Finger vein identification using fuzzy-based k-nearest centroid neighbor classifier
NASA Astrophysics Data System (ADS)
Rosdi, Bakhtiar Affendi; Jaafar, Haryati; Ramli, Dzati Athiar
2015-02-01
In this paper, a new approach for personal identification using finger vein image is presented. Finger vein is an emerging type of biometrics that attracts attention of researchers in biometrics area. As compared to other biometric traits such as face, fingerprint and iris, finger vein is more secured and hard to counterfeit since the features are inside the human body. So far, most of the researchers focus on how to extract robust features from the captured vein images. Not much research was conducted on the classification of the extracted features. In this paper, a new classifier called fuzzy-based k-nearest centroid neighbor (FkNCN) is applied to classify the finger vein image. The proposed FkNCN employs a surrounding rule to obtain the k-nearest centroid neighbors based on the spatial distributions of the training images and their distance to the test image. Then, the fuzzy membership function is utilized to assign the test image to the class which is frequently represented by the k-nearest centroid neighbors. Experimental evaluation using our own database which was collected from 492 fingers shows that the proposed FkNCN has better performance than the k-nearest neighbor, k-nearest-centroid neighbor and fuzzy-based-k-nearest neighbor classifiers. This shows that the proposed classifier is able to identify the finger vein image effectively.
Generative Models for Similarity-based Classification
2007-01-01
NC), local nearest centroid (local NC), k-nearest neighbors ( kNN ), and condensed nearest neighbors (CNN) are all similarity-based classifiers which...vector machine to the k nearest neighbors of the test sample [80]. The SVM- KNN method was developed to address the robustness and dimensionality...concerns that afflict nearest neighbors and SVMs. Similarly to the nearest-means classifier, the SVM- KNN is a hybrid local and global classifier developed
An Improvement To The k-Nearest Neighbor Classifier For ECG Database
NASA Astrophysics Data System (ADS)
Jaafar, Haryati; Hidayah Ramli, Nur; Nasir, Aimi Salihah Abdul
2018-03-01
The k nearest neighbor (kNN) is a non-parametric classifier and has been widely used for pattern classification. However, in practice, the performance of kNN often tends to fail due to the lack of information on how the samples are distributed among them. Moreover, kNN is no longer optimal when the training samples are limited. Another problem observed in kNN is regarding the weighting issues in assigning the class label before classification. Thus, to solve these limitations, a new classifier called Mahalanobis fuzzy k-nearest centroid neighbor (MFkNCN) is proposed in this study. Here, a Mahalanobis distance is applied to avoid the imbalance of samples distribition. Then, a surrounding rule is employed to obtain the nearest centroid neighbor based on the distributions of training samples and its distance to the query point. Consequently, the fuzzy membership function is employed to assign the query point to the class label which is frequently represented by the nearest centroid neighbor Experimental studies from electrocardiogram (ECG) signal is applied in this study. The classification performances are evaluated in two experimental steps i.e. different values of k and different sizes of feature dimensions. Subsequently, a comparative study of kNN, kNCN, FkNN and MFkCNN classifier is conducted to evaluate the performances of the proposed classifier. The results show that the performance of MFkNCN consistently exceeds the kNN, kNCN and FkNN with the best classification rates of 96.5%.
Comparative Analysis of Document level Text Classification Algorithms using R
NASA Astrophysics Data System (ADS)
Syamala, Maganti; Nalini, N. J., Dr; Maguluri, Lakshamanaphaneendra; Ragupathy, R., Dr.
2017-08-01
From the past few decades there has been tremendous volumes of data available in Internet either in structured or unstructured form. Also, there is an exponential growth of information on Internet, so there is an emergent need of text classifiers. Text mining is an interdisciplinary field which draws attention on information retrieval, data mining, machine learning, statistics and computational linguistics. And to handle this situation, a wide range of supervised learning algorithms has been introduced. Among all these K-Nearest Neighbor(KNN) is efficient and simplest classifier in text classification family. But KNN suffers from imbalanced class distribution and noisy term features. So, to cope up with this challenge we use document based centroid dimensionality reduction(CentroidDR) using R Programming. By combining these two text classification techniques, KNN and Centroid classifiers, we propose a scalable and effective flat classifier, called MCenKNN which works well substantially better than CenKNN.
Nation-Building Modeling and Resource Allocation Via Dynamic Programming
2014-09-01
Figure 2. RAND Study Models[59:98,115] (WMA) and used both the k-Nearest Neighbor ( KNN ) and Nearest Centroid (NC) algorithms to classify future features...The study found that KNN performed bet- ter than NC with 85% or greater accuracy in all test cases. The methodology was adopted for use under the...analysis feature of the model. 3.7.1 The No Surge Alternative. On the 10th of January 2007, President George W. Bush delivered a speech to the American
2012-03-01
WMA) and used both the k-Nearest Neighbor ( KNN ) and Nearest Centroid 27 (a) Coalition and Regional (b) Indigenous Figure 3. RAND Study Models[32:98,115...NC) algorithms to classify future features. The study found that KNN performed better than NC with 85% or greater accuracy in all test cases. The...the model. 4.2.1 No Surge. On the 10th of January 2007, President George W. Bush delivered a speech to the American Public outlining a new strategy in
Stapanian, Martin A.; Adams, Jean V.; Fennessy, M. Siobhan; Mack, John; Micacchion, Mick
2013-01-01
A persistent question among ecologists and environmental managers is whether constructed wetlands are structurally or functionally equivalent to naturally occurring wetlands. We examined 19 variables collected from 10 constructed and nine natural emergent wetlands in Ohio, USA. Our primary objective was to identify candidate indicators of wetland class (natural or constructed), based on measurements of soil properties and an index of vegetation integrity, that can be used to track the progress of constructed wetlands toward a natural state. The method of nearest shrunken centroids was used to find a subset of variables that would serve as the best classifiers of wetland class, and error rate was calculated using a five-fold cross-validation procedure. The shrunken differences of percent total organic carbon (% TOC) and percent dry weight of the soil exhibited the greatest distances from the overall centroid. Classification based on these two variables yielded a misclassification rate of 11% based on cross-validation. Our results indicate that % TOC and percent dry weight can be used as candidate indicators of the status of emergent, constructed wetlands in Ohio and for assessing the performance of mitigation. The method of nearest shrunken centroids has excellent potential for further applications in ecology.
Helweg, D A; Roitblat, H L; Nachtigall, P E; Hautus, M J
1996-01-01
We examined the ability of a bottlenose dolphin (Tursiops truncatus) to recognize aspect-dependent objects using echolocation. An aspect-dependent object such as a cube produces acoustically different echoes at different angles relative to the echolocation signal. The dolphin recognized the objects even though the objects were free to rotate and sway. A linear discriminant analysis and nearest centroid classifier could classify the objects using average amplitude, center frequency, and bandwidth of object echoes. The results show that dolphins can use varying acoustic properties to recognize constant objects and suggest that aspect-independent representations may be formed by combining information gleaned from multiple echoes.
Recognizing human activities using appearance metric feature and kinematics feature
NASA Astrophysics Data System (ADS)
Qian, Huimin; Zhou, Jun; Lu, Xinbiao; Wu, Xinye
2017-05-01
The problem of automatically recognizing human activities from videos through the fusion of the two most important cues, appearance metric feature and kinematics feature, is considered. And a system of two-dimensional (2-D) Poisson equations is introduced to extract the more discriminative appearance metric feature. Specifically, the moving human blobs are first detected out from the video by background subtraction technique to form a binary image sequence, from which the appearance feature designated as the motion accumulation image and the kinematics feature termed as centroid instantaneous velocity are extracted. Second, 2-D discrete Poisson equations are employed to reinterpret the motion accumulation image to produce a more differentiated Poisson silhouette image, from which the appearance feature vector is created through the dimension reduction technique called bidirectional 2-D principal component analysis, considering the balance between classification accuracy and time consumption. Finally, a cascaded classifier based on the nearest neighbor classifier and two directed acyclic graph support vector machine classifiers, integrated with the fusion of the appearance feature vector and centroid instantaneous velocity vector, is applied to recognize the human activities. Experimental results on the open databases and a homemade one confirm the recognition performance of the proposed algorithm.
Galván-Tejada, Carlos E.; Zanella-Calzada, Laura A.; Galván-Tejada, Jorge I.; Celaya-Padilla, José M.; Gamboa-Rosales, Hamurabi; Garza-Veloz, Idalia; Martinez-Fierro, Margarita L.
2017-01-01
Breast cancer is an important global health problem, and the most common type of cancer among women. Late diagnosis significantly decreases the survival rate of the patient; however, using mammography for early detection has been demonstrated to be a very important tool increasing the survival rate. The purpose of this paper is to obtain a multivariate model to classify benign and malignant tumor lesions using a computer-assisted diagnosis with a genetic algorithm in training and test datasets from mammography image features. A multivariate search was conducted to obtain predictive models with different approaches, in order to compare and validate results. The multivariate models were constructed using: Random Forest, Nearest centroid, and K-Nearest Neighbor (K-NN) strategies as cost function in a genetic algorithm applied to the features in the BCDR public databases. Results suggest that the two texture descriptor features obtained in the multivariate model have a similar or better prediction capability to classify the data outcome compared with the multivariate model composed of all the features, according to their fitness value. This model can help to reduce the workload of radiologists and present a second opinion in the classification of tumor lesions. PMID:28216571
Galván-Tejada, Carlos E; Zanella-Calzada, Laura A; Galván-Tejada, Jorge I; Celaya-Padilla, José M; Gamboa-Rosales, Hamurabi; Garza-Veloz, Idalia; Martinez-Fierro, Margarita L
2017-02-14
Breast cancer is an important global health problem, and the most common type of cancer among women. Late diagnosis significantly decreases the survival rate of the patient; however, using mammography for early detection has been demonstrated to be a very important tool increasing the survival rate. The purpose of this paper is to obtain a multivariate model to classify benign and malignant tumor lesions using a computer-assisted diagnosis with a genetic algorithm in training and test datasets from mammography image features. A multivariate search was conducted to obtain predictive models with different approaches, in order to compare and validate results. The multivariate models were constructed using: Random Forest, Nearest centroid, and K-Nearest Neighbor (K-NN) strategies as cost function in a genetic algorithm applied to the features in the BCDR public databases. Results suggest that the two texture descriptor features obtained in the multivariate model have a similar or better prediction capability to classify the data outcome compared with the multivariate model composed of all the features, according to their fitness value. This model can help to reduce the workload of radiologists and present a second opinion in the classification of tumor lesions.
Jaafar, Haryati; Ibrahim, Salwani; Ramli, Dzati Athiar
2015-01-01
Mobile implementation is a current trend in biometric design. This paper proposes a new approach to palm print recognition, in which smart phones are used to capture palm print images at a distance. A touchless system was developed because of public demand for privacy and sanitation. Robust hand tracking, image enhancement, and fast computation processing algorithms are required for effective touchless and mobile-based recognition. In this project, hand tracking and the region of interest (ROI) extraction method were discussed. A sliding neighborhood operation with local histogram equalization, followed by a local adaptive thresholding or LHEAT approach, was proposed in the image enhancement stage to manage low-quality palm print images. To accelerate the recognition process, a new classifier, improved fuzzy-based k nearest centroid neighbor (IFkNCN), was implemented. By removing outliers and reducing the amount of training data, this classifier exhibited faster computation. Our experimental results demonstrate that a touchless palm print system using LHEAT and IFkNCN achieves a promising recognition rate of 98.64%. PMID:26113861
Helweg, D A; Au, W W; Roitblat, H L; Nachtigall, P E
1996-04-01
The relationships between acoustic features of target echoes and the cognitive representations of the target formed by an echolocating dolphin will influence the ease with which the dolphin can recognize a target. A blindfolded Atlantic bottlenose dolphin (Tursiops truncatus) learned to match aspect-dependent three-dimensional targets (such as a cube) at haphazard orientations, although with some difficulty. This task may have been difficult because aspect-dependent targets produce different echoes at different orientations, which required the dolphin to have some capability for object constancy across changes in echo characteristics. Significant target-related differences in echo amplitude, rms bandwidth, and distributions of interhighlight intervals were observed among echoes collected when the dolphin was performing the task. Targets could be classified using a combination of energy flux density and rms bandwidth by a linear discriminant analysis and a nearest centroid classifier. Neither statistical model could classify targets without amplitude information, but the highest accuracy required spectral information as well. This suggests that the dolphin recognized the targets using a multidimensional representation containing amplitude and spectral information and that dolphins can form stable representations of targets regardless of orientation based on varying sensory properties.
Frog sound identification using extended k-nearest neighbor classifier
NASA Astrophysics Data System (ADS)
Mukahar, Nordiana; Affendi Rosdi, Bakhtiar; Athiar Ramli, Dzati; Jaafar, Haryati
2017-09-01
Frog sound identification based on the vocalization becomes important for biological research and environmental monitoring. As a result, different types of feature extractions and classifiers have been employed to evaluate the accuracy of frog sound identification. This paper presents a frog sound identification with Extended k-Nearest Neighbor (EKNN) classifier. The EKNN classifier integrates the nearest neighbors and mutual sharing of neighborhood concepts, with the aims of improving the classification performance. It makes a prediction based on who are the nearest neighbors of the testing sample and who consider the testing sample as their nearest neighbors. In order to evaluate the classification performance in frog sound identification, the EKNN classifier is compared with competing classifier, k -Nearest Neighbor (KNN), Fuzzy k -Nearest Neighbor (FKNN) k - General Nearest Neighbor (KGNN)and Mutual k -Nearest Neighbor (MKNN) on the recorded sounds of 15 frog species obtained in Malaysia forest. The recorded sounds have been segmented using Short Time Energy and Short Time Average Zero Crossing Rate (STE+STAZCR), sinusoidal modeling (SM), manual and the combination of Energy (E) and Zero Crossing Rate (ZCR) (E+ZCR) while the features are extracted by Mel Frequency Cepstrum Coefficient (MFCC). The experimental results have shown that the EKNCN classifier exhibits the best performance in terms of accuracy compared to the competing classifiers, KNN, FKNN, GKNN and MKNN for all cases.
Centre-based restricted nearest feature plane with angle classifier for face recognition
NASA Astrophysics Data System (ADS)
Tang, Linlin; Lu, Huifen; Zhao, Liang; Li, Zuohua
2017-10-01
An improved classifier based on the nearest feature plane (NFP), called the centre-based restricted nearest feature plane with the angle (RNFPA) classifier, is proposed for the face recognition problems here. The famous NFP uses the geometrical information of samples to increase the number of training samples, but it increases the computation complexity and it also has an inaccuracy problem coursed by the extended feature plane. To solve the above problems, RNFPA exploits a centre-based feature plane and utilizes a threshold of angle to restrict extended feature space. By choosing the appropriate angle threshold, RNFPA can improve the performance and decrease computation complexity. Experiments in the AT&T face database, AR face database and FERET face database are used to evaluate the proposed classifier. Compared with the original NFP classifier, the nearest feature line (NFL) classifier, the nearest neighbour (NN) classifier and some other improved NFP classifiers, the proposed one achieves competitive performance.
Topological side-chain classification of beta-turns: ideal motifs for peptidomimetic development.
Tran, Tran Trung; McKie, Jim; Meutermans, Wim D F; Bourne, Gregory T; Andrews, Peter R; Smythe, Mark L
2005-08-01
Beta-turns are important topological motifs for biological recognition of proteins and peptides. Organic molecules that sample the side chain positions of beta-turns have shown broad binding capacity to multiple different receptors, for example benzodiazepines. Beta-turns have traditionally been classified into various types based on the backbone dihedral angles (phi2, psi2, phi3 and psi3). Indeed, 57-68% of beta-turns are currently classified into 8 different backbone families (Type I, Type II, Type I', Type II', Type VIII, Type VIa1, Type VIa2 and Type VIb and Type IV which represents unclassified beta-turns). Although this classification of beta-turns has been useful, the resulting beta-turn types are not ideal for the design of beta-turn mimetics as they do not reflect topological features of the recognition elements, the side chains. To overcome this, we have extracted beta-turns from a data set of non-homologous and high-resolution protein crystal structures. The side chain positions, as defined by C(alpha)-C(beta) vectors, of these turns have been clustered using the kth nearest neighbor clustering and filtered nearest centroid sorting algorithms. Nine clusters were obtained that cluster 90% of the data, and the average intra-cluster RMSD of the four C(alpha)-C(beta) vectors is 0.36. The nine clusters therefore represent the topology of the side chain scaffold architecture of the vast majority of beta-turns. The mean structures of the nine clusters are useful for the development of beta-turn mimetics and as biological descriptors for focusing combinatorial chemistry towards biologically relevant topological space.
voomDDA: discovery of diagnostic biomarkers and classification of RNA-seq data.
Zararsiz, Gokmen; Goksuluk, Dincer; Klaus, Bernd; Korkmaz, Selcuk; Eldem, Vahap; Karabulut, Erdem; Ozturk, Ahmet
2017-01-01
RNA-Seq is a recent and efficient technique that uses the capabilities of next-generation sequencing technology for characterizing and quantifying transcriptomes. One important task using gene-expression data is to identify a small subset of genes that can be used to build diagnostic classifiers particularly for cancer diseases. Microarray based classifiers are not directly applicable to RNA-Seq data due to its discrete nature. Overdispersion is another problem that requires careful modeling of mean and variance relationship of the RNA-Seq data. In this study, we present voomDDA classifiers: variance modeling at the observational level (voom) extensions of the nearest shrunken centroids (NSC) and the diagonal discriminant classifiers. VoomNSC is one of these classifiers and brings voom and NSC approaches together for the purpose of gene-expression based classification. For this purpose, we propose weighted statistics and put these weighted statistics into the NSC algorithm. The VoomNSC is a sparse classifier that models the mean-variance relationship using the voom method and incorporates voom's precision weights into the NSC classifier via weighted statistics. A comprehensive simulation study was designed and four real datasets are used for performance assessment. The overall results indicate that voomNSC performs as the sparsest classifier. It also provides the most accurate results together with power-transformed Poisson linear discriminant analysis, rlog transformed support vector machines and random forests algorithms. In addition to prediction purposes, the voomNSC classifier can be used to identify the potential diagnostic biomarkers for a condition of interest. Through this work, statistical learning methods proposed for microarrays can be reused for RNA-Seq data. An interactive web application is freely available at http://www.biosoft.hacettepe.edu.tr/voomDDA/.
Zou, Meng; Liu, Zhaoqi; Zhang, Xiang-Sun; Wang, Yong
2015-10-15
In prognosis and survival studies, an important goal is to identify multi-biomarker panels with predictive power using molecular characteristics or clinical observations. Such analysis is often challenged by censored, small-sample-size, but high-dimensional genomic profiles or clinical data. Therefore, sophisticated models and algorithms are in pressing need. In this study, we propose a novel Area Under Curve (AUC) optimization method for multi-biomarker panel identification named Nearest Centroid Classifier for AUC optimization (NCC-AUC). Our method is motived by the connection between AUC score for classification accuracy evaluation and Harrell's concordance index in survival analysis. This connection allows us to convert the survival time regression problem to a binary classification problem. Then an optimization model is formulated to directly maximize AUC and meanwhile minimize the number of selected features to construct a predictor in the nearest centroid classifier framework. NCC-AUC shows its great performance by validating both in genomic data of breast cancer and clinical data of stage IB Non-Small-Cell Lung Cancer (NSCLC). For the genomic data, NCC-AUC outperforms Support Vector Machine (SVM) and Support Vector Machine-based Recursive Feature Elimination (SVM-RFE) in classification accuracy. It tends to select a multi-biomarker panel with low average redundancy and enriched biological meanings. Also NCC-AUC is more significant in separation of low and high risk cohorts than widely used Cox model (Cox proportional-hazards regression model) and L1-Cox model (L1 penalized in Cox model). These performance gains of NCC-AUC are quite robust across 5 subtypes of breast cancer. Further in an independent clinical data, NCC-AUC outperforms SVM and SVM-RFE in predictive accuracy and is consistently better than Cox model and L1-Cox model in grouping patients into high and low risk categories. In summary, NCC-AUC provides a rigorous optimization framework to systematically reveal multi-biomarker panel from genomic and clinical data. It can serve as a useful tool to identify prognostic biomarkers for survival analysis. NCC-AUC is available at http://doc.aporc.org/wiki/NCC-AUC. ywang@amss.ac.cn Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Error minimizing algorithms for nearest eighbor classifiers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Porter, Reid B; Hush, Don; Zimmer, G. Beate
2011-01-03
Stack Filters define a large class of discrete nonlinear filter first introd uced in image and signal processing for noise removal. In recent years we have suggested their application to classification problems, and investigated their relationship to other types of discrete classifiers such as Decision Trees. In this paper we focus on a continuous domain version of Stack Filter Classifiers which we call Ordered Hypothesis Machines (OHM), and investigate their relationship to Nearest Neighbor classifiers. We show that OHM classifiers provide a novel framework in which to train Nearest Neighbor type classifiers by minimizing empirical error based loss functions. Wemore » use the framework to investigate a new cost sensitive loss function that allows us to train a Nearest Neighbor type classifier for low false alarm rate applications. We report results on both synthetic data and real-world image data.« less
Berke, Ethan M; Shi, Xun
2009-04-29
Travel time is an important metric of geographic access to health care. We compared strategies of estimating travel times when only subject ZIP code data were available. Using simulated data from New Hampshire and Arizona, we estimated travel times to nearest cancer centers by using: 1) geometric centroid of ZIP code polygons as origins, 2) population centroids as origin, 3) service area rings around each cancer center, assigning subjects to rings by assuming they are evenly distributed within their ZIP code, 4) service area rings around each center, assuming the subjects follow the population distribution within the ZIP code. We used travel times based on street addresses as true values to validate estimates. Population-based methods have smaller errors than geometry-based methods. Within categories (geometry or population), centroid and service area methods have similar errors. Errors are smaller in urban areas than in rural areas. Population-based methods are superior to the geometry-based methods, with the population centroid method appearing to be the best choice for estimating travel time. Estimates in rural areas are less reliable.
Angles-centroids fitting calibration and the centroid algorithm applied to reverse Hartmann test
NASA Astrophysics Data System (ADS)
Zhao, Zhu; Hui, Mei; Xia, Zhengzheng; Dong, Liquan; Liu, Ming; Liu, Xiaohua; Kong, Lingqin; Zhao, Yuejin
2017-02-01
In this paper, we develop an angles-centroids fitting (ACF) system and the centroid algorithm to calibrate the reverse Hartmann test (RHT) with sufficient precision. The essence of ACF calibration is to establish the relationship between ray angles and detector coordinates. Centroids computation is used to find correspondences between the rays of datum marks and detector pixels. Here, the point spread function of RHT is classified as circle of confusion (CoC), and the fitting of a CoC spot with 2D Gaussian profile to identify the centroid forms the basis of the centroid algorithm. Theoretical and experimental results of centroids computation demonstrate that the Gaussian fitting method has a less centroid shift or the shift grows at a slower pace when the quality of the image is reduced. In ACF tests, the optical instrumental alignments reach an overall accuracy of 0.1 pixel with the application of laser spot centroids tracking program. Locating the crystal at different positions, the feasibility and accuracy of ACF calibration are further validated to 10-6-10-4 rad root-mean-square error of the calibrations differences.
Wu, Baolin
2006-02-15
Differential gene expression detection and sample classification using microarray data have received much research interest recently. Owing to the large number of genes p and small number of samples n (p > n), microarray data analysis poses big challenges for statistical analysis. An obvious problem owing to the 'large p small n' is over-fitting. Just by chance, we are likely to find some non-differentially expressed genes that can classify the samples very well. The idea of shrinkage is to regularize the model parameters to reduce the effects of noise and produce reliable inferences. Shrinkage has been successfully applied in the microarray data analysis. The SAM statistics proposed by Tusher et al. and the 'nearest shrunken centroid' proposed by Tibshirani et al. are ad hoc shrinkage methods. Both methods are simple, intuitive and prove to be useful in empirical studies. Recently Wu proposed the penalized t/F-statistics with shrinkage by formally using the (1) penalized linear regression models for two-class microarray data, showing good performance. In this paper we systematically discussed the use of penalized regression models for analyzing microarray data. We generalize the two-class penalized t/F-statistics proposed by Wu to multi-class microarray data. We formally derive the ad hoc shrunken centroid used by Tibshirani et al. using the (1) penalized regression models. And we show that the penalized linear regression models provide a rigorous and unified statistical framework for sample classification and differential gene expression detection.
Iris recognition using image moments and k-means algorithm.
Khan, Yaser Daanial; Khan, Sher Afzal; Ahmad, Farooq; Islam, Saeed
2014-01-01
This paper presents a biometric technique for identification of a person using the iris image. The iris is first segmented from the acquired image of an eye using an edge detection algorithm. The disk shaped area of the iris is transformed into a rectangular form. Described moments are extracted from the grayscale image which yields a feature vector containing scale, rotation, and translation invariant moments. Images are clustered using the k-means algorithm and centroids for each cluster are computed. An arbitrary image is assumed to belong to the cluster whose centroid is the nearest to the feature vector in terms of Euclidean distance computed. The described model exhibits an accuracy of 98.5%.
Iris Recognition Using Image Moments and k-Means Algorithm
Khan, Yaser Daanial; Khan, Sher Afzal; Ahmad, Farooq; Islam, Saeed
2014-01-01
This paper presents a biometric technique for identification of a person using the iris image. The iris is first segmented from the acquired image of an eye using an edge detection algorithm. The disk shaped area of the iris is transformed into a rectangular form. Described moments are extracted from the grayscale image which yields a feature vector containing scale, rotation, and translation invariant moments. Images are clustered using the k-means algorithm and centroids for each cluster are computed. An arbitrary image is assumed to belong to the cluster whose centroid is the nearest to the feature vector in terms of Euclidean distance computed. The described model exhibits an accuracy of 98.5%. PMID:24977221
Star centroiding error compensation for intensified star sensors.
Jiang, Jie; Xiong, Kun; Yu, Wenbo; Yan, Jinyun; Zhang, Guangjun
2016-12-26
A star sensor provides high-precision attitude information by capturing a stellar image; however, the traditional star sensor has poor dynamic performance, which is attributed to its low sensitivity. Regarding the intensified star sensor, the image intensifier is utilized to improve the sensitivity, thereby further improving the dynamic performance of the star sensor. However, the introduction of image intensifier results in star centroiding accuracy decrease, further influencing the attitude measurement precision of the star sensor. A star centroiding error compensation method for intensified star sensors is proposed in this paper to reduce the influences. First, the imaging model of the intensified detector, which includes the deformation parameter of the optical fiber panel, is established based on the orthographic projection through the analysis of errors introduced by the image intensifier. Thereafter, the position errors at the target points based on the model are obtained by using the Levenberg-Marquardt (LM) optimization method. Last, the nearest trigonometric interpolation method is presented to compensate for the arbitrary centroiding error of the image plane. Laboratory calibration result and night sky experiment result show that the compensation method effectively eliminates the error introduced by the image intensifier, thus remarkably improving the precision of the intensified star sensors.
Bias in error estimation when using cross-validation for model selection.
Varma, Sudhir; Simon, Richard
2006-02-23
Cross-validation (CV) is an effective method for estimating the prediction error of a classifier. Some recent articles have proposed methods for optimizing classifiers by choosing classifier parameter values that minimize the CV error estimate. We have evaluated the validity of using the CV error estimate of the optimized classifier as an estimate of the true error expected on independent data. We used CV to optimize the classification parameters for two kinds of classifiers; Shrunken Centroids and Support Vector Machines (SVM). Random training datasets were created, with no difference in the distribution of the features between the two classes. Using these "null" datasets, we selected classifier parameter values that minimized the CV error estimate. 10-fold CV was used for Shrunken Centroids while Leave-One-Out-CV (LOOCV) was used for the SVM. Independent test data was created to estimate the true error. With "null" and "non null" (with differential expression between the classes) data, we also tested a nested CV procedure, where an inner CV loop is used to perform the tuning of the parameters while an outer CV is used to compute an estimate of the error. The CV error estimate for the classifier with the optimal parameters was found to be a substantially biased estimate of the true error that the classifier would incur on independent data. Even though there is no real difference between the two classes for the "null" datasets, the CV error estimate for the Shrunken Centroid with the optimal parameters was less than 30% on 18.5% of simulated training data-sets. For SVM with optimal parameters the estimated error rate was less than 30% on 38% of "null" data-sets. Performance of the optimized classifiers on the independent test set was no better than chance. The nested CV procedure reduces the bias considerably and gives an estimate of the error that is very close to that obtained on the independent testing set for both Shrunken Centroids and SVM classifiers for "null" and "non-null" data distributions. We show that using CV to compute an error estimate for a classifier that has itself been tuned using CV gives a significantly biased estimate of the true error. Proper use of CV for estimating true error of a classifier developed using a well defined algorithm requires that all steps of the algorithm, including classifier parameter tuning, be repeated in each CV loop. A nested CV procedure provides an almost unbiased estimate of the true error.
Gonçalves, B; Marcelino, R; Torres-Ronda, L; Torrents, C; Sampaio, J
2016-07-01
Optimizing collective behaviour helps to increase performance in mutual tasks. In team sports settings, the small-sided games (SSG) have been used as key context tools to stress out the players' awareness about their in-game required behaviours. Research has mostly described these behaviours when confronting teams have the same number of players, disregarding the frequent situations of low and high inequality. This study compared the players' positioning dynamics when manipulating the number of opponents and teammates during professional and amateur football SSG. The participants played 4v3, 4v5 and 4v7 games, where one team was confronted with low-superiority, low- and high-inferiority situations, and their opponents with low-, medium- and high-cooperation situations. Positional data were used to calculate effective playing space and distances from each player to team centroid, opponent team centroid and nearest opponent. Outcomes suggested that increasing the number of opponents in professional teams resulted in moderate/large decrease in approximate entropy (ApEn) values to both distance to team and opponent team centroid (i.e., the variables present higher regularity/predictability pattern). In low-cooperation game scenarios, the ApEn in amateurs' tactical variables presented a moderate/large increase. The professional teams presented an increase in the distance to nearest opponent with the increase of the cooperation level. Increasing the number of opponents was effective to overemphasise the need to use local information in the positioning decision-making process from professionals. Conversely, amateur still rely on external informational feedback. Increasing the cooperation promoted more regularity in spatial organisation in amateurs and emphasise their players' local perceptions.
NASA Astrophysics Data System (ADS)
Sirait, Kamson; Tulus; Budhiarti Nababan, Erna
2017-12-01
Clustering methods that have high accuracy and time efficiency are necessary for the filtering process. One method that has been known and applied in clustering is K-Means Clustering. In its application, the determination of the begining value of the cluster center greatly affects the results of the K-Means algorithm. This research discusses the results of K-Means Clustering with starting centroid determination with a random and KD-Tree method. The initial determination of random centroid on the data set of 1000 student academic data to classify the potentially dropout has a sse value of 952972 for the quality variable and 232.48 for the GPA, whereas the initial centroid determination by KD-Tree has a sse value of 504302 for the quality variable and 214,37 for the GPA variable. The smaller sse values indicate that the result of K-Means Clustering with initial KD-Tree centroid selection have better accuracy than K-Means Clustering method with random initial centorid selection.
Analysis of k-means clustering approach on the breast cancer Wisconsin dataset.
Dubey, Ashutosh Kumar; Gupta, Umesh; Jain, Sonal
2016-11-01
Breast cancer is one of the most common cancers found worldwide and most frequently found in women. An early detection of breast cancer provides the possibility of its cure; therefore, a large number of studies are currently going on to identify methods that can detect breast cancer in its early stages. This study was aimed to find the effects of k-means clustering algorithm with different computation measures like centroid, distance, split method, epoch, attribute, and iteration and to carefully consider and identify the combination of measures that has potential of highly accurate clustering accuracy. K-means algorithm was used to evaluate the impact of clustering using centroid initialization, distance measures, and split methods. The experiments were performed using breast cancer Wisconsin (BCW) diagnostic dataset. Foggy and random centroids were used for the centroid initialization. In foggy centroid, based on random values, the first centroid was calculated. For random centroid, the initial centroid was considered as (0, 0). The results were obtained by employing k-means algorithm and are discussed with different cases considering variable parameters. The calculations were based on the centroid (foggy/random), distance (Euclidean/Manhattan/Pearson), split (simple/variance), threshold (constant epoch/same centroid), attribute (2-9), and iteration (4-10). Approximately, 92 % average positive prediction accuracy was obtained with this approach. Better results were found for the same centroid and the highest variance. The results achieved using Euclidean and Manhattan were better than the Pearson correlation. The findings of this work provided extensive understanding of the computational parameters that can be used with k-means. The results indicated that k-means has a potential to classify BCW dataset.
Hanigan, Ivan; Hall, Gillian; Dear, Keith B G
2006-09-13
To explain the possible effects of exposure to weather conditions on population health outcomes, weather data need to be calculated at a level in space and time that is appropriate for the health data. There are various ways of estimating exposure values from raw data collected at weather stations but the rationale for using one technique rather than another; the significance of the difference in the values obtained; and the effect these have on a research question are factors often not explicitly considered. In this study we compare different techniques for allocating weather data observations to small geographical areas and different options for weighting averages of these observations when calculating estimates of daily precipitation and temperature for Australian Postal Areas. Options that weight observations based on distance from population centroids and population size are more computationally intensive but give estimates that conceptually are more closely related to the experience of the population. Options based on values derived from sites internal to postal areas, or from nearest neighbour sites--that is, using proximity polygons around weather stations intersected with postal areas--tended to include fewer stations' observations in their estimates, and missing values were common. Options based on observations from stations within 50 kilometres radius of centroids and weighting of data by distance from centroids gave more complete estimates. Using the geographic centroid of the postal area gave estimates that differed slightly from the population weighted centroids and the population weighted average of sub-unit estimates. To calculate daily weather exposure values for analysis of health outcome data for small areas, the use of data from weather stations internal to the area only, or from neighbouring weather stations (allocated by the use of proximity polygons), is too limited. The most appropriate method conceptually is the use of weather data from sites within 50 kilometres radius of the area weighted to population centres, but a simpler acceptable option is to weight to the geographic centroid.
Examining change detection approaches for tropical mangrove monitoring
Myint, Soe W.; Franklin, Janet; Buenemann, Michaela; Kim, Won; Giri, Chandra
2014-01-01
This study evaluated the effectiveness of different band combinations and classifiers (unsupervised, supervised, object-oriented nearest neighbor, and object-oriented decision rule) for quantifying mangrove forest change using multitemporal Landsat data. A discriminant analysis using spectra of different vegetation types determined that bands 2 (0.52 to 0.6 μm), 5 (1.55 to 1.75 μm), and 7 (2.08 to 2.35 μm) were the most effective bands for differentiating mangrove forests from surrounding land cover types. A ranking of thirty-six change maps, produced by comparing the classification accuracy of twelve change detection approaches, was used. The object-based Nearest Neighbor classifier produced the highest mean overall accuracy (84 percent) regardless of band combinations. The automated decision rule-based approach (mean overall accuracy of 88 percent) as well as a composite of bands 2, 5, and 7 used with the unsupervised classifier and the same composite or all band difference with the object-oriented Nearest Neighbor classifier were the most effective approaches.
Bashar, Md Khayrul; Komatsu, Koji; Fujimori, Toshihiko; Kobayashi, Tetsuya J
2012-01-01
Accurate identification of cell nuclei and their tracking using three dimensional (3D) microscopic images is a demanding task in many biological studies. Manual identification of nuclei centroids from images is an error-prone task, sometimes impossible to accomplish due to low contrast and the presence of noise. Nonetheless, only a few methods are available for 3D bioimaging applications, which sharply contrast with 2D analysis, where many methods already exist. In addition, most methods essentially adopt segmentation for which a reliable solution is still unknown, especially for 3D bio-images having juxtaposed cells. In this work, we propose a new method that can directly extract nuclei centroids from fluorescence microscopy images. This method involves three steps: (i) Pre-processing, (ii) Local enhancement, and (iii) Centroid extraction. The first step includes two variations: first variation (Variant-1) uses the whole 3D pre-processed image, whereas the second one (Variant-2) modifies the preprocessed image to the candidate regions or the candidate hybrid image for further processing. At the second step, a multiscale cube filtering is employed in order to locally enhance the pre-processed image. Centroid extraction in the third step consists of three stages. In Stage-1, we compute a local characteristic ratio at every voxel and extract local maxima regions as candidate centroids using a ratio threshold. Stage-2 processing removes spurious centroids from Stage-1 results by analyzing shapes of intensity profiles from the enhanced image. An iterative procedure based on the nearest neighborhood principle is then proposed to combine if there are fragmented nuclei. Both qualitative and quantitative analyses on a set of 100 images of 3D mouse embryo are performed. Investigations reveal a promising achievement of the technique presented in terms of average sensitivity and precision (i.e., 88.04% and 91.30% for Variant-1; 86.19% and 95.00% for Variant-2), when compared with an existing method (86.06% and 90.11%), originally developed for analyzing C. elegans images.
Li, Qingbo; Hao, Can; Kang, Xue; Zhang, Jialin; Sun, Xuejun; Wang, Wenbo; Zeng, Haishan
2017-11-27
Combining Fourier transform infrared spectroscopy (FTIR) with endoscopy, it is expected that noninvasive, rapid detection of colorectal cancer can be performed in vivo in the future. In this study, Fourier transform infrared spectra were collected from 88 endoscopic biopsy colorectal tissue samples (41 colitis and 47 cancers). A new method, viz., entropy weight local-hyperplane k-nearest-neighbor (EWHK), which is an improved version of K-local hyperplane distance nearest-neighbor (HKNN), is proposed for tissue classification. In order to avoid limiting high dimensions and small values of the nearest neighbor, the new EWHK method calculates feature weights based on information entropy. The average results of the random classification showed that the EWHK classifier for differentiating cancer from colitis samples produced a sensitivity of 81.38% and a specificity of 92.69%.
Weighted Parzen Windows for Pattern Classification
1994-05-01
Nearest-Neighbor Rule The k-Nearest-Neighbor ( kNN ) technique is nonparametric, assuming nothing about the distribution of the data. Stated succinctly...probabilities P(wj I x) from samples." Raudys and Jain [20:255] advance this interpretation by pointing out that the kNN technique can be viewed as the...34Parzen window classifier with a hyper- rectangular window function." As with the Parzen-window technique, the kNN classifier is more accurate as the
1991-12-01
user using the ’: knn ’ option in the do-scenario command line). An instance of the K-Nearest Neighbor object is first created and initialized before...Navigation Computer HF High Frequency ILS Instrument Landing System KNN K - Nearest Neighbor LRU Line Replaceable Unit MC Mission Computer MTCA...approaches have been investigated here, K-nearest Neighbors ( KNN ) and neural networks (NN). Both approaches require that previously classified examples of
SU-F-J-109: Generate Synthetic CT From Cone Beam CT for CBCT-Based Dose Calculation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, H; Barbee, D; Wang, W
Purpose: The use of CBCT for dose calculation is limited by its HU inaccuracy from increased scatter. This study presents a method to generate synthetic CT images from CBCT data by a probabilistic classification that may be robust to CBCT noise. The feasibility of using the synthetic CT for dose calculation is evaluated in IMRT for unilateral H&N cancer. Methods: In the training phase, a fuzzy c-means classification was performed on HU vectors (CBCT, CT) of planning CT and registered day-1 CBCT image pair. Using the resulting centroid CBCT and CT values for five classified “tissue” types, a synthetic CTmore » for a daily CBCT was created by classifying each CBCT voxel to obtain its probability belonging to each tissue class, then assigning a CT HU with a probability-weighted summation of the classes’ CT centroids. Two synthetic CTs from a CBCT were generated: s-CT using the centroids from classification of individual patient CBCT/CT data; s2-CT using the same centroids for all patients to investigate the applicability of group-based centroids. IMRT dose calculations for five patients were performed on the synthetic CTs and compared with CT-planning doses by dose-volume statistics. Results: DVH curves of PTVs and critical organs calculated on s-CT and s2-CT agree with those from planning-CT within 3%, while doses calculated with heterogeneity off or on raw CBCT show DVH differences up to 15%. The differences in PTV D95% and spinal cord max are 0.6±0.6% and 0.6±0.3% for s-CT, and 1.6±1.7% and 1.9±1.7% for s2-CT. Gamma analysis (2%/2mm) shows 97.5±1.6% and 97.6±1.6% pass rates for using s-CTs and s2-CTs compared with CT-based doses, respectively. Conclusion: CBCT-synthesized CTs using individual or group-based centroids resulted in dose calculations that are comparable to CT-planning dose for unilateral H&N cancer. The method may provide a tool for accurate dose calculation based on daily CBCT.« less
AVNM: A Voting based Novel Mathematical Rule for Image Classification.
Vidyarthi, Ankit; Mittal, Namita
2016-12-01
In machine learning, the accuracy of the system depends upon classification result. Classification accuracy plays an imperative role in various domains. Non-parametric classifier like K-Nearest Neighbor (KNN) is the most widely used classifier for pattern analysis. Besides its easiness, simplicity and effectiveness characteristics, the main problem associated with KNN classifier is the selection of a number of nearest neighbors i.e. "k" for computation. At present, it is hard to find the optimal value of "k" using any statistical algorithm, which gives perfect accuracy in terms of low misclassification error rate. Motivated by the prescribed problem, a new sample space reduction weighted voting mathematical rule (AVNM) is proposed for classification in machine learning. The proposed AVNM rule is also non-parametric in nature like KNN. AVNM uses the weighted voting mechanism with sample space reduction to learn and examine the predicted class label for unidentified sample. AVNM is free from any initial selection of predefined variable and neighbor selection as found in KNN algorithm. The proposed classifier also reduces the effect of outliers. To verify the performance of the proposed AVNM classifier, experiments are made on 10 standard datasets taken from UCI database and one manually created dataset. The experimental result shows that the proposed AVNM rule outperforms the KNN classifier and its variants. Experimentation results based on confusion matrix accuracy parameter proves higher accuracy value with AVNM rule. The proposed AVNM rule is based on sample space reduction mechanism for identification of an optimal number of nearest neighbor selections. AVNM results in better classification accuracy and minimum error rate as compared with the state-of-art algorithm, KNN, and its variants. The proposed rule automates the selection of nearest neighbor selection and improves classification rate for UCI dataset and manually created dataset. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Deep neural network-based domain adaptation for classification of remote sensing images
NASA Astrophysics Data System (ADS)
Ma, Li; Song, Jiazhen
2017-10-01
We investigate the effectiveness of deep neural network for cross-domain classification of remote sensing images in this paper. In the network, class centroid alignment is utilized as a domain adaptation strategy, making the network able to transfer knowledge from the source domain to target domain on a per-class basis. Since predicted labels of target data should be used to estimate the centroid of each class, we use overall centroid alignment as a coarse domain adaptation method to improve the estimation accuracy. In addition, rectified linear unit is used as the activation function to produce sparse features, which may improve the separation capability. The proposed network can provide both aligned features and an adaptive classifier, as well as obtain label-free classification of target domain data. The experimental results using Hyperion, NCALM, and WorldView-2 remote sensing images demonstrated the effectiveness of the proposed approach.
An incremental knowledge assimilation system (IKAS) for mine detection
NASA Astrophysics Data System (ADS)
Porway, Jake; Raju, Chaitanya; Varadarajan, Karthik Mahesh; Nguyen, Hieu; Yadegar, Joseph
2010-04-01
In this paper we present an adaptive incremental learning system for underwater mine detection and classification that utilizes statistical models of seabed texture and an adaptive nearest-neighbor classifier to identify varied underwater targets in many different environments. The first stage of processing uses our Background Adaptive ANomaly detector (BAAN), which identifies statistically likely target regions using Gabor filter responses over the image. Using this information, BAAN classifies the background type and updates its detection using background-specific parameters. To perform classification, a Fully Adaptive Nearest Neighbor (FAAN) determines the best label for each detection. FAAN uses an extremely fast version of Nearest Neighbor to find the most likely label for the target. The classifier perpetually assimilates new and relevant information into its existing knowledge database in an incremental fashion, allowing improved classification accuracy and capturing concept drift in the target classes. Experiments show that the system achieves >90% classification accuracy on underwater mine detection tasks performed on synthesized datasets provided by the Office of Naval Research. We have also demonstrated that the system can incrementally improve its detection accuracy by constantly learning from new samples.
Vehicle Classification Using an Imbalanced Dataset Based on a Single Magnetic Sensor.
Xu, Chang; Wang, Yingguan; Bao, Xinghe; Li, Fengrong
2018-05-24
This paper aims to improve the accuracy of automatic vehicle classifiers for imbalanced datasets. Classification is made through utilizing a single anisotropic magnetoresistive sensor, with the models of vehicles involved being classified into hatchbacks, sedans, buses, and multi-purpose vehicles (MPVs). Using time domain and frequency domain features in combination with three common classification algorithms in pattern recognition, we develop a novel feature extraction method for vehicle classification. These three common classification algorithms are the k-nearest neighbor, the support vector machine, and the back-propagation neural network. Nevertheless, a problem remains with the original vehicle magnetic dataset collected being imbalanced, and may lead to inaccurate classification results. With this in mind, we propose an approach called SMOTE, which can further boost the performance of classifiers. Experimental results show that the k-nearest neighbor (KNN) classifier with the SMOTE algorithm can reach a classification accuracy of 95.46%, thus minimizing the effect of the imbalance.
Fuzzy-Rough Nearest Neighbour Classification
NASA Astrophysics Data System (ADS)
Jensen, Richard; Cornelis, Chris
A new fuzzy-rough nearest neighbour (FRNN) classification algorithm is presented in this paper, as an alternative to Sarkar's fuzzy-rough ownership function (FRNN-O) approach. By contrast to the latter, our method uses the nearest neighbours to construct lower and upper approximations of decision classes, and classifies test instances based on their membership to these approximations. In the experimental analysis, we evaluate our approach with both classical fuzzy-rough approximations (based on an implicator and a t-norm), as well as with the recently introduced vaguely quantified rough sets. Preliminary results are very good, and in general FRNN outperforms FRNN-O, as well as the traditional fuzzy nearest neighbour (FNN) algorithm.
AlShawaqfeh, M K; Wajid, B; Minamoto, Y; Markel, M; Lidbury, J A; Steiner, J M; Serpedin, E; Suchodolski, J S
2017-11-01
Recent studies have identified various bacterial groups that are altered in dogs with chronic inflammatory enteropathies (CE) compared to healthy dogs. The study aim was to use quantitative PCR (qPCR) assays to confirm these findings in a larger number of dogs, and to build a mathematical algorithm to report these microbiota changes as a dysbiosis index (DI). Fecal DNA from 95 healthy dogs and 106 dogs with histologically confirmed CE was analyzed. Samples were grouped into a training set and a validation set. Various mathematical models and combination of qPCR assays were evaluated to find a model with highest discriminatory power. The final qPCR panel consisted of eight bacterial groups: total bacteria, Faecalibacterium, Turicibacter, Escherichia coli, Streptococcus, Blautia, Fusobacterium and Clostridium hiranonis. The qPCR-based DI was built based on the nearest centroid classifier, and reports the degree of dysbiosis in a single numerical value that measures the closeness in the l2 - norm of the test sample to the mean prototype of each class. A negative DI indicates normobiosis, whereas a positive DI indicates dysbiosis. For a threshold of 0, the DI based on the combined dataset achieved 74% sensitivity and 95% specificity to separate healthy and CE dogs. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Spatial pattern recognition of seismic events in South West Colombia
NASA Astrophysics Data System (ADS)
Benítez, Hernán D.; Flórez, Juan F.; Duque, Diana P.; Benavides, Alberto; Lucía Baquero, Olga; Quintero, Jiber
2013-09-01
Recognition of seismogenic zones in geographical regions supports seismic hazard studies. This recognition is usually based on visual, qualitative and subjective analysis of data. Spatial pattern recognition provides a well founded means to obtain relevant information from large amounts of data. The purpose of this work is to identify and classify spatial patterns in instrumental data of the South West Colombian seismic database. In this research, clustering tendency analysis validates whether seismic database possesses a clustering structure. A non-supervised fuzzy clustering algorithm creates groups of seismic events. Given the sensitivity of fuzzy clustering algorithms to centroid initial positions, we proposed a methodology to initialize centroids that generates stable partitions with respect to centroid initialization. As a result of this work, a public software tool provides the user with the routines developed for clustering methodology. The analysis of the seismogenic zones obtained reveals meaningful spatial patterns in South-West Colombia. The clustering analysis provides a quantitative location and dispersion of seismogenic zones that facilitates seismological interpretations of seismic activities in South West Colombia.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mason, J.
CCHDT constructs and classifies various arrangements of hard disks of a single radius places on the unit square with periodic boundary conditions. Specifially, a given configuration is evolved to the nearest critical point on a smoothed hard disk energy fuction, and is classified by the adjacency matrix of the canonically labelled contact graph.
Histogram Curve Matching Approaches for Object-based Image Classification of Land Cover and Land Use
Toure, Sory I.; Stow, Douglas A.; Weeks, John R.; Kumar, Sunil
2013-01-01
The classification of image-objects is usually done using parametric statistical measures of central tendency and/or dispersion (e.g., mean or standard deviation). The objectives of this study were to analyze digital number histograms of image objects and evaluate classifications measures exploiting characteristic signatures of such histograms. Two histograms matching classifiers were evaluated and compared to the standard nearest neighbor to mean classifier. An ADS40 airborne multispectral image of San Diego, California was used for assessing the utility of curve matching classifiers in a geographic object-based image analysis (GEOBIA) approach. The classifications were performed with data sets having 0.5 m, 2.5 m, and 5 m spatial resolutions. Results show that histograms are reliable features for characterizing classes. Also, both histogram matching classifiers consistently performed better than the one based on the standard nearest neighbor to mean rule. The highest classification accuracies were produced with images having 2.5 m spatial resolution. PMID:24403648
NASA Astrophysics Data System (ADS)
Hu, Yifan; Han, Hao; Zhu, Wei; Li, Lihong; Pickhardt, Perry J.; Liang, Zhengrong
2016-03-01
Feature classification plays an important role in differentiation or computer-aided diagnosis (CADx) of suspicious lesions. As a widely used ensemble learning algorithm for classification, random forest (RF) has a distinguished performance for CADx. Our recent study has shown that the location index (LI), which is derived from the well-known kNN (k nearest neighbor) and wkNN (weighted k nearest neighbor) classifier [1], has also a distinguished role in the classification for CADx. Therefore, in this paper, based on the property that the LI will achieve a very high accuracy, we design an algorithm to integrate the LI into RF for improved or higher value of AUC (area under the curve of receiver operating characteristics -- ROC). Experiments were performed by the use of a database of 153 lesions (polyps), including 116 neoplastic lesions and 37 hyperplastic lesions, with comparison to the existing classifiers of RF and wkNN, respectively. A noticeable gain by the proposed integrated classifier was quantified by the AUC measure.
NASA Astrophysics Data System (ADS)
Chernick, Julian A.; Perlovsky, Leonid I.; Tye, David M.
1994-06-01
This paper describes applications of maximum likelihood adaptive neural system (MLANS) to the characterization of clutter in IR images and to the identification of targets. The characterization of image clutter is needed to improve target detection and to enhance the ability to compare performance of different algorithms using diverse imagery data. Enhanced unambiguous IFF is important for fratricide reduction while automatic cueing and targeting is becoming an ever increasing part of operations. We utilized MLANS which is a parametric neural network that combines optimal statistical techniques with a model-based approach. This paper shows that MLANS outperforms classical classifiers, the quadratic classifier and the nearest neighbor classifier, because on the one hand it is not limited to the usual Gaussian distribution assumption and can adapt in real time to the image clutter distribution; on the other hand MLANS learns from fewer samples and is more robust than the nearest neighbor classifiers. Future research will address uncooperative IFF using fused IR and MMW data.
Activity Recognition in Egocentric video using SVM, kNN and Combined SVMkNN Classifiers
NASA Astrophysics Data System (ADS)
Sanal Kumar, K. P.; Bhavani, R., Dr.
2017-08-01
Egocentric vision is a unique perspective in computer vision which is human centric. The recognition of egocentric actions is a challenging task which helps in assisting elderly people, disabled patients and so on. In this work, life logging activity videos are taken as input. There are 2 categories, first one is the top level and second one is second level. Here, the recognition is done using the features like Histogram of Oriented Gradients (HOG), Motion Boundary Histogram (MBH) and Trajectory. The features are fused together and it acts as a single feature. The extracted features are reduced using Principal Component Analysis (PCA). The features that are reduced are provided as input to the classifiers like Support Vector Machine (SVM), k nearest neighbor (kNN) and combined Support Vector Machine (SVM) and k Nearest Neighbor (kNN) (combined SVMkNN). These classifiers are evaluated and the combined SVMkNN provided better results than other classifiers in the literature.
Adaptive fuzzy leader clustering of complex data sets in pattern recognition
NASA Technical Reports Server (NTRS)
Newton, Scott C.; Pemmaraju, Surya; Mitra, Sunanda
1992-01-01
A modular, unsupervised neural network architecture for clustering and classification of complex data sets is presented. The adaptive fuzzy leader clustering (AFLC) architecture is a hybrid neural-fuzzy system that learns on-line in a stable and efficient manner. The initial classification is performed in two stages: a simple competitive stage and a distance metric comparison stage. The cluster prototypes are then incrementally updated by relocating the centroid positions from fuzzy C-means system equations for the centroids and the membership values. The AFLC algorithm is applied to the Anderson Iris data and laser-luminescent fingerprint image data. It is concluded that the AFLC algorithm successfully classifies features extracted from real data, discrete or continuous.
Automated quasi-3D spine curvature quantification and classification
NASA Astrophysics Data System (ADS)
Khilari, Rupal; Puchin, Juris; Okada, Kazunori
2018-02-01
Scoliosis is a highly prevalent spine deformity that has traditionally been diagnosed through measurement of the Cobb angle on radiographs. More recent technology such as the commercial EOS imaging system, although more accurate, also require manual intervention for selecting the extremes of the vertebrae forming the Cobb angle. This results in a high degree of inter and intra observer error in determining the extent of spine deformity. Our primary focus is to eliminate the need for manual intervention by robustly quantifying the curvature of the spine in three dimensions, making it consistent across multiple observers. Given the vertebrae centroids, the proposed Vertebrae Sequence Angle (VSA) estimation and segmentation algorithm finds the largest angle between consecutive pairs of centroids within multiple inflection points on the curve. To exploit existing clinical diagnostic standards, the algorithm uses a quasi-3-dimensional approach considering the curvature in the coronal and sagittal projection planes of the spine. Experiments were performed with manuallyannotated ground-truth classification of publicly available, centroid-annotated CT spine datasets. This was compared with the results obtained from manual Cobb and Centroid angle estimation methods. Using the VSA, we then automatically classify the occurrence and the severity of spine curvature based on Lenke's classification for idiopathic scoliosis. We observe that the results appear promising with a scoliotic angle lying within +/- 9° of the Cobb and Centroid angle, and vertebrae positions differing by at the most one position. Our system also resulted in perfect classification of scoliotic from healthy spines with our dataset with six cases.
Spatial-Temporal dynamics of Newtonian and viscoelastic turbulence
NASA Astrophysics Data System (ADS)
Wang, Sung-Ning; Graham, Michael
2015-11-01
Introducing a trace amount of polymer into liquid turbulent flows can result in substantial reduction of friction drag. This phenomenon has been widely used in fluid transport, such as the Alaska crude oil pipeline. However, the mechanism is not well understood. We conduct direct numerical simulations of Newtonian and viscoelastic turbulence in large domains, in which the flow shows different characteristics in different regions. In some areas the drag is low and vortex motions are quiescent, while in other areas the drag is higher and the motions are more active. To identify these regions, we apply a statistical method, k-means clustering, which partitions the observations into k clusters by assigning each observation to its nearest centroid. The resulting partition maximizes the between-cluster variance. In the simulations, the observations are the instantaneous wall shear rate. Regions with different levels of drag are automatically identified by the partitioning algorithm. We find that the velocity profiles of the centroids exhibit characteristics similar to the individual coherent structures observed in minimal domain simulations. In addition, as viscoelasticity increases, polymer stretch becomes strongly correlated with wall shear stress. This work was supported by NSF grant CBET-1510291.
Thanh Noi, Phan; Kappas, Martin
2017-01-01
In previous classification studies, three non-parametric classifiers, Random Forest (RF), k-Nearest Neighbor (kNN), and Support Vector Machine (SVM), were reported as the foremost classifiers at producing high accuracies. However, only a few studies have compared the performances of these classifiers with different training sample sizes for the same remote sensing images, particularly the Sentinel-2 Multispectral Imager (MSI). In this study, we examined and compared the performances of the RF, kNN, and SVM classifiers for land use/cover classification using Sentinel-2 image data. An area of 30 × 30 km2 within the Red River Delta of Vietnam with six land use/cover types was classified using 14 different training sample sizes, including balanced and imbalanced, from 50 to over 1250 pixels/class. All classification results showed a high overall accuracy (OA) ranging from 90% to 95%. Among the three classifiers and 14 sub-datasets, SVM produced the highest OA with the least sensitivity to the training sample sizes, followed consecutively by RF and kNN. In relation to the sample size, all three classifiers showed a similar and high OA (over 93.85%) when the training sample size was large enough, i.e., greater than 750 pixels/class or representing an area of approximately 0.25% of the total study area. The high accuracy was achieved with both imbalanced and balanced datasets. PMID:29271909
Thanh Noi, Phan; Kappas, Martin
2017-12-22
In previous classification studies, three non-parametric classifiers, Random Forest (RF), k-Nearest Neighbor (kNN), and Support Vector Machine (SVM), were reported as the foremost classifiers at producing high accuracies. However, only a few studies have compared the performances of these classifiers with different training sample sizes for the same remote sensing images, particularly the Sentinel-2 Multispectral Imager (MSI). In this study, we examined and compared the performances of the RF, kNN, and SVM classifiers for land use/cover classification using Sentinel-2 image data. An area of 30 × 30 km² within the Red River Delta of Vietnam with six land use/cover types was classified using 14 different training sample sizes, including balanced and imbalanced, from 50 to over 1250 pixels/class. All classification results showed a high overall accuracy (OA) ranging from 90% to 95%. Among the three classifiers and 14 sub-datasets, SVM produced the highest OA with the least sensitivity to the training sample sizes, followed consecutively by RF and kNN. In relation to the sample size, all three classifiers showed a similar and high OA (over 93.85%) when the training sample size was large enough, i.e., greater than 750 pixels/class or representing an area of approximately 0.25% of the total study area. The high accuracy was achieved with both imbalanced and balanced datasets.
NASA Technical Reports Server (NTRS)
Wharton, S. W.
1980-01-01
An Interactive Cluster Analysis Procedure (ICAP) was developed to derive classifier training statistics from remotely sensed data. The algorithm interfaces the rapid numerical processing capacity of a computer with the human ability to integrate qualitative information. Control of the clustering process alternates between the algorithm, which creates new centroids and forms clusters and the analyst, who evaluate and elect to modify the cluster structure. Clusters can be deleted or lumped pairwise, or new centroids can be added. A summary of the cluster statistics can be requested to facilitate cluster manipulation. The ICAP was implemented in APL (A Programming Language), an interactive computer language. The flexibility of the algorithm was evaluated using data from different LANDSAT scenes to simulate two situations: one in which the analyst is assumed to have no prior knowledge about the data and wishes to have the clusters formed more or less automatically; and the other in which the analyst is assumed to have some knowledge about the data structure and wishes to use that information to closely supervise the clustering process. For comparison, an existing clustering method was also applied to the two data sets.
NASA Astrophysics Data System (ADS)
Kuijf, Hugo J.; Moeskops, Pim; de Vos, Bob D.; Bouvy, Willem H.; de Bresser, Jeroen; Biessels, Geert Jan; Viergever, Max A.; Vincken, Koen L.
2016-03-01
Novelty detection is concerned with identifying test data that differs from the training data of a classifier. In the case of brain MR images, pathology or imaging artefacts are examples of untrained data. In this proof-of-principle study, we measure the behaviour of a classifier during the classification of trained labels (i.e. normal brain tissue). Next, we devise a measure that distinguishes normal classifier behaviour from abnormal behavior that occurs in the case of a novelty. This will be evaluated by training a kNN classifier on normal brain tissue, applying it to images with an untrained pathology (white matter hyperintensities (WMH)), and determine if our measure is able to identify abnormal classifier behaviour at WMH locations. For our kNN classifier, behaviour is modelled as the mean, median, or q1 distance to the k nearest points. Healthy tissue was trained on 15 images; classifier behaviour was trained/tested on 5 images with leave-one-out cross-validation. For each trained class, we measure the distribution of mean/median/q1 distances to the k nearest point. Next, for each test voxel, we compute its Z-score with respect to the measured distribution of its predicted label. We consider a Z-score >=4 abnormal behaviour of the classifier, having a probability due to chance of 0.000032. Our measure identified >90% of WMH volume and also highlighted other non-trained findings. The latter being predominantly vessels, cerebral falx, brain mask errors, choroid plexus. This measure is generalizable to other classifiers and might help in detecting unexpected findings or novelties by measuring classifier behaviour.
ERIC Educational Resources Information Center
Pankau, Brian L.
2009-01-01
This empirical study evaluates the document category prediction effectiveness of Naive Bayes (NB) and K-Nearest Neighbor (KNN) classifier treatments built from different feature selection and machine learning settings and trained and tested against textual corpora of 2300 Gang-Of-Four (GOF) design pattern documents. Analysis of the experiment's…
NASA Astrophysics Data System (ADS)
Lestari, D.; Raharjo, D.; Bustamam, A.; Abdillah, B.; Widhianto, W.
2017-07-01
Dengue virus consists of 10 different constituent proteins and are classified into 4 major serotypes (DEN 1 - DEN 4). This study was designed to perform clustering against 30 protein sequences of dengue virus taken from Virus Pathogen Database and Analysis Resource (VIPR) using Regularized Markov Clustering (R-MCL) algorithm and then we analyze the result. By using Python program 3.4, R-MCL algorithm produces 8 clusters with more than one centroid in several clusters. The number of centroid shows the density level of interaction. Protein interactions that are connected in a tissue, form a complex protein that serves as a specific biological process unit. The analysis of result shows the R-MCL clustering produces clusters of dengue virus family based on the similarity role of their constituent protein, regardless of serotypes.
Analysis of miRNA expression profile based on SVM algorithm
NASA Astrophysics Data System (ADS)
Ting-ting, Dai; Chang-ji, Shan; Yan-shou, Dong; Yi-duo, Bian
2018-05-01
Based on mirna expression spectrum data set, a new data mining algorithm - tSVM - KNN (t statistic with support vector machine - k nearest neighbor) is proposed. the idea of the algorithm is: firstly, the feature selection of the data set is carried out by the unified measurement method; Secondly, SVM - KNN algorithm, which combines support vector machine (SVM) and k - nearest neighbor (k - nearest neighbor) is used as classifier. Simulation results show that SVM - KNN algorithm has better classification ability than SVM and KNN alone. Tsvm - KNN algorithm only needs 5 mirnas to obtain 96.08 % classification accuracy in terms of the number of mirna " tags" and recognition accuracy. compared with similar algorithms, tsvm - KNN algorithm has obvious advantages.
Minimum Expected Risk Estimation for Near-neighbor Classification
2006-04-01
We consider the problems of class probability estimation and classification when using near-neighbor classifiers, such as k-nearest neighbors ( kNN ...estimate for weighted kNN classifiers with different prior information, for a broad class of risk functions. Theory and simulations show how significant...the difference is compared to the standard maximum likelihood weighted kNN estimates. Comparisons are made with uniform weights, symmetric weights
A Proposed Methodology to Classify Frontier Capital Markets
2011-07-31
but because it is the surest route to our common good.” -Inaugural Speech by President Barack Obama, Jan 2009 This project involves basic...machine learning. The algorithm consists of a unique binary classifier mechanism that combines three methods: k-Nearest Neighbors ( kNN ), ensemble...Through kNN Ensemble Classification Techniques E. Capital Market Classification Based on Capital Flows and Trading Architecture F. Horizontal
A Proposed Methodology to Classify Frontier Capital Markets
2011-07-31
out of charity, but because it is the surest route to our common good.” -Inaugural Speech by President Barack Obama, Jan 2009 This project...identification, and machine learning. The algorithm consists of a unique binary classifier mechanism that combines three methods: k-Nearest Neighbors ( kNN ...Support Through kNN Ensemble Classification Techniques E. Capital Market Classification Based on Capital Flows and Trading Architecture F
Derrac, Joaquín; Triguero, Isaac; Garcia, Salvador; Herrera, Francisco
2012-10-01
Cooperative coevolution is a successful trend of evolutionary computation which allows us to define partitions of the domain of a given problem, or to integrate several related techniques into one, by the use of evolutionary algorithms. It is possible to apply it to the development of advanced classification methods, which integrate several machine learning techniques into a single proposal. A novel approach integrating instance selection, instance weighting, and feature weighting into the framework of a coevolutionary model is presented in this paper. We compare it with a wide range of evolutionary and nonevolutionary related methods, in order to show the benefits of the employment of coevolution to apply the techniques considered simultaneously. The results obtained, contrasted through nonparametric statistical tests, show that our proposal outperforms other methods in the comparison, thus becoming a suitable tool in the task of enhancing the nearest neighbor classifier.
Predicting Flavonoid UGT Regioselectivity
Jackson, Rhydon; Knisley, Debra; McIntosh, Cecilia; Pfeiffer, Phillip
2011-01-01
Machine learning was applied to a challenging and biologically significant protein classification problem: the prediction of avonoid UGT acceptor regioselectivity from primary sequence. Novel indices characterizing graphical models of residues were proposed and found to be widely distributed among existing amino acid indices and to cluster residues appropriately. UGT subsequences biochemically linked to regioselectivity were modeled as sets of index sequences. Several learning techniques incorporating these UGT models were compared with classifications based on standard sequence alignment scores. These techniques included an application of time series distance functions to protein classification. Time series distances defined on the index sequences were used in nearest neighbor and support vector machine classifiers. Additionally, Bayesian neural network classifiers were applied to the index sequences. The experiments identified improvements over the nearest neighbor and support vector machine classifications relying on standard alignment similarity scores, as well as strong correlations between specific subsequences and regioselectivities. PMID:21747849
The Use of Fuzzy Set Classification for Pattern Recognition of the Polygraph
1993-12-01
actual feature extraction was done, It was decided to use the K-nearest neighbor ( KNN ) the data was preprocessed. The electrocardiogram classifier in...showing heart pulse, and a low frequency not known beforehand, and the KNN classifier does not component showing blood volume. The derivative of...the characteristics of the conventional KNN these six derived signals were detrended and filtered, classification method is that it assigns each
A Novel Locally Linear KNN Method With Applications to Visual Recognition.
Liu, Qingfeng; Liu, Chengjun
2017-09-01
A locally linear K Nearest Neighbor (LLK) method is presented in this paper with applications to robust visual recognition. Specifically, the concept of an ideal representation is first presented, which improves upon the traditional sparse representation in many ways. The objective function based on a host of criteria for sparsity, locality, and reconstruction is then optimized to derive a novel representation, which is an approximation to the ideal representation. The novel representation is further processed by two classifiers, namely, an LLK-based classifier and a locally linear nearest mean-based classifier, for visual recognition. The proposed classifiers are shown to connect to the Bayes decision rule for minimum error. Additional new theoretical analysis is presented, such as the nonnegative constraint, the group regularization, and the computational efficiency of the proposed LLK method. New methods such as a shifted power transformation for improving reliability, a coefficients' truncating method for enhancing generalization, and an improved marginal Fisher analysis method for feature extraction are proposed to further improve visual recognition performance. Extensive experiments are implemented to evaluate the proposed LLK method for robust visual recognition. In particular, eight representative data sets are applied for assessing the performance of the LLK method for various visual recognition applications, such as action recognition, scene recognition, object recognition, and face recognition.
Comparison of four approaches to a rock facies classification problem
Dubois, M.K.; Bohling, Geoffrey C.; Chakrabarti, S.
2007-01-01
In this study, seven classifiers based on four different approaches were tested in a rock facies classification problem: classical parametric methods using Bayes' rule, and non-parametric methods using fuzzy logic, k-nearest neighbor, and feed forward-back propagating artificial neural network. Determining the most effective classifier for geologic facies prediction in wells without cores in the Panoma gas field, in Southwest Kansas, was the objective. Study data include 3600 samples with known rock facies class (from core) with each sample having either four or five measured properties (wire-line log curves), and two derived geologic properties (geologic constraining variables). The sample set was divided into two subsets, one for training and one for testing the ability of the trained classifier to correctly assign classes. Artificial neural networks clearly outperformed all other classifiers and are effective tools for this particular classification problem. Classical parametric models were inadequate due to the nature of the predictor variables (high dimensional and not linearly correlated), and feature space of the classes (overlapping). The other non-parametric methods tested, k-nearest neighbor and fuzzy logic, would need considerable improvement to match the neural network effectiveness, but further work, possibly combining certain aspects of the three non-parametric methods, may be justified. ?? 2006 Elsevier Ltd. All rights reserved.
Nearest Neighbor Algorithms for Pattern Classification
NASA Technical Reports Server (NTRS)
Barrios, J. O.
1972-01-01
A solution of the discrimination problem is considered by means of the minimum distance classifier, commonly referred to as the nearest neighbor (NN) rule. The NN rule is nonparametric, or distribution free, in the sense that it does not depend on any assumptions about the underlying statistics for its application. The k-NN rule is a procedure that assigns an observation vector z to a category F if most of the k nearby observations x sub i are elements of F. The condensed nearest neighbor (CNN) rule may be used to reduce the size of the training set required categorize The Bayes risk serves merely as a reference-the limit of excellence beyond which it is not possible to go. The NN rule is bounded below by the Bayes risk and above by twice the Bayes risk.
Mars global digital dune database and initial science results
Hayward, R.K.; Mullins, K.F.; Fenton, L.K.; Hare, T.M.; Titus, T.N.; Bourke, M.C.; Colaprete, A.; Christensen, P.R.
2007-01-01
A new Mars Global Digital Dune Database (MGD3) constructed using Thermal Emission Imaging System (THEMIS) infrared (IR) images provides a comprehensive and quantitative view of the geographic distribution of moderate- to large-size dune fields (area >1 kM2) that will help researchers to understand global climatic and sedimentary processes that have shaped the surface of Mars. MGD3 extends from 65??N to 65??S latitude and includes ???550 dune fields, covering ???70,000 km2, with an estimated total volume of ???3,600 km3. This area, when combined with polar dune estimates, suggests moderate- to large-size dune field coverage on Mars may total ???800,000 km2, ???6 times less than the total areal estimate of ???5,000,000 km2 for terrestrial dunes. Where availability and quality of THEMIS visible (VIS) or Mars Orbiter Camera. narrow-angle (MOC NA) images allow, we classify dunes and include dune slipface measurements, which are derived from gross dune morphology and represent the prevailing wind direction at the last time of significant dune modification. For dunes located within craters, the azimuth from crater centroid to dune field centroid (referred to as dune centroid azimuth) is calculated and can provide an accurate method for tracking dune migration within smooth-floored craters. These indicators of wind direction are compared to output from a general circulation model (GCM). Dune centroid azimuth values generally correlate to regional wind patterns. Slipface orientations are less well correlated, suggesting that local topographic effects may play a larger role in dune orientation than regional winds. Copyright 2007 by the American Geophysical Union.
Bruno, Rossella; Alì, Greta; Giannini, Riccardo; Proietti, Agnese; Lucchi, Marco; Chella, Antonio; Melfi, Franca; Mussi, Alfredo; Fontanini, Gabriella
2017-01-10
Malignant pleural mesothelioma (MPM) is a rare asbestos related cancer, aggressive and unresponsive to therapies. Histological examination of pleural lesions is the gold standard of MPM diagnosis, although it is sometimes hard to discriminate the epithelioid type of MPM from benign mesothelial hyperplasia (MH).This work aims to define a new molecular tool for the differential diagnosis of MPM, using the expression profile of 117 genes deregulated in this tumour.The gene expression analysis was performed by nanoString System on tumour tissues from 36 epithelioid MPM and 17 MH patients, and on 14 mesothelial pleural samples analysed in a blind way. Data analysis included raw nanoString data normalization, unsupervised cluster analysis by Pearson correlation, non-parametric Mann Whitney U-test and molecular classification by the Uncorrelated Shrunken Centroid (USC) Algorithm.The Mann-Whitney U-test found 35 genes upregulated and 31 downregulated in MPM. The unsupervised cluster analysis revealed two clusters, one composed only of MPM and one only of MH samples, thus revealing class-specific gene profiles. The Uncorrelated Shrunken Centroid algorithm identified two classifiers, one including 22 genes and the other 40 genes, able to properly classify all the samples as benign or malignant using gene expression data; both classifiers were also able to correctly determine, in a blind analysis, the diagnostic categories of all the 14 unknown samples.In conclusion we delineated a diagnostic tool combining molecular data (gene expression) and computational analysis (USC algorithm), which can be applied in the clinical practice for the differential diagnosis of MPM.
Jastrow-like ground states for quantum many-body potentials with near-neighbors interactions
NASA Astrophysics Data System (ADS)
Baradaran, Marzieh; Carrasco, José A.; Finkel, Federico; González-López, Artemio
2018-01-01
We completely solve the problem of classifying all one-dimensional quantum potentials with nearest- and next-to-nearest-neighbors interactions whose ground state is Jastrow-like, i.e., of Jastrow type but depending only on differences of consecutive particles. In particular, we show that these models must necessarily contain a three-body interaction term, as was the case with all previously known examples. We discuss several particular instances of the general solution, including a new hyperbolic potential and a model with elliptic interactions which reduces to the known rational and trigonometric ones in appropriate limits.
Learning with imperfectly labeled patterns
NASA Technical Reports Server (NTRS)
Chittineni, C. B.
1979-01-01
The problem of learning in pattern recognition using imperfectly labeled patterns is considered. The performance of the Bayes and nearest neighbor classifiers with imperfect labels is discussed using a probabilistic model for the mislabeling of the training patterns. Schemes for training the classifier using both parametric and non parametric techniques are presented. Methods for the correction of imperfect labels were developed. To gain an understanding of the learning process, expressions are derived for success probability as a function of training time for a one dimensional increment error correction classifier with imperfect labels. Feature selection with imperfectly labeled patterns is described.
Thermography based diagnosis of ruptured anterior cruciate ligament (ACL) in canines
NASA Astrophysics Data System (ADS)
Lama, Norsang; Umbaugh, Scott E.; Mishra, Deependra; Dahal, Rohini; Marino, Dominic J.; Sackman, Joseph
2016-09-01
Anterior cruciate ligament (ACL) rupture in canines is a common orthopedic injury in veterinary medicine. Veterinarians use both imaging and non-imaging methods to diagnose the disease. Common imaging methods such as radiography, computed tomography (CT scan) and magnetic resonance imaging (MRI) have some disadvantages: expensive setup, high dose of radiation, and time-consuming. In this paper, we present an alternative diagnostic method based on feature extraction and pattern classification (FEPC) to diagnose abnormal patterns in ACL thermograms. The proposed method was experimented with a total of 30 thermograms for each camera view (anterior, lateral and posterior) including 14 disease and 16 non-disease cases provided from Long Island Veterinary Specialists. The normal and abnormal patterns in thermograms are analyzed in two steps: feature extraction and pattern classification. Texture features based on gray level co-occurrence matrices (GLCM), histogram features and spectral features are extracted from the color normalized thermograms and the computed feature vectors are applied to Nearest Neighbor (NN) classifier, K-Nearest Neighbor (KNN) classifier and Support Vector Machine (SVM) classifier with leave-one-out validation method. The algorithm gives the best classification success rate of 86.67% with a sensitivity of 85.71% and a specificity of 87.5% in ACL rupture detection using NN classifier for the lateral view and Norm-RGB-Lum color normalization method. Our results show that the proposed method has the potential to detect ACL rupture in canines.
Diagnosis of Tempromandibular Disorders Using Local Binary Patterns.
Haghnegahdar, A A; Kolahi, S; Khojastepour, L; Tajeripour, F
2018-03-01
Temporomandibular joint disorder (TMD) might be manifested as structural changes in bone through modification, adaptation or direct destruction. We propose to use Local Binary Pattern (LBP) characteristics and histogram-oriented gradients on the recorded images as a diagnostic tool in TMD assessment. CBCT images of 66 patients (132 joints) with TMD and 66 normal cases (132 joints) were collected and 2 coronal cut prepared from each condyle, although images were limited to head of mandibular condyle. In order to extract features of images, first we use LBP and then histogram of oriented gradients. To reduce dimensionality, the linear algebra Singular Value Decomposition (SVD) is applied to the feature vectors matrix of all images. For evaluation, we used K nearest neighbor (K-NN), Support Vector Machine, Naïve Bayesian and Random Forest classifiers. We used Receiver Operating Characteristic (ROC) to evaluate the hypothesis. K nearest neighbor classifier achieves a very good accuracy (0.9242), moreover, it has desirable sensitivity (0.9470) and specificity (0.9015) results, when other classifiers have lower accuracy, sensitivity and specificity. We proposed a fully automatic approach to detect TMD using image processing techniques based on local binary patterns and feature extraction. K-NN has been the best classifier for our experiments in detecting patients from healthy individuals, by 92.42% accuracy, 94.70% sensitivity and 90.15% specificity. The proposed method can help automatically diagnose TMD at its initial stages.
Discrimination of different sub-basins on Tajo River based on water influence factor
NASA Astrophysics Data System (ADS)
Bermudez, R.; Gascó, J. M.; Tarquis, A. M.; Saa-Requejo, A.
2009-04-01
Numeric taxonomy has been applied to classify Tajo basin water (Spain) till Portugal border. Several stations, a total of 52, that estimate 15 water variables have been used in this study. The different groups have been obtained applying a Euclidean distance among stations (distance classification) and a Euclidean distance between each station and the centroid estimated among them (centroid classification), varying the number of parameters and with or without variable typification. In order to compare the classification a log-log relation has been established, between number of groups created and distances, to select the best one. It has been observed that centroid classification is more appropriate following in a more logic way the natural constrictions than the minimum distance among stations. Variable typification doesn't improve the classification except when the centroid method is applied. Taking in consideration the ions and the sum of them as variables, the classification improved. Stations are grouped based on electric conductivity (CE), total anions (TA), total cations (TC) and ions ratio (Na/Ca and Mg/Ca). For a given classification and comparing the different groups created a certain variation in ions concentration and ions ratio are observed. However, the variation in each ion among groups is different depending on the case. For the last group, regardless the classification, the increase in all ions is general. Comparing the dendrograms, and groups that originated, Tajo river basin can be sub dived in five sub-basins differentiated by the main influence on water: 1. With a higher ombrogenic influence (rain fed). 2. With ombrogenic and pedogenic influence (rain and groundwater fed). 3. With pedogenic influence. 4. With lithogenic influence (geological bedrock). 5. With a higher ombrogenic and lithogenic influence added.
Maheswaran, Ravi; Pearson, Tim; Beevers, Sean D.; Campbell, Michael J.; Wolfe, Charles D.
2016-01-01
Background and Purpose Few studies have examined the association between air pollutants and ischemic stroke subtypes. We examined acute effects of outdoor air pollutants (PM10, NO2, O3, CO, SO2) on subtypes and severity of incident ischemic stroke and investigated if pre-existing risk factors increased susceptibility. Methods We used a time stratified case-crossover study and stroke cases from the South London Stroke Register set up to capture all incident cases of first ever stroke occurring amongst residents in a geographically defined area. The Oxford clinical and TOAST etiological classifications were used to classify subtypes. A pragmatic clinical classification system was used to assess severity. Air pollution concentrations from the nearest background air pollution monitoring stations to patients’ residential postcode centroids were used. Lags from 0 to 6 days were investigated. Results There were 2590 incident cases of ischemic stroke (1995–2006). While there were associations at various lag times with several pollutants, overall, there was no consistent pattern between exposure and risk of ischemic stroke subtypes or severity. The possible exception was the association between NO2 exposure and small vessel disease stroke—adjusted odds ratio of 1.51 (1.12–2.02) associated with an inter-quartile range increase in the lag 0–6 day average for NO2. There were no clear associations in relation to pre-existing risk factors. Conclusions Overall, we found little consistent evidence of association between air pollutants and ischemic stroke subtypes and severity. There was however a suggestion that increasing NO2 exposure might be associated with higher risk of stroke caused by cerebrovascular small vessel disease. PMID:27362783
Comparison of Classifier Architectures for Online Neural Spike Sorting.
Saeed, Maryam; Khan, Amir Ali; Kamboh, Awais Mehmood
2017-04-01
High-density, intracranial recordings from micro-electrode arrays need to undergo Spike Sorting in order to associate the recorded neuronal spikes to particular neurons. This involves spike detection, feature extraction, and classification. To reduce the data transmission and power requirements, on-chip real-time processing is becoming very popular. However, high computational resources are required for classifiers in on-chip spike-sorters, making scalability a great challenge. In this review paper, we analyze several popular classifiers to propose five new hardware architectures using the off-chip training with on-chip classification approach. These include support vector classification, fuzzy C-means classification, self-organizing maps classification, moving-centroid K-means classification, and Cosine distance classification. The performance of these architectures is analyzed in terms of accuracy and resource requirement. We establish that the neural networks based Self-Organizing Maps classifier offers the most viable solution. A spike sorter based on the Self-Organizing Maps classifier, requires only 7.83% of computational resources of the best-reported spike sorter, hierarchical adaptive means, while offering a 3% better accuracy at 7 dB SNR.
Large margin nearest neighbor classifiers.
Domeniconi, Carlotta; Gunopulos, Dimitrios; Peng, Jing
2005-07-01
The nearest neighbor technique is a simple and appealing approach to addressing classification problems. It relies on the assumption of locally constant class conditional probabilities. This assumption becomes invalid in high dimensions with a finite number of examples due to the curse of dimensionality. Severe bias can be introduced under these conditions when using the nearest neighbor rule. The employment of a locally adaptive metric becomes crucial in order to keep class conditional probabilities close to uniform, thereby minimizing the bias of estimates. We propose a technique that computes a locally flexible metric by means of support vector machines (SVMs). The decision function constructed by SVMs is used to determine the most discriminant direction in a neighborhood around the query. Such a direction provides a local feature weighting scheme. We formally show that our method increases the margin in the weighted space where classification takes place. Moreover, our method has the important advantage of online computational efficiency over competing locally adaptive techniques for nearest neighbor classification. We demonstrate the efficacy of our method using both real and simulated data.
Allen, Victoria W; Shirasu-Hiza, Mimi
2018-01-01
Despite being pervasive, the control of programmed grooming is poorly understood. We addressed this gap by developing a high-throughput platform that allows long-term detection of grooming in Drosophila melanogaster. In our method, a k-nearest neighbors algorithm automatically classifies fly behavior and finds grooming events with over 90% accuracy in diverse genotypes. Our data show that flies spend ~13% of their waking time grooming, driven largely by two major internal programs. One of these programs regulates the timing of grooming and involves the core circadian clock components cycle, clock, and period. The second program regulates the duration of grooming and, while dependent on cycle and clock, appears to be independent of period. This emerging dual control model in which one program controls timing and another controls duration, resembles the two-process regulatory model of sleep. Together, our quantitative approach presents the opportunity for further dissection of mechanisms controlling long-term grooming in Drosophila. PMID:29485401
Brain tissue segmentation in 4D CT using voxel classification
NASA Astrophysics Data System (ADS)
van den Boom, R.; Oei, M. T. H.; Lafebre, S.; Oostveen, L. J.; Meijer, F. J. A.; Steens, S. C. A.; Prokop, M.; van Ginneken, B.; Manniesing, R.
2012-02-01
A method is proposed to segment anatomical regions of the brain from 4D computer tomography (CT) patient data. The method consists of a three step voxel classification scheme, each step focusing on structures that are increasingly difficult to segment. The first step classifies air and bone, the second step classifies vessels and the third step classifies white matter, gray matter and cerebrospinal fluid. As features the time averaged intensity value and the temporal intensity change value were used. In each step, a k-Nearest-Neighbor classifier was used to classify the voxels. Training data was obtained by placing regions of interest in reconstructed 3D image data. The method has been applied to ten 4D CT cerebral patient data. A leave-one-out experiment showed consistent and accurate segmentation results.
Machine Learning in Intrusion Detection
2005-07-01
machine learning tasks. Anomaly detection provides the core technology for a broad spectrum of security-centric applications. In this dissertation, we examine various aspects of anomaly based intrusion detection in computer security. First, we present a new approach to learn program behavior for intrusion detection. Text categorization techniques are adopted to convert each process to a vector and calculate the similarity between two program activities. Then the k-nearest neighbor classifier is employed to classify program behavior as normal or intrusive. We demonstrate
Detecting the Difficulty Level of Foreign Language Texts
2010-02-01
continuous tenses), as well as part- of- speech labels for words. The authors used a k-Nearest Neighbor ( kNN ) classifier (Cover and Hart, 1967; Mitchell, 1997...anticipate, and influence these situations and to operate in them is found in foreign language speech and text. For this reason, military linguists are...the language model system, LGR is the prediction of one of the grammar-based classifiers, and CkNN is a confidence value of the kNN prediction for the
Perceptions and Expected Immediate Reactions to Severe Storm Displays.
Jon, Ihnji; Huang, Shih-Kai; Lindell, Michael K
2017-11-09
The National Weather Service has adopted warning polygons that more specifically indicate the risk area than its previous county-wide warnings. However, these polygons are not defined in terms of numerical strike probabilities (p s ). To better understand people's interpretations of warning polygons, 167 participants were shown 23 hypothetical scenarios in one of three information conditions-polygon-only (Condition A), polygon + tornadic storm cell (Condition B), and polygon + tornadic storm cell + flanking nontornadic storm cells (Condition C). Participants judged each polygon's p s and reported the likelihood of taking nine different response actions. The polygon-only condition replicated the results of previous studies; p s was highest at the polygon's centroid and declined in all directions from there. The two conditions displaying storm cells differed from the polygon-only condition only in having p s just as high at the polygon's edge nearest the storm cell as at its centroid. Overall, p s values were positively correlated with expectations of continuing normal activities, seeking information from social sources, seeking shelter, and evacuating by car. These results indicate that participants make more appropriate p s judgments when polygons are presented in their natural context of radar displays than when they are presented in isolation. However, the fact that p s judgments had moderately positive correlations with both sheltering (a generally appropriate response) and evacuation (a generally inappropriate response) suggests that experiment participants experience the same ambivalence about these two protective actions as people threatened by actual tornadoes. © 2017 Society for Risk Analysis.
Asteroid detection using a single multi-wavelength CCD scan
NASA Astrophysics Data System (ADS)
Melton, Jonathan
2016-09-01
Asteroid detection is a topic of great interest due to the possibility of diverting possibly dangerous asteroids or mining potentially lucrative ones. Currently, asteroid detection is generally performed by taking multiple images of the same patch of sky separated by 10-15 minutes, then subtracting the images to find movement. However, this is time consuming because of the need to revisit the same area multiple times per night. This paper describes an algorithm that can detect asteroids using a single CCD camera scan, thus cutting down on the time and cost of an asteroid survey. The algorithm is based on the fact that some telescopes scan the sky at multiple wavelengths with a small time separation between the wavelength components. As a result, an object moving with sufficient speed will appear in different places in different wavelength components of the same image. Using image processing techniques we detect the centroids of points of light in the first component and compare these positions to the centroids in the other components using a nearest neighbor algorithm. The algorithm was used on a test set of 49 images obtained from the Sloan telescope in New Mexico and found 100% of known asteroids with only 3 false positives. This algorithm has the advantage of decreasing the amount of time required to perform an asteroid scan, thus allowing more sky to be scanned in the same amount of time or freeing a telescope for other pursuits.
Hiscock, Rosemary; Pearce, Jamie; Blakely, Tony; Witten, Karen
2008-01-01
Objective To explore whether travel time access to the nearest general practitioner (GP) surgery (which is equivalent to U.S. primary care physician [PCP] office) and pharmacy predicts individual-level health service utilization and satisfaction. Data Sources GP and pharmacy addresses were obtained from the New Zealand Ministry of Health in 2003 and merged with a geographic boundaries data set. Travel times derived from these data were appended to the 2002/03 New Zealand Health Survey (N = 12,529). Study Design Multilevel logistic regression was used to model the relationship between travel time access and five health service outcomes: GP consultation, blood pressure test, cholesterol test, visit to pharmacy, and satisfaction with latest GP consultation. Data Collection/Extraction Travel times between each census meshblock centroid and the nearest GP and pharmacy were calculated using Geographical Information System. Principal Findings When travel times were long, blood pressure tests were less likely in urban areas (odds ratio [OR] 0.75 [0.59–0.97]), GP consultations were less likely in rural centers (OR 0.42 [0.22–0.78]) and pharmacy visits were less likely in highly rural areas (OR 0.36 [0.13–0.99]). There was some evidence of lower utilization in rural areas. Conclusions Locational access to GP surgeries and pharmacies appears to sometimes be associated with health service use but not satisfaction. PMID:18671752
ICAP - An Interactive Cluster Analysis Procedure for analyzing remotely sensed data
NASA Technical Reports Server (NTRS)
Wharton, S. W.; Turner, B. J.
1981-01-01
An Interactive Cluster Analysis Procedure (ICAP) was developed to derive classifier training statistics from remotely sensed data. ICAP differs from conventional clustering algorithms by allowing the analyst to optimize the cluster configuration by inspection, rather than by manipulating process parameters. Control of the clustering process alternates between the algorithm, which creates new centroids and forms clusters, and the analyst, who can evaluate and elect to modify the cluster structure. Clusters can be deleted, or lumped together pairwise, or new centroids can be added. A summary of the cluster statistics can be requested to facilitate cluster manipulation. The principal advantage of this approach is that it allows prior information (when available) to be used directly in the analysis, since the analyst interacts with ICAP in a straightforward manner, using basic terms with which he is more likely to be familiar. Results from testing ICAP showed that an informed use of ICAP can improve classification, as compared to an existing cluster analysis procedure.
An ensemble of dissimilarity based classifiers for Mackerel gender determination
NASA Astrophysics Data System (ADS)
Blanco, A.; Rodriguez, R.; Martinez-Maranon, I.
2014-03-01
Mackerel is an infravalored fish captured by European fishing vessels. A manner to add value to this specie can be achieved by trying to classify it attending to its sex. Colour measurements were performed on Mackerel females and males (fresh and defrozen) extracted gonads to obtain differences between sexes. Several linear and non linear classifiers such as Support Vector Machines (SVM), k Nearest Neighbors (k-NN) or Diagonal Linear Discriminant Analysis (DLDA) can been applied to this problem. However, theyare usually based on Euclidean distances that fail to reflect accurately the sample proximities. Classifiers based on non-Euclidean dissimilarities misclassify a different set of patterns. We combine different kind of dissimilarity based classifiers. The diversity is induced considering a set of complementary dissimilarities for each model. The experimental results suggest that our algorithm helps to improve classifiers based on a single dissimilarity.
Umut, İlhan; Çentik, Güven
2016-01-01
The number of channels used for polysomnographic recording frequently causes difficulties for patients because of the many cables connected. Also, it increases the risk of having troubles during recording process and increases the storage volume. In this study, it is intended to detect periodic leg movement (PLM) in sleep with the use of the channels except leg electromyography (EMG) by analysing polysomnography (PSG) data with digital signal processing (DSP) and machine learning methods. PSG records of 153 patients of different ages and genders with PLM disorder diagnosis were examined retrospectively. A novel software was developed for the analysis of PSG records. The software utilizes the machine learning algorithms, statistical methods, and DSP methods. In order to classify PLM, popular machine learning methods (multilayer perceptron, K-nearest neighbour, and random forests) and logistic regression were used. Comparison of classified results showed that while K-nearest neighbour classification algorithm had higher average classification rate (91.87%) and lower average classification error value (RMSE = 0.2850), multilayer perceptron algorithm had the lowest average classification rate (83.29%) and the highest average classification error value (RMSE = 0.3705). Results showed that PLM can be classified with high accuracy (91.87%) without leg EMG record being present. PMID:27213008
Umut, İlhan; Çentik, Güven
2016-01-01
The number of channels used for polysomnographic recording frequently causes difficulties for patients because of the many cables connected. Also, it increases the risk of having troubles during recording process and increases the storage volume. In this study, it is intended to detect periodic leg movement (PLM) in sleep with the use of the channels except leg electromyography (EMG) by analysing polysomnography (PSG) data with digital signal processing (DSP) and machine learning methods. PSG records of 153 patients of different ages and genders with PLM disorder diagnosis were examined retrospectively. A novel software was developed for the analysis of PSG records. The software utilizes the machine learning algorithms, statistical methods, and DSP methods. In order to classify PLM, popular machine learning methods (multilayer perceptron, K-nearest neighbour, and random forests) and logistic regression were used. Comparison of classified results showed that while K-nearest neighbour classification algorithm had higher average classification rate (91.87%) and lower average classification error value (RMSE = 0.2850), multilayer perceptron algorithm had the lowest average classification rate (83.29%) and the highest average classification error value (RMSE = 0.3705). Results showed that PLM can be classified with high accuracy (91.87%) without leg EMG record being present.
Kumar, Mukesh; Rath, Nitish Kumar; Rath, Santanu Kumar
2016-04-01
Microarray-based gene expression profiling has emerged as an efficient technique for classification, prognosis, diagnosis, and treatment of cancer. Frequent changes in the behavior of this disease generates an enormous volume of data. Microarray data satisfies both the veracity and velocity properties of big data, as it keeps changing with time. Therefore, the analysis of microarray datasets in a small amount of time is essential. They often contain a large amount of expression, but only a fraction of it comprises genes that are significantly expressed. The precise identification of genes of interest that are responsible for causing cancer are imperative in microarray data analysis. Most existing schemes employ a two-phase process such as feature selection/extraction followed by classification. In this paper, various statistical methods (tests) based on MapReduce are proposed for selecting relevant features. After feature selection, a MapReduce-based K-nearest neighbor (mrKNN) classifier is also employed to classify microarray data. These algorithms are successfully implemented in a Hadoop framework. A comparative analysis is done on these MapReduce-based models using microarray datasets of various dimensions. From the obtained results, it is observed that these models consume much less execution time than conventional models in processing big data. Copyright © 2016 Elsevier Inc. All rights reserved.
Efficient Fingercode Classification
NASA Astrophysics Data System (ADS)
Sun, Hong-Wei; Law, Kwok-Yan; Gollmann, Dieter; Chung, Siu-Leung; Li, Jian-Bin; Sun, Jia-Guang
In this paper, we present an efficient fingerprint classification algorithm which is an essential component in many critical security application systems e. g. systems in the e-government and e-finance domains. Fingerprint identification is one of the most important security requirements in homeland security systems such as personnel screening and anti-money laundering. The problem of fingerprint identification involves searching (matching) the fingerprint of a person against each of the fingerprints of all registered persons. To enhance performance and reliability, a common approach is to reduce the search space by firstly classifying the fingerprints and then performing the search in the respective class. Jain et al. proposed a fingerprint classification algorithm based on a two-stage classifier, which uses a K-nearest neighbor classifier in its first stage. The fingerprint classification algorithm is based on the fingercode representation which is an encoding of fingerprints that has been demonstrated to be an effective fingerprint biometric scheme because of its ability to capture both local and global details in a fingerprint image. We enhance this approach by improving the efficiency of the K-nearest neighbor classifier for fingercode-based fingerprint classification. Our research firstly investigates the various fast search algorithms in vector quantization (VQ) and the potential application in fingerprint classification, and then proposes two efficient algorithms based on the pyramid-based search algorithms in VQ. Experimental results on DB1 of FVC 2004 demonstrate that our algorithms can outperform the full search algorithm and the original pyramid-based search algorithms in terms of computational efficiency without sacrificing accuracy.
Diagnosis of Tempromandibular Disorders Using Local Binary Patterns
Haghnegahdar, A.A.; Kolahi, S.; Khojastepour, L.; Tajeripour, F.
2018-01-01
Background: Temporomandibular joint disorder (TMD) might be manifested as structural changes in bone through modification, adaptation or direct destruction. We propose to use Local Binary Pattern (LBP) characteristics and histogram-oriented gradients on the recorded images as a diagnostic tool in TMD assessment. Material and Methods: CBCT images of 66 patients (132 joints) with TMD and 66 normal cases (132 joints) were collected and 2 coronal cut prepared from each condyle, although images were limited to head of mandibular condyle. In order to extract features of images, first we use LBP and then histogram of oriented gradients. To reduce dimensionality, the linear algebra Singular Value Decomposition (SVD) is applied to the feature vectors matrix of all images. For evaluation, we used K nearest neighbor (K-NN), Support Vector Machine, Naïve Bayesian and Random Forest classifiers. We used Receiver Operating Characteristic (ROC) to evaluate the hypothesis. Results: K nearest neighbor classifier achieves a very good accuracy (0.9242), moreover, it has desirable sensitivity (0.9470) and specificity (0.9015) results, when other classifiers have lower accuracy, sensitivity and specificity. Conclusion: We proposed a fully automatic approach to detect TMD using image processing techniques based on local binary patterns and feature extraction. K-NN has been the best classifier for our experiments in detecting patients from healthy individuals, by 92.42% accuracy, 94.70% sensitivity and 90.15% specificity. The proposed method can help automatically diagnose TMD at its initial stages. PMID:29732343
Local Subspace Classifier with Transform-Invariance for Image Classification
NASA Astrophysics Data System (ADS)
Hotta, Seiji
A family of linear subspace classifiers called local subspace classifier (LSC) outperforms the k-nearest neighbor rule (kNN) and conventional subspace classifiers in handwritten digit classification. However, LSC suffers very high sensitivity to image transformations because it uses projection and the Euclidean distances for classification. In this paper, I present a combination of a local subspace classifier (LSC) and a tangent distance (TD) for improving accuracy of handwritten digit recognition. In this classification rule, we can deal with transform-invariance easily because we are able to use tangent vectors for approximation of transformations. However, we cannot use tangent vectors in other type of images such as color images. Hence, kernel LSC (KLSC) is proposed for incorporating transform-invariance into LSC via kernel mapping. The performance of the proposed methods is verified with the experiments on handwritten digit and color image classification.
Shen, Hong-Bin; Chou, Kuo-Chen
2005-11-25
The nucleus is the brain of eukaryotic cells that guides the life processes of the cell by issuing key instructions. For in-depth understanding of the biochemical process of the nucleus, the knowledge of localization of nuclear proteins is very important. With the avalanche of protein sequences generated in the post-genomic era, it is highly desired to develop an automated method for fast annotating the subnuclear locations for numerous newly found nuclear protein sequences so as to be able to timely utilize them for basic research and drug discovery. In view of this, a novel approach is developed for predicting the protein subnuclear location. It is featured by introducing a powerful classifier, the optimized evidence-theoretic K-nearest classifier, and using the pseudo amino acid composition [K.C. Chou, PROTEINS: Structure, Function, and Genetics, 43 (2001) 246], which can incorporate a considerable amount of sequence-order effects, to represent protein samples. As a demonstration, identifications were performed for 370 nuclear proteins among the following 9 subnuclear locations: (1) Cajal body, (2) chromatin, (3) heterochromatin, (4) nuclear diffuse, (5) nuclear pore, (6) nuclear speckle, (7) nucleolus, (8) PcG body, and (9) PML body. The overall success rates thus obtained by both the re-substitution test and jackknife cross-validation test are significantly higher than those by existing classifiers on the same working dataset. It is anticipated that the powerful approach may also become a useful high throughput vehicle to bridge the huge gap occurring in the post-genomic era between the number of gene sequences in databases and the number of gene products that have been functionally characterized. The OET-KNN classifier will be available at www.pami.sjtu.edu.cn/people/hbshen.
Hussain, Lal
2018-06-01
Epilepsy is a neurological disorder produced due to abnormal excitability of neurons in the brain. The research reveals that brain activity is monitored through electroencephalogram (EEG) of patients suffered from seizure to detect the epileptic seizure. The performance of EEG detection based epilepsy require feature extracting strategies. In this research, we have extracted varying features extracting strategies based on time and frequency domain characteristics, nonlinear, wavelet based entropy and few statistical features. A deeper study was undertaken using novel machine learning classifiers by considering multiple factors. The support vector machine kernels are evaluated based on multiclass kernel and box constraint level. Likewise, for K-nearest neighbors (KNN), we computed the different distance metrics, Neighbor weights and Neighbors. Similarly, the decision trees we tuned the paramours based on maximum splits and split criteria and ensemble classifiers are evaluated based on different ensemble methods and learning rate. For training/testing tenfold Cross validation was employed and performance was evaluated in form of TPR, NPR, PPV, accuracy and AUC. In this research, a deeper analysis approach was performed using diverse features extracting strategies using robust machine learning classifiers with more advanced optimal options. Support Vector Machine linear kernel and KNN with City block distance metric give the overall highest accuracy of 99.5% which was higher than using the default parameters for these classifiers. Moreover, highest separation (AUC = 0.9991, 0.9990) were obtained at different kernel scales using SVM. Additionally, the K-nearest neighbors with inverse squared distance weight give higher performance at different Neighbors. Moreover, to distinguish the postictal heart rate oscillations from epileptic ictal subjects, and highest performance of 100% was obtained using different machine learning classifiers.
NASA Astrophysics Data System (ADS)
Schmalz, M.; Ritter, G.; Key, R.
Accurate and computationally efficient spectral signature classification is a crucial step in the nonimaging detection and recognition of spaceborne objects. In classical hyperspectral recognition applications using linear mixing models, signature classification accuracy depends on accurate spectral endmember discrimination [1]. If the endmembers cannot be classified correctly, then the signatures cannot be classified correctly, and object recognition from hyperspectral data will be inaccurate. In practice, the number of endmembers accurately classified often depends linearly on the number of inputs. This can lead to potentially severe classification errors in the presence of noise or densely interleaved signatures. In this paper, we present an comparison of emerging technologies for nonimaging spectral signature classfication based on a highly accurate, efficient search engine called Tabular Nearest Neighbor Encoding (TNE) [3,4] and a neural network technology called Morphological Neural Networks (MNNs) [5]. Based on prior results, TNE can optimize its classifier performance to track input nonergodicities, as well as yield measures of confidence or caution for evaluation of classification results. Unlike neural networks, TNE does not have a hidden intermediate data structure (e.g., the neural net weight matrix). Instead, TNE generates and exploits a user-accessible data structure called the agreement map (AM), which can be manipulated by Boolean logic operations to effect accurate classifier refinement algorithms. The open architecture and programmability of TNE's agreement map processing allows a TNE programmer or user to determine classification accuracy, as well as characterize in detail the signatures for which TNE did not obtain classification matches, and why such mis-matches occurred. In this study, we will compare TNE and MNN based endmember classification, using performance metrics such as probability of correct classification (Pd) and rate of false detections (Rfa). As proof of principle, we analyze classification of multiple closely spaced signatures from a NASA database of space material signatures. Additional analysis pertains to computational complexity and noise sensitivity, which are superior to Bayesian techniques based on classical neural networks. [1] Winter, M.E. "Fast autonomous spectral end-member determination in hyperspectral data," in Proceedings of the 13th International Conference On Applied Geologic Remote Sensing, Vancouver, B.C., Canada, pp. 337-44 (1999). [2] N. Keshava, "A survey of spectral unmixing algorithms," Lincoln Laboratory Journal 14:55-78 (2003). [3] Key, G., M.S. SCHMALZ, F.M. Caimi, and G.X. Ritter. "Performance analysis of tabular nearest neighbor encoding algorithm for joint compression and ATR", in Proceedings SPIE 3814:115-126 (1999). [4] Schmalz, M.S. and G. Key. "Algorithms for hyperspectral signature classification in unresolved object detection using tabular nearest neighbor encoding" in Proceedings of the 2007 AMOS Conference, Maui HI (2007). [5] Ritter, G.X., G. Urcid, and M.S. Schmalz. "Autonomous single-pass endmember approximation using lattice auto-associative memories", Neurocomputing (Elsevier), accepted (June 2008).
Delavarian, Mona; Towhidkhah, Farzad; Gharibzadeh, Shahriar; Dibajnia, Parvin
2011-07-12
Automatic classification of different behavioral disorders with many similarities (e.g. in symptoms) by using an automated approach will help psychiatrists to concentrate on correct disorder and its treatment as soon as possible, to avoid wasting time on diagnosis, and to increase the accuracy of diagnosis. In this study, we tried to differentiate and classify (diagnose) 306 children with many similar symptoms and different behavioral disorders such as ADHD, depression, anxiety, comorbid depression and anxiety and conduct disorder with high accuracy. Classification was based on the symptoms and their severity. With examining 16 different available classifiers, by using "Prtools", we have proposed nearest mean classifier as the most accurate classifier with 96.92% accuracy in this research. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
A faux hawk fullerene with PCBM-like properties
San, Long K.; Bukovsky, Eric V.; Larson, Bryon W.; ...
2014-12-16
Reaction of C 60, C 6F 5CF 2I, and SnH(n-Bu) 3 produced, among other unidentified fullerene derivatives, the two new compounds 1,9-C 60(CF 2C 6F 5)H (1) and 1,9-C 60(cyclo-CF 2(2-C 6F 4)) (2). The highest isolated yield of 1 was 35% based on C60. Depending on the reaction conditions, the relative amounts of 1 and 2 generated in situ were as high as 85% and 71%, respectively, based on HPLC peak integration and summing over all fullerene species present other than unreacted C60. Compound 1 is thermally stable in 1,2-dichlorobenzene (oDCB) at 160 °C but was rapidly converted tomore » 2 upon addition of Sn 2(n-Bu) 6 at this temperature. In contrast, complete conversion of 1 to 2 occurred within minutes, or hours, at 25 °C in 90/10 (v/v) PhCN/C 6D 6 by addition of stoichiometric, or sub-stoichiometric, amounts of proton sponge (PS) or cobaltocene (CoCp 2). DFT calculations indicate that when 1 is deprotonated, the anion C 60(CF 2C 6F 5)- can undergo facile intramolecular SNAr annulation to form 2 with concomitant loss of F-. To our knowledge this is the first observation of a fullerene-cage carbanion acting as an S NAr nucleophile towards an aromatic C–F bond. The gas-phase electron affinity (EA) of 2 was determined to be 2.805(10) eV by low-temperature PES, higher by 0.12(1) eV than the EA of C 60 and higher by 0.18(1) eV than the EA of phenyl-C 61-butyric acid methyl ester (PCBM). In contrast, the relative E 1/2(0/-) values of 2 and C 60, -0.01(1) and 0.00(1) V, respectively, are virtually the same (on this scale, and under the same conditions, the E1/2(0/-) of PCBM is -0.09 V). Time-resolved microwave conductivity charge-carrier yield × mobility values for organic photovoltaic active-layer-type blends of 2 and poly-3-hexylthiophene (P3HT) were comparable to those for equimolar blends of PCBM and P3HT. The structure of solvent-free crystals of 2 was determined by single-crystal X-ray diffraction. The number of nearest-neighbor fullerene–fullerene interactions with centroid···centroid (⊙···⊙) distances of ≤10.34 Å is significantly greater, and the average ⊙···⊙ distance is shorter, for 2 (10 nearest neighbors; ave. ⊙···⊙ distance = 10.09 Å) than for solvent-free crystals of PCBM (7 nearest neighbors; ave. ⊙···⊙ distance = 10.17 Å). Finally, the thermal stability of 2 was found to be far greater than that of PCBM.« less
A faux hawk fullerene with PCBM-like properties
DOE Office of Scientific and Technical Information (OSTI.GOV)
San, Long K.; Bukovsky, Eric V.; Larson, Bryon W.
Reaction of C 60, C 6F 5CF 2I, and SnH(n-Bu) 3 produced, among other unidentified fullerene derivatives, the two new compounds 1,9-C 60(CF 2C 6F 5)H (1) and 1,9-C 60(cyclo-CF 2(2-C 6F 4)) (2). The highest isolated yield of 1 was 35% based on C60. Depending on the reaction conditions, the relative amounts of 1 and 2 generated in situ were as high as 85% and 71%, respectively, based on HPLC peak integration and summing over all fullerene species present other than unreacted C60. Compound 1 is thermally stable in 1,2-dichlorobenzene (oDCB) at 160 °C but was rapidly converted tomore » 2 upon addition of Sn 2(n-Bu) 6 at this temperature. In contrast, complete conversion of 1 to 2 occurred within minutes, or hours, at 25 °C in 90/10 (v/v) PhCN/C 6D 6 by addition of stoichiometric, or sub-stoichiometric, amounts of proton sponge (PS) or cobaltocene (CoCp 2). DFT calculations indicate that when 1 is deprotonated, the anion C 60(CF 2C 6F 5)- can undergo facile intramolecular SNAr annulation to form 2 with concomitant loss of F-. To our knowledge this is the first observation of a fullerene-cage carbanion acting as an S NAr nucleophile towards an aromatic C–F bond. The gas-phase electron affinity (EA) of 2 was determined to be 2.805(10) eV by low-temperature PES, higher by 0.12(1) eV than the EA of C 60 and higher by 0.18(1) eV than the EA of phenyl-C 61-butyric acid methyl ester (PCBM). In contrast, the relative E 1/2(0/-) values of 2 and C 60, -0.01(1) and 0.00(1) V, respectively, are virtually the same (on this scale, and under the same conditions, the E1/2(0/-) of PCBM is -0.09 V). Time-resolved microwave conductivity charge-carrier yield × mobility values for organic photovoltaic active-layer-type blends of 2 and poly-3-hexylthiophene (P3HT) were comparable to those for equimolar blends of PCBM and P3HT. The structure of solvent-free crystals of 2 was determined by single-crystal X-ray diffraction. The number of nearest-neighbor fullerene–fullerene interactions with centroid···centroid (⊙···⊙) distances of ≤10.34 Å is significantly greater, and the average ⊙···⊙ distance is shorter, for 2 (10 nearest neighbors; ave. ⊙···⊙ distance = 10.09 Å) than for solvent-free crystals of PCBM (7 nearest neighbors; ave. ⊙···⊙ distance = 10.17 Å). Finally, the thermal stability of 2 was found to be far greater than that of PCBM.« less
NASA Technical Reports Server (NTRS)
McDowell, Mark
2004-01-01
An integrated algorithm for decomposing overlapping particle images (multi-particle objects) along with determining each object s constituent particle centroid(s) has been developed using image analysis techniques. The centroid finding algorithm uses a modified eight-direction search method for finding the perimeter of any enclosed object. The centroid is calculated using the intensity-weighted center of mass of the object. The overlap decomposition algorithm further analyzes the object data and breaks it down into its constituent particle centroid(s). This is accomplished with an artificial neural network, feature based technique and provides an efficient way of decomposing overlapping particles. Combining the centroid finding and overlap decomposition routines into a single algorithm allows us to accurately predict the error associated with finding the centroid(s) of particles in our experiments. This algorithm has been tested using real, simulated, and synthetic data and the results are presented and discussed.
Credit scoring analysis using weighted k nearest neighbor
NASA Astrophysics Data System (ADS)
Mukid, M. A.; Widiharih, T.; Rusgiyono, A.; Prahutama, A.
2018-05-01
Credit scoring is a quatitative method to evaluate the credit risk of loan applications. Both statistical methods and artificial intelligence are often used by credit analysts to help them decide whether the applicants are worthy of credit. These methods aim to predict future behavior in terms of credit risk based on past experience of customers with similar characteristics. This paper reviews the weighted k nearest neighbor (WKNN) method for credit assessment by considering the use of some kernels. We use credit data from a private bank in Indonesia. The result shows that the Gaussian kernel and rectangular kernel have a better performance based on the value of percentage corrected classified whose value is 82.4% respectively.
Kepler Planet Detection Metrics: Robovetter Completeness and Effectiveness for Data Release 25
NASA Technical Reports Server (NTRS)
Coughlin, Jeffrey L.
2017-01-01
In general, the Kepler pipeline identifies a list of Threshold Crossing Events (TCEs), which are periodic flux decrements meeting certain criteria (Jenkins, 2017). These TCEs are reviewed and those that appear consistent with astrophysically transiting or eclipsing systems are classified as Kepler Objects of Interest (KOIs). Further review is given to KOIs, which are then dispositioned as Planet Candidates (PCs) or False Positive (FPs). FPs are further denoted by four major flags that indicate if the signal is Not Transit-Like (NTL), due to a Stellar Eclipse (SS; previously referred to as Significant Secondary), and/or due to contamination from a source other than the target as evidenced by a Centroid Offset (CO) oran Ephemeris Match (EM) with another object. This entire TCE review process is known as dispositioning or vetting.In the first five Kepler mission planet candidate catalogs (Borucki et al., 2011a,b; Batalha et al., 2013; Burke et al., 2014; Rowe et al., 2015), TCEs were manually examined on an individual basis and dispositioned using various plots and quantitative diagnostic tests (see e.g., Coughlin, 2017). In the sixth catalog, Mullally et al. (2015a) employed partial automation via simple parameter cuts to automatically disposition a large fraction of TCEs as not transit-like. Mullally et al. (2015a) also used an automated technique known as the centroid Robovetter (Mullally, 2017) to automatically identify some FP KOIs due to centroid offsets - a telltale signature of light contamination from another target. The remaining targets were manually dispositioned. In the seventh catalog, Coughlin et al. (2016) automated theentire dispositioning process using what is collectively known simply as the Robovetter.In the eighth and final mission catalog, Thompson et al. (2017) use a revised Robovetter to automate the dispositioning of all TCEs with an emphasis on creating a catalog suitable for accurately determining planet occurrence rates. In order to calculate accurate occurrence rates, the completeness and effectiveness of the Robovetter must be characterized. We define these terms as applied to the Robovetter, following Thompson et al. (2017), as:1. Completeness: The fraction of transiting planets detected by the pipeline that are classified as planet candidates by the Robovetter.2. Effectiveness: The fraction of false positives detected by the pipeline that are classified as false positives by the Robovetter.The remainder of this document describes products that can be used to quantitatively assess Robovetter completeness and effectiveness for an arbitrary set of Kepler stars.
Lowest cost, nearest term options for a manned Mars mission
NASA Technical Reports Server (NTRS)
Sauls, Bob; Mortensen, Michael; Myers, Renee; Guacci, Giovanni; Montes, Fred
1992-01-01
This study is part of a NASA/USRA Advanced Design Program project executed for the purpose of examining the requirements of a first manned Mars mission. The mission, classified as a split/sprint mission, has been designed for a crew of six with a total manned trip time of one year.
Emotion recognition from multichannel EEG signals using K-nearest neighbor classification.
Li, Mi; Xu, Hongpei; Liu, Xingwang; Lu, Shengfu
2018-04-27
Many studies have been done on the emotion recognition based on multi-channel electroencephalogram (EEG) signals. This paper explores the influence of the emotion recognition accuracy of EEG signals in different frequency bands and different number of channels. We classified the emotional states in the valence and arousal dimensions using different combinations of EEG channels. Firstly, DEAP default preprocessed data were normalized. Next, EEG signals were divided into four frequency bands using discrete wavelet transform, and entropy and energy were calculated as features of K-nearest neighbor Classifier. The classification accuracies of the 10, 14, 18 and 32 EEG channels based on the Gamma frequency band were 89.54%, 92.28%, 93.72% and 95.70% in the valence dimension and 89.81%, 92.24%, 93.69% and 95.69% in the arousal dimension. As the number of channels increases, the classification accuracy of emotional states also increases, the classification accuracy of the gamma frequency band is greater than that of the beta frequency band followed by the alpha and theta frequency bands. This paper provided better frequency bands and channels reference for emotion recognition based on EEG.
Fall Detection Using Smartphone Audio Features.
Cheffena, Michael
2016-07-01
An automated fall detection system based on smartphone audio features is developed. The spectrogram, mel frequency cepstral coefficents (MFCCs), linear predictive coding (LPC), and matching pursuit (MP) features of different fall and no-fall sound events are extracted from experimental data. Based on the extracted audio features, four different machine learning classifiers: k-nearest neighbor classifier (k-NN), support vector machine (SVM), least squares method (LSM), and artificial neural network (ANN) are investigated for distinguishing between fall and no-fall events. For each audio feature, the performance of each classifier in terms of sensitivity, specificity, accuracy, and computational complexity is evaluated. The best performance is achieved using spectrogram features with ANN classifier with sensitivity, specificity, and accuracy all above 98%. The classifier also has acceptable computational requirement for training and testing. The system is applicable in home environments where the phone is placed in the vicinity of the user.
E-Nose Vapor Identification Based on Dempster-Shafer Fusion of Multiple Classifiers
NASA Technical Reports Server (NTRS)
Li, Winston; Leung, Henry; Kwan, Chiman; Linnell, Bruce R.
2005-01-01
Electronic nose (e-nose) vapor identification is an efficient approach to monitor air contaminants in space stations and shuttles in order to ensure the health and safety of astronauts. Data preprocessing (measurement denoising and feature extraction) and pattern classification are important components of an e-nose system. In this paper, a wavelet-based denoising method is applied to filter the noisy sensor measurements. Transient-state features are then extracted from the denoised sensor measurements, and are used to train multiple classifiers such as multi-layer perceptions (MLP), support vector machines (SVM), k nearest neighbor (KNN), and Parzen classifier. The Dempster-Shafer (DS) technique is used at the end to fuse the results of the multiple classifiers to get the final classification. Experimental analysis based on real vapor data shows that the wavelet denoising method can remove both random noise and outliers successfully, and the classification rate can be improved by using classifier fusion.
Relationship between ion pair geometries and electrostatic strengths in proteins.
Kumar, Sandeep; Nussinov, Ruth
2002-01-01
The electrostatic free energy contribution of an ion pair in a protein depends on two factors, geometrical orientation of the side-chain charged groups with respect to each other and the structural context of the ion pair in the protein. Conformers in NMR ensembles enable studies of the relationship between geometry and electrostatic strengths of ion pairs, because the protein structural contexts are highly similar across different conformers. We have studied this relationship using a dataset of 22 unique ion pairs in 14 NMR conformer ensembles for 11 nonhomologous proteins. In different NMR conformers, the ion pairs are classified as salt bridges, nitrogen-oxygen (N-O) bridges and longer-range ion pairs on the basis of geometrical criteria. In salt bridges, centroids of the side-chain charged groups and at least a pair of side-chain nitrogen and oxygen atoms of the ion-pairing residues are within a 4 A distance. In N-O bridges, at least a pair of the side-chain nitrogen and oxygen atoms of the ion-pairing residues are within 4 A distance, but the distance between the side-chain charged group centroids is greater than 4 A. In the longer-range ion pairs, the side-chain charged group centroids as well as the side-chain nitrogen and oxygen atoms are more than 4 A apart. Continuum electrostatic calculations indicate that most of the ion pairs have stabilizing electrostatic contributions when their side-chain charged group centroids are within 5 A distance. Hence, most (approximately 92%) of the salt bridges and a majority (68%) of the N-O bridges are stabilizing. Most (approximately 89%) of the destabilizing ion pairs are the longer-range ion pairs. In the NMR conformer ensembles, the electrostatic interaction between side-chain charged groups of the ion-pairing residues is the strongest for salt bridges, considerably weaker for N-O bridges, and the weakest for longer-range ion pairs. These results suggest empirical rules for stabilizing electrostatic interactions in proteins. PMID:12202384
Smeragliuolo, Anna H.; Long, John Davis; Bumanlag, Silverio Joseph; He, Victor; Lampe, Anna
2017-01-01
The objective of this study was to determine whether kinematic data collected by the Microsoft Kinect 2 (MK2) could be used to quantify postural stability in healthy subjects. Twelve subjects were recruited for the project, and were instructed to perform a sequence of simple postural stability tasks. The movement sequence was performed as subjects were seated on top of a force platform, and the MK2 was positioned in front of them. This sequence of tasks was performed by each subject under three different postural conditions: “both feet on the ground” (1), “One foot off the ground” (2), and “both feet off the ground” (3). We compared force platform and MK2 data to quantify the degree to which the MK2 was returning reliable data across subjects. We then applied a novel machine-learning paradigm to the MK2 data in order to determine the extent to which data from the MK2 could be used to reliably classify different postural conditions. Our initial comparison of force plate and MK2 data showed a strong agreement between the two devices, with strong Pearson correlations between the trunk centroids “Spine_Mid” (0.85 ± 0.06), “Neck” (0.86 ± 0.07) and “Head” (0.87 ± 0.07), and the center of pressure centroid inferred by the force platform. Mean accuracy for the machine learning classifier from MK2 was 97.0%, with a specific classification accuracy breakdown of 90.9%, 100%, and 100% for conditions 1 through 3, respectively. Mean accuracy for the machine learning classifier derived from the force platform data was lower at 84.4%. We conclude that data from the MK2 has sufficient information content to allow us to classify sequences of tasks being performed under different levels of postural stability. Future studies will focus on validating this protocol on large populations of individuals with actual balance impairments in order to create a toolkit that is clinically validated and available to the medical community. PMID:28196139
Dehbandi, Behdad; Barachant, Alexandre; Smeragliuolo, Anna H; Long, John Davis; Bumanlag, Silverio Joseph; He, Victor; Lampe, Anna; Putrino, David
2017-01-01
The objective of this study was to determine whether kinematic data collected by the Microsoft Kinect 2 (MK2) could be used to quantify postural stability in healthy subjects. Twelve subjects were recruited for the project, and were instructed to perform a sequence of simple postural stability tasks. The movement sequence was performed as subjects were seated on top of a force platform, and the MK2 was positioned in front of them. This sequence of tasks was performed by each subject under three different postural conditions: "both feet on the ground" (1), "One foot off the ground" (2), and "both feet off the ground" (3). We compared force platform and MK2 data to quantify the degree to which the MK2 was returning reliable data across subjects. We then applied a novel machine-learning paradigm to the MK2 data in order to determine the extent to which data from the MK2 could be used to reliably classify different postural conditions. Our initial comparison of force plate and MK2 data showed a strong agreement between the two devices, with strong Pearson correlations between the trunk centroids "Spine_Mid" (0.85 ± 0.06), "Neck" (0.86 ± 0.07) and "Head" (0.87 ± 0.07), and the center of pressure centroid inferred by the force platform. Mean accuracy for the machine learning classifier from MK2 was 97.0%, with a specific classification accuracy breakdown of 90.9%, 100%, and 100% for conditions 1 through 3, respectively. Mean accuracy for the machine learning classifier derived from the force platform data was lower at 84.4%. We conclude that data from the MK2 has sufficient information content to allow us to classify sequences of tasks being performed under different levels of postural stability. Future studies will focus on validating this protocol on large populations of individuals with actual balance impairments in order to create a toolkit that is clinically validated and available to the medical community.
Larrañaga, Ana; Bielza, Concha; Pongrácz, Péter; Faragó, Tamás; Bálint, Anna; Larrañaga, Pedro
2015-03-01
Barking is perhaps the most characteristic form of vocalization in dogs; however, very little is known about its role in the intraspecific communication of this species. Besides the obvious need for ethological research, both in the field and in the laboratory, the possible information content of barks can also be explored by computerized acoustic analyses. This study compares four different supervised learning methods (naive Bayes, classification trees, [Formula: see text]-nearest neighbors and logistic regression) combined with three strategies for selecting variables (all variables, filter and wrapper feature subset selections) to classify Mudi dogs by sex, age, context and individual from their barks. The classification accuracy of the models obtained was estimated by means of [Formula: see text]-fold cross-validation. Percentages of correct classifications were 85.13 % for determining sex, 80.25 % for predicting age (recodified as young, adult and old), 55.50 % for classifying contexts (seven situations) and 67.63 % for recognizing individuals (8 dogs), so the results are encouraging. The best-performing method was [Formula: see text]-nearest neighbors following a wrapper feature selection approach. The results for classifying contexts and recognizing individual dogs were better with this method than they were for other approaches reported in the specialized literature. This is the first time that the sex and age of domestic dogs have been predicted with the help of sound analysis. This study shows that dog barks carry ample information regarding the caller's indexical features. Our computerized analysis provides indirect proof that barks may serve as an important source of information for dogs as well.
Interface Prostheses With Classifier-Feedback-Based User Training.
Fang, Yinfeng; Zhou, Dalin; Li, Kairu; Liu, Honghai
2017-11-01
It is evident that user training significantly affects performance of pattern-recognition-based myoelectric prosthetic device control. Despite plausible classification accuracy on offline datasets, online accuracy usually suffers from the changes in physiological conditions and electrode displacement. The user ability in generating consistent electromyographic (EMG) patterns can be enhanced via proper user training strategies in order to improve online performance. This study proposes a clustering-feedback strategy that provides real-time feedback to users by means of a visualized online EMG signal input as well as the centroids of the training samples, whose dimensionality is reduced to minimal number by dimension reduction. Clustering feedback provides a criterion that guides users to adjust motion gestures and muscle contraction forces intentionally. The experiment results have demonstrated that hand motion recognition accuracy increases steadily along the progress of the clustering-feedback-based user training, while conventional classifier-feedback methods, i.e., label feedback, hardly achieve any improvement. The result concludes that the use of proper classifier feedback can accelerate the process of user training, and implies prosperous future for the amputees with limited or no experience in pattern-recognition-based prosthetic device manipulation.It is evident that user training significantly affects performance of pattern-recognition-based myoelectric prosthetic device control. Despite plausible classification accuracy on offline datasets, online accuracy usually suffers from the changes in physiological conditions and electrode displacement. The user ability in generating consistent electromyographic (EMG) patterns can be enhanced via proper user training strategies in order to improve online performance. This study proposes a clustering-feedback strategy that provides real-time feedback to users by means of a visualized online EMG signal input as well as the centroids of the training samples, whose dimensionality is reduced to minimal number by dimension reduction. Clustering feedback provides a criterion that guides users to adjust motion gestures and muscle contraction forces intentionally. The experiment results have demonstrated that hand motion recognition accuracy increases steadily along the progress of the clustering-feedback-based user training, while conventional classifier-feedback methods, i.e., label feedback, hardly achieve any improvement. The result concludes that the use of proper classifier feedback can accelerate the process of user training, and implies prosperous future for the amputees with limited or no experience in pattern-recognition-based prosthetic device manipulation.
The distance function effect on k-nearest neighbor classification for medical datasets.
Hu, Li-Yu; Huang, Min-Wei; Ke, Shih-Wen; Tsai, Chih-Fong
2016-01-01
K-nearest neighbor (k-NN) classification is conventional non-parametric classifier, which has been used as the baseline classifier in many pattern classification problems. It is based on measuring the distances between the test data and each of the training data to decide the final classification output. Since the Euclidean distance function is the most widely used distance metric in k-NN, no study examines the classification performance of k-NN by different distance functions, especially for various medical domain problems. Therefore, the aim of this paper is to investigate whether the distance function can affect the k-NN performance over different medical datasets. Our experiments are based on three different types of medical datasets containing categorical, numerical, and mixed types of data and four different distance functions including Euclidean, cosine, Chi square, and Minkowsky are used during k-NN classification individually. The experimental results show that using the Chi square distance function is the best choice for the three different types of datasets. However, using the cosine and Euclidean (and Minkowsky) distance function perform the worst over the mixed type of datasets. In this paper, we demonstrate that the chosen distance function can affect the classification accuracy of the k-NN classifier. For the medical domain datasets including the categorical, numerical, and mixed types of data, K-NN based on the Chi square distance function performs the best.
Malof, Jordan M.; Mazurowski, Maciej A.; Tourassi, Georgia D.
2013-01-01
Case selection is a useful approach for increasing the efficiency and performance of case-based classifiers. Multiple techniques have been designed to perform case selection. This paper empirically investigates how class imbalance in the available set of training cases can impact the performance of the resulting classifier as well as properties of the selected set. In this study, the experiments are performed using a dataset for the problem of detecting breast masses in screening mammograms. The classification problem was binary and we used a k-nearest neighbor classifier. The classifier’s performance was evaluated using the Receiver Operating Characteristic (ROC) area under the curve (AUC) measure. The experimental results indicate that although class imbalance reduces the performance of the derived classifier and the effectiveness of selection at improving overall classifier performance, case selection can still be beneficial, regardless of the level of class imbalance. PMID:21820273
Schwenk
1998-11-15
We present a new classification architecture based on autoassociative neural networks that are used to learn discriminant models of each class. The proposed architecture has several interesting properties with respect to other model-based classifiers like nearest-neighbors or radial basis functions: it has a low computational complexity and uses a compact distributed representation of the models. The classifier is also well suited for the incorporation of a priori knowledge by means of a problem-specific distance measure. In particular, we will show that tangent distance (Simard, Le Cun, & Denker, 1993) can be used to achieve transformation invariance during learning and recognition. We demonstrate the application of this classifier to optical character recognition, where it has achieved state-of-the-art results on several reference databases. Relations to other models, in particular those based on principal component analysis, are also discussed.
Automatic tissue characterization from ultrasound imagery
NASA Astrophysics Data System (ADS)
Kadah, Yasser M.; Farag, Aly A.; Youssef, Abou-Bakr M.; Badawi, Ahmed M.
1993-08-01
In this work, feature extraction algorithms are proposed to extract the tissue characterization parameters from liver images. Then the resulting parameter set is further processed to obtain the minimum number of parameters representing the most discriminating pattern space for classification. This preprocessing step was applied to over 120 pathology-investigated cases to obtain the learning data for designing the classifier. The extracted features are divided into independent training and test sets and are used to construct both statistical and neural classifiers. The optimal criteria for these classifiers are set to have minimum error, ease of implementation and learning, and the flexibility for future modifications. Various algorithms for implementing various classification techniques are presented and tested on the data. The best performance was obtained using a single layer tensor model functional link network. Also, the voting k-nearest neighbor classifier provided comparably good diagnostic rates.
Automatic morphological classification of galaxy images
Shamir, Lior
2009-01-01
We describe an image analysis supervised learning algorithm that can automatically classify galaxy images. The algorithm is first trained using a manually classified images of elliptical, spiral, and edge-on galaxies. A large set of image features is extracted from each image, and the most informative features are selected using Fisher scores. Test images can then be classified using a simple Weighted Nearest Neighbor rule such that the Fisher scores are used as the feature weights. Experimental results show that galaxy images from Galaxy Zoo can be classified automatically to spiral, elliptical and edge-on galaxies with accuracy of ~90% compared to classifications carried out by the author. Full compilable source code of the algorithm is available for free download, and its general-purpose nature makes it suitable for other uses that involve automatic image analysis of celestial objects. PMID:20161594
Chaotic particle swarm optimization with mutation for classification.
Assarzadeh, Zahra; Naghsh-Nilchi, Ahmad Reza
2015-01-01
In this paper, a chaotic particle swarm optimization with mutation-based classifier particle swarm optimization is proposed to classify patterns of different classes in the feature space. The introduced mutation operators and chaotic sequences allows us to overcome the problem of early convergence into a local minima associated with particle swarm optimization algorithms. That is, the mutation operator sharpens the convergence and it tunes the best possible solution. Furthermore, to remove the irrelevant data and reduce the dimensionality of medical datasets, a feature selection approach using binary version of the proposed particle swarm optimization is introduced. In order to demonstrate the effectiveness of our proposed classifier, mutation-based classifier particle swarm optimization, it is checked out with three sets of data classifications namely, Wisconsin diagnostic breast cancer, Wisconsin breast cancer and heart-statlog, with different feature vector dimensions. The proposed algorithm is compared with different classifier algorithms including k-nearest neighbor, as a conventional classifier, particle swarm-classifier, genetic algorithm, and Imperialist competitive algorithm-classifier, as more sophisticated ones. The performance of each classifier was evaluated by calculating the accuracy, sensitivity, specificity and Matthews's correlation coefficient. The experimental results show that the mutation-based classifier particle swarm optimization unequivocally performs better than all the compared algorithms.
Local classifier weighting by quadratic programming.
Cevikalp, Hakan; Polikar, Robi
2008-10-01
It has been widely accepted that the classification accuracy can be improved by combining outputs of multiple classifiers. However, how to combine multiple classifiers with various (potentially conflicting) decisions is still an open problem. A rich collection of classifier combination procedures -- many of which are heuristic in nature -- have been developed for this goal. In this brief, we describe a dynamic approach to combine classifiers that have expertise in different regions of the input space. To this end, we use local classifier accuracy estimates to weight classifier outputs. Specifically, we estimate local recognition accuracies of classifiers near a query sample by utilizing its nearest neighbors, and then use these estimates to find the best weights of classifiers to label the query. The problem is formulated as a convex quadratic optimization problem, which returns optimal nonnegative classifier weights with respect to the chosen objective function, and the weights ensure that locally most accurate classifiers are weighted more heavily for labeling the query sample. Experimental results on several data sets indicate that the proposed weighting scheme outperforms other popular classifier combination schemes, particularly on problems with complex decision boundaries. Hence, the results indicate that local classification-accuracy-based combination techniques are well suited for decision making when the classifiers are trained by focusing on different regions of the input space.
A Comparison of Rule-Based, K-Nearest Neighbor, and Neural Net Classifiers for Automated
Tai-Hoon Cho; Richard W. Conners; Philip A. Araman
1991-01-01
Over the last few years the authors have been involved in research aimed at developing a machine vision system for locating and identifying surface defects on materials. The particular problem being studied involves locating surface defects on hardwood lumber in a species independent manner. Obviously, the accurate location and identification of defects is of paramount...
Autonomous target recognition using remotely sensed surface vibration measurements
NASA Astrophysics Data System (ADS)
Geurts, James; Ruck, Dennis W.; Rogers, Steven K.; Oxley, Mark E.; Barr, Dallas N.
1993-09-01
The remotely measured surface vibration signatures of tactical military ground vehicles are investigated for use in target classification and identification friend or foe (IFF) systems. The use of remote surface vibration sensing by a laser radar reduces the effects of partial occlusion, concealment, and camouflage experienced by automatic target recognition systems using traditional imagery in a tactical battlefield environment. Linear Predictive Coding (LPC) efficiently represents the vibration signatures and nearest neighbor classifiers exploit the LPC feature set using a variety of distortion metrics. Nearest neighbor classifiers achieve an 88 percent classification rate in an eight class problem, representing a classification performance increase of thirty percent from previous efforts. A novel confidence figure of merit is implemented to attain a 100 percent classification rate with less than 60 percent rejection. The high classification rates are achieved on a target set which would pose significant problems to traditional image-based recognition systems. The targets are presented to the sensor in a variety of aspects and engine speeds at a range of 1 kilometer. The classification rates achieved demonstrate the benefits of using remote vibration measurement in a ground IFF system. The signature modeling and classification system can also be used to identify rotary and fixed-wing targets.
Bennet, Jaison; Ganaprakasam, Chilambuchelvan Arul; Arputharaj, Kannan
2014-01-01
Cancer classification by doctors and radiologists was based on morphological and clinical features and had limited diagnostic ability in olden days. The recent arrival of DNA microarray technology has led to the concurrent monitoring of thousands of gene expressions in a single chip which stimulates the progress in cancer classification. In this paper, we have proposed a hybrid approach for microarray data classification based on nearest neighbor (KNN), naive Bayes, and support vector machine (SVM). Feature selection prior to classification plays a vital role and a feature selection technique which combines discrete wavelet transform (DWT) and moving window technique (MWT) is used. The performance of the proposed method is compared with the conventional classifiers like support vector machine, nearest neighbor, and naive Bayes. Experiments have been conducted on both real and benchmark datasets and the results indicate that the ensemble approach produces higher classification accuracy than conventional classifiers. This paper serves as an automated system for the classification of cancer and can be applied by doctors in real cases which serve as a boon to the medical community. This work further reduces the misclassification of cancers which is highly not allowed in cancer detection.
NASA Astrophysics Data System (ADS)
Winder, Anthony J.; Siemonsen, Susanne; Flottmann, Fabian; Fiehler, Jens; Forkert, Nils D.
2017-03-01
Voxel-based tissue outcome prediction in acute ischemic stroke patients is highly relevant for both clinical routine and research. Previous research has shown that features extracted from baseline multi-parametric MRI datasets have a high predictive value and can be used for the training of classifiers, which can generate tissue outcome predictions for both intravenous and conservative treatments. However, with the recent advent and popularization of intra-arterial thrombectomy treatment, novel research specifically addressing the utility of predictive classi- fiers for thrombectomy intervention is necessary for a holistic understanding of current stroke treatment options. The aim of this work was to develop three clinically viable tissue outcome prediction models using approximate nearest-neighbor, generalized linear model, and random decision forest approaches and to evaluate the accuracy of predicting tissue outcome after intra-arterial treatment. Therefore, the three machine learning models were trained, evaluated, and compared using datasets of 42 acute ischemic stroke patients treated with intra-arterial thrombectomy. Classifier training utilized eight voxel-based features extracted from baseline MRI datasets and five global features. Evaluation of classifier-based predictions was performed via comparison to the known tissue outcome, which was determined in follow-up imaging, using the Dice coefficient and leave-on-patient-out cross validation. The random decision forest prediction model led to the best tissue outcome predictions with a mean Dice coefficient of 0.37. The approximate nearest-neighbor and generalized linear model performed equally suboptimally with average Dice coefficients of 0.28 and 0.27 respectively, suggesting that both non-linearity and machine learning are desirable properties of a classifier well-suited to the intra-arterial tissue outcome prediction problem.
2013-01-01
Background Identifying the emotional state is helpful in applications involving patients with autism and other intellectual disabilities; computer-based training, human computer interaction etc. Electrocardiogram (ECG) signals, being an activity of the autonomous nervous system (ANS), reflect the underlying true emotional state of a person. However, the performance of various methods developed so far lacks accuracy, and more robust methods need to be developed to identify the emotional pattern associated with ECG signals. Methods Emotional ECG data was obtained from sixty participants by inducing the six basic emotional states (happiness, sadness, fear, disgust, surprise and neutral) using audio-visual stimuli. The non-linear feature ‘Hurst’ was computed using Rescaled Range Statistics (RRS) and Finite Variance Scaling (FVS) methods. New Hurst features were proposed by combining the existing RRS and FVS methods with Higher Order Statistics (HOS). The features were then classified using four classifiers – Bayesian Classifier, Regression Tree, K- nearest neighbor and Fuzzy K-nearest neighbor. Seventy percent of the features were used for training and thirty percent for testing the algorithm. Results Analysis of Variance (ANOVA) conveyed that Hurst and the proposed features were statistically significant (p < 0.001). Hurst computed using RRS and FVS methods showed similar classification accuracy. The features obtained by combining FVS and HOS performed better with a maximum accuracy of 92.87% and 76.45% for classifying the six emotional states using random and subject independent validation respectively. Conclusions The results indicate that the combination of non-linear analysis and HOS tend to capture the finer emotional changes that can be seen in healthy ECG data. This work can be further fine tuned to develop a real time system. PMID:23680041
Yang, Jiaojiao; Guo, Qian; Li, Wenjie; Wang, Suhong; Zou, Ling
2016-04-01
This paper aims to assist the individual clinical diagnosis of children with attention-deficit/hyperactivity disorder using electroencephalogram signal detection method.Firstly,in our experiments,we obtained and studied the electroencephalogram signals from fourteen attention-deficit/hyperactivity disorder children and sixteen typically developing children during the classic interference control task of Simon-spatial Stroop,and we completed electroencephalogram data preprocessing including filtering,segmentation,removal of artifacts and so on.Secondly,we selected the subset electroencephalogram electrodes using principal component analysis(PCA)method,and we collected the common channels of the optimal electrodes which occurrence rates were more than 90%in each kind of stimulation.We then extracted the latency(200~450ms)mean amplitude features of the common electrodes.Finally,we used the k-nearest neighbor(KNN)classifier based on Euclidean distance and the support vector machine(SVM)classifier based on radial basis kernel function to classify.From the experiment,at the same kind of interference control task,the attention-deficit/hyperactivity disorder children showed lower correct response rates and longer reaction time.The N2 emerged in prefrontal cortex while P2 presented in the inferior parietal area when all kinds of stimuli demonstrated.Meanwhile,the children with attention-deficit/hyperactivity disorder exhibited markedly reduced N2 and P2amplitude compared to typically developing children.KNN resulted in better classification accuracy than SVM classifier,and the best classification rate was 89.29%in StI task.The results showed that the electroencephalogram signals were different in the brain regions of prefrontal cortex and inferior parietal cortex between attention-deficit/hyperactivity disorder and typically developing children during the interference control task,which provided a scientific basis for the clinical diagnosis of attention-deficit/hyperactivity disorder individuals.
An ultra low power feature extraction and classification system for wearable seizure detection.
Page, Adam; Pramod Tim Oates, Siddharth; Mohsenin, Tinoosh
2015-01-01
In this paper we explore the use of a variety of machine learning algorithms for designing a reliable and low-power, multi-channel EEG feature extractor and classifier for predicting seizures from electroencephalographic data (scalp EEG). Different machine learning classifiers including k-nearest neighbor, support vector machines, naïve Bayes, logistic regression, and neural networks are explored with the goal of maximizing detection accuracy while minimizing power, area, and latency. The input to each machine learning classifier is a 198 feature vector containing 9 features for each of the 22 EEG channels obtained over 1-second windows. All classifiers were able to obtain F1 scores over 80% and onset sensitivity of 100% when tested on 10 patients. Among five different classifiers that were explored, logistic regression (LR) proved to have minimum hardware complexity while providing average F-1 score of 91%. Both ASIC and FPGA implementations of logistic regression are presented and show the smallest area, power consumption, and the lowest latency when compared to the previous work.
Motion Control of Drives for Prosthetic Hand Using Continuous Myoelectric Signals
NASA Astrophysics Data System (ADS)
Purushothaman, Geethanjali; Ray, Kalyan Kumar
2016-03-01
In this paper the authors present motion control of a prosthetic hand, through continuous myoelectric signal acquisition, classification and actuation of the prosthetic drive. A four channel continuous electromyogram (EMG) signal also known as myoelectric signals (MES) are acquired from the abled-body to classify the six unique movements of hand and wrist, viz, hand open (HO), hand close (HC), wrist flexion (WF), wrist extension (WE), ulnar deviation (UD) and radial deviation (RD). The classification technique involves in extracting the features/pattern through statistical time domain (TD) parameter/autoregressive coefficients (AR), which are reduced using principal component analysis (PCA). The reduced statistical TD features and or AR coefficients are used to classify the signal patterns through k nearest neighbour (kNN) as well as neural network (NN) classifier and the performance of the classifiers are compared. Performance comparison of the above two classifiers clearly shows that kNN classifier in identifying the hidden intended motion in the myoelectric signals is better than that of NN classifier. Once the classifier identifies the intended motion, the signal is amplified to actuate the three low power DC motor to perform the above mentioned movements.
Determining the Number of Clusters in a Data Set Without Graphical Interpretation
NASA Technical Reports Server (NTRS)
Aguirre, Nathan S.; Davies, Misty D.
2011-01-01
Cluster analysis is a data mining technique that is meant ot simplify the process of classifying data points. The basic clustering process requires an input of data points and the number of clusters wanted. The clustering algorithm will then pick starting C points for the clusters, which can be either random spatial points or random data points. It then assigns each data point to the nearest C point where "nearest usually means Euclidean distance, but some algorithms use another criterion. The next step is determining whether the clustering arrangement this found is within a certain tolerance. If it falls within this tolerance, the process ends. Otherwise the C points are adjusted based on how many data points are in each cluster, and the steps repeat until the algorithm converges,
Huang, Qiongyu; Sauer, John R.; Swatantran, Anu; Dubayah, Ralph
2016-01-01
Drastic shifts in species distributions are a cause of concern for ecologists. Such shifts pose great threat to biodiversity especially under unprecedented anthropogenic and natural disturbances. Many studies have documented recent shifts in species distributions. However, most of these studies are limited to regional scales, and do not consider the abundance structure within species ranges. Developing methods to detect systematic changes in species distributions over their full ranges is critical for understanding the impact of changing environments and for successful conservation planning. Here, we demonstrate a centroid model for range-wide analysis of distribution shifts using the North American Breeding Bird Survey. The centroid model is based on a hierarchical Bayesian framework which models population change within physiographic strata while accounting for several factors affecting species detectability. Yearly abundance-weighted range centroids are estimated. As case studies, we derive annual centroids for the Carolina wren and house finch in their ranges in the U.S. We further evaluate the first-difference correlation between species’ centroid movement and changes in winter severity, total population abundance. We also examined associations of change in centroids from sub-ranges. Change in full-range centroid movements of Carolina wren significantly correlate with snow cover days (r = −0.58). For both species, the full-range centroid shifts also have strong correlation with total abundance (r = 0.65, and 0.51 respectively). The movements of the full-range centroids of the two species are correlated strongly (up to r = 0.76) with that of the sub-ranges with more drastic population changes. Our study demonstrates the usefulness of centroids for analyzing distribution changes in a two-dimensional spatial context. Particularly it highlights applications that associate the centroid with factors such as environmental stressors, population characteristics, and progression of invasive species. Routine monitoring of changes in centroid will provide useful insights into long-term avian responses to environmental changes.
Chaotic Particle Swarm Optimization with Mutation for Classification
Assarzadeh, Zahra; Naghsh-Nilchi, Ahmad Reza
2015-01-01
In this paper, a chaotic particle swarm optimization with mutation-based classifier particle swarm optimization is proposed to classify patterns of different classes in the feature space. The introduced mutation operators and chaotic sequences allows us to overcome the problem of early convergence into a local minima associated with particle swarm optimization algorithms. That is, the mutation operator sharpens the convergence and it tunes the best possible solution. Furthermore, to remove the irrelevant data and reduce the dimensionality of medical datasets, a feature selection approach using binary version of the proposed particle swarm optimization is introduced. In order to demonstrate the effectiveness of our proposed classifier, mutation-based classifier particle swarm optimization, it is checked out with three sets of data classifications namely, Wisconsin diagnostic breast cancer, Wisconsin breast cancer and heart-statlog, with different feature vector dimensions. The proposed algorithm is compared with different classifier algorithms including k-nearest neighbor, as a conventional classifier, particle swarm-classifier, genetic algorithm, and Imperialist competitive algorithm-classifier, as more sophisticated ones. The performance of each classifier was evaluated by calculating the accuracy, sensitivity, specificity and Matthews's correlation coefficient. The experimental results show that the mutation-based classifier particle swarm optimization unequivocally performs better than all the compared algorithms. PMID:25709937
The food retail environment and area deprivation in Glasgow City, UK
Macdonald, Laura; Ellaway, Anne; Macintyre, Sally
2009-01-01
It has previously been suggested that deprived neighbourhoods within modern cities have poor access to general amenities, for example, fewer food retail outlets. Here we examine the distribution of food retailers by deprivation in the City of Glasgow, UK. We obtained a list of 934 food retailers in Glasgow, UK, in 2007, and mapped these at address level. We categorised small areas (data zones) into quintiles of area deprivation using the 2006 Scottish Index of Multiple Deprivation Income sub-domain score. We computed mean number of retailers per 1000 residents per data zone, and mean network distance to nearest outlet from data zone centroid, for all retailers combined and for each of seven categories of retailer separately (i.e. bakers, butchers, fruit and vegetable sellers, fishmongers, convenience stores, supermarkets and delicatessens). The most deprived quintile (of areas) had the greatest mean number of total food retailers per 1000 residents while quintile 1 (least deprived) had the least, and this difference was statistically significant (Chi-square p < 0.01). The closest mean distance to the nearest food retailer was within quintile 3 while the furthest distance was within quintile 1, and this was also statistically significant (Chi-square p < 0.01). There was variation in the distribution of the seven different types of food retailers, and access to amenities depended upon the type of food retailer studied and whether proximity or density was measured. Overall the findings suggested that deprived neighbourhoods within the City of Glasgow did not necessarily have fewer food retail outlets. PMID:19660114
The food retail environment and area deprivation in Glasgow City, UK.
Macdonald, Laura; Ellaway, Anne; Macintyre, Sally
2009-08-06
It has previously been suggested that deprived neighbourhoods within modern cities have poor access to general amenities, for example, fewer food retail outlets. Here we examine the distribution of food retailers by deprivation in the City of Glasgow, UK.We obtained a list of 934 food retailers in Glasgow, UK, in 2007, and mapped these at address level. We categorised small areas (data zones) into quintiles of area deprivation using the 2006 Scottish Index of Multiple Deprivation Income sub-domain score. We computed mean number of retailers per 1000 residents per data zone, and mean network distance to nearest outlet from data zone centroid, for all retailers combined and for each of seven categories of retailer separately (i.e. bakers, butchers, fruit and vegetable sellers, fishmongers, convenience stores, supermarkets and delicatessens).The most deprived quintile (of areas) had the greatest mean number of total food retailers per 1000 residents while quintile 1 (least deprived) had the least, and this difference was statistically significant (Chi-square p < 0.01). The closest mean distance to the nearest food retailer was within quintile 3 while the furthest distance was within quintile 1, and this was also statistically significant (Chi-square p < 0.01). There was variation in the distribution of the seven different types of food retailers, and access to amenities depended upon the type of food retailer studied and whether proximity or density was measured. Overall the findings suggested that deprived neighbourhoods within the City of Glasgow did not necessarily have fewer food retail outlets.
Ronald E. McRoberts; Mark D. Nelson; Daniel G. Wendt
2002-01-01
For two large study areas in Minnesota, USA, stratified estimation using classified Landsat Thematic Mapper satellite imagery as the basis for stratification was used to estimate forest area. Measurements of forest inventory plots obtained for a 12-month period in 1998 and 1999 were used as the source of data for within-stratum estimates. These measurements further...
Portable Language-Independent Adaptive Translation from OCR. Phase 1
2009-04-01
including brute-force k-Nearest Neighbors ( kNN ), fast approximate kNN using hashed k-d trees, classification and regression trees, and locality...achieved by refinements in ground-truthing protocols. Recent algorithmic improvements to our approximate kNN classifier using hashed k-D trees allows...recent years discriminative training has been shown to outperform phonetic HMMs estimated using ML for speech recognition. Standard ML estimation
Diaz, Naryttza N; Krause, Lutz; Goesmann, Alexander; Niehaus, Karsten; Nattkemper, Tim W
2009-01-01
Background Metagenomics, or the sequencing and analysis of collective genomes (metagenomes) of microorganisms isolated from an environment, promises direct access to the "unculturable majority". This emerging field offers the potential to lay solid basis on our understanding of the entire living world. However, the taxonomic classification is an essential task in the analysis of metagenomics data sets that it is still far from being solved. We present a novel strategy to predict the taxonomic origin of environmental genomic fragments. The proposed classifier combines the idea of the k-nearest neighbor with strategies from kernel-based learning. Results Our novel strategy was extensively evaluated using the leave-one-out cross validation strategy on fragments of variable length (800 bp – 50 Kbp) from 373 completely sequenced genomes. TACOA is able to classify genomic fragments of length 800 bp and 1 Kbp with high accuracy until rank class. For longer fragments ≥ 3 Kbp accurate predictions are made at even deeper taxonomic ranks (order and genus). Remarkably, TACOA also produces reliable results when the taxonomic origin of a fragment is not represented in the reference set, thus classifying such fragments to its known broader taxonomic class or simply as "unknown". We compared the classification accuracy of TACOA with the latest intrinsic classifier PhyloPythia using 63 recently published complete genomes. For fragments of length 800 bp and 1 Kbp the overall accuracy of TACOA is higher than that obtained by PhyloPythia at all taxonomic ranks. For all fragment lengths, both methods achieved comparable high specificity results up to rank class and low false negative rates are also obtained. Conclusion An accurate multi-class taxonomic classifier was developed for environmental genomic fragments. TACOA can predict with high reliability the taxonomic origin of genomic fragments as short as 800 bp. The proposed method is transparent, fast, accurate and the reference set can be easily updated as newly sequenced genomes become available. Moreover, the method demonstrated to be competitive when compared to the most current classifier PhyloPythia and has the advantage that it can be locally installed and the reference set can be kept up-to-date. PMID:19210774
Aeroelastically coupled blades for vertical axis wind turbines
Paquette, Joshua; Barone, Matthew F.
2016-02-23
Various technologies described herein pertain to a vertical axis wind turbine blade configured to rotate about a rotation axis. The vertical axis wind turbine blade includes at least an attachment segment, a rear swept segment, and optionally, a forward swept segment. The attachment segment is contiguous with the forward swept segment, and the forward swept segment is contiguous with the rear swept segment. The attachment segment includes a first portion of a centroid axis, the forward swept segment includes a second portion of the centroid axis, and the rear swept segment includes a third portion of the centroid axis. The second portion of the centroid axis is angularly displaced ahead of the first portion of the centroid axis and the third portion of the centroid axis is angularly displaced behind the first portion of the centroid axis in the direction of rotation about the rotation axis.
A Doppler centroid estimation algorithm for SAR systems optimized for the quasi-homogeneous source
NASA Technical Reports Server (NTRS)
Jin, Michael Y.
1989-01-01
Radar signal processing applications frequently require an estimate of the Doppler centroid of a received signal. The Doppler centroid estimate is required for synthetic aperture radar (SAR) processing. It is also required for some applications involving target motion estimation and antenna pointing direction estimation. In some cases, the Doppler centroid can be accurately estimated based on available information regarding the terrain topography, the relative motion between the sensor and the terrain, and the antenna pointing direction. Often, the accuracy of the Doppler centroid estimate can be improved by analyzing the characteristics of the received SAR signal. This kind of signal processing is also referred to as clutterlock processing. A Doppler centroid estimation (DCE) algorithm is described which contains a linear estimator optimized for the type of terrain surface that can be modeled by a quasi-homogeneous source (QHS). Information on the following topics is presented: (1) an introduction to the theory of Doppler centroid estimation; (2) analysis of the performance characteristics of previously reported DCE algorithms; (3) comparison of these analysis results with experimental results; (4) a description and performance analysis of a Doppler centroid estimator which is optimized for a QHS; and (5) comparison of the performance of the optimal QHS Doppler centroid estimator with that of previously reported methods.
NASA Astrophysics Data System (ADS)
Wang, D.; Becker, N. C.; Weinstein, S.; Duputel, Z.; Rivera, L. A.; Hayes, G. P.; Hirshorn, B. F.; Bouchard, R. H.; Mungov, G.
2017-12-01
The Pacific Tsunami Warning Center (PTWC) began forecasting tsunamis in real-time using source parameters derived from real-time Centroid Moment Tensor (CMT) solutions in 2009. Both the USGS and PTWC typically obtain W-Phase CMT solutions for large earthquakes less than 30 minutes after earthquake origin time. Within seconds, and often before waves reach the nearest deep ocean bottom pressure sensor (DARTs), PTWC then generates a regional tsunami propagation forecast using its linear shallow water model. The model is initialized by the sea surface deformation that mimics the seafloor deformation based on Okada's (1985) dislocation model of a rectangular fault with a uniform slip. The fault length and width are empirical functions of the seismic moment. How well did this simple model perform? The DART records provide a very valuable dataset for model validation. We examine tsunami events of the last decade with earthquake magnitudes ranging from 6.5 to 9.0 including some deep events for which tsunamis were not expected. Most of the forecast results were obtained during the events. We also include events from before the implementation of the WCMT method at USGS and PTWC, 2006-2009. For these events, WCMTs were computed retrospectively (Duputel et al. 2012). We also re-ran the model with a larger domain for some events to include far-field DARTs that recorded a tsunami with identical source parameters used during the events. We conclude that our model results, in terms of maximum wave amplitude, are mostly within a factor of two of the observed at DART stations, with an average error of less than 40% for most events, including the 2010 Maule and the 2011 Tohoku tsunamis. However, the simple fault model with a uniform slip is too simplistic for the Tohoku tsunami. We note model results are sensitive to centroid location and depth, especially if the earthquake is close to land or inland. For the 2016 M7.8 New Zealand earthquake the initial forecast underestimated the tsunami because the initial WCMT centroid was on land (the epicenter was inland but most of the slips occurred offshore). Later WCMTs did provide better forecast. The model also failed to reproduce the observed tsunamis from earthquake-generated landslides. Sea level observations during the events are crucial in determining whether or not a forecast needs to be adjusted.
NASA Astrophysics Data System (ADS)
Wang, Dong
2016-03-01
Gears are the most commonly used components in mechanical transmission systems. Their failures may cause transmission system breakdown and result in economic loss. Identification of different gear crack levels is important to prevent any unexpected gear failure because gear cracks lead to gear tooth breakage. Signal processing based methods mainly require expertize to explain gear fault signatures which is usually not easy to be achieved by ordinary users. In order to automatically identify different gear crack levels, intelligent gear crack identification methods should be developed. The previous case studies experimentally proved that K-nearest neighbors based methods exhibit high prediction accuracies for identification of 3 different gear crack levels under different motor speeds and loads. In this short communication, to further enhance prediction accuracies of existing K-nearest neighbors based methods and extend identification of 3 different gear crack levels to identification of 5 different gear crack levels, redundant statistical features are constructed by using Daubechies 44 (db44) binary wavelet packet transform at different wavelet decomposition levels, prior to the use of a K-nearest neighbors method. The dimensionality of redundant statistical features is 620, which provides richer gear fault signatures. Since many of these statistical features are redundant and highly correlated with each other, dimensionality reduction of redundant statistical features is conducted to obtain new significant statistical features. At last, the K-nearest neighbors method is used to identify 5 different gear crack levels under different motor speeds and loads. A case study including 3 experiments is investigated to demonstrate that the developed method provides higher prediction accuracies than the existing K-nearest neighbors based methods for recognizing different gear crack levels under different motor speeds and loads. Based on the new significant statistical features, some other popular statistical models including linear discriminant analysis, quadratic discriminant analysis, classification and regression tree and naive Bayes classifier, are compared with the developed method. The results show that the developed method has the highest prediction accuracies among these statistical models. Additionally, selection of the number of new significant features and parameter selection of K-nearest neighbors are thoroughly investigated.
CCD centroiding analysis for Nano-JASMINE observation data
NASA Astrophysics Data System (ADS)
Niwa, Yoshito; Yano, Taihei; Araki, Hiroshi; Gouda, Naoteru; Kobayashi, Yukiyasu; Yamada, Yoshiyuki; Tazawa, Seiichi; Hanada, Hideo
2010-07-01
Nano-JASMINE is a very small satellite mission for global space astrometry with milli-arcsecond accuracy, which will be launched in 2011. In this mission, centroids of stars in CCD image frames are estimated with sub-pixel accuracy. In order to realize such a high precision centroiding an algorithm utilizing a least square method is employed. One of the advantages is that centroids can be calculated without explicit assumption of the point spread functions of stars. CCD centroiding experiment has been performed to investigate whether this data analysis is available, and centroids of artificial star images on a CCD are determined with a precision of less than 0.001 pixel. This result indicates parallaxes of stars within 300 pc from Sun can be observed in Nano-JASMINE.
NASA Astrophysics Data System (ADS)
Alqasemi, Umar; Kumavor, Patrick; Aguirre, Andres; Zhu, Quing
2012-12-01
Unique features and the underlining hypotheses of how these features may relate to the tumor physiology in coregistered ultrasound and photoacoustic images of ex vivo ovarian tissue are introduced. The images were first compressed with wavelet transform. The mean Radon transform of photoacoustic images was then computed and fitted with a Gaussian function to find the centroid of a suspicious area for shift-invariant recognition process. Twenty-four features were extracted from a training set by several methods, including Fourier transform, image statistics, and different composite filters. The features were chosen from more than 400 training images obtained from 33 ex vivo ovaries of 24 patients, and used to train three classifiers, including generalized linear model, neural network, and support vector machine (SVM). The SVM achieved the best training performance and was able to exclusively separate cancerous from non-cancerous cases with 100% sensitivity and specificity. At the end, the classifiers were used to test 95 new images obtained from 37 ovaries of 20 additional patients. The SVM classifier achieved 76.92% sensitivity and 95.12% specificity. Furthermore, if we assume that recognizing one image as a cancer is sufficient to consider an ovary as malignant, the SVM classifier achieves 100% sensitivity and 87.88% specificity.
Janousova, Eva; Schwarz, Daniel; Kasparek, Tomas
2015-06-30
We investigated a combination of three classification algorithms, namely the modified maximum uncertainty linear discriminant analysis (mMLDA), the centroid method, and the average linkage, with three types of features extracted from three-dimensional T1-weighted magnetic resonance (MR) brain images, specifically MR intensities, grey matter densities, and local deformations for distinguishing 49 first episode schizophrenia male patients from 49 healthy male subjects. The feature sets were reduced using intersubject principal component analysis before classification. By combining the classifiers, we were able to obtain slightly improved results when compared with single classifiers. The best classification performance (81.6% accuracy, 75.5% sensitivity, and 87.8% specificity) was significantly better than classification by chance. We also showed that classifiers based on features calculated using more computation-intensive image preprocessing perform better; mMLDA with classification boundary calculated as weighted mean discriminative scores of the groups had improved sensitivity but similar accuracy compared to the original MLDA; reducing a number of eigenvectors during data reduction did not always lead to higher classification accuracy, since noise as well as the signal important for classification were removed. Our findings provide important information for schizophrenia research and may improve accuracy of computer-aided diagnostics of neuropsychiatric diseases. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Sharkey, Joseph R; Horel, Scott; Dean, Wesley R
2010-05-25
There has been limited study of all types of food stores, such as traditional (supercenters, supermarkets, and grocery stores), convenience stores, and non-traditional (dollar stores, mass merchandisers, and pharmacies) as potential opportunities for purchase of fresh and processed (canned and frozen) fruits and vegetables, especially in small-town or rural areas. Data from the Brazos Valley Food Environment Project (BVFEP) are combined with 2000 U.S. Census data for 101 Census block groups (CBG) to examine neighborhood access to fruits and vegetables. BVFEP data included identification and geocoding of all food stores (n = 185) in six rural counties in Texas, using ground-truthed methods and on-site assessment of the availability and variety of fresh and processed fruits and vegetables in all food stores. Access from the population-weighted centroid of each CBG was measured using proximity (minimum network distance) and coverage (number of shopping opportunities) for a good selection of fresh and processed fruits and vegetables. Neighborhood inequalities (deprivation and vehicle ownership) and spatial access for fruits and vegetables were examined using Wilcoxon matched-pairs signed-rank test and multivariate regression models. The variety of fruits or vegetables was greater at supermarkets compared with grocery stores. Among non-traditional and convenience food stores, the largest variety was found at dollar stores. On average, rural neighborhoods were 9.9 miles to the nearest supermarket, 6.7 miles and 7.4 miles to the nearest food store with a good variety of fresh fruits and vegetables, respectively, and 4.7 miles and 4.5 miles to a good variety of fresh and processed fruits or vegetables. High deprivation or low vehicle ownership neighborhoods had better spatial access to a good variety of fruits and vegetables, both in the distance to the nearest source and in the number of shopping opportunities. Supermarkets and grocery stores are no longer the only shopping opportunities for fruits or vegetables. The inclusion of data on availability of fresh or processed fruits or vegetables in the measurements provides robust meaning to the concept of potential access in this large rural area.
2010-01-01
Objective There has been limited study of all types of food stores, such as traditional (supercenters, supermarkets, and grocery stores), convenience stores, and non-traditional (dollar stores, mass merchandisers, and pharmacies) as potential opportunities for purchase of fresh and processed (canned and frozen) fruits and vegetables, especially in small-town or rural areas. Methods Data from the Brazos Valley Food Environment Project (BVFEP) are combined with 2000 U.S. Census data for 101 Census block groups (CBG) to examine neighborhood access to fruits and vegetables. BVFEP data included identification and geocoding of all food stores (n = 185) in six rural counties in Texas, using ground-truthed methods and on-site assessment of the availability and variety of fresh and processed fruits and vegetables in all food stores. Access from the population-weighted centroid of each CBG was measured using proximity (minimum network distance) and coverage (number of shopping opportunities) for a good selection of fresh and processed fruits and vegetables. Neighborhood inequalities (deprivation and vehicle ownership) and spatial access for fruits and vegetables were examined using Wilcoxon matched-pairs signed-rank test and multivariate regression models. Results The variety of fruits or vegetables was greater at supermarkets compared with grocery stores. Among non-traditional and convenience food stores, the largest variety was found at dollar stores. On average, rural neighborhoods were 9.9 miles to the nearest supermarket, 6.7 miles and 7.4 miles to the nearest food store with a good variety of fresh fruits and vegetables, respectively, and 4.7 miles and 4.5 miles to a good variety of fresh and processed fruits or vegetables. High deprivation or low vehicle ownership neighborhoods had better spatial access to a good variety of fruits and vegetables, both in the distance to the nearest source and in the number of shopping opportunities. Conclusion Supermarkets and grocery stores are no longer the only shopping opportunities for fruits or vegetables. The inclusion of data on availability of fresh or processed fruits or vegetables in the measurements provides robust meaning to the concept of potential access in this large rural area. PMID:20500853
Land cover map for map zones 8 and 9 developed from SAGEMAP, GNN, and SWReGAP: a pilot for NWGAP
James S. Kagan; Janet L. Ohmann; Matthew Gregory; Claudine Tobalske
2008-01-01
As part of the Northwest Gap Analysis Project, land cover maps were generated for most of eastern Washington and eastern Oregon. The maps were derived from regional SAGEMAP and SWReGAP data sets using decision tree classifiers for nonforest areas, and Gradient Nearest Neighbor imputation modeling for forests and woodlands. The maps integrate data from regional...
Natural Language Processing Based Instrument for Classification of Free Text Medical Records
2016-01-01
According to the Ministry of Labor, Health and Social Affairs of Georgia a new health management system has to be introduced in the nearest future. In this context arises the problem of structuring and classifying documents containing all the history of medical services provided. The present work introduces the instrument for classification of medical records based on the Georgian language. It is the first attempt of such classification of the Georgian language based medical records. On the whole 24.855 examination records have been studied. The documents were classified into three main groups (ultrasonography, endoscopy, and X-ray) and 13 subgroups using two well-known methods: Support Vector Machine (SVM) and K-Nearest Neighbor (KNN). The results obtained demonstrated that both machine learning methods performed successfully, with a little supremacy of SVM. In the process of classification a “shrink” method, based on features selection, was introduced and applied. At the first stage of classification the results of the “shrink” case were better; however, on the second stage of classification into subclasses 23% of all documents could not be linked to only one definite individual subclass (liver or binary system) due to common features characterizing these subclasses. The overall results of the study were successful. PMID:27668260
Tilbury, Karissa; Hocker, James; Wen, Bruce L.; Sandbo, Nathan; Singh, Vikas; Campagnola, Paul J.
2014-01-01
Abstract. Patients with idiopathic fibrosis (IPF) have poor long-term survival as there are limited diagnostic/prognostic tools or successful therapies. Remodeling of the extracellular matrix (ECM) has been implicated in IPF progression; however, the structural consequences on the collagen architecture have not received considerable attention. Here, we demonstrate that second harmonic generation (SHG) and multiphoton fluorescence microscopy can quantitatively differentiate normal and IPF human tissues. For SHG analysis, we developed a classifier based on wavelet transforms, principle component analysis, and a K-nearest-neighbor algorithm to classify the specific alterations of the collagen structure observed in IPF tissues. The resulting ROC curves obtained by varying the numbers of principal components and nearest neighbors yielded accuracies of >95%. In contrast, simpler metrics based on SHG intensity and collagen coverage in the image provided little or no discrimination. We also characterized the change in the elastin/collagen balance by simultaneously measuring the elastin autofluorescence and SHG intensities and found that the IPF tissues were less elastic relative to collagen. This is consistent with known mechanical consequences of the disease. Understanding ECM remodeling in IPF via nonlinear optical microscopy may enhance our ability to differentiate patients with rapid and slow progression and, thus, provide better prognostic information. PMID:25134793
Shrinkage simplex-centroid designs for a quadratic mixture model
NASA Astrophysics Data System (ADS)
Hasan, Taha; Ali, Sajid; Ahmed, Munir
2018-03-01
A simplex-centroid design for q mixture components comprises of all possible subsets of the q components, present in equal proportions. The design does not contain full mixture blends except the overall centroid. In real-life situations, all mixture blends comprise of at least a minimum proportion of each component. Here, we introduce simplex-centroid designs which contain complete blends but with some loss in D-efficiency and stability in G-efficiency. We call such designs as shrinkage simplex-centroid designs. Furthermore, we use the proposed designs to generate component-amount designs by their projection.
Exploiting Language Models to Classify Events from Twitter
Vo, Duc-Thuan; Hai, Vo Thuan; Ock, Cheol-Young
2015-01-01
Classifying events is challenging in Twitter because tweets texts have a large amount of temporal data with a lot of noise and various kinds of topics. In this paper, we propose a method to classify events from Twitter. We firstly find the distinguishing terms between tweets in events and measure their similarities with learning language models such as ConceptNet and a latent Dirichlet allocation method for selectional preferences (LDA-SP), which have been widely studied based on large text corpora within computational linguistic relations. The relationship of term words in tweets will be discovered by checking them under each model. We then proposed a method to compute the similarity between tweets based on tweets' features including common term words and relationships among their distinguishing term words. It will be explicit and convenient for applying to k-nearest neighbor techniques for classification. We carefully applied experiments on the Edinburgh Twitter Corpus to show that our method achieves competitive results for classifying events. PMID:26451139
Efficacy Evaluation of Different Wavelet Feature Extraction Methods on Brain MRI Tumor Detection
NASA Astrophysics Data System (ADS)
Nabizadeh, Nooshin; John, Nigel; Kubat, Miroslav
2014-03-01
Automated Magnetic Resonance Imaging brain tumor detection and segmentation is a challenging task. Among different available methods, feature-based methods are very dominant. While many feature extraction techniques have been employed, it is still not quite clear which of feature extraction methods should be preferred. To help improve the situation, we present the results of a study in which we evaluate the efficiency of using different wavelet transform features extraction methods in brain MRI abnormality detection. Applying T1-weighted brain image, Discrete Wavelet Transform (DWT), Discrete Wavelet Packet Transform (DWPT), Dual Tree Complex Wavelet Transform (DTCWT), and Complex Morlet Wavelet Transform (CMWT) methods are applied to construct the feature pool. Three various classifiers as Support Vector Machine, K Nearest Neighborhood, and Sparse Representation-Based Classifier are applied and compared for classifying the selected features. The results show that DTCWT and CMWT features classified with SVM, result in the highest classification accuracy, proving of capability of wavelet transform features to be informative in this application.
Centroid tracker and aimpoint selection
NASA Astrophysics Data System (ADS)
Venkateswarlu, Ronda; Sujata, K. V.; Venkateswara Rao, B.
1992-11-01
Autonomous fire and forget weapons have gained importance to achieve accurate first pass kill by hitting the target at an appropriate aim point. Centroid of the image presented by a target in the field of view (FOV) of a sensor is generally accepted as the aimpoint for these weapons. Centroid trackers are applicable only when the target image is of significant size in the FOV of the sensor but does not overflow the FOV. But as the range between the sensor and the target decreases the image of the target will grow and finally overflow the FOV at close ranges and the centroid point on the target will keep on changing which is not desirable. And also centroid need not be the most desired/vulnerable point on the target. For hardened targets like tanks, proper aimpoint selection and guidance up to almost zero range is essential to achieve maximum kill probability. This paper presents a centroid tracker realization. As centroid offers a stable tracking point, it can be used as a reference to select the proper aimpoint. The centroid and the desired aimpoint are simultaneously tracked to avoid jamming by flares and also to take care of the problems arising due to image overflow. Thresholding of gray level image to binary image is a crucial step in centroid tracker. Different thresholding algorithms are discussed and a suitable algorithm is chosen. The real-time hardware implementation of centroid tracker with a suitable thresholding technique is presented including the interfacing to a multimode tracker for autonomous target tracking and aimpoint selection. The hardware uses very high speed arithmetic and programmable logic devices to meet the speed requirement and a microprocessor based subsystem for the system control. The tracker has been evaluated in a field environment.
Yin, Xiaoming; Li, Xiang; Zhao, Liping; Fang, Zhongping
2009-11-10
A Shack-Hartmann wavefront sensor (SWHS) splits the incident wavefront into many subsections and transfers the distorted wavefront detection into the centroid measurement. The accuracy of the centroid measurement determines the accuracy of the SWHS. Many methods have been presented to improve the accuracy of the wavefront centroid measurement. However, most of these methods are discussed from the point of view of optics, based on the assumption that the spot intensity of the SHWS has a Gaussian distribution, which is not applicable to the digital SHWS. In this paper, we present a centroid measurement algorithm based on the adaptive thresholding and dynamic windowing method by utilizing image processing techniques for practical application of the digital SHWS in surface profile measurement. The method can detect the centroid of each focal spot precisely and robustly by eliminating the influence of various noises, such as diffraction of the digital SHWS, unevenness and instability of the light source, as well as deviation between the centroid of the focal spot and the center of the detection area. The experimental results demonstrate that the algorithm has better precision, repeatability, and stability compared with other commonly used centroid methods, such as the statistical averaging, thresholding, and windowing algorithms.
Machine learning approach to automatic exudate detection in retinal images from diabetic patients
NASA Astrophysics Data System (ADS)
Sopharak, Akara; Dailey, Matthew N.; Uyyanonvara, Bunyarit; Barman, Sarah; Williamson, Tom; Thet Nwe, Khine; Aye Moe, Yin
2010-01-01
Exudates are among the preliminary signs of diabetic retinopathy, a major cause of vision loss in diabetic patients. Early detection of exudates could improve patients' chances to avoid blindness. In this paper, we present a series of experiments on feature selection and exudates classification using naive Bayes and support vector machine (SVM) classifiers. We first fit the naive Bayes model to a training set consisting of 15 features extracted from each of 115,867 positive examples of exudate pixels and an equal number of negative examples. We then perform feature selection on the naive Bayes model, repeatedly removing features from the classifier, one by one, until classification performance stops improving. To find the best SVM, we begin with the best feature set from the naive Bayes classifier, and repeatedly add the previously-removed features to the classifier. For each combination of features, we perform a grid search to determine the best combination of hyperparameters ν (tolerance for training errors) and γ (radial basis function width). We compare the best naive Bayes and SVM classifiers to a baseline nearest neighbour (NN) classifier using the best feature sets from both classifiers. We find that the naive Bayes and SVM classifiers perform better than the NN classifier. The overall best sensitivity, specificity, precision, and accuracy are 92.28%, 98.52%, 53.05%, and 98.41%, respectively.
Directional phytoscreening: contaminant gradients in trees for plume delineation.
Limmer, Matt A; Shetty, Mikhil K; Markus, Samantha; Kroeker, Ryan; Parker, Beth L; Martinez, Camilo; Burken, Joel G
2013-08-20
Tree sampling methods have been used in phytoscreening applications to delineate contaminated soil and groundwater, augmenting traditional investigative methods that are time-consuming, resource-intensive, invasive, and costly. In the past decade, contaminant concentrations in tree tissues have been shown to reflect the extent and intensity of subsurface contamination. This paper investigates a new phytoscreening tool: directional tree coring, a concept originating from field data that indicated azimuthal concentrations in tree trunks reflected the concentration gradients in the groundwater around the tree. To experimentally test this hypothesis, large diameter trees were subjected to subsurface contaminant concentration gradients in a greenhouse study. These trees were then analyzed for azimuthal concentration gradients in aboveground tree tissues, revealing contaminant centroids located on the side of the tree nearest the most contaminated groundwater. Tree coring at three field sites revealed sufficiently steep contaminant gradients in trees reflected nearby groundwater contaminant gradients. In practice, trees possessing steep contaminant gradients are indicators of steep subsurface contaminant gradients, providing compass-like information about the contaminant gradient, pointing investigators toward higher concentration regions of the plume.
Takahashi, Hiro; Kobayashi, Takeshi; Honda, Hiroyuki
2005-01-15
For establishing prognostic predictors of various diseases using DNA microarray analysis technology, it is desired to find selectively significant genes for constructing the prognostic model and it is also necessary to eliminate non-specific genes or genes with error before constructing the model. We applied projective adaptive resonance theory (PART) to gene screening for DNA microarray data. Genes selected by PART were subjected to our FNN-SWEEP modeling method for the construction of a cancer class prediction model. The model performance was evaluated through comparison with a conventional screening signal-to-noise (S2N) method or nearest shrunken centroids (NSC) method. The FNN-SWEEP predictor with PART screening could discriminate classes of acute leukemia in blinded data with 97.1% accuracy and classes of lung cancer with 90.0% accuracy, while the predictor with S2N was only 85.3 and 70.0% or the predictor with NSC was 88.2 and 90.0%, respectively. The results have proven that PART was superior for gene screening. The software is available upon request from the authors. honda@nubio.nagoya-u.ac.jp
Kringel, D; Ultsch, A; Zimmermann, M; Jansen, J-P; Ilias, W; Freynhagen, R; Griessinger, N; Kopf, A; Stein, C; Doehring, A; Resch, E; Lötsch, J
2017-01-01
Next-generation sequencing (NGS) provides unrestricted access to the genome, but it produces ‘big data’ exceeding in amount and complexity the classical analytical approaches. We introduce a bioinformatics-based classifying biomarker that uses emergent properties in genetics to separate pain patients requiring extremely high opioid doses from controls. Following precisely calculated selection of the 34 most informative markers in the OPRM1, OPRK1, OPRD1 and SIGMAR1 genes, pattern of genotypes belonging to either patient group could be derived using a k-nearest neighbor (kNN) classifier that provided a diagnostic accuracy of 80.6±4%. This outperformed alternative classifiers such as reportedly functional opioid receptor gene variants or complex biomarkers obtained via multiple regression or decision tree analysis. The accumulation of several genetic variants with only minor functional influences may result in a qualitative consequence affecting complex phenotypes, pointing at emergent properties in genetics. PMID:27139154
Kringel, D; Ultsch, A; Zimmermann, M; Jansen, J-P; Ilias, W; Freynhagen, R; Griessinger, N; Kopf, A; Stein, C; Doehring, A; Resch, E; Lötsch, J
2017-10-01
Next-generation sequencing (NGS) provides unrestricted access to the genome, but it produces 'big data' exceeding in amount and complexity the classical analytical approaches. We introduce a bioinformatics-based classifying biomarker that uses emergent properties in genetics to separate pain patients requiring extremely high opioid doses from controls. Following precisely calculated selection of the 34 most informative markers in the OPRM1, OPRK1, OPRD1 and SIGMAR1 genes, pattern of genotypes belonging to either patient group could be derived using a k-nearest neighbor (kNN) classifier that provided a diagnostic accuracy of 80.6±4%. This outperformed alternative classifiers such as reportedly functional opioid receptor gene variants or complex biomarkers obtained via multiple regression or decision tree analysis. The accumulation of several genetic variants with only minor functional influences may result in a qualitative consequence affecting complex phenotypes, pointing at emergent properties in genetics.
Sharkey, Joseph R; Johnson, Cassandra M; Dean, Wesley R; Horel, Scott A
2011-01-25
Individuals and families are relying more on food prepared outside the home as a source for at-home and away-from-home consumption. Restricting the estimation of fast-food access to fast-food restaurants alone may underestimate potential spatial access to fast food. The study used data from the 2006 Brazos Valley Food Environment Project (BVFEP) and the 2000 U.S. Census Summary File 3 for six rural counties in the Texas Brazos Valley region. BVFEP ground-truthed data included identification and geocoding of all fast-food restaurants, convenience stores, supermarkets, and grocery stores in study area and on-site assessment of the availability and variety of fast-food lunch/dinner entrées and side dishes. Network distance was calculated from the population-weighted centroid of each census block group to all retail locations that marketed fast food (n = 205 fast-food opportunities). Spatial access to fast-food opportunities (FFO) was significantly better than to traditional fast-food restaurants (FFR). The median distance to the nearest FFO was 2.7 miles, compared with 4.5 miles to the nearest FFR. Residents of high deprivation neighborhoods had better spatial access to a variety of healthier fast-food entrée and side dish options than residents of low deprivation neighborhoods. Our analyses revealed that identifying fast-food restaurants as the sole source of fast-food entrées and side dishes underestimated neighborhood exposure to fast food, in terms of both neighborhood proximity and coverage. Potential interventions must consider all retail opportunities for fast food, and not just traditional FFR.
2011-01-01
Background Individuals and families are relying more on food prepared outside the home as a source for at-home and away-from-home consumption. Restricting the estimation of fast-food access to fast-food restaurants alone may underestimate potential spatial access to fast food. Methods The study used data from the 2006 Brazos Valley Food Environment Project (BVFEP) and the 2000 U.S. Census Summary File 3 for six rural counties in the Texas Brazos Valley region. BVFEP ground-truthed data included identification and geocoding of all fast-food restaurants, convenience stores, supermarkets, and grocery stores in study area and on-site assessment of the availability and variety of fast-food lunch/dinner entrées and side dishes. Network distance was calculated from the population-weighted centroid of each census block group to all retail locations that marketed fast food (n = 205 fast-food opportunities). Results Spatial access to fast-food opportunities (FFO) was significantly better than to traditional fast-food restaurants (FFR). The median distance to the nearest FFO was 2.7 miles, compared with 4.5 miles to the nearest FFR. Residents of high deprivation neighborhoods had better spatial access to a variety of healthier fast-food entrée and side dish options than residents of low deprivation neighborhoods. Conclusions Our analyses revealed that identifying fast-food restaurants as the sole source of fast-food entrées and side dishes underestimated neighborhood exposure to fast food, in terms of both neighborhood proximity and coverage. Potential interventions must consider all retail opportunities for fast food, and not just traditional FFR. PMID:21266055
Travel by public transit to mammography facilities in 6 US urban areas.
Graham, S; Lewis, B; Flanagan, B; Watson, M; Peipins, L
2015-12-01
We examined lack of private vehicle access and 30 minutes or longer public transportation travel time to mammography facilities for women 40 years of age or older in the urban areas of Boston, Philadelphia, San Antonio, San Diego, Denver, and Seattle to identify transit marginalized populations - women for whom these travel characteristics may jointly present a barrier to clinic access. This ecological study used sex and race/ethnicity data from the 2010 US Census and household vehicle availability data from the American Community Survey 2008-2012, all at Census tract level. Using the public transportation option on Google Trip Planner we obtained the travel time from the centroid of each census tract to all local mammography facilities to determine the nearest mammography facility in each urban area. Median travel times by public transportation to the nearest facility for women with no household access to a private vehicle were obtained by ranking travel time by population group across all U.S. census tracts in each urban area and across the entire study area. The overall median travel times for each urban area for women without household access to a private vehicle ranged from a low of 15 minutes in Boston and Philadelphia to 27 minutes in San Diego. The numbers and percentages of transit marginalized women were then calculated for all urban areas by population group. While black women were less likely to have private vehicle access, and both Hispanic and black women were more likely to be transit marginalized, this outcome varied by urban area. White women constituted the largest number of transit marginalized. Our results indicate that mammography facilities are favorably located for the large majority of women, although there are still substantial numbers for whom travel may likely present a barrier to mammography facility access.
Centroid of a Polygon--Three Views.
ERIC Educational Resources Information Center
Shilgalis, Thomas W.; Benson, Carol T.
2001-01-01
Investigates the idea of the center of mass of a polygon and illustrates centroids of polygons. Connects physics, mathematics, and technology to produces results that serve to generalize the notion of centroid to polygons other than triangles. (KHR)
A Cultural Resources Inventory of the John Martin Reservoir, Colorado.
1982-08-31
Service provides very useful infor- thesis 1.4. The Nearest Neighbor analysis with mation as to potential native plants and animals Z-coordinate cluster...expected relationships between Our analysis is far less specific about the base camps and many internally different artifact staple plant resources. The...artifacts were which had one of the lowest mean annual plant classified for the analysis in terms of their attri- production ratings. Twenty-nine
Adaptive fuzzy system for 3-D vision
NASA Technical Reports Server (NTRS)
Mitra, Sunanda
1993-01-01
An adaptive fuzzy system using the concept of the Adaptive Resonance Theory (ART) type neural network architecture and incorporating fuzzy c-means (FCM) system equations for reclassification of cluster centers was developed. The Adaptive Fuzzy Leader Clustering (AFLC) architecture is a hybrid neural-fuzzy system which learns on-line in a stable and efficient manner. The system uses a control structure similar to that found in the Adaptive Resonance Theory (ART-1) network to identify the cluster centers initially. The initial classification of an input takes place in a two stage process; a simple competitive stage and a distance metric comparison stage. The cluster prototypes are then incrementally updated by relocating the centroid positions from Fuzzy c-Means (FCM) system equations for the centroids and the membership values. The operational characteristics of AFLC and the critical parameters involved in its operation are discussed. The performance of the AFLC algorithm is presented through application of the algorithm to the Anderson Iris data, and laser-luminescent fingerprint image data. The AFLC algorithm successfully classifies features extracted from real data, discrete or continuous, indicating the potential strength of this new clustering algorithm in analyzing complex data sets. The hybrid neuro-fuzzy AFLC algorithm will enhance analysis of a number of difficult recognition and control problems involved with Tethered Satellite Systems and on-orbit space shuttle attitude controller.
Classifying epileptic EEG signals with delay permutation entropy and Multi-Scale K-means.
Zhu, Guohun; Li, Yan; Wen, Peng Paul; Wang, Shuaifang
2015-01-01
Most epileptic EEG classification algorithms are supervised and require large training datasets, that hinder their use in real time applications. This chapter proposes an unsupervised Multi-Scale K-means (MSK-means) MSK-means algorithm to distinguish epileptic EEG signals and identify epileptic zones. The random initialization of the K-means algorithm can lead to wrong clusters. Based on the characteristics of EEGs, the MSK-means MSK-means algorithm initializes the coarse-scale centroid of a cluster with a suitable scale factor. In this chapter, the MSK-means algorithm is proved theoretically superior to the K-means algorithm on efficiency. In addition, three classifiers: the K-means, MSK-means MSK-means and support vector machine (SVM), are used to identify seizure and localize epileptogenic zone using delay permutation entropy features. The experimental results demonstrate that identifying seizure with the MSK-means algorithm and delay permutation entropy achieves 4. 7 % higher accuracy than that of K-means, and 0. 7 % higher accuracy than that of the SVM.
Comparison of performance of some common Hartmann-Shack centroid estimation methods
NASA Astrophysics Data System (ADS)
Thatiparthi, C.; Ommani, A.; Burman, R.; Thapa, D.; Hutchings, N.; Lakshminarayanan, V.
2016-03-01
The accuracy of the estimation of optical aberrations by measuring the distorted wave front using a Hartmann-Shack wave front sensor (HSWS) is mainly dependent upon the measurement accuracy of the centroid of the focal spot. The most commonly used methods for centroid estimation such as the brightest spot centroid; first moment centroid; weighted center of gravity and intensity weighted center of gravity, are generally applied on the entire individual sub-apertures of the lens let array. However, these processes of centroid estimation are sensitive to the influence of reflections, scattered light, and noise; especially in the case where the signal spot area is smaller compared to the whole sub-aperture area. In this paper, we give a comparison of performance of the commonly used centroiding methods on estimation of optical aberrations, with and without the use of some pre-processing steps (thresholding, Gaussian smoothing and adaptive windowing). As an example we use the aberrations of the human eye model. This is done using the raw data collected from a custom made ophthalmic aberrometer and a model eye to emulate myopic and hyper-metropic defocus values up to 2 Diopters. We show that the use of any simple centroiding algorithm is sufficient in the case of ophthalmic applications for estimating aberrations within the typical clinically acceptable limits of a quarter Diopter margins, when certain pre-processing steps to reduce the impact of external factors are used.
Star sub-pixel centroid calculation based on multi-step minimum energy difference method
NASA Astrophysics Data System (ADS)
Wang, Duo; Han, YanLi; Sun, Tengfei
2013-09-01
The star's centroid plays a vital role in celestial navigation, star images which be gotten during daytime, due to the strong sky background, have a low SNR, and the star objectives are nearly submerged in the background, takes a great trouble to the centroid localization. Traditional methods, such as a moment method, weighted centroid calculation method is simple but has a big error, especially in the condition of a low SNR. Gaussian method has a high positioning accuracy, but the computational complexity. Analysis of the energy distribution in star image, a location method for star target centroids based on multi-step minimum energy difference is proposed. This method uses the linear superposition to narrow the centroid area, in the certain narrow area uses a certain number of interpolation to pixels for the pixels' segmentation, and then using the symmetry of the stellar energy distribution, tentatively to get the centroid position: assume that the current pixel is the star centroid position, and then calculates and gets the difference of the sum of the energy which in the symmetric direction(in this paper we take the two directions of transverse and longitudinal) and the equal step length(which can be decided through different conditions, the paper takes 9 as the step length) of the current pixel, and obtain the centroid position in this direction when the minimum difference appears, and so do the other directions, then the validation comparison of simulated star images, and compare with several traditional methods, experiments shows that the positioning accuracy of the method up to 0.001 pixel, has good effect to calculate the centroid of low SNR conditions; at the same time, uses this method on a star map which got at the fixed observation site during daytime in near-infrared band, compare the results of the paper's method with the position messages which were known of the star, it shows that :the multi-step minimum energy difference method achieves a better effect.
A recursive technique for adaptive vector quantization
NASA Technical Reports Server (NTRS)
Lindsay, Robert A.
1989-01-01
Vector Quantization (VQ) is fast becoming an accepted, if not preferred method for image compression. The VQ performs well when compressing all types of imagery including Video, Electro-Optical (EO), Infrared (IR), Synthetic Aperture Radar (SAR), Multi-Spectral (MS), and digital map data. The only requirement is to change the codebook to switch the compressor from one image sensor to another. There are several approaches for designing codebooks for a vector quantizer. Adaptive Vector Quantization is a procedure that simultaneously designs codebooks as the data is being encoded or quantized. This is done by computing the centroid as a recursive moving average where the centroids move after every vector is encoded. When computing the centroid of a fixed set of vectors the resultant centroid is identical to the previous centroid calculation. This method of centroid calculation can be easily combined with VQ encoding techniques. The defined quantizer changes after every encoded vector by recursively updating the centroid of minimum distance which is the selected by the encoder. Since the quantizer is changing definition or states after every encoded vector, the decoder must now receive updates to the codebook. This is done as side information by multiplexing bits into the compressed source data.
Xia, Wenjun; Mita, Yoshio; Shibata, Tadashi
2016-05-01
Aiming at efficient data condensation and improving accuracy, this paper presents a hardware-friendly template reduction (TR) method for the nearest neighbor (NN) classifiers by introducing the concept of critical boundary vectors. A hardware system is also implemented to demonstrate the feasibility of using an field-programmable gate array (FPGA) to accelerate the proposed method. Initially, k -means centers are used as substitutes for the entire template set. Then, to enhance the classification performance, critical boundary vectors are selected by a novel learning algorithm, which is completed within a single iteration. Moreover, to remove noisy boundary vectors that can mislead the classification in a generalized manner, a global categorization scheme has been explored and applied to the algorithm. The global characterization automatically categorizes each classification problem and rapidly selects the boundary vectors according to the nature of the problem. Finally, only critical boundary vectors and k -means centers are used as the new template set for classification. Experimental results for 24 data sets show that the proposed algorithm can effectively reduce the number of template vectors for classification with a high learning speed. At the same time, it improves the accuracy by an average of 2.17% compared with the traditional NN classifiers and also shows greater accuracy than seven other TR methods. We have shown the feasibility of using a proof-of-concept FPGA system of 256 64-D vectors to accelerate the proposed method on hardware. At a 50-MHz clock frequency, the proposed system achieves a 3.86 times higher learning speed than on a 3.4-GHz PC, while consuming only 1% of the power of that used by the PC.
NASA Astrophysics Data System (ADS)
Li, Xiaohui; Yang, Sibo; Fan, Rongwei; Yu, Xin; Chen, Deying
2018-06-01
In this paper, discrimination of soft tissues using laser-induced breakdown spectroscopy (LIBS) in combination with multivariate statistical methods is presented. Fresh pork fat, skin, ham, loin and tenderloin muscle tissues are manually cut into slices and ablated using a 1064 nm pulsed Nd:YAG laser. Discrimination analyses between fat, skin and muscle tissues, and further between highly similar ham, loin and tenderloin muscle tissues, are performed based on the LIBS spectra in combination with multivariate statistical methods, including principal component analysis (PCA), k nearest neighbors (kNN) classification, and support vector machine (SVM) classification. Performances of the discrimination models, including accuracy, sensitivity and specificity, are evaluated using 10-fold cross validation. The classification models are optimized to achieve best discrimination performances. The fat, skin and muscle tissues can be definitely discriminated using both kNN and SVM classifiers, with accuracy of over 99.83%, sensitivity of over 0.995 and specificity of over 0.998. The highly similar ham, loin and tenderloin muscle tissues can also be discriminated with acceptable performances. The best performances are achieved with SVM classifier using Gaussian kernel function, with accuracy of 76.84%, sensitivity of over 0.742 and specificity of over 0.869. The results show that the LIBS technique assisted with multivariate statistical methods could be a powerful tool for online discrimination of soft tissues, even for tissues of high similarity, such as muscles from different parts of the animal body. This technique could be used for discrimination of tissues suffering minor clinical changes, thus may advance the diagnosis of early lesions and abnormalities.
Inspection of wear particles in oils by using a fuzzy classifier
NASA Astrophysics Data System (ADS)
Hamalainen, Jari J.; Enwald, Petri
1994-11-01
The reliability of stand-alone machines and larger production units can be improved by automated condition monitoring. Analysis of wear particles in lubricating or hydraulic oils helps diagnosing the wear states of machine parts. This paper presents a computer vision system for automated classification of wear particles. Digitized images from experiments with a bearing test bench, a hydraulic system with an industrial company, and oil samples from different industrial sources were used for algorithm development and testing. The wear particles were divided into four classes indicating different wear mechanisms: cutting wear, fatigue wear, adhesive wear, and abrasive wear. The results showed that the fuzzy K-nearest neighbor classifier utilized gave the same distribution of wear particles as the classification by a human expert.
A system for tracking and recognizing pedestrian faces using a network of loosely coupled cameras
NASA Astrophysics Data System (ADS)
Gagnon, L.; Laliberté, F.; Foucher, S.; Branzan Albu, A.; Laurendeau, D.
2006-05-01
A face recognition module has been developed for an intelligent multi-camera video surveillance system. The module can recognize a pedestrian face in terms of six basic emotions and the neutral state. Face and facial features detection (eyes, nasal root, nose and mouth) are first performed using cascades of boosted classifiers. These features are used to normalize the pose and dimension of the face image. Gabor filters are then sampled on a regular grid covering the face image to build a facial feature vector that feeds a nearest neighbor classifier with a cosine distance similarity measure for facial expression interpretation and face model construction. A graphical user interface allows the user to adjust the module parameters.
Recognizing ovarian cancer from co-registered ultrasound and photoacoustic images
NASA Astrophysics Data System (ADS)
Alqasemi, Umar; Kumavor, Patrick; Aguirre, Andres; Zhu, Quing
2013-03-01
Unique features in co-registered ultrasound and photoacoustic images of ex vivo ovarian tissue are introduced, along with the hypotheses of how these features may relate to the physiology of tumors. The images are compressed with wavelet transform, after which the mean Radon transform of the photoacoustic image is computed and fitted with a Gaussian function to find the centroid of the suspicious area for shift-invariant recognition process. In the next step, 24 features are extracted from a training set of images by several methods; including features from the Fourier domain, image statistics, and the outputs of different composite filters constructed from the joint frequency response of different cancerous images. The features were chosen from more than 400 training images obtained from 33 ex vivo ovaries of 24 patients, and used to train a support vector machine (SVM) structure. The SVM classifier was able to exclusively separate the cancerous from the non-cancerous cases with 100% sensitivity and specificity. At the end, the classifier was used to test 95 new images, obtained from 37 ovaries of 20 additional patients. The SVM classifier achieved 76.92% sensitivity and 95.12% specificity. Furthermore, if we assume that recognizing one image as a cancerous case is sufficient to consider the ovary as malignant, then the SVM classifier achieves 100% sensitivity and 87.88% specificity.
Automatic Recognition of Breathing Route During Sleep Using Snoring Sounds
NASA Astrophysics Data System (ADS)
Mikami, Tsuyoshi; Kojima, Yohichiro
This letter classifies snoring sounds into three breathing routes (oral, nasal, and oronasal) with discriminant analysis of the power spectra and k-nearest neighbor method. It is necessary to recognize breathing route during snoring, because oral snoring is a typical symptom of sleep apnea but we cannot know our own breathing and snoring condition during sleep. As a result, about 98.8% classification rate is obtained by using leave-one-out test for performance evaluation.
Heterogeneous Multi-Metric Learning for Multi-Sensor Fusion
2011-07-01
distance”. One of the most widely used methods is the k-nearest neighbor ( KNN ) method [4], which labels an input data sample to be the class with majority...despite of its simplicity, it can be an effective candidate and can be easily extended to handle multiple sensors. Distance based method such as KNN relies...Neighbor (LMNN) method [21] which will be briefly reviewed in the sequel. LMNN method tries to learn an optimal metric specifically for KNN classifier. The
NASA Astrophysics Data System (ADS)
Taha, Z.; Razman, M. A. M.; Ghani, A. S. Abdul; Majeed, A. P. P. Abdul; Musa, R. M.; Adnan, F. A.; Sallehudin, M. F.; Mukai, Y.
2018-04-01
Fish Hunger behaviour is essential in determining the fish feeding routine, particularly for fish farmers. The inability to provide accurate feeding routines (under-feeding or over-feeding) may lead the death of the fish and consequently inhibits the quantity of the fish produced. Moreover, the excessive food that is not consumed by the fish will be dissolved in the water and accordingly reduce the water quality through the reduction of oxygen quantity. This problem also leads the death of the fish or even spur fish diseases. In the present study, a correlation of Barramundi fish-school behaviour with hunger condition through the hybrid data integration of image processing technique is established. The behaviour is clustered with respect to the position of the school size as well as the school density of the fish before feeding, during feeding and after feeding. The clustered fish behaviour is then classified through k-Nearest Neighbour (k-NN) learning algorithm. Three different variations of the algorithm namely cosine, cubic and weighted are assessed on its ability to classify the aforementioned fish hunger behaviour. It was found from the study that the weighted k-NN variation provides the best classification with an accuracy of 86.5%. Therefore, it could be concluded that the proposed integration technique may assist fish farmers in ascertaining fish feeding routine.
Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio
NASA Astrophysics Data System (ADS)
Nababan, A. A.; Sitompul, O. S.; Tulus
2018-04-01
K- Nearest Neighbor (KNN) is a good classifier, but from several studies, the result performance accuracy of KNN still lower than other methods. One of the causes of the low accuracy produced, because each attribute has the same effect on the classification process, while some less relevant characteristics lead to miss-classification of the class assignment for new data. In this research, we proposed Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio as a parameter to see the correlation between each attribute in the data and the Gain Ratio also will be used as the basis for weighting each attribute of the dataset. The accuracy of results is compared to the accuracy acquired from the original KNN method using 10-fold Cross-Validation with several datasets from the UCI Machine Learning repository and KEEL-Dataset Repository, such as abalone, glass identification, haberman, hayes-roth and water quality status. Based on the result of the test, the proposed method was able to increase the classification accuracy of KNN, where the highest difference of accuracy obtained hayes-roth dataset is worth 12.73%, and the lowest difference of accuracy obtained in the abalone dataset of 0.07%. The average result of the accuracy of all dataset increases the accuracy by 5.33%.
Ensemble Methods for Classification of Physical Activities from Wrist Accelerometry.
Chowdhury, Alok Kumar; Tjondronegoro, Dian; Chandran, Vinod; Trost, Stewart G
2017-09-01
To investigate whether the use of ensemble learning algorithms improve physical activity recognition accuracy compared to the single classifier algorithms, and to compare the classification accuracy achieved by three conventional ensemble machine learning methods (bagging, boosting, random forest) and a custom ensemble model comprising four algorithms commonly used for activity recognition (binary decision tree, k nearest neighbor, support vector machine, and neural network). The study used three independent data sets that included wrist-worn accelerometer data. For each data set, a four-step classification framework consisting of data preprocessing, feature extraction, normalization and feature selection, and classifier training and testing was implemented. For the custom ensemble, decisions from the single classifiers were aggregated using three decision fusion methods: weighted majority vote, naïve Bayes combination, and behavior knowledge space combination. Classifiers were cross-validated using leave-one subject out cross-validation and compared on the basis of average F1 scores. In all three data sets, ensemble learning methods consistently outperformed the individual classifiers. Among the conventional ensemble methods, random forest models provided consistently high activity recognition; however, the custom ensemble model using weighted majority voting demonstrated the highest classification accuracy in two of the three data sets. Combining multiple individual classifiers using conventional or custom ensemble learning methods can improve activity recognition accuracy from wrist-worn accelerometer data.
Yousef, Malik; Khalifa, Waleed; AbedAllah, Loai
2016-12-22
The performance of many learning and data mining algorithms depends critically on suitable metrics to assess efficiency over the input space. Learning a suitable metric from examples may, therefore, be the key to successful application of these algorithms. We have demonstrated that the k-nearest neighbor (kNN) classification can be significantly improved by learning a distance metric from labeled examples. The clustering ensemble is used to define the distance between points in respect to how they co-cluster. This distance is then used within the framework of the kNN algorithm to define a classifier named ensemble clustering kNN classifier (EC-kNN). In many instances in our experiments we achieved highest accuracy while SVM failed to perform as well. In this study, we compare the performance of a two-class classifier using EC-kNN with different one-class and two-class classifiers. The comparison was applied to seven different plant microRNA species considering eight feature selection methods. In this study, the averaged results show that ECkNN outperforms all other methods employed here and previously published results for the same data. In conclusion, this study shows that the chosen classifier shows high performance when the distance metric is carefully chosen.
Yousef, Malik; Khalifa, Waleed; AbdAllah, Loai
2016-12-01
The performance of many learning and data mining algorithms depends critically on suitable metrics to assess efficiency over the input space. Learning a suitable metric from examples may, therefore, be the key to successful application of these algorithms. We have demonstrated that the k-nearest neighbor (kNN) classification can be significantly improved by learning a distance metric from labeled examples. The clustering ensemble is used to define the distance between points in respect to how they co-cluster. This distance is then used within the framework of the kNN algorithm to define a classifier named ensemble clustering kNN classifier (EC-kNN). In many instances in our experiments we achieved highest accuracy while SVM failed to perform as well. In this study, we compare the performance of a two-class classifier using EC-kNN with different one-class and two-class classifiers. The comparison was applied to seven different plant microRNA species considering eight feature selection methods. In this study, the averaged results show that EC-kNN outperforms all other methods employed here and previously published results for the same data. In conclusion, this study shows that the chosen classifier shows high performance when the distance metric is carefully chosen.
Classification of vegetation types in military region
NASA Astrophysics Data System (ADS)
Gonçalves, Miguel; Silva, Jose Silvestre; Bioucas-Dias, Jose
2015-10-01
In decision-making process regarding planning and execution of military operations, the terrain is a determining factor. Aerial photographs are a source of vital information for the success of an operation in hostile region, namely when the cartographic information behind enemy lines is scarce or non-existent. The objective of present work is the development of a tool capable of processing aerial photos. The methodology implemented starts with feature extraction, followed by the application of an automatic selector of features. The next step, using the k-fold cross validation technique, estimates the input parameters for the following classifiers: Sparse Multinomial Logist Regression (SMLR), K Nearest Neighbor (KNN), Linear Classifier using Principal Component Expansion on the Joint Data (PCLDC) and Multi-Class Support Vector Machine (MSVM). These classifiers were used in two different studies with distinct objectives: discrimination of vegetation's density and identification of vegetation's main components. It was found that the best classifier on the first approach is the Sparse Logistic Multinomial Regression (SMLR). On the second approach, the implemented methodology applied to high resolution images showed that the better performance was achieved by KNN classifier and PCLDC. Comparing the two approaches there is a multiscale issue, in which for different resolutions, the best solution to the problem requires different classifiers and the extraction of different features.
Comparisons and Selections of Features and Classifiers for Short Text Classification
NASA Astrophysics Data System (ADS)
Wang, Ye; Zhou, Zhi; Jin, Shan; Liu, Debin; Lu, Mi
2017-10-01
Short text is considerably different from traditional long text documents due to its shortness and conciseness, which somehow hinders the applications of conventional machine learning and data mining algorithms in short text classification. According to traditional artificial intelligence methods, we divide short text classification into three steps, namely preprocessing, feature selection and classifier comparison. In this paper, we have illustrated step-by-step how we approach our goals. Specifically, in feature selection, we compared the performance and robustness of the four methods of one-hot encoding, tf-idf weighting, word2vec and paragraph2vec, and in the classification part, we deliberately chose and compared Naive Bayes, Logistic Regression, Support Vector Machine, K-nearest Neighbor and Decision Tree as our classifiers. Then, we compared and analysed the classifiers horizontally with each other and vertically with feature selections. Regarding the datasets, we crawled more than 400,000 short text files from Shanghai and Shenzhen Stock Exchanges and manually labeled them into two classes, the big and the small. There are eight labels in the big class, and 59 labels in the small class.
Automatic Cataract Hardness Classification Ex Vivo by Ultrasound Techniques.
Caixinha, Miguel; Santos, Mário; Santos, Jaime
2016-04-01
To demonstrate the feasibility of a new methodology for cataract hardness characterization and automatic classification using ultrasound techniques, different cataract degrees were induced in 210 porcine lenses. A 25-MHz ultrasound transducer was used to obtain acoustical parameters (velocity and attenuation) and backscattering signals. B-Scan and parametric Nakagami images were constructed. Ninety-seven parameters were extracted and subjected to a Principal Component Analysis. Bayes, K-Nearest-Neighbours, Fisher Linear Discriminant and Support Vector Machine (SVM) classifiers were used to automatically classify the different cataract severities. Statistically significant increases with cataract formation were found for velocity, attenuation, mean brightness intensity of the B-Scan images and mean Nakagami m parameter (p < 0.01). The four classifiers showed a good performance for healthy versus cataractous lenses (F-measure ≥ 92.68%), while for initial versus severe cataracts the SVM classifier showed the higher performance (90.62%). The results showed that ultrasound techniques can be used for non-invasive cataract hardness characterization and automatic classification. Copyright © 2016 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.
Developing a radiomics framework for classifying non-small cell lung carcinoma subtypes
NASA Astrophysics Data System (ADS)
Yu, Dongdong; Zang, Yali; Dong, Di; Zhou, Mu; Gevaert, Olivier; Fang, Mengjie; Shi, Jingyun; Tian, Jie
2017-03-01
Patient-targeted treatment of non-small cell lung carcinoma (NSCLC) has been well documented according to the histologic subtypes over the past decade. In parallel, recent development of quantitative image biomarkers has recently been highlighted as important diagnostic tools to facilitate histological subtype classification. In this study, we present a radiomics analysis that classifies the adenocarcinoma (ADC) and squamous cell carcinoma (SqCC). We extract 52-dimensional, CT-based features (7 statistical features and 45 image texture features) to represent each nodule. We evaluate our approach on a clinical dataset including 324 ADCs and 110 SqCCs patients with CT image scans. Classification of these features is performed with four different machine-learning classifiers including Support Vector Machines with Radial Basis Function kernel (RBF-SVM), Random forest (RF), K-nearest neighbor (KNN), and RUSBoost algorithms. To improve the classifiers' performance, optimal feature subset is selected from the original feature set by using an iterative forward inclusion and backward eliminating algorithm. Extensive experimental results demonstrate that radiomics features achieve encouraging classification results on both complete feature set (AUC=0.89) and optimal feature subset (AUC=0.91).
Reliability of an experimental method to analyse the impact point on a golf ball during putting.
Richardson, Ashley K; Mitchell, Andrew C S; Hughes, Gerwyn
2015-06-01
This study aimed to examine the reliability of an experimental method identifying the location of the impact point on a golf ball during putting. Forty trials were completed using a mechanical putting robot set to reproduce a putt of 3.2 m, with four different putter-ball combinations. After locating the centre of the dimple pattern (centroid) the following variables were tested; distance of the impact point from the centroid, angle of the impact point from the centroid and distance of the impact point from the centroid derived from the X, Y coordinates. Good to excellent reliability was demonstrated in all impact variables reflected in very strong relative (ICC = 0.98-1.00) and absolute reliability (SEM% = 0.9-4.3%). The highest SEM% observed was 7% for the angle of the impact point from the centroid. In conclusion, the experimental method was shown to be reliable at locating the centroid location of a golf ball, therefore allowing for the identification of the point of impact with the putter head and is suitable for use in subsequent studies.
NASA Astrophysics Data System (ADS)
Coleman, J. E.; Ekdahl, C. A.; Moir, D. C.; Sullivan, G. W.; Crawford, M. T.
2014-09-01
Axial beam centroid and beam breakup (BBU) measurements were conducted on an 80 ns FWHM, intense relativistic electron bunch with an injected energy of 3.8 MV and current of 2.9 kA. The intense relativistic electron bunch is accelerated and transported through a nested solenoid and ferrite induction core lattice consisting of 64 elements, exiting the accelerator with a nominal energy of 19.8 MeV. The principal objective of these experiments is to quantify the coupling of the beam centroid motion to the BBU instability and validate the theory of this coupling for the first time. Time resolved centroid measurements indicate a reduction in the BBU amplitude, ⟨ξ⟩, of 19% and a reduction in the BBU growth rate (Γ) of 4% by reducing beam centroid misalignments ˜50% throughout the accelerator. An investigation into the contribution of the misaligned elements is made. An alignment algorithm is presented in addition to a qualitative comparison of experimental and calculated results which include axial beam centroid oscillations, BBU amplitude, and growth with different dipole steering.
A regularization approach to hydrofacies delineation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wohlberg, Brendt; Tartakovsky, Daniel
2009-01-01
We consider an inverse problem of identifying complex internal structures of composite (geological) materials from sparse measurements of system parameters and system states. Two conceptual frameworks for identifying internal boundaries between constitutive materials in a composite are considered. A sequential approach relies on support vector machines, nearest neighbor classifiers, or geostatistics to reconstruct boundaries from measurements of system parameters and then uses system states data to refine the reconstruction. A joint approach inverts the two data sets simultaneously by employing a regularization approach.
Huang, Yin-Fu; Wang, Chia-Ming; Liou, Sing-Wu
2013-01-01
A hybrid self-adaptive harmony search and back-propagation mining system was proposed to discover weighted patterns in human intron sequences. By testing the weights under a lazy nearest neighbor classifier, the numerical results revealed the significance of these weighted patterns. Comparing these weighted patterns with the popular intron consensus model, it is clear that the discovered weighted patterns make originally the ambiguous 5SS and 3SS header patterns more specific and concrete.
Wang, Chia-Ming; Liou, Sing-Wu
2013-01-01
A hybrid self-adaptive harmony search and back-propagation mining system was proposed to discover weighted patterns in human intron sequences. By testing the weights under a lazy nearest neighbor classifier, the numerical results revealed the significance of these weighted patterns. Comparing these weighted patterns with the popular intron consensus model, it is clear that the discovered weighted patterns make originally the ambiguous 5SS and 3SS header patterns more specific and concrete. PMID:23737711
Identity Recognition Algorithm Using Improved Gabor Feature Selection of Gait Energy Image
NASA Astrophysics Data System (ADS)
Chao, LIANG; Ling-yao, JIA; Dong-cheng, SHI
2017-01-01
This paper describes an effective gait recognition approach based on Gabor features of gait energy image. In this paper, the kernel Fisher analysis combined with kernel matrix is proposed to select dominant features. The nearest neighbor classifier based on whitened cosine distance is used to discriminate different gait patterns. The approach proposed is tested on the CASIA and USF gait databases. The results show that our approach outperforms other state of gait recognition approaches in terms of recognition accuracy and robustness.
Machine Learning Methods for Production Cases Analysis
NASA Astrophysics Data System (ADS)
Mokrova, Nataliya V.; Mokrov, Alexander M.; Safonova, Alexandra V.; Vishnyakov, Igor V.
2018-03-01
Approach to analysis of events occurring during the production process were proposed. Described machine learning system is able to solve classification tasks related to production control and hazard identification at an early stage. Descriptors of the internal production network data were used for training and testing of applied models. k-Nearest Neighbors and Random forest methods were used to illustrate and analyze proposed solution. The quality of the developed classifiers was estimated using standard statistical metrics, such as precision, recall and accuracy.
NASA Astrophysics Data System (ADS)
Adya Zizwan, Putra; Zarlis, Muhammad; Budhiarti Nababan, Erna
2017-12-01
The determination of Centroid on K-Means Algorithm directly affects the quality of the clustering results. Determination of centroid by using random numbers has many weaknesses. The GenClust algorithm that combines the use of Genetic Algorithms and K-Means uses a genetic algorithm to determine the centroid of each cluster. The use of the GenClust algorithm uses 50% chromosomes obtained through deterministic calculations and 50% is obtained from the generation of random numbers. This study will modify the use of the GenClust algorithm in which the chromosomes used are 100% obtained through deterministic calculations. The results of this study resulted in performance comparisons expressed in Mean Square Error influenced by centroid determination on K-Means method by using GenClust method, modified GenClust method and also classic K-Means.
A focal plane metrology system and PSF centroiding experiment
NASA Astrophysics Data System (ADS)
Li, Haitao; Li, Baoquan; Cao, Yang; Li, Ligang
2016-10-01
In this paper, we present an overview of a detector array equipment metrology testbed and a micro-pixel centroiding experiment currently under development at the National Space Science Center, Chinese Academy of Sciences. We discuss on-going development efforts aimed at calibrating the intra-/inter-pixel quantum efficiency and pixel positions for scientific grade CMOS detector, and review significant progress in achieving higher precision differential centroiding for pseudo star images in large area back-illuminated CMOS detector. Without calibration of pixel positions and intrapixel response, we have demonstrated that the standard deviation of differential centroiding is below 2.0e-3 pixels.
High-speed on-chip windowed centroiding using photodiode-based CMOS imager
NASA Technical Reports Server (NTRS)
Pain, Bedabrata (Inventor); Sun, Chao (Inventor); Yang, Guang (Inventor); Cunningham, Thomas J. (Inventor); Hancock, Bruce (Inventor)
2003-01-01
A centroid computation system is disclosed. The system has an imager array, a switching network, computation elements, and a divider circuit. The imager array has columns and rows of pixels. The switching network is adapted to receive pixel signals from the image array. The plurality of computation elements operates to compute inner products for at least x and y centroids. The plurality of computation elements has only passive elements to provide inner products of pixel signals the switching network. The divider circuit is adapted to receive the inner products and compute the x and y centroids.
High-speed on-chip windowed centroiding using photodiode-based CMOS imager
NASA Technical Reports Server (NTRS)
Pain, Bedabrata (Inventor); Sun, Chao (Inventor); Yang, Guang (Inventor); Cunningham, Thomas J. (Inventor); Hancock, Bruce (Inventor)
2004-01-01
A centroid computation system is disclosed. The system has an imager array, a switching network, computation elements, and a divider circuit. The imager array has columns and rows of pixels. The switching network is adapted to receive pixel signals from the image array. The plurality of computation elements operates to compute inner products for at least x and y centroids. The plurality of computation elements has only passive elements to provide inner products of pixel signals the switching network. The divider circuit is adapted to receive the inner products and compute the x and y centroids.
Wang, Yong; Wu, Qiao-Feng; Chen, Chen; Wu, Ling-Yun; Yan, Xian-Zhong; Yu, Shu-Guang; Zhang, Xiang-Sun; Liang, Fan-Rong
2012-01-01
Acupuncture has been practiced in China for thousands of years as part of the Traditional Chinese Medicine (TCM) and has gradually accepted in western countries as an alternative or complementary treatment. However, the underlying mechanism of acupuncture, especially whether there exists any difference between varies acupoints, remains largely unknown, which hinders its widespread use. In this study, we develop a novel Linear Programming based Feature Selection method (LPFS) to understand the mechanism of acupuncture effect, at molecular level, by revealing the metabolite biomarkers for acupuncture treatment. Specifically, we generate and investigate the high-throughput metabolic profiles of acupuncture treatment at several acupoints in human. To select the subsets of metabolites that best characterize the acupuncture effect for each meridian point, an optimization model is proposed to identify biomarkers from high-dimensional metabolic data from case and control samples. Importantly, we use nearest centroid as the prototype to simultaneously minimize the number of selected features and the leave-one-out cross validation error of classifier. We compared the performance of LPFS to several state-of-the-art methods, such as SVM recursive feature elimination (SVM-RFE) and sparse multinomial logistic regression approach (SMLR). We find that our LPFS method tends to reveal a small set of metabolites with small standard deviation and large shifts, which exactly serves our requirement for good biomarker. Biologically, several metabolite biomarkers for acupuncture treatment are revealed and serve as the candidates for further mechanism investigation. Also biomakers derived from five meridian points, Zusanli (ST36), Liangmen (ST21), Juliao (ST3), Yanglingquan (GB34), and Weizhong (BL40), are compared for their similarity and difference, which provide evidence for the specificity of acupoints. Our result demonstrates that metabolic profiling might be a promising method to investigate the molecular mechanism of acupuncture. Comparing with other existing methods, LPFS shows better performance to select a small set of key molecules. In addition, LPFS is a general methodology and can be applied to other high-dimensional data analysis, for example cancer genomics.
2012-01-01
Background Acupuncture has been practiced in China for thousands of years as part of the Traditional Chinese Medicine (TCM) and has gradually accepted in western countries as an alternative or complementary treatment. However, the underlying mechanism of acupuncture, especially whether there exists any difference between varies acupoints, remains largely unknown, which hinders its widespread use. Results In this study, we develop a novel Linear Programming based Feature Selection method (LPFS) to understand the mechanism of acupuncture effect, at molecular level, by revealing the metabolite biomarkers for acupuncture treatment. Specifically, we generate and investigate the high-throughput metabolic profiles of acupuncture treatment at several acupoints in human. To select the subsets of metabolites that best characterize the acupuncture effect for each meridian point, an optimization model is proposed to identify biomarkers from high-dimensional metabolic data from case and control samples. Importantly, we use nearest centroid as the prototype to simultaneously minimize the number of selected features and the leave-one-out cross validation error of classifier. We compared the performance of LPFS to several state-of-the-art methods, such as SVM recursive feature elimination (SVM-RFE) and sparse multinomial logistic regression approach (SMLR). We find that our LPFS method tends to reveal a small set of metabolites with small standard deviation and large shifts, which exactly serves our requirement for good biomarker. Biologically, several metabolite biomarkers for acupuncture treatment are revealed and serve as the candidates for further mechanism investigation. Also biomakers derived from five meridian points, Zusanli (ST36), Liangmen (ST21), Juliao (ST3), Yanglingquan (GB34), and Weizhong (BL40), are compared for their similarity and difference, which provide evidence for the specificity of acupoints. Conclusions Our result demonstrates that metabolic profiling might be a promising method to investigate the molecular mechanism of acupuncture. Comparing with other existing methods, LPFS shows better performance to select a small set of key molecules. In addition, LPFS is a general methodology and can be applied to other high-dimensional data analysis, for example cancer genomics. PMID:23046877
NASA Astrophysics Data System (ADS)
Sopharak, Akara; Uyyanonvara, Bunyarit; Barman, Sarah; Williamson, Thomas
To prevent blindness from diabetic retinopathy, periodic screening and early diagnosis are neccessary. Due to lack of expert ophthalmologists in rural area, automated early exudate (one of visible sign of diabetic retinopathy) detection could help to reduce the number of blindness in diabetic patients. Traditional automatic exudate detection methods are based on specific parameter configuration, while the machine learning approaches which seems more flexible may be computationally high cost. A comparative analysis of traditional and machine learning of exudates detection, namely, mathematical morphology, fuzzy c-means clustering, naive Bayesian classifier, Support Vector Machine and Nearest Neighbor classifier are presented. Detected exudates are validated with expert ophthalmologists' hand-drawn ground-truths. The sensitivity, specificity, precision, accuracy and time complexity of each method are also compared.
Orr, Lindsay; Hernández de la Peña, Lisandro; Roy, Pierre-Nicholas
2017-06-07
A derivation of quantum statistical mechanics based on the concept of a Feynman path centroid is presented for the case of generalized density operators using the projected density operator formalism of Blinov and Roy [J. Chem. Phys. 115, 7822-7831 (2001)]. The resulting centroid densities, centroid symbols, and centroid correlation functions are formulated and analyzed in the context of the canonical equilibrium picture of Jang and Voth [J. Chem. Phys. 111, 2357-2370 (1999)]. The case where the density operator projects onto a particular energy eigenstate of the system is discussed, and it is shown that one can extract microcanonical dynamical information from double Kubo transformed correlation functions. It is also shown that the proposed projection operator approach can be used to formally connect the centroid and Wigner phase-space distributions in the zero reciprocal temperature β limit. A Centroid Molecular Dynamics (CMD) approximation to the state-projected exact quantum dynamics is proposed and proven to be exact in the harmonic limit. The state projected CMD method is also tested numerically for a quartic oscillator and a double-well potential and found to be more accurate than canonical CMD. In the case of a ground state projection, this method can resolve tunnelling splittings of the double well problem in the higher barrier regime where canonical CMD fails. Finally, the state-projected CMD framework is cast in a path integral form.
NASA Astrophysics Data System (ADS)
Orr, Lindsay; Hernández de la Peña, Lisandro; Roy, Pierre-Nicholas
2017-06-01
A derivation of quantum statistical mechanics based on the concept of a Feynman path centroid is presented for the case of generalized density operators using the projected density operator formalism of Blinov and Roy [J. Chem. Phys. 115, 7822-7831 (2001)]. The resulting centroid densities, centroid symbols, and centroid correlation functions are formulated and analyzed in the context of the canonical equilibrium picture of Jang and Voth [J. Chem. Phys. 111, 2357-2370 (1999)]. The case where the density operator projects onto a particular energy eigenstate of the system is discussed, and it is shown that one can extract microcanonical dynamical information from double Kubo transformed correlation functions. It is also shown that the proposed projection operator approach can be used to formally connect the centroid and Wigner phase-space distributions in the zero reciprocal temperature β limit. A Centroid Molecular Dynamics (CMD) approximation to the state-projected exact quantum dynamics is proposed and proven to be exact in the harmonic limit. The state projected CMD method is also tested numerically for a quartic oscillator and a double-well potential and found to be more accurate than canonical CMD. In the case of a ground state projection, this method can resolve tunnelling splittings of the double well problem in the higher barrier regime where canonical CMD fails. Finally, the state-projected CMD framework is cast in a path integral form.
Classifying multispectral data by neural networks
NASA Technical Reports Server (NTRS)
Telfer, Brian A.; Szu, Harold H.; Kiang, Richard K.
1993-01-01
Several energy functions for synthesizing neural networks are tested on 2-D synthetic data and on Landsat-4 Thematic Mapper data. These new energy functions, designed specifically for minimizing misclassification error, in some cases yield significant improvements in classification accuracy over the standard least mean squares energy function. In addition to operating on networks with one output unit per class, a new energy function is tested for binary encoded outputs, which result in smaller network sizes. The Thematic Mapper data (four bands were used) is classified on a single pixel basis, to provide a starting benchmark against which further improvements will be measured. Improvements are underway to make use of both subpixel and superpixel (i.e. contextual or neighborhood) information in tile processing. For single pixel classification, the best neural network result is 78.7 percent, compared with 71.7 percent for a classical nearest neighbor classifier. The 78.7 percent result also improves on several earlier neural network results on this data.
NASA Astrophysics Data System (ADS)
Georganos, Stefanos; Grippa, Tais; Vanhuysse, Sabine; Lennert, Moritz; Shimoni, Michal; Wolff, Eléonore
2017-10-01
This study evaluates the impact of three Feature Selection (FS) algorithms in an Object Based Image Analysis (OBIA) framework for Very-High-Resolution (VHR) Land Use-Land Cover (LULC) classification. The three selected FS algorithms, Correlation Based Selection (CFS), Mean Decrease in Accuracy (MDA) and Random Forest (RF) based Recursive Feature Elimination (RFE), were tested on Support Vector Machine (SVM), K-Nearest Neighbor, and Random Forest (RF) classifiers. The results demonstrate that the accuracy of SVM and KNN classifiers are the most sensitive to FS. The RF appeared to be more robust to high dimensionality, although a significant increase in accuracy was found by using the RFE method. In terms of classification accuracy, SVM performed the best using FS, followed by RF and KNN. Finally, only a small number of features is needed to achieve the highest performance using each classifier. This study emphasizes the benefits of rigorous FS for maximizing performance, as well as for minimizing model complexity and interpretation.
Emotion detection model of Filipino music
NASA Astrophysics Data System (ADS)
Noblejas, Kathleen Alexis; Isidro, Daryl Arvin; Samonte, Mary Jane C.
2017-02-01
This research explored the creation of a model to detect emotion from Filipino songs. The emotion model used was based from Paul Ekman's six basic emotions. The songs were classified into the following genres: kundiman, novelty, pop, and rock. The songs were annotated by a group of music experts based on the emotion the song induces to the listener. Musical features of the songs were extracted using jAudio while the lyric features were extracted by Bag-of- Words feature representation. The audio and lyric features of the Filipino songs were extracted for classification by the chosen three classifiers, Naïve Bayes, Support Vector Machines, and k-Nearest Neighbors. The goal of the research was to know which classifier would work best for Filipino music. Evaluation was done by 10-fold cross validation and accuracy, precision, recall, and F-measure results were compared. Models were also tested with unknown test data to further determine the models' accuracy through the prediction results.
Object Classification in Semi Structured Enviroment Using Forward-Looking Sonar
dos Santos, Matheus; Ribeiro, Pedro Otávio; Núñez, Pedro; Botelho, Silvia
2017-01-01
The submarine exploration using robots has been increasing in recent years. The automation of tasks such as monitoring, inspection, and underwater maintenance requires the understanding of the robot’s environment. The object recognition in the scene is becoming a critical issue for these systems. On this work, an underwater object classification pipeline applied in acoustic images acquired by Forward-Looking Sonar (FLS) are studied. The object segmentation combines thresholding, connected pixels searching and peak of intensity analyzing techniques. The object descriptor extract intensity and geometric features of the detected objects. A comparison between the Support Vector Machine, K-Nearest Neighbors, and Random Trees classifiers are presented. An open-source tool was developed to annotate and classify the objects and evaluate their classification performance. The proposed method efficiently segments and classifies the structures in the scene using a real dataset acquired by an underwater vehicle in a harbor area. Experimental results demonstrate the robustness and accuracy of the method described in this paper. PMID:28961163
Zhang, Y N
2017-01-01
Parkinson's disease (PD) is primarily diagnosed by clinical examinations, such as walking test, handwriting test, and MRI diagnostic. In this paper, we propose a machine learning based PD telediagnosis method for smartphone. Classification of PD using speech records is a challenging task owing to the fact that the classification accuracy is still lower than doctor-level. Here we demonstrate automatic classification of PD using time frequency features, stacked autoencoders (SAE), and K nearest neighbor (KNN) classifier. KNN classifier can produce promising classification results from useful representations which were learned by SAE. Empirical results show that the proposed method achieves better performance with all tested cases across classification tasks, demonstrating machine learning capable of classifying PD with a level of competence comparable to doctor. It concludes that a smartphone can therefore potentially provide low-cost PD diagnostic care. This paper also gives an implementation on browser/server system and reports the running time cost. Both advantages and disadvantages of the proposed telediagnosis system are discussed.
2017-01-01
Parkinson's disease (PD) is primarily diagnosed by clinical examinations, such as walking test, handwriting test, and MRI diagnostic. In this paper, we propose a machine learning based PD telediagnosis method for smartphone. Classification of PD using speech records is a challenging task owing to the fact that the classification accuracy is still lower than doctor-level. Here we demonstrate automatic classification of PD using time frequency features, stacked autoencoders (SAE), and K nearest neighbor (KNN) classifier. KNN classifier can produce promising classification results from useful representations which were learned by SAE. Empirical results show that the proposed method achieves better performance with all tested cases across classification tasks, demonstrating machine learning capable of classifying PD with a level of competence comparable to doctor. It concludes that a smartphone can therefore potentially provide low-cost PD diagnostic care. This paper also gives an implementation on browser/server system and reports the running time cost. Both advantages and disadvantages of the proposed telediagnosis system are discussed. PMID:29075547
Nearest neighbor 3D segmentation with context features
NASA Astrophysics Data System (ADS)
Hristova, Evelin; Schulz, Heinrich; Brosch, Tom; Heinrich, Mattias P.; Nickisch, Hannes
2018-03-01
Automated and fast multi-label segmentation of medical images is challenging and clinically important. This paper builds upon a supervised machine learning framework that uses training data sets with dense organ annotations and vantage point trees to classify voxels in unseen images based on similarity of binary feature vectors extracted from the data. Without explicit model knowledge, the algorithm is applicable to different modalities and organs, and achieves high accuracy. The method is successfully tested on 70 abdominal CT and 42 pelvic MR images. With respect to ground truth, an average Dice overlap score of 0.76 for the CT segmentation of liver, spleen and kidneys is achieved. The mean score for the MR delineation of bladder, bones, prostate and rectum is 0.65. Additionally, we benchmark several variations of the main components of the method and reduce the computation time by up to 47% without significant loss of accuracy. The segmentation results are - for a nearest neighbor method - surprisingly accurate, robust as well as data and time efficient.
NASA Astrophysics Data System (ADS)
Purwanti, Endah; Calista, Evelyn
2017-05-01
Leukemia is a type of cancer which is caused by malignant neoplasms in leukocyte cells. Leukemia disease which can cause death quickly enough for the sufferer is a type of acute lymphocyte leukemia (ALL). In this study, we propose automatic detection of lymphocyte leukemia through classification of lymphocyte cell images obtained from peripheral blood smear single cell. There are two main objectives in this study. The first is to extract featuring cells. The second objective is to classify the lymphocyte cells into two classes, namely normal and abnormal lymphocytes. In conducting this study, we use combination of shape feature and histogram feature, and the classification algorithm is k-nearest Neighbour with k variation is 1, 3, 5, 7, 9, 11, 13, and 15. The best level of accuracy, sensitivity, and specificity in this study are 90%, 90%, and 90%, and they were obtained from combined features of area-perimeter-mean-standard deviation with k=7.
Deep neural networks for texture classification-A theoretical analysis.
Basu, Saikat; Mukhopadhyay, Supratik; Karki, Manohar; DiBiano, Robert; Ganguly, Sangram; Nemani, Ramakrishna; Gayaka, Shreekant
2018-01-01
We investigate the use of Deep Neural Networks for the classification of image datasets where texture features are important for generating class-conditional discriminative representations. To this end, we first derive the size of the feature space for some standard textural features extracted from the input dataset and then use the theory of Vapnik-Chervonenkis dimension to show that hand-crafted feature extraction creates low-dimensional representations which help in reducing the overall excess error rate. As a corollary to this analysis, we derive for the first time upper bounds on the VC dimension of Convolutional Neural Network as well as Dropout and Dropconnect networks and the relation between excess error rate of Dropout and Dropconnect networks. The concept of intrinsic dimension is used to validate the intuition that texture-based datasets are inherently higher dimensional as compared to handwritten digits or other object recognition datasets and hence more difficult to be shattered by neural networks. We then derive the mean distance from the centroid to the nearest and farthest sampling points in an n-dimensional manifold and show that the Relative Contrast of the sample data vanishes as dimensionality of the underlying vector space tends to infinity. Copyright © 2017 Elsevier Ltd. All rights reserved.
Program Calculates Forces in Bolted Structural Joints
NASA Technical Reports Server (NTRS)
Buder, Daniel A.
2005-01-01
FORTRAN 77 computer program calculates forces in bolts in the joints of structures. This program is used in conjunction with the NASTRAN finite-element structural-analysis program. A mathematical model of a structure is first created by approximating its load-bearing members with representative finite elements, then NASTRAN calculates the forces and moments that each finite element contributes to grid points located throughout the structure. The user selects the finite elements that correspond to structural members that contribute loads to the joints of interest, and identifies the grid point nearest to each such joint. This program reads the pertinent NASTRAN output, combines the forces and moments from the contributing elements to determine the resultant force and moment acting at each proximate grid point, then transforms the forces and moments from these grid points to the centroids of the affected joints. Then the program uses these joint loads to obtain the axial and shear forces in the individual bolts. The program identifies which bolts bear the greatest axial and/or shear loads. The program also performs a fail-safe analysis in which the foregoing calculations are repeated for a sequence of cases in which each fastener, in turn, is assumed not to transmit an axial force.
Neural network classification of sweet potato embryos
NASA Astrophysics Data System (ADS)
Molto, Enrique; Harrell, Roy C.
1993-05-01
Somatic embryogenesis is a process that allows for the in vitro propagation of thousands of plants in sub-liter size vessels and has been successfully applied to many significant species. The heterogeneity of maturity and quality of embryos produced with this technique requires sorting to obtain a uniform product. An automated harvester is being developed at the University of Florida to sort embryos in vitro at different stages of maturation in a suspension culture. The system utilizes machine vision to characterize embryo morphology and a fluidic based separation device to isolate embryos associated with a pre-defined, targeted morphology. Two different backpropagation neural networks (BNN) were used to classify embryos based on information extracted from the vision system. One network utilized geometric features such as embryo area, length, and symmetry as inputs. The alternative network utilized polar coordinates of an embryo's perimeter with respect to its centroid as inputs. The performances of both techniques were compared with each other and with an embryo classification method based on linear discriminant analysis (LDA). Similar results were obtained with all three techniques. Classification efficiency was improved by reducing the dimension of the feature vector trough a forward stepwise analysis by LDA. In order to enhance the purity of the sample selected as harvestable, a reject to classify option was introduced in the model and analyzed. The best classifier performances (76% overall correct classifications, 75% harvestable objects properly classified, homogeneity improvement ratio 1.5) were obtained using 8 features in a BNN.
Random ensemble learning for EEG classification.
Hosseini, Mohammad-Parsa; Pompili, Dario; Elisevich, Kost; Soltanian-Zadeh, Hamid
2018-01-01
Real-time detection of seizure activity in epilepsy patients is critical in averting seizure activity and improving patients' quality of life. Accurate evaluation, presurgical assessment, seizure prevention, and emergency alerts all depend on the rapid detection of seizure onset. A new method of feature selection and classification for rapid and precise seizure detection is discussed wherein informative components of electroencephalogram (EEG)-derived data are extracted and an automatic method is presented using infinite independent component analysis (I-ICA) to select independent features. The feature space is divided into subspaces via random selection and multichannel support vector machines (SVMs) are used to classify these subspaces. The result of each classifier is then combined by majority voting to establish the final output. In addition, a random subspace ensemble using a combination of SVM, multilayer perceptron (MLP) neural network and an extended k-nearest neighbors (k-NN), called extended nearest neighbor (ENN), is developed for the EEG and electrocorticography (ECoG) big data problem. To evaluate the solution, a benchmark ECoG of eight patients with temporal and extratemporal epilepsy was implemented in a distributed computing framework as a multitier cloud-computing architecture. Using leave-one-out cross-validation, the accuracy, sensitivity, specificity, and both false positive and false negative ratios of the proposed method were found to be 0.97, 0.98, 0.96, 0.04, and 0.02, respectively. Application of the solution to cases under investigation with ECoG has also been effected to demonstrate its utility. Copyright © 2017 Elsevier B.V. All rights reserved.
Císař, Petr; Labbé, Laurent; Souček, Pavel; Pelissier, Pablo; Kerneis, Thierry
2018-01-01
The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout (Oncorhynchus mykiss) were fed either a fish-meal based diet (80 fish) or a 100% plant-based diet (80 fish) and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF), Support vector machine (SVM), Logistic regression (LR) and k-Nearest neighbours (k-NN). The SVM with radial based kernel provided the best classifier with correct classification rate (CCR) of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k-NN was the least accurate (40%) classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet’s effects on fish skin. PMID:29596375
Saberioon, Mohammadmehdi; Císař, Petr; Labbé, Laurent; Souček, Pavel; Pelissier, Pablo; Kerneis, Thierry
2018-03-29
The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout ( Oncorhynchus mykiss ) were fed either a fish-meal based diet (80 fish) or a 100% plant-based diet (80 fish) and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF), Support vector machine (SVM), Logistic regression (LR) and k -Nearest neighbours ( k -NN). The SVM with radial based kernel provided the best classifier with correct classification rate (CCR) of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k -NN was the least accurate (40%) classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet's effects on fish skin.
Application of texture analysis method for mammogram density classification
NASA Astrophysics Data System (ADS)
Nithya, R.; Santhi, B.
2017-07-01
Mammographic density is considered a major risk factor for developing breast cancer. This paper proposes an automated approach to classify breast tissue types in digital mammogram. The main objective of the proposed Computer-Aided Diagnosis (CAD) system is to investigate various feature extraction methods and classifiers to improve the diagnostic accuracy in mammogram density classification. Texture analysis methods are used to extract the features from the mammogram. Texture features are extracted by using histogram, Gray Level Co-Occurrence Matrix (GLCM), Gray Level Run Length Matrix (GLRLM), Gray Level Difference Matrix (GLDM), Local Binary Pattern (LBP), Entropy, Discrete Wavelet Transform (DWT), Wavelet Packet Transform (WPT), Gabor transform and trace transform. These extracted features are selected using Analysis of Variance (ANOVA). The features selected by ANOVA are fed into the classifiers to characterize the mammogram into two-class (fatty/dense) and three-class (fatty/glandular/dense) breast density classification. This work has been carried out by using the mini-Mammographic Image Analysis Society (MIAS) database. Five classifiers are employed namely, Artificial Neural Network (ANN), Linear Discriminant Analysis (LDA), Naive Bayes (NB), K-Nearest Neighbor (KNN), and Support Vector Machine (SVM). Experimental results show that ANN provides better performance than LDA, NB, KNN and SVM classifiers. The proposed methodology has achieved 97.5% accuracy for three-class and 99.37% for two-class density classification.
Automated diagnosis of dry eye using infrared thermography images
NASA Astrophysics Data System (ADS)
Acharya, U. Rajendra; Tan, Jen Hong; Koh, Joel E. W.; Sudarshan, Vidya K.; Yeo, Sharon; Too, Cheah Loon; Chua, Chua Kuang; Ng, E. Y. K.; Tong, Louis
2015-07-01
Dry Eye (DE) is a condition of either decreased tear production or increased tear film evaporation. Prolonged DE damages the cornea causing the corneal scarring, thinning and perforation. There is no single uniform diagnosis test available to date; combinations of diagnostic tests are to be performed to diagnose DE. The current diagnostic methods available are subjective, uncomfortable and invasive. Hence in this paper, we have developed an efficient, fast and non-invasive technique for the automated identification of normal and DE classes using infrared thermography images. The features are extracted from nonlinear method called Higher Order Spectra (HOS). Features are ranked using t-test ranking strategy. These ranked features are fed to various classifiers namely, K-Nearest Neighbor (KNN), Nave Bayesian Classifier (NBC), Decision Tree (DT), Probabilistic Neural Network (PNN), and Support Vector Machine (SVM) to select the best classifier using minimum number of features. Our proposed system is able to identify the DE and normal classes automatically with classification accuracy of 99.8%, sensitivity of 99.8%, and specificity if 99.8% for left eye using PNN and KNN classifiers. And we have reported classification accuracy of 99.8%, sensitivity of 99.9%, and specificity if 99.4% for right eye using SVM classifier with polynomial order 2 kernel.
Jiang, Xiaoying; Wei, Rong; Zhao, Yanjun; Zhang, Tongliang
2008-05-01
The knowledge of subnuclear localization in eukaryotic cells is essential for understanding the life function of nucleus. Developing prediction methods and tools for proteins subnuclear localization become important research fields in protein science for special characteristics in cell nuclear. In this study, a novel approach has been proposed to predict protein subnuclear localization. Sample of protein is represented by Pseudo Amino Acid (PseAA) composition based on approximate entropy (ApEn) concept, which reflects the complexity of time series. A novel ensemble classifier is designed incorporating three AdaBoost classifiers. The base classifier algorithms in three AdaBoost are decision stumps, fuzzy K nearest neighbors classifier, and radial basis-support vector machines, respectively. Different PseAA compositions are used as input data of different AdaBoost classifier in ensemble. Genetic algorithm is used to optimize the dimension and weight factor of PseAA composition. Two datasets often used in published works are used to validate the performance of the proposed approach. The obtained results of Jackknife cross-validation test are higher and more balance than them of other methods on same datasets. The promising results indicate that the proposed approach is effective and practical. It might become a useful tool in protein subnuclear localization. The software in Matlab and supplementary materials are available freely by contacting the corresponding author.
Rajagopal, Rekha; Ranganathan, Vidhyapriya
2018-06-05
Automation in cardiac arrhythmia classification helps medical professionals make accurate decisions about the patient's health. The aim of this work was to design a hybrid classification model to classify cardiac arrhythmias. The design phase of the classification model comprises the following stages: preprocessing of the cardiac signal by eliminating detail coefficients that contain noise, feature extraction through Daubechies wavelet transform, and arrhythmia classification using a collaborative decision from the K nearest neighbor classifier (KNN) and a support vector machine (SVM). The proposed model is able to classify 5 arrhythmia classes as per the ANSI/AAMI EC57: 1998 classification standard. Level 1 of the proposed model involves classification using the KNN and the classifier is trained with examples from all classes. Level 2 involves classification using an SVM and is trained specifically to classify overlapped classes. The final classification of a test heartbeat pertaining to a particular class is done using the proposed KNN/SVM hybrid model. The experimental results demonstrated that the average sensitivity of the proposed model was 92.56%, the average specificity 99.35%, the average positive predictive value 98.13%, the average F-score 94.5%, and the average accuracy 99.78%. The results obtained using the proposed model were compared with the results of discriminant, tree, and KNN classifiers. The proposed model is able to achieve a high classification accuracy.
NASA Astrophysics Data System (ADS)
Qur’ania, A.; Sarinah, I.
2018-03-01
People often wrong in knowing the type of jasmine by just looking at the white color of the jasmine, while not all white flowers including jasmine and not all jasmine flowers have white. There is a jasmine that is yellow and there is a jasmine that is white and purple.The aim of this research is to identify Jasmine flower (Jasminum sp.) based on the shape of the flower image-based using Sobel edge detection and k-Nearest Neighbor. Edge detection is used to detect the type of flower from the flower shape. Edge detection aims to improve the appearance of the border of a digital image. While k-Nearest Neighbor method is used to classify the classification of test objects into classes that have neighbouring properties closest to the object of training. The data used in this study are three types of jasmine namely jasmine white (Jasminum sambac), jasmine gambir (Jasminum pubescens), and jasmine japan (Pseuderanthemum reticulatum). Testing of jasmine flower image resized 50 × 50 pixels, 100 × 100 pixels, 150 × 150 pixels yields an accuracy of 84%. Tests on distance values of the k-NN method with spacing 5, 10 and 15 resulted in different accuracy rates for 5 and 10 closest distances yielding the same accuracy rate of 84%, for the 15 shortest distance resulted in a small accuracy of 65.2%.
Quantitative diagnosis of bladder cancer by morphometric analysis of HE images
NASA Astrophysics Data System (ADS)
Wu, Binlin; Nebylitsa, Samantha V.; Mukherjee, Sushmita; Jain, Manu
2015-02-01
In clinical practice, histopathological analysis of biopsied tissue is the main method for bladder cancer diagnosis and prognosis. The diagnosis is performed by a pathologist based on the morphological features in the image of a hematoxylin and eosin (HE) stained tissue sample. This manuscript proposes algorithms to perform morphometric analysis on the HE images, quantify the features in the images, and discriminate bladder cancers with different grades, i.e. high grade and low grade. The nuclei are separated from the background and other types of cells such as red blood cells (RBCs) and immune cells using manual outlining, color deconvolution and image segmentation. A mask of nuclei is generated for each image for quantitative morphometric analysis. The features of the nuclei in the mask image including size, shape, orientation, and their spatial distributions are measured. To quantify local clustering and alignment of nuclei, we propose a 1-nearest-neighbor (1-NN) algorithm which measures nearest neighbor distance and nearest neighbor parallelism. The global distributions of the features are measured using statistics of the proposed parameters. A linear support vector machine (SVM) algorithm is used to classify the high grade and low grade bladder cancers. The results show using a particular group of nuclei such as large ones, and combining multiple parameters can achieve better discrimination. This study shows the proposed approach can potentially help expedite pathological diagnosis by triaging potentially suspicious biopsies.
Velazquez-Pupo, Roxana; Sierra-Romero, Alberto; Torres-Roman, Deni; Shkvarko, Yuriy V.; Romero-Delgado, Misael
2018-01-01
This paper presents a high performance vision-based system with a single static camera for traffic surveillance, for moving vehicle detection with occlusion handling, tracking, counting, and One Class Support Vector Machine (OC-SVM) classification. In this approach, moving objects are first segmented from the background using the adaptive Gaussian Mixture Model (GMM). After that, several geometric features are extracted, such as vehicle area, height, width, centroid, and bounding box. As occlusion is present, an algorithm was implemented to reduce it. The tracking is performed with adaptive Kalman filter. Finally, the selected geometric features: estimated area, height, and width are used by different classifiers in order to sort vehicles into three classes: small, midsize, and large. Extensive experimental results in eight real traffic videos with more than 4000 ground truth vehicles have shown that the improved system can run in real time under an occlusion index of 0.312 and classify vehicles with a global detection rate or recall, precision, and F-measure of up to 98.190%, and an F-measure of up to 99.051% for midsize vehicles. PMID:29382078
Rai, Shesh N; Trainor, Patrick J; Khosravi, Farhad; Kloecker, Goetz; Panchapakesan, Balaji
2016-01-01
The development of biosensors that produce time series data will facilitate improvements in biomedical diagnostics and in personalized medicine. The time series produced by these devices often contains characteristic features arising from biochemical interactions between the sample and the sensor. To use such characteristic features for determining sample class, similarity-based classifiers can be utilized. However, the construction of such classifiers is complicated by the variability in the time domains of such series that renders the traditional distance metrics such as Euclidean distance ineffective in distinguishing between biological variance and time domain variance. The dynamic time warping (DTW) algorithm is a sequence alignment algorithm that can be used to align two or more series to facilitate quantifying similarity. In this article, we evaluated the performance of DTW distance-based similarity classifiers for classifying time series that mimics electrical signals produced by nanotube biosensors. Simulation studies demonstrated the positive performance of such classifiers in discriminating between time series containing characteristic features that are obscured by noise in the intensity and time domains. We then applied a DTW distance-based k -nearest neighbors classifier to distinguish the presence/absence of mesenchymal biomarker in cancer cells in buffy coats in a blinded test. Using a train-test approach, we find that the classifier had high sensitivity (90.9%) and specificity (81.8%) in differentiating between EpCAM-positive MCF7 cells spiked in buffy coats and those in plain buffy coats.
Harry V., Jr. Wiant; Michael L. Spangler; John E. Baumgras
2002-01-01
Various taper systems and the centroid method were compared to unbiased volume estimates made by importance sampling for 720 hardwood trees selected throughout the state of West Virginia. Only the centroid method consistently gave volumes estimates that did not differ significantly from those made by importance sampling, although some taper equations did well for most...
Özdemir, Merve Erkınay; Telatar, Ziya; Eroğul, Osman; Tunca, Yusuf
2018-05-01
Dysmorphic syndromes have different facial malformations. These malformations are significant to an early diagnosis of dysmorphic syndromes and contain distinctive information for face recognition. In this study we define the certain features of each syndrome by considering facial malformations and classify Fragile X, Hurler, Prader Willi, Down, Wolf Hirschhorn syndromes and healthy groups automatically. The reference points are marked on the face images and ratios between the points' distances are taken into consideration as features. We suggest a neural network based hierarchical decision tree structure in order to classify the syndrome types. We also implement k-nearest neighbor (k-NN) and artificial neural network (ANN) classifiers to compare classification accuracy with our hierarchical decision tree. The classification accuracy is 50, 73 and 86.7% with k-NN, ANN and hierarchical decision tree methods, respectively. Then, the same images are shown to a clinical expert who achieve a recognition rate of 46.7%. We develop an efficient system to recognize different syndrome types automatically in a simple, non-invasive imaging data, which is independent from the patient's age, sex and race at high accuracy. The promising results indicate that our method can be used for pre-diagnosis of the dysmorphic syndromes by clinical experts.
Identification of Anisomerous Motor Imagery EEG Signals Based on Complex Algorithms
Zhang, Zhiwen; Duan, Feng; Zhou, Xin; Meng, Zixuan
2017-01-01
Motor imagery (MI) electroencephalograph (EEG) signals are widely applied in brain-computer interface (BCI). However, classified MI states are limited, and their classification accuracy rates are low because of the characteristics of nonlinearity and nonstationarity. This study proposes a novel MI pattern recognition system that is based on complex algorithms for classifying MI EEG signals. In electrooculogram (EOG) artifact preprocessing, band-pass filtering is performed to obtain the frequency band of MI-related signals, and then, canonical correlation analysis (CCA) combined with wavelet threshold denoising (WTD) is used for EOG artifact preprocessing. We propose a regularized common spatial pattern (R-CSP) algorithm for EEG feature extraction by incorporating the principle of generic learning. A new classifier combining the K-nearest neighbor (KNN) and support vector machine (SVM) approaches is used to classify four anisomerous states, namely, imaginary movements with the left hand, right foot, and right shoulder and the resting state. The highest classification accuracy rate is 92.5%, and the average classification accuracy rate is 87%. The proposed complex algorithm identification method can significantly improve the identification rate of the minority samples and the overall classification performance. PMID:28874909
Gap Shape Classification using Landscape Indices and Multivariate Statistics
Wu, Chih-Da; Cheng, Chi-Chuan; Chang, Che-Chang; Lin, Chinsu; Chang, Kun-Cheng; Chuang, Yung-Chung
2016-01-01
This study proposed a novel methodology to classify the shape of gaps using landscape indices and multivariate statistics. Patch-level indices were used to collect the qualified shape and spatial configuration characteristics for canopy gaps in the Lienhuachih Experimental Forest in Taiwan in 1998 and 2002. Non-hierarchical cluster analysis was used to assess the optimal number of gap clusters and canonical discriminant analysis was used to generate the discriminant functions for canopy gap classification. The gaps for the two periods were optimally classified into three categories. In general, gap type 1 had a more complex shape, gap type 2 was more elongated and gap type 3 had the largest gaps that were more regular in shape. The results were evaluated using Wilks’ lambda as satisfactory (p < 0.001). The agreement rate of confusion matrices exceeded 96%. Differences in gap characteristics between the classified gap types that were determined using a one-way ANOVA showed a statistical significance in all patch indices (p = 0.00), except for the Euclidean nearest neighbor distance (ENN) in 2002. Taken together, these results demonstrated the feasibility and applicability of the proposed methodology to classify the shape of a gap. PMID:27901127
Software platform for managing the classification of error- related potentials of observers
NASA Astrophysics Data System (ADS)
Asvestas, P.; Ventouras, E.-C.; Kostopoulos, S.; Sidiropoulos, K.; Korfiatis, V.; Korda, A.; Uzunolglu, A.; Karanasiou, I.; Kalatzis, I.; Matsopoulos, G.
2015-09-01
Human learning is partly based on observation. Electroencephalographic recordings of subjects who perform acts (actors) or observe actors (observers), contain a negative waveform in the Evoked Potentials (EPs) of the actors that commit errors and of observers who observe the error-committing actors. This waveform is called the Error-Related Negativity (ERN). Its detection has applications in the context of Brain-Computer Interfaces. The present work describes a software system developed for managing EPs of observers, with the aim of classifying them into observations of either correct or incorrect actions. It consists of an integrated platform for the storage, management, processing and classification of EPs recorded during error-observation experiments. The system was developed using C# and the following development tools and frameworks: MySQL, .NET Framework, Entity Framework and Emgu CV, for interfacing with the machine learning library of OpenCV. Up to six features can be computed per EP recording per electrode. The user can select among various feature selection algorithms and then proceed to train one of three types of classifiers: Artificial Neural Networks, Support Vector Machines, k-nearest neighbour. Next the classifier can be used for classifying any EP curve that has been inputted to the database.
Gap Shape Classification using Landscape Indices and Multivariate Statistics.
Wu, Chih-Da; Cheng, Chi-Chuan; Chang, Che-Chang; Lin, Chinsu; Chang, Kun-Cheng; Chuang, Yung-Chung
2016-11-30
This study proposed a novel methodology to classify the shape of gaps using landscape indices and multivariate statistics. Patch-level indices were used to collect the qualified shape and spatial configuration characteristics for canopy gaps in the Lienhuachih Experimental Forest in Taiwan in 1998 and 2002. Non-hierarchical cluster analysis was used to assess the optimal number of gap clusters and canonical discriminant analysis was used to generate the discriminant functions for canopy gap classification. The gaps for the two periods were optimally classified into three categories. In general, gap type 1 had a more complex shape, gap type 2 was more elongated and gap type 3 had the largest gaps that were more regular in shape. The results were evaluated using Wilks' lambda as satisfactory (p < 0.001). The agreement rate of confusion matrices exceeded 96%. Differences in gap characteristics between the classified gap types that were determined using a one-way ANOVA showed a statistical significance in all patch indices (p = 0.00), except for the Euclidean nearest neighbor distance (ENN) in 2002. Taken together, these results demonstrated the feasibility and applicability of the proposed methodology to classify the shape of a gap.
Doppler centroid estimation ambiguity for synthetic aperture radars
NASA Technical Reports Server (NTRS)
Chang, C. Y.; Curlander, J. C.
1989-01-01
A technique for estimation of the Doppler centroid of an SAR in the presence of large uncertainty in antenna boresight pointing is described. Also investigated is the image degradation resulting from data processing that uses an ambiguous centroid. Two approaches for resolving ambiguities in Doppler centroid estimation (DCE) are presented: the range cross-correlation technique and the multiple-PRF (pulse repetition frequency) technique. Because other design factors control the PRF selection for SAR, a generalized algorithm is derived for PRFs not containing a common divisor. An example using the SIR-C parameters illustrates that this algorithm is capable of resolving the C-band DCE ambiguities for antenna pointing uncertainties of about 2-3 deg.
Characterization of trabecular bone using the backscattered spectral centroid shift.
Wear, Keith A
2003-04-01
Ultrasonic attenuation in bone in vivo is generally measured using a through-transmission method at the calcaneus. Although attenuation in calcaneus has been demonstrated to be a useful predictor for osteoporotic fracture risk, measurements at other clinically important sites, such as hip and spine, could potentially contain additional useful diagnostic information. Through-transmission measurements may not be feasible at these sites due to complex bone shapes and the increased amount of intervening soft tissue. Centroid shift from the backscattered signal is an index of attenuation slope and has been used previously to characterize soft tissues. In this paper, centroid shift from signals backscattered from 30 trabecular bone samples in vitro were measured. Attenuation slope also was measured using a through-transmission method. The correlation coefficient between centroid shift and attenuation slope was -0.71. The 95% confidence interval was (-0.86, -0.47). These results suggest that the backscattered spectral centroid shift may contain useful diagnostic information potentially applicable to hip and spine.
Garra, Brian S; Locher, Melanie; Felker, Steven; Wear, Keith A
2009-01-01
Ultrasonic backscatter measurements from vertebral bodies (L3 and L4) in nine women were performed using a clinical ultrasonic imaging system. Measurements were made through the abdomen. The location of a vertebra was identified from the bright specular reflection from the vertebral anterior surface. Backscattered signals were gated to isolate signal emanating from the cancellous interiors of vertebrae. The spectral centroid shift of the backscattered signal, which has previously been shown to correlate highly with bone mineral density (BMD) in human calcaneus in vitro, was measured. BMD was also measured in the nine subjects' vertebrae using a clinical bone densitometer. The correlation coefficient between centroid shift and BMD was r = -0.61. The slope of the linear fit was -160 kHz / (g/cm(2)). The negative slope was expected because the attenuation coefficient (and therefore magnitude of the centroid downshift) is known from previous studies to increase with BMD. The centroid shift may be a useful parameter for characterizing bone in vivo.
Real-time object-to-features vectorisation via Siamese neural networks
NASA Astrophysics Data System (ADS)
Fedorenko, Fedor; Usilin, Sergey
2017-03-01
Object-to-features vectorisation is a hard problem to solve for objects that can be hard to distinguish. Siamese and Triplet neural networks are one of the more recent tools used for such task. However, most networks used are very deep networks that prove to be hard to compute in the Internet of Things setting. In this paper, a computationally efficient neural network is proposed for real-time object-to-features vectorisation into a Euclidean metric space. We use L2 distance to reflect feature vector similarity during both training and testing. In this way, feature vectors we develop can be easily classified using K-Nearest Neighbours classifier. Such approach can be used to train networks to vectorise such "problematic" objects like images of human faces, keypoint image patches, like keypoints on Arctic maps and surrounding marine areas.
ISE-based sensor array system for classification of foodstuffs
NASA Astrophysics Data System (ADS)
Ciosek, Patrycja; Sobanski, Tomasz; Augustyniak, Ewa; Wróblewski, Wojciech
2006-01-01
A system composed of an array of polymeric membrane ion-selective electrodes and a pattern recognition block—a so-called 'electronic tongue'—was used for the classification of liquid samples: milk, fruit juice and tonic. The task of this system was to automatically recognize a brand of the product. To analyze the measurement set-up responses various non-parametric classifiers such as k-nearest neighbours, a feedforward neural network and a probabilistic neural network were used. In order to enhance the classification ability of the system, standard model solutions of salts were measured (in order to take into account any variation in time of the working parameters of the sensors). This system was capable of recognizing the brand of the products with accuracy ranging from 68% to 100% (in the case of the best classifier).
Just-in-time adaptive classifiers-part II: designing the classifier.
Alippi, Cesare; Roveri, Manuel
2008-12-01
Aging effects, environmental changes, thermal drifts, and soft and hard faults affect physical systems by changing their nature and behavior over time. To cope with a process evolution adaptive solutions must be envisaged to track its dynamics; in this direction, adaptive classifiers are generally designed by assuming the stationary hypothesis for the process generating the data with very few results addressing nonstationary environments. This paper proposes a methodology based on k-nearest neighbor (NN) classifiers for designing adaptive classification systems able to react to changing conditions just-in-time (JIT), i.e., exactly when it is needed. k-NN classifiers have been selected for their computational-free training phase, the possibility to easily estimate the model complexity k and keep under control the computational complexity of the classifier through suitable data reduction mechanisms. A JIT classifier requires a temporal detection of a (possible) process deviation (aspect tackled in a companion paper) followed by an adaptive management of the knowledge base (KB) of the classifier to cope with the process change. The novelty of the proposed approach resides in the general framework supporting the real-time update of the KB of the classification system in response to novel information coming from the process both in stationary conditions (accuracy improvement) and in nonstationary ones (process tracking) and in providing a suitable estimate of k. It is shown that the classification system grants consistency once the change targets the process generating the data in a new stationary state, as it is the case in many real applications.
Consistent latent position estimation and vertex classification for random dot product graphs.
Sussman, Daniel L; Tang, Minh; Priebe, Carey E
2014-01-01
In this work, we show that using the eigen-decomposition of the adjacency matrix, we can consistently estimate latent positions for random dot product graphs provided the latent positions are i.i.d. from some distribution. If class labels are observed for a number of vertices tending to infinity, then we show that the remaining vertices can be classified with error converging to Bayes optimal using the $(k)$-nearest-neighbors classification rule. We evaluate the proposed methods on simulated data and a graph derived from Wikipedia.
Mannil, Manoj; von Spiczak, Jochen; Manka, Robert; Alkadhi, Hatem
2018-06-01
The aim of this study was to test whether texture analysis and machine learning enable the detection of myocardial infarction (MI) on non-contrast-enhanced low radiation dose cardiac computed tomography (CCT) images. In this institutional review board-approved retrospective study, we included non-contrast-enhanced electrocardiography-gated low radiation dose CCT image data (effective dose, 0.5 mSv) acquired for the purpose of calcium scoring of 27 patients with acute MI (9 female patients; mean age, 60 ± 12 years), 30 patients with chronic MI (8 female patients; mean age, 68 ± 13 years), and in 30 subjects (9 female patients; mean age, 44 ± 6 years) without cardiac abnormality, hereafter termed controls. Texture analysis of the left ventricle was performed using free-hand regions of interest, and texture features were classified twice (Model I: controls versus acute MI versus chronic MI; Model II: controls versus acute and chronic MI). For both classifications, 6 commonly used machine learning classifiers were used: decision tree C4.5 (J48), k-nearest neighbors, locally weighted learning, RandomForest, sequential minimal optimization, and an artificial neural network employing deep learning. In addition, 2 blinded, independent readers visually assessed noncontrast CCT images for the presence or absence of MI. In Model I, best classification results were obtained using the k-nearest neighbors classifier (sensitivity, 69%; specificity, 85%; false-positive rate, 0.15). In Model II, the best classification results were found with the locally weighted learning classification (sensitivity, 86%; specificity, 81%; false-positive rate, 0.19) with an area under the curve from receiver operating characteristics analysis of 0.78. In comparison, both readers were not able to identify MI in any of the noncontrast, low radiation dose CCT images. This study indicates the ability of texture analysis and machine learning in detecting MI on noncontrast low radiation dose CCT images being not visible for the radiologists' eye.
Stephens, David; Diesing, Markus
2014-01-01
Detailed seabed substrate maps are increasingly in demand for effective planning and management of marine ecosystems and resources. It has become common to use remotely sensed multibeam echosounder data in the form of bathymetry and acoustic backscatter in conjunction with ground-truth sampling data to inform the mapping of seabed substrates. Whilst, until recently, such data sets have typically been classified by expert interpretation, it is now obvious that more objective, faster and repeatable methods of seabed classification are required. This study compares the performances of a range of supervised classification techniques for predicting substrate type from multibeam echosounder data. The study area is located in the North Sea, off the north-east coast of England. A total of 258 ground-truth samples were classified into four substrate classes. Multibeam bathymetry and backscatter data, and a range of secondary features derived from these datasets were used in this study. Six supervised classification techniques were tested: Classification Trees, Support Vector Machines, k-Nearest Neighbour, Neural Networks, Random Forest and Naive Bayes. Each classifier was trained multiple times using different input features, including i) the two primary features of bathymetry and backscatter, ii) a subset of the features chosen by a feature selection process and iii) all of the input features. The predictive performances of the models were validated using a separate test set of ground-truth samples. The statistical significance of model performances relative to a simple baseline model (Nearest Neighbour predictions on bathymetry and backscatter) were tested to assess the benefits of using more sophisticated approaches. The best performing models were tree based methods and Naive Bayes which achieved accuracies of around 0.8 and kappa coefficients of up to 0.5 on the test set. The models that used all input features didn't generally perform well, highlighting the need for some means of feature selection.
Proposing an adaptive mutation to improve XCSF performance to classify ADHD and BMD patients
NASA Astrophysics Data System (ADS)
Sadatnezhad, Khadijeh; Boostani, Reza; Ghanizadeh, Ahmad
2010-12-01
There is extensive overlap of clinical symptoms observed among children with bipolar mood disorder (BMD) and those with attention deficit hyperactivity disorder (ADHD). Thus, diagnosis according to clinical symptoms cannot be very accurate. It is therefore desirable to develop quantitative criteria for automatic discrimination between these disorders. This study is aimed at designing an efficient decision maker to accurately classify ADHD and BMD patients by analyzing their electroencephalogram (EEG) signals. In this study, 22 channels of EEGs have been recorded from 21 subjects with ADHD and 22 individuals with BMD. Several informative features, such as fractal dimension, band power and autoregressive coefficients, were extracted from the recorded signals. Considering the multimodal overlapping distribution of the obtained features, linear discriminant analysis (LDA) was used to reduce the input dimension in a more separable space to make it more appropriate for the proposed classifier. A piecewise linear classifier based on the extended classifier system for function approximation (XCSF) was modified by developing an adaptive mutation rate, which was proportional to the genotypic content of best individuals and their fitness in each generation. The proposed operator controlled the trade-off between exploration and exploitation while maintaining the diversity in the classifier's population to avoid premature convergence. To assess the effectiveness of the proposed scheme, the extracted features were applied to support vector machine, LDA, nearest neighbor and XCSF classifiers. To evaluate the method, a noisy environment was simulated with different noise amplitudes. It is shown that the results of the proposed technique are more robust as compared to conventional classifiers. Statistical tests demonstrate that the proposed classifier is a promising method for discriminating between ADHD and BMD patients.
Proposing an adaptive mutation to improve XCSF performance to classify ADHD and BMD patients.
Sadatnezhad, Khadijeh; Boostani, Reza; Ghanizadeh, Ahmad
2010-12-01
There is extensive overlap of clinical symptoms observed among children with bipolar mood disorder (BMD) and those with attention deficit hyperactivity disorder (ADHD). Thus, diagnosis according to clinical symptoms cannot be very accurate. It is therefore desirable to develop quantitative criteria for automatic discrimination between these disorders. This study is aimed at designing an efficient decision maker to accurately classify ADHD and BMD patients by analyzing their electroencephalogram (EEG) signals. In this study, 22 channels of EEGs have been recorded from 21 subjects with ADHD and 22 individuals with BMD. Several informative features, such as fractal dimension, band power and autoregressive coefficients, were extracted from the recorded signals. Considering the multimodal overlapping distribution of the obtained features, linear discriminant analysis (LDA) was used to reduce the input dimension in a more separable space to make it more appropriate for the proposed classifier. A piecewise linear classifier based on the extended classifier system for function approximation (XCSF) was modified by developing an adaptive mutation rate, which was proportional to the genotypic content of best individuals and their fitness in each generation. The proposed operator controlled the trade-off between exploration and exploitation while maintaining the diversity in the classifier's population to avoid premature convergence. To assess the effectiveness of the proposed scheme, the extracted features were applied to support vector machine, LDA, nearest neighbor and XCSF classifiers. To evaluate the method, a noisy environment was simulated with different noise amplitudes. It is shown that the results of the proposed technique are more robust as compared to conventional classifiers. Statistical tests demonstrate that the proposed classifier is a promising method for discriminating between ADHD and BMD patients.
Travel by public transit to mammography facilities in 6 US urban areas
Graham, S; Lewis, B; Flanagan, B; Watson, M; Peipins, L
2017-01-01
We examined lack of private vehicle access and 30 minutes or longer public transportation travel time to mammography facilities for women 40 years of age or older in the urban areas of Boston, Philadelphia, San Antonio, San Diego, Denver, and Seattle to identify transit marginalized populations - women for whom these travel characteristics may jointly present a barrier to clinic access. This ecological study used sex and race/ethnicity data from the 2010 US Census and household vehicle availability data from the American Community Survey 2008–2012, all at Census tract level. Using the public transportation option on Google Trip Planner we obtained the travel time from the centroid of each census tract to all local mammography facilities to determine the nearest mammography facility in each urban area. Median travel times by public transportation to the nearest facility for women with no household access to a private vehicle were obtained by ranking travel time by population group across all U.S. census tracts in each urban area and across the entire study area. The overall median travel times for each urban area for women without household access to a private vehicle ranged from a low of 15 minutes in Boston and Philadelphia to 27 minutes in San Diego. The numbers and percentages of transit marginalized women were then calculated for all urban areas by population group. While black women were less likely to have private vehicle access, and both Hispanic and black women were more likely to be transit marginalized, this outcome varied by urban area. White women constituted the largest number of transit marginalized. Our results indicate that mammography facilities are favorably located for the large majority of women, although there are still substantial numbers for whom travel may likely present a barrier to mammography facility access. PMID:29285434
NASA Astrophysics Data System (ADS)
Li, Xiaoliang; Luo, Lei; Li, Pengwei; Yu, Qingkui
2018-03-01
The image sensor in satellite optical communication system may generate noise due to space irradiation damage, leading to deviation for the determination of the light spot centroid. Based on the irradiation test data of CMOS devices, simulated defect spots in different sizes have been used for calculating the centroid deviation value by grey-level centroid algorithm. The impact on tracking & pointing accuracy of the system has been analyzed. The results show that both the amount and the position of irradiation-induced defect pixels contribute to spot centroid deviation. And the larger spot has less deviation. At last, considering the space radiation damage, suggestions are made for the constraints of spot size selection.
NASA Astrophysics Data System (ADS)
Han, Cheongho; Jeong, Youngjin; Kim, Ho-Il
1998-11-01
Recently Alard, Mao, & Guibert and Alard proposed to detect the shift of a star's image centroid, δx, as a method to identify the lensed source among blended stars. Goldberg & Woźniak actually applied this method to the OGLE-1 database and found that seven of 15 events showed significant centroid shifts of δx >~ 0.2". The amount of centroid shift has been estimated theoretically by Goldberg; however, he treated the problem in general and did not apply it to a particular survey or field and therefore based his estimate on simple toy model luminosity functions (i.e., power laws). In this paper, we construct the expected distribution of δx for Galactic bulge events based on the precise stellar luminosity function observed by Holtzman et al. using the Hubble Space Telescope. Their luminosity function is complete up to MI ~ 9.0 (MV ~ 12), which corresponds to faint M-type stars. In our analysis we find that regular blending cannot produce a large fraction of events with measurable centroid shifts. By contrast, a significant fraction of events would have measurable centroid shifts if they are affected by amplification-bias blending. Therefore, the measurements of large centroid shifts for an important fraction of microlensing events of Goldberg & Woźniak confirm the prediction of Han & Alard that a large fraction of Galactic bulge events are affected by amplification-bias blending.
Discriminative Hierarchical K-Means Tree for Large-Scale Image Classification.
Chen, Shizhi; Yang, Xiaodong; Tian, Yingli
2015-09-01
A key challenge in large-scale image classification is how to achieve efficiency in terms of both computation and memory without compromising classification accuracy. The learning-based classifiers achieve the state-of-the-art accuracies, but have been criticized for the computational complexity that grows linearly with the number of classes. The nonparametric nearest neighbor (NN)-based classifiers naturally handle large numbers of categories, but incur prohibitively expensive computation and memory costs. In this brief, we present a novel classification scheme, i.e., discriminative hierarchical K-means tree (D-HKTree), which combines the advantages of both learning-based and NN-based classifiers. The complexity of the D-HKTree only grows sublinearly with the number of categories, which is much better than the recent hierarchical support vector machines-based methods. The memory requirement is the order of magnitude less than the recent Naïve Bayesian NN-based approaches. The proposed D-HKTree classification scheme is evaluated on several challenging benchmark databases and achieves the state-of-the-art accuracies, while with significantly lower computation cost and memory requirement.
Speaker gender identification based on majority vote classifiers
NASA Astrophysics Data System (ADS)
Mezghani, Eya; Charfeddine, Maha; Nicolas, Henri; Ben Amar, Chokri
2017-03-01
Speaker gender identification is considered among the most important tools in several multimedia applications namely in automatic speech recognition, interactive voice response systems and audio browsing systems. Gender identification systems performance is closely linked to the selected feature set and the employed classification model. Typical techniques are based on selecting the best performing classification method or searching optimum tuning of one classifier parameters through experimentation. In this paper, we consider a relevant and rich set of features involving pitch, MFCCs as well as other temporal and frequency-domain descriptors. Five classification models including decision tree, discriminant analysis, nave Bayes, support vector machine and k-nearest neighbor was experimented. The three best perming classifiers among the five ones will contribute by majority voting between their scores. Experimentations were performed on three different datasets spoken in three languages: English, German and Arabic in order to validate language independency of the proposed scheme. Results confirm that the presented system has reached a satisfying accuracy rate and promising classification performance thanks to the discriminating abilities and diversity of the used features combined with mid-level statistics.
Automated diagnosis of epilepsy using CWT, HOS and texture parameters.
Acharya, U Rajendra; Yanti, Ratna; Zheng, Jia Wei; Krishnan, M Muthu Rama; Tan, Jen Hong; Martis, Roshan Joy; Lim, Choo Min
2013-06-01
Epilepsy is a chronic brain disorder which manifests as recurrent seizures. Electroencephalogram (EEG) signals are generally analyzed to study the characteristics of epileptic seizures. In this work, we propose a method for the automated classification of EEG signals into normal, interictal and ictal classes using Continuous Wavelet Transform (CWT), Higher Order Spectra (HOS) and textures. First the CWT plot was obtained for the EEG signals and then the HOS and texture features were extracted from these plots. Then the statistically significant features were fed to four classifiers namely Decision Tree (DT), K-Nearest Neighbor (KNN), Probabilistic Neural Network (PNN) and Support Vector Machine (SVM) to select the best classifier. We observed that the SVM classifier with Radial Basis Function (RBF) kernel function yielded the best results with an average accuracy of 96%, average sensitivity of 96.9% and average specificity of 97% for 23.6 s duration of EEG data. Our proposed technique can be used as an automatic seizure monitoring software. It can also assist the doctors to cross check the efficacy of their prescribed drugs.
Ozcift, Akin
2012-08-01
Parkinson disease (PD) is an age-related deterioration of certain nerve systems, which affects movement, balance, and muscle control of clients. PD is one of the common diseases which affect 1% of people older than 60 years. A new classification scheme based on support vector machine (SVM) selected features to train rotation forest (RF) ensemble classifiers is presented for improving diagnosis of PD. The dataset contains records of voice measurements from 31 people, 23 with PD and each record in the dataset is defined with 22 features. The diagnosis model first makes use of a linear SVM to select ten most relevant features from 22. As a second step of the classification model, six different classifiers are trained with the subset of features. Subsequently, at the third step, the accuracies of classifiers are improved by the utilization of RF ensemble classification strategy. The results of the experiments are evaluated using three metrics; classification accuracy (ACC), Kappa Error (KE) and Area under the Receiver Operating Characteristic (ROC) Curve (AUC). Performance measures of two base classifiers, i.e. KStar and IBk, demonstrated an apparent increase in PD diagnosis accuracy compared to similar studies in literature. After all, application of RF ensemble classification scheme improved PD diagnosis in 5 of 6 classifiers significantly. We, numerically, obtained about 97% accuracy in RF ensemble of IBk (a K-Nearest Neighbor variant) algorithm, which is a quite high performance for Parkinson disease diagnosis.
NASA Astrophysics Data System (ADS)
Ceylan Koydemir, Hatice; Feng, Steve; Liang, Kyle; Nadkarni, Rohan; Tseng, Derek; Benien, Parul; Ozcan, Aydogan
2017-03-01
Giardia lamblia causes a disease known as giardiasis, which results in diarrhea, abdominal cramps, and bloating. Although conventional pathogen detection methods used in water analysis laboratories offer high sensitivity and specificity, they are time consuming, and need experts to operate bulky equipment and analyze the samples. Here we present a field-portable and cost-effective smartphone-based waterborne pathogen detection platform that can automatically classify Giardia cysts using machine learning. Our platform enables the detection and quantification of Giardia cysts in one hour, including sample collection, labeling, filtration, and automated counting steps. We evaluated the performance of three prototypes using Giardia-spiked water samples from different sources (e.g., reagent-grade, tap, non-potable, and pond water samples). We populated a training database with >30,000 cysts and estimated our detection sensitivity and specificity using 20 different classifier models, including decision trees, nearest neighbor classifiers, support vector machines (SVMs), and ensemble classifiers, and compared their speed of training and classification, as well as predicted accuracies. Among them, cubic SVM, medium Gaussian SVM, and bagged-trees were the most promising classifier types with accuracies of 94.1%, 94.2%, and 95%, respectively; we selected the latter as our preferred classifier for the detection and enumeration of Giardia cysts that are imaged using our mobile-phone fluorescence microscope. Without the need for any experts or microbiologists, this field-portable pathogen detection platform can present a useful tool for water quality monitoring in resource-limited-settings.
Dissimilarity representations in lung parenchyma classification
NASA Astrophysics Data System (ADS)
Sørensen, Lauge; de Bruijne, Marleen
2009-02-01
A good problem representation is important for a pattern recognition system to be successful. The traditional approach to statistical pattern recognition is feature representation. More specifically, objects are represented by a number of features in a feature vector space, and classifiers are built in this representation. This is also the general trend in lung parenchyma classification in computed tomography (CT) images, where the features often are measures on feature histograms. Instead, we propose to build normal density based classifiers in dissimilarity representations for lung parenchyma classification. This allows for the classifiers to work on dissimilarities between objects, which might be a more natural way of representing lung parenchyma. In this context, dissimilarity is defined between CT regions of interest (ROI)s. ROIs are represented by their CT attenuation histogram and ROI dissimilarity is defined as a histogram dissimilarity measure between the attenuation histograms. In this setting, the full histograms are utilized according to the chosen histogram dissimilarity measure. We apply this idea to classification of different emphysema patterns as well as normal, healthy tissue. Two dissimilarity representation approaches as well as different histogram dissimilarity measures are considered. The approaches are evaluated on a set of 168 CT ROIs using normal density based classifiers all showing good performance. Compared to using histogram dissimilarity directly as distance in a emph{k} nearest neighbor classifier, which achieves a classification accuracy of 92.9%, the best dissimilarity representation based classifier is significantly better with a classification accuracy of 97.0% (text{emph{p" border="0" class="imgtopleft"> = 0.046).
Acharya, U Rajendra; Sree, S Vinitha; Chattopadhyay, Subhagata; Yu, Wenwei; Ang, Peng Chuan Alvin
2011-06-01
Epilepsy is a common neurological disorder that is characterized by the recurrence of seizures. Electroencephalogram (EEG) signals are widely used to diagnose seizures. Because of the non-linear and dynamic nature of the EEG signals, it is difficult to effectively decipher the subtle changes in these signals by visual inspection and by using linear techniques. Therefore, non-linear methods are being researched to analyze the EEG signals. In this work, we use the recorded EEG signals in Recurrence Plots (RP), and extract Recurrence Quantification Analysis (RQA) parameters from the RP in order to classify the EEG signals into normal, ictal, and interictal classes. Recurrence Plot (RP) is a graph that shows all the times at which a state of the dynamical system recurs. Studies have reported significantly different RQA parameters for the three classes. However, more studies are needed to develop classifiers that use these promising features and present good classification accuracy in differentiating the three types of EEG segments. Therefore, in this work, we have used ten RQA parameters to quantify the important features in the EEG signals.These features were fed to seven different classifiers: Support vector machine (SVM), Gaussian Mixture Model (GMM), Fuzzy Sugeno Classifier, K-Nearest Neighbor (KNN), Naive Bayes Classifier (NBC), Decision Tree (DT), and Radial Basis Probabilistic Neural Network (RBPNN). Our results show that the SVM classifier was able to identify the EEG class with an average efficiency of 95.6%, sensitivity and specificity of 98.9% and 97.8%, respectively.
Detecting falls with wearable sensors using machine learning techniques.
Özdemir, Ahmet Turan; Barshan, Billur
2014-06-18
Falls are a serious public health problem and possibly life threatening for people in fall risk groups. We develop an automated fall detection system with wearable motion sensor units fitted to the subjects' body at six different positions. Each unit comprises three tri-axial devices (accelerometer, gyroscope, and magnetometer/compass). Fourteen volunteers perform a standardized set of movements including 20 voluntary falls and 16 activities of daily living (ADLs), resulting in a large dataset with 2520 trials. To reduce the computational complexity of training and testing the classifiers, we focus on the raw data for each sensor in a 4 s time window around the point of peak total acceleration of the waist sensor, and then perform feature extraction and reduction. Most earlier studies on fall detection employ rule-based approaches that rely on simple thresholding of the sensor outputs. We successfully distinguish falls from ADLs using six machine learning techniques (classifiers): the k-nearest neighbor (k-NN) classifier, least squares method (LSM), support vector machines (SVM), Bayesian decision making (BDM), dynamic time warping (DTW), and artificial neural networks (ANNs). We compare the performance and the computational complexity of the classifiers and achieve the best results with the k-NN classifier and LSM, with sensitivity, specificity, and accuracy all above 99%. These classifiers also have acceptable computational requirements for training and testing. Our approach would be applicable in real-world scenarios where data records of indeterminate length, containing multiple activities in sequence, are recorded.
Shack-Hartmann wavefront sensor with large dynamic range.
Xia, Mingliang; Li, Chao; Hu, Lifa; Cao, Zhaoliang; Mu, Quanquan; Xuan, Li
2010-01-01
A new spot centroid detection algorithm for a Shack-Hartmann wavefront sensor (SHWFS) is experimentally investigated. The algorithm is a kind of dynamic tracking algorithm that tracks and calculates the corresponding spot centroid of the current spot map based on the spot centroid of the previous spot map, according to the strong correlation of the wavefront slope and the centroid of the corresponding spot between temporally adjacent SHWFS measurements. That is, for adjacent measurements, the spot centroid movement will usually fall within some range. Using the algorithm, the dynamic range of an SHWFS can be expanded by a factor of three in the measurement of tilt aberration compared with the conventional algorithm, more than 1.3 times in the measurement of defocus aberration, and more than 2 times in the measurement of the mixture of spherical aberration plus coma aberration. The algorithm is applied in our SHWFS to measure the distorted wavefront of the human eye. The experimental results of the adaptive optics (AO) system for retina imaging are presented to prove its feasibility for highly aberrated eyes.
Hosseinifard, Behshad; Moradi, Mohammad Hassan; Rostami, Reza
2013-03-01
Diagnosing depression in the early curable stages is very important and may even save the life of a patient. In this paper, we study nonlinear analysis of EEG signal for discriminating depression patients and normal controls. Forty-five unmedicated depressed patients and 45 normal subjects were participated in this study. Power of four EEG bands and four nonlinear features including detrended fluctuation analysis (DFA), higuchi fractal, correlation dimension and lyapunov exponent were extracted from EEG signal. For discriminating the two groups, k-nearest neighbor, linear discriminant analysis and logistic regression as the classifiers are then used. Highest classification accuracy of 83.3% is obtained by correlation dimension and LR classifier among other nonlinear features. For further improvement, all nonlinear features are combined and applied to classifiers. A classification accuracy of 90% is achieved by all nonlinear features and LR classifier. In all experiments, genetic algorithm is employed to select the most important features. The proposed technique is compared and contrasted with the other reported methods and it is demonstrated that by combining nonlinear features, the performance is enhanced. This study shows that nonlinear analysis of EEG can be a useful method for discriminating depressed patients and normal subjects. It is suggested that this analysis may be a complementary tool to help psychiatrists for diagnosing depressed patients. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Getting to know the nearest stars: Intermittent radio bursts from Ross 614
NASA Astrophysics Data System (ADS)
Winterhalter, Daniel; Knapp, Mary; Bastian, Tim
2017-04-01
Radio observations have been used as a search tool for exoplanets since before the confirmed discovery of the first extrasolar planet. To date, there have been no definitive detections of exoplanets in the radio regime. We are engaged in an ongoing blind radio survey of the nearest star systems for exoplanetary radio emission. The goal of this survey is to obtain meaningful upper limits on radio emission from (or modulated by) sub-stellar companions of the nearest stars. Nearby stars are strongly preferred because they suffer the least from the dilution of potential radio signals by distance. Targets are selected by distance and observability (both LOFAR and VLA) only. Other properties of target stars, such as stellar type, are not considered to avoid biasing the search. Five survey targets, Procyon, GJ 1111, GJ 725, Ross 614, and UGPSJ072227.51, have been observed with the VLA telescope L- and S-band receivers. P-band observations are ongoing. Of particular interest are, at this time, our observation of the Ross 614 System. Ross 614 is an M-dwarf binary system at a distance of about 13 Ly, with an orbital period of 16.6 years. The binary companions are classified as flare stars because strong radio emission has been detected from the location of the system in previous work. Analyses are in progress to determine if the intermittent burst are similar to solar-type burst, and/or if there is any evidence for emissions from sub-stellar companions.
Latent Dirichlet Allocation (LDA) Model and kNN Algorithm to Classify Research Project Selection
NASA Astrophysics Data System (ADS)
Safi’ie, M. A.; Utami, E.; Fatta, H. A.
2018-03-01
Universitas Sebelas Maret has a teaching staff more than 1500 people, and one of its tasks is to carry out research. In the other side, the funding support for research and service is limited, so there is need to be evaluated to determine the Research proposal submission and devotion on society (P2M). At the selection stage, research proposal documents are collected as unstructured data and the data stored is very large. To extract information contained in the documents therein required text mining technology. This technology applied to gain knowledge to the documents by automating the information extraction. In this articles we use Latent Dirichlet Allocation (LDA) to the documents as a model in feature extraction process, to get terms that represent its documents. Hereafter we use k-Nearest Neighbour (kNN) algorithm to classify the documents based on its terms.
Batres-Mendoza, Patricia; Montoro-Sanjose, Carlos R; Guerra-Hernandez, Erick I; Almanza-Ojeda, Dora L; Rostro-Gonzalez, Horacio; Romero-Troncoso, Rene J; Ibarra-Manzano, Mario A
2016-03-05
Quaternions can be used as an alternative to model the fundamental patterns of electroencephalographic (EEG) signals in the time domain. Thus, this article presents a new quaternion-based technique known as quaternion-based signal analysis (QSA) to represent EEG signals obtained using a brain-computer interface (BCI) device to detect and interpret cognitive activity. This quaternion-based signal analysis technique can extract features to represent brain activity related to motor imagery accurately in various mental states. Experimental tests in which users where shown visual graphical cues related to left and right movements were used to collect BCI-recorded signals. These signals were then classified using decision trees (DT), support vector machine (SVM) and k-nearest neighbor (KNN) techniques. The quantitative analysis of the classifiers demonstrates that this technique can be used as an alternative in the EEG-signal modeling phase to identify mental states.
Batres-Mendoza, Patricia; Montoro-Sanjose, Carlos R.; Guerra-Hernandez, Erick I.; Almanza-Ojeda, Dora L.; Rostro-Gonzalez, Horacio; Romero-Troncoso, Rene J.; Ibarra-Manzano, Mario A.
2016-01-01
Quaternions can be used as an alternative to model the fundamental patterns of electroencephalographic (EEG) signals in the time domain. Thus, this article presents a new quaternion-based technique known as quaternion-based signal analysis (QSA) to represent EEG signals obtained using a brain-computer interface (BCI) device to detect and interpret cognitive activity. This quaternion-based signal analysis technique can extract features to represent brain activity related to motor imagery accurately in various mental states. Experimental tests in which users where shown visual graphical cues related to left and right movements were used to collect BCI-recorded signals. These signals were then classified using decision trees (DT), support vector machine (SVM) and k-nearest neighbor (KNN) techniques. The quantitative analysis of the classifiers demonstrates that this technique can be used as an alternative in the EEG-signal modeling phase to identify mental states. PMID:26959029
Cheng, Sara Y.; Duong, Hai V.; Compton, Campbell; Vaughn, Mark W.; Nguyen, Hoa; Cheng, Kwan H.
2015-01-01
Quantifying protein-induced lipid disruptions at the atomistic level is a challenging problem in membrane biophysics. Here we propose a novel 3D Voronoi tessellation nearest-atom-neighbor shell method to classify and characterize lipid domains into discrete concentric lipid shells surrounding membrane proteins in structurally heterogeneous lipid membranes. This method needs only the coordinates of the system and is independent of force fields and simulation conditions. As a proof-of-principle, we use this multiple lipid shell method to analyze the lipid disruption profiles of three simulated membrane systems: phosphatidylcholine, phosphatidylcholine/cholesterol, and beta-amyloid/phosphatidylcholine/cholesterol. We observed different atomic volume disruption mechanisms due to cholesterol and beta-amyloid Additionally, several lipid fractional groups and lipid-interfacial water did not converge to their control values with increasing distance or shell order from the protein. This volume divergent behavior was confirmed by bilayer thickness and chain orientational order calculations. Our method can also be used to analyze high-resolution structural experimental data. PMID:25637891
Biclustering Learning of Trading Rules.
Huang, Qinghua; Wang, Ting; Tao, Dacheng; Li, Xuelong
2015-10-01
Technical analysis with numerous indicators and patterns has been regarded as important evidence for making trading decisions in financial markets. However, it is extremely difficult for investors to find useful trading rules based on numerous technical indicators. This paper innovatively proposes the use of biclustering mining to discover effective technical trading patterns that contain a combination of indicators from historical financial data series. This is the first attempt to use biclustering algorithm on trading data. The mined patterns are regarded as trading rules and can be classified as three trading actions (i.e., the buy, the sell, and no-action signals) with respect to the maximum support. A modified K nearest neighborhood ( K -NN) method is applied to classification of trading days in the testing period. The proposed method [called biclustering algorithm and the K nearest neighbor (BIC- K -NN)] was implemented on four historical datasets and the average performance was compared with the conventional buy-and-hold strategy and three previously reported intelligent trading systems. Experimental results demonstrate that the proposed trading system outperforms its counterparts and will be useful for investment in various financial markets.
Bonet, Isis; Franco-Montero, Pedro; Rivero, Virginia; Teijeira, Marta; Borges, Fernanda; Uriarte, Eugenio; Morales Helguera, Aliuska
2013-12-23
A(2B) adenosine receptor antagonists may be beneficial in treating diseases like asthma, diabetes, diabetic retinopathy, and certain cancers. This has stimulated research for the development of potent ligands for this subtype, based on quantitative structure-affinity relationships. In this work, a new ensemble machine learning algorithm is proposed for classification and prediction of the ligand-binding affinity of A(2B) adenosine receptor antagonists. This algorithm is based on the training of different classifier models with multiple training sets (composed of the same compounds but represented by diverse features). The k-nearest neighbor, decision trees, neural networks, and support vector machines were used as single classifiers. To select the base classifiers for combining into the ensemble, several diversity measures were employed. The final multiclassifier prediction results were computed from the output obtained by using a combination of selected base classifiers output, by utilizing different mathematical functions including the following: majority vote, maximum and average probability. In this work, 10-fold cross- and external validation were used. The strategy led to the following results: i) the single classifiers, together with previous features selections, resulted in good overall accuracy, ii) a comparison between single classifiers, and their combinations in the multiclassifier model, showed that using our ensemble gave a better performance than the single classifier model, and iii) our multiclassifier model performed better than the most widely used multiclassifier models in the literature. The results and statistical analysis demonstrated the supremacy of our multiclassifier approach for predicting the affinity of A(2B) adenosine receptor antagonists, and it can be used to develop other QSAR models.
Consensus Classification Using Non-Optimized Classifiers.
Brownfield, Brett; Lemos, Tony; Kalivas, John H
2018-04-03
Classifying samples into categories is a common problem in analytical chemistry and other fields. Classification is usually based on only one method, but numerous classifiers are available with some being complex, such as neural networks, and others are simple, such as k nearest neighbors. Regardless, most classification schemes require optimization of one or more tuning parameters for best classification accuracy, sensitivity, and specificity. A process not requiring exact selection of tuning parameter values would be useful. To improve classification, several ensemble approaches have been used in past work to combine classification results from multiple optimized single classifiers. The collection of classifications for a particular sample are then combined by a fusion process such as majority vote to form the final classification. Presented in this Article is a method to classify a sample by combining multiple classification methods without specifically classifying the sample by each method, that is, the classification methods are not optimized. The approach is demonstrated on three analytical data sets. The first is a beer authentication set with samples measured on five instruments, allowing fusion of multiple instruments by three ways. The second data set is composed of textile samples from three classes based on Raman spectra. This data set is used to demonstrate the ability to classify simultaneously with different data preprocessing strategies, thereby reducing the need to determine the ideal preprocessing method, a common prerequisite for accurate classification. The third data set contains three wine cultivars for three classes measured at 13 unique chemical and physical variables. In all cases, fusion of nonoptimized classifiers improves classification. Also presented are atypical uses of Procrustes analysis and extended inverted signal correction (EISC) for distinguishing sample similarities to respective classes.
Basavanhally, Ajay; Viswanath, Satish; Madabhushi, Anant
2015-01-01
Clinical trials increasingly employ medical imaging data in conjunction with supervised classifiers, where the latter require large amounts of training data to accurately model the system. Yet, a classifier selected at the start of the trial based on smaller and more accessible datasets may yield inaccurate and unstable classification performance. In this paper, we aim to address two common concerns in classifier selection for clinical trials: (1) predicting expected classifier performance for large datasets based on error rates calculated from smaller datasets and (2) the selection of appropriate classifiers based on expected performance for larger datasets. We present a framework for comparative evaluation of classifiers using only limited amounts of training data by using random repeated sampling (RRS) in conjunction with a cross-validation sampling strategy. Extrapolated error rates are subsequently validated via comparison with leave-one-out cross-validation performed on a larger dataset. The ability to predict error rates as dataset size increases is demonstrated on both synthetic data as well as three different computational imaging tasks: detecting cancerous image regions in prostate histopathology, differentiating high and low grade cancer in breast histopathology, and detecting cancerous metavoxels in prostate magnetic resonance spectroscopy. For each task, the relationships between 3 distinct classifiers (k-nearest neighbor, naive Bayes, Support Vector Machine) are explored. Further quantitative evaluation in terms of interquartile range (IQR) suggests that our approach consistently yields error rates with lower variability (mean IQRs of 0.0070, 0.0127, and 0.0140) than a traditional RRS approach (mean IQRs of 0.0297, 0.0779, and 0.305) that does not employ cross-validation sampling for all three datasets. PMID:25993029
Concurrent Timbres in Orchestration: a Perceptual Study of Factors Determining "blend"
NASA Astrophysics Data System (ADS)
Sandell, Gregory John
Orchestration often involves selecting instruments for concurrent presentation, as in melodic doubling or chords. One evaluation of the aural outcome of such choices is along the continuum of "blend": whether the instruments fuse into a single composite timbre, segregate into distinct timbral entities, or fall somewhere in between the two extremes. This study investigates, through perceptual experimentation, the acoustical correlates of blend for 15 natural-sounding orchestral instruments presented in concurrently-sounding pairs (e.g. flute-cello, trumpet -oboe, etc.). Ratings of blend showed primary effects for centroid (the location of the midpoint of the spectral energy distribution) and duration of the onset for the tones. Lower average values of both centroid and onset duration for a pair of tones led to increased blends, as did closeness in value for the two factors. Blend decreased (instruments segregated) with higher average values or increased difference in value for the two factors. The musical interval of presentation slightly affected the relative importance of these two mechanisms, with unison intervals determined more by lower average centroid, and minor thirds determined more by closeness in centroid. The contribution of onset in general was slightly more pronounced in the unison conditions than in the minor third condition. Additional factors contributing to blend were correlation of amplitude and centroid envelopes (blend increased as temporal patterns rose and fell in synchrony) and similarity in the overall amount of fundamental frequency perturbation (decreased blend with increasing jitter from both tones). To confirm the importance of centroid as an independent factor determining blend, pairs of tones including instruments with artificially changed centroids were rated for blend. Judgments for several versions of the same instrument pair showed that blend decreased as the altered instrument increased in centroid, corroborating the earlier experiments. Other factors manipulated were amplitude level and the degree of inharmonicity. A survey of orchestration manuals showed many illustrations of "blending" combinations of instruments that were consistent with the results of these experiments. This study's acoustically-based guidelines for blend augment instance-based methods of traditional orchestration teaching, providing underlying abstractions helpful for evaluating the blend of arbitrary combinations of instruments.
Fast Query-Optimized Kernel-Machine Classification
NASA Technical Reports Server (NTRS)
Mazzoni, Dominic; DeCoste, Dennis
2004-01-01
A recently developed algorithm performs kernel-machine classification via incremental approximate nearest support vectors. The algorithm implements support-vector machines (SVMs) at speeds 10 to 100 times those attainable by use of conventional SVM algorithms. The algorithm offers potential benefits for classification of images, recognition of speech, recognition of handwriting, and diverse other applications in which there are requirements to discern patterns in large sets of data. SVMs constitute a subset of kernel machines (KMs), which have become popular as models for machine learning and, more specifically, for automated classification of input data on the basis of labeled training data. While similar in many ways to k-nearest-neighbors (k-NN) models and artificial neural networks (ANNs), SVMs tend to be more accurate. Using representations that scale only linearly in the numbers of training examples, while exploring nonlinear (kernelized) feature spaces that are exponentially larger than the original input dimensionality, KMs elegantly and practically overcome the classic curse of dimensionality. However, the price that one must pay for the power of KMs is that query-time complexity scales linearly with the number of training examples, making KMs often orders of magnitude more computationally expensive than are ANNs, decision trees, and other popular machine learning alternatives. The present algorithm treats an SVM classifier as a special form of a k-NN. The algorithm is based partly on an empirical observation that one can often achieve the same classification as that of an exact KM by using only small fraction of the nearest support vectors (SVs) of a query. The exact KM output is a weighted sum over the kernel values between the query and the SVs. In this algorithm, the KM output is approximated with a k-NN classifier, the output of which is a weighted sum only over the kernel values involving k selected SVs. Before query time, there are gathered statistics about how misleading the output of the k-NN model can be, relative to the outputs of the exact KM for a representative set of examples, for each possible k from 1 to the total number of SVs. From these statistics, there are derived upper and lower thresholds for each step k. These thresholds identify output levels for which the particular variant of the k-NN model already leans so strongly positively or negatively that a reversal in sign is unlikely, given the weaker SV neighbors still remaining. At query time, the partial output of each query is incrementally updated, stopping as soon as it exceeds the predetermined statistical thresholds of the current step. For an easy query, stopping can occur as early as step k = 1. For more difficult queries, stopping might not occur until nearly all SVs are touched. A key empirical observation is that this approach can tolerate very approximate nearest-neighbor orderings. In experiments, SVs and queries were projected to a subspace comprising the top few principal- component dimensions and neighbor orderings were computed in that subspace. This approach ensured that the overhead of the nearest-neighbor computations was insignificant, relative to that of the exact KM computation.
Predicting hepatotoxicity using ToxCast in vitro bioactivity and ...
Background: The U.S. EPA ToxCastTM program is screening thousands of environmental chemicals for bioactivity using hundreds of high-throughput in vitro assays to build predictive models of toxicity. We represented chemicals based on bioactivity and chemical structure descriptors then used supervised machine learning to predict their hepatotoxic effects.Results: A set of 677 chemicals were represented by 711 in vitro bioactivity descriptors (from ToxCast assays), 4,376 chemical structure descriptors (from QikProp, OpenBabel, PADEL, and PubChem), and three hepatotoxicity categories (from animal studies). Hepatotoxicants were defined by rat liver histopathology observed after chronic chemical testing and grouped into hypertrophy (161), injury (101) and proliferative lesions (99). Classifiers were built using six machine learning algorithms: linear discriminant analysis (LDA), Naïve Bayes (NB), support vector classification (SVM), classification and regression trees (CART), k-nearest neighbors (KNN) and an ensemble of classifiers (ENSMB). Classifiers of hepatotoxicity were built using chemical structure, ToxCast bioactivity, and a hybrid representation. Predictive performance was evaluated using 10-fold cross-validation testing and in-loop, filter-based, feature subset selection. Hybrid classifiers had the best balanced accuracy for predicting hypertrophy (0.78±0.08), injury (0.73±0.10) and proliferative lesions (0.72±0.09). Though chemical and bioactivity class
Accelerometer and Camera-Based Strategy for Improved Human Fall Detection.
Zerrouki, Nabil; Harrou, Fouzi; Sun, Ying; Houacine, Amrane
2016-12-01
In this paper, we address the problem of detecting human falls using anomaly detection. Detection and classification of falls are based on accelerometric data and variations in human silhouette shape. First, we use the exponentially weighted moving average (EWMA) monitoring scheme to detect a potential fall in the accelerometric data. We used an EWMA to identify features that correspond with a particular type of fall allowing us to classify falls. Only features corresponding with detected falls were used in the classification phase. A benefit of using a subset of the original data to design classification models minimizes training time and simplifies models. Based on features corresponding to detected falls, we used the support vector machine (SVM) algorithm to distinguish between true falls and fall-like events. We apply this strategy to the publicly available fall detection databases from the university of Rzeszow's. Results indicated that our strategy accurately detected and classified fall events, suggesting its potential application to early alert mechanisms in the event of fall situations and its capability for classification of detected falls. Comparison of the classification results using the EWMA-based SVM classifier method with those achieved using three commonly used machine learning classifiers, neural network, K-nearest neighbor and naïve Bayes, proved our model superior.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kalkwarf, D.R.
1980-05-01
Airborne uranium products were collected at the perimeter of the uranium-conversion plant operated by the Allied Chemical Corporation at Metropolis, Illinois, and the dissolution rates of these products were classified in terms of the ICRP Task Group Lung Model. Assignments were based on measurements of the dissolution half-times exhibited by uranium components of the dust samples as they dissolved in simulated lung fluid at 37/sup 0/C. Based on three trials, the dissolution behavior of dust with aerodynamic equivalent diameter (AED) less than 5.5 ..mu..m and collected nearest the closest residence to the plant was classified 0.40 D, 0.60 Y. Basedmore » on two trials, the dissolution behavior of dust with AED greater than 5.5 ..mu..m and collected at this location was classified 0.37 D, 0.63 Y. Based on one trial, the dissolution behavior of dust with AED less than 5.5 ..mu..m and collected at a location on the opposite side of the plant was classified 0.68 D, 0.32 Y. There was some evidence for adsorption of dissolved uranium onto other dust components during dissolution, and preliminary dissolution trials are recommended for future samples in order to optimize the fluid replacement schedule.« less
Wheeler, Stephanie B; Kuo, Tzy-Mey; Durham, Danielle; Frizzelle, Brian; Reeder-Hayes, Katherine; Meyer, Anne-Marie
2014-01-01
Distance to oncology service providers and rurality may affect receipt of guideline-recommended radiation therapy (RT), but the extent to which these factors affect the care of Medicare-insured patients is unknown. Using cancer registry data linked to Medicare claims from the Integrated Cancer Information and Surveillance System (ICISS), we identified all women aged 65 years or older who were diagnosed with stage I, II, or III breast cancer from 2003 through 2005, who had Medicare claims through 2006, and who were clinically eligible for RT. We geocoded the address of each RT service provider's practice location and calculated the travel distance from each patient's residential address to the nearest RT provider. We used ZIP codes to classify each patient's residence as rural or urban according to rural- urban commuting area codes. We used generalized estimating equations models with county-level clustering and interaction terms between distance categories and rural-urban status to estimate the effect of distance to care and rural-urban status on receipt of RT. In urban areas, increasing distance to the nearest RT provider was associated with a lower likelihood of receiving RT (odds ratio [OR] = 0.54; 95% confidence interval [CI], 0.30-0.97) for those living more than 20 miles from the nearest RT provider compared with those living less than 10 miles away. In rural areas, those living within 10-20 miles of the nearest RT provider were more likely to receive RT than those living less than 10 miles away (OR = 1.73; 95% CI, 1.08-2.76). Results may not be generalizable to areas outside North Carolina or to non-Medicare populations. Coordinated outreach programs targeted differently to rural and urban patients may be necessary to improve the quality of oncology care.
Analysis of the hand vein pattern for people recognition
NASA Astrophysics Data System (ADS)
Castro-Ortega, R.; Toxqui-Quitl, C.; Cristóbal, G.; Marcos, J. Victor; Padilla-Vivanco, A.; Hurtado Pérez, R.
2015-09-01
The shape of the hand vascular pattern contains useful and unique features that can be used for identifying and authenticating people, with applications in access control, medicine and financial services. In this work, an optical system for the image acquisition of the hand vascular pattern is implemented. It consists of a CCD camera with sensitivity in the IR and a light source with emission in the 880 nm. The IR radiation interacts with the desoxyhemoglobin, hemoglobin and water present in the blood of the veins, making possible to see the vein pattern underneath skin. The segmentation of the Region Of Interest (ROI) is achieved using geometrical moments locating the centroid of an image. For enhancement of the vein pattern we use the technique of Histogram Equalization and Contrast Limited Adaptive Histogram Equalization (CLAHE). In order to remove unnecessary information such as body hair and skinfolds, a low pass filter is implemented. A method based on geometric moments is used to obtain the invariant descriptors of the input images. The classification task is achieved using Artificial Neural Networks (ANN) and K-Nearest Neighbors (K-nn) algorithms. Experimental results using our database show a percentage of correct classification, higher of 86.36% with ANN for 912 images of 38 people with 12 versions each one.
Zhang, Xu; Zhao, Yufeng; Xu, Jia; Xue, Zhengsheng; Zhang, Menghui; Pang, Xiaoyan; Zhang, Xiaojun; Zhao, Liping
2015-09-23
Accumulating evidence suggests that the gut microbiota is an important factor in mediating the development of obesity-related metabolic disorders, including type 2 diabetes. Metformin and berberine, two clinically effective drugs for treating diabetes, have recently been shown to exert their actions through modulating the gut microbiota. In this study, we demonstrated that metformin and berberine similarly shifted the overall structure of the gut microbiota in rats. Both drugs showed reverting effects on the high-fat diet-induced structural changes of gut microbiota. The diversity of gut microbiota was significantly reduced by both berberine- and metformin-treatments. Nearest shrunken centroids analysis identified 134 operational taxonomic units (OTUs) responding to the treatments, which showed close associations with the changes of obese phenotypes. Sixty out of the 134 OTUs were decreased by both drugs, while those belonging to putative short-chain fatty acids (SCFA)-producing bacteria, including Allobaculum, Bacteriodes, Blautia, Butyricoccus, and Phascolarctobacterium, were markedly increased by both berberine and, to a lesser extent, metformin. Taken together, our findings suggest that berberine and metformin showed similarity in modulating the gut microbiota, including the enrichment of SCFA-producing bacteria and reduction of microbial diversity, which may contribute to their beneficial effects to the host.
Yard flooding by irrigation canals increased the risk of West Nile disease in El Paso, Texas
Cardenas, Victor M.; Jaime, Javier; Ford, Paula B.; Gonzalez, Fernando J.; Carrillo, Irma; Gallegos, Jorge E.; Watts, Douglas M.
2011-01-01
Purpose To investigate the effects of use of water from irrigation canals to flood residential yards on the risk of West Nile disease in El Paso, Texas. Methods West Nile disease confirmed cases in 2009–2010 were compared with a random sample of 50 residents of the county according to access to and use of water from irrigation canals by subjects or their neighbors, as well as geo-referenced closest distance between their home address and the nearest irrigation canal. A windshield survey of 600 meters around the study subjects’ home address recorded the presence of irrigation canals. The distance from the residence of 182 confirmed cases of West Nile disease reported in 2003–2010 to canals was compared to that of the centroids of 182 blocks selected at random. Results Cases were more likely than controls to report their neighbors flooded their yards with water from canals. Irrigation canals were more often observed in neighborhoods of cases than of controls. Using the set of addresses of 182 confirmed cases and 182 hypothetic controls the authors found a statistically significant inverse relation with risk of West Nile disease. Conclusions Flooding of yards with water from canals increased the risk of West Nile disease. PMID:21943648
Yard flooding by irrigation canals increased the risk of West Nile disease in El Paso, Texas.
Cardenas, Victor M; Jaime, Javier; Ford, Paula B; Gonzalez, Fernando J; Carrillo, Irma; Gallegos, Jorge E; Watts, Douglas M
2011-12-01
To investigate the effects of use of water from irrigation canals to flood residential yards on the risk of West Nile disease in El Paso, Texas. West Nile disease confirmed cases in 2009 through 2010 were compared with a random sample of 50 residents of the county according to access to and use of water from irrigation canals by subjects or their neighbors, as well as geo-referenced closest distance between their home address and the nearest irrigation canal. A windshield survey of 600 m around the study subjects' home address recorded the presence of irrigation canals. The distance from the residence of 182 confirmed cases of West Nile disease reported in 2003 through 2010 to canals was compared with that of the centroids of 182 blocks selected at random. Cases were more likely than controls to report their neighbors flooded their yards with water from canals. Irrigation canals were more often observed in neighborhoods of cases than of controls. Using the set of addresses of 182 confirmed cases and 182 hypothetical controls the authors found a significant, inverse relation with risk of West Nile disease. Flooding of yards with water from canals increased the risk of West Nile disease. Copyright © 2011 Elsevier Inc. All rights reserved.
[Schizophrenia or spiritual crisis? On "raising the kundalini" and its diagnostic classification].
Hansen, G
1995-07-31
Two patients are described who had been diagnosed as schizophrenic, but had actually instead been going through spiritual crises, which in Eastern spiritual tradition are called raising the kundalini. Perhaps this experience is not a disease, but many--especially if not understood by oneself, the nearest relations and the medical profession--cause mental illness. In WHO ICD-10 the experience could be classified as F48.8, disordines neurotici specificati alii. The process falls outside the categories of both normal and psychotic. When allowed to progress to completion this process culminates in deep psychological balance, strength, and maturity.
Robotic situational awareness of actions in human teaming
NASA Astrophysics Data System (ADS)
Tahmoush, Dave
2015-06-01
When robots can sense and interpret the activities of the people they are working with, they become more of a team member and less of just a piece of equipment. This has motivated work on recognizing human actions using existing robotic sensors like short-range ladar imagers. These produce three-dimensional point cloud movies which can be analyzed for structure and motion information. We skeletonize the human point cloud and apply a physics-based velocity correlation scheme to the resulting joint motions. The twenty actions are then recognized using a nearest-neighbors classifier that achieves good accuracy.
Generic Design Procedures for the Repair of Acoustically Damaged Panels
2008-12-01
plate for component 1 h2 Thickness of plate for component 2 h3 Thickness of plate for component 3 h13 Distance from centroid of component 1 to centroid...E1 View AA Simply supported/clamped plate h13 Ly Lx y x d3 d1 y 2a Figure 4: Geometry for constrained layer damping of a simply...dimensions, properties and parameters Physical dimensions (Figure 4) Material properties Key parameters h1, h2 , h3 , h13 , Lx , Ly , 2a E1 , E3 , G2
Ambiguity Of Doppler Centroid In Synthetic-Aperture Radar
NASA Technical Reports Server (NTRS)
Chang, Chi-Yung; Curlander, John C.
1991-01-01
Paper discusses performances of two algorithms for resolution of ambiguity in estimated Doppler centroid frequency of echoes in synthetic-aperture radar. One based on range-cross-correlation technique, other based on multiple-pulse-repetition-frequency technique.
Shaffer, Franklin D.
2013-03-12
The application relates to particle trajectory recognition from a Centroid Population comprised of Centroids having an (x, y, t) or (x, y, f) coordinate. The method is applicable to visualization and measurement of particle flow fields of high particle. In one embodiment, the centroids are generated from particle images recorded on camera frames. The application encompasses digital computer systems and distribution mediums implementing the method disclosed and is particularly applicable to recognizing trajectories of particles in particle flows of high particle concentration. The method accomplishes trajectory recognition by forming Candidate Trajectory Trees and repeated searches at varying Search Velocities, such that initial search areas are set to a minimum size in order to recognize only the slowest, least accelerating particles which produce higher local concentrations. When a trajectory is recognized, the centroids in that trajectory are removed from consideration in future searches.
Centroid estimation for a Shack-Hartmann wavefront sensor based on stream processing.
Kong, Fanpeng; Polo, Manuel Cegarra; Lambert, Andrew
2017-08-10
Using center of gravity to estimate the centroid of the spot in a Shack-Hartmann wavefront sensor, the measurement corrupts with photon and detector noise. Parameters, like window size, often require careful optimization to balance the noise error, dynamic range, and linearity of the response coefficient under different photon flux. It also needs to be substituted by the correlation method for extended sources. We propose a centroid estimator based on stream processing, where the center of gravity calculation window floats with the incoming pixel from the detector. In comparison with conventional methods, we show that the proposed estimator simplifies the choice of optimized parameters, provides a unit linear coefficient response, and reduces the influence of background and noise. It is shown that the stream-based centroid estimator also works well for limited size extended sources. A hardware implementation of the proposed estimator is discussed.
Effects of window size and shape on accuracy of subpixel centroid estimation of target images
NASA Technical Reports Server (NTRS)
Welch, Sharon S.
1993-01-01
A new algorithm is presented for increasing the accuracy of subpixel centroid estimation of (nearly) point target images in cases where the signal-to-noise ratio is low and the signal amplitude and shape vary from frame to frame. In the algorithm, the centroid is calculated over a data window that is matched in width to the image distribution. Fourier analysis is used to explain the dependency of the centroid estimate on the size of the data window, and simulation and experimental results are presented which demonstrate the effects of window size for two different noise models. The effects of window shape were also investigated for uniform and Gaussian-shaped windows. The new algorithm was developed to improve the dynamic range of a close-range photogrammetric tracking system that provides feedback for control of a large gap magnetic suspension system (LGMSS).
Photometric analysis in the Kepler Science Operations Center pipeline
NASA Astrophysics Data System (ADS)
Twicken, Joseph D.; Clarke, Bruce D.; Bryson, Stephen T.; Tenenbaum, Peter; Wu, Hayley; Jenkins, Jon M.; Girouard, Forrest; Klaus, Todd C.
2010-07-01
We describe the Photometric Analysis (PA) software component and its context in the Kepler Science Operations Center (SOC) Science Processing Pipeline. The primary tasks of this module are to compute the photometric flux and photocenters (centroids) for over 160,000 long cadence (~thirty minute) and 512 short cadence (~one minute) stellar targets from the calibrated pixels in their respective apertures. We discuss science algorithms for long and short cadence PA: cosmic ray cleaning; background estimation and removal; aperture photometry; and flux-weighted centroiding. We discuss the end-to-end propagation of uncertainties for the science algorithms. Finally, we present examples of photometric apertures, raw flux light curves, and centroid time series from Kepler flight data. PA light curves, centroid time series, and barycentric timestamp corrections are exported to the Multi-mission Archive at Space Telescope [Science Institute] (MAST) and are made available to the general public in accordance with the NASA/Kepler data release policy.
Photometric Analysis in the Kepler Science Operations Center Pipeline
NASA Technical Reports Server (NTRS)
Twicken, Joseph D.; Clarke, Bruce D.; Bryson, Stephen T.; Tenenbaum, Peter; Wu, Hayley; Jenkins, Jon M.; Girouard, Forrest; Klaus, Todd C.
2010-01-01
We describe the Photometric Analysis (PA) software component and its context in the Kepler Science Operations Center (SOC) pipeline. The primary tasks of this module are to compute the photometric flux and photocenters (centroids) for over 160,000 long cadence (thirty minute) and 512 short cadence (one minute) stellar targets from the calibrated pixels in their respective apertures. We discuss the science algorithms for long and short cadence PA: cosmic ray cleaning; background estimation and removal; aperture photometry; and flux-weighted centroiding. We discuss the end-to-end propagation of uncertainties for the science algorithms. Finally, we present examples of photometric apertures, raw flux light curves, and centroid time series from Kepler flight data. PA light curves, centroid time series, and barycentric timestamp corrections are exported to the Multi-mission Archive at Space Telescope [Science Institute] (MAST) and are made available to the general public in accordance with the NASA/Kepler data release policy.
NASA Astrophysics Data System (ADS)
Dotani, T.
1989-11-01
Strong Quasi-Periodic Oscillations (QPO) in type 2 bursts from the rapid burster with Ginga were detected. The QPD have centroid frequency of approximately 5 and 2 Hz during bursts which lasted for approximately 10 and 30 sec, respectively. The QPO observations were analyzed and the following results were obtained: QPO centroid frequencies have some correlation with burst duration and peak count rate, however the correlations are complicated and the burst parameters do not uniquely determine the QPO centroid frequency; the appearance of the QPO is closely related to the so-called timescale-invariant profile of the bursts; the QPO are significant only in the even numbered peaks of the profile and not in the odd numbered peaks; in most cases the QPO centroid frequency decreases up to approximately 25 percent during a burst. The energy spectra at the QPO peaks and valleys were investigated and the QPO peaks were found to have significantely higher blackbody temperature than the QPD valleys.
Amaral, Jorge L M; Lopes, Agnaldo J; Jansen, José M; Faria, Alvaro C D; Melo, Pedro L
2013-12-01
The purpose of this study was to develop an automatic classifier to increase the accuracy of the forced oscillation technique (FOT) for diagnosing early respiratory abnormalities in smoking patients. The data consisted of FOT parameters obtained from 56 volunteers, 28 healthy and 28 smokers with low tobacco consumption. Many supervised learning techniques were investigated, including logistic linear classifiers, k nearest neighbor (KNN), neural networks and support vector machines (SVM). To evaluate performance, the ROC curve of the most accurate parameter was established as baseline. To determine the best input features and classifier parameters, we used genetic algorithms and a 10-fold cross-validation using the average area under the ROC curve (AUC). In the first experiment, the original FOT parameters were used as input. We observed a significant improvement in accuracy (KNN=0.89 and SVM=0.87) compared with the baseline (0.77). The second experiment performed a feature selection on the original FOT parameters. This selection did not cause any significant improvement in accuracy, but it was useful in identifying more adequate FOT parameters. In the third experiment, we performed a feature selection on the cross products of the FOT parameters. This selection resulted in a further increase in AUC (KNN=SVM=0.91), which allows for high diagnostic accuracy. In conclusion, machine learning classifiers can help identify early smoking-induced respiratory alterations. The use of FOT cross products and the search for the best features and classifier parameters can markedly improve the performance of machine learning classifiers. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Ali, Safdar; Majid, Abdul
2015-04-01
The diagnostic of human breast cancer is an intricate process and specific indicators may produce negative results. In order to avoid misleading results, accurate and reliable diagnostic system for breast cancer is indispensable. Recently, several interesting machine-learning (ML) approaches are proposed for prediction of breast cancer. To this end, we developed a novel classifier stacking based evolutionary ensemble system "Can-Evo-Ens" for predicting amino acid sequences associated with breast cancer. In this paper, first, we selected four diverse-type of ML algorithms of Naïve Bayes, K-Nearest Neighbor, Support Vector Machines, and Random Forest as base-level classifiers. These classifiers are trained individually in different feature spaces using physicochemical properties of amino acids. In order to exploit the decision spaces, the preliminary predictions of base-level classifiers are stacked. Genetic programming (GP) is then employed to develop a meta-classifier that optimal combine the predictions of the base classifiers. The most suitable threshold value of the best-evolved predictor is computed using Particle Swarm Optimization technique. Our experiments have demonstrated the robustness of Can-Evo-Ens system for independent validation dataset. The proposed system has achieved the highest value of Area Under Curve (AUC) of ROC Curve of 99.95% for cancer prediction. The comparative results revealed that proposed approach is better than individual ML approaches and conventional ensemble approaches of AdaBoostM1, Bagging, GentleBoost, and Random Subspace. It is expected that the proposed novel system would have a major impact on the fields of Biomedical, Genomics, Proteomics, Bioinformatics, and Drug Development. Copyright © 2015 Elsevier Inc. All rights reserved.
2014-01-01
Background Pulmonary acoustic parameters extracted from recorded respiratory sounds provide valuable information for the detection of respiratory pathologies. The automated analysis of pulmonary acoustic signals can serve as a differential diagnosis tool for medical professionals, a learning tool for medical students, and a self-management tool for patients. In this context, we intend to evaluate and compare the performance of the support vector machine (SVM) and K-nearest neighbour (K-nn) classifiers in diagnosis respiratory pathologies using respiratory sounds from R.A.L.E database. Results The pulmonary acoustic signals used in this study were obtained from the R.A.L.E lung sound database. The pulmonary acoustic signals were manually categorised into three different groups, namely normal, airway obstruction pathology, and parenchymal pathology. The mel-frequency cepstral coefficient (MFCC) features were extracted from the pre-processed pulmonary acoustic signals. The MFCC features were analysed by one-way ANOVA and then fed separately into the SVM and K-nn classifiers. The performances of the classifiers were analysed using the confusion matrix technique. The statistical analysis of the MFCC features using one-way ANOVA showed that the extracted MFCC features are significantly different (p < 0.001). The classification accuracies of the SVM and K-nn classifiers were found to be 92.19% and 98.26%, respectively. Conclusion Although the data used to train and test the classifiers are limited, the classification accuracies found are satisfactory. The K-nn classifier was better than the SVM classifier for the discrimination of pulmonary acoustic signals from pathological and normal subjects obtained from the RALE database. PMID:24970564
Palaniappan, Rajkumar; Sundaraj, Kenneth; Sundaraj, Sebastian
2014-06-27
Pulmonary acoustic parameters extracted from recorded respiratory sounds provide valuable information for the detection of respiratory pathologies. The automated analysis of pulmonary acoustic signals can serve as a differential diagnosis tool for medical professionals, a learning tool for medical students, and a self-management tool for patients. In this context, we intend to evaluate and compare the performance of the support vector machine (SVM) and K-nearest neighbour (K-nn) classifiers in diagnosis respiratory pathologies using respiratory sounds from R.A.L.E database. The pulmonary acoustic signals used in this study were obtained from the R.A.L.E lung sound database. The pulmonary acoustic signals were manually categorised into three different groups, namely normal, airway obstruction pathology, and parenchymal pathology. The mel-frequency cepstral coefficient (MFCC) features were extracted from the pre-processed pulmonary acoustic signals. The MFCC features were analysed by one-way ANOVA and then fed separately into the SVM and K-nn classifiers. The performances of the classifiers were analysed using the confusion matrix technique. The statistical analysis of the MFCC features using one-way ANOVA showed that the extracted MFCC features are significantly different (p < 0.001). The classification accuracies of the SVM and K-nn classifiers were found to be 92.19% and 98.26%, respectively. Although the data used to train and test the classifiers are limited, the classification accuracies found are satisfactory. The K-nn classifier was better than the SVM classifier for the discrimination of pulmonary acoustic signals from pathological and normal subjects obtained from the RALE database.
Anam, Khairul; Al-Jumaily, Adel
2017-01-01
The success of myoelectric pattern recognition (M-PR) mostly relies on the features extracted and classifier employed. This paper proposes and evaluates a fast classifier, extreme learning machine (ELM), to classify individual and combined finger movements on amputees and non-amputees. ELM is a single hidden layer feed-forward network (SLFN) that avoids iterative learning by determining input weights randomly and output weights analytically. Therefore, it can accelerate the training time of SLFNs. In addition to the classifier evaluation, this paper evaluates various feature combinations to improve the performance of M-PR and investigate some feature projections to improve the class separability of the features. Different from other studies on the implementation of ELM in the myoelectric controller, this paper presents a complete and thorough investigation of various types of ELMs including the node-based and kernel-based ELM. Furthermore, this paper provides comparisons of ELMs and other well-known classifiers such as linear discriminant analysis (LDA), k-nearest neighbour (kNN), support vector machine (SVM) and least-square SVM (LS-SVM). The experimental results show the most accurate ELM classifier is radial basis function ELM (RBF-ELM). The comparison of RBF-ELM and other well-known classifiers shows that RBF-ELM is as accurate as SVM and LS-SVM but faster than the SVM family; it is superior to LDA and kNN. The experimental results also indicate that the accuracy gap of the M-PR on the amputees and non-amputees is not too much with the accuracy of 98.55% on amputees and 99.5% on the non-amputees using six electromyography (EMG) channels. Copyright © 2016 Elsevier Ltd. All rights reserved.
Online breakage detection of multitooth tools using classifier ensembles for imbalanced data
NASA Astrophysics Data System (ADS)
Bustillo, Andrés; Rodríguez, Juan J.
2014-12-01
Cutting tool breakage detection is an important task, due to its economic impact on mass production lines in the automobile industry. This task presents a central limitation: real data-sets are extremely imbalanced because breakage occurs in very few cases compared with normal operation of the cutting process. In this paper, we present an analysis of different data-mining techniques applied to the detection of insert breakage in multitooth tools. The analysis applies only one experimental variable: the electrical power consumption of the tool drive. This restriction profiles real industrial conditions more accurately than other physical variables, such as acoustic or vibration signals, which are not so easily measured. Many efforts have been made to design a method that is able to identify breakages with a high degree of reliability within a short period of time. The solution is based on classifier ensembles for imbalanced data-sets. Classifier ensembles are combinations of classifiers, which in many situations are more accurate than individual classifiers. Six different base classifiers are tested: Decision Trees, Rules, Naïve Bayes, Nearest Neighbour, Multilayer Perceptrons and Logistic Regression. Three different balancing strategies are tested with each of the classifier ensembles and compared to their performance with the original data-set: Synthetic Minority Over-Sampling Technique (SMOTE), undersampling and a combination of SMOTE and undersampling. To identify the most suitable data-mining solution, Receiver Operating Characteristics (ROC) graph and Recall-precision graph are generated and discussed. The performance of logistic regression ensembles on the balanced data-set using the combination of SMOTE and undersampling turned out to be the most suitable technique. Finally a comparison using industrial performance measures is presented, which concludes that this technique is also more suited to this industrial problem than the other techniques presented in the bibliography.
NASA Astrophysics Data System (ADS)
Wang, Yujie; Wang, Zhen; Wang, Yanli; Liu, Taigang; Zhang, Wenbing
2018-01-01
The thermodynamic and kinetic parameters of an RNA base pair with different nearest and next nearest neighbors were obtained through long-time molecular dynamics simulation of the opening-closing switch process of the base pair near its melting temperature. The results indicate that thermodynamic parameters of GC base pair are dependent on the nearest neighbor base pair, and the next nearest neighbor base pair has little effect, which validated the nearest-neighbor model. The closing and opening rates of the GC base pair also showed nearest neighbor dependences. At certain temperature, the closing and opening rates of the GC pair with nearest neighbor AU is larger than that with the nearest neighbor GC, and the next nearest neighbor plays little role. The free energy landscape of the GC base pair with the nearest neighbor GC is rougher than that with nearest neighbor AU.
A multiple-point spatially weighted k-NN method for object-based classification
NASA Astrophysics Data System (ADS)
Tang, Yunwei; Jing, Linhai; Li, Hui; Atkinson, Peter M.
2016-10-01
Object-based classification, commonly referred to as object-based image analysis (OBIA), is now commonly regarded as able to produce more appealing classification maps, often of greater accuracy, than pixel-based classification and its application is now widespread. Therefore, improvement of OBIA using spatial techniques is of great interest. In this paper, multiple-point statistics (MPS) is proposed for object-based classification enhancement in the form of a new multiple-point k-nearest neighbour (k-NN) classification method (MPk-NN). The proposed method first utilises a training image derived from a pre-classified map to characterise the spatial correlation between multiple points of land cover classes. The MPS borrows spatial structures from other parts of the training image, and then incorporates this spatial information, in the form of multiple-point probabilities, into the k-NN classifier. Two satellite sensor images with a fine spatial resolution were selected to evaluate the new method. One is an IKONOS image of the Beijing urban area and the other is a WorldView-2 image of the Wolong mountainous area, in China. The images were object-based classified using the MPk-NN method and several alternatives, including the k-NN, the geostatistically weighted k-NN, the Bayesian method, the decision tree classifier (DTC), and the support vector machine classifier (SVM). It was demonstrated that the new spatial weighting based on MPS can achieve greater classification accuracy relative to the alternatives and it is, thus, recommended as appropriate for object-based classification.
Higher criticism thresholding: Optimal feature selection when useful features are rare and weak.
Donoho, David; Jin, Jiashun
2008-09-30
In important application fields today-genomics and proteomics are examples-selecting a small subset of useful features is crucial for success of Linear Classification Analysis. We study feature selection by thresholding of feature Z-scores and introduce a principle of threshold selection, based on the notion of higher criticism (HC). For i = 1, 2, ..., p, let pi(i) denote the two-sided P-value associated with the ith feature Z-score and pi((i)) denote the ith order statistic of the collection of P-values. The HC threshold is the absolute Z-score corresponding to the P-value maximizing the HC objective (i/p - pi((i)))/sqrt{i/p(1-i/p)}. We consider a rare/weak (RW) feature model, where the fraction of useful features is small and the useful features are each too weak to be of much use on their own. HC thresholding (HCT) has interesting behavior in this setting, with an intimate link between maximizing the HC objective and minimizing the error rate of the designed classifier, and very different behavior from popular threshold selection procedures such as false discovery rate thresholding (FDRT). In the most challenging RW settings, HCT uses an unconventionally low threshold; this keeps the missed-feature detection rate under better control than FDRT and yields a classifier with improved misclassification performance. Replacing cross-validated threshold selection in the popular Shrunken Centroid classifier with the computationally less expensive and simpler HCT reduces the variance of the selected threshold and the error rate of the constructed classifier. Results on standard real datasets and in asymptotic theory confirm the advantages of HCT.
Higher criticism thresholding: Optimal feature selection when useful features are rare and weak
Donoho, David; Jin, Jiashun
2008-01-01
In important application fields today—genomics and proteomics are examples—selecting a small subset of useful features is crucial for success of Linear Classification Analysis. We study feature selection by thresholding of feature Z-scores and introduce a principle of threshold selection, based on the notion of higher criticism (HC). For i = 1, 2, …, p, let πi denote the two-sided P-value associated with the ith feature Z-score and π(i) denote the ith order statistic of the collection of P-values. The HC threshold is the absolute Z-score corresponding to the P-value maximizing the HC objective (i/p − π(i))/i/p(1−i/p). We consider a rare/weak (RW) feature model, where the fraction of useful features is small and the useful features are each too weak to be of much use on their own. HC thresholding (HCT) has interesting behavior in this setting, with an intimate link between maximizing the HC objective and minimizing the error rate of the designed classifier, and very different behavior from popular threshold selection procedures such as false discovery rate thresholding (FDRT). In the most challenging RW settings, HCT uses an unconventionally low threshold; this keeps the missed-feature detection rate under better control than FDRT and yields a classifier with improved misclassification performance. Replacing cross-validated threshold selection in the popular Shrunken Centroid classifier with the computationally less expensive and simpler HCT reduces the variance of the selected threshold and the error rate of the constructed classifier. Results on standard real datasets and in asymptotic theory confirm the advantages of HCT. PMID:18815365
Centroids evaluation of the images obtained with the conical null-screen corneal topographer
NASA Astrophysics Data System (ADS)
Osorio-Infante, Arturo I.; Armengol-Cruz, Victor de Emanuel; Campos-García, Manuel; Cossio-Guerrero, Cesar; Marquez-Flores, Jorge; Díaz-Uribe, José Rufino
2016-09-01
In this work, we propose some algorithms to recover the centroids of the resultant image obtained by a conical nullscreen based corneal topographer. With these algorithms, we obtain the region of interest (roi) of the original image and using an image-processing algorithm, we calculate the geometric centroid of each roi. In order to improve our algorithm performance, we use different settings of null-screen targets, changing their size and number. We also improved the illumination system to avoid inhomogeneous zones in the corneal images. Finally, we report some corneal topographic measurements with the best setting we found.
An Accurate Centroiding Algorithm for PSF Reconstruction
NASA Astrophysics Data System (ADS)
Lu, Tianhuan; Luo, Wentao; Zhang, Jun; Zhang, Jiajun; Li, Hekun; Dong, Fuyu; Li, Yingke; Liu, Dezi; Fu, Liping; Li, Guoliang; Fan, Zuhui
2018-07-01
In this work, we present a novel centroiding method based on Fourier space Phase Fitting (FPF) for Point Spread Function (PSF) reconstruction. We generate two sets of simulations to test our method. The first set is generated by GalSim with an elliptical Moffat profile and strong anisotropy that shifts the center of the PSF. The second set of simulations is drawn from CFHT i band stellar imaging data. We find non-negligible anisotropy from CFHT stellar images, which leads to ∼0.08 scatter in units of pixels using a polynomial fitting method (Vakili & Hogg). When we apply the FPF method to estimate the centroid in real space, the scatter reduces to ∼0.04 in S/N = 200 CFHT-like sample. In low signal-to-noise ratio (S/N; 50 and 100) CFHT-like samples, the background noise dominates the shifting of the centroid; therefore, the scatter estimated from different methods is similar. We compare polynomial fitting and FPF using GalSim simulation with optical anisotropy. We find that in all S/N (50, 100, and 200) samples, FPF performs better than polynomial fitting by a factor of ∼3. In general, we suggest that in real observations there exists anisotropy that shifts the centroid, and thus, the FPF method provides a better way to accurately locate it.
Accuracy of Shack-Hartmann wavefront sensor using a coherent wound fibre image bundle
NASA Astrophysics Data System (ADS)
Zheng, Jessica R.; Goodwin, Michael; Lawrence, Jon
2018-03-01
Shack-Hartmannwavefront sensors using wound fibre image bundles are desired for multi-object adaptive optical systems to provide large multiplex positioned by Starbugs. The use of a large-sized wound fibre image bundle provides the flexibility to use more sub-apertures wavefront sensor for ELTs. These compact wavefront sensors take advantage of large focal surfaces such as the Giant Magellan Telescope. The focus of this paper is to study the wound fibre image bundle structure defects effect on the centroid measurement accuracy of a Shack-Hartmann wavefront sensor. We use the first moment centroid method to estimate the centroid of a focused Gaussian beam sampled by a simulated bundle. Spot estimation accuracy with wound fibre image bundle and its structure impact on wavefront measurement accuracy statistics are addressed. Our results show that when the measurement signal-to-noise ratio is high, the centroid measurement accuracy is dominated by the wound fibre image bundle structure, e.g. tile angle and gap spacing. For the measurement with low signal-to-noise ratio, its accuracy is influenced by the read noise of the detector instead of the wound fibre image bundle structure defects. We demonstrate this both with simulation and experimentally. We provide a statistical model of the centroid and wavefront error of a wound fibre image bundle found through experiment.
Webcam mouse using face and eye tracking in various illumination environments.
Lin, Yuan-Pin; Chao, Yi-Ping; Lin, Chung-Chih; Chen, Jyh-Horng
2005-01-01
Nowadays, due to enhancement of computer performance and popular usage of webcam devices, it has become possible to acquire users' gestures for the human-computer-interface with PC via webcam. However, the effects of illumination variation would dramatically decrease the stability and accuracy of skin-based face tracking system; especially for a notebook or portable platform. In this study we present an effective illumination recognition technique, combining K-Nearest Neighbor classifier and adaptive skin model, to realize the real-time tracking system. We have demonstrated that the accuracy of face detection based on the KNN classifier is higher than 92% in various illumination environments. In real-time implementation, the system successfully tracks user face and eyes features at 15 fps under standard notebook platforms. Although KNN classifier only initiates five environments at preliminary stage, the system permits users to define and add their favorite environments to KNN for computer access. Eventually, based on this efficient tracking algorithm, we have developed a "Webcam Mouse" system to control the PC cursor using face and eye tracking. Preliminary studies in "point and click" style PC web games also shows promising applications in consumer electronic markets in the future.
Automated 3D Phenotype Analysis Using Data Mining
Plyusnin, Ilya; Evans, Alistair R.; Karme, Aleksis; Gionis, Aristides; Jernvall, Jukka
2008-01-01
The ability to analyze and classify three-dimensional (3D) biological morphology has lagged behind the analysis of other biological data types such as gene sequences. Here, we introduce the techniques of data mining to the study of 3D biological shapes to bring the analyses of phenomes closer to the efficiency of studying genomes. We compiled five training sets of highly variable morphologies of mammalian teeth from the MorphoBrowser database. Samples were labeled either by dietary class or by conventional dental types (e.g. carnassial, selenodont). We automatically extracted a multitude of topological attributes using Geographic Information Systems (GIS)-like procedures that were then used in several combinations of feature selection schemes and probabilistic classification models to build and optimize classifiers for predicting the labels of the training sets. In terms of classification accuracy, computational time and size of the feature sets used, non-repeated best-first search combined with 1-nearest neighbor classifier was the best approach. However, several other classification models combined with the same searching scheme proved practical. The current study represents a first step in the automatic analysis of 3D phenotypes, which will be increasingly valuable with the future increase in 3D morphology and phenomics databases. PMID:18320060
An Efficient Statistical Computation Technique for Health Care Big Data using R
NASA Astrophysics Data System (ADS)
Sushma Rani, N.; Srinivasa Rao, P., Dr; Parimala, P.
2017-08-01
Due to the changes in living conditions and other factors many critical health related problems are arising. The diagnosis of the problem at earlier stages will increase the chances of survival and fast recovery. This reduces the time of recovery and the cost associated for the treatment. One such medical related issue is cancer and breast cancer has been identified as the second leading cause of cancer death. If detected in the early stage it can be cured. Once a patient is detected with breast cancer tumor, it should be classified whether it is cancerous or non-cancerous. So the paper uses k-nearest neighbors(KNN) algorithm which is one of the simplest machine learning algorithms and is an instance-based learning algorithm to classify the data. Day-to -day new records are added which leds to increase in the data to be classified and this tends to be big data problem. The algorithm is implemented in R whichis the most popular platform applied to machine learning algorithms for statistical computing. Experimentation is conducted by using various classification evaluation metric onvarious values of k. The results show that the KNN algorithm out performes better than existing models.
Spatial Statistics for Tumor Cell Counting and Classification
NASA Astrophysics Data System (ADS)
Wirjadi, Oliver; Kim, Yoo-Jin; Breuel, Thomas
To count and classify cells in histological sections is a standard task in histology. One example is the grading of meningiomas, benign tumors of the meninges, which requires to assess the fraction of proliferating cells in an image. As this process is very time consuming when performed manually, automation is required. To address such problems, we propose a novel application of Markov point process methods in computer vision, leading to algorithms for computing the locations of circular objects in images. In contrast to previous algorithms using such spatial statistics methods in image analysis, the present one is fully trainable. This is achieved by combining point process methods with statistical classifiers. Using simulated data, the method proposed in this paper will be shown to be more accurate and more robust to noise than standard image processing methods. On the publicly available SIMCEP benchmark for cell image analysis algorithms, the cell count performance of the present paper is significantly more accurate than results published elsewhere, especially when cells form dense clusters. Furthermore, the proposed system performs as well as a state-of-the-art algorithm for the computer-aided histological grading of meningiomas when combined with a simple k-nearest neighbor classifier for identifying proliferating cells.
Knee X-ray image analysis method for automated detection of Osteoarthritis
Shamir, Lior; Ling, Shari M.; Scott, William W.; Bos, Angelo; Orlov, Nikita; Macura, Tomasz; Eckley, D. Mark; Ferrucci, Luigi; Goldberg, Ilya G.
2008-01-01
We describe a method for automated detection of radiographic Osteoarthritis (OA) in knee X-ray images. The detection is based on the Kellgren-Lawrence classification grades, which correspond to the different stages of OA severity. The classifier was built using manually classified X-rays, representing the first four KL grades (normal, doubtful, minimal and moderate). Image analysis is performed by first identifying a set of image content descriptors and image transforms that are informative for the detection of OA in the X-rays, and assigning weights to these image features using Fisher scores. Then, a simple weighted nearest neighbor rule is used in order to predict the KL grade to which a given test X-ray sample belongs. The dataset used in the experiment contained 350 X-ray images classified manually by their KL grades. Experimental results show that moderate OA (KL grade 3) and minimal OA (KL grade 2) can be differentiated from normal cases with accuracy of 91.5% and 80.4%, respectively. Doubtful OA (KL grade 1) was detected automatically with a much lower accuracy of 57%. The source code developed and used in this study is available for free download at www.openmicroscopy.org. PMID:19342330
2012-01-01
Background Development and application of transcriptomics-based gene classifiers for ecotoxicological applications lag far behind those of biomedical sciences. Many such classifiers discovered thus far lack vigorous statistical and experimental validations. A combination of genetic algorithm/support vector machines and genetic algorithm/K nearest neighbors was used in this study to search for classifiers of endocrine-disrupting chemicals (EDCs) in zebrafish. Searches were conducted on both tissue-specific and tissue-combined datasets, either across the entire transcriptome or within individual transcription factor (TF) networks previously linked to EDC effects. Candidate classifiers were evaluated by gene set enrichment analysis (GSEA) on both the original training data and a dedicated validation dataset. Results Multi-tissue dataset yielded no classifiers. Among the 19 chemical-tissue conditions evaluated, the transcriptome-wide searches yielded classifiers for six of them, each having approximately 20 to 30 gene features unique to a condition. Searches within individual TF networks produced classifiers for 15 chemical-tissue conditions, each containing 100 or fewer top-ranked gene features pooled from those of multiple TF networks and also unique to each condition. For the training dataset, 10 out of 11 classifiers successfully identified the gene expression profiles (GEPs) of their targeted chemical-tissue conditions by GSEA. For the validation dataset, classifiers for prochloraz-ovary and flutamide-ovary also correctly identified the GEPs of corresponding conditions while no classifier could predict the GEP from prochloraz-brain. Conclusions The discrepancies in the performance of these classifiers were attributed in part to varying data complexity among the conditions, as measured to some degree by Fisher’s discriminant ratio statistic. This variation in data complexity could likely be compensated by adjusting sample size for individual chemical-tissue conditions, thus suggesting a need for a preliminary survey of transcriptomic responses before launching a full scale classifier discovery effort. Classifier discovery based on individual TF networks could yield more mechanistically-oriented biomarkers. GSEA proved to be a flexible and effective tool for application of gene classifiers but a similar and more refined algorithm, connectivity mapping, should also be explored. The distribution characteristics of classifiers across tissues, chemicals, and TF networks suggested a differential biological impact among the EDCs on zebrafish transcriptome involving some basic cellular functions. PMID:22849515
DOE Office of Scientific and Technical Information (OSTI.GOV)
Young, M; Craft, D
Purpose: To develop an efficient, pathway-based classification system using network biology statistics to assist in patient-specific response predictions to radiation and drug therapies across multiple cancer types. Methods: We developed PICS (Pathway Informed Classification System), a novel two-step cancer classification algorithm. In PICS, a matrix m of mRNA expression values for a patient cohort is collapsed into a matrix p of biological pathways. The entries of p, which we term pathway scores, are obtained from either principal component analysis (PCA), normal tissue centroid (NTC), or gene expression deviation (GED). The pathway score matrix is clustered using both k-means and hierarchicalmore » clustering, and a clustering is judged by how well it groups patients into distinct survival classes. The most effective pathway scoring/clustering combination, per clustering p-value, thus generates various ‘signatures’ for conventional and functional cancer classification. Results: PICS successfully regularized large dimension gene data, separated normal and cancerous tissues, and clustered a large patient cohort spanning six cancer types. Furthermore, PICS clustered patient cohorts into distinct, statistically-significant survival groups. For a suboptimally-debulked ovarian cancer set, the pathway-classified Kaplan-Meier survival curve (p = .00127) showed significant improvement over that of a prior gene expression-classified study (p = .0179). For a pancreatic cancer set, the pathway-classified Kaplan-Meier survival curve (p = .00141) showed significant improvement over that of a prior gene expression-classified study (p = .04). Pathway-based classification confirmed biomarkers for the pyrimidine, WNT-signaling, glycerophosphoglycerol, beta-alanine, and panthothenic acid pathways for ovarian cancer. Despite its robust nature, PICS requires significantly less run time than current pathway scoring methods. Conclusion: This work validates the PICS method to improve cancer classification using biological pathways. Patients are classified with greater specificity and physiological relevance as compared to current gene-specific approaches. Focus now moves to utilizing PICS for pan-cancer patient-specific treatment response prediction.« less
Intraoperative cyclorotation and pupil centroid shift during LASIK and PRK.
Narváez, Julio; Brucks, Matthew; Zimmerman, Grenith; Bekendam, Peter; Bacon, Gregory; Schmid, Kristin
2012-05-01
To determine the degree of cyclorotation and centroid shift in the x and y axis that occurs intraoperatively during LASIK and photorefractive keratectomy (PRK). Intraoperative cyclorotation and centroid shift were measured in 63 eyes from 34 patients with a mean age of 34 years (range: 20 to 56 years) undergoing either LASIK or PRK. Preoperatively, an iris image of each eye was obtained with the VISX WaveScan Wavefront System (Abbott Medical Optics Inc) with iris registration. A VISX Star S4 (Abbott Medical Optics Inc) laser was later used to measure cyclotorsion and pupil centroid shift at the beginning of the refractive procedure and after flap creation or epithelial removal. The mean change in intraoperative cyclorotation was 1.48±1.11° in LASIK eyes and 2.02±2.63° in PRK eyes. Cyclorotation direction changed by >2° in 21% of eyes after flap creation in LASIK and in 32% of eyes after epithelial removal in PRK. The respective mean intraoperative shift in the x axis and y axis was 0.13±0.15 mm and 0.17±0.14 mm, respectively, in LASIK eyes, and 0.09±0.07 mm and 0.10±0.13 mm, respectively, in PRK eyes. Intraoperative centroid shifts >100 μm in either the x axis or y axis occurred in 71% of LASIK eyes and 55% of PRK eyes. Significant changes in cyclotorsion and centroid shifts were noted prior to surgery as well as intraoperatively with both LASIK and PRK. It may be advantageous to engage iris registration immediately prior to ablation to provide a reference point representative of eye position at the initiation of laser delivery. Copyright 2012, SLACK Incorporated.
Vitola, Jaime; Pozo, Francesc; Tibaduiza, Diego A.; Anaya, Maribel
2017-01-01
Civil and military structures are susceptible and vulnerable to damage due to the environmental and operational conditions. Therefore, the implementation of technology to provide robust solutions in damage identification (by using signals acquired directly from the structure) is a requirement to reduce operational and maintenance costs. In this sense, the use of sensors permanently attached to the structures has demonstrated a great versatility and benefit since the inspection system can be automated. This automation is carried out with signal processing tasks with the aim of a pattern recognition analysis. This work presents the detailed description of a structural health monitoring (SHM) system based on the use of a piezoelectric (PZT) active system. The SHM system includes: (i) the use of a piezoelectric sensor network to excite the structure and collect the measured dynamic response, in several actuation phases; (ii) data organization; (iii) advanced signal processing techniques to define the feature vectors; and finally; (iv) the nearest neighbor algorithm as a machine learning approach to classify different kinds of damage. A description of the experimental setup, the experimental validation and a discussion of the results from two different structures are included and analyzed. PMID:28230796
Vitola, Jaime; Pozo, Francesc; Tibaduiza, Diego A; Anaya, Maribel
2017-02-21
Civil and military structures are susceptible and vulnerable to damage due to the environmental and operational conditions. Therefore, the implementation of technology to provide robust solutions in damage identification (by using signals acquired directly from the structure) is a requirement to reduce operational and maintenance costs. In this sense, the use of sensors permanently attached to the structures has demonstrated a great versatility and benefit since the inspection system can be automated. This automation is carried out with signal processing tasks with the aim of a pattern recognition analysis. This work presents the detailed description of a structural health monitoring (SHM) system based on the use of a piezoelectric (PZT) active system. The SHM system includes: (i) the use of a piezoelectric sensor network to excite the structure and collect the measured dynamic response, in several actuation phases; (ii) data organization; (iii) advanced signal processing techniques to define the feature vectors; and finally; (iv) the nearest neighbor algorithm as a machine learning approach to classify different kinds of damage. A description of the experimental setup, the experimental validation and a discussion of the results from two different structures are included and analyzed.
Acetobacter fabarum sp. nov., an acetic acid bacterium from a Ghanaian cocoa bean heap fermentation.
Cleenwerck, Ilse; Gonzalez, Angel; Camu, Nicholas; Engelbeen, Katrien; De Vos, Paul; De Vuyst, Luc
2008-09-01
Six acetic acid bacterial isolates, obtained during a study of the microbial diversity of spontaneous fermentations of Ghanaian cocoa beans, were subjected to a polyphasic taxonomic study. (GTG)(5)-PCR fingerprinting grouped the isolates together, but they could not be identified using this method. Phylogenetic analysis based on 16S rRNA gene sequences allocated the isolates to the genus Acetobacter and revealed Acetobacter lovaniensis, Acetobacter ghanensis and Acetobacter syzygii to be nearest neighbours. DNA-DNA hybridizations demonstrated that the isolates belonged to a single novel genospecies that could be differentiated from its phylogenetically nearest neighbours by the following phenotypic characteristics: no production of 2-keto-D-gluconic acid from D-glucose; growth on methanol and D-xylose, but not on maltose, as sole carbon sources; no growth on yeast extract with 30% D-glucose; and weak growth at 37 degrees C. The DNA G+C contents of four selected strains were 56.8-58.0 mol%. The results obtained prove that the isolates should be classified as representatives of a novel Acetobacter species, for which the name Acetobacter fabarum sp. nov. is proposed. The type strain is strain 985(T) (=R-36330(T) =LMG 24244(T) =DSM 19596(T)).
Evaluation of centroiding algorithm error for Nano-JASMINE
NASA Astrophysics Data System (ADS)
Hara, Takuji; Gouda, Naoteru; Yano, Taihei; Yamada, Yoshiyuki
2014-08-01
The Nano-JASMINE mission has been designed to perform absolute astrometric measurements with unprecedented accuracy; the end-of-mission parallax standard error is required to be of the order of 3 milli arc seconds for stars brighter than 7.5 mag in the zw-band(0.6μm-1.0μm) .These requirements set a stringent constraint on the accuracy of the estimation of the location of the stellar image on the CCD for each observation. However each stellar images have individual shape depend on the spectral energy distribution of the star, the CCD properties, and the optics and its associated wave front errors. So it is necessity that the centroiding algorithm performs a high accuracy in any observables. Referring to the study of Gaia, we use LSF fitting method for centroiding algorithm, and investigate systematic error of the algorithm for Nano-JASMINE. Furthermore, we found to improve the algorithm by restricting sample LSF when we use a Principle Component Analysis. We show that centroiding algorithm error decrease after adapted the method.
NASA Astrophysics Data System (ADS)
Asoka-Kumar, P.; Leung, T. C.; Lynn, K. G.; Nielsen, B.; Forcier, M. P.; Weinberg, Z. A.; Rubloff, G. W.
1992-06-01
The centroid shifts of positron annihilation spectra are reported from the depletion regions of metal-oxide-semiconductor (MOS) capacitors at room temperature and at 35 K. The centroid shift measurement can be explained using the variation of the electric field strength and depletion layer thickness as a function of the applied gate bias. An estimate for the relevant MOS quantities is obtained by fitting the centroid shift versus beam energy data with a steady-state diffusion-annihilation equation and a derivative-gaussian positron implantation profile. Inadequacy of the present analysis scheme is evident from the derived quantities and alternate methods are required for better predictions.
Multi-Band Received Signal Strength Fingerprinting Based Indoor Location System
NASA Astrophysics Data System (ADS)
Sertthin, Chinnapat; Fujii, Takeo; Ohtsuki, Tomoaki; Nakagawa, Masao
This paper proposes a new multi-band received signal strength (MRSS) fingerprinting based indoor location system, which employs the frequency diversity on the conventional single-band received signal strength (RSS) fingerprinting based indoor location system. In the proposed system, the impacts of frequency diversity on the enhancements of positioning accuracy are analyzed. Effectiveness of the proposed system is proved by experimental approach, which was conducted in non line-of-sight (NLOS) environment under the area of 103m2 at Yagami Campus, Keio University. WLAN access points, which simultaneously transmit dual-band signal of 2.4 and 5.2GHz, are utilized as transmitters. Likewise, a dual-band WLAN receiver is utilized as a receiver. Signal distances calculated by both Manhattan and Euclidean were classified by K-Nearest Neighbor (KNN) classifier to illustrate the performance of the proposed system. The results confirmed that Frequency diversity attributions of multi-band signal provide accuracy improvement over 50% of the conventional single-band.
Chemical data as markers of the geographical origins of sugarcane spirits.
Serafim, F A T; Pereira-Filho, Edenir R; Franco, D W
2016-04-01
In an attempt to classify sugarcane spirits according to their geographic region of origin, chemical data for 24 analytes were evaluated in 50 cachaças produced using a similar procedure in selected regions of Brazil: São Paulo - SP (15), Minas Gerais - MG (11), Rio de Janeiro - RJ (11), Paraiba -PB (9), and Ceará - CE (4). Multivariate analysis was applied to the analytical results, and the predictive abilities of different classification methods were evaluated. Principal component analysis identified five groups, and chemical similarities were observed between MG and SP samples and between RJ and PB samples. CE samples presented a distinct chemical profile. Among the samples, partial linear square discriminant analysis (PLS-DA) classified 50.2% of the samples correctly, K-nearest neighbor (KNN) 86%, and soft independent modeling of class analogy (SIMCA) 56.2%. Therefore, in this proof of concept demonstration, the proposed approach based on chemical data satisfactorily predicted the cachaças' geographic origins. Copyright © 2015 Elsevier Ltd. All rights reserved.
White matter lesion extension to automatic brain tissue segmentation on MRI.
de Boer, Renske; Vrooman, Henri A; van der Lijn, Fedde; Vernooij, Meike W; Ikram, M Arfan; van der Lugt, Aad; Breteler, Monique M B; Niessen, Wiro J
2009-05-01
A fully automated brain tissue segmentation method is optimized and extended with white matter lesion segmentation. Cerebrospinal fluid (CSF), gray matter (GM) and white matter (WM) are segmented by an atlas-based k-nearest neighbor classifier on multi-modal magnetic resonance imaging data. This classifier is trained by registering brain atlases to the subject. The resulting GM segmentation is used to automatically find a white matter lesion (WML) threshold in a fluid-attenuated inversion recovery scan. False positive lesions are removed by ensuring that the lesions are within the white matter. The method was visually validated on a set of 209 subjects. No segmentation errors were found in 98% of the brain tissue segmentations and 97% of the WML segmentations. A quantitative evaluation using manual segmentations was performed on a subset of 6 subjects for CSF, GM and WM segmentation and an additional 14 for the WML segmentations. The results indicated that the automatic segmentation accuracy is close to the interobserver variability of manual segmentations.
Method of Menu Selection by Gaze Movement Using AC EOG Signals
NASA Astrophysics Data System (ADS)
Kanoh, Shin'ichiro; Futami, Ryoko; Yoshinobu, Tatsuo; Hoshimiya, Nozomu
A method to detect the direction and the distance of voluntary eye gaze movement from EOG (electrooculogram) signals was proposed and tested. In this method, AC-amplified vertical and horizontal transient EOG signals were classified into 8-class directions and 2-class distances of voluntary eye gaze movements. A horizontal and a vertical EOGs during eye gaze movement at each sampling time were treated as a two-dimensional vector, and the center of gravity of the sample vectors whose norms were more than 80% of the maximum norm was used as a feature vector to be classified. By the classification using the k-nearest neighbor algorithm, it was shown that the averaged correct detection rates on each subject were 98.9%, 98.7%, 94.4%, respectively. This method can avoid strict EOG-based eye tracking which requires DC amplification of very small signal. It would be useful to develop robust human interfacing systems based on menu selection for severely paralyzed patients.
Silva Filho, Telmo M; Souza, Renata M C R; Prudêncio, Ricardo B C
2016-08-01
Some complex data types are capable of modeling data variability and imprecision. These data types are studied in the symbolic data analysis field. One such data type is interval data, which represents ranges of values and is more versatile than classic point data for many domains. This paper proposes a new prototype-based classifier for interval data, trained by a swarm optimization method. Our work has two main contributions: a swarm method which is capable of performing both automatic selection of features and pruning of unused prototypes and a generalized weighted squared Euclidean distance for interval data. By discarding unnecessary features and prototypes, the proposed algorithm deals with typical limitations of prototype-based methods, such as the problem of prototype initialization. The proposed distance is useful for learning classes in interval datasets with different shapes, sizes and structures. When compared to other prototype-based methods, the proposed method achieves lower error rates in both synthetic and real interval datasets. Copyright © 2016 Elsevier Ltd. All rights reserved.
Classification of older adults with/without a fall history using machine learning methods.
Lin Zhang; Ou Ma; Fabre, Jennifer M; Wood, Robert H; Garcia, Stephanie U; Ivey, Kayla M; McCann, Evan D
2015-01-01
Falling is a serious problem in an aged society such that assessment of the risk of falls for individuals is imperative for the research and practice of falls prevention. This paper introduces an application of several machine learning methods for training a classifier which is capable of classifying individual older adults into a high risk group and a low risk group (distinguished by whether or not the members of the group have a recent history of falls). Using a 3D motion capture system, significant gait features related to falls risk are extracted. By training these features, classification hypotheses are obtained based on machine learning techniques (K Nearest-neighbour, Naive Bayes, Logistic Regression, Neural Network, and Support Vector Machine). Training and test accuracies with sensitivity and specificity of each of these techniques are assessed. The feature adjustment and tuning of the machine learning algorithms are discussed. The outcome of the study will benefit the prediction and prevention of falls.
Lagrangian methods of cosmic web classification
NASA Astrophysics Data System (ADS)
Fisher, J. D.; Faltenbacher, A.; Johnson, M. S. T.
2016-05-01
The cosmic web defines the large-scale distribution of matter we see in the Universe today. Classifying the cosmic web into voids, sheets, filaments and nodes allows one to explore structure formation and the role environmental factors have on halo and galaxy properties. While existing studies of cosmic web classification concentrate on grid-based methods, this work explores a Lagrangian approach where the V-web algorithm proposed by Hoffman et al. is implemented with techniques borrowed from smoothed particle hydrodynamics. The Lagrangian approach allows one to classify individual objects (e.g. particles or haloes) based on properties of their nearest neighbours in an adaptive manner. It can be applied directly to a halo sample which dramatically reduces computational cost and potentially allows an application of this classification scheme to observed galaxy samples. Finally, the Lagrangian nature admits a straightforward inclusion of the Hubble flow negating the necessity of a visually defined threshold value which is commonly employed by grid-based classification methods.
NASA Astrophysics Data System (ADS)
He, Y.; He, Y.
2018-04-01
Urban shanty towns are communities that has contiguous old and dilapidated houses with more than 2000 square meters built-up area or more than 50 households. This study makes attempts to extract shanty towns in Nanning City using the product of Census and TripleSat satellite images. With 0.8-meter high-resolution remote sensing images, five texture characteristics (energy, contrast, maximum probability, and inverse difference moment) of shanty towns are trained and analyzed through GLCM. In this study, samples of shanty town are well classified with 98.2 % producer accuracy of unsupervised classification and 73.2 % supervised classification correctness. Low-rise and mid-rise residential blocks in Nanning City are classified into 4 different types by using k-means clustering and nearest neighbour classification respectively. This study initially establish texture feature descriptions of different types of residential areas, especially low-rise and mid-rise buildings, which would help city administrator evaluate residential blocks and reconstruction shanty towns.
Fuzzy support vector machine: an efficient rule-based classification technique for microarrays.
Hajiloo, Mohsen; Rabiee, Hamid R; Anooshahpour, Mahdi
2013-01-01
The abundance of gene expression microarray data has led to the development of machine learning algorithms applicable for tackling disease diagnosis, disease prognosis, and treatment selection problems. However, these algorithms often produce classifiers with weaknesses in terms of accuracy, robustness, and interpretability. This paper introduces fuzzy support vector machine which is a learning algorithm based on combination of fuzzy classifiers and kernel machines for microarray classification. Experimental results on public leukemia, prostate, and colon cancer datasets show that fuzzy support vector machine applied in combination with filter or wrapper feature selection methods develops a robust model with higher accuracy than the conventional microarray classification models such as support vector machine, artificial neural network, decision trees, k nearest neighbors, and diagonal linear discriminant analysis. Furthermore, the interpretable rule-base inferred from fuzzy support vector machine helps extracting biological knowledge from microarray data. Fuzzy support vector machine as a new classification model with high generalization power, robustness, and good interpretability seems to be a promising tool for gene expression microarray classification.
Automated robot-assisted surgical skill evaluation: Predictive analytics approach.
Fard, Mahtab J; Ameri, Sattar; Darin Ellis, R; Chinnam, Ratna B; Pandya, Abhilash K; Klein, Michael D
2018-02-01
Surgical skill assessment has predominantly been a subjective task. Recently, technological advances such as robot-assisted surgery have created great opportunities for objective surgical evaluation. In this paper, we introduce a predictive framework for objective skill assessment based on movement trajectory data. Our aim is to build a classification framework to automatically evaluate the performance of surgeons with different levels of expertise. Eight global movement features are extracted from movement trajectory data captured by a da Vinci robot for surgeons with two levels of expertise - novice and expert. Three classification methods - k-nearest neighbours, logistic regression and support vector machines - are applied. The result shows that the proposed framework can classify surgeons' expertise as novice or expert with an accuracy of 82.3% for knot tying and 89.9% for a suturing task. This study demonstrates and evaluates the ability of machine learning methods to automatically classify expert and novice surgeons using global movement features. Copyright © 2017 John Wiley & Sons, Ltd.
Chen, Ting; Rangarajan, Anand; Vemuri, Baba C.
2010-01-01
This paper presents a novel classification via aggregated regression algorithm – dubbed CAVIAR – and its application to the OASIS MRI brain image database. The CAVIAR algorithm simultaneously combines a set of weak learners based on the assumption that the weight combination for the final strong hypothesis in CAVIAR depends on both the weak learners and the training data. A regularization scheme using the nearest neighbor method is imposed in the testing stage to avoid overfitting. A closed form solution to the cost function is derived for this algorithm. We use a novel feature – the histogram of the deformation field between the MRI brain scan and the atlas which captures the structural changes in the scan with respect to the atlas brain – and this allows us to automatically discriminate between various classes within OASIS [1] using CAVIAR. We empirically show that CAVIAR significantly increases the performance of the weak classifiers by showcasing the performance of our technique on OASIS. PMID:21151847
Chen, Ting; Rangarajan, Anand; Vemuri, Baba C
2010-04-14
This paper presents a novel classification via aggregated regression algorithm - dubbed CAVIAR - and its application to the OASIS MRI brain image database. The CAVIAR algorithm simultaneously combines a set of weak learners based on the assumption that the weight combination for the final strong hypothesis in CAVIAR depends on both the weak learners and the training data. A regularization scheme using the nearest neighbor method is imposed in the testing stage to avoid overfitting. A closed form solution to the cost function is derived for this algorithm. We use a novel feature - the histogram of the deformation field between the MRI brain scan and the atlas which captures the structural changes in the scan with respect to the atlas brain - and this allows us to automatically discriminate between various classes within OASIS [1] using CAVIAR. We empirically show that CAVIAR significantly increases the performance of the weak classifiers by showcasing the performance of our technique on OASIS.
Classification of speech dysfluencies using LPC based parameterization techniques.
Hariharan, M; Chee, Lim Sin; Ai, Ooi Chia; Yaacob, Sazali
2012-06-01
The goal of this paper is to discuss and compare three feature extraction methods: Linear Predictive Coefficients (LPC), Linear Prediction Cepstral Coefficients (LPCC) and Weighted Linear Prediction Cepstral Coefficients (WLPCC) for recognizing the stuttered events. Speech samples from the University College London Archive of Stuttered Speech (UCLASS) were used for our analysis. The stuttered events were identified through manual segmentation and were used for feature extraction. Two simple classifiers namely, k-nearest neighbour (kNN) and Linear Discriminant Analysis (LDA) were employed for speech dysfluencies classification. Conventional validation method was used for testing the reliability of the classifier results. The study on the effect of different frame length, percentage of overlapping, value of ã in a first order pre-emphasizer and different order p were discussed. The speech dysfluencies classification accuracy was found to be improved by applying statistical normalization before feature extraction. The experimental investigation elucidated LPC, LPCC and WLPCC features can be used for identifying the stuttered events and WLPCC features slightly outperforms LPCC features and LPC features.
Breast Cancer Detection with Reduced Feature Set.
Mert, Ahmet; Kılıç, Niyazi; Bilgili, Erdem; Akan, Aydin
2015-01-01
This paper explores feature reduction properties of independent component analysis (ICA) on breast cancer decision support system. Wisconsin diagnostic breast cancer (WDBC) dataset is reduced to one-dimensional feature vector computing an independent component (IC). The original data with 30 features and reduced one feature (IC) are used to evaluate diagnostic accuracy of the classifiers such as k-nearest neighbor (k-NN), artificial neural network (ANN), radial basis function neural network (RBFNN), and support vector machine (SVM). The comparison of the proposed classification using the IC with original feature set is also tested on different validation (5/10-fold cross-validations) and partitioning (20%-40%) methods. These classifiers are evaluated how to effectively categorize tumors as benign and malignant in terms of specificity, sensitivity, accuracy, F-score, Youden's index, discriminant power, and the receiver operating characteristic (ROC) curve with its criterion values including area under curve (AUC) and 95% confidential interval (CI). This represents an improvement in diagnostic decision support system, while reducing computational complexity.
Attallah, Omneya; Karthikesalingam, Alan; Holt, Peter Je; Thompson, Matthew M; Sayers, Rob; Bown, Matthew J; Choke, Eddie C; Ma, Xianghong
2017-11-01
Feature selection is essential in medical area; however, its process becomes complicated with the presence of censoring which is the unique character of survival analysis. Most survival feature selection methods are based on Cox's proportional hazard model, though machine learning classifiers are preferred. They are less employed in survival analysis due to censoring which prevents them from directly being used to survival data. Among the few work that employed machine learning classifiers, partial logistic artificial neural network with auto-relevance determination is a well-known method that deals with censoring and perform feature selection for survival data. However, it depends on data replication to handle censoring which leads to unbalanced and biased prediction results especially in highly censored data. Other methods cannot deal with high censoring. Therefore, in this article, a new hybrid feature selection method is proposed which presents a solution to high level censoring. It combines support vector machine, neural network, and K-nearest neighbor classifiers using simple majority voting and a new weighted majority voting method based on survival metric to construct a multiple classifier system. The new hybrid feature selection process uses multiple classifier system as a wrapper method and merges it with iterated feature ranking filter method to further reduce features. Two endovascular aortic repair datasets containing 91% censored patients collected from two centers were used to construct a multicenter study to evaluate the performance of the proposed approach. The results showed the proposed technique outperformed individual classifiers and variable selection methods based on Cox's model such as Akaike and Bayesian information criterions and least absolute shrinkage and selector operator in p values of the log-rank test, sensitivity, and concordance index. This indicates that the proposed classifier is more powerful in correctly predicting the risk of re-intervention enabling doctor in selecting patients' future follow-up plan.
Bearak, Jonathan M; Burke, Kristen Lagasse; Jones, Rachel K
2017-11-01
Abortion can help women to control their fertility and is an important component of health care for women. Although women in the USA who live further from an abortion clinic are less likely to obtain an abortion than women who live closer to an abortion clinic, no national study has examined inequality in access to abortion and whether inequality has increased as the number of abortion clinics has declined. For this analysis, we obtained data on abortion clinics for 2000, 2011, and 2014 from the Guttmacher Institute's Abortion Provider Census. Block groups and the percentage of women aged 15-44 years by census tract were obtained from the US Census Bureau. Distance to the nearest clinic was calculated for the population-weighted centroid of every block group. We calculated the median distance to an abortion clinic for women in each county and the median and 80th percentile distances for each state by weighting block groups by the number of women of reproductive age (15-44 years). In 2014, women in the USA would have had to travel a median distance of 10·79 miles (17·36 km) to reach the nearest abortion clinic, although 20% of women would have had to travel 42·54 miles (68·46 km) or more. We found substantially greater variation within than between states because, even in mostly rural states, women and clinics were concentrated in urban areas. We identified spatial disparities in abortion access, which were broadly unchanged, at least as far back as 2000. We showed substantial and persistent spatial disparities in access to abortion in the USA. These results contribute to an emerging literature documenting similar disparities in other high-income countries. An anonymous grant to the Guttmacher Institute. Copyright © 2017 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY-NC-ND 4.0 license. Published by Elsevier Ltd.. All rights reserved.
ERIC Educational Resources Information Center
Ferrarello, Daniela; Mammana, Maria Flavia; Pennisi, Mario
2018-01-01
In this paper, we show some properties of centroids of geometric figures, such as triangles, quadrilaterals and tetrahedra. In particular, we will prove the properties by means of geometric transformations and by introducing extensions of triangles and quadrilaterals, i.e. by adding one, two or three new vertices to the figure. The study of these…
Centroid and Theoretical Rotation: Justification for Their Use in Q Methodology Research
ERIC Educational Resources Information Center
Ramlo, Sue
2016-01-01
This manuscript's purpose is to introduce Q as a methodology before providing clarification about the preferred factor analytical choices of centroid and theoretical (hand) rotation. Stephenson, the creator of Q, designated that only these choices allowed for scientific exploration of subjectivity while not violating assumptions associated with…
A fresh look at functional link neural network for motor imagery-based brain-computer interface.
Hettiarachchi, Imali T; Babaei, Toktam; Nguyen, Thanh; Lim, Chee P; Nahavandi, Saeid
2018-05-04
Artificial neural networks (ANNs) are one of the widely used classifiers in the brain-computer interface (BCI) systems-based on noninvasive electroencephalography (EEG) signals. Among the different ANN architectures, the most commonly applied for BCI classifiers is the multilayer perceptron (MLP). When appropriately designed with optimal number of neuron layers and number of neurons per layer, the ANN can act as a universal approximator. However, due to the low signal-to-noise ratio of EEG signal data, overtraining problem may become an inherent issue, causing these universal approximators to fail in real-time applications. In this study we introduce a higher order neural network, namely the functional link neural network (FLNN) as a classifier for motor imagery (MI)-based BCI systems, to remedy the drawbacks in MLP. We compare the proposed method with competing classifiers such as linear decomposition analysis, naïve Bayes, k-nearest neighbours, support vector machine and three MLP architectures. Two multi-class benchmark datasets from the BCI competitions are used. Common spatial pattern algorithm is utilized for feature extraction to build classification models. FLNN reports the highest average Kappa value over multiple subjects for both the BCI competition datasets, under similarly preprocessed data and extracted features. Further, statistical comparison results over multiple subjects show that the proposed FLNN classification method yields the best performance among the competing classifiers. Findings from this study imply that the proposed method, which has less computational complexity compared to the MLP, can be implemented effectively in practical MI-based BCI systems. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Wigdahl, J.; Agurto, C.; Murray, V.; Barriga, S.; Soliz, P.
2013-03-01
Diabetic retinopathy (DR) affects more than 4.4 million Americans age 40 and over. Automatic screening for DR has shown to be an efficient and cost-effective way to lower the burden on the healthcare system, by triaging diabetic patients and ensuring timely care for those presenting with DR. Several supervised algorithms have been developed to detect pathologies related to DR, but little work has been done in determining the size of the training set that optimizes an algorithm's performance. In this paper we analyze the effect of the training sample size on the performance of a top-down DR screening algorithm for different types of statistical classifiers. Results are based on partial least squares (PLS), support vector machines (SVM), k-nearest neighbor (kNN), and Naïve Bayes classifiers. Our dataset consisted of digital retinal images collected from a total of 745 cases (595 controls, 150 with DR). We varied the number of normal controls in the training set, while keeping the number of DR samples constant, and repeated the procedure 10 times using randomized training sets to avoid bias. Results show increasing performance in terms of area under the ROC curve (AUC) when the number of DR subjects in the training set increased, with similar trends for each of the classifiers. Of these, PLS and k-NN had the highest average AUC. Lower standard deviation and a flattening of the AUC curve gives evidence that there is a limit to the learning ability of the classifiers and an optimal number of cases to train on.
[Terahertz Spectroscopic Identification with Deep Belief Network].
Ma, Shuai; Shen, Tao; Wang, Rui-qi; Lai, Hua; Yu, Zheng-tao
2015-12-01
Feature extraction and classification are the key issues of terahertz spectroscopy identification. Because many materials have no apparent absorption peaks in the terahertz band, it is difficult to extract theirs terahertz spectroscopy feature and identify. To this end, a novel of identify terahertz spectroscopy approach with Deep Belief Network (DBN) was studied in this paper, which combines the advantages of DBN and K-Nearest Neighbors (KNN) classifier. Firstly, cubic spline interpolation and S-G filter were used to normalize the eight kinds of substances (ATP, Acetylcholine Bromide, Bifenthrin, Buprofezin, Carbazole, Bleomycin, Buckminster and Cylotriphosphazene) terahertz transmission spectra in the range of 0.9-6 THz. Secondly, the DBN model was built by two restricted Boltzmann machine (RBM) and then trained layer by layer using unsupervised approach. Instead of using handmade features, the DBN was employed to learn suitable features automatically with raw input data. Finally, a KNN classifier was applied to identify the terahertz spectrum. Experimental results show that using the feature learned by DBN can identify the terahertz spectrum of different substances with the recognition rate of over 90%, which demonstrates that the proposed method can automatically extract the effective features of terahertz spectrum. Furthermore, this KNN classifier was compared with others (BP neural network, SOM neural network and RBF neural network). Comparisons showed that the recognition rate of KNN classifier is better than the other three classifiers. Using the approach that automatic extract terahertz spectrum features by DBN can greatly reduce the workload of feature extraction. This proposed method shows a promising future in the application of identifying the mass terahertz spectroscopy.
Do unpaved, low-traffic roads affect bird communities?
NASA Astrophysics Data System (ADS)
Mammides, Christos; Kounnamas, Constantinos; Goodale, Eben; Kadis, Costas
2016-02-01
Unpaved, low traffic roads are often assumed to have minimal effects on biodiversity. To explore this assertion, we sampled the bird communities in fifteen randomly selected sites in Pafos Forest, Cyprus and used multiple regression to quantify the effects of such roads on the total species richness. Moreover, we classified birds according to their migratory status and their global population trends, and tested each category separately. Besides the total length of unpaved roads, we also tested: a. the site's habitat diversity, b. the coefficient of variation in habitat (patch) size, c. the distance to the nearest agricultural field, and d. the human population size of the nearest village. We measured our variables at six different distances from the bird point-count locations. We found a strong negative relationship between the total bird richness and the total length of unpaved roads. The human population size of the nearest village also had a negative effect. Habitat diversity was positively related to species richness. When the categories were tested, we found that the passage migrants were influenced more by the road network while resident breeders were influenced by habitat diversity. Species with increasing and stable populations were only marginally affected by the variables tested, but the effect of road networks on species with decreasing populations was large. We conclude that unpaved and sporadically used roads can have detrimental effects on the bird communities, especially on vulnerable species. We propose that actions are taken to limit the extent of road networks within protected areas, especially in sites designated for their rich avifauna, such as Pafos Forest, where several of the affected species are species of European and global importance.
The Surprising Absence of Absorption in the Far-ultraviolet Spectrum of Mrk 231
NASA Technical Reports Server (NTRS)
Veilleux, S.; Trippe, M.; Hamann, F.; Rupke, D. S. N.; Tripp, T. M.; Netzer, H.; Lutz, D.; Sembach, K. R.; Krug, H.; Teng, Stacy H.;
2013-01-01
Mrk 231, the nearest (z = 0.0422) quasar, hosts both a galactic-scale wind and a nuclear-scale iron low-ionization broad absorption line (FeLoBAL) outflow. We recently obtained a far-ultraviolet (FUV) spectrum of this object covering approx. 1150-1470A with the Cosmic Origins Spectrograph on board the Hubble Space Telescope. This spectrum is highly peculiar, highlighted by the presence of faint (< or approx.2% of predictions based on H(alpha)), broad (> or approx.10,000 km/s at the base), and highly blueshifted (centroid at approx. 3500 km/s) Ly(aplpha) emission. The FUV continuum emission is slightly declining at shorter wavelengths (consistent with F(sub lambda) Alpha Lambda(sup 1.7)) and does not show the presence of any obvious photospheric or wind stellar features. Surprisingly, the FUV spectrum also does not show any unambiguous broad absorption features. It thus appears to be dominated by the AGN, rather than hot stars, and virtually unfiltered by the dusty FeLoBAL screen. The observed Ly(alpha) emission is best explained if it is produced in the outflowing BAL cloud system, while the Balmer lines arise primarily from the standard broad emission line region seen through the dusty (Av approx. 7 mag) broad absorption line region. Two possible geometric models are discussed in the context of these new results.
Fusion-based multi-target tracking and localization for intelligent surveillance systems
NASA Astrophysics Data System (ADS)
Rababaah, Haroun; Shirkhodaie, Amir
2008-04-01
In this paper, we have presented two approaches addressing visual target tracking and localization in complex urban environment. The two techniques presented in this paper are: fusion-based multi-target visual tracking, and multi-target localization via camera calibration. For multi-target tracking, the data fusion concepts of hypothesis generation/evaluation/selection, target-to-target registration, and association are employed. An association matrix is implemented using RGB histograms for associated tracking of multi-targets of interests. Motion segmentation of targets of interest (TOI) from the background was achieved by a Gaussian Mixture Model. Foreground segmentation, on other hand, was achieved by the Connected Components Analysis (CCA) technique. The tracking of individual targets was estimated by fusing two sources of information, the centroid with the spatial gating, and the RGB histogram association matrix. The localization problem is addressed through an effective camera calibration technique using edge modeling for grid mapping (EMGM). A two-stage image pixel to world coordinates mapping technique is introduced that performs coarse and fine location estimation of moving TOIs. In coarse estimation, an approximate neighborhood of the target position is estimated based on nearest 4-neighbor method, and in fine estimation, we use Euclidean interpolation to localize the position within the estimated four neighbors. Both techniques were tested and shown reliable results for tracking and localization of Targets of interests in complex urban environment.
NASA Technical Reports Server (NTRS)
Antreasian, Peter G.
1988-01-01
Two orbit simulations, one representing the actual Geopotential Research Mission (GRM) orbit and the other representing the orbit estimated from orbit determination techniques, are presented. A computer algorithm was created to simulate GRM's drag compensation mechanism so the fuel expenditure and proof mass trajectories relative to the spacecraft centroid could be calculated for the mission. The results of the GRM DISCOS simulation demonstrated that the spacecraft can essentially be drag-free. The results showed that the centroid of the spacecraft can be controlled so that it will not deviate more than 1.0 mm in any direction from the centroid of the proof mass.
Dynamic imaging model and parameter optimization for a star tracker.
Yan, Jinyun; Jiang, Jie; Zhang, Guangjun
2016-03-21
Under dynamic conditions, star spots move across the image plane of a star tracker and form a smeared star image. This smearing effect increases errors in star position estimation and degrades attitude accuracy. First, an analytical energy distribution model of a smeared star spot is established based on a line segment spread function because the dynamic imaging process of a star tracker is equivalent to the static imaging process of linear light sources. The proposed model, which has a clear physical meaning, explicitly reflects the key parameters of the imaging process, including incident flux, exposure time, velocity of a star spot in an image plane, and Gaussian radius. Furthermore, an analytical expression of the centroiding error of the smeared star spot is derived using the proposed model. An accurate and comprehensive evaluation of centroiding accuracy is obtained based on the expression. Moreover, analytical solutions of the optimal parameters are derived to achieve the best performance in centroid estimation. Finally, we perform numerical simulations and a night sky experiment to validate the correctness of the dynamic imaging model, the centroiding error expression, and the optimal parameters.
(E)-2-[(2,4,6-Tri-meth-oxy-benzyl-idene)amino]-phenol.
Kaewmanee, Narissara; Chantrapromma, Suchada; Boonnak, Nawong; Quah, Ching Kheng; Fun, Hoong-Kun
2014-01-01
There are two independent mol-ecules in the asymmetric unit of the title compound, C16H17NO4, with similar conformations but some differences in their bond angles. Each mol-ecule adopts a trans configuration with respect to the methyl-idene C=N bond and is twisted with a dihedral angle between the two substituted benzene rings of 80.52 (7)° in one mol-ecule and 83.53 (7)° in the other. All meth-oxy groups are approximately coplanar with the attached benzene rings, with Cmeth-yl-O-C-C torsion angles ranging from -6.7 (2) to 5.07 (19)°. In the crystal, independent mol-ecules are linked together by O-H⋯N and O-H⋯O hydrogen bonds and a π-π inter-action [centroid-centroid distance of 3.6030 (9) Å], forming a dimer. The dimers are further linked by weak C-H⋯O inter-actions and another π-π inter-action [centroid-centroid distance of 3.9452 (9) Å] into layers lying parallel to the ab plane.
Two-dimensional shape recognition using oriented-polar representation
NASA Astrophysics Data System (ADS)
Hu, Neng-Chung; Yu, Kuo-Kan; Hsu, Yung-Li
1997-10-01
To deal with such a problem as object recognition of position, scale, and rotation invariance (PSRI), we utilize some PSRI properties of images obtained from objects, for example, the centroid of the image. The corresponding position of the centroid to the boundary of the image is invariant in spite of rotation, scale, and translation of the image. To obtain the information of the image, we use the technique similar to Radon transform, called the oriented-polar representation of a 2D image. In this representation, two specific points, the centroid and the weighted mean point, are selected to form an initial ray, then the image is sampled with N angularly equispaced rays departing from the initial rays. Each ray contains a number of intersections and the distance information obtained from the centroid to the intersections. The shape recognition algorithm is based on the least total error of these two items of information. Together with a simple noise removal and a typical backpropagation neural network, this algorithm is simple, but the PSRI is achieved with a high recognition rate.
NASA Astrophysics Data System (ADS)
Li, Xinji; Hui, Mei; Zhao, Zhu; Liu, Ming; Dong, Liquan; Kong, Lingqin; Zhao, Yuejin
2018-05-01
A differential computation method is presented to improve the precision of calibration for coaxial reverse Hartmann test (RHT). In the calibration, the accuracy of the distance measurement greatly influences the surface shape test, as demonstrated in the mathematical analyses. However, high-precision absolute distance measurement is difficult in the calibration. Thus, a differential computation method that only requires the relative distance was developed. In the proposed method, a liquid crystal display screen successively displayed two regular dot matrix patterns with different dot spacing. In a special case, images on the detector exhibited similar centroid distributions during the reflector translation. Thus, the critical value of the relative displacement distance and the centroid distributions of the dots on the detector were utilized to establish the relationship between the rays at certain angles and the detector coordinates. Experiments revealed the approximately linear behavior of the centroid variation with the relative displacement distance. With the differential computation method, we increased the precision of traditional calibration 10-5 rad root mean square. The precision of the RHT was increased by approximately 100 nm.
360-degrees profilometry using strip-light projection coupled to Fourier phase-demodulation.
Servin, Manuel; Padilla, Moises; Garnica, Guillermo
2016-01-11
360 degrees (360°) digitalization of three dimensional (3D) solids using a projected light-strip is a well-established technique in academic and commercial profilometers. These profilometers project a light-strip over the digitizing solid while the solid is rotated a full revolution or 360-degrees. Then, a computer program typically extracts the centroid of this light-strip, and by triangulation one obtains the shape of the solid. Here instead of using intensity-based light-strip centroid estimation, we propose to use Fourier phase-demodulation for 360° solid digitalization. The advantage of Fourier demodulation over strip-centroid estimation is that the accuracy of phase-demodulation linearly-increases with the fringe density, while in strip-light the centroid-estimation errors are independent. Here we proposed first to construct a carrier-frequency fringe-pattern by closely adding the individual light-strip images recorded while the solid is being rotated. Next, this high-density fringe-pattern is phase-demodulated using the standard Fourier technique. To test the feasibility of this Fourier demodulation approach, we have digitized two solids with increasing topographic complexity: a Rubik's cube and a plastic model of a human-skull. According to our results, phase demodulation based on the Fourier technique is less noisy than triangulation based on centroid light-strip estimation. Moreover, Fourier demodulation also provides the amplitude of the analytic signal which is a valuable information for the visualization of surface details.
Online myoelectric control of a dexterous hand prosthesis by transradial amputees.
Cipriani, Christian; Antfolk, Christian; Controzzi, Marco; Lundborg, Göran; Rosen, Birgitta; Carrozza, Maria Chiara; Sebelius, Fredrik
2011-06-01
A real-time pattern recognition algorithm based on k-nearest neighbors and lazy learning was used to classify, voluntary electromyography (EMG) signals and to simultaneously control movements of a dexterous artificial hand. EMG signals were superficially recorded by eight pairs of electrodes from the stumps of five transradial amputees and forearms of five able-bodied participants and used online to control a robot hand. Seven finger movements (not involving the wrist) were investigated in this study. The first objective was to understand whether and to which extent it is possible to control continuously and in real-time, the finger postures of a prosthetic hand, using superficial EMG, and a practical classifier, also taking advantage of the direct visual feedback of the moving hand. The second objective was to calculate statistical differences in the performance between participants and groups, thereby assessing the general applicability of the proposed method. The average accuracy of the classifier was 79% for amputees and 89% for able-bodied participants. Statistical analysis of the data revealed a difference in control accuracy based on the aetiology of amputation, type of prostheses regularly used and also between able-bodied participants and amputees. These results are encouraging for the development of noninvasive EMG interfaces for the control of dexterous prostheses.
Feature extraction and classification of clouds in high resolution panchromatic satellite imagery
NASA Astrophysics Data System (ADS)
Sharghi, Elan
The development of sophisticated remote sensing sensors is rapidly increasing, and the vast amount of satellite imagery collected is too much to be analyzed manually by a human image analyst. It has become necessary for a tool to be developed to automate the job of an image analyst. This tool would need to intelligently detect and classify objects of interest through computer vision algorithms. Existing software called the Rapid Image Exploitation Resource (RAPIER®) was designed by engineers at Space and Naval Warfare Systems Center Pacific (SSC PAC) to perform exactly this function. This software automatically searches for anomalies in the ocean and reports the detections as a possible ship object. However, if the image contains a high percentage of cloud coverage, a high number of false positives are triggered by the clouds. The focus of this thesis is to explore various feature extraction and classification methods to accurately distinguish clouds from ship objects. An examination of a texture analysis method, line detection using the Hough transform, and edge detection using wavelets are explored as possible feature extraction methods. The features are then supplied to a K-Nearest Neighbors (KNN) or Support Vector Machine (SVM) classifier. Parameter options for these classifiers are explored and the optimal parameters are determined.
Retinopathy of Prematurity-assist: Novel Software for Detecting Plus Disease
Pour, Elias Khalili; Pourreza, Hamidreza; Zamani, Kambiz Ameli; Mahmoudi, Alireza; Sadeghi, Arash Mir Mohammad; Shadravan, Mahla; Karkhaneh, Reza; Pour, Ramak Rouhi
2017-01-01
Purpose To design software with a novel algorithm, which analyzes the tortuosity and vascular dilatation in fundal images of retinopathy of prematurity (ROP) patients with an acceptable accuracy for detecting plus disease. Methods Eighty-seven well-focused fundal images taken with RetCam were classified to three groups of plus, non-plus, and pre-plus by agreement between three ROP experts. Automated algorithms in this study were designed based on two methods: the curvature measure and distance transform for assessment of tortuosity and vascular dilatation, respectively as two major parameters of plus disease detection. Results Thirty-eight plus, 12 pre-plus, and 37 non-plus images, which were classified by three experts, were tested by an automated algorithm and software evaluated the correct grouping of images in comparison to expert voting with three different classifiers, k-nearest neighbor, support vector machine and multilayer perceptron network. The plus, pre-plus, and non-plus images were analyzed with 72.3%, 83.7%, and 84.4% accuracy, respectively. Conclusions The new automated algorithm used in this pilot scheme for diagnosis and screening of patients with plus ROP has acceptable accuracy. With more improvements, it may become particularly useful, especially in centers without a skilled person in the ROP field. PMID:29022295
Aesthetic preference recognition of 3D shapes using EEG.
Chew, Lin Hou; Teo, Jason; Mountstephens, James
2016-04-01
Recognition and identification of aesthetic preference is indispensable in industrial design. Humans tend to pursue products with aesthetic values and make buying decisions based on their aesthetic preferences. The existence of neuromarketing is to understand consumer responses toward marketing stimuli by using imaging techniques and recognition of physiological parameters. Numerous studies have been done to understand the relationship between human, art and aesthetics. In this paper, we present a novel preference-based measurement of user aesthetics using electroencephalogram (EEG) signals for virtual 3D shapes with motion. The 3D shapes are designed to appear like bracelets, which is generated by using the Gielis superformula. EEG signals were collected by using a medical grade device, the B-Alert X10 from advance brain monitoring, with a sampling frequency of 256 Hz and resolution of 16 bits. The signals obtained when viewing 3D bracelet shapes were decomposed into alpha, beta, theta, gamma and delta rhythm by using time-frequency analysis, then classified into two classes, namely like and dislike by using support vector machines and K-nearest neighbors (KNN) classifiers respectively. Classification accuracy of up to 80 % was obtained by using KNN with the alpha, theta and delta rhythms as the features extracted from frontal channels, Fz, F3 and F4 to classify two classes, like and dislike.
A three-parameter model for classifying anurans into four genera based on advertisement calls.
Gingras, Bruno; Fitch, William Tecumseh
2013-01-01
The vocalizations of anurans are innate in structure and may therefore contain indicators of phylogenetic history. Thus, advertisement calls of species which are more closely related phylogenetically are predicted to be more similar than those of distant species. This hypothesis was evaluated by comparing several widely used machine-learning algorithms. Recordings of advertisement calls from 142 species belonging to four genera were analyzed. A logistic regression model, using mean values for dominant frequency, coefficient of variation of root-mean square energy, and spectral flux, correctly classified advertisement calls with regard to genus with an accuracy above 70%. Similar accuracy rates were obtained using these parameters with a support vector machine model, a K-nearest neighbor algorithm, and a multivariate Gaussian distribution classifier, whereas a Gaussian mixture model performed slightly worse. In contrast, models based on mel-frequency cepstral coefficients did not fare as well. Comparable accuracy levels were obtained on out-of-sample recordings from 52 of the 142 original species. The results suggest that a combination of low-level acoustic attributes is sufficient to discriminate efficiently between the vocalizations of these four genera, thus supporting the initial premise and validating the use of high-throughput algorithms on animal vocalizations to evaluate phylogenetic hypotheses.
Classification of sodium MRI data of cartilage using machine learning.
Madelin, Guillaume; Poidevin, Frederick; Makrymallis, Antonios; Regatte, Ravinder R
2015-11-01
To assess the possible utility of machine learning for classifying subjects with and subjects without osteoarthritis using sodium magnetic resonance imaging data. Theory: Support vector machine, k-nearest neighbors, naïve Bayes, discriminant analysis, linear regression, logistic regression, neural networks, decision tree, and tree bagging were tested. Sodium magnetic resonance imaging with and without fluid suppression by inversion recovery was acquired on the knee cartilage of 19 controls and 28 osteoarthritis patients. Sodium concentrations were measured in regions of interests in the knee for both acquisitions. Mean (MEAN) and standard deviation (STD) of these concentrations were measured in each regions of interest, and the minimum, maximum, and mean of these two measurements were calculated over all regions of interests for each subject. The resulting 12 variables per subject were used as predictors for classification. Either Min [STD] alone, or in combination with Mean [MEAN] or Min [MEAN], all from fluid suppressed data, were the best predictors with an accuracy >74%, mainly with linear logistic regression and linear support vector machine. Other good classifiers include discriminant analysis, linear regression, and naïve Bayes. Machine learning is a promising technique for classifying osteoarthritis patients and controls from sodium magnetic resonance imaging data. © 2014 Wiley Periodicals, Inc.
The Performance of Short-Term Heart Rate Variability in the Detection of Congestive Heart Failure
Barros, Allan Kardec; Ohnishi, Noboru
2016-01-01
Congestive heart failure (CHF) is a cardiac disease associated with the decreasing capacity of the cardiac output. It has been shown that the CHF is the main cause of the cardiac death around the world. Some works proposed to discriminate CHF subjects from healthy subjects using either electrocardiogram (ECG) or heart rate variability (HRV) from long-term recordings. In this work, we propose an alternative framework to discriminate CHF from healthy subjects by using HRV short-term intervals based on 256 RR continuous samples. Our framework uses a matching pursuit algorithm based on Gabor functions. From the selected Gabor functions, we derived a set of features that are inputted into a hybrid framework which uses a genetic algorithm and k-nearest neighbour classifier to select a subset of features that has the best classification performance. The performance of the framework is analyzed using both Fantasia and CHF database from Physionet archives which are, respectively, composed of 40 healthy volunteers and 29 subjects. From a set of nonstandard 16 features, the proposed framework reaches an overall accuracy of 100% with five features. Our results suggest that the application of hybrid frameworks whose classifier algorithms are based on genetic algorithms has outperformed well-known classifier methods. PMID:27891509
Classification of CT examinations for COPD visual severity analysis
NASA Astrophysics Data System (ADS)
Tan, Jun; Zheng, Bin; Wang, Xingwei; Pu, Jiantao; Gur, David; Sciurba, Frank C.; Leader, J. Ken
2012-03-01
In this study we present a computational method of CT examination classification into visual assessed emphysema severity. The visual severity categories ranged from 0 to 5 and were rated by an experienced radiologist. The six categories were none, trace, mild, moderate, severe and very severe. Lung segmentation was performed for every input image and all image features are extracted from the segmented lung only. We adopted a two-level feature representation method for the classification. Five gray level distribution statistics, six gray level co-occurrence matrix (GLCM), and eleven gray level run-length (GLRL) features were computed for each CT image depicted segment lung. Then we used wavelets decomposition to obtain the low- and high-frequency components of the input image, and again extract from the lung region six GLCM features and eleven GLRL features. Therefore our feature vector length is 56. The CT examinations were classified using the support vector machine (SVM) and k-nearest neighbors (KNN) and the traditional threshold (density mask) approach. The SVM classifier had the highest classification performance of all the methods with an overall sensitivity of 54.4% and a 69.6% sensitivity to discriminate "no" and "trace visually assessed emphysema. We believe this work may lead to an automated, objective method to categorically classify emphysema severity on CT exam.
Tahir, Fahima; Fahiem, Muhammad Abuzar
2014-01-01
The quality of pharmaceutical products plays an important role in pharmaceutical industry as well as in our lives. Usage of defective tablets can be harmful for patients. In this research we proposed a nondestructive method to identify defective and nondefective tablets using their surface morphology. Three different environmental factors temperature, humidity and moisture are analyzed to evaluate the performance of the proposed method. Multiple textural features are extracted from the surface of the defective and nondefective tablets. These textural features are gray level cooccurrence matrix, run length matrix, histogram, autoregressive model and HAAR wavelet. Total textural features extracted from images are 281. We performed an analysis on all those 281, top 15, and top 2 features. Top 15 features are extracted using three different feature reduction techniques: chi-square, gain ratio and relief-F. In this research we have used three different classifiers: support vector machine, K-nearest neighbors and naïve Bayes to calculate the accuracies against proposed method using two experiments, that is, leave-one-out cross-validation technique and train test models. We tested each classifier against all selected features and then performed the comparison of their results. The experimental work resulted in that in most of the cases SVM performed better than the other two classifiers.
Construction accident narrative classification: An evaluation of text mining techniques.
Goh, Yang Miang; Ubeynarayana, C U
2017-11-01
Learning from past accidents is fundamental to accident prevention. Thus, accident and near miss reporting are encouraged by organizations and regulators. However, for organizations managing large safety databases, the time taken to accurately classify accident and near miss narratives will be very significant. This study aims to evaluate the utility of various text mining classification techniques in classifying 1000 publicly available construction accident narratives obtained from the US OSHA website. The study evaluated six machine learning algorithms, including support vector machine (SVM), linear regression (LR), random forest (RF), k-nearest neighbor (KNN), decision tree (DT) and Naive Bayes (NB), and found that SVM produced the best performance in classifying the test set of 251 cases. Further experimentation with tokenization of the processed text and non-linear SVM were also conducted. In addition, a grid search was conducted on the hyperparameters of the SVM models. It was found that the best performing classifiers were linear SVM with unigram tokenization and radial basis function (RBF) SVM with uni-gram tokenization. In view of its relative simplicity, the linear SVM is recommended. Across the 11 labels of accident causes or types, the precision of the linear SVM ranged from 0.5 to 1, recall ranged from 0.36 to 0.9 and F1 score was between 0.45 and 0.92. The reasons for misclassification were discussed and suggestions on ways to improve the performance were provided. Copyright © 2017 Elsevier Ltd. All rights reserved.
MeSHLabeler: improving the accuracy of large-scale MeSH indexing by integrating diverse evidence.
Liu, Ke; Peng, Shengwen; Wu, Junqiu; Zhai, Chengxiang; Mamitsuka, Hiroshi; Zhu, Shanfeng
2015-06-15
Medical Subject Headings (MeSHs) are used by National Library of Medicine (NLM) to index almost all citations in MEDLINE, which greatly facilitates the applications of biomedical information retrieval and text mining. To reduce the time and financial cost of manual annotation, NLM has developed a software package, Medical Text Indexer (MTI), for assisting MeSH annotation, which uses k-nearest neighbors (KNN), pattern matching and indexing rules. Other types of information, such as prediction by MeSH classifiers (trained separately), can also be used for automatic MeSH annotation. However, existing methods cannot effectively integrate multiple evidence for MeSH annotation. We propose a novel framework, MeSHLabeler, to integrate multiple evidence for accurate MeSH annotation by using 'learning to rank'. Evidence includes numerous predictions from MeSH classifiers, KNN, pattern matching, MTI and the correlation between different MeSH terms, etc. Each MeSH classifier is trained independently, and thus prediction scores from different classifiers are incomparable. To address this issue, we have developed an effective score normalization procedure to improve the prediction accuracy. MeSHLabeler won the first place in Task 2A of 2014 BioASQ challenge, achieving the Micro F-measure of 0.6248 for 9,040 citations provided by the BioASQ challenge. Note that this accuracy is around 9.15% higher than 0.5724, obtained by MTI. The software is available upon request. © The Author 2015. Published by Oxford University Press.
MeSHLabeler: improving the accuracy of large-scale MeSH indexing by integrating diverse evidence
Liu, Ke; Peng, Shengwen; Wu, Junqiu; Zhai, Chengxiang; Mamitsuka, Hiroshi; Zhu, Shanfeng
2015-01-01
Motivation: Medical Subject Headings (MeSHs) are used by National Library of Medicine (NLM) to index almost all citations in MEDLINE, which greatly facilitates the applications of biomedical information retrieval and text mining. To reduce the time and financial cost of manual annotation, NLM has developed a software package, Medical Text Indexer (MTI), for assisting MeSH annotation, which uses k-nearest neighbors (KNN), pattern matching and indexing rules. Other types of information, such as prediction by MeSH classifiers (trained separately), can also be used for automatic MeSH annotation. However, existing methods cannot effectively integrate multiple evidence for MeSH annotation. Methods: We propose a novel framework, MeSHLabeler, to integrate multiple evidence for accurate MeSH annotation by using ‘learning to rank’. Evidence includes numerous predictions from MeSH classifiers, KNN, pattern matching, MTI and the correlation between different MeSH terms, etc. Each MeSH classifier is trained independently, and thus prediction scores from different classifiers are incomparable. To address this issue, we have developed an effective score normalization procedure to improve the prediction accuracy. Results: MeSHLabeler won the first place in Task 2A of 2014 BioASQ challenge, achieving the Micro F-measure of 0.6248 for 9,040 citations provided by the BioASQ challenge. Note that this accuracy is around 9.15% higher than 0.5724, obtained by MTI. Availability and implementation: The software is available upon request. Contact: zhusf@fudan.edu.cn PMID:26072501
Diagnosis of multiple sclerosis from EEG signals using nonlinear methods.
Torabi, Ali; Daliri, Mohammad Reza; Sabzposhan, Seyyed Hojjat
2017-12-01
EEG signals have essential and important information about the brain and neural diseases. The main purpose of this study is classifying two groups of healthy volunteers and Multiple Sclerosis (MS) patients using nonlinear features of EEG signals while performing cognitive tasks. EEG signals were recorded when users were doing two different attentional tasks. One of the tasks was based on detecting a desired change in color luminance and the other task was based on detecting a desired change in direction of motion. EEG signals were analyzed in two ways: EEG signals analysis without rhythms decomposition and EEG sub-bands analysis. After recording and preprocessing, time delay embedding method was used for state space reconstruction; embedding parameters were determined for original signals and their sub-bands. Afterwards nonlinear methods were used in feature extraction phase. To reduce the feature dimension, scalar feature selections were done by using T-test and Bhattacharyya criteria. Then, the data were classified using linear support vector machines (SVM) and k-nearest neighbor (KNN) method. The best combination of the criteria and classifiers was determined for each task by comparing performances. For both tasks, the best results were achieved by using T-test criterion and SVM classifier. For the direction-based and the color-luminance-based tasks, maximum classification performances were 93.08 and 79.79% respectively which were reached by using optimal set of features. Our results show that the nonlinear dynamic features of EEG signals seem to be useful and effective in MS diseases diagnosis.
Anomaly detection in forward looking infrared imaging using one-class classifiers
NASA Astrophysics Data System (ADS)
Popescu, Mihail; Stone, Kevin; Havens, Timothy; Ho, Dominic; Keller, James
2010-04-01
In this paper we describe a method for generating cues of possible abnormal objects present in the field of view of an infrared (IR) camera installed on a moving vehicle. The proposed method has two steps. In the first step, for each frame, we generate a set of possible points of interest using a corner detection algorithm. In the second step, the points related to the background are discarded from the point set using an one class classifier (OCC) trained on features extracted from a local neighborhood of each point. The advantage of using an OCC is that we do not need examples from the "abnormal object" class to train the classifier. Instead, OCC is trained using corner points from images known to be abnormal object free, i.e., that contain only background scenes. To further reduce the number of false alarms we use a temporal fusion procedure: a region has to be detected as "interesting" in m out of n, m
Trainor, Patrick J; DeFilippis, Andrew P; Rai, Shesh N
2017-06-21
Statistical classification is a critical component of utilizing metabolomics data for examining the molecular determinants of phenotypes. Despite this, a comprehensive and rigorous evaluation of the accuracy of classification techniques for phenotype discrimination given metabolomics data has not been conducted. We conducted such an evaluation using both simulated and real metabolomics datasets, comparing Partial Least Squares-Discriminant Analysis (PLS-DA), Sparse PLS-DA, Random Forests, Support Vector Machines (SVM), Artificial Neural Network, k -Nearest Neighbors ( k -NN), and Naïve Bayes classification techniques for discrimination. We evaluated the techniques on simulated data generated to mimic global untargeted metabolomics data by incorporating realistic block-wise correlation and partial correlation structures for mimicking the correlations and metabolite clustering generated by biological processes. Over the simulation studies, covariance structures, means, and effect sizes were stochastically varied to provide consistent estimates of classifier performance over a wide range of possible scenarios. The effects of the presence of non-normal error distributions, the introduction of biological and technical outliers, unbalanced phenotype allocation, missing values due to abundances below a limit of detection, and the effect of prior-significance filtering (dimension reduction) were evaluated via simulation. In each simulation, classifier parameters, such as the number of hidden nodes in a Neural Network, were optimized by cross-validation to minimize the probability of detecting spurious results due to poorly tuned classifiers. Classifier performance was then evaluated using real metabolomics datasets of varying sample medium, sample size, and experimental design. We report that in the most realistic simulation studies that incorporated non-normal error distributions, unbalanced phenotype allocation, outliers, missing values, and dimension reduction, classifier performance (least to greatest error) was ranked as follows: SVM, Random Forest, Naïve Bayes, sPLS-DA, Neural Networks, PLS-DA and k -NN classifiers. When non-normal error distributions were introduced, the performance of PLS-DA and k -NN classifiers deteriorated further relative to the remaining techniques. Over the real datasets, a trend of better performance of SVM and Random Forest classifier performance was observed.
(2-{[2-(diphenyl-phosphino)phen-yl]thio}-phen-yl)diphenyl-phosphine sulfide.
Alvarez-Larena, Angel; Martinez-Cuevas, Francisco J; Flor, Teresa; Real, Juli
2012-11-01
In the title compound, C(36)H(28)P(2)S(2), the dihedral angle between the central benzene rings is 66.95 (13)°. In the crystal, molecules are linked via C(ar)-H⋯π and π-π inter-actions [shortest centroid-centroid distance between benzene rings = 3.897 (2) Å].
Method of wavefront tilt correction for optical heterodyne detection systems under strong turbulence
NASA Astrophysics Data System (ADS)
Xiang, Jing-song; Tian, Xin; Pan, Le-chun
2014-07-01
Atmospheric turbulence decreases the heterodyne mixing efficiency of the optical heterodyne detection systems. Wavefront tilt correction is often used to improve the optical heterodyne mixing efficiency. But the performance of traditional centroid tracking tilt correction is poor under strong turbulence conditions. In this paper, a tilt correction method which tracking the peak value of laser spot on focal plane is proposed. Simulation results show that, under strong turbulence conditions, the performance of peak value tracking tilt correction is distinctly better than that of traditional centroid tracking tilt correction method, and the phenomenon of large antenna's performance inferior to small antenna's performance which may be occurred in centroid tracking tilt correction method can also be avoid in peak value tracking tilt correction method.
Kireeva, Natalia V; Ovchinnikova, Svetlana I; Kuznetsov, Sergey L; Kazennov, Andrey M; Tsivadze, Aslan Yu
2014-02-01
This study concerns large margin nearest neighbors classifier and its multi-metric extension as the efficient approaches for metric learning which aimed to learn an appropriate distance/similarity function for considered case studies. In recent years, many studies in data mining and pattern recognition have demonstrated that a learned metric can significantly improve the performance in classification, clustering and retrieval tasks. The paper describes application of the metric learning approach to in silico assessment of chemical liabilities. Chemical liabilities, such as adverse effects and toxicity, play a significant role in drug discovery process, in silico assessment of chemical liabilities is an important step aimed to reduce costs and animal testing by complementing or replacing in vitro and in vivo experiments. Here, to our knowledge for the first time, a distance-based metric learning procedures have been applied for in silico assessment of chemical liabilities, the impact of metric learning on structure-activity landscapes and predictive performance of developed models has been analyzed, the learned metric was used in support vector machines. The metric learning results have been illustrated using linear and non-linear data visualization techniques in order to indicate how the change of metrics affected nearest neighbors relations and descriptor space.
NASA Astrophysics Data System (ADS)
Kireeva, Natalia V.; Ovchinnikova, Svetlana I.; Kuznetsov, Sergey L.; Kazennov, Andrey M.; Tsivadze, Aslan Yu.
2014-02-01
This study concerns large margin nearest neighbors classifier and its multi-metric extension as the efficient approaches for metric learning which aimed to learn an appropriate distance/similarity function for considered case studies. In recent years, many studies in data mining and pattern recognition have demonstrated that a learned metric can significantly improve the performance in classification, clustering and retrieval tasks. The paper describes application of the metric learning approach to in silico assessment of chemical liabilities. Chemical liabilities, such as adverse effects and toxicity, play a significant role in drug discovery process, in silico assessment of chemical liabilities is an important step aimed to reduce costs and animal testing by complementing or replacing in vitro and in vivo experiments. Here, to our knowledge for the first time, a distance-based metric learning procedures have been applied for in silico assessment of chemical liabilities, the impact of metric learning on structure-activity landscapes and predictive performance of developed models has been analyzed, the learned metric was used in support vector machines. The metric learning results have been illustrated using linear and non-linear data visualization techniques in order to indicate how the change of metrics affected nearest neighbors relations and descriptor space.
Siuly; Yin, Xiaoxia; Hadjiloucas, Sillas; Zhang, Yanchun
2016-04-01
This work provides a performance comparison of four different machine learning classifiers: multinomial logistic regression with ridge estimators (MLR) classifier, k-nearest neighbours (KNN), support vector machine (SVM) and naïve Bayes (NB) as applied to terahertz (THz) transient time domain sequences associated with pixelated images of different powder samples. The six substances considered, although have similar optical properties, their complex insertion loss at the THz part of the spectrum is significantly different because of differences in both their frequency dependent THz extinction coefficient as well as differences in their refractive index and scattering properties. As scattering can be unquantifiable in many spectroscopic experiments, classification solely on differences in complex insertion loss can be inconclusive. The problem is addressed using two-dimensional (2-D) cross-correlations between background and sample interferograms, these ensure good noise suppression of the datasets and provide a range of statistical features that are subsequently used as inputs to the above classifiers. A cross-validation procedure is adopted to assess the performance of the classifiers. Firstly the measurements related to samples that had thicknesses of 2mm were classified, then samples at thicknesses of 4mm, and after that 3mm were classified and the success rate and consistency of each classifier was recorded. In addition, mixtures having thicknesses of 2 and 4mm as well as mixtures of 2, 3 and 4mm were presented simultaneously to all classifiers. This approach provided further cross-validation of the classification consistency of each algorithm. The results confirm the superiority in classification accuracy and robustness of the MLR (least accuracy 88.24%) and KNN (least accuracy 90.19%) algorithms which consistently outperformed the SVM (least accuracy 74.51%) and NB (least accuracy 56.86%) classifiers for the same number of feature vectors across all studies. The work establishes a general methodology for assessing the performance of other hyperspectral dataset classifiers on the basis of 2-D cross-correlations in far-infrared spectroscopy or other parts of the electromagnetic spectrum. It also advances the wider proliferation of automated THz imaging systems across new application areas e.g., biomedical imaging, industrial processing and quality control where interpretation of hyperspectral images is still under development. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Pollyea, R.; Mohammadi, N.; Taylor, J. E.
2017-12-01
The annual earthquake rate in Oklahoma increased dramatically between 2009 and 2016, owing in large part to the rapid proliferation of salt water disposal wells associated with unconventional oil and gas recovery. This study presents a geospatial analysis of earthquake occurrence and SWD injection volume within a 68,420 km2 area in north-central Oklahoma between 2011 and 2016. The spatial co-variability of earthquake occurrence and SWD injection volume is analyzed for each year of the study by calculating the geographic centroid for both earthquake epicenter and volume-weighted well location. In addition, the spatial cross correlation between earthquake occurrence and SWD volume is quantified by calculating the cross semivariogram annually for a 9.6 km × 9.6 km (6 mi × 6 mi) grid over the study area. Results from these analyses suggest that the relationship between volume-weighted well centroids and earthquake centroids generally follow pressure diffusion space-time scaling, and the volume-weighted well centroid predicts the geographic earthquake centroid within a 1σ radius of gyration. The cross semivariogram calculations show that SWD injection volume and earthquake occurrence are spatially cross correlated between 2014 and 2016. These results also show that the strength of cross correlation decreased from 2015 to 2016; however, the cross correlation length scale remains unchanged at 125 km. This suggests that earthquake mitigation efforts have been moderately successful in decreasing the strength of cross correlation between SWD volume and earthquake occurrence near-field, but the far-field contribution of SWD injection volume to earthquake occurrence remains unaffected.
NASA Astrophysics Data System (ADS)
Yuan, Ying; Li, Jicun; Li, Xin-Zheng; Wang, Feng
2018-05-01
The development of effective centroid potentials (ECPs) is explored with both the constrained-centroid and quasi-adiabatic force matching using liquid water as a test system. A trajectory integrated with the ECP is free of statistical noises that would be introduced when the centroid potential is approximated on the fly with a finite number of beads. With the reduced cost of ECP, challenging experimental properties can be studied in the spirit of centroid molecular dynamics. The experimental number density of H2O is 0.38% higher than that of D2O. With the ECP, the H2O number density is predicted to be 0.42% higher, when the dispersion term is not refit. After correction of finite size effects, the diffusion constant of H2O is found to be 21% higher than that of D2O, which is in good agreement with the 29.9% higher diffusivity for H2O observed experimentally. Although the ECP is also able to capture the redshifts of both the OH and OD stretching modes in liquid water, there are a number of properties that a classical simulation with the ECP will not be able to recover. For example, the heat capacities of H2O and D2O are predicted to be almost identical and higher than the experimental values. Such a failure is simply a result of not properly treating quantized vibrational energy levels when the trajectory is propagated with classical mechanics. Several limitations of the ECP based approach without bead population reconstruction are discussed.
Classification of EEG Signals Based on Pattern Recognition Approach.
Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed
2017-01-01
Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a "pattern recognition" approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90-7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11-89.63% and 91.60-81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy.
Classification of EEG Signals Based on Pattern Recognition Approach
Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed
2017-01-01
Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a “pattern recognition” approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90–7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11–89.63% and 91.60–81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy. PMID:29209190
False alarm reduction by the And-ing of multiple multivariate Gaussian classifiers
NASA Astrophysics Data System (ADS)
Dobeck, Gerald J.; Cobb, J. Tory
2003-09-01
The high-resolution sonar is one of the principal sensors used by the Navy to detect and classify sea mines in minehunting operations. For such sonar systems, substantial effort has been devoted to the development of automated detection and classification (D/C) algorithms. These have been spurred by several factors including (1) aids for operators to reduce work overload, (2) more optimal use of all available data, and (3) the introduction of unmanned minehunting systems. The environments where sea mines are typically laid (harbor areas, shipping lanes, and the littorals) give rise to many false alarms caused by natural, biologic, and man-made clutter. The objective of the automated D/C algorithms is to eliminate most of these false alarms while still maintaining a very high probability of mine detection and classification (PdPc). In recent years, the benefits of fusing the outputs of multiple D/C algorithms have been studied. We refer to this as Algorithm Fusion. The results have been remarkable, including reliable robustness to new environments. This paper describes a method for training several multivariate Gaussian classifiers such that their And-ing dramatically reduces false alarms while maintaining a high probability of classification. This training approach is referred to as the Focused- Training method. This work extends our 2001-2002 work where the Focused-Training method was used with three other types of classifiers: the Attractor-based K-Nearest Neighbor Neural Network (a type of radial-basis, probabilistic neural network), the Optimal Discrimination Filter Classifier (based linear discrimination theory), and the Quadratic Penalty Function Support Vector Machine (QPFSVM). Although our experience has been gained in the area of sea mine detection and classification, the principles described herein are general and can be applied to a wide range of pattern recognition and automatic target recognition (ATR) problems.
Optimal Doppler centroid estimation for SAR data from a quasi-homogeneous source
NASA Technical Reports Server (NTRS)
Jin, M. Y.
1986-01-01
This correspondence briefly describes two Doppler centroid estimation (DCE) algorithms, provides a performance summary for these algorithms, and presents the experimental results. These algorithms include that of Li et al. (1985) and a newly developed one that is optimized for quasi-homogeneous sources. The performance enhancement achieved by the optimal DCE algorithm is clearly demonstrated by the experimental results.
Temporal variations in the position of the heliospheric equator
NASA Astrophysics Data System (ADS)
Obridko, V. N.; Shelting, B. D.
2008-08-01
It is shown that the centroid of the heliospheric equator undergoes quasi-periodic oscillations. During the minimum of the 11-year cycle, the centroid shifts southwards (the so-called bashful-ballerina effect). The direction of the shift reverses during the solar maximum. The solar quadrupole is responsible for this effect. The shift is compared with the tilt of the heliospheric current sheet.
Tan, Li Kuo; Liew, Yih Miin; Lim, Einly; Abdul Aziz, Yang Faridah; Chee, Kok Han; McLaughlin, Robert A
2018-06-01
In this paper, we develop and validate an open source, fully automatic algorithm to localize the left ventricular (LV) blood pool centroid in short axis cardiac cine MR images, enabling follow-on automated LV segmentation algorithms. The algorithm comprises four steps: (i) quantify motion to determine an initial region of interest surrounding the heart, (ii) identify potential 2D objects of interest using an intensity-based segmentation, (iii) assess contraction/expansion, circularity, and proximity to lung tissue to score all objects of interest in terms of their likelihood of constituting part of the LV, and (iv) aggregate the objects into connected groups and construct the final LV blood pool volume and centroid. This algorithm was tested against 1140 datasets from the Kaggle Second Annual Data Science Bowl, as well as 45 datasets from the STACOM 2009 Cardiac MR Left Ventricle Segmentation Challenge. Correct LV localization was confirmed in 97.3% of the datasets. The mean absolute error between the gold standard and localization centroids was 2.8 to 4.7 mm, or 12 to 22% of the average endocardial radius. Graphical abstract Fully automated localization of the left ventricular blood pool in short axis cardiac cine MR images.
Event Centroiding Applied to Energy-Resolved Neutron Imaging at LANSCE
Borges, Nicholas; Losko, Adrian; Vogel, Sven
2018-02-13
The energy-dependence of the neutron cross section provides vastly different contrast mechanisms than polychromatic neutron radiography if neutron energies can be selected for imaging applications. In recent years, energy-resolved neutron imaging (ERNI) with epi-thermal neutrons, utilizing neutron absorption resonances for contrast as well as for quantitative density measurements, was pioneered at the Flight Path 5 beam line at LANSCE and continues to be refined. In this work, we present event centroiding, i.e., the determination of the center-of-gravity of a detection event on an imaging detector to allow sub-pixel spatial resolution and apply it to the many frames collected for energy-resolvedmore » neutron imaging at a pulsed neutron source. While event centroiding was demonstrated at thermal neutron sources, it has not been applied to energy-resolved neutron imaging, where the energy resolution requires to be preserved, and we present a quantification of the possible resolution as a function of neutron energy. For the 55 μm pixel size of the detector used for this study, we found a resolution improvement from ~80 μm to ~22 μm using pixel centroiding while fully preserving the energy resolution.« less
Event Centroiding Applied to Energy-Resolved Neutron Imaging at LANSCE
DOE Office of Scientific and Technical Information (OSTI.GOV)
Borges, Nicholas; Losko, Adrian; Vogel, Sven
The energy-dependence of the neutron cross section provides vastly different contrast mechanisms than polychromatic neutron radiography if neutron energies can be selected for imaging applications. In recent years, energy-resolved neutron imaging (ERNI) with epi-thermal neutrons, utilizing neutron absorption resonances for contrast as well as for quantitative density measurements, was pioneered at the Flight Path 5 beam line at LANSCE and continues to be refined. In this work, we present event centroiding, i.e., the determination of the center-of-gravity of a detection event on an imaging detector to allow sub-pixel spatial resolution and apply it to the many frames collected for energy-resolvedmore » neutron imaging at a pulsed neutron source. While event centroiding was demonstrated at thermal neutron sources, it has not been applied to energy-resolved neutron imaging, where the energy resolution requires to be preserved, and we present a quantification of the possible resolution as a function of neutron energy. For the 55 μm pixel size of the detector used for this study, we found a resolution improvement from ~80 μm to ~22 μm using pixel centroiding while fully preserving the energy resolution.« less
Automatic detection and quantitative analysis of cells in the mouse primary motor cortex
NASA Astrophysics Data System (ADS)
Meng, Yunlong; He, Yong; Wu, Jingpeng; Chen, Shangbin; Li, Anan; Gong, Hui
2014-09-01
Neuronal cells play very important role on metabolism regulation and mechanism control, so cell number is a fundamental determinant of brain function. Combined suitable cell-labeling approaches with recently proposed three-dimensional optical imaging techniques, whole mouse brain coronal sections can be acquired with 1-μm voxel resolution. We have developed a completely automatic pipeline to perform cell centroids detection, and provided three-dimensional quantitative information of cells in the primary motor cortex of C57BL/6 mouse. It involves four principal steps: i) preprocessing; ii) image binarization; iii) cell centroids extraction and contour segmentation; iv) laminar density estimation. Investigations on the presented method reveal promising detection accuracy in terms of recall and precision, with average recall rate 92.1% and average precision rate 86.2%. We also analyze laminar density distribution of cells from pial surface to corpus callosum from the output vectorizations of detected cell centroids in mouse primary motor cortex, and find significant cellular density distribution variations in different layers. This automatic cell centroids detection approach will be beneficial for fast cell-counting and accurate density estimation, as time-consuming and error-prone manual identification is avoided.
Fusing Image Data for Calculating Position of an Object
NASA Technical Reports Server (NTRS)
Huntsberger, Terrance; Cheng, Yang; Liebersbach, Robert; Trebi-Ollenu, Ashitey
2007-01-01
A computer program has been written for use in maintaining the calibration, with respect to the positions of imaged objects, of a stereoscopic pair of cameras on each of the Mars Explorer Rovers Spirit and Opportunity. The program identifies and locates a known object in the images. The object in question is part of a Moessbauer spectrometer located at the tip of a robot arm, the kinematics of which are known. In the program, the images are processed through a module that extracts edges, combines the edges into line segments, and then derives ellipse centroids from the line segments. The images are also processed by a feature-extraction algorithm that performs a wavelet analysis, then performs a pattern-recognition operation in the wavelet-coefficient space to determine matches to a texture feature measure derived from the horizontal, vertical, and diagonal coefficients. The centroids from the ellipse finder and the wavelet feature matcher are then fused to determine co-location. In the event that a match is found, the centroid (or centroids if multiple matches are present) is reported. If no match is found, the process reports the results of the analyses for further examination by human experts.
Wang, Jian-Gang; Sung, Eric; Yau, Wei-Yun
2011-07-01
Facial age classification is an approach to classify face images into one of several predefined age groups. One of the difficulties in applying learning techniques to the age classification problem is the large amount of labeled training data required. Acquiring such training data is very costly in terms of age progress, privacy, human time, and effort. Although unlabeled face images can be obtained easily, it would be expensive to manually label them on a large scale and getting the ground truth. The frugal selection of the unlabeled data for labeling to quickly reach high classification performance with minimal labeling efforts is a challenging problem. In this paper, we present an active learning approach based on an online incremental bilateral two-dimension linear discriminant analysis (IB2DLDA) which initially learns from a small pool of labeled data and then iteratively selects the most informative samples from the unlabeled set to increasingly improve the classifier. Specifically, we propose a novel data selection criterion called the furthest nearest-neighbor (FNN) that generalizes the margin-based uncertainty to the multiclass case and which is easy to compute, so that the proposed active learning algorithm can handle a large number of classes and large data sizes efficiently. Empirical experiments on FG-NET and Morph databases together with a large unlabeled data set for age categorization problems show that the proposed approach can achieve results comparable or even outperform a conventionally trained active classifier that requires much more labeling effort. Our IB2DLDA-FNN algorithm can achieve similar results much faster than random selection and with fewer samples for age categorization. It also can achieve comparable results with active SVM but is much faster than active SVM in terms of training because kernel methods are not needed. The results on the face recognition database and palmprint/palm vein database showed that our approach can handle problems with large number of classes. Our contributions in this paper are twofold. First, we proposed the IB2DLDA-FNN, the FNN being our novel idea, as a generic on-line or active learning paradigm. Second, we showed that it can be another viable tool for active learning of facial age range classification.
NASA Astrophysics Data System (ADS)
Li, Xuxu; Li, Xinyang; wang, Caixia
2018-03-01
This paper proposes an efficient approach to decrease the computational costs of correlation-based centroiding methods used for point source Shack-Hartmann wavefront sensors. Four typical similarity functions have been compared, i.e. the absolute difference function (ADF), ADF square (ADF2), square difference function (SDF), and cross-correlation function (CCF) using the Gaussian spot model. By combining them with fast search algorithms, such as three-step search (TSS), two-dimensional logarithmic search (TDL), cross search (CS), and orthogonal search (OS), computational costs can be reduced drastically without affecting the accuracy of centroid detection. Specifically, OS reduces calculation consumption by 90%. A comprehensive simulation indicates that CCF exhibits a better performance than other functions under various light-level conditions. Besides, the effectiveness of fast search algorithms has been verified.
Research of centroiding algorithms for extended and elongated spot of sodium laser guide star
NASA Astrophysics Data System (ADS)
Shao, Yayun; Zhang, Yudong; Wei, Kai
2016-10-01
Laser guide stars (LGSs) increase the sky coverage of astronomical adaptive optics systems. But spot array obtained by Shack-Hartmann wave front sensors (WFSs) turns extended and elongated, due to the thickness and size limitation of sodium LGS, which affects the accuracy of the wave front reconstruction algorithm. In this paper, we compared three different centroiding algorithms , the Center-of-Gravity (CoG), weighted CoG (WCoG) and Intensity Weighted Centroid (IWC), as well as those accuracies for various extended and elongated spots. In addition, we compared the reconstructed image data from those three algorithms with theoretical results, and proved that WCoG and IWC are the best wave front reconstruction algorithms for extended and elongated spot among all the algorithms.
Trajectory data privacy protection based on differential privacy mechanism
NASA Astrophysics Data System (ADS)
Gu, Ke; Yang, Lihao; Liu, Yongzhi; Liao, Niandong
2018-05-01
In this paper, we propose a trajectory data privacy protection scheme based on differential privacy mechanism. In the proposed scheme, the algorithm first selects the protected points from the user’s trajectory data; secondly, the algorithm forms the polygon according to the protected points and the adjacent and high frequent accessed points that are selected from the accessing point database, then the algorithm calculates the polygon centroids; finally, the noises are added to the polygon centroids by the differential privacy method, and the polygon centroids replace the protected points, and then the algorithm constructs and issues the new trajectory data. The experiments show that the running time of the proposed algorithms is fast, the privacy protection of the scheme is effective and the data usability of the scheme is higher.
3-Phenyl-6-(2-pyrid-yl)-1,2,4,5-tetra-zine.
Chartrand, Daniel; Laverdière, François; Hanan, Garry
2007-12-06
The title compound, C(13)H(9)N(5), is the first asymmetric diaryl-1,2,4,5-tetra-zine to be crystallographically characterized. We have been inter-ested in this motif for incorporation into supra-molecular assemblies based on coordination chemistry. The solid state structure shows a centrosymmetric mol-ecule, forcing a positional disorder of the terminal phenyl and pyridyl rings. The mol-ecule is completely planar, unusual for aromatic rings with N atoms in adjacent ortho positions. The stacking observed is very common in diaryl-tetra-zines and is dominated by π stacking [centroid-to-centroid distance between the tetrazine ring and the aromatic ring of an adjacent molecule is 3.6 Å, perpendicular (centroid-to-plane) distance of about 3.3 Å].
Vrooman, Henri A; Cocosco, Chris A; van der Lijn, Fedde; Stokking, Rik; Ikram, M Arfan; Vernooij, Meike W; Breteler, Monique M B; Niessen, Wiro J
2007-08-01
Conventional k-Nearest-Neighbor (kNN) classification, which has been successfully applied to classify brain tissue in MR data, requires training on manually labeled subjects. This manual labeling is a laborious and time-consuming procedure. In this work, a new fully automated brain tissue classification procedure is presented, in which kNN training is automated. This is achieved by non-rigidly registering the MR data with a tissue probability atlas to automatically select training samples, followed by a post-processing step to keep the most reliable samples. The accuracy of the new method was compared to rigid registration-based training and to conventional kNN-based segmentation using training on manually labeled subjects for segmenting gray matter (GM), white matter (WM) and cerebrospinal fluid (CSF) in 12 data sets. Furthermore, for all classification methods, the performance was assessed when varying the free parameters. Finally, the robustness of the fully automated procedure was evaluated on 59 subjects. The automated training method using non-rigid registration with a tissue probability atlas was significantly more accurate than rigid registration. For both automated training using non-rigid registration and for the manually trained kNN classifier, the difference with the manual labeling by observers was not significantly larger than inter-observer variability for all tissue types. From the robustness study, it was clear that, given an appropriate brain atlas and optimal parameters, our new fully automated, non-rigid registration-based method gives accurate and robust segmentation results. A similarity index was used for comparison with manually trained kNN. The similarity indices were 0.93, 0.92 and 0.92, for CSF, GM and WM, respectively. It can be concluded that our fully automated method using non-rigid registration may replace manual segmentation, and thus that automated brain tissue segmentation without laborious manual training is feasible.
Applying data fusion techniques for benthic habitat mapping and monitoring in a coral reef ecosystem
NASA Astrophysics Data System (ADS)
Zhang, Caiyun
2015-06-01
Accurate mapping and effective monitoring of benthic habitat in the Florida Keys are critical in developing management strategies for this valuable coral reef ecosystem. For this study, a framework was designed for automated benthic habitat mapping by combining multiple data sources (hyperspectral, aerial photography, and bathymetry data) and four contemporary imagery processing techniques (data fusion, Object-based Image Analysis (OBIA), machine learning, and ensemble analysis). In the framework, 1-m digital aerial photograph was first merged with 17-m hyperspectral imagery and 10-m bathymetry data using a pixel/feature-level fusion strategy. The fused dataset was then preclassified by three machine learning algorithms (Random Forest, Support Vector Machines, and k-Nearest Neighbor). Final object-based habitat maps were produced through ensemble analysis of outcomes from three classifiers. The framework was tested for classifying a group-level (3-class) and code-level (9-class) habitats in a portion of the Florida Keys. Informative and accurate habitat maps were achieved with an overall accuracy of 88.5% and 83.5% for the group-level and code-level classifications, respectively.
NASA Astrophysics Data System (ADS)
Sokolov, Anton; Gengembre, Cyril; Dmitriev, Egor; Delbarre, Hervé
2017-04-01
The problem is considered of classification of local atmospheric meteorological events in the coastal area such as sea breezes, fogs and storms. The in-situ meteorological data as wind speed and direction, temperature, humidity and turbulence are used as predictors. Local atmospheric events of 2013-2014 were analysed manually to train classification algorithms in the coastal area of English Channel in Dunkirk (France). Then, ultrasonic anemometer data and LIDAR wind profiler data were used as predictors. A few algorithms were applied to determine meteorological events by local data such as a decision tree, the nearest neighbour classifier, a support vector machine. The comparison of classification algorithms was carried out, the most important predictors for each event type were determined. It was shown that in more than 80 percent of the cases machine learning algorithms detect the meteorological class correctly. We expect that this methodology could be applied also to classify events by climatological in-situ data or by modelling data. It allows estimating frequencies of each event in perspective of climate change.
Ensemble LUT classification for degraded document enhancement
NASA Astrophysics Data System (ADS)
Obafemi-Ajayi, Tayo; Agam, Gady; Frieder, Ophir
2008-01-01
The fast evolution of scanning and computing technologies have led to the creation of large collections of scanned paper documents. Examples of such collections include historical collections, legal depositories, medical archives, and business archives. Moreover, in many situations such as legal litigation and security investigations scanned collections are being used to facilitate systematic exploration of the data. It is almost always the case that scanned documents suffer from some form of degradation. Large degradations make documents hard to read and substantially deteriorate the performance of automated document processing systems. Enhancement of degraded document images is normally performed assuming global degradation models. When the degradation is large, global degradation models do not perform well. In contrast, we propose to estimate local degradation models and use them in enhancing degraded document images. Using a semi-automated enhancement system we have labeled a subset of the Frieder diaries collection.1 This labeled subset was then used to train an ensemble classifier. The component classifiers are based on lookup tables (LUT) in conjunction with the approximated nearest neighbor algorithm. The resulting algorithm is highly effcient. Experimental evaluation results are provided using the Frieder diaries collection.1
Nagarajan, R; Hariharan, M; Satiyan, M
2012-08-01
Developing tools to assist physically disabled and immobilized people through facial expression is a challenging area of research and has attracted many researchers recently. In this paper, luminance stickers based facial expression recognition is proposed. Recognition of facial expression is carried out by employing Discrete Wavelet Transform (DWT) as a feature extraction method. Different wavelet families with their different orders (db1 to db20, Coif1 to Coif 5 and Sym2 to Sym8) are utilized to investigate their performance in recognizing facial expression and to evaluate their computational time. Standard deviation is computed for the coefficients of first level of wavelet decomposition for every order of wavelet family. This standard deviation is used to form a set of feature vectors for classification. In this study, conventional validation and cross validation are performed to evaluate the efficiency of the suggested feature vectors. Three different classifiers namely Artificial Neural Network (ANN), k-Nearest Neighborhood (kNN) and Linear Discriminant Analysis (LDA) are used to classify a set of eight facial expressions. The experimental results demonstrate that the proposed method gives very promising classification accuracies.
Gunavathi, Chellamuthu; Premalatha, Kandasamy
2014-01-01
Feature selection in cancer classification is a central area of research in the field of bioinformatics and used to select the informative genes from thousands of genes of the microarray. The genes are ranked based on T-statistics, signal-to-noise ratio (SNR), and F-test values. The swarm intelligence (SI) technique finds the informative genes from the top-m ranked genes. These selected genes are used for classification. In this paper the shuffled frog leaping with Lévy flight (SFLLF) is proposed for feature selection. In SFLLF, the Lévy flight is included to avoid premature convergence of shuffled frog leaping (SFL) algorithm. The SI techniques such as particle swarm optimization (PSO), cuckoo search (CS), SFL, and SFLLF are used for feature selection which identifies informative genes for classification. The k-nearest neighbour (k-NN) technique is used to classify the samples. The proposed work is applied on 10 different benchmark datasets and examined with SI techniques. The experimental results show that the results obtained from k-NN classifier through SFLLF feature selection method outperform PSO, CS, and SFL.
Kharbach, Mourad; Kamal, Rabie; Mansouri, Mohammed Alaoui; Marmouzi, Ilias; Viaene, Johan; Cherrah, Yahia; Alaoui, Katim; Vercammen, Joeri; Bouklouze, Abdelaziz; Vander Heyden, Yvan
2018-10-15
This study investigated the effectiveness of SIFT-MS versus chemical profiling, both coupled to multivariate data analysis, to classify 95 Extra Virgin Argan Oils (EVAO), originating from five Moroccan Argan forest locations. The full scan option of SIFT-MS, is suitable to indicate the geographic origin of EVAO based on the fingerprints obtained using the three chemical ionization precursors (H 3 O + , NO + and O 2 + ). The chemical profiling (including acidity, peroxide value, spectrophotometric indices, fatty acids, tocopherols- and sterols composition) was also used for classification. Partial least squares discriminant analysis (PLS-DA), soft independent modeling of class analogy (SIMCA), K-nearest neighbors (KNN), and support vector machines (SVM), were compared. The SIFT-MS data were therefore fed to variable-selection methods to find potential biomarkers for classification. The classification models based either on chemical profiling or SIFT-MS data were able to classify the samples with high accuracy. SIFT-MS was found to be advantageous for rapid geographic classification. Copyright © 2018 Elsevier Ltd. All rights reserved.
Pashaei, Elnaz; Ozen, Mustafa; Aydin, Nizamettin
2015-08-01
Improving accuracy of supervised classification algorithms in biomedical applications is one of active area of research. In this study, we improve the performance of Particle Swarm Optimization (PSO) combined with C4.5 decision tree (PSO+C4.5) classifier by applying Boosted C5.0 decision tree as the fitness function. To evaluate the effectiveness of our proposed method, it is implemented on 1 microarray dataset and 5 different medical data sets obtained from UCI machine learning databases. Moreover, the results of PSO + Boosted C5.0 implementation are compared to eight well-known benchmark classification methods (PSO+C4.5, support vector machine under the kernel of Radial Basis Function, Classification And Regression Tree (CART), C4.5 decision tree, C5.0 decision tree, Boosted C5.0 decision tree, Naive Bayes and Weighted K-Nearest neighbor). Repeated five-fold cross-validation method was used to justify the performance of classifiers. Experimental results show that our proposed method not only improve the performance of PSO+C4.5 but also obtains higher classification accuracy compared to the other classification methods.
Link prediction in multiplex online social networks
NASA Astrophysics Data System (ADS)
Jalili, Mahdi; Orouskhani, Yasin; Asgari, Milad; Alipourfard, Nazanin; Perc, Matjaž
2017-02-01
Online social networks play a major role in modern societies, and they have shaped the way social relationships evolve. Link prediction in social networks has many potential applications such as recommending new items to users, friendship suggestion and discovering spurious connections. Many real social networks evolve the connections in multiple layers (e.g. multiple social networking platforms). In this article, we study the link prediction problem in multiplex networks. As an example, we consider a multiplex network of Twitter (as a microblogging service) and Foursquare (as a location-based social network). We consider social networks of the same users in these two platforms and develop a meta-path-based algorithm for predicting the links. The connectivity information of the two layers is used to predict the links in Foursquare network. Three classical classifiers (naive Bayes, support vector machines (SVM) and K-nearest neighbour) are used for the classification task. Although the networks are not highly correlated in the layers, our experiments show that including the cross-layer information significantly improves the prediction performance. The SVM classifier results in the best performance with an average accuracy of 89%.
Link prediction in multiplex online social networks.
Jalili, Mahdi; Orouskhani, Yasin; Asgari, Milad; Alipourfard, Nazanin; Perc, Matjaž
2017-02-01
Online social networks play a major role in modern societies, and they have shaped the way social relationships evolve. Link prediction in social networks has many potential applications such as recommending new items to users, friendship suggestion and discovering spurious connections. Many real social networks evolve the connections in multiple layers (e.g. multiple social networking platforms). In this article, we study the link prediction problem in multiplex networks. As an example, we consider a multiplex network of Twitter (as a microblogging service) and Foursquare (as a location-based social network). We consider social networks of the same users in these two platforms and develop a meta-path-based algorithm for predicting the links. The connectivity information of the two layers is used to predict the links in Foursquare network. Three classical classifiers (naive Bayes, support vector machines (SVM) and K-nearest neighbour) are used for the classification task. Although the networks are not highly correlated in the layers, our experiments show that including the cross-layer information significantly improves the prediction performance. The SVM classifier results in the best performance with an average accuracy of 89%.
Classification of X-ray sources in the direction of M31
NASA Astrophysics Data System (ADS)
Vasilopoulos, G.; Hatzidimitriou, D.; Pietsch, W.
2012-01-01
M31 is our nearest spiral galaxy, at a distance of 780 kpc. Identification of X-ray sources in nearby galaxies is important for interpreting the properties of more distant ones, mainly because we can classify nearby sources using both X-ray and optical data, while more distant ones via X-rays alone. The XMM-Newton Large Project for M31 has produced an abundant sample of about 1900 X-ray sources in the direction of M31. Most of them remain elusive, giving us little signs of their origin. Our goal is to classify these sources using criteria based on properties of already identified ones. In particular we construct candidate lists of high mass X-ray binaries, low mass X-ray binaries, X-ray binaries correlated with globular clusters and AGN based on their X-ray emission and the properties of their optical counterparts, if any. Our main methodology consists of identifying particular loci of X-ray sources on X-ray hardness ratio diagrams and the color magnitude diagrams of their optical counterparts. Finally, we examined the X-ray luminosity function of the X-ray binaries populations.
NASA Astrophysics Data System (ADS)
Karsi, Redouane; Zaim, Mounia; El Alami, Jamila
2017-07-01
Thanks to the development of the internet, a large community now has the possibility to communicate and express its opinions and preferences through multiple media such as blogs, forums, social networks and e-commerce sites. Today, it becomes clearer that opinions published on the web are a very valuable source for decision-making, so a rapidly growing field of research called “sentiment analysis” is born to address the problem of automatically determining the polarity (Positive, negative, neutral,…) of textual opinions. People expressing themselves in a particular domain often use specific domain language expressions, thus, building a classifier, which performs well in different domains is a challenging problem. The purpose of this paper is to evaluate the impact of domain for sentiment classification when using machine learning techniques. In our study three popular machine learning techniques: Support Vector Machines (SVM), Naive Bayes and K nearest neighbors(KNN) were applied on datasets collected from different domains. Experimental results show that Support Vector Machines outperforms other classifiers in all domains, since it achieved at least 74.75% accuracy with a standard deviation of 4,08.
Shermeyer, Jacob S.; Haack, Barry N.
2015-01-01
Two forestry-change detection methods are described, compared, and contrasted for estimating deforestation and growth in threatened forests in southern Peru from 2000 to 2010. The methods used in this study rely on freely available data, including atmospherically corrected Landsat 5 Thematic Mapper and Moderate Resolution Imaging Spectroradiometer (MODIS) vegetation continuous fields (VCF). The two methods include a conventional supervised signature extraction method and a unique self-calibrating method called MODIS VCF guided forest/nonforest (FNF) masking. The process chain for each of these methods includes a threshold classification of MODIS VCF, training data or signature extraction, signature evaluation, k-nearest neighbor classification, analyst-guided reclassification, and postclassification image differencing to generate forest change maps. Comparisons of all methods were based on an accuracy assessment using 500 validation pixels. Results of this accuracy assessment indicate that FNF masking had a 5% higher overall accuracy and was superior to conventional supervised classification when estimating forest change. Both methods succeeded in classifying persistently forested and nonforested areas, and both had limitations when classifying forest change.
Automated Slicing for a Multi-Axis Metal Deposition System (Preprint)
2006-09-01
experimented with different materials like H13 tool steel to build the part. Following the same slicing and scanning toolpath result, there is a geometric...and analysis tool -centroidal axis. Similar to medial axis, it contains geometry and topological information but is significantly computationally...geometry reasoning and analysis tool -centroidal axis. Similar to medial axis, it contains geometry and topological information but is significantly
Video image position determination
Christensen, Wynn; Anderson, Forrest L.; Kortegaard, Birchard L.
1991-01-01
An optical beam position controller in which a video camera captures an image of the beam in its video frames, and conveys those images to a processing board which calculates the centroid coordinates for the image. The image coordinates are used by motor controllers and stepper motors to position the beam in a predetermined alignment. In one embodiment, system noise, used in conjunction with Bernoulli trials, yields higher resolution centroid coordinates.
NASA Astrophysics Data System (ADS)
Ferrarello, Daniela; Mammana, Maria Flavia; Pennisi, Mario
2018-05-01
In this paper, we show some properties of centroids of geometric figures, such as triangles, quadrilaterals and tetrahedra. In particular, we will prove the properties by means of geometric transformations and by introducing extensions of triangles and quadrilaterals, i.e. by adding one, two or three new vertices to the figure. The study of these properties can be used, with profit, in a classroom activity supported by a dynamic geometry system.
Davila, Juan Carlos; Cretu, Ana-Maria; Zaremba, Marek
2017-06-07
The design of multiple human activity recognition applications in areas such as healthcare, sports and safety relies on wearable sensor technologies. However, when making decisions based on the data acquired by such sensors in practical situations, several factors related to sensor data alignment, data losses, and noise, among other experimental constraints, deteriorate data quality and model accuracy. To tackle these issues, this paper presents a data-driven iterative learning framework to classify human locomotion activities such as walk, stand, lie, and sit, extracted from the Opportunity dataset. Data acquired by twelve 3-axial acceleration sensors and seven inertial measurement units are initially de-noised using a two-stage consecutive filtering approach combining a band-pass Finite Impulse Response (FIR) and a wavelet filter. A series of statistical parameters are extracted from the kinematical features, including the principal components and singular value decomposition of roll, pitch, yaw and the norm of the axial components. The novel interactive learning procedure is then applied in order to minimize the number of samples required to classify human locomotion activities. Only those samples that are most distant from the centroids of data clusters, according to a measure presented in the paper, are selected as candidates for the training dataset. The newly built dataset is then used to train an SVM multi-class classifier. The latter will produce the lowest prediction error. The proposed learning framework ensures a high level of robustness to variations in the quality of input data, while only using a much lower number of training samples and therefore a much shorter training time, which is an important consideration given the large size of the dataset.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Denney, K. D.; Peterson, B. M.; Horne, Keith
We use the coadded spectra of 32 epochs of Sloan Digital Sky Survey (SDSS) Reverberation Mapping Project observations of 482 quasars with z > 1.46 to highlight systematic biases in the SDSS- and Baryon Oscillation Spectroscopic Survey (BOSS)-pipeline redshifts due to the natural diversity of quasar properties. We investigate the characteristics of this bias by comparing the BOSS-pipeline redshifts to an estimate from the centroid of He ii λ 1640. He ii has a low equivalent width but is often well-defined in high-S/N spectra, does not suffer from self-absorption, and has a narrow component which, when present (the case for aboutmore » half of our sources), produces a redshift estimate that, on average, is consistent with that determined from [O ii] to within the He ii and [O ii] centroid measurement uncertainties. The large redshift differences of ∼1000 km s{sup −1}, on average, between the BOSS-pipeline and He ii-centroid redshifts, suggest there are significant biases in a portion of BOSS quasar redshift measurements. Adopting the He ii-based redshifts shows that C iv does not exhibit a ubiquitous blueshift for all quasars, given the precision probed by our measurements. Instead, we find a distribution of C iv-centroid blueshifts across our sample, with a dynamic range that (i) is wider than that previously reported for this line, and (ii) spans C iv centroids from those consistent with the systemic redshift to those with significant blueshifts of thousands of kilometers per second. These results have significant implications for measurement and use of high-redshift quasar properties and redshifts, and studies based thereon.« less
NASA Astrophysics Data System (ADS)
Celenk, Mehmet; Song, Yinglei; Ma, Limin; Zhou, Min
2003-05-01
A new algorithm that can be used to automatically recognize and classify malignant lymphomas and lukemia is proposed in this paper. The algorithm utilizes the morphological watershed to extract boundaries of cells from their grey-level images. It generates a sequence of Euclidean distances by selecting pixels in clockwise direction on the boundary of the cell and calculating the Euclidean distances of the selected pixels from the centroid of the cell. A feature vector associated with each cell is then obtained by applying the auto-regressive moving-average (ARMA) model to the generated sequence of Euclidean distances. The clustering measure J3=trace{inverse(Sw-1)Sm} involving the within (Sw) and mixed (Sm) class-scattering matrices is computed for both cell classes to provide an insight into the extent to which different cell classes in the training data are separated. Our test results suggest that the algorithm is highly accurate for the development of an interactive, computer-assisted diagnosis (CAD) tool.
NASA Astrophysics Data System (ADS)
Pratiher, Sawon; Patra, Sayantani; Pratiher, Souvik
2017-06-01
A novel analytical methodology for segregating healthy and neurological disorders from gait patterns is proposed by employing a set of oscillating components called intrinsic mode functions (IMF's). These IMF's are generated by the Empirical Mode Decomposition of the gait time series and the Hilbert transformed analytic signal representation forms the complex plane trace of the elliptical shaped analytic IMFs. The area measure and the relative change in the centroid position of the polygon formed by the Convex Hull of these analytic IMF's are taken as the discriminative features. Classification accuracy of 79.31% with Ensemble learning based Adaboost classifier validates the adequacy of the proposed methodology for a computer aided diagnostic (CAD) system for gait pattern identification. Also, the efficacy of several potential biomarkers like Bandwidth of Amplitude Modulation and Frequency Modulation IMF's and it's Mean Frequency from the Fourier-Bessel expansion from each of these analytic IMF's has been discussed for its potency in diagnosis of gait pattern identification and classification.
Paraskevopoulou, Sivylla E; Wu, Di; Eftekhar, Amir; Constandinou, Timothy G
2014-09-30
This work presents a novel unsupervised algorithm for real-time adaptive clustering of neural spike data (spike sorting). The proposed Hierarchical Adaptive Means (HAM) clustering method combines centroid-based clustering with hierarchical cluster connectivity to classify incoming spikes using groups of clusters. It is described how the proposed method can adaptively track the incoming spike data without requiring any past history, iteration or training and autonomously determines the number of spike classes. Its performance (classification accuracy) has been tested using multiple datasets (both simulated and recorded) achieving a near-identical accuracy compared to k-means (using 10-iterations and provided with the number of spike classes). Also, its robustness in applying to different feature extraction methods has been demonstrated by achieving classification accuracies above 80% across multiple datasets. Last but crucially, its low complexity, that has been quantified through both memory and computation requirements makes this method hugely attractive for future hardware implementation. Copyright © 2014 Elsevier B.V. All rights reserved.
Spotting stellar activity cycles in Gaia astrometry
NASA Astrophysics Data System (ADS)
Morris, Brett M.; Agol, Eric; Davenport, James R. A.; Hawley, Suzanne L.
2018-06-01
Astrometry from Gaia will measure the positions of stellar photometric centroids to unprecedented precision. We show that the precision of Gaia astrometry is sufficient to detect starspot-induced centroid jitter for nearby stars in the Tycho-Gaia Astrometric Solution (TGAS) sample with magnetic activity similar to the young G-star KIC 7174505 or the active M4 dwarf GJ 1243, but is insufficient to measure centroid jitter for stars with Sun-like spot distributions. We simulate Gaia observations of stars with 10 year activity cycles to search for evidence of activity cycles, and find that Gaia astrometry alone likely cannot detect activity cycles for stars in the TGAS sample, even if they have spot distributions like KIC 7174505. We review the activity of the nearby low-mass stars in the TGAS sample for which we anticipate significant detections of spot-induced jitter.
3-Phenyl-6-(2-pyridyl)-1,2,4,5-tetrazine
Chartrand, Daniel; Laverdière, François; Hanan, Garry
2008-01-01
The title compound, C13H9N5, is the first asymmetric diaryl-1,2,4,5-tetrazine to be crystallographically characterized. We have been interested in this motif for incorporation into supramolecular assemblies based on coordination chemistry. The solid state structure shows a centrosymmetric molecule, forcing a positional disorder of the terminal phenyl and pyridyl rings. The molecule is completely planar, unusual for aromatic rings with N atoms in adjacent ortho positions. The stacking observed is very common in diaryltetrazines and is dominated by π stacking [centroid-to-centroid distance between the tetrazine ring and the aromatic ring of an adjacent molecule is 3.6 Å, perpendicular (centroid-to-plane) distance of about 3.3 Å]. PMID:21200916
Douglas, David R [Newport News, VA; Benson, Stephen V [Yorktown, VA
2007-01-23
A method of energy recovery for RF-base linear charged particle accelerators that allows energy recovery without large relative momentum spread of the particle beam involving first accelerating a waveform particle beam having a crest and a centroid with an injection energy E.sub.o with the centroid of the particle beam at a phase offset f.sub.o from the crest of the accelerating waveform to an energy E.sub.full and then recovering the beam energy centroid a phase f.sub.o+Df relative to the crest of the waveform particle beam such that (E.sub.full-E.sub.o)(1+cos(f.sub.o+Df))>dE/2 wherein dE=the full energy spread, dE/2=the full energy half spread and Df=the wave form phase distance.
[A graph cuts-based interactive method for segmentation of magnetic resonance images of meningioma].
Li, Shuan-qiang; Feng, Qian-jin; Chen, Wu-fan; Lin, Ya-zhong
2011-06-01
For accurate segmentation of the magnetic resonance (MR) images of meningioma, we propose a novel interactive segmentation method based on graph cuts. The high dimensional image features was extracted, and for each pixel, the probabilities of its origin, either the tumor or the background regions, were estimated by exploiting the weighted K-nearest neighborhood classifier. Based on these probabilities, a new energy function was proposed. Finally, a graph cut optimal framework was used for the solution of the energy function. The proposed method was evaluated by application in the segmentation of MR images of meningioma, and the results showed that the method significantly improved the segmentation accuracy compared with the gray level information-based graph cut method.
Hamiltonian identifiability assisted by single-probe measurement
NASA Astrophysics Data System (ADS)
Sone, Akira; Cappellaro, Paola; Quantum Engineering Group Team
2017-04-01
We study the Hamiltonian identifiability of a many-body spin- 1 / 2 system assisted by the measurement on a single quantum probe based on the eigensystem realization algorithm (ERA) approach employed in. We demonstrate a potential application of Gröbner basis to the identifiability test of the Hamiltonian, and provide the necessary experimental resources, such as the lower bound in the number of the required sampling points, the upper bound in total required evolution time, and thus the total measurement time. Focusing on the examples of the identifiability in the spin chain model with nearest-neighbor interaction, we classify the spin-chain Hamiltonian based on its identifiability, and provide the control protocols to engineer the non-identifiable Hamiltonian to be an identifiable Hamiltonian.
Harmonic wavelet packet transform for on-line system health diagnosis
NASA Astrophysics Data System (ADS)
Yan, Ruqiang; Gao, Robert X.
2004-07-01
This paper presents a new approach to on-line health diagnosis of mechanical systems, based on the wavelet packet transform. Specifically, signals acquired from vibration sensors are decomposed into sub-bands by means of the discrete harmonic wavelet packet transform (DHWPT). Based on the Fisher linear discriminant criterion, features in the selected sub-bands are then used as inputs to three classifiers (Nearest Neighbor rule-based and two Neural Network-based), for system health condition assessment. Experimental results have confirmed that, comparing to the conventional approach where statistical parameters from raw signals are used, the presented approach enabled higher signal-to-noise ratio for more effective and intelligent use of the sensory information, thus leading to more accurate system health diagnosis.
NASA Astrophysics Data System (ADS)
Li, Manchun; Ma, Lei; Blaschke, Thomas; Cheng, Liang; Tiede, Dirk
2016-07-01
Geographic Object-Based Image Analysis (GEOBIA) is becoming more prevalent in remote sensing classification, especially for high-resolution imagery. Many supervised classification approaches are applied to objects rather than pixels, and several studies have been conducted to evaluate the performance of such supervised classification techniques in GEOBIA. However, these studies did not systematically investigate all relevant factors affecting the classification (segmentation scale, training set size, feature selection and mixed objects). In this study, statistical methods and visual inspection were used to compare these factors systematically in two agricultural case studies in China. The results indicate that Random Forest (RF) and Support Vector Machines (SVM) are highly suitable for GEOBIA classifications in agricultural areas and confirm the expected general tendency, namely that the overall accuracies decline with increasing segmentation scale. All other investigated methods except for RF and SVM are more prone to obtain a lower accuracy due to the broken objects at fine scales. In contrast to some previous studies, the RF classifiers yielded the best results and the k-nearest neighbor classifier were the worst results, in most cases. Likewise, the RF and Decision Tree classifiers are the most robust with or without feature selection. The results of training sample analyses indicated that the RF and adaboost. M1 possess a superior generalization capability, except when dealing with small training sample sizes. Furthermore, the classification accuracies were directly related to the homogeneity/heterogeneity of the segmented objects for all classifiers. Finally, it was suggested that RF should be considered in most cases for agricultural mapping.
Driving behavior recognition using EEG data from a simulated car-following experiment.
Yang, Liu; Ma, Rui; Zhang, H Michael; Guan, Wei; Jiang, Shixiong
2018-07-01
Driving behavior recognition is the foundation of driver assistance systems, with potential applications in automated driving systems. Most prevailing studies have used subjective questionnaire data and objective driving data to classify driving behaviors, while few studies have used physiological signals such as electroencephalography (EEG) to gather data. To bridge this gap, this paper proposes a two-layer learning method for driving behavior recognition using EEG data. A simulated car-following driving experiment was designed and conducted to simultaneously collect data on the driving behaviors and EEG data of drivers. The proposed learning method consists of two layers. In Layer I, two-dimensional driving behavior features representing driving style and stability were selected and extracted from raw driving behavior data using K-means and support vector machine recursive feature elimination. Five groups of driving behaviors were classified based on these two-dimensional driving behavior features. In Layer II, the classification results from Layer I were utilized as inputs to generate a k-Nearest-Neighbor classifier identifying driving behavior groups using EEG data. Using independent component analysis, a fast Fourier transformation, and linear discriminant analysis sequentially, the raw EEG signals were processed to extract two core EEG features. Classifier performance was enhanced using the adaptive synthetic sampling approach. A leave-one-subject-out cross validation was conducted. The results showed that the average classification accuracy for all tested traffic states was 69.5% and the highest accuracy reached 83.5%, suggesting a significant correlation between EEG patterns and car-following behavior. Copyright © 2017 Elsevier Ltd. All rights reserved.
Multi-feature classifiers for burst detection in single EEG channels from preterm infants
NASA Astrophysics Data System (ADS)
Navarro, X.; Porée, F.; Kuchenbuch, M.; Chavez, M.; Beuchée, Alain; Carrault, G.
2017-08-01
Objective. The study of electroencephalographic (EEG) bursts in preterm infants provides valuable information about maturation or prognostication after perinatal asphyxia. Over the last two decades, a number of works proposed algorithms to automatically detect EEG bursts in preterm infants, but they were designed for populations under 35 weeks of post menstrual age (PMA). However, as the brain activity evolves rapidly during postnatal life, these solutions might be under-performing with increasing PMA. In this work we focused on preterm infants reaching term ages (PMA ⩾36 weeks) using multi-feature classification on a single EEG channel. Approach. Five EEG burst detectors relying on different machine learning approaches were compared: logistic regression (LR), linear discriminant analysis (LDA), k-nearest neighbors (kNN), support vector machines (SVM) and thresholding (Th). Classifiers were trained by visually labeled EEG recordings from 14 very preterm infants (born after 28 weeks of gestation) with 36-41 weeks PMA. Main results. The most performing classifiers reached about 95% accuracy (kNN, SVM and LR) whereas Th obtained 84%. Compared to human-automatic agreements, LR provided the highest scores (Cohen’s kappa = 0.71) using only three EEG features. Applying this classifier in an unlabeled database of 21 infants ⩾36 weeks PMA, we found that long EEG bursts and short inter-burst periods are characteristic of infants with the highest PMA and weights. Significance. In view of these results, LR-based burst detection could be a suitable tool to study maturation in monitoring or portable devices using a single EEG channel.
Protein classification based on text document classification techniques.
Cheng, Betty Yee Man; Carbonell, Jaime G; Klein-Seetharaman, Judith
2005-03-01
The need for accurate, automated protein classification methods continues to increase as advances in biotechnology uncover new proteins. G-protein coupled receptors (GPCRs) are a particularly difficult superfamily of proteins to classify due to extreme diversity among its members. Previous comparisons of BLAST, k-nearest neighbor (k-NN), hidden markov model (HMM) and support vector machine (SVM) using alignment-based features have suggested that classifiers at the complexity of SVM are needed to attain high accuracy. Here, analogous to document classification, we applied Decision Tree and Naive Bayes classifiers with chi-square feature selection on counts of n-grams (i.e. short peptide sequences of length n) to this classification task. Using the GPCR dataset and evaluation protocol from the previous study, the Naive Bayes classifier attained an accuracy of 93.0 and 92.4% in level I and level II subfamily classification respectively, while SVM has a reported accuracy of 88.4 and 86.3%. This is a 39.7 and 44.5% reduction in residual error for level I and level II subfamily classification, respectively. The Decision Tree, while inferior to SVM, outperforms HMM in both level I and level II subfamily classification. For those GPCR families whose profiles are stored in the Protein FAMilies database of alignments and HMMs (PFAM), our method performs comparably to a search against those profiles. Finally, our method can be generalized to other protein families by applying it to the superfamily of nuclear receptors with 94.5, 97.8 and 93.6% accuracy in family, level I and level II subfamily classification respectively. Copyright 2005 Wiley-Liss, Inc.
Estimating local scaling properties for the classification of interstitial lung disease patterns
NASA Astrophysics Data System (ADS)
Huber, Markus B.; Nagarajan, Mahesh B.; Leinsinger, Gerda; Ray, Lawrence A.; Wismueller, Axel
2011-03-01
Local scaling properties of texture regions were compared in their ability to classify morphological patterns known as 'honeycombing' that are considered indicative for the presence of fibrotic interstitial lung diseases in high-resolution computed tomography (HRCT) images. For 14 patients with known occurrence of honeycombing, a stack of 70 axial, lung kernel reconstructed images were acquired from HRCT chest exams. 241 regions of interest of both healthy and pathological (89) lung tissue were identified by an experienced radiologist. Texture features were extracted using six properties calculated from gray-level co-occurrence matrices (GLCM), Minkowski Dimensions (MDs), and the estimation of local scaling properties with Scaling Index Method (SIM). A k-nearest-neighbor (k-NN) classifier and a Multilayer Radial Basis Functions Network (RBFN) were optimized in a 10-fold cross-validation for each texture vector, and the classification accuracy was calculated on independent test sets as a quantitative measure of automated tissue characterization. A Wilcoxon signed-rank test was used to compare two accuracy distributions including the Bonferroni correction. The best classification results were obtained by the set of SIM features, which performed significantly better than all the standard GLCM and MD features (p < 0.005) for both classifiers with the highest accuracy (94.1%, 93.7%; for the k-NN and RBFN classifier, respectively). The best standard texture features were the GLCM features 'homogeneity' (91.8%, 87.2%) and 'absolute value' (90.2%, 88.5%). The results indicate that advanced texture features using local scaling properties can provide superior classification performance in computer-assisted diagnosis of interstitial lung diseases when compared to standard texture analysis methods.
NASA Astrophysics Data System (ADS)
Alfano, R.; Soetemans, D.; Bauman, G. S.; Gibson, E.; Gaed, M.; Moussa, M.; Gomez, J. A.; Chin, J. L.; Pautler, S.; Ward, A. D.
2018-02-01
Multi-parametric MRI (mp-MRI) is becoming a standard in contemporary prostate cancer screening and diagnosis, and has shown to aid physicians in cancer detection. It offers many advantages over traditional systematic biopsy, which has shown to have very high clinical false-negative rates of up to 23% at all stages of the disease. However beneficial, mp-MRI is relatively complex to interpret and suffers from inter-observer variability in lesion localization and grading. Computer-aided diagnosis (CAD) systems have been developed as a solution as they have the power to perform deterministic quantitative image analysis. We measured the accuracy of such a system validated using accurately co-registered whole-mount digitized histology. We trained a logistic linear classifier (LOGLC), support vector machine (SVC), k-nearest neighbour (KNN) and random forest classifier (RFC) in a four part ROI based experiment against: 1) cancer vs. non-cancer, 2) high-grade (Gleason score ≥4+3) vs. low-grade cancer (Gleason score <4+3), 3) high-grade vs. other tissue components and 4) high-grade vs. benign tissue by selecting the classifier with the highest AUC using 1-10 features from forward feature selection. The CAD model was able to classify malignant vs. benign tissue and detect high-grade cancer with high accuracy. Once fully validated, this work will form the basis for a tool that enhances the radiologist's ability to detect malignancies, potentially improving biopsy guidance, treatment selection, and focal therapy for prostate cancer patients, maximizing the potential for cure and increasing quality of life.
Xiao, Xuan; Wang, Pu; Lin, Wei-Zhong; Jia, Jian-Hua; Chou, Kuo-Chen
2013-05-15
Antimicrobial peptides (AMPs), also called host defense peptides, are an evolutionarily conserved component of the innate immune response and are found among all classes of life. According to their special functions, AMPs are generally classified into ten categories: Antibacterial Peptides, Anticancer/tumor Peptides, Antifungal Peptides, Anti-HIV Peptides, Antiviral Peptides, Antiparasital Peptides, Anti-protist Peptides, AMPs with Chemotactic Activity, Insecticidal Peptides, and Spermicidal Peptides. Given a query peptide, how can we identify whether it is an AMP or non-AMP? If it is, can we identify which functional type or types it belong to? Particularly, how can we deal with the multi-type problem since an AMP may belong to two or more functional types? To address these problems, which are obviously very important to both basic research and drug development, a multi-label classifier was developed based on the pseudo amino acid composition (PseAAC) and fuzzy K-nearest neighbor (FKNN) algorithm, where the components of PseAAC were featured by incorporating five physicochemical properties. The novel classifier is called iAMP-2L, where "2L" means that it is a 2-level predictor. The 1st-level is to answer the 1st question above, while the 2nd-level is to answer the 2nd and 3rd questions that are beyond the reach of any existing methods in this area. For the conveniences of users, a user-friendly web-server for iAMP-2L was established at http://www.jci-bioinfo.cn/iAMP-2L. Copyright © 2013 Elsevier Inc. All rights reserved.
Parsec's astrometry direct approaches .
NASA Astrophysics Data System (ADS)
Andrei, A. H.
Parallaxes - and hence the fundamental establishment of stellar distances - rank among the oldest, keyest, and hardest of astronomical determinations. Arguably amongst the most essential too. The direct approach to obtain trigonometric parallaxes, using a constrained set of equations to derive positions, proper motions, and parallaxes, has been labeled as risky. Properly so, because the axis of the parallactic apparent ellipse is smaller than one arcsec even for the nearest stars, and just a fraction of its perimeter can be followed. Thus the classical approach is of linearizing the description by locking the solution to a set of precise positions of the Earth at the instants of observation, rather than to the dynamics of its orbit, and of adopting a close examination of the never many points available. In the PARSEC program the parallaxes of 143 brown dwarfs were aimed at. Five years of observation of the fields were taken with the WIFI camera at the ESO 2.2m telescope, in Chile. The goal is to provide a statistically significant number of trigonometric parallaxes to BD sub-classes from L0 to T7. Taking advantage of the large, regularly spaced, quantity of observations, here we take the risky approach to fit an ellipse in ecliptical observed coordinates and derive the parallaxes. We also combine the solutions from different centroiding methods, widely proven in prior astrometric investigations. As each of those methods assess diverse properties of the PSFs, they are taken as independent measurements, and combined into a weighted least-square general solution.
Cotter, Christopher R.; Schüttler, Heinz-Bernd; Igoshin, Oleg A.; Shimkets, Lawrence J.
2017-01-01
Collective cell movement is critical to the emergent properties of many multicellular systems, including microbial self-organization in biofilms, embryogenesis, wound healing, and cancer metastasis. However, even the best-studied systems lack a complete picture of how diverse physical and chemical cues act upon individual cells to ensure coordinated multicellular behavior. Known for its social developmental cycle, the bacterium Myxococcus xanthus uses coordinated movement to generate three-dimensional aggregates called fruiting bodies. Despite extensive progress in identifying genes controlling fruiting body development, cell behaviors and cell–cell communication mechanisms that mediate aggregation are largely unknown. We developed an approach to examine emergent behaviors that couples fluorescent cell tracking with data-driven models. A unique feature of this approach is the ability to identify cell behaviors affecting the observed aggregation dynamics without full knowledge of the underlying biological mechanisms. The fluorescent cell tracking revealed large deviations in the behavior of individual cells. Our modeling method indicated that decreased cell motility inside the aggregates, a biased walk toward aggregate centroids, and alignment among neighboring cells in a radial direction to the nearest aggregate are behaviors that enhance aggregation dynamics. Our modeling method also revealed that aggregation is generally robust to perturbations in these behaviors and identified possible compensatory mechanisms. The resulting approach of directly combining behavior quantification with data-driven simulations can be applied to more complex systems of collective cell movement without prior knowledge of the cellular machinery and behavioral cues. PMID:28533367
Geographical accessibility to community pharmacies by the elderly in metropolitan Lisbon.
Padeiro, Miguel
2018-07-01
In ageing societies, community pharmacies play an important role in delivering medicines, responsible advising, and other targeted services. Elderly people are among their main consumers, as they use more prescription drugs, need more specific health care, and experience more mobility issues than other age groups. This makes geographical accessibility a relevant concern for them. To measure geographical pedestrian accessibility to community pharmacies by elderly people in the Lisbon Metropolitan Area (LMA). The number of elderly people living within a 10- and 15-min walk was estimated based on the exploitation of population census data, the address-based location of 801 community pharmacies, and a Google Maps Application Programming Interface (API) method for calculating distances between pharmacies and the centroids of census statistical subsections. Results were compared to figures attained via traditional methods. In the LMA, 61.2% of the elderly live less than a 10 min walk from the nearest pharmacy and 76.9% live less than 15 min away. This opposes the common view that pharmacies are highly accessible in urban areas. In addition, results show a high spatial variability of proximity to pharmacies. Despite the illusion of good coverage suggested at the metropolitan scale, accessibility measures demonstrate the existence of pharmaceutical deprivation areas for the elderly. The findings indicate the need for more accuracy in both access measurements and redistribution policies. Measurement methods and population targets should be reconsidered. Copyright © 2017 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Martinez-Torteya, Antonio; Treviño-Alvarado, Víctor; Tamez-Peña, José
2013-02-01
The accurate diagnosis of Alzheimer's disease (AD) and mild cognitive impairment (MCI) confers many clinical research and patient care benefits. Studies have shown that multimodal biomarkers provide better diagnosis accuracy of AD and MCI than unimodal biomarkers, but their construction has been based on traditional statistical approaches. The objective of this work was the creation of accurate AD and MCI diagnostic multimodal biomarkers using advanced bioinformatics tools. The biomarkers were created by exploring multimodal combinations of features using machine learning techniques. Data was obtained from the ADNI database. The baseline information (e.g. MRI analyses, PET analyses and laboratory essays) from AD, MCI and healthy control (HC) subjects with available diagnosis up to June 2012 was mined for case/controls candidates. The data mining yielded 47 HC, 83 MCI and 43 AD subjects for biomarker creation. Each subject was characterized by at least 980 ADNI features. A genetic algorithm feature selection strategy was used to obtain compact and accurate cross-validated nearest centroid biomarkers. The biomarkers achieved training classification accuracies of 0.983, 0.871 and 0.917 for HC vs. AD, HC vs. MCI and MCI vs. AD respectively. The constructed biomarkers were relatively compact: from 5 to 11 features. Those multimodal biomarkers included several widely accepted univariate biomarkers and novel image and biochemical features. Multimodal biomarkers constructed from previously and non-previously AD associated features showed improved diagnostic performance when compared to those based solely on previously AD associated features.
Mediterranean Land Use and Land Cover Classification Assessment Using High Spatial Resolution Data
NASA Astrophysics Data System (ADS)
Elhag, Mohamed; Boteva, Silvena
2016-10-01
Landscape fragmentation is noticeably practiced in Mediterranean regions and imposes substantial complications in several satellite image classification methods. To some extent, high spatial resolution data were able to overcome such complications. For better classification performances in Land Use Land Cover (LULC) mapping, the current research adopts different classification methods comparison for LULC mapping using Sentinel-2 satellite as a source of high spatial resolution. Both of pixel-based and an object-based classification algorithms were assessed; the pixel-based approach employs Maximum Likelihood (ML), Artificial Neural Network (ANN) algorithms, Support Vector Machine (SVM), and, the object-based classification uses the Nearest Neighbour (NN) classifier. Stratified Masking Process (SMP) that integrates a ranking process within the classes based on spectral fluctuation of the sum of the training and testing sites was implemented. An analysis of the overall and individual accuracy of the classification results of all four methods reveals that the SVM classifier was the most efficient overall by distinguishing most of the classes with the highest accuracy. NN succeeded to deal with artificial surface classes in general while agriculture area classes, and forest and semi-natural area classes were segregated successfully with SVM. Furthermore, a comparative analysis indicates that the conventional classification method yielded better accuracy results than the SMP method overall with both classifiers used, ML and SVM.
Image-based Analysis of Emotional Facial Expressions in Full Face Transplants.
Bedeloglu, Merve; Topcu, Çagdas; Akgul, Arzu; Döger, Ela Naz; Sever, Refik; Ozkan, Ozlenen; Ozkan, Omer; Uysal, Hilmi; Polat, Ovunc; Çolak, Omer Halil
2018-01-20
In this study, it is aimed to determine the degree of the development in emotional expression of full face transplant patients from photographs. Hence, a rehabilitation process can be planned according to the determination of degrees as a later work. As envisaged, in full face transplant cases, the determination of expressions can be confused or cannot be achieved as the healthy control group. In order to perform image-based analysis, a control group consist of 9 healthy males and 2 full-face transplant patients participated in the study. Appearance-based Gabor Wavelet Transform (GWT) and Local Binary Pattern (LBP) methods are adopted for recognizing neutral and 6 emotional expressions which consist of angry, scared, happy, hate, confused and sad. Feature extraction was carried out by using both methods and combination of these methods serially. In the performed expressions, the extracted features of the most distinct zones in the facial area where the eye and mouth region, have been used to classify the emotions. Also, the combination of these region features has been used to improve classifier performance. Control subjects and transplant patients' ability to perform emotional expressions have been determined with K-nearest neighbor (KNN) classifier with region-specific and method-specific decision stages. The results have been compared with healthy group. It has been observed that transplant patients don't reflect some emotional expressions. Also, there were confusions among expressions.
An Event-Triggered Machine Learning Approach for Accelerometer-Based Fall Detection.
Putra, I Putu Edy Suardiyana; Brusey, James; Gaura, Elena; Vesilo, Rein
2017-12-22
The fixed-size non-overlapping sliding window (FNSW) and fixed-size overlapping sliding window (FOSW) approaches are the most commonly used data-segmentation techniques in machine learning-based fall detection using accelerometer sensors. However, these techniques do not segment by fall stages (pre-impact, impact, and post-impact) and thus useful information is lost, which may reduce the detection rate of the classifier. Aligning the segment with the fall stage is difficult, as the segment size varies. We propose an event-triggered machine learning (EvenT-ML) approach that aligns each fall stage so that the characteristic features of the fall stages are more easily recognized. To evaluate our approach, two publicly accessible datasets were used. Classification and regression tree (CART), k -nearest neighbor ( k -NN), logistic regression (LR), and the support vector machine (SVM) were used to train the classifiers. EvenT-ML gives classifier F-scores of 98% for a chest-worn sensor and 92% for a waist-worn sensor, and significantly reduces the computational cost compared with the FNSW- and FOSW-based approaches, with reductions of up to 8-fold and 78-fold, respectively. EvenT-ML achieves a significantly better F-score than existing fall detection approaches. These results indicate that aligning feature segments with fall stages significantly increases the detection rate and reduces the computational cost.
Using Bayesian neural networks to classify forest scenes
NASA Astrophysics Data System (ADS)
Vehtari, Aki; Heikkonen, Jukka; Lampinen, Jouko; Juujarvi, Jouni
1998-10-01
We present results that compare the performance of Bayesian learning methods for neural networks on the task of classifying forest scenes into trees and background. Classification task is demanding due to the texture richness of the trees, occlusions of the forest scene objects and diverse lighting conditions under operation. This makes it difficult to determine which are optimal image features for the classification. A natural way to proceed is to extract many different types of potentially suitable features, and to evaluate their usefulness in later processing stages. One approach to cope with large number of features is to use Bayesian methods to control the model complexity. Bayesian learning uses a prior on model parameters, combines this with evidence from a training data, and the integrates over the resulting posterior to make predictions. With this method, we can use large networks and many features without fear of overfitting. For this classification task we compare two Bayesian learning methods for multi-layer perceptron (MLP) neural networks: (1) The evidence framework of MacKay uses a Gaussian approximation to the posterior weight distribution and maximizes with respect to hyperparameters. (2) In a Markov Chain Monte Carlo (MCMC) method due to Neal, the posterior distribution of the network parameters is numerically integrated using the MCMC method. As baseline classifiers for comparison we use (3) MLP early stop committee, (4) K-nearest-neighbor and (5) Classification And Regression Tree.
Balouchestani, Mohammadreza; Krishnan, Sridhar
2014-01-01
Long-term recording of Electrocardiogram (ECG) signals plays an important role in health care systems for diagnostic and treatment purposes of heart diseases. Clustering and classification of collecting data are essential parts for detecting concealed information of P-QRS-T waves in the long-term ECG recording. Currently used algorithms do have their share of drawbacks: 1) clustering and classification cannot be done in real time; 2) they suffer from huge energy consumption and load of sampling. These drawbacks motivated us in developing novel optimized clustering algorithm which could easily scan large ECG datasets for establishing low power long-term ECG recording. In this paper, we present an advanced K-means clustering algorithm based on Compressed Sensing (CS) theory as a random sampling procedure. Then, two dimensionality reduction methods: Principal Component Analysis (PCA) and Linear Correlation Coefficient (LCC) followed by sorting the data using the K-Nearest Neighbours (K-NN) and Probabilistic Neural Network (PNN) classifiers are applied to the proposed algorithm. We show our algorithm based on PCA features in combination with K-NN classifier shows better performance than other methods. The proposed algorithm outperforms existing algorithms by increasing 11% classification accuracy. In addition, the proposed algorithm illustrates classification accuracy for K-NN and PNN classifiers, and a Receiver Operating Characteristics (ROC) area of 99.98%, 99.83%, and 99.75% respectively.
Chen, Peng; Li, Jinyan; Wong, Limsoon; Kuwahara, Hiroyuki; Huang, Jianhua Z; Gao, Xin
2013-08-01
Hot spot residues of proteins are fundamental interface residues that help proteins perform their functions. Detecting hot spots by experimental methods is costly and time-consuming. Sequential and structural information has been widely used in the computational prediction of hot spots. However, structural information is not always available. In this article, we investigated the problem of identifying hot spots using only physicochemical characteristics extracted from amino acid sequences. We first extracted 132 relatively independent physicochemical features from a set of the 544 properties in AAindex1, an amino acid index database. Each feature was utilized to train a classification model with a novel encoding schema for hot spot prediction by the IBk algorithm, an extension of the K-nearest neighbor algorithm. The combinations of the individual classifiers were explored and the classifiers that appeared frequently in the top performing combinations were selected. The hot spot predictor was built based on an ensemble of these classifiers and to work in a voting manner. Experimental results demonstrated that our method effectively exploited the feature space and allowed flexible weights of features for different queries. On the commonly used hot spot benchmark sets, our method significantly outperformed other machine learning algorithms and state-of-the-art hot spot predictors. The program is available at http://sfb.kaust.edu.sa/pages/software.aspx. Copyright © 2013 Wiley Periodicals, Inc.
Sertel, O.; Kong, J.; Shimada, H.; Catalyurek, U.V.; Saltz, J.H.; Gurcan, M.N.
2009-01-01
We are developing a computer-aided prognosis system for neuroblastoma (NB), a cancer of the nervous system and one of the most malignant tumors affecting children. Histopathological examination is an important stage for further treatment planning in routine clinical diagnosis of NB. According to the International Neuroblastoma Pathology Classification (the Shimada system), NB patients are classified into favorable and unfavorable histology based on the tissue morphology. In this study, we propose an image analysis system that operates on digitized H&E stained whole-slide NB tissue samples and classifies each slide as either stroma-rich or stroma-poor based on the degree of Schwannian stromal development. Our statistical framework performs the classification based on texture features extracted using co-occurrence statistics and local binary patterns. Due to the high resolution of digitized whole-slide images, we propose a multi-resolution approach that mimics the evaluation of a pathologist such that the image analysis starts from the lowest resolution and switches to higher resolutions when necessary. We employ an offine feature selection step, which determines the most discriminative features at each resolution level during the training step. A modified k-nearest neighbor classifier is used to determine the confidence level of the classification to make the decision at a particular resolution level. The proposed approach was independently tested on 43 whole-slide samples and provided an overall classification accuracy of 88.4%. PMID:20161324
NASA Astrophysics Data System (ADS)
Navratil, Peter; Wilps, Hans
2013-01-01
Three different object-based image classification techniques are applied to high-resolution satellite data for the mapping of the habitats of Asian migratory locust (Locusta migratoria migratoria) in the southern Aral Sea basin, Uzbekistan. A set of panchromatic and multispectral Système Pour l'Observation de la Terre-5 satellite images was spectrally enhanced by normalized difference vegetation index and tasseled cap transformation and segmented into image objects, which were then classified by three different classification approaches: a rule-based hierarchical fuzzy threshold (HFT) classification method was compared to a supervised nearest neighbor classifier and classification tree analysis by the quick, unbiased, efficient statistical trees algorithm. Special emphasis was laid on the discrimination of locust feeding and breeding habitats due to the significance of this discrimination for practical locust control. Field data on vegetation and land cover, collected at the time of satellite image acquisition, was used to evaluate classification accuracy. The results show that a robust HFT classifier outperformed the two automated procedures by 13% overall accuracy. The classification method allowed a reliable discrimination of locust feeding and breeding habitats, which is of significant importance for the application of the resulting data for an economically and environmentally sound control of locust pests because exact spatial knowledge on the habitat types allows a more effective surveying and use of pesticides.
NASA Astrophysics Data System (ADS)
Lesniak, J. M.; Hupse, R.; Blanc, R.; Karssemeijer, N.; Székely, G.
2012-08-01
False positive (FP) marks represent an obstacle for effective use of computer-aided detection (CADe) of breast masses in mammography. Typically, the problem can be approached either by developing more discriminative features or by employing different classifier designs. In this paper, the usage of support vector machine (SVM) classification for FP reduction in CADe is investigated, presenting a systematic quantitative evaluation against neural networks, k-nearest neighbor classification, linear discriminant analysis and random forests. A large database of 2516 film mammography examinations and 73 input features was used to train the classifiers and evaluate for their performance on correctly diagnosed exams as well as false negatives. Further, classifier robustness was investigated using varying training data and feature sets as input. The evaluation was based on the mean exam sensitivity in 0.05-1 FPs on normals on the free-response receiver operating characteristic curve (FROC), incorporated into a tenfold cross validation framework. It was found that SVM classification using a Gaussian kernel offered significantly increased detection performance (P = 0.0002) compared to the reference methods. Varying training data and input features, SVMs showed improved exploitation of large feature sets. It is concluded that with the SVM-based CADe a significant reduction of FPs is possible outperforming other state-of-the-art approaches for breast mass CADe.
NASA Astrophysics Data System (ADS)
Oung, Qi Wei; Nisha Basah, Shafriza; Muthusamy, Hariharan; Vijean, Vikneswaran; Lee, Hoileong
2018-03-01
Parkinson’s disease (PD) is one type of progressive neurodegenerative disease known as motor system syndrome, which is due to the death of dopamine-generating cells, a region of the human midbrain. PD normally affects people over 60 years of age, which at present has influenced a huge part of worldwide population. Lately, many researches have shown interest into the connection between PD and speech disorders. Researches have revealed that speech signals may be a suitable biomarker for distinguishing between people with Parkinson’s (PWP) from healthy subjects. Therefore, early diagnosis of PD through the speech signals can be considered for this aim. In this research, the speech data are acquired based on speech behaviour as the biomarker for differentiating PD severity levels (mild and moderate) from healthy subjects. Feature extraction algorithms applied are Mel Frequency Cepstral Coefficients (MFCC), Linear Predictive Coefficients (LPC), Linear Prediction Cepstral Coefficients (LPCC), and Weighted Linear Prediction Cepstral Coefficients (WLPCC). For classification, two types of classifiers are used: k-Nearest Neighbour (KNN) and Probabilistic Neural Network (PNN). The experimental results demonstrated that PNN classifier and KNN classifier achieve the best average classification performance of 92.63% and 88.56% respectively through 10-fold cross-validation measures. Favourably, the suggested techniques have the possibilities of becoming a new choice of promising tools for the PD detection with tremendous performance.
Prediction of in vivo hepatotoxicity effects using in vitro ...
High-throughput in vitro transcriptomics data support molecular understanding of chemical-induced toxicity. Here, we evaluated the utility of such data to predict liver toxicity. First, in vitro gene expression data for 93 genes was generated following exposure of metabolically competent HepaRG cells to 1060 environmental chemicals from the US EPA ToxCast library. The empirical relationship between these data and rat chronic liver endpoints from animal studies in the Toxicity Reference Database (ToxRefDB) was then evaluated using machine learning techniques. Chemicals were classified as positive (242) or negative (135) based on observed hepatic histopathologic effects, and divided into three categories: hypertrophy (183), injury (112) and proliferative lesions (101). Hepatotoxicants were classified on the basis of the bioactivity of 93 genes (descriptors) using six machine learning algorithms: linear discriminant analysis, naïve Bayes, support vector classification, classification and regression trees, k-nearest neighbors, and an ensemble of classifiers. Classification performance was evaluated using 10-fold cross-validation testing, and in-loop, filter-based, feature subset selection. The best balanced accuracy for prediction of hypertrophy, injury and proliferative lesions were 0.81 ± 0.07, 0.79 ± 0.08 and 0.77 ± 0.09, respectively. Gene specific perturbation of xenobiotic metabolism enzymes (CYP7A1/2E1/4A11/1A1/4A22) and transporters (ABCG2, ABCB11, SLC22
Ipsen, Andreas
2017-02-03
Here, the mass peak centroid is a quantity that is at the core of mass spectrometry (MS). However, despite its central status in the field, models of its statistical distribution are often chosen quite arbitrarily and without attempts at establishing a proper theoretical justification for their use. Recent work has demonstrated that for mass spectrometers employing analog-to-digital converters (ADCs) and electron multipliers, the statistical distribution of the mass peak intensity can be described via a relatively simple model derived essentially from first principles. Building on this result, the following article derives the corresponding statistical distribution for the mass peak centroidsmore » of such instruments. It is found that for increasing signal strength, the centroid distribution converges to a Gaussian distribution whose mean and variance are determined by physically meaningful parameters and which in turn determine bias and variability of the m/z measurements of the instrument. Through the introduction of the concept of “pulse-peak correlation”, the model also elucidates the complicated relationship between the shape of the voltage pulses produced by the preamplifier and the mean and variance of the centroid distribution. The predictions of the model are validated with empirical data and with Monte Carlo simulations.« less
Optimum threshold selection method of centroid computation for Gaussian spot
NASA Astrophysics Data System (ADS)
Li, Xuxu; Li, Xinyang; Wang, Caixia
2015-10-01
Centroid computation of Gaussian spot is often conducted to get the exact position of a target or to measure wave-front slopes in the fields of target tracking and wave-front sensing. Center of Gravity (CoG) is the most traditional method of centroid computation, known as its low algorithmic complexity. However both electronic noise from the detector and photonic noise from the environment reduces its accuracy. In order to improve the accuracy, thresholding is unavoidable before centroid computation, and optimum threshold need to be selected. In this paper, the model of Gaussian spot is established to analyze the performance of optimum threshold under different Signal-to-Noise Ratio (SNR) conditions. Besides, two optimum threshold selection methods are introduced: TmCoG (using m % of the maximum intensity of spot as threshold), and TkCoG ( usingμn +κσ n as the threshold), μn and σn are the mean value and deviation of back noise. Firstly, their impact on the detection error under various SNR conditions is simulated respectively to find the way to decide the value of k or m. Then, a comparison between them is made. According to the simulation result, TmCoG is superior over TkCoG for the accuracy of selected threshold, and detection error is also lower.
Structure and seasonal variations of the nocturnal mesospheric K layer at Arecibo
NASA Astrophysics Data System (ADS)
Yue, Xianchang; Friedman, Jonathan S.; Wu, Xiongbin; Zhou, Qihou H.
2017-07-01
We present the seasonal variations of the nocturnal mesospheric potassium (K) layer at Arecibo, Puerto Rico (18.35°N, 66.75°W) from 160 nights of K Doppler lidar observations between December 2003 and January 2010, during which the solar activity is mostly low. The background temperature is also measured simultaneously by the lidar and shows a strong semiannual oscillation with maxima occurring during equinoxes at all altitudes. The annual mean K density profile is approximately Gaussian with a peak altitude of 91.7 km. The K column abundance and the centroid height have strong semiannual variations, with maxima at the solstices. Both parameters are negatively correlated to the mean background temperature with a correlation coefficient < -0.5. The root-mean-square (RMS) width has a distinct annual oscillation with the largest width occurring in May. The seasonal variation of the centroid height is similar to that of the Fe layer at the same site. The seasonal temperature variation indicates significant enhanced wave-induced downward transport for both species during spring and autumn. This explains the metal layer centroid height and column abundance variations at Arecibo and provides a general mechanism to account for the seasonal variations in the centroid height of all metal species measured at low-latitude and midlatitude sites.
Mobilio, Dominick; Walker, Gary; Brooijmans, Natasja; Nilakantan, Ramaswamy; Denny, R Aldrin; Dejoannis, Jason; Feyfant, Eric; Kowticwar, Rupesh K; Mankala, Jyoti; Palli, Satish; Punyamantula, Sairam; Tatipally, Maneesh; John, Reji K; Humblet, Christine
2010-08-01
The Protein Data Bank is the most comprehensive source of experimental macromolecular structures. It can, however, be difficult at times to locate relevant structures with the Protein Data Bank search interface. This is particularly true when searching for complexes containing specific interactions between protein and ligand atoms. Moreover, searching within a family of proteins can be tedious. For example, one cannot search for some conserved residue as residue numbers vary across structures. We describe herein three databases, Protein Relational Database, Kinase Knowledge Base, and Matrix Metalloproteinase Knowledge Base, containing protein structures from the Protein Data Bank. In Protein Relational Database, atom-atom distances between protein and ligand have been precalculated allowing for millisecond retrieval based on atom identity and distance constraints. Ring centroids, centroid-centroid and centroid-atom distances and angles have also been included permitting queries for pi-stacking interactions and other structural motifs involving rings. Other geometric features can be searched through the inclusion of residue pair and triplet distances. In Kinase Knowledge Base and Matrix Metalloproteinase Knowledge Base, the catalytic domains have been aligned into common residue numbering schemes. Thus, by searching across Protein Relational Database and Kinase Knowledge Base, one can easily retrieve structures wherein, for example, a ligand of interest is making contact with the gatekeeper residue.
Statistical Properties of Line Centroid Velocity Increments in the rho Ophiuchi Cloud
NASA Technical Reports Server (NTRS)
Lis, D. C.; Keene, Jocelyn; Li, Y.; Phillips, T. G.; Pety, J.
1998-01-01
We present a comparison of histograms of CO (2-1) line centroid velocity increments in the rho Ophiuchi molecular cloud with those computed for spectra synthesized from a three-dimensional, compressible, but non-starforming and non-gravitating hydrodynamic simulation. Histograms of centroid velocity increments in the rho Ophiuchi cloud show clearly non-Gaussian wings, similar to those found in histograms of velocity increments and derivatives in experimental studies of laboratory and atmospheric flows, as well as numerical simulations of turbulence. The magnitude of these wings increases monotonically with decreasing separation, down to the angular resolution of the data. This behavior is consistent with that found in the phase of the simulation which has most of the properties of incompressible turbulence. The time evolution of the magnitude of the non-Gaussian wings in the histograms of centroid velocity increments in the simulation is consistent with the evolution of the vorticity in the flow. However, we cannot exclude the possibility that the wings are associated with the shock interaction regions. Moreover, in an active starforming region like the rho Ophiuchi cloud, the effects of shocks may be more important than in the simulation. However, being able to identify shock interaction regions in the interstellar medium is also important, since numerical simulations show that vorticity is generated in shock interactions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ipsen, Andreas
Here, the mass peak centroid is a quantity that is at the core of mass spectrometry (MS). However, despite its central status in the field, models of its statistical distribution are often chosen quite arbitrarily and without attempts at establishing a proper theoretical justification for their use. Recent work has demonstrated that for mass spectrometers employing analog-to-digital converters (ADCs) and electron multipliers, the statistical distribution of the mass peak intensity can be described via a relatively simple model derived essentially from first principles. Building on this result, the following article derives the corresponding statistical distribution for the mass peak centroidsmore » of such instruments. It is found that for increasing signal strength, the centroid distribution converges to a Gaussian distribution whose mean and variance are determined by physically meaningful parameters and which in turn determine bias and variability of the m/z measurements of the instrument. Through the introduction of the concept of “pulse-peak correlation”, the model also elucidates the complicated relationship between the shape of the voltage pulses produced by the preamplifier and the mean and variance of the centroid distribution. The predictions of the model are validated with empirical data and with Monte Carlo simulations.« less
Precision targeting in guided munition using IR sensor and MmW radar
NASA Astrophysics Data System (ADS)
Sreeja, S.; Hablani, H. B.; Arya, H.
2015-10-01
Conventional munitions are not guided with sensors and therefore miss the target, particularly if the target is mobile. The miss distance of these munitions can be decreased by incorporating sensors to detect the target and guide the munition during flight. This paper is concerned with a Precision Guided Munition(PGM) equipped with an infrared sensor and a millimeter wave radar [IR and MmW, for short]. Three-dimensional flight of the munition and its pitch and yaw motion models are developed and simulated. The forward and lateral motion of a target tank on the ground is modeled as two independent second-order Gauss-Markov process. To estimate the target location on the ground and the line-of-sight rate to intercept it an Extended Kalman Filter is composed whose state vector consists of cascaded state vectors of missile dynamics and target dynamics. The line-of-sight angle measurement from the infrared seeker is by centroiding the target image in 40 Hz. The centroid estimation of the images in the focal plane is at a frequency of 10 Hz. Every 10 Hz, centroids of four consecutive images are averaged, yielding a time-averaged centroid, implying some measurement delay. The miss distance achieved by including by image processing delays is 1:45m.
Precision targeting in guided munition using infrared sensor and millimeter wave radar
NASA Astrophysics Data System (ADS)
Sulochana, Sreeja; Hablani, Hari B.; Arya, Hemendra
2016-07-01
Conventional munitions are not guided with sensors and therefore miss the target, particularly if the target is mobile. The miss distance of these munitions can be decreased by incorporating sensors to detect the target and guide the munition during flight. This paper is concerned with a precision guided munition equipped with an infrared (IR) sensor and a millimeter wave radar (MmW). Three-dimensional flight of the munition and its pitch and yaw motion models are developed and simulated. The forward and lateral motion of a target tank on the ground is modeled as two independent second-order Gauss-Markov processes. To estimate the target location on the ground and the line-of-sight (LOS) rate to intercept it, an extended Kalman filter is composed whose state vector consists of cascaded state vectors of missile dynamics and target dynamics. The LOS angle measurement from the IR seeker is by centroiding the target image in 40 Hz. The centroid estimation of the images in the focal plane is at a frequency of 10 Hz. Every 10 Hz, centroids of four consecutive images are averaged, yielding a time-averaged centroid, implying some measurement delay. The miss distance achieved by including image processing delays is 1.45 m.
Diagnostic tools for nearest neighbors techniques when used with satellite imagery
Ronald E. McRoberts
2009-01-01
Nearest neighbors techniques are non-parametric approaches to multivariate prediction that are useful for predicting both continuous and categorical forest attribute variables. Although some assumptions underlying nearest neighbor techniques are common to other prediction techniques such as regression, other assumptions are unique to nearest neighbor techniques....
Salari, Nader; Shohaimi, Shamarina; Najafi, Farid; Nallappan, Meenakshii; Karishnarajah, Isthrinayagy
2014-01-01
Among numerous artificial intelligence approaches, k-Nearest Neighbor algorithms, genetic algorithms, and artificial neural networks are considered as the most common and effective methods in classification problems in numerous studies. In the present study, the results of the implementation of a novel hybrid feature selection-classification model using the above mentioned methods are presented. The purpose is benefitting from the synergies obtained from combining these technologies for the development of classification models. Such a combination creates an opportunity to invest in the strength of each algorithm, and is an approach to make up for their deficiencies. To develop proposed model, with the aim of obtaining the best array of features, first, feature ranking techniques such as the Fisher's discriminant ratio and class separability criteria were used to prioritize features. Second, the obtained results that included arrays of the top-ranked features were used as the initial population of a genetic algorithm to produce optimum arrays of features. Third, using a modified k-Nearest Neighbor method as well as an improved method of backpropagation neural networks, the classification process was advanced based on optimum arrays of the features selected by genetic algorithms. The performance of the proposed model was compared with thirteen well-known classification models based on seven datasets. Furthermore, the statistical analysis was performed using the Friedman test followed by post-hoc tests. The experimental findings indicated that the novel proposed hybrid model resulted in significantly better classification performance compared with all 13 classification methods. Finally, the performance results of the proposed model was benchmarked against the best ones reported as the state-of-the-art classifiers in terms of classification accuracy for the same data sets. The substantial findings of the comprehensive comparative study revealed that performance of the proposed model in terms of classification accuracy is desirable, promising, and competitive to the existing state-of-the-art classification models. PMID:25419659
Ising lattices with +/-J second-nearest-neighbor interactions
NASA Astrophysics Data System (ADS)
Ramírez-Pastor, A. J.; Nieto, F.; Vogel, E. E.
1997-06-01
Second-nearest-neighbor interactions are added to the usual nearest-neighbor Ising Hamiltonian for square lattices in different ways. The starting point is a square lattice where half the nearest-neighbor interactions are ferromagnetic and the other half of the bonds are antiferromagnetic. Then, second-nearest-neighbor interactions can also be assigned randomly or in a variety of causal manners determined by the nearest-neighbor interactions. In the present paper we consider three causal and three random ways of assigning second-nearest-neighbor exchange interactions. Several ground-state properties are then calculated for each of these lattices:energy per bond ɛg, site correlation parameter pg, maximal magnetization μg, and fraction of unfrustrated bonds hg. A set of 500 samples is considered for each size N (number of spins) and array (way of distributing the N spins). The properties of the original lattices with only nearest-neighbor interactions are already known, which allows realizing the effect of the additional interactions. We also include cubic lattices to discuss the distinction between coordination number and dimensionality. Comparison with results for triangular and honeycomb lattices is done at specific points.
PyCCF: Python Cross Correlation Function for reverberation mapping studies
NASA Astrophysics Data System (ADS)
Sun, Mouyuan; Grier, C. J.; Peterson, B. M.
2018-05-01
PyCCF emulates a Fortran program written by B. Peterson for use with reverberation mapping. The code cross correlates two light curves that are unevenly sampled using linear interpolation and measures the peak and centroid of the cross-correlation function. In addition, it is possible to run Monto Carlo iterations using flux randomization and random subset selection (RSS) to produce cross-correlation centroid distributions to estimate the uncertainties in the cross correlation results.
Immune Centroids Over-Sampling Method for Multi-Class Classification
2015-05-22
recognize to specific antigens . The response of a receptor to an antigen can activate its hosting B-cell. Activated B-cell then proliferates and...modifying N.K. Jerne’s theory. The theory states that in a pre-existing group of lympho- cytes ( specifically B cells), a specific antigen only...the clusters of each small class, which have high data density, called global immune centroids over-sampling (denoted as Global-IC). Specifically
A motion detection system for AXAF X-ray ground testing
NASA Technical Reports Server (NTRS)
Arenberg, Jonathan W.; Texter, Scott C.
1993-01-01
The concept, implementation, and performance of the motion detection system (MDS) designed as a diagnostic for X-ray ground testing for AXAF are described. The purpose of the MDS is to measure the magnitude of a relative rigid body motion among the AXAF test optic, the X-ray source, and X-ray focal plane detector. The MDS consists of a point source, lens, centroid detector, transimpedance amplifier, and computer system. Measurement of the centroid position of the image of the optical point source provides a direct measure of the motions of the X-ray optical system. The outputs from the detector and filter/amplifier are digitized and processed using the calibration with a 50 Hz bandwidth to give the centroid's location on the detector. Resolution of 0.008 arcsec has been achieved by this system. Data illustrating the performance of the motion detection system are also presented.
Assessment of auditory impression of the coolness and warmness of automotive HVAC noise.
Nakagawa, Seiji; Hotehama, Takuya; Kamiya, Masaru
2017-07-01
Noise induced by a heating, ventilation and air conditioning (HVAC) system in a vehicle is an important factor that affects the comfort of the interior of a car cabin. Much effort has been devoted to reduce noise levels, however, there is a need for a new sound design that addresses the noise problem from a different point of view. In this study, focusing on the auditory impression of automotive HVAC noise concerning coolness and warmness, psychoacoustical listening tests were performed using a paired comparison technique under various conditions of room temperature. Five stimuli were synthesized by stretching the spectral envelopes of recorded automotive HVAC noise to assess the effect of the spectral centroid, and were presented to normal-hearing subjects. Results show that the spectral centroid significantly affects the auditory impression concerning coolness and warmness; a higher spectral centroid induces a cooler auditory impression regardless of the room temperature.
NASA Astrophysics Data System (ADS)
Bansal, A. R.; Anand, S. P.; Rajaram, Mita; Rao, V. K.; Dimri, V. P.
2013-09-01
The depth to the bottom of the magnetic sources (DBMS) has been estimated from the aeromagnetic data of Central India. The conventional centroid method of DBMS estimation assumes random uniform uncorrelated distribution of sources and to overcome this limitation a modified centroid method based on scaling distribution has been proposed. Shallower values of the DBMS are found for the south western region. The DBMS values are found as low as 22 km in the south west Deccan trap covered regions and as deep as 43 km in the Chhattisgarh Basin. In most of the places DBMS are much shallower than the Moho depth, earlier found from the seismic study and may be representing the thermal/compositional/petrological boundaries. The large variation in the DBMS indicates the complex nature of the Indian crust.
Mackenbach, Joreintje D.; Lakerveld, Jeroen; Forouhi, Nita G.; Griffin, Simon J.; Brage, Søren; Wareham, Nicholas J.; Monsivais, Pablo
2017-01-01
U.S. policy initiatives have sought to improve health through attracting neighborhood supermarket investment. Little evidence exists to suggest that these policies will be effective, in particular where there are socioeconomic barriers to healthy eating. We measured the independent associations and combined interplay of supermarket access and socioeconomic status with obesity. Using data on 9702 UK adults, we employed adjusted regression analyses to estimate measured BMI (kg/m2), overweight (25 ≥ BMI < 30) and obesity (≥30), across participants’ highest educational attainment (three groups) and tertiles of street network distance (km) from home location to nearest supermarket. Jointly-classified models estimated combined associations of education and supermarket distance, and relative excess risk due to interaction (RERI). Participants farthest away from their nearest supermarket had higher odds of obesity (OR 1.33, 95% CI: 1.11, 1.58), relative to those living closest. Lower education was also associated with higher odds of obesity. Those least-educated and living farthest away had 3.39 (2.46–4.65) times the odds of being obese, compared to those highest-educated and living closest, with an excess obesity risk (RERI = 0.09); results were similar for overweight. Our results suggest that public health can be improved through planning better access to supermarkets, in combination with interventions to address socioeconomic barriers. PMID:29068365
Burgoine, Thomas; Mackenbach, Joreintje D; Lakerveld, Jeroen; Forouhi, Nita G; Griffin, Simon J; Brage, Søren; Wareham, Nicholas J; Monsivais, Pablo
2017-10-25
U.S. policy initiatives have sought to improve health through attracting neighborhood supermarket investment. Little evidence exists to suggest that these policies will be effective, in particular where there are socioeconomic barriers to healthy eating. We measured the independent associations and combined interplay of supermarket access and socioeconomic status with obesity. Using data on 9702 UK adults, we employed adjusted regression analyses to estimate measured BMI (kg/m²), overweight (25 ≥ BMI < 30) and obesity (≥30), across participants' highest educational attainment (three groups) and tertiles of street network distance (km) from home location to nearest supermarket. Jointly-classified models estimated combined associations of education and supermarket distance, and relative excess risk due to interaction (RERI). Participants farthest away from their nearest supermarket had higher odds of obesity (OR 1.33, 95% CI: 1.11, 1.58), relative to those living closest. Lower education was also associated with higher odds of obesity. Those least-educated and living farthest away had 3.39 (2.46-4.65) times the odds of being obese, compared to those highest-educated and living closest, with an excess obesity risk (RERI = 0.09); results were similar for overweight. Our results suggest that public health can be improved through planning better access to supermarkets, in combination with interventions to address socioeconomic barriers.
NASA Astrophysics Data System (ADS)
Taha, Zahari; Muazu Musa, Rabiu; Majeed, Anwar P. P. Abdul; Razali Abdullah, Mohamad; Muaz Alim, Muhammad; Nasir, Ahmad Fakhri Ab
2018-04-01
The present study aims at classifying and predicting high and low potential archers from a collection of psychological coping skills variables trained on different k-Nearest Neighbour (k-NN) kernels. 50 youth archers with the average age and standard deviation of (17.0 ±.056) gathered from various archery programmes completed a one end shooting score test. Psychological coping skills inventory which evaluates the archers level of related coping skills were filled out by the archers prior to their shooting tests. k-means cluster analysis was applied to cluster the archers based on their scores on variables assessed k-NN models, i.e. fine, medium, coarse, cosine, cubic and weighted kernel functions, were trained on the psychological variables. The k-means clustered the archers into high psychologically prepared archers (HPPA) and low psychologically prepared archers (LPPA), respectively. It was demonstrated that the cosine k-NN model exhibited good accuracy and precision throughout the exercise with an accuracy of 94% and considerably fewer error rate for the prediction of the HPPA and the LPPA as compared to the rest of the models. The findings of this investigation can be valuable to coaches and sports managers to recognise high potential athletes from the selected psychological coping skills variables examined which would consequently save time and energy during talent identification and development programme.
Quantum nuclear effects in water using centroid molecular dynamics
NASA Astrophysics Data System (ADS)
Kondratyuk, N. D.; Norman, G. E.; Stegailov, V. V.
2018-01-01
The quantum nuclear effects are studied in water using the method of centroid molecular dynamics (CMD). The aim is the calibration of CMD implementation in LAMMPS. The calculated intramolecular energy, atoms gyration radii and radial distribution functions are shown in comparison with previous works. The work is assumed to be the step toward to solution of the discrepancy between the simulation results and the experimental data of liquid n-alkane properties in our previous works.
Improving experimental phases for strong reflections prior to density modification
DOE Office of Scientific and Technical Information (OSTI.GOV)
Uervirojnangkoorn, Monarin; Hilgenfeld, Rolf; Terwilliger, Thomas C.
Experimental phasing of diffraction data from macromolecular crystals involves deriving phase probability distributions. These distributions are often bimodal, making their weighted average, the centroid phase, improbable, so that electron-density maps computed using centroid phases are often non-interpretable. Density modification brings in information about the characteristics of electron density in protein crystals. In successful cases, this allows a choice between the modes in the phase probability distributions, and the maps can cross the borderline between non-interpretable and interpretable. Based on the suggestions by Vekhter [Vekhter (2005), Acta Cryst. D 61, 899–902], the impact of identifying optimized phases for a small numbermore » of strong reflections prior to the density-modification process was investigated while using the centroid phase as a starting point for the remaining reflections. A genetic algorithm was developed that optimizes the quality of such phases using the skewness of the density map as a target function. Phases optimized in this way are then used in density modification. In most of the tests, the resulting maps were of higher quality than maps generated from the original centroid phases. In one of the test cases, the new method sufficiently improved a marginal set of experimental SAD phases to enable successful map interpretation. Lastly, a computer program, SISA, has been developed to apply this method for phase improvement in macromolecular crystallography.« less
Improving experimental phases for strong reflections prior to density modification
Uervirojnangkoorn, Monarin; Hilgenfeld, Rolf; Terwilliger, Thomas C.; ...
2013-09-20
Experimental phasing of diffraction data from macromolecular crystals involves deriving phase probability distributions. These distributions are often bimodal, making their weighted average, the centroid phase, improbable, so that electron-density maps computed using centroid phases are often non-interpretable. Density modification brings in information about the characteristics of electron density in protein crystals. In successful cases, this allows a choice between the modes in the phase probability distributions, and the maps can cross the borderline between non-interpretable and interpretable. Based on the suggestions by Vekhter [Vekhter (2005), Acta Cryst. D 61, 899–902], the impact of identifying optimized phases for a small numbermore » of strong reflections prior to the density-modification process was investigated while using the centroid phase as a starting point for the remaining reflections. A genetic algorithm was developed that optimizes the quality of such phases using the skewness of the density map as a target function. Phases optimized in this way are then used in density modification. In most of the tests, the resulting maps were of higher quality than maps generated from the original centroid phases. In one of the test cases, the new method sufficiently improved a marginal set of experimental SAD phases to enable successful map interpretation. Lastly, a computer program, SISA, has been developed to apply this method for phase improvement in macromolecular crystallography.« less
An adaptive tracker for ShipIR/NTCS
NASA Astrophysics Data System (ADS)
Ramaswamy, Srinivasan; Vaitekunas, David A.
2015-05-01
A key component in any image-based tracking system is the adaptive tracking algorithm used to segment the image into potential targets, rank-and-select the best candidate target, and the gating of the selected target to further improve tracker performance. This paper will describe a new adaptive tracker algorithm added to the naval threat countermeasure simulator (NTCS) of the NATO-standard ship signature model (ShipIR). The new adaptive tracking algorithm is an optional feature used with any of the existing internal NTCS or user-defined seeker algorithms (e.g., binary centroid, intensity centroid, and threshold intensity centroid). The algorithm segments the detected pixels into clusters, and the smallest set of clusters that meet the detection criterion is obtained by using a knapsack algorithm to identify the set of clusters that should not be used. The rectangular area containing the chosen clusters defines an inner boundary, from which a weighted centroid is calculated as the aim-point. A track-gate is then positioned around the clusters, taking into account the rate of change of the bounding area and compensating for any gimbal displacement. A sequence of scenarios is used to test the new tracking algorithm on a generic unclassified DDG ShipIR model, with and without flares, and demonstrate how some of the key seeker signals are impacted by both the ship and flare intrinsic signatures.
NASA Astrophysics Data System (ADS)
Bansal, A. R.; Anand, S.; Rajaram, M.; Rao, V.; Dimri, V. P.
2012-12-01
The depth to the bottom of the magnetic sources (DBMS) may be used as an estimate of the Curie - point depth. The DBMSs can also be interpreted in term of thermal structure of the crust. The thermal structure of the crust is a sensitive parameter and depends on the many properties of crust e.g. modes of deformation, depths of brittle and ductile deformation zones, regional heat flow variations, seismicity, subsidence/uplift patterns and maturity of organic matter in sedimentary basins. The conventional centroid method of DBMS estimation assumes random uniform uncorrelated distribution of sources and to overcome this limitation a modified centroid method based on fractal distribution has been proposed. We applied this modified centroid method to the aeromagnetic data of the central Indian region and selected 29 half overlapping blocks of dimension 200 km x 200 km covering different parts of the central India. Shallower values of the DBMS are found for the western and southern portion of Indian shield. The DBMSs values are found as low as close to middle crust in the south west Deccan trap and probably deeper than Moho in the Chhatisgarh basin. In few places DBMS are close to the Moho depth found from the seismic study and others places shallower than the Moho. The DBMS indicate complex nature of the Indian crust.
NASA Astrophysics Data System (ADS)
Zocchi, Fabio E.
2017-10-01
One of the approaches that is being tested for the integration of the mirror modules of the advanced telescope for high-energy astrophysics x-ray mission of the European Space Agency consists in aligning each module on an optical bench operated at an ultraviolet wavelength. The mirror module is illuminated by a plane wave and, in order to overcome diffraction effects, the centroid of the image produced by the module is used as a reference to assess the accuracy of the optical alignment of the mirror module itself. Among other sources of uncertainty, the wave-front error of the plane wave also introduces an error in the position of the centroid, thus affecting the quality of the mirror module alignment. The power spectral density of the position of the point spread function centroid is here derived from the power spectral density of the wave-front error of the plane wave in the framework of the scalar theory of Fourier diffraction. This allows the defining of a specification on the collimator quality used for generating the plane wave starting from the contribution to the error budget allocated for the uncertainty of the centroid position. The theory generally applies whenever Fourier diffraction is a valid approximation, in which case the obtained result is identical to that derived by geometrical optics considerations.
Data mining neocortical high-frequency oscillations in epilepsy and controls.
Blanco, Justin A; Stead, Matt; Krieger, Abba; Stacey, William; Maus, Douglas; Marsh, Eric; Viventi, Jonathan; Lee, Kendall H; Marsh, Richard; Litt, Brian; Worrell, Gregory A
2011-10-01
Transient high-frequency (100-500 Hz) oscillations of the local field potential have been studied extensively in human mesial temporal lobe. Previous studies report that both ripple (100-250 Hz) and fast ripple (250-500 Hz) oscillations are increased in the seizure-onset zone of patients with mesial temporal lobe epilepsy. Comparatively little is known, however, about their spatial distribution with respect to seizure-onset zone in neocortical epilepsy, or their prevalence in normal brain. We present a quantitative analysis of high-frequency oscillations and their rates of occurrence in a group of nine patients with neocortical epilepsy and two control patients with no history of seizures. Oscillations were automatically detected and classified using an unsupervised approach in a data set of unprecedented volume in epilepsy research, over 12 terabytes of continuous long-term micro- and macro-electrode intracranial recordings, without human preprocessing, enabling selection-bias-free estimates of oscillation rates. There are three main results: (i) a cluster of ripple frequency oscillations with median spectral centroid = 137 Hz is increased in the seizure-onset zone more frequently than a cluster of fast ripple frequency oscillations (median spectral centroid = 305 Hz); (ii) we found no difference in the rates of high frequency oscillations in control neocortex and the non-seizure-onset zone neocortex of patients with epilepsy, despite the possibility of different underlying mechanisms of generation; and (iii) while previous studies have demonstrated that oscillations recorded by parenchyma-penetrating micro-electrodes have higher peak 100-500 Hz frequencies than penetrating macro-electrodes, this was not found for the epipial electrodes used here to record from the neocortical surface. We conclude that the relative rate of ripple frequency oscillations is a potential biomarker for epileptic neocortex, but that larger prospective studies correlating high-frequency oscillations rates with seizure-onset zone, resected tissue and surgical outcome are required to determine the true predictive value.
Machine-learning approaches to select Wolf-Rayet candidates
NASA Astrophysics Data System (ADS)
Marston, A. P.; Morello, G.; Morris, P.; van Dyk, S.; Mauerhan, J.
2017-11-01
The WR stellar population can be distinguished, at least partially, from other stellar populations by broad-band IR colour selection. We present the use of a machine learning classifier to quantitatively improve the selection of Galactic Wolf-Rayet (WR) candidates. These methods are used to separate the other stellar populations which have similar IR colours. We show the results of the classifications obtained by using the 2MASS J, H and K photometric bands, and the Spitzer/IRAC bands at 3.6, 4.5, 5.8 and 8.0μm. The k-Nearest Neighbour method has been used to select Galactic WR candidates for observational follow-up. A few candidates have been spectroscopically observed. Preliminary observations suggest that a detection rate of 50% can easily be achieved.
Compact localized states and flat-band generators in one dimension
NASA Astrophysics Data System (ADS)
Maimaiti, Wulayimu; Andreanov, Alexei; Park, Hee Chul; Gendelman, Oleg; Flach, Sergej
2017-03-01
Flat bands (FB) are strictly dispersionless bands in the Bloch spectrum of a periodic lattice Hamiltonian, recently observed in a variety of photonic and dissipative condensate networks. FB Hamiltonians are fine-tuned networks, still lacking a comprehensive generating principle. We introduce a FB generator based on local network properties. We classify FB networks through the properties of compact localized states (CLS) which are exact FB eigenstates and occupy U unit cells. We obtain the complete two-parameter FB family of two-band d =1 networks with nearest unit cell interaction and U =2 . We discover a novel high symmetry sawtooth chain with identical hoppings in a transverse dc field, easily accessible in experiments. Our results pave the way towards a complete description of FBs in networks with more bands and in higher dimensions.
Transportation Modes Classification Using Sensors on Smartphones.
Fang, Shih-Hau; Liao, Hao-Hsiang; Fei, Yu-Xiang; Chen, Kai-Hsiang; Huang, Jen-Wei; Lu, Yu-Ding; Tsao, Yu
2016-08-19
This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user's transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes.
Transportation Modes Classification Using Sensors on Smartphones
Fang, Shih-Hau; Liao, Hao-Hsiang; Fei, Yu-Xiang; Chen, Kai-Hsiang; Huang, Jen-Wei; Lu, Yu-Ding; Tsao, Yu
2016-01-01
This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user’s transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes. PMID:27548182
Nonlinear features for product inspection
NASA Astrophysics Data System (ADS)
Talukder, Ashit; Casasent, David P.
1999-03-01
Classification of real-time X-ray images of randomly oriented touching pistachio nuts is discussed. The ultimate objective is the development of a system for automated non-invasive detection of defective product items on a conveyor belt. We discuss the extraction of new features that allow better discrimination between damaged and clean items (pistachio nuts). This feature extraction and classification stage is the new aspect of this paper; our new maximum representation and discriminating feature (MRDF) extraction method computes nonlinear features that are used as inputs to a new modified k nearest neighbor classifier. In this work, the MRDF is applied to standard features (rather than iconic data). The MRDF is robust to various probability distributions of the input class and is shown to provide good classification and new ROC (receiver operating characteristic) data.
Song, Wen; Ade, Carl; Broxterman, Ryan; Barstow, Thomas; Nelson, Thomas; Warren, Steve
2012-01-01
Accelerometer data provide useful information about subject activity in many different application scenarios. For this study, single-accelerometer data were acquired from subjects participating in field tests that mimic tasks that astronauts might encounter in reduced gravity environments. The primary goal of this effort was to apply classification algorithms that could identify these tasks based on features present in their corresponding accelerometer data, where the end goal is to establish methods to unobtrusively gauge subject well-being based on sensors that reside in their local environment. In this initial analysis, six different activities that involve leg movement are classified. The k-Nearest Neighbors (kNN) algorithm was found to be the most effective, with an overall classification success rate of 90.8%.
Turbulent chimeras in large semiconductor laser arrays
Shena, J.; Hizanidis, J.; Kovanis, V.; Tsironis, G. P.
2017-01-01
Semiconductor laser arrays have been investigated experimentally and theoretically from the viewpoint of temporal and spatial coherence for the past forty years. In this work, we are focusing on a rather novel complex collective behavior, namely chimera states, where synchronized clusters of emitters coexist with unsynchronized ones. For the first time, we find such states exist in large diode arrays based on quantum well gain media with nearest-neighbor interactions. The crucial parameters are the evanescent coupling strength and the relative optical frequency detuning between the emitters of the array. By employing a recently proposed figure of merit for classifying chimera states, we provide quantitative and qualitative evidence for the observed dynamics. The corresponding chimeras are identified as turbulent according to the irregular temporal behavior of the classification measure. PMID:28165053
Turbulent chimeras in large semiconductor laser arrays
NASA Astrophysics Data System (ADS)
Shena, J.; Hizanidis, J.; Kovanis, V.; Tsironis, G. P.
2017-02-01
Semiconductor laser arrays have been investigated experimentally and theoretically from the viewpoint of temporal and spatial coherence for the past forty years. In this work, we are focusing on a rather novel complex collective behavior, namely chimera states, where synchronized clusters of emitters coexist with unsynchronized ones. For the first time, we find such states exist in large diode arrays based on quantum well gain media with nearest-neighbor interactions. The crucial parameters are the evanescent coupling strength and the relative optical frequency detuning between the emitters of the array. By employing a recently proposed figure of merit for classifying chimera states, we provide quantitative and qualitative evidence for the observed dynamics. The corresponding chimeras are identified as turbulent according to the irregular temporal behavior of the classification measure.
Motion data classification on the basis of dynamic time warping with a cloud point distance measure
NASA Astrophysics Data System (ADS)
Switonski, Adam; Josinski, Henryk; Zghidi, Hafedh; Wojciechowski, Konrad
2016-06-01
The paper deals with the problem of classification of model free motion data. The nearest neighbors classifier which is based on comparison performed by Dynamic Time Warping transform with cloud point distance measure is proposed. The classification utilizes both specific gait features reflected by a movements of subsequent skeleton joints and anthropometric data. To validate proposed approach human gait identification challenge problem is taken into consideration. The motion capture database containing data of 30 different humans collected in Human Motion Laboratory of Polish-Japanese Academy of Information Technology is used. The achieved results are satisfactory, the obtained accuracy of human recognition exceeds 90%. What is more, the applied cloud point distance measure does not depend on calibration process of motion capture system which results in reliable validation.
Anomalous quantum heat transport in a one-dimensional harmonic chain with random couplings.
Yan, Yonghong; Zhao, Hui
2012-07-11
We investigate quantum heat transport in a one-dimensional harmonic system with random couplings. In the presence of randomness, phonon modes may normally be classified as ballistic, diffusive or localized. We show that these modes can roughly be characterized by the local nearest-neighbor level spacing distribution, similarly to their electronic counterparts. We also show that the thermal conductance G(th) through the system decays rapidly with the system size (G(th) ∼ L(-α)). The exponent α strongly depends on the system size and can change from α < 1 to α > 1 with increasing system size, indicating that the system undergoes a transition from a heat conductor to a heat insulator. This result could be useful in thermal control of low-dimensional systems.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Avkshtol, V; Tanny, S; Reddy, K
Purpose: Stereotactic radiation therapy (SRT) provides an excellent alternative to embolization and surgical excision for the management of appropriately selected cerebral arteriovenous malformations (AVMs). The currently accepted standard for delineating AVMs is planar digital subtraction angiography (DSA). DSA can be used to acquire a 3D data set that preserves osseous structures (3D-DA) at the time of the angiography for SRT planning. Magnetic resonance imaging (MRI) provides an alternative noninvasive method of visualizing the AVM nidus with comparable spatial resolution. We utilized 3D-DA and T1 post-contrast MRI data to evaluate the differences in SRT target volumes. Methods: Four patients underwent 3D-DAmore » and high-resolution MRI. 3D T1 post-contrast images were obtained in all three reconstruction planes. A planning CT was fused with MRI and 3D-DA data sets. The AVMs were contoured utilizing one of the image sets at a time. Target volume, centroid, and maximum and minimum dimensions were analyzed for each patient. Results: Targets delineated using post-contrast MRI demonstrated a larger mean volume. AVMs >2 cc were found to have a larger difference between MRI and 3D-DA volumes. Larger AVMs also demonstrated a smaller relative uncertainty in contour centroid position (1 mm). AVM targets <2 cc had smaller absolute differences in volume, but larger differences in contour centroid position (2.5 mm). MRI targets demonstrated a more irregular shape compared to 3D-DA targets. Conclusions: Our preliminary data supports the use of MRI alone to delineate AVM targets >2 cc. The greater centroid stability for AVMs >2 cc ensures accurate target localization during image fusion. The larger MRI target volumes did not result in prohibitively greater volumes of normal brain tissue receiving the prescription dose. The larger centroid instability for AVMs <2 cc precludes the use of MRI alone for target delineation. We recommend incorporating a 3D-DA for these patients.« less
NASA Astrophysics Data System (ADS)
Salman, S. S.; Abbas, W. A.
2018-05-01
The goal of the study is to support analysis Enhancement of Resolution and study effect on classification methods on bands spectral information of specific and quantitative approaches. In this study introduce a method to enhancement resolution Landsat 8 of combining the bands spectral of 30 meters resolution with panchromatic band 8 of 15 meters resolution, because of importance multispectral imagery to extracting land - cover. Classification methods used in this study to classify several lands -covers recorded from OLI- 8 imagery. Two methods of Data mining can be classified as either supervised or unsupervised. In supervised methods, there is a particular predefined target, that means the algorithm learn which values of the target are associated with which values of the predictor sample. K-nearest neighbors and maximum likelihood algorithms examine in this work as supervised methods. In other hand, no sample identified as target in unsupervised methods, the algorithm of data extraction searches for structure and patterns between all the variables, represented by Fuzzy C-mean clustering method as one of the unsupervised methods, NDVI vegetation index used to compare the results of classification method, the percent of dense vegetation in maximum likelihood method give a best results.
Comparison of GOES Cloud Classification Algorithms Employing Explicit and Implicit Physics
NASA Technical Reports Server (NTRS)
Bankert, Richard L.; Mitrescu, Cristian; Miller, Steven D.; Wade, Robert H.
2009-01-01
Cloud-type classification based on multispectral satellite imagery data has been widely researched and demonstrated to be useful for distinguishing a variety of classes using a wide range of methods. The research described here is a comparison of the classifier output from two very different algorithms applied to Geostationary Operational Environmental Satellite (GOES) data over the course of one year. The first algorithm employs spectral channel thresholding and additional physically based tests. The second algorithm was developed through a supervised learning method with characteristic features of expertly labeled image samples used as training data for a 1-nearest-neighbor classification. The latter's ability to identify classes is also based in physics, but those relationships are embedded implicitly within the algorithm. A pixel-to-pixel comparison analysis was done for hourly daytime scenes within a region in the northeastern Pacific Ocean. Considerable agreement was found in this analysis, with many of the mismatches or disagreements providing insight to the strengths and limitations of each classifier. Depending upon user needs, a rule-based or other postprocessing system that combines the output from the two algorithms could provide the most reliable cloud-type classification.
Super-resolution method for face recognition using nonlinear mappings on coherent features.
Huang, Hua; He, Huiting
2011-01-01
Low-resolution (LR) of face images significantly decreases the performance of face recognition. To address this problem, we present a super-resolution method that uses nonlinear mappings to infer coherent features that favor higher recognition of the nearest neighbor (NN) classifiers for recognition of single LR face image. Canonical correlation analysis is applied to establish the coherent subspaces between the principal component analysis (PCA) based features of high-resolution (HR) and LR face images. Then, a nonlinear mapping between HR/LR features can be built by radial basis functions (RBFs) with lower regression errors in the coherent feature space than in the PCA feature space. Thus, we can compute super-resolved coherent features corresponding to an input LR image according to the trained RBF model efficiently and accurately. And, face identity can be obtained by feeding these super-resolved features to a simple NN classifier. Extensive experiments on the Facial Recognition Technology, University of Manchester Institute of Science and Technology, and Olivetti Research Laboratory databases show that the proposed method outperforms the state-of-the-art face recognition algorithms for single LR image in terms of both recognition rate and robustness to facial variations of pose and expression.
Deep feature extraction and combination for synthetic aperture radar target classification
NASA Astrophysics Data System (ADS)
Amrani, Moussa; Jiang, Feng
2017-10-01
Feature extraction has always been a difficult problem in the classification performance of synthetic aperture radar automatic target recognition (SAR-ATR). It is very important to select discriminative features to train a classifier, which is a prerequisite. Inspired by the great success of convolutional neural network (CNN), we address the problem of SAR target classification by proposing a feature extraction method, which takes advantage of exploiting the extracted deep features from CNNs on SAR images to introduce more powerful discriminative features and robust representation ability for them. First, the pretrained VGG-S net is fine-tuned on moving and stationary target acquisition and recognition (MSTAR) public release database. Second, after a simple preprocessing is performed, the fine-tuned network is used as a fixed feature extractor to extract deep features from the processed SAR images. Third, the extracted deep features are fused by using a traditional concatenation and a discriminant correlation analysis algorithm. Finally, for target classification, K-nearest neighbors algorithm based on LogDet divergence-based metric learning triplet constraints is adopted as a baseline classifier. Experiments on MSTAR are conducted, and the classification accuracy results demonstrate that the proposed method outperforms the state-of-the-art methods.
Fergus, Paul; Hignett, David; Hussain, Abir; Al-Jumeily, Dhiya; Abdel-Aziz, Khaled
2015-01-01
The epilepsies are a heterogeneous group of neurological disorders and syndromes characterised by recurrent, involuntary, paroxysmal seizure activity, which is often associated with a clinicoelectrical correlate on the electroencephalogram. The diagnosis of epilepsy is usually made by a neurologist but can be difficult to be made in the early stages. Supporting paraclinical evidence obtained from magnetic resonance imaging and electroencephalography may enable clinicians to make a diagnosis of epilepsy and investigate treatment earlier. However, electroencephalogram capture and interpretation are time consuming and can be expensive due to the need for trained specialists to perform the interpretation. Automated detection of correlates of seizure activity may be a solution. In this paper, we present a supervised machine learning approach that classifies seizure and nonseizure records using an open dataset containing 342 records. Our results show an improvement on existing studies by as much as 10% in most cases with a sensitivity of 93%, specificity of 94%, and area under the curve of 98% with a 6% global error using a k-class nearest neighbour classifier. We propose that such an approach could have clinical applications in the investigation of patients with suspected seizure disorders.
A Step Towards EEG-based Brain Computer Interface for Autism Intervention*
Fan, Jing; Wade, Joshua W.; Bian, Dayi; Key, Alexandra P.; Warren, Zachary E.; Mion, Lorraine C.; Sarkar, Nilanjan
2017-01-01
Autism Spectrum Disorder (ASD) is a prevalent and costly neurodevelopmental disorder. Individuals with ASD often have deficits in social communication skills as well as adaptive behavior skills related to daily activities. We have recently designed a novel virtual reality (VR) based driving simulator for driving skill training for individuals with ASD. In this paper, we explored the feasibility of detecting engagement level, emotional states, and mental workload during VR-based driving using EEG as a first step towards a potential EEG-based Brain Computer Interface (BCI) for assisting autism intervention. We used spectral features of EEG signals from a 14-channel EEG neuroheadset, together with therapist ratings of behavioral engagement, enjoyment, frustration, boredom, and difficulty to train a group of classification models. Seven classification methods were applied and compared including Bayes network, naïve Bayes, Support Vector Machine (SVM), multilayer perceptron, K-nearest neighbors (KNN), random forest, and J48. The classification results were promising, with over 80% accuracy in classifying engagement and mental workload, and over 75% accuracy in classifying emotional states. Such results may lead to an adaptive closed-loop VR-based skill training system for use in autism intervention. PMID:26737113
Classification of Malaysia aromatic rice using multivariate statistical analysis
NASA Astrophysics Data System (ADS)
Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A.; Omar, O.
2015-05-01
Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC-MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.
Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy
Zhang, Lina; Zhang, Chengjin; Gao, Rui; Yang, Runtao; Song, Qing
2016-01-01
Antioxidant proteins perform significant functions in maintaining oxidation/antioxidation balance and have potential therapies for some diseases. Accurate identification of antioxidant proteins could contribute to revealing physiological processes of oxidation/antioxidation balance and developing novel antioxidation-based drugs. In this study, an ensemble method is presented to predict antioxidant proteins with hybrid features, incorporating SSI (Secondary Structure Information), PSSM (Position Specific Scoring Matrix), RSA (Relative Solvent Accessibility), and CTD (Composition, Transition, Distribution). The prediction results of the ensemble predictor are determined by an average of prediction results of multiple base classifiers. Based on a classifier selection strategy, we obtain an optimal ensemble classifier composed of RF (Random Forest), SMO (Sequential Minimal Optimization), NNA (Nearest Neighbor Algorithm), and J48 with an accuracy of 0.925. A Relief combined with IFS (Incremental Feature Selection) method is adopted to obtain optimal features from hybrid features. With the optimal features, the ensemble method achieves improved performance with a sensitivity of 0.95, a specificity of 0.93, an accuracy of 0.94, and an MCC (Matthew’s Correlation Coefficient) of 0.880, far better than the existing method. To evaluate the prediction performance objectively, the proposed method is compared with existing methods on the same independent testing dataset. Encouragingly, our method performs better than previous studies. In addition, our method achieves more balanced performance with a sensitivity of 0.878 and a specificity of 0.860. These results suggest that the proposed ensemble method can be a potential candidate for antioxidant protein prediction. For public access, we develop a user-friendly web server for antioxidant protein identification that is freely accessible at http://antioxidant.weka.cc. PMID:27662651
NASA Astrophysics Data System (ADS)
Hanson-Hedgecock, S.; Bursik, M.; Rogova, G.
2008-12-01
We are developing an intelligent system to correlate tephra layers by using the lithologic and geochemical characteristics of field samples, to aid geologists in interpreting eruption patterns in volcanic fields. Understanding the eruption history of a volcanic field from stratigraphic studies is important for forecasting future eruptive behavior and hazards. The intelligent system is used to define groups of tephra source vents and to correlate tephra layers based on a combination of geochemical data and lithostratigraphic characteristics. The tephra beds of the Mono-Inyo Craters, California, are used to test the ability of the intelligent system for tephra layer correlation. The data processing is performed by a suite of both unsupervised and supervised classifiers, built and combined within the framework of the Dempster-Shafer theory of evidence. We have developed algorithms to calculate isopleth maps of thickness, lithic and pumice size that are used in the processing of the lithostratigraphic data. This spatial information is important in the determination of eruption patterns and is used by an evidential nearest neighbor classifier to correlate tephra layers. Integrating a better isopleth approximation function and expert knowledge about stratigraphic order of the tephra layers into the classifier improves the lithostratigraphic correlation from 56% to 87% of layers correctly identified. Geochemical data for defining groups of tephra sources are processed by a suit of fuzzy k-means classifiers. Improved clustering results of geochemical data are achieved by the fusion of individual clustering results with an evidential combination method. The intelligent system aids correlation by showing matches and disparities between data patterns from different outcrops that may have been overlooked. The intelligent system produces a useful recognition result, while dealing with the uncertainty from sparse data and the imprecise description of layer characteristics.
Hoff, Michael H.
2004-01-01
The lake herring (Coregonus artedi) was one of the most commercially and ecologically valuable Lake Superior fishes, but declined in the second half of the 20th century as the result of overharvest of putatively discrete stocks. No tools were previously available that described lake herring stock structure and accurately classified lake herring to their spawning stocks. The accuracy of discriminating among spawning aggregations was evaluated using whole-body morphometrics based on a truss network. Lake herring were collected from 11 spawning aggregations in Lake Superior and two inland Wisconsin lakes to evaluate morphometrics as a stock discrimination tool. Discriminant function analysis correctly classified 53% of all fish from all spawning aggregations, and fish from all but one aggregation were classified at greater rates than were possible by chance. Discriminant analysis also correctly classified 66% of fish to nearest neighbor groups, which were groups that accounted for the possibility of mixing among the aggregations. Stepwise discriminant analysis showed that posterior body length and depth measurements were among the best discriminators of spawning aggregations. These findings support other evidence that discrete stocks of lake herring exist in Lake Superior, and fishery managers should consider all but one of the spawning aggregations as discrete stocks. Abundance, annual harvest, total annual mortality rate, and exploitation data should be collected from each stock, and surplus production of each stock should be estimated. Prudent management of stock surplus production and exploitation rates will aid in restoration of stocks and will prevent a repeat of the stock collapses that occurred in the middle of the 20th century, when the species was nearly extirpated from the lake.
NASA Astrophysics Data System (ADS)
van Rikxoort, E. M.; de Jong, P. A.; Mets, O. M.; van Ginneken, B.
2012-03-01
Chronic Obstructive Pulmonary Disease (COPD) is a chronic lung disease that is characterized by airflow limitation. COPD is clinically diagnosed and monitored using pulmonary function testing (PFT), which measures global inspiration and expiration capabilities of patients and is time-consuming and labor-intensive. It is becoming standard practice to obtain paired inspiration-expiration CT scans of COPD patients. Predicting the PFT results from the CT scans would alleviate the need for PFT testing. It is hypothesized that the change of the trachea during breathing might be an indicator of tracheomalacia in COPD patients and correlate with COPD severity. In this paper, we propose to automatically measure morphological changes in the trachea from paired inspiration and expiration CT scans and investigate the influence on COPD GOLD stage classification. The trachea is automatically segmented and the trachea shape is encoded using the lengths of rays cast from the center of gravity of the trachea. These features are used in a classifier, combined with emphysema scoring, to attempt to classify subjects into their COPD stage. A database of 187 subjects, well distributed over the COPD GOLD stages 0 through 4 was used for this study. The data was randomly divided into training and test set. Using the training scans, a nearest mean classifier was trained to classify the subjects into their correct GOLD stage using either emphysema score, tracheal shape features, or a combination. Combining the proposed trachea shape features with emphysema score, the classification performance into GOLD stages improved with 11% to 51%. In addition, an 80% accuracy was achieved in distinguishing healthy subjects from COPD patients.
Classification of interstitial lung disease patterns with topological texture features
NASA Astrophysics Data System (ADS)
Huber, Markus B.; Nagarajan, Mahesh; Leinsinger, Gerda; Ray, Lawrence A.; Wismüller, Axel
2010-03-01
Topological texture features were compared in their ability to classify morphological patterns known as 'honeycombing' that are considered indicative for the presence of fibrotic interstitial lung diseases in high-resolution computed tomography (HRCT) images. For 14 patients with known occurrence of honey-combing, a stack of 70 axial, lung kernel reconstructed images were acquired from HRCT chest exams. A set of 241 regions of interest of both healthy and pathological (89) lung tissue were identified by an experienced radiologist. Texture features were extracted using six properties calculated from gray-level co-occurrence matrices (GLCM), Minkowski Dimensions (MDs), and three Minkowski Functionals (MFs, e.g. MF.euler). A k-nearest-neighbor (k-NN) classifier and a Multilayer Radial Basis Functions Network (RBFN) were optimized in a 10-fold cross-validation for each texture vector, and the classification accuracy was calculated on independent test sets as a quantitative measure of automated tissue characterization. A Wilcoxon signed-rank test was used to compare two accuracy distributions and the significance thresholds were adjusted for multiple comparisons by the Bonferroni correction. The best classification results were obtained by the MF features, which performed significantly better than all the standard GLCM and MD features (p < 0.005) for both classifiers. The highest accuracy was found for MF.euler (97.5%, 96.6%; for the k-NN and RBFN classifier, respectively). The best standard texture features were the GLCM features 'homogeneity' (91.8%, 87.2%) and 'absolute value' (90.2%, 88.5%). The results indicate that advanced topological texture features can provide superior classification performance in computer-assisted diagnosis of interstitial lung diseases when compared to standard texture analysis methods.
Hafiz, Pegah; Nematollahi, Mohtaram; Boostani, Reza; Namavar Jahromi, Bahia
2017-10-01
In vitro fertilization (IVF) and intracytoplasmic sperm injection (ICSI) are two important subsets of the assisted reproductive techniques, used for the treatment of infertility. Predicting implantation outcome of IVF/ICSI or the chance of pregnancy is essential for infertile couples, since these treatments are complex and expensive with a low probability of conception. In this cross-sectional study, the data of 486 patients were collected using census method. The IVF/ICSI dataset contains 29 variables along with an identifier for each patient that is either negative or positive. Mean accuracy and mean area under the receiver operating characteristic (ROC) curve are calculated for the classifiers. Sensitivity, specificity, positive and negative predictive values, and likelihood ratios of classifiers are employed as indicators of performance. The state-of-art classifiers which are candidates for this study include support vector machines, recursive partitioning (RPART), random forest (RF), adaptive boosting, and one-nearest neighbor. RF and RPART outperform the other comparable methods. The results revealed the areas under the ROC curve (AUC) as 84.23 and 82.05%, respectively. The importance of IVF/ICSI features was extracted from the output of RPART. Our findings demonstrate that the probability of pregnancy is low for women aged above 38. Classifiers RF and RPART are better at predicting IVF/ICSI cases compared to other decision makers that were tested in our study. Elicited decision rules of RPART determine useful predictive features of IVF/ICSI. Out of 20 factors, the age of woman, number of developed embryos, and serum estradiol level on the day of human chorionic gonadotropin administration are the three best features for such prediction. Copyright© by Royan Institute. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md
Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy trainingmore » time, and prone to fatigue as the number of sample increased and inconsistent. The GC–MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.« less
Introducing two Random Forest based methods for cloud detection in remote sensing images
NASA Astrophysics Data System (ADS)
Ghasemian, Nafiseh; Akhoondzadeh, Mehdi
2018-07-01
Cloud detection is a necessary phase in satellite images processing to retrieve the atmospheric and lithospheric parameters. Currently, some cloud detection methods based on Random Forest (RF) model have been proposed but they do not consider both spectral and textural characteristics of the image. Furthermore, they have not been tested in the presence of snow/ice. In this paper, we introduce two RF based algorithms, Feature Level Fusion Random Forest (FLFRF) and Decision Level Fusion Random Forest (DLFRF) to incorporate visible, infrared (IR) and thermal spectral and textural features (FLFRF) including Gray Level Co-occurrence Matrix (GLCM) and Robust Extended Local Binary Pattern (RELBP_CI) or visible, IR and thermal classifiers (DLFRF) for highly accurate cloud detection on remote sensing images. FLFRF first fuses visible, IR and thermal features. Thereafter, it uses the RF model to classify pixels to cloud, snow/ice and background or thick cloud, thin cloud and background. DLFRF considers visible, IR and thermal features (both spectral and textural) separately and inserts each set of features to RF model. Then, it holds vote matrix of each run of the model. Finally, it fuses the classifiers using the majority vote method. To demonstrate the effectiveness of the proposed algorithms, 10 Terra MODIS and 15 Landsat 8 OLI/TIRS images with different spatial resolutions are used in this paper. Quantitative analyses are based on manually selected ground truth data. Results show that after adding RELBP_CI to input feature set cloud detection accuracy improves. Also, the average cloud kappa values of FLFRF and DLFRF on MODIS images (1 and 0.99) are higher than other machine learning methods, Linear Discriminate Analysis (LDA), Classification And Regression Tree (CART), K Nearest Neighbor (KNN) and Support Vector Machine (SVM) (0.96). The average snow/ice kappa values of FLFRF and DLFRF on MODIS images (1 and 0.85) are higher than other traditional methods. The quantitative values on Landsat 8 images show similar trend. Consequently, while SVM and K-nearest neighbor show overestimation in predicting cloud and snow/ice pixels, our Random Forest (RF) based models can achieve higher cloud, snow/ice kappa values on MODIS and thin cloud, thick cloud and snow/ice kappa values on Landsat 8 images. Our algorithms predict both thin and thick cloud on Landsat 8 images while the existing cloud detection algorithm, Fmask cannot discriminate them. Compared to the state-of-the-art methods, our algorithms have acquired higher average cloud and snow/ice kappa values for different spatial resolutions.
JASMINE project Instrument design and centroiding experiment
NASA Astrophysics Data System (ADS)
Yano, Taihei; Gouda, Naoteru; Kobayashi, Yukiyasu; Yamada, Yoshiyuki
JASMINE will study the fundamental structure and evolution of the Milky Way Galaxy. To accomplish these objectives, JASMINE will measure trigonometric parallaxes, positions and proper motions of about 10 million stars with a precision of 10 μarcsec at z = 14 mag. In this paper the instrument design (optics, detectors, etc.) of JASMINE is presented. We also show a CCD centroiding experiment for estimating positions of star images. The experimental result shows that the accuracy of estimated distances has a variance of less than 0.01 pixel.
Self-aligning biaxial load frame
Ward, M.B.; Epstein, J.S.; Lloyd, W.R.
1994-01-18
An self-aligning biaxial loading apparatus for use in testing the strength of specimens while maintaining a constant specimen centroid during the loading operation. The self-aligning biaxial loading apparatus consists of a load frame and two load assemblies for imparting two independent perpendicular forces upon a test specimen. The constant test specimen centroid is maintained by providing elements for linear motion of the load frame relative to a fixed cross head, and by alignment and linear motion elements of one load assembly relative to the load frame. 3 figures.
Self-aligning biaxial load frame
Ward, Michael B.; Epstein, Jonathan S.; Lloyd, W. Randolph
1994-01-01
An self-aligning biaxial loading apparatus for use in testing the strength of specimens while maintaining a constant specimen centroid during the loading operation. The self-aligning biaxial loading apparatus consists of a load frame and two load assemblies for imparting two independent perpendicular forces upon a test specimen. The constant test specimen centroid is maintained by providing elements for linear motion of the load frame relative to a fixed crosshead, and by alignment and linear motion elements of one load assembly relative to the load frame.
Centroid — moment tensor solutions for July-September 2000
NASA Astrophysics Data System (ADS)
Dziewonski, A. M.; Ekström, G.; Maternovskaya, N. N.
2001-06-01
Centroid-moment tensor (CMT) solutions are presented for 308 earthquakes that occurred during the third quarter of 2000. The solutions are obtained using corrections for aspherical earth structure represented by a whole mantle shear velocity model SH8/U4L8 of Dziewonski and Woodward [Acoustical Imaging, Vol. 19, Plenum Press, New York, 1992, p. 785]. A model of anelastic attenuation of Durek and Ekström [Bull. Seism. Soc. Am. 86 (1996) 144] is used to predict the decay of the wave forms.
Picric acid-2,4,6-trichloro-aniline (1/1).
Wang, Wan-Qiang
2011-04-01
In the title adduct, C(6)H(4)Cl(3)N·C(6)H(3)N(3)O(7), the two benzene rings are almost coplanar, with a dihedral angle of 1.19 (1)° and an inter-ring centroid-centroid separation of 4.816 (2) Å. The crystal structure is stabilized by inter-molecular N-H⋯O(nitro) hydrogen bonds, giving a chain structure. In addition, there are phenol-nitro O-H⋯O inter-actions.
4-[(1E)-3-(2,6-Dichloro-3-fluoro-phen-yl)-3-oxoprop-1-en-1-yl]benzonitrile.
Praveen, Aletti S; Yathirajan, Hemmige S; Narayana, Badiadka; Gerber, Thomas; Hosten, Eric; Betz, Richard
2012-05-01
In the title mol-ecule, C(16)H(8)Cl(2)FNO, the benzene rings form a dihedral angle of 78.69 (8)°. The F atom is disordered over two positions in a 0.530 (3):0.470 (3) ratio. The crystal packing exhibits π-π inter-actions between dichloro-substituted rings [centroid-centroid distance = 3.6671 (10) Å] and weak inter-molecular C-H⋯F contacts.
Micro-XANES Determination Fe Speciation in Natural Basalts at Mantle-Relevant fO2
NASA Astrophysics Data System (ADS)
Fischer, R.; Cottrell, E.; Lanzirotti, A.; Kelley, K. A.
2007-12-01
We demonstrate that the oxidation state of iron (Fe3+/ΣFe) can be determined with a precision of ±0.02 (10% relative) on natural basalt glasses at mantle-relevant fO2 using Fe K-edge X-ray absorption near edge structure (XANES) spectroscopy. This is equivalent to ±0.25 log unit resolution relative to the QFM buffer. Precise determination of the oxidation state over this narrow range (Fe3+/ΣFe=0.06-0.30) and at low fO2 (down to QFM-2) relies on appropriate standards, high spectral resolution, and highly reproducible methods for extracting the pre-edge centroid position. We equilibrated natural tholeiite powder in a CO/CO2 gas mixing furnace at 1350°C from QFM-3 to QFM+2 to create six glasses of known Fe3+/ΣFe, independently determined by Mössbauer spectroscopy. XANES spectra were collected at station X26A at NSLS, Brookhaven Natl. Lab, in fluorescence mode (9 element Ge array detector) using both Si(111) and Si(311) monochromators. Generally, the energy position of the 1s→3d (pre-edge) transition centroid is the most sensitive monitor of Fe oxidation state using XANES. For the mixture of Fe oxidation states in these glasses and the resulting coordination geometries, the pre-edge spectra are best defined by two multiple 3d crystal field transitions. The Si(311) monochromator, with higher energy resolution, substantially improved spectral resolution for the 1s→3d transition. Dwell times of 5s at 0.1eV intervals across the pre-edge region yielded spectra with the 1s→3d transition peaks clearly resolved. The pre-edge centroid position is highly sensitive to the background subtraction and peak fitting procedures. Differences in fitting models result in small but significant differences in the calculated peak area of each pre-edge multiplet, and the relative contribution of each peak to the calculated centroid. We assessed several schemes and obtained robust centroid positions by simultaneously fitting the background with a damped harmonic oscillator (DHO) function and pre-edge features with two Gaussians over a sub-sample of the pre-edge region (7110-7120 eV). We found that the relation between Fe3+/ΣFe and the centroid energy is non-linear over this fO2 range, which is expected if the coordination environment changes with oxidation state. ΔQFM is linearly related (R2=0.99) to the centroid position. This new calibration allows the oxidation states of natural mantle melts to be discriminated with high spatial resolution (9μm). We apply the new calibration to determination of Fe3+/ΣFe in natural basaltic glasses and olivine-hosted glass inclusions (Cottrell et al. & Kelley et al., this meeting).
NASA Astrophysics Data System (ADS)
Tosun, Akif Burak; Yergiyev, Oleksandr; Kolouri, Soheil; Silverman, Jan F.; Rohde, Gustavo K.
2014-03-01
diagnostic standard is a pleural biopsy with subsequent histologic examination of the tissue demonstrating invasion by the tumor. The diagnostic tissue is obtained through thoracoscopy or open thoracotomy, both being highly invasive procedures. Thoracocenthesis, or removal of effusion fluid from the pleural space, is a far less invasive procedure that can provide material for cytological examination. However, it is insufficient to definitively confirm or exclude the diagnosis of malignant mesothelioma, since tissue invasion cannot be determined. In this study, we present a computerized method to detect and classify malignant mesothelioma based on the nuclear chromatin distribution from digital images of mesothelial cells in effusion cytology specimens. Our method aims at determining whether a set of nuclei belonging to a patient, obtained from effusion fluid images using image segmentation, is benign or malignant, and has a potential to eliminate the need for tissue biopsy. This method is performed by quantifying chromatin morphology of cells using the optimal transportation (Kantorovich-Wasserstein) metric in combination with the modified Fisher discriminant analysis, a k-nearest neighborhood classification, and a simple voting strategy. Our results show that we can classify the data of 10 different human cases with 100% accuracy after blind cross validation. We conclude that nuclear structure alone contains enough information to classify the malignant mesothelioma. We also conclude that the distribution of chromatin seems to be a discriminating feature between nuclei of benign and malignant mesothelioma cells.
Physical Human Activity Recognition Using Wearable Sensors.
Attal, Ferhat; Mohammed, Samer; Dedabrishvili, Mariam; Chamroukhi, Faicel; Oukhellou, Latifa; Amirat, Yacine
2015-12-11
This paper presents a review of different classification techniques used to recognize human activities from wearable inertial sensor data. Three inertial sensor units were used in this study and were worn by healthy subjects at key points of upper/lower body limbs (chest, right thigh and left ankle). Three main steps describe the activity recognition process: sensors' placement, data pre-processing and data classification. Four supervised classification techniques namely, k-Nearest Neighbor (k-NN), Support Vector Machines (SVM), Gaussian Mixture Models (GMM), and Random Forest (RF) as well as three unsupervised classification techniques namely, k-Means, Gaussian mixture models (GMM) and Hidden Markov Model (HMM), are compared in terms of correct classification rate, F-measure, recall, precision, and specificity. Raw data and extracted features are used separately as inputs of each classifier. The feature selection is performed using a wrapper approach based on the RF algorithm. Based on our experiments, the results obtained show that the k-NN classifier provides the best performance compared to other supervised classification algorithms, whereas the HMM classifier is the one that gives the best results among unsupervised classification algorithms. This comparison highlights which approach gives better performance in both supervised and unsupervised contexts. It should be noted that the obtained results are limited to the context of this study, which concerns the classification of the main daily living human activities using three wearable accelerometers placed at the chest, right shank and left ankle of the subject.
Local binary pattern texture-based classification of solid masses in ultrasound breast images
NASA Astrophysics Data System (ADS)
Matsumoto, Monica M. S.; Sehgal, Chandra M.; Udupa, Jayaram K.
2012-03-01
Breast cancer is one of the leading causes of cancer mortality among women. Ultrasound examination can be used to assess breast masses, complementarily to mammography. Ultrasound images reveal tissue information in its echoic patterns. Therefore, pattern recognition techniques can facilitate classification of lesions and thereby reduce the number of unnecessary biopsies. Our hypothesis was that image texture features on the boundary of a lesion and its vicinity can be used to classify masses. We have used intensity-independent and rotation-invariant texture features, known as Local Binary Patterns (LBP). The classifier selected was K-nearest neighbors. Our breast ultrasound image database consisted of 100 patient images (50 benign and 50 malignant cases). The determination of whether the mass was benign or malignant was done through biopsy and pathology assessment. The training set consisted of sixty images, randomly chosen from the database of 100 patients. The testing set consisted of forty images to be classified. The results with a multi-fold cross validation of 100 iterations produced a robust evaluation. The highest performance was observed for feature LBP with 24 symmetrically distributed neighbors over a circle of radius 3 (LBP24,3) with an accuracy rate of 81.0%. We also investigated an approach with a score of malignancy assigned to the images in the test set. This approach provided an ROC curve with Az of 0.803. The analysis of texture features over the boundary of solid masses showed promise for malignancy classification in ultrasound breast images.
Khondoker, Mizanur R; Bachmann, Till T; Mewissen, Muriel; Dickinson, Paul; Dobrzelecki, Bartosz; Campbell, Colin J; Mount, Andrew R; Walton, Anthony J; Crain, Jason; Schulze, Holger; Giraud, Gerard; Ross, Alan J; Ciani, Ilenia; Ember, Stuart W J; Tlili, Chaker; Terry, Jonathan G; Grant, Eilidh; McDonnell, Nicola; Ghazal, Peter
2010-12-01
Machine learning and statistical model based classifiers have increasingly been used with more complex and high dimensional biological data obtained from high-throughput technologies. Understanding the impact of various factors associated with large and complex microarray datasets on the predictive performance of classifiers is computationally intensive, under investigated, yet vital in determining the optimal number of biomarkers for various classification purposes aimed towards improved detection, diagnosis, and therapeutic monitoring of diseases. We investigate the impact of microarray based data characteristics on the predictive performance for various classification rules using simulation studies. Our investigation using Random Forest, Support Vector Machines, Linear Discriminant Analysis and k-Nearest Neighbour shows that the predictive performance of classifiers is strongly influenced by training set size, biological and technical variability, replication, fold change and correlation between biomarkers. Optimal number of biomarkers for a classification problem should therefore be estimated taking account of the impact of all these factors. A database of average generalization errors is built for various combinations of these factors. The database of generalization errors can be used for estimating the optimal number of biomarkers for given levels of predictive accuracy as a function of these factors. Examples show that curves from actual biological data resemble that of simulated data with corresponding levels of data characteristics. An R package optBiomarker implementing the method is freely available for academic use from the Comprehensive R Archive Network (http://www.cran.r-project.org/web/packages/optBiomarker/).
Classification of postural profiles among mouth-breathing children by learning vector quantization.
Mancini, F; Sousa, F S; Hummel, A D; Falcão, A E J; Yi, L C; Ortolani, C F; Sigulem, D; Pisa, I T
2011-01-01
Mouth breathing is a chronic syndrome that may bring about postural changes. Finding characteristic patterns of changes occurring in the complex musculoskeletal system of mouth-breathing children has been a challenge. Learning vector quantization (LVQ) is an artificial neural network model that can be applied for this purpose. The aim of the present study was to apply LVQ to determine the characteristic postural profiles shown by mouth-breathing children, in order to further understand abnormal posture among mouth breathers. Postural training data on 52 children (30 mouth breathers and 22 nose breathers) and postural validation data on 32 children (22 mouth breathers and 10 nose breathers) were used. The performance of LVQ and other classification models was compared in relation to self-organizing maps, back-propagation applied to multilayer perceptrons, Bayesian networks, naive Bayes, J48 decision trees, k, and k-nearest-neighbor classifiers. Classifier accuracy was assessed by means of leave-one-out cross-validation, area under ROC curve (AUC), and inter-rater agreement (Kappa statistics). By using the LVQ model, five postural profiles for mouth-breathing children could be determined. LVQ showed satisfactory results for mouth-breathing and nose-breathing classification: sensitivity and specificity rates of 0.90 and 0.95, respectively, when using the training dataset, and 0.95 and 0.90, respectively, when using the validation dataset. The five postural profiles for mouth-breathing children suggested by LVQ were incorporated into application software for classifying the severity of mouth breathers' abnormal posture.
Physical Human Activity Recognition Using Wearable Sensors
Attal, Ferhat; Mohammed, Samer; Dedabrishvili, Mariam; Chamroukhi, Faicel; Oukhellou, Latifa; Amirat, Yacine
2015-01-01
This paper presents a review of different classification techniques used to recognize human activities from wearable inertial sensor data. Three inertial sensor units were used in this study and were worn by healthy subjects at key points of upper/lower body limbs (chest, right thigh and left ankle). Three main steps describe the activity recognition process: sensors’ placement, data pre-processing and data classification. Four supervised classification techniques namely, k-Nearest Neighbor (k-NN), Support Vector Machines (SVM), Gaussian Mixture Models (GMM), and Random Forest (RF) as well as three unsupervised classification techniques namely, k-Means, Gaussian mixture models (GMM) and Hidden Markov Model (HMM), are compared in terms of correct classification rate, F-measure, recall, precision, and specificity. Raw data and extracted features are used separately as inputs of each classifier. The feature selection is performed using a wrapper approach based on the RF algorithm. Based on our experiments, the results obtained show that the k-NN classifier provides the best performance compared to other supervised classification algorithms, whereas the HMM classifier is the one that gives the best results among unsupervised classification algorithms. This comparison highlights which approach gives better performance in both supervised and unsupervised contexts. It should be noted that the obtained results are limited to the context of this study, which concerns the classification of the main daily living human activities using three wearable accelerometers placed at the chest, right shank and left ankle of the subject. PMID:26690450
Lim, Dong Kyu; Long, Nguyen Phuoc; Mo, Changyeun; Dong, Ziyuan; Cui, Lingmei; Kim, Giyoung; Kwon, Sung Won
2017-10-01
The mixing of extraneous ingredients with original products is a common adulteration practice in food and herbal medicines. In particular, authenticity of white rice and its corresponding blended products has become a key issue in food industry. Accordingly, our current study aimed to develop and evaluate a novel discrimination method by combining targeted lipidomics with powerful supervised learning methods, and eventually introduce a platform to verify the authenticity of white rice. A total of 30 cultivars were collected, and 330 representative samples of white rice from Korea and China as well as seven mixing ratios were examined. Random forests (RF), support vector machines (SVM) with a radial basis function kernel, C5.0, model averaged neural network, and k-nearest neighbor classifiers were used for the classification. We achieved desired results, and the classifiers effectively differentiated white rice from Korea to blended samples with high prediction accuracy for the contamination ratio as low as five percent. In addition, RF and SVM classifiers were generally superior to and more robust than the other techniques. Our approach demonstrated that the relative differences in lysoGPLs can be successfully utilized to detect the adulterated mixing of white rice originating from different countries. In conclusion, the present study introduces a novel and high-throughput platform that can be applied to authenticate adulterated admixtures from original white rice samples. Copyright © 2017 Elsevier Ltd. All rights reserved.
Hartman Testing of X-Ray Telescopes
NASA Technical Reports Server (NTRS)
Saha, Timo T.; Biskasch, Michael; Zhang, William W.
2013-01-01
Hartmann testing of x-ray telescopes is a simple test method to retrieve and analyze alignment errors and low-order circumferential errors of x-ray telescopes and their components. A narrow slit is scanned along the circumference of the telescope in front of the mirror and the centroids of the images are calculated. From the centroid data, alignment errors, radius variation errors, and cone-angle variation errors can be calculated. Mean cone angle, mean radial height (average radius), and the focal length of the telescope can also be estimated if the centroid data is measured at multiple focal plane locations. In this paper we present the basic equations that are used in the analysis process. These equations can be applied to full circumference or segmented x-ray telescopes. We use the Optical Surface Analysis Code (OSAC) to model a segmented x-ray telescope and show that the derived equations and accompanying analysis retrieves the alignment errors and low order circumferential errors accurately.
Centroid-moment tensor inversions using high-rate GPS waveforms
NASA Astrophysics Data System (ADS)
O'Toole, Thomas B.; Valentine, Andrew P.; Woodhouse, John H.
2012-10-01
Displacement time-series recorded by Global Positioning System (GPS) receivers are a new type of near-field waveform observation of the seismic source. We have developed an inversion method which enables the recovery of an earthquake's mechanism and centroid coordinates from such data. Our approach is identical to that of the 'classical' Centroid-Moment Tensor (CMT) algorithm, except that we forward model the seismic wavefield using a method that is amenable to the efficient computation of synthetic GPS seismograms and their partial derivatives. We demonstrate the validity of our approach by calculating CMT solutions using 1 Hz GPS data for two recent earthquakes in Japan. These results are in good agreement with independently determined source models of these events. With wider availability of data, we envisage the CMT algorithm providing a tool for the systematic inversion of GPS waveforms, as is already the case for teleseismic data. Furthermore, this general inversion method could equally be applied to other near-field earthquake observations such as those made using accelerometers.
NASA Technical Reports Server (NTRS)
Chang, C. Y.; Curlander, J. C.
1992-01-01
Estimation of the Doppler centroid ambiguity is a necessary element of the signal processing for SAR systems with large antenna pointing errors. Without proper resolution of the Doppler centroid estimation (DCE) ambiguity, the image quality will be degraded in the system impulse response function and the geometric fidelity. Two techniques for resolution of DCE ambiguity for the spaceborne SAR are presented; they include a brief review of the range cross-correlation technique and presentation of a new technique using multiple pulse repetition frequencies (PRFs). For SAR systems, where other performance factors control selection of the PRF's, an algorithm is devised to resolve the ambiguity that uses PRF's of arbitrary numerical values. The performance of this multiple PRF technique is analyzed based on a statistical error model. An example is presented that demonstrates for the Shuttle Imaging Radar-C (SIR-C) C-band SAR, the probability of correct ambiguity resolution is higher than 95 percent for antenna attitude errors as large as 3 deg.
Yoshiara, Luciane Yuri; Madeira, Tiago Bervelieri; Delaroza, Fernanda; da Silva, Josemeyre Bonifácio; Ida, Elza Iouko
2012-12-01
The objective of this study was to optimize the extraction of different isoflavone forms (glycosidic, malonyl-glycosidic, aglycone and total) from defatted cotyledon soy flour using the simplex-centroid experimental design with four solvents of varying polarity (water, acetone, ethanol and acetonitrile). The obtained extracts were then analysed by high-performance liquid chromatography. The profile of the different soy isoflavones forms varied with different extractions solvents. Varying the solvent or mixture used, the extraction of different isoflavones was optimized using the centroid-simplex mixture design. The special cubic model best fitted to the four solvents and its combination for soy isoflavones extraction. For glycosidic isoflavones extraction, the polar ternary mixture (water, acetone and acetonitrile) achieved the best extraction; malonyl-glycosidic forms were better extracted with mixtures of water, acetone and ethanol. Aglycone isoflavones, water and acetone mixture were best extracted and total isoflavones, the best solvents were ternary mixture of water, acetone and ethanol.
Quantum Correlation in the XY Spin Model with Anisotropic Three-Site Interaction
NASA Astrophysics Data System (ADS)
Wang, Yao; Chai, Bing-Bing; Guo, Jin-Liang
2018-05-01
We investigate pairwise entanglement and quantum discord (QD) in the XY spin model with anisotropic three-site interaction at zero and finite temperatures. For both the nearest-neighbor spins and the next nearest-neighbor spins, special attention is paid to the dependence of entanglement and QD on the anisotropic parameter δ induced by the next nearest-neighbor spins. We show that the behavior of QD differs in many ways from entanglement under the influences of the anisotropic three-site interaction at finite temperatures. More important, comparing the effects of δ on the entanglement and QD, we find the anisotropic three-site interaction plays an important role in the quantum correlations at zero and finite temperatures. It is found that δ can strengthen the quantum correlation for both the nearest-neighbor spins and the next nearest-neighbor spins, especially for the nearest-neighbor spins at low temperature.
Performing a scatterv operation on a hierarchical tree network optimized for collective operations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D
Performing a scatterv operation on a hierarchical tree network optimized for collective operations including receiving, by the scatterv module installed on the node, from a nearest neighbor parent above the node a chunk of data having at least a portion of data for the node; maintaining, by the scatterv module installed on the node, the portion of the data for the node; determining, by the scatterv module installed on the node, whether any portions of the data are for a particular nearest neighbor child below the node or one or more other nodes below the particular nearest neighbor child; andmore » sending, by the scatterv module installed on the node, those portions of data to the nearest neighbor child if any portions of the data are for a particular nearest neighbor child below the node or one or more other nodes below the particular nearest neighbor child.« less
Novel method of detecting movement of the interference fringes using one-dimensional PSD.
Wang, Qi; Xia, Ji; Liu, Xu; Zhao, Yong
2015-06-02
In this paper, a method of using a one-dimensional position-sensitive detector (PSD) by replacing charge-coupled device (CCD) to measure the movement of the interference fringes is presented first, and its feasibility is demonstrated through an experimental setup based on the principle of centroid detection. Firstly, the centroid position of the interference fringes in a fiber Mach-Zehnder (M-Z) interferometer is solved in theory, showing it has a higher resolution and sensitivity. According to the physical characteristics and principles of PSD, a simulation of the interference fringe's phase difference in fiber M-Z interferometers and PSD output is carried out. Comparing the simulation results with the relationship between phase differences and centroid positions in fiber M-Z interferometers, the conclusion that the output of interference fringes by PSD is still the centroid position is obtained. Based on massive measurements, the best resolution of the system is achieved with 5.15, 625 μm. Finally, the detection system is evaluated through setup error analysis and an ultra-narrow-band filter structure. The filter structure is configured with a one-dimensional photonic crystal containing positive and negative refraction material, which can eliminate background light in the PSD detection experiment. This detection system has a simple structure, good stability, high precision and easily performs remote measurements, which makes it potentially useful in material small deformation tests, refractivity measurements of optical media and optical wave front detection.
Human attention filters for single colors.
Sun, Peng; Chubb, Charles; Wright, Charles E; Sperling, George
2016-10-25
The visual images in the eyes contain much more information than the brain can process. An important selection mechanism is feature-based attention (FBA). FBA is best described by attention filters that specify precisely the extent to which items containing attended features are selectively processed and the extent to which items that do not contain the attended features are attenuated. The centroid-judgment paradigm enables quick, precise measurements of such human perceptual attention filters, analogous to transmission measurements of photographic color filters. Subjects use a mouse to locate the centroid-the center of gravity-of a briefly displayed cloud of dots and receive precise feedback. A subset of dots is distinguished by some characteristic, such as a different color, and subjects judge the centroid of only the distinguished subset (e.g., dots of a particular color). The analysis efficiently determines the precise weight in the judged centroid of dots of every color in the display (i.e., the attention filter for the particular attended color in that context). We report 32 attention filters for single colors. Attention filters that discriminate one saturated hue from among seven other equiluminant distractor hues are extraordinarily selective, achieving attended/unattended weight ratios >20:1. Attention filters for selecting a color that differs in saturation or lightness from distractors are much less selective than attention filters for hue (given equal discriminability of the colors), and their filter selectivities are proportional to the discriminability distance of neighboring colors, whereas in the same range hue attention-filter selectivity is virtually independent of discriminabilty.
NASA Astrophysics Data System (ADS)
Régis, J.-M.; Jolie, J.; Mach, H.; Simpson, G. S.; Blazhev, A.; Pascovici, G.; Pfeiffer, M.; Rudigier, M.; Saed-Samii, N.; Warr, N.; Blanc, A.; de France, G.; Jentschel, M.; Köster, U.; Mutti, P.; Soldner, T.; Ur, C. A.; Urban, W.; Bruce, A. M.; Drouet, F.; Fraile, L. M.; Ilieva, S.; Korten, W.; Kröll, T.; Lalkovski, S.; Mărginean, S.; Paziy, V.; Podolyák, Zs.; Regan, P. H.; Stezowski, O.; Vancraeyenest, A.
2015-05-01
A novel method for direct electronic "fast-timing" lifetime measurements of nuclear excited states via γ-γ coincidences using an array equipped with N very fast high-resolution LaBr3(Ce) scintillator detectors is presented. The generalized centroid difference method provides two independent "start" and "stop" time spectra obtained without any correction by a superposition of the N(N - 1)/2 calibrated γ-γ time difference spectra of the N detector fast-timing system. The two fast-timing array time spectra correspond to a forward and reverse gating of a specific γ-γ cascade and the centroid difference as the time shift between the centroids of the two time spectra provides a picosecond-sensitive mirror-symmetric observable of the set-up. The energydependent mean prompt response difference between the start and stop events is calibrated and used as a single correction for lifetime determination. These combined fast-timing array mean γ-γ zero-time responses can be determined for 40 keV < Eγ < 1.4 MeV with a precision better than 10 ps using a 152Eu γ-ray source. The new method is described with examples of (n,γ) and (n,f,γ) experiments performed at the intense cold-neutron beam facility PF1B of the Institut Laue-Langevin in Grenoble, France, using 16 LaBr3(Ce) detectors within the EXILL&FATIMA campaign in 2013. The results are discussed with respect to possible systematic errors induced by background contributions.
Empirical Model of Precipitating Ion Oval
NASA Astrophysics Data System (ADS)
Goldstein, Jerry
2017-10-01
In this brief technical report published maps of ion integral flux are used to constrain an empirical model of the precipitating ion oval. The ion oval is modeled as a Gaussian function of ionospheric latitude that depends on local time and the Kp geomagnetic index. The three parameters defining this function are the centroid latitude, width, and amplitude. The local time dependences of these three parameters are approximated by Fourier series expansions whose coefficients are constrained by the published ion maps. The Kp dependence of each coefficient is modeled by a linear fit. Optimization of the number of terms in the expansion is achieved via minimization of the global standard deviation between the model and the published ion map at each Kp. The empirical model is valid near the peak flux of the auroral oval; inside its centroid region the model reproduces the published ion maps with standard deviations of less than 5% of the peak integral flux. On the subglobal scale, average local errors (measured as a fraction of the point-to-point integral flux) are below 30% in the centroid region. Outside its centroid region the model deviates significantly from the H89 integral flux maps. The model's performance is assessed by comparing it with both local and global data from a 17 April 2002 substorm event. The model can reproduce important features of the macroscale auroral region but none of its subglobal structure, and not immediately following a substorm.
Evidence against global attention filters selective for absolute bar-orientation in human vision.
Inverso, Matthew; Sun, Peng; Chubb, Charles; Wright, Charles E; Sperling, George
2016-01-01
The finding that an item of type A pops out from an array of distractors of type B typically is taken to support the inference that human vision contains a neural mechanism that is activated by items of type A but not by items of type B. Such a mechanism might be expected to yield a neural image in which items of type A produce high activation and items of type B low (or zero) activation. Access to such a neural image might further be expected to enable accurate estimation of the centroid of an ensemble of items of type A intermixed with to-be-ignored items of type B. Here, it is shown that as the number of items in stimulus displays is increased, performance in estimating the centroids of horizontal (vertical) items amid vertical (horizontal) distractors degrades much more quickly and dramatically than does performance in estimating the centroids of white (black) items among black (white) distractors. Together with previous findings, these results suggest that, although human vision does possess bottom-up neural mechanisms sensitive to abrupt local changes in bar-orientation, and although human vision does possess and utilize top-down global attention filters capable of selecting multiple items of one brightness or of one color from among others, it cannot use a top-down global attention filter capable of selecting multiple bars of a given absolute orientation and filtering bars of the opposite orientation in a centroid task.
NASA Astrophysics Data System (ADS)
Basoglu, Burak; Halicioglu, Kerem; Albayrak, Muge; Ulug, Rasit; Tevfik Ozludemir, M.; Deniz, Rasim
2017-04-01
In the last decade, the importance of high-precise geoid determination at local or national level has been pointed out by Turkish National Geodesy Commission. The Commission has also put objective of modernization of national height system of Turkey to the agenda. Meanwhile several projects have been realized in recent years. In Istanbul city, a GNSS/Levelling geoid was defined in 2005 for the metropolitan area of the city with an accuracy of ±3.5cm. In order to achieve a better accuracy in this area, "Local Geoid Determination with Integration of GNSS/Levelling and Astro-Geodetic Data" project has been conducted in Istanbul Technical University and Bogazici University KOERI since January 2016. The project is funded by The Scientific and Technological Research Council of Turkey. With the scope of the project, modernization studies of Digital Zenith Camera System are being carried on in terms of hardware components and software development. Accentuated subjects are the star catalogues, and centroiding algorithm used to identify the stars on the zenithal star field. During the test observations of Digital Zenith Camera System performed between 2013-2016, final results were calculated using the PSF method for star centroiding, and the second USNO CCD Astrograph Catalogue (UCAC2) for the reference star positions. This study aims to investigate the position accuracy of the star images by comparing different centroiding algorithms and available star catalogs used in astro-geodetic observations conducted with the digital zenith camera system.
Statistical Approaches Used to Assess the Equity of Access to Food Outlets: A Systematic Review
Lamb, Karen E.; Thornton, Lukar E.; Cerin, Ester; Ball, Kylie
2015-01-01
Background Inequalities in eating behaviours are often linked to the types of food retailers accessible in neighbourhood environments. Numerous studies have aimed to identify if access to healthy and unhealthy food retailers is socioeconomically patterned across neighbourhoods, and thus a potential risk factor for dietary inequalities. Existing reviews have examined differences between methodologies, particularly focussing on neighbourhood and food outlet access measure definitions. However, no review has informatively discussed the suitability of the statistical methodologies employed; a key issue determining the validity of study findings. Our aim was to examine the suitability of statistical approaches adopted in these analyses. Methods Searches were conducted for articles published from 2000–2014. Eligible studies included objective measures of the neighbourhood food environment and neighbourhood-level socio-economic status, with a statistical analysis of the association between food outlet access and socio-economic status. Results Fifty-four papers were included. Outlet accessibility was typically defined as the distance to the nearest outlet from the neighbourhood centroid, or as the number of food outlets within a neighbourhood (or buffer). To assess if these measures were linked to neighbourhood disadvantage, common statistical methods included ANOVA, correlation, and Poisson or negative binomial regression. Although all studies involved spatial data, few considered spatial analysis techniques or spatial autocorrelation. Conclusions With advances in GIS software, sophisticated measures of neighbourhood outlet accessibility can be considered. However, approaches to statistical analysis often appear less sophisticated. Care should be taken to consider assumptions underlying the analysis and the possibility of spatially correlated residuals which could affect the results. PMID:29546115
PARSEC's Astrometry - The Risky Approach
NASA Astrophysics Data System (ADS)
Andrei, A. H.
2015-10-01
Parallaxes - and hence the fundamental establishment of stellar distances - rank among the oldest, most direct, and hardest of astronomical determinations. Arguably amongst the most essential too. The direct approach to obtain trigonometric parallaxes, using a constrained set of equations to derive positions, proper motions, and parallaxes, has been labelled as risky. Properly so, because the axis of the parallactic apparent ellipse is smaller than one arcsec even for the nearest stars, and just a fraction of its perimeter can be followed. Thus the classical approach is of linearizing the description by locking the solution to a set of precise positions of the Earth at the instants of observation, rather than to the dynamics of its orbit, and of adopting a close examination of the few observations available. In the PARSEC program the parallaxes of 143 brown dwarfs were planned. Five years of observation of the fields were taken with the WFI camera at the ESO 2.2m telescope in Chile. The goal is to provide a statistically significant number of trigonometric parallaxes for BD sub-classes from L0 to T7. Taking advantage of the large, regularly spaced, quantity of observations, here we take the risky approach to fit an ellipse to the observed ecliptic coordinates and derive the parallaxes. We also combine the solutions from different centroiding methods, widely proven in prior astrometric investigations. As each of those methods assess diverse properties of the PSFs, they are taken as independent measurements, and combined into a weighted least-squares general solution. The results obtained compare well with the literature and with the classical approach.