Electromechanical properties of engineered lead free potassium sodium niobate based materials =
NASA Astrophysics Data System (ADS)
Rafiq, Muhammad Asif
K0.5Na0.5NbO3 (KNN), is the most promising lead free material for substituting lead zirconate titanate (PZT) which is still the market leader used for sensors and actuators. To make KNN a real competitor, it is necessary to understand and to improve its properties. This goal is pursued in the present work via different approaches aiming to study KNN intrinsic properties and then to identify appropriate strategies like doping and texturing for designing better KNN materials for an intended application. Hence, polycrystalline KNN ceramics (undoped, non-stoichiometric; NST and doped), high-quality KNN single crystals and textured KNN based ceramics were successfully synthesized and characterized in this work. Polycrystalline undoped, non-stoichiometric (NST) and Mn doped KNN ceramics were prepared by conventional ceramic processing. Structure, microstructure and electrical properties were measured. It was observed that the window for mono-phasic compositions was very narrow for both NST ceramics and Mn doped ceramics. For NST ceramics the variation of A/B ratio influenced the polarization (P-E) hysteresis loop and better piezoelectric and dielectric responses could be found for small stoichiometry deviations (A/B = 0.97). Regarding Mn doping, as compared to undoped KNN which showed leaky polarization (P-E) hysteresis loops, B-site Mn doped ceramics showed a well saturated, less-leaky hysteresis loop and a significant properties improvement. Impedance spectroscopy was used to assess the role of Mn and a relation between charge transport - defects and ferroelectric response in K0.5Na0.5NbO3 (KNN) and Mn doped KNN ceramics could be established. At room temperature the conduction in KNN which is associated with holes transport is suppressed by Mn doping. Hence Mn addition increases the resistivity of the ceramic, which proved to be very helpful for improving the saturation of the P-E loop. At high temperatures the conduction is dominated by the motion of ionized oxygen vacancies whose concentration increases with Mn doping. Single crystals of potassium sodium niobate (KNN) were grown by a modified high temperature flux method. A boron-modified flux was used to obtain the crystals at a relatively low temperature. XRD, EDS and ICP analysis proved the chemical and crystallographic quality of the crystals. The grown KNN crystals exhibit higher dielectric permittivity (29,100) at the tetragonal-to-cubic phase transition temperature, higher remnant polarization (19.4 ?C/cm2) and piezoelectric coefficient (160 pC/N) when compared with the standard KNN ceramics. KNN single crystals domain structure was characterized for the first time by piezoforce response microscopy. It could be observed that - oriented potassium sodium niobate (KNN) single crystals reveal a long range ordered domain pattern of parallel 180° domains with zig-zag 90° domains. From the comparison of KNN Single crystals to ceramics, It is argued that the presence in KNN single crystal (and absence in KNN ceramics) of such a long range order specific domain pattern that is its fingerprint accounts for the improved properties of single crystals. These results have broad implications for the expanded use of KNN materials, by establishing a relation between the domain patterns and the dielectric and ferroelectric response of single crystals and ceramics and by indicating ways of achieving maximised properties in KNN materials. (Abstract shortened by ProQuest.).
Nearest neighbor-density-based clustering methods for large hyperspectral images
NASA Astrophysics Data System (ADS)
Cariou, Claude; Chehdi, Kacem
2017-10-01
We address the problem of hyperspectral image (HSI) pixel partitioning using nearest neighbor - density-based (NN-DB) clustering methods. NN-DB methods are able to cluster objects without specifying the number of clusters to be found. Within the NN-DB approach, we focus on deterministic methods, e.g. ModeSeek, knnClust, and GWENN (standing for Graph WatershEd using Nearest Neighbors). These methods only require the availability of a k-nearest neighbor (kNN) graph based on a given distance metric. Recently, a new DB clustering method, called Density Peak Clustering (DPC), has received much attention, and kNN versions of it have quickly followed and showed their efficiency. However, NN-DB methods still suffer from the difficulty of obtaining the kNN graph due to the quadratic complexity with respect to the number of pixels. This is why GWENN was embedded into a multiresolution (MR) scheme to bypass the computation of the full kNN graph over the image pixels. In this communication, we propose to extent the MR-GWENN scheme on three aspects. Firstly, similarly to knnClust, the original labeling rule of GWENN is modified to account for local density values, in addition to the labels of previously processed objects. Secondly, we set up a modified NN search procedure within the MR scheme, in order to stabilize of the number of clusters found from the coarsest to the finest spatial resolution. Finally, we show that these extensions can be easily adapted to the three other NN-DB methods (ModeSeek, knnClust, knnDPC) for pixel clustering in large HSIs. Experiments are conducted to compare the four NN-DB methods for pixel clustering in HSIs. We show that NN-DB methods can outperform a classical clustering method such as fuzzy c-means (FCM), in terms of classification accuracy, relevance of found clusters, and clustering speed. Finally, we demonstrate the feasibility and evaluate the performances of NN-DB methods on a very large image acquired by our AISA Eagle hyperspectral imaging sensor.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chitralekha, C. S.; Rasi, Mohammed; Nair, Swapna S., E-mail: swapna.s.nair@gmail.com
A modified sol-gel method was introduced by employing a cost effective novel template to synthesize coaxial one dimensional (1-D) composite nanostructures based on CoFe{sub 2}O{sub 4} (CFO) - K{sub 0.5}Na{sub 0.5}NbO{sub 3} (KNN) and magnetic nanostructures based on CoFe{sub 2}O{sub 4} (CFO). The studies with scanning electron microscopy (SEM) and atomic force microscopy (AFM) revealed that the composite material is characterized by the 1-D tubular structure. The absorption edge is blue shifted for both KNN and CFO nanotubes due to the lattice strain effect.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Du Hongliang; Zhou Wancheng; Luo Fa
The (1-x)(K{sub 0.5}Na{sub 0.5})NbO{sub 3}-x(Ba{sub 0.5}Sr{sub 0.5})TiO{sub 3} (KNN-BST) solid solution has been synthesized by conventional solid-state sintering in order to search for the new lead-free relaxor ferroelectrics for high temperature applications. The phase structure, dielectric properties, and relaxor behavior of the (1-x)KNN-xBST solid solution are systematically investigated. The phase structure of the (1-x)KNN-xBST solid solution gradually changes from pure perovskite phase with an orthorhombic symmetry to the tetragonal symmetry, then to the pseudocubic phase, and to the cubic phase with increasing addition of BST. The 0.90KNN-0.10BST solid solution shows a broad dielectric peak with permittivity maximum near 2500 andmore » low dielectric loss (<4%) in the temperature range of 100-250 deg. C. The result indicates that this material may have great potential for a variety of high temperature applications. The diffuse phase transition and the temperature of the maximum dielectric permittivity shifting toward higher temperature with increasing frequency, which are two typical characteristics for relaxor ferroelectrics, are observed in the (1-x)KNN-xBST solid solution. The dielectric relaxor behavior obeys a modified Curie-Weiss law and a Vogel-Fulcher relationship. The relaxor nature is attributed to the appearance of polar nanoregions owing to the formation of randon fields including local electric fields and elastic fields. These results confirm that the KNN-based relaxor ferroelectrics can be regarded as an alternative direction for the development of high temperature lead-free relaxor ferroelectrics.« less
A Comparative Evaluation of Anomaly Detection Algorithms for Maritime Video Surveillance
2011-01-01
of k-means clustering and the k- NN Localized p-value Estimator ( KNN -LPE). K-means is a popular distance-based clustering algorithm while KNN -LPE...implemented the sparse cluster identification rule we described in Section 3.1. 2. k-NN Localized p-value Estimator ( KNN -LPE): We implemented this using...Average Density ( KNN -NAD): This was implemented as described in Section 3.4. Algorithm Parameter Settings The global and local density-based anomaly
Analysis of miRNA expression profile based on SVM algorithm
NASA Astrophysics Data System (ADS)
Ting-ting, Dai; Chang-ji, Shan; Yan-shou, Dong; Yi-duo, Bian
2018-05-01
Based on mirna expression spectrum data set, a new data mining algorithm - tSVM - KNN (t statistic with support vector machine - k nearest neighbor) is proposed. the idea of the algorithm is: firstly, the feature selection of the data set is carried out by the unified measurement method; Secondly, SVM - KNN algorithm, which combines support vector machine (SVM) and k - nearest neighbor (k - nearest neighbor) is used as classifier. Simulation results show that SVM - KNN algorithm has better classification ability than SVM and KNN alone. Tsvm - KNN algorithm only needs 5 mirnas to obtain 96.08 % classification accuracy in terms of the number of mirna " tags" and recognition accuracy. compared with similar algorithms, tsvm - KNN algorithm has obvious advantages.
Eiras, José A; Gerbasi, Rosimeire B Z; Rosso, Jaciele M; Silva, Daniel M; Cótica, Luiz F; Santos, Ivair A; Souza, Camila A; Lente, Manuel H
2016-03-08
Lead free piezoelectric materials are being intensively investigated in order to substitute lead based ones, commonly used in many different applications. Among the most promising lead-free materials are those with modified NaNbO₃, such as (K, Na)NbO₃ (KNN) and (Ba, Na)(Ti, Nb)O₃ (BTNN) families. From a ceramic processing point of view, high density single phase KNN and BTNN ceramics are very difficult to sinter due to the volatility of the alkaline elements, the narrow sintering temperature range and the anomalous grain growth. In this work, Spark Plasma Sintering (SPS) and high-energy ball milling (HEBM), following heat treatments (calcining and sintering), in oxidative (O₂) atmosphere have been used to prepare single phase highly densified KNN ("pure" and Cu 2+ or Li 1+ doped), with theoretical densities ρ th > 97% and BTNN ceramics (ρ th - 90%), respectively. Using BTTN ceramics with a P 4 mm perovskite-like structure, we showed that by increasing the NaNbO₃ content, the ferroelectric properties change from having a relaxor effect to an almost "normal" ferroelectric character, while the tetragonality and grain size increase and the shear piezoelectric coefficients ( k 15 , g 15 and d 15 ) improve. For KNN ceramics, the results reveal that the values for remanent polarization as well as for most of the coercive field are quite similar among all compositions. These facts evidenced that Cu 2+ may be incorporated into the A and/or B sites of the perovskite structure, having both hardening and softening effects.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yongsiri, Ploypailin; Sirisoonthorn, Somnuk; Pengpat, Kamonpan, E-mail: kamonpan.p@cmu.ac.th
Highlights: • The KNN–SiO{sub 2} doped Er{sub 2}O{sub 3} glass-ceramics was prepared by incorporation method. • High dielectric constant (458.41 at 100 kHz) and low loss (0.0005) could be obtained. • TEM and SEM confirmed the existence of KNN crystals embedded in glass matrix. • The Er{sub 2}O{sub 3} dopant causes insignificant effect on modifying E{sub g} value. - Abstract: In this study, transparent glass-ceramics from potassium sodium niobate (KNN)-silicate glass system doped with erbium oxide (Er{sub 2}O{sub 3}) were successfully prepared by incorporation method. KNN was added in glass batches as heterogeneous nucleating agent. The KNN powder was mixedmore » with SiO{sub 2} and Er{sub 2}O{sub 3} dopant with KNN and Er{sub 2}O{sub 3} content varied between 70–80 and 0.5–1.0 mol%, respectively. Each batch was subsequently melted at 1300 °C for 15 min in a platinum crucible using an electric furnace. The quenched glasses were then subjected to heat treatment at various temperatures for 4 h. XRD results showed that the prepared glass ceramics contained crystals of KNN solid solution. In contrary, dielectric constant (ϵ{sub r}) and dielectric loss (tan δ) were found to increase with increasing heat treatment temperature. Additionally, optical properties such as absorbance and energy band gap have been investigated.« less
Distributed Computation of the knn Graph for Large High-Dimensional Point Sets
Plaku, Erion; Kavraki, Lydia E.
2009-01-01
High-dimensional problems arising from robot motion planning, biology, data mining, and geographic information systems often require the computation of k nearest neighbor (knn) graphs. The knn graph of a data set is obtained by connecting each point to its k closest points. As the research in the above-mentioned fields progressively addresses problems of unprecedented complexity, the demand for computing knn graphs based on arbitrary distance metrics and large high-dimensional data sets increases, exceeding resources available to a single machine. In this work we efficiently distribute the computation of knn graphs for clusters of processors with message passing. Extensions to our distributed framework include the computation of graphs based on other proximity queries, such as approximate knn or range queries. Our experiments show nearly linear speedup with over one hundred processors and indicate that similar speedup can be obtained with several hundred processors. PMID:19847318
Characterization of Hard Piezoelectric Lead-Free Ceramics
Zhang, Shujun; Lim, Jong Bong; Lee, Hyeong Jae; Shrout, Thomas R.
2010-01-01
K4CuNb8O23 doped K0.45Na0.55NbO3 (KNN-KCN) ferroelectric ceramics were found to exhibit asymmetrical polarization hysteresis loops, related to the development of an internal bias field. The internal bias field is believed to be the result of defect dipoles of acceptor ions and oxygen vacancies, which lead to piezoelectric “hardening” effect, by stabilizing and pinning of the domain wall motion. The dielectric loss for the hard lead-free piezoelectric ceramic was found to be 0.6%, with mechanical quality factors Q on the order of >1500. Furthermore, the piezoelectric properties were found to decrease and the coercive field increased, when compared with the undoped material, exhibiting a typical characteristic of “hard” behavior. The temperature usage range was limited by the polymorphic phase transition temperature, being 188°C. The full set of material constants was determined for the KNN-KCN materials. Compared with conventional hard PZT ceramics, the lead-free possessed lower dielectric and piezoelectric properties; however, comparable values of mechanical Q, dielectric loss, and coercive fields were obtained, making acceptor modified KNN based lead-free piezoelectric material promising for high-power applications, where lead-free materials are desirable. PMID:19686966
Eiras, José A.; Gerbasi, Rosimeire B. Z.; Rosso, Jaciele M.; Silva, Daniel M.; Cótica, Luiz F.; Santos, Ivair A.; Souza, Camila A.; Lente, Manuel H.
2016-01-01
Lead free piezoelectric materials are being intensively investigated in order to substitute lead based ones, commonly used in many different applications. Among the most promising lead-free materials are those with modified NaNbO3, such as (K, Na)NbO3 (KNN) and (Ba, Na)(Ti, Nb)O3 (BTNN) families. From a ceramic processing point of view, high density single phase KNN and BTNN ceramics are very difficult to sinter due to the volatility of the alkaline elements, the narrow sintering temperature range and the anomalous grain growth. In this work, Spark Plasma Sintering (SPS) and high-energy ball milling (HEBM), following heat treatments (calcining and sintering), in oxidative (O2) atmosphere have been used to prepare single phase highly densified KNN (“pure” and Cu2+ or Li1+ doped), with theoretical densities ρth > 97% and BTNN ceramics (ρth ~ 90%), respectively. Using BTTN ceramics with a P4mm perovskite-like structure, we showed that by increasing the NaNbO3 content, the ferroelectric properties change from having a relaxor effect to an almost “normal” ferroelectric character, while the tetragonality and grain size increase and the shear piezoelectric coefficients (k15, g15 and d15) improve. For KNN ceramics, the results reveal that the values for remanent polarization as well as for most of the coercive field are quite similar among all compositions. These facts evidenced that Cu2+ may be incorporated into the A and/or B sites of the perovskite structure, having both hardening and softening effects. PMID:28773304
The distance function effect on k-nearest neighbor classification for medical datasets.
Hu, Li-Yu; Huang, Min-Wei; Ke, Shih-Wen; Tsai, Chih-Fong
2016-01-01
K-nearest neighbor (k-NN) classification is conventional non-parametric classifier, which has been used as the baseline classifier in many pattern classification problems. It is based on measuring the distances between the test data and each of the training data to decide the final classification output. Since the Euclidean distance function is the most widely used distance metric in k-NN, no study examines the classification performance of k-NN by different distance functions, especially for various medical domain problems. Therefore, the aim of this paper is to investigate whether the distance function can affect the k-NN performance over different medical datasets. Our experiments are based on three different types of medical datasets containing categorical, numerical, and mixed types of data and four different distance functions including Euclidean, cosine, Chi square, and Minkowsky are used during k-NN classification individually. The experimental results show that using the Chi square distance function is the best choice for the three different types of datasets. However, using the cosine and Euclidean (and Minkowsky) distance function perform the worst over the mixed type of datasets. In this paper, we demonstrate that the chosen distance function can affect the classification accuracy of the k-NN classifier. For the medical domain datasets including the categorical, numerical, and mixed types of data, K-NN based on the Chi square distance function performs the best.
Generative Models for Similarity-based Classification
2007-01-01
NC), local nearest centroid (local NC), k-nearest neighbors ( kNN ), and condensed nearest neighbors (CNN) are all similarity-based classifiers which...vector machine to the k nearest neighbors of the test sample [80]. The SVM- KNN method was developed to address the robustness and dimensionality...concerns that afflict nearest neighbors and SVMs. Similarly to the nearest-means classifier, the SVM- KNN is a hybrid local and global classifier developed
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang Yanxia; Ma He; Peng Nanbo
We apply one of the lazy learning methods, the k-nearest neighbor (kNN) algorithm, to estimate the photometric redshifts of quasars based on various data sets from the Sloan Digital Sky Survey (SDSS), the UKIRT Infrared Deep Sky Survey (UKIDSS), and the Wide-field Infrared Survey Explorer (WISE; the SDSS sample, the SDSS-UKIDSS sample, the SDSS-WISE sample, and the SDSS-UKIDSS-WISE sample). The influence of the k value and different input patterns on the performance of kNN is discussed. kNN performs best when k is different with a special input pattern for a special data set. The best result belongs to the SDSS-UKIDSS-WISEmore » sample. The experimental results generally show that the more information from more bands, the better performance of photometric redshift estimation with kNN. The results also demonstrate that kNN using multiband data can effectively solve the catastrophic failure of photometric redshift estimation, which is met by many machine learning methods. Compared with the performance of various other methods of estimating the photometric redshifts of quasars, kNN based on KD-Tree shows superiority, exhibiting the best accuracy.« less
Mutual proximity graphs for improved reachability in music recommendation.
Flexer, Arthur; Stevens, Jeff
2018-01-01
This paper is concerned with the impact of hubness, a general problem of machine learning in high-dimensional spaces, on a real-world music recommendation system based on visualisation of a k-nearest neighbour (knn) graph. Due to a problem of measuring distances in high dimensions, hub objects are recommended over and over again while anti-hubs are nonexistent in recommendation lists, resulting in poor reachability of the music catalogue. We present mutual proximity graphs, which are an alternative to knn and mutual knn graphs, and are able to avoid hub vertices having abnormally high connectivity. We show that mutual proximity graphs yield much better graph connectivity resulting in improved reachability compared to knn graphs, mutual knn graphs and mutual knn graphs enhanced with minimum spanning trees, while simultaneously reducing the negative effects of hubness.
Mutual proximity graphs for improved reachability in music recommendation
Flexer, Arthur; Stevens, Jeff
2018-01-01
This paper is concerned with the impact of hubness, a general problem of machine learning in high-dimensional spaces, on a real-world music recommendation system based on visualisation of a k-nearest neighbour (knn) graph. Due to a problem of measuring distances in high dimensions, hub objects are recommended over and over again while anti-hubs are nonexistent in recommendation lists, resulting in poor reachability of the music catalogue. We present mutual proximity graphs, which are an alternative to knn and mutual knn graphs, and are able to avoid hub vertices having abnormally high connectivity. We show that mutual proximity graphs yield much better graph connectivity resulting in improved reachability compared to knn graphs, mutual knn graphs and mutual knn graphs enhanced with minimum spanning trees, while simultaneously reducing the negative effects of hubness. PMID:29348779
Electromagnetic Induction Spectroscopy for the Detection of Subsurface Targets
2012-12-01
curves of the proposed method and that of Fails et al.. For the kNN ROC curve, k = 7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81...et al. [6] and Ramachandran et al. [7] both demonstrated success in detecting mines using the k-nearest-neighbor ( kNN ) algorithm based on the EMI...error is also included in the feature vector. The kNN labels an unknown target based on the closest targets in a training set. Collins et al. [2] and
Development of a Lead-free Piezoelectric (K,Na)NbO3 Thin Film Deposited on Nickel-based Electrodes
NASA Astrophysics Data System (ADS)
Bani Milhim, Alaeddin
It is desirable to replace noble metals used as electrode materials for piezoelectric thin film with base metals. This will reduce the piezoelectric thin film fabrication cost. A nickel?based layer in conjunction with other protective layers is proposed as a bottom electrode for lead-free piezoelectric KNN thin film. The obtained results do not indicate the oxidation of the nickel?based bottom electrode after the deposition of KNN at 600 °C for 10 hours in the presence of oxygen and/or after annealing the sample at 400 °C for an hour in air. The fabricated KNN thin film was fully characterized in this work. The effective piezoelectric coefficients d33 and d31 were estimated to be 37 pm/V and 17.2 pm/V, respectively, at 100 kV/cm. The piezoelectric properties of the fabricated KNN/Ni/Ti/SiO2/Si are affected by the crystal orientation of the KNN layer, which was preferentially oriented in the (110) direction. Optimization of the deposition parameters of the fabricated KNN/Ni/Ti/SiO2/Si film is expected to further enhance the piezoelectric properties. Two novel systems utilizing the developed KNN piezoelectric thin film are proposed and their performance simulated based on the achieved KNN thin film parameters. The first is a precision automated nanomanipulation system using an AFM as a sensor and piezo-actuated manipulators. Real-time feedback of the particle being manipulated can be achieved using the proposed system. The length of the manipulators needs to be at least 2 mm to be incorporated with a commercial AFM system. To fabricate the required manipulators, a three-step electrochemical etching technique was developed. Tungsten tips combining well-defined conical shape, a length as large as 2 mm, and sharpness with a radius of curvature of around 20 nm were fabricated using the proposed technique. By depositing the KNN thin film on the fabricated manipulator, nanomanipulators with out-of-plane actuation can be produced. Ultrasonic piezoelectric fan array, the second system, is proposed for GPU cooling applications. The developed KNN thin film is proposed as the piezo layer in the piezo fan structure. The novel solution can offer large air flow rate and low power consumption. Since the operating frequency is beyond the human audible frequencies, non?audible noise fans are expected by using the proposed ultrasonic piezo fan system. Moreover, fabrication of these ultrasonic piezo fans can be part of the GPU fabrication process itself.
The Korean Neonatal Network: An Overview
Chang, Yun Sil; Park, Hyun-Young
2015-01-01
Currently, in the Republic of Korea, despite the very-low-birth rate, the birth rate and number of preterm infants are markedly increasing. Neonatal deaths and major complications mostly occur in premature infants, especially very-low-birth-weight infants (VLBWIs). VLBWIs weigh less than 1,500 g at birth and require intensive treatment in a neonatal intensive care unit (NICU). The operation of the Korean Neonatal Network (KNN) officially started on April 15, 2013, by the Korean Society of Neonatology with support from the Korea Centers for Disease Control and Prevention. The KNN is a national multicenter neonatal network based on a prospective web-based registry for VLBWIs. About 2,000 VLBWIs from 60 participating hospital NICUs are registered annually in the KNN. The KNN has built unique systems such as a web-based real-time data display on the web site and a site-visit monitoring system for data quality surveillance. The KNN should be maintained and developed further in order to generate appropriate, population-based, data-driven, health-care policies; facilitate active multicenter neonatal research, including quality improvement of neonatal care; and ultimately lead to improvement in the prognosis of high-risk newborns and subsequent reduction in health-care costs through the development of evidence-based neonatal medicine in Korea. PMID:26566355
The Korean Neonatal Network: An Overview.
Chang, Yun Sil; Park, Hyun-Young; Park, Won Soon
2015-10-01
Currently, in the Republic of Korea, despite the very-low-birth rate, the birth rate and number of preterm infants are markedly increasing. Neonatal deaths and major complications mostly occur in premature infants, especially very-low-birth-weight infants (VLBWIs). VLBWIs weigh less than 1,500 g at birth and require intensive treatment in a neonatal intensive care unit (NICU). The operation of the Korean Neonatal Network (KNN) officially started on April 15, 2013, by the Korean Society of Neonatology with support from the Korea Centers for Disease Control and Prevention. The KNN is a national multicenter neonatal network based on a prospective web-based registry for VLBWIs. About 2,000 VLBWIs from 60 participating hospital NICUs are registered annually in the KNN. The KNN has built unique systems such as a web-based real-time data display on the web site and a site-visit monitoring system for data quality surveillance. The KNN should be maintained and developed further in order to generate appropriate, population-based, data-driven, health-care policies; facilitate active multicenter neonatal research, including quality improvement of neonatal care; and ultimately lead to improvement in the prognosis of high-risk newborns and subsequent reduction in health-care costs through the development of evidence-based neonatal medicine in Korea.
Kalegowda, Yogesh; Harmer, Sarah L
2012-03-20
Time-of-flight secondary ion mass spectrometry (TOF-SIMS) spectra of mineral samples are complex, comprised of large mass ranges and many peaks. Consequently, characterization and classification analysis of these systems is challenging. In this study, different chemometric and statistical data evaluation methods, based on monolayer sensitive TOF-SIMS data, have been tested for the characterization and classification of copper-iron sulfide minerals (chalcopyrite, chalcocite, bornite, and pyrite) at different flotation pulp conditions (feed, conditioned feed, and Eh modified). The complex mass spectral data sets were analyzed using the following chemometric and statistical techniques: principal component analysis (PCA); principal component-discriminant functional analysis (PC-DFA); soft independent modeling of class analogy (SIMCA); and k-Nearest Neighbor (k-NN) classification. PCA was found to be an important first step in multivariate analysis, providing insight into both the relative grouping of samples and the elemental/molecular basis for those groupings. For samples exposed to oxidative conditions (at Eh ~430 mV), each technique (PCA, PC-DFA, SIMCA, and k-NN) was found to produce excellent classification. For samples at reductive conditions (at Eh ~ -200 mV SHE), k-NN and SIMCA produced the most accurate classification. Phase identification of particles that contain the same elements but a different crystal structure in a mixed multimetal mineral system has been achieved.
Multidimensional k-nearest neighbor model based on EEMD for financial time series forecasting
NASA Astrophysics Data System (ADS)
Zhang, Ningning; Lin, Aijing; Shang, Pengjian
2017-07-01
In this paper, we propose a new two-stage methodology that combines the ensemble empirical mode decomposition (EEMD) with multidimensional k-nearest neighbor model (MKNN) in order to forecast the closing price and high price of the stocks simultaneously. The modified algorithm of k-nearest neighbors (KNN) has an increasingly wide application in the prediction of all fields. Empirical mode decomposition (EMD) decomposes a nonlinear and non-stationary signal into a series of intrinsic mode functions (IMFs), however, it cannot reveal characteristic information of the signal with much accuracy as a result of mode mixing. So ensemble empirical mode decomposition (EEMD), an improved method of EMD, is presented to resolve the weaknesses of EMD by adding white noise to the original data. With EEMD, the components with true physical meaning can be extracted from the time series. Utilizing the advantage of EEMD and MKNN, the new proposed ensemble empirical mode decomposition combined with multidimensional k-nearest neighbor model (EEMD-MKNN) has high predictive precision for short-term forecasting. Moreover, we extend this methodology to the case of two-dimensions to forecast the closing price and high price of the four stocks (NAS, S&P500, DJI and STI stock indices) at the same time. The results indicate that the proposed EEMD-MKNN model has a higher forecast precision than EMD-KNN, KNN method and ARIMA.
Wang, Xueyi
2012-02-08
The k-nearest neighbors (k-NN) algorithm is a widely used machine learning method that finds nearest neighbors of a test object in a feature space. We present a new exact k-NN algorithm called kMkNN (k-Means for k-Nearest Neighbors) that uses the k-means clustering and the triangle inequality to accelerate the searching for nearest neighbors in a high dimensional space. The kMkNN algorithm has two stages. In the buildup stage, instead of using complex tree structures such as metric trees, kd-trees, or ball-tree, kMkNN uses a simple k-means clustering method to preprocess the training dataset. In the searching stage, given a query object, kMkNN finds nearest training objects starting from the nearest cluster to the query object and uses the triangle inequality to reduce the distance calculations. Experiments show that the performance of kMkNN is surprisingly good compared to the traditional k-NN algorithm and tree-based k-NN algorithms such as kd-trees and ball-trees. On a collection of 20 datasets with up to 10(6) records and 10(4) dimensions, kMkNN shows a 2-to 80-fold reduction of distance calculations and a 2- to 60-fold speedup over the traditional k-NN algorithm for 16 datasets. Furthermore, kMkNN performs significant better than a kd-tree based k-NN algorithm for all datasets and performs better than a ball-tree based k-NN algorithm for most datasets. The results show that kMkNN is effective for searching nearest neighbors in high dimensional spaces.
A multiple-point spatially weighted k-NN method for object-based classification
NASA Astrophysics Data System (ADS)
Tang, Yunwei; Jing, Linhai; Li, Hui; Atkinson, Peter M.
2016-10-01
Object-based classification, commonly referred to as object-based image analysis (OBIA), is now commonly regarded as able to produce more appealing classification maps, often of greater accuracy, than pixel-based classification and its application is now widespread. Therefore, improvement of OBIA using spatial techniques is of great interest. In this paper, multiple-point statistics (MPS) is proposed for object-based classification enhancement in the form of a new multiple-point k-nearest neighbour (k-NN) classification method (MPk-NN). The proposed method first utilises a training image derived from a pre-classified map to characterise the spatial correlation between multiple points of land cover classes. The MPS borrows spatial structures from other parts of the training image, and then incorporates this spatial information, in the form of multiple-point probabilities, into the k-NN classifier. Two satellite sensor images with a fine spatial resolution were selected to evaluate the new method. One is an IKONOS image of the Beijing urban area and the other is a WorldView-2 image of the Wolong mountainous area, in China. The images were object-based classified using the MPk-NN method and several alternatives, including the k-NN, the geostatistically weighted k-NN, the Bayesian method, the decision tree classifier (DTC), and the support vector machine classifier (SVM). It was demonstrated that the new spatial weighting based on MPS can achieve greater classification accuracy relative to the alternatives and it is, thus, recommended as appropriate for object-based classification.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hao, Jigong; Xu, Zhijun, E-mail: zhjxu@lcu.edu.cn; Chu, Ruiqing
Reddish orange-emitting 0.948(K{sub 0.5}Na{sub 0.5})NbO{sub 3}-0.052LiSbO{sub 3}-xmol%Sm{sub 2}O{sub 3} (KNN-5.2LS-xSm{sub 2}O{sub 3}) lead-free piezoelectric ceramics with good piezoelectric properties were fabricated in this study, and the photoluminescence and electrical properties of the ceramics were systematically studied. Results showed that Sm{sub 2}O{sub 3} substitution into KNN-5.2LS induces a phase transition from the coexistence of orthorhombic and tetragonal phases to a pseudocubic phase and shifts the polymorphic phase transition (PPT) to below room temperature. The temperature stability and fatigue resistance of the modified ceramics were significantly improved by Sm{sub 2}O{sub 3} substitution. The KNN-5.2LS ceramic with 0.4 mol. % Sm{sub 2}O{sub 3} exhibitedmore » temperature-independent properties (25–150 °C), fatigue-free behavior (up to 10{sup 6} cycles), and good piezoelectric properties (d{sub 33}{sup * }= 230 pm/V, d{sub 33} = 176 pC/N, k{sub p} = 35%). Studies on the photoluminescence properties of the samples showed strong reddish-orange emission upon blue light excitation; these emission intensities were strongly dependent on the doping concentration and sintering temperature. The 0.4 mol. % Sm{sub 2}O{sub 3}-modified sample exhibited temperature responses over a wide temperature range of 10–443 K. The maximum sensing sensitivity of the sample was 7.5 × 10{sup −4} K at 293 K, at which point PPT occurred. A relatively long decay lifetime τ of 1.27–1.40 ms and a large quantum yield η of 0.17–0.19 were obtained from the Sm-modified samples. These results suggest that the KNN-5.2LS-xSm{sub 2}O{sub 3} system presents multifunctional properties and significant technological potential in novel multifunctional devices.« less
Comparative Analysis of Document level Text Classification Algorithms using R
NASA Astrophysics Data System (ADS)
Syamala, Maganti; Nalini, N. J., Dr; Maguluri, Lakshamanaphaneendra; Ragupathy, R., Dr.
2017-08-01
From the past few decades there has been tremendous volumes of data available in Internet either in structured or unstructured form. Also, there is an exponential growth of information on Internet, so there is an emergent need of text classifiers. Text mining is an interdisciplinary field which draws attention on information retrieval, data mining, machine learning, statistics and computational linguistics. And to handle this situation, a wide range of supervised learning algorithms has been introduced. Among all these K-Nearest Neighbor(KNN) is efficient and simplest classifier in text classification family. But KNN suffers from imbalanced class distribution and noisy term features. So, to cope up with this challenge we use document based centroid dimensionality reduction(CentroidDR) using R Programming. By combining these two text classification techniques, KNN and Centroid classifiers, we propose a scalable and effective flat classifier, called MCenKNN which works well substantially better than CenKNN.
Heterogeneous Multi-Metric Learning for Multi-Sensor Fusion
2011-07-01
distance”. One of the most widely used methods is the k-nearest neighbor ( KNN ) method [4], which labels an input data sample to be the class with majority...despite of its simplicity, it can be an effective candidate and can be easily extended to handle multiple sensors. Distance based method such as KNN relies...Neighbor (LMNN) method [21] which will be briefly reviewed in the sequel. LMNN method tries to learn an optimal metric specifically for KNN classifier. The
Geometric k-nearest neighbor estimation of entropy and mutual information
NASA Astrophysics Data System (ADS)
Lord, Warren M.; Sun, Jie; Bollt, Erik M.
2018-03-01
Nonparametric estimation of mutual information is used in a wide range of scientific problems to quantify dependence between variables. The k-nearest neighbor (knn) methods are consistent, and therefore expected to work well for a large sample size. These methods use geometrically regular local volume elements. This practice allows maximum localization of the volume elements, but can also induce a bias due to a poor description of the local geometry of the underlying probability measure. We introduce a new class of knn estimators that we call geometric knn estimators (g-knn), which use more complex local volume elements to better model the local geometry of the probability measures. As an example of this class of estimators, we develop a g-knn estimator of entropy and mutual information based on elliptical volume elements, capturing the local stretching and compression common to a wide range of dynamical system attractors. A series of numerical examples in which the thickness of the underlying distribution and the sample sizes are varied suggest that local geometry is a source of problems for knn methods such as the Kraskov-Stögbauer-Grassberger estimator when local geometric effects cannot be removed by global preprocessing of the data. The g-knn method performs well despite the manipulation of the local geometry. In addition, the examples suggest that the g-knn estimators can be of particular relevance to applications in which the system is large, but the data size is limited.
An Examination of Diameter Density Prediction with k-NN and Airborne Lidar
Strunk, Jacob L.; Gould, Peter J.; Packalen, Petteri; ...
2017-11-16
While lidar-based forest inventory methods have been widely demonstrated, performances of methods to predict tree diameters with airborne lidar (lidar) are not well understood. One cause for this is that the performance metrics typically used in studies for prediction of diameters can be difficult to interpret, and may not support comparative inferences between sampling designs and study areas. To help with this problem we propose two indices and use them to evaluate a variety of lidar and k nearest neighbor (k-NN) strategies for prediction of tree diameter distributions. The indices are based on the coefficient of determination ( R 2),more » and root mean square deviation (RMSD). Both of the indices are highly interpretable, and the RMSD-based index facilitates comparisons with alternative (non-lidar) inventory strategies, and with projects in other regions. K-NN diameter distribution prediction strategies were examined using auxiliary lidar for 190 training plots distribute across the 800 km 2 Savannah River Site in South Carolina, USA. In conclusion, we evaluate the performance of k-NN with respect to distance metrics, number of neighbors, predictor sets, and response sets. K-NN and lidar explained 80% of variability in diameters, and Mahalanobis distance with k = 3 neighbors performed best according to a number of criteria.« less
An Examination of Diameter Density Prediction with k-NN and Airborne Lidar
DOE Office of Scientific and Technical Information (OSTI.GOV)
Strunk, Jacob L.; Gould, Peter J.; Packalen, Petteri
While lidar-based forest inventory methods have been widely demonstrated, performances of methods to predict tree diameters with airborne lidar (lidar) are not well understood. One cause for this is that the performance metrics typically used in studies for prediction of diameters can be difficult to interpret, and may not support comparative inferences between sampling designs and study areas. To help with this problem we propose two indices and use them to evaluate a variety of lidar and k nearest neighbor (k-NN) strategies for prediction of tree diameter distributions. The indices are based on the coefficient of determination ( R 2),more » and root mean square deviation (RMSD). Both of the indices are highly interpretable, and the RMSD-based index facilitates comparisons with alternative (non-lidar) inventory strategies, and with projects in other regions. K-NN diameter distribution prediction strategies were examined using auxiliary lidar for 190 training plots distribute across the 800 km 2 Savannah River Site in South Carolina, USA. In conclusion, we evaluate the performance of k-NN with respect to distance metrics, number of neighbors, predictor sets, and response sets. K-NN and lidar explained 80% of variability in diameters, and Mahalanobis distance with k = 3 neighbors performed best according to a number of criteria.« less
NASA Astrophysics Data System (ADS)
Adeniyi, D. A.; Wei, Z.; Yang, Y.
2017-10-01
Recommendation problem has been extensively studied by researchers in the field of data mining, database and information retrieval. This study presents the design and realisation of an automated, personalised news recommendations system based on Chi-square statistics-based K-nearest neighbour (χ2SB-KNN) model. The proposed χ2SB-KNN model has the potential to overcome computational complexity and information overloading problems, reduces runtime and speeds up execution process through the use of critical value of χ2 distribution. The proposed recommendation engine can alleviate scalability challenges through combined online pattern discovery and pattern matching for real-time recommendations. This work also showcases the development of a novel method of feature selection referred to as Data Discretisation-Based feature selection method. This is used for selecting the best features for the proposed χ2SB-KNN algorithm at the preprocessing stage of the classification procedures. The implementation of the proposed χ2SB-KNN model is achieved through the use of a developed in-house Java program on an experimental website called OUC newsreaders' website. Finally, we compared the performance of our system with two baseline methods which are traditional Euclidean distance K-nearest neighbour and Naive Bayesian techniques. The result shows a significant improvement of our method over the baseline methods studied.
Sintering of Lead-Free Piezoelectric Sodium Potassium Niobate Ceramics
Malič, Barbara; Koruza, Jurij; Hreščak, Jitka; Bernard, Janez; Wang, Ke; Fisher, John G.; Benčan, Andreja
2015-01-01
The potassium sodium niobate, K0.5Na0.5NbO3, solid solution (KNN) is considered as one of the most promising, environment-friendly, lead-free candidates to replace highly efficient, lead-based piezoelectrics. Since the first reports of KNN, it has been recognized that obtaining phase-pure materials with a high density and a uniform, fine-grained microstructure is a major challenge. For this reason the present paper reviews the different methods for consolidating KNN ceramics. The difficulties involved in the solid-state synthesis of KNN powder, i.e., obtaining phase purity, the stoichiometry of the perovskite phase, and the chemical homogeneity, are discussed. The solid-state sintering of stoichiometric KNN is characterized by poor densification and an extremely narrow sintering-temperature range, which is close to the solidus temperature. A study of the initial sintering stage revealed that coarsening of the microstructure without densification contributes to a reduction of the driving force for sintering. The influences of the (K + Na)/Nb molar ratio, the presence of a liquid phase, chemical modifications (doping, complex solid solutions) and different atmospheres (i.e., defect chemistry) on the sintering are discussed. Special sintering techniques, such as pressure-assisted sintering and spark-plasma sintering, can be effective methods for enhancing the density of KNN ceramics. The sintering behavior of KNN is compared to that of a representative piezoelectric lead zirconate titanate (PZT). PMID:28793702
On prognostic models, artificial intelligence and censored observations.
Anand, S S; Hamilton, P W; Hughes, J G; Bell, D A
2001-03-01
The development of prognostic models for assisting medical practitioners with decision making is not a trivial task. Models need to possess a number of desirable characteristics and few, if any, current modelling approaches based on statistical or artificial intelligence can produce models that display all these characteristics. The inability of modelling techniques to provide truly useful models has led to interest in these models being purely academic in nature. This in turn has resulted in only a very small percentage of models that have been developed being deployed in practice. On the other hand, new modelling paradigms are being proposed continuously within the machine learning and statistical community and claims, often based on inadequate evaluation, being made on their superiority over traditional modelling methods. We believe that for new modelling approaches to deliver true net benefits over traditional techniques, an evaluation centric approach to their development is essential. In this paper we present such an evaluation centric approach to developing extensions to the basic k-nearest neighbour (k-NN) paradigm. We use standard statistical techniques to enhance the distance metric used and a framework based on evidence theory to obtain a prediction for the target example from the outcome of the retrieved exemplars. We refer to this new k-NN algorithm as Censored k-NN (Ck-NN). This reflects the enhancements made to k-NN that are aimed at providing a means for handling censored observations within k-NN.
Multisite rainfall downscaling and disaggregation in a tropical urban area
NASA Astrophysics Data System (ADS)
Lu, Y.; Qin, X. S.
2014-02-01
A systematic downscaling-disaggregation study was conducted over Singapore Island, with an aim to generate high spatial and temporal resolution rainfall data under future climate-change conditions. The study consisted of two major components. The first part was to perform an inter-comparison of various alternatives of downscaling and disaggregation methods based on observed data. This included (i) single-site generalized linear model (GLM) plus K-nearest neighbor (KNN) (S-G-K) vs. multisite GLM (M-G) for spatial downscaling, (ii) HYETOS vs. KNN for single-site disaggregation, and (iii) KNN vs. MuDRain (Multivariate Rainfall Disaggregation tool) for multisite disaggregation. The results revealed that, for multisite downscaling, M-G performs better than S-G-K in covering the observed data with a lower RMSE value; for single-site disaggregation, KNN could better keep the basic statistics (i.e. standard deviation, lag-1 autocorrelation and probability of wet hour) than HYETOS; for multisite disaggregation, MuDRain outperformed KNN in fitting interstation correlations. In the second part of the study, an integrated downscaling-disaggregation framework based on M-G, KNN, and MuDRain was used to generate hourly rainfall at multiple sites. The results indicated that the downscaled and disaggregated rainfall data based on multiple ensembles from HadCM3 for the period from 1980 to 2010 could well cover the observed mean rainfall amount and extreme data, and also reasonably keep the spatial correlations both at daily and hourly timescales. The framework was also used to project future rainfall conditions under HadCM3 SRES A2 and B2 scenarios. It was indicated that the annual rainfall amount could reduce up to 5% at the end of this century, but the rainfall of wet season and extreme hourly rainfall could notably increase.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ikeda, Y.; Sato, T.
The resonance energies of strange dibaryons are investigated with the use of the KNN-{pi}YN coupled-channels Faddeev equation. It is found that the pole positions of the predicted three-body amplitudes are significantly modified when the three-body coupled-channels dynamics is approximated, as is done in the literature, by the effective two-body KN interactions.
A study on (K, Na) NbO3 based multilayer piezoelectric ceramics micro speaker
NASA Astrophysics Data System (ADS)
Gao, Renlong; Chu, Xiangcheng; Huan, Yu; Sun, Yiming; Liu, Jiayi; Wang, Xiaohui; Li, Longtu
2014-10-01
A flat panel micro speaker was fabricated from (K, Na) NbO3 (KNN)-based multilayer piezoelectric ceramics by a tape casting and cofiring process using Ag-Pd alloys as an inner electrode. The interface between ceramic and electrode was investigated by scanning electron microscope (SEM) and transmission electron microscope (TEM). The acoustic response was characterized by a standard audio test system. We found that the micro speaker with dimensions of 23 × 27 × 0.6 mm3, using three layers of 30 μm thickness KNN-based ceramic, has a high average sound pressure level (SPL) of 87 dB, between 100 Hz-20 kHz under five voltage. This result was even better than that of lead zirconate titanate (PZT)-based ceramics under the same conditions. The experimental results show that the KNN-based multilayer ceramics could be used as lead free piezoelectric micro speakers.
Accelerating k-NN Algorithm with Hybrid MPI and OpenSHMEM
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Jian; Hamidouche, Khaled; Zheng, Jie
2015-08-05
Machine Learning algorithms are benefiting from the continuous improvement of programming models, including MPI, MapReduce and PGAS. k-Nearest Neighbors (k-NN) algorithm is a widely used machine learning algorithm, applied to supervised learning tasks such as classification. Several parallel implementations of k-NN have been proposed in the literature and practice. However, on high-performance computing systems with high-speed interconnects, it is important to further accelerate existing designs of the k-NN algorithm through taking advantage of scalable programming models. To improve the performance of k-NN on large-scale environment with InfiniBand network, this paper proposes several alternative hybrid MPI+OpenSHMEM designs and performs a systemicmore » evaluation and analysis on typical workloads. The hybrid designs leverage the one-sided memory access to better overlap communication with computation than the existing pure MPI design, and propose better schemes for efficient buffer management. The implementation based on k-NN program from MaTEx with MVAPICH2-X (Unified MPI+PGAS Communication Runtime over InfiniBand) shows up to 9.0% time reduction for training KDD Cup 2010 workload over 512 cores, and 27.6% time reduction for small workload with balanced communication and computation. Experiments of running with varied number of cores show that our design can maintain good scalability.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yu, Qi; Zhu, Fang-Yuan; Cheng, Li-Qian
Crystallographic structure of sol-gel-processed lead-free (K,Na)NbO{sub 3} (KNN) epitaxial films on [100]-cut SrTiO{sub 3} single-crystalline substrates was investigated for a deeper understanding of its piezoelectric response. Lattice parameter measurement by high-resolution X-ray diffraction and transmission electron microscopy revealed that the orthorhombic KNN films on SrTiO{sub 3} (100) surfaces are [010] oriented (b-axis-oriented) rather than commonly identified c-axis orientation. Based on the crystallographic orientation and corresponding ferroelectric domain structure investigated by piezoresponse force microscopy, the superior piezoelectric property along b-axis of epitaxial KNN films than other orientations can be explained.
Attention Recognition in EEG-Based Affective Learning Research Using CFS+KNN Algorithm.
Hu, Bin; Li, Xiaowei; Sun, Shuting; Ratcliffe, Martyn
2018-01-01
The research detailed in this paper focuses on the processing of Electroencephalography (EEG) data to identify attention during the learning process. The identification of affect using our procedures is integrated into a simulated distance learning system that provides feedback to the user with respect to attention and concentration. The authors propose a classification procedure that combines correlation-based feature selection (CFS) and a k-nearest-neighbor (KNN) data mining algorithm. To evaluate the CFS+KNN algorithm, it was test against CFS+C4.5 algorithm and other classification algorithms. The classification performance was measured 10 times with different 3-fold cross validation data. The data was derived from 10 subjects while they were attempting to learn material in a simulated distance learning environment. A self-assessment model of self-report was used with a single valence to evaluate attention on 3 levels (high, neutral, low). It was found that CFS+KNN had a much better performance, giving the highest correct classification rate (CCR) of % for the valence dimension divided into three classes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Atamanik, E.; Thangadurai, V.
We report a comparative study of the dielectric properties of solid-state (ceramic method) synthesized NaNbO{sub 3} (NN), Na{sub 0.75}K{sub 0.25}NbO{sub 3} (K25NN), K{sub 0.5}Na{sub 0.5}NbO{sub 3} (KNN) and some composite materials containing In{sub 2}O{sub 3} and NN or KNN using an AC impedance method. Powder X-ray diffraction (PXRD) was employed to investigate the phase purity. No significant amount of impurity phase was observed for NN, K25NN, and KNN. Substitutions of 10, 15 and 25 mol% In{sup 3+} for Nb{sup 5+} in KNN and NN using solid-state reactions at 1150 deg. C resulted in composite materials. AC impedance studies of NN,more » KNN and K25NN in the temperature range of 500-800 deg. C showed a single semicircle (attributed to the bulk property) in the high-frequency range of 10{sup 3} to 10{sup 6} Hz. The individual contributions from the bulk and grain boundary on the dielectric properties were resolved and quantified from the impedance data. The calculated dielectric values for NN were consistent with previously reported in the literature. 10% Indium based KNN composite materials had the lowest dielectric loss 0.585 and the dielectric constant of 233 at 100 kHz at the temperature of 650 deg. C.« less
AVNM: A Voting based Novel Mathematical Rule for Image Classification.
Vidyarthi, Ankit; Mittal, Namita
2016-12-01
In machine learning, the accuracy of the system depends upon classification result. Classification accuracy plays an imperative role in various domains. Non-parametric classifier like K-Nearest Neighbor (KNN) is the most widely used classifier for pattern analysis. Besides its easiness, simplicity and effectiveness characteristics, the main problem associated with KNN classifier is the selection of a number of nearest neighbors i.e. "k" for computation. At present, it is hard to find the optimal value of "k" using any statistical algorithm, which gives perfect accuracy in terms of low misclassification error rate. Motivated by the prescribed problem, a new sample space reduction weighted voting mathematical rule (AVNM) is proposed for classification in machine learning. The proposed AVNM rule is also non-parametric in nature like KNN. AVNM uses the weighted voting mechanism with sample space reduction to learn and examine the predicted class label for unidentified sample. AVNM is free from any initial selection of predefined variable and neighbor selection as found in KNN algorithm. The proposed classifier also reduces the effect of outliers. To verify the performance of the proposed AVNM classifier, experiments are made on 10 standard datasets taken from UCI database and one manually created dataset. The experimental result shows that the proposed AVNM rule outperforms the KNN classifier and its variants. Experimentation results based on confusion matrix accuracy parameter proves higher accuracy value with AVNM rule. The proposed AVNM rule is based on sample space reduction mechanism for identification of an optimal number of nearest neighbor selections. AVNM results in better classification accuracy and minimum error rate as compared with the state-of-art algorithm, KNN, and its variants. The proposed rule automates the selection of nearest neighbor selection and improves classification rate for UCI dataset and manually created dataset. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Zhu, W. L.; Zhu, J. L.; Meng, Y.; Wang, M. S.; Zhu, B.; Zhu, X. H.; Zhu, J. G.; Xiao, D. Q.; Pezzotti, G.
2011-12-01
This paper presents a Raman spectroscopic study of compositional-change-induced structure variation and of the related mechanism of Mg doping in LiSbO3 (LS)-modified (K0.5Na0.5)NbO3 (KNN) ceramics. With increasing LS content from 0 to 0.06, a discontinuous shift towards higher wavenumbers was found for the band position of the A1g(v1) stretching mode of KNN, accompanied by a clearly nonlinear broadening of this band and a decrease in its intensity. Such morphological changes in the Raman spectrum result from two factors: (i) changes in polarizability/binding strength of the O-Nb-O vibration upon incorporation of Li ions in the KNN perovskitic structure and (ii) a polymorphic phase transition (PPT) from orthorhombic to tetragonal (O → T) phase at x > 0.04. Upon increasing the amount, w, of Mg dopant incorporated into the (1-x)KNN-xLS ceramic structure, the intensity of the Raman bands are enhanced, while the peak position and the full width at half maximum of the A1g(v1) mode was found to experience a clear dependence on both w and x. Raman characterization revealed that the mechanism of Mg doping is strongly correlated with the concentration of Li in the perovskite structure: Mg2+ ions will preferentially replace Li+ ions for low Mg doping while replace K/Na ions for higher doping of Mg. The PPT O → T was also found to be altered by the introduction of Mg and the critical value of LS concentration, xO-T, for incipient O → T transition in the KNN-xLS-wMT system was strongly dependent on Mg content, with xO → T being roughly equal to 0.04 + 2w, for the case of dilute Mg alloying.
Improving GPU-accelerated adaptive IDW interpolation algorithm using fast kNN search.
Mei, Gang; Xu, Nengxiong; Xu, Liangliang
2016-01-01
This paper presents an efficient parallel Adaptive Inverse Distance Weighting (AIDW) interpolation algorithm on modern Graphics Processing Unit (GPU). The presented algorithm is an improvement of our previous GPU-accelerated AIDW algorithm by adopting fast k-nearest neighbors (kNN) search. In AIDW, it needs to find several nearest neighboring data points for each interpolated point to adaptively determine the power parameter; and then the desired prediction value of the interpolated point is obtained by weighted interpolating using the power parameter. In this work, we develop a fast kNN search approach based on the space-partitioning data structure, even grid, to improve the previous GPU-accelerated AIDW algorithm. The improved algorithm is composed of the stages of kNN search and weighted interpolating. To evaluate the performance of the improved algorithm, we perform five groups of experimental tests. The experimental results indicate: (1) the improved algorithm can achieve a speedup of up to 1017 over the corresponding serial algorithm; (2) the improved algorithm is at least two times faster than our previous GPU-accelerated AIDW algorithm; and (3) the utilization of fast kNN search can significantly improve the computational efficiency of the entire GPU-accelerated AIDW algorithm.
Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio
NASA Astrophysics Data System (ADS)
Nababan, A. A.; Sitompul, O. S.; Tulus
2018-04-01
K- Nearest Neighbor (KNN) is a good classifier, but from several studies, the result performance accuracy of KNN still lower than other methods. One of the causes of the low accuracy produced, because each attribute has the same effect on the classification process, while some less relevant characteristics lead to miss-classification of the class assignment for new data. In this research, we proposed Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio as a parameter to see the correlation between each attribute in the data and the Gain Ratio also will be used as the basis for weighting each attribute of the dataset. The accuracy of results is compared to the accuracy acquired from the original KNN method using 10-fold Cross-Validation with several datasets from the UCI Machine Learning repository and KEEL-Dataset Repository, such as abalone, glass identification, haberman, hayes-roth and water quality status. Based on the result of the test, the proposed method was able to increase the classification accuracy of KNN, where the highest difference of accuracy obtained hayes-roth dataset is worth 12.73%, and the lowest difference of accuracy obtained in the abalone dataset of 0.07%. The average result of the accuracy of all dataset increases the accuracy by 5.33%.
Piezoelectric Properties of LiSbO3-Modified (K0.48Na0.52)NbO3 Lead-Free Ceramics
NASA Astrophysics Data System (ADS)
Wu, Jiagang; Wang, Yuanyu; Xiao, Dingquan; Zhu, Jianguo; Yu, Ping; Wu, Lang; Wu, Wenjuan
2007-11-01
Lead-free piezoelectric (1-x)(K0.48Na0.52)NbO3-xLiSbO3 [(1-x)KNN-xLS] ceramics were prepared by conventional sintering. A morphotropic phase boundary (MPB) between the orthorhombic and tetragonal phases was identified in the composition range of 0.04
NASA Astrophysics Data System (ADS)
Li, Cuiling; Jiang, Kai; Zhao, Xueguan; Fan, Pengfei; Wang, Xiu; Liu, Chuan
2017-10-01
Impurity of melon seeds variety will cause reductions of melon production and economic benefits of farmers, this research aimed to adopt spectral technology combined with chemometrics methods to identify melon seeds variety. Melon seeds whose varieties were "Yi Te Bai", "Yi Te Jin", "Jing Mi NO.7", "Jing Mi NO.11" and " Yi Li Sha Bai "were used as research samples. A simple spectral system was developed to collect reflective spectral data of melon seeds, including a light source unit, a spectral data acquisition unit and a data processing unit, the detection wavelength range of this system was 200-1100nm with spectral resolution of 0.14 7.7nm. The original reflective spectral data was pre-treated with de-trend (DT), multiple scattering correction (MSC), first derivative (FD), normalization (NOR) and Savitzky-Golay (SG) convolution smoothing methods. Principal Component Analysis (PCA) method was adopted to reduce the dimensions of reflective spectral data and extract principal components. K-nearest neighbour (KNN) and Fisher discriminant analysis (FDA) methods were used to develop discriminant models of melon seeds variety based on PCA. Spectral data pretreatments improved the discriminant effects of KNN and FDA, FDA generated better discriminant results than KNN, both KNN and FDA methods produced discriminant accuracies reaching to 90.0% for validation set. Research results showed that using spectral technology in combination with KNN and FDA modelling methods to identify melon seeds variety was feasible.
NASA Astrophysics Data System (ADS)
Wani, Omar; Beckers, Joost V. L.; Weerts, Albrecht H.; Solomatine, Dimitri P.
2017-08-01
A non-parametric method is applied to quantify residual uncertainty in hydrologic streamflow forecasting. This method acts as a post-processor on deterministic model forecasts and generates a residual uncertainty distribution. Based on instance-based learning, it uses a k nearest-neighbour search for similar historical hydrometeorological conditions to determine uncertainty intervals from a set of historical errors, i.e. discrepancies between past forecast and observation. The performance of this method is assessed using test cases of hydrologic forecasting in two UK rivers: the Severn and Brue. Forecasts in retrospect were made and their uncertainties were estimated using kNN resampling and two alternative uncertainty estimators: quantile regression (QR) and uncertainty estimation based on local errors and clustering (UNEEC). Results show that kNN uncertainty estimation produces accurate and narrow uncertainty intervals with good probability coverage. Analysis also shows that the performance of this technique depends on the choice of search space. Nevertheless, the accuracy and reliability of uncertainty intervals generated using kNN resampling are at least comparable to those produced by QR and UNEEC. It is concluded that kNN uncertainty estimation is an interesting alternative to other post-processors, like QR and UNEEC, for estimating forecast uncertainty. Apart from its concept being simple and well understood, an advantage of this method is that it is relatively easy to implement.
An Improvement To The k-Nearest Neighbor Classifier For ECG Database
NASA Astrophysics Data System (ADS)
Jaafar, Haryati; Hidayah Ramli, Nur; Nasir, Aimi Salihah Abdul
2018-03-01
The k nearest neighbor (kNN) is a non-parametric classifier and has been widely used for pattern classification. However, in practice, the performance of kNN often tends to fail due to the lack of information on how the samples are distributed among them. Moreover, kNN is no longer optimal when the training samples are limited. Another problem observed in kNN is regarding the weighting issues in assigning the class label before classification. Thus, to solve these limitations, a new classifier called Mahalanobis fuzzy k-nearest centroid neighbor (MFkNCN) is proposed in this study. Here, a Mahalanobis distance is applied to avoid the imbalance of samples distribition. Then, a surrounding rule is employed to obtain the nearest centroid neighbor based on the distributions of training samples and its distance to the query point. Consequently, the fuzzy membership function is employed to assign the query point to the class label which is frequently represented by the nearest centroid neighbor Experimental studies from electrocardiogram (ECG) signal is applied in this study. The classification performances are evaluated in two experimental steps i.e. different values of k and different sizes of feature dimensions. Subsequently, a comparative study of kNN, kNCN, FkNN and MFkCNN classifier is conducted to evaluate the performances of the proposed classifier. The results show that the performance of MFkNCN consistently exceeds the kNN, kNCN and FkNN with the best classification rates of 96.5%.
A Proposed Methodology to Classify Frontier Capital Markets
2011-07-31
but because it is the surest route to our common good.” -Inaugural Speech by President Barack Obama, Jan 2009 This project involves basic...machine learning. The algorithm consists of a unique binary classifier mechanism that combines three methods: k-Nearest Neighbors ( kNN ), ensemble...Through kNN Ensemble Classification Techniques E. Capital Market Classification Based on Capital Flows and Trading Architecture F. Horizontal
A Proposed Methodology to Classify Frontier Capital Markets
2011-07-31
out of charity, but because it is the surest route to our common good.” -Inaugural Speech by President Barack Obama, Jan 2009 This project...identification, and machine learning. The algorithm consists of a unique binary classifier mechanism that combines three methods: k-Nearest Neighbors ( kNN ...Support Through kNN Ensemble Classification Techniques E. Capital Market Classification Based on Capital Flows and Trading Architecture F
Research on cardiovascular disease prediction based on distance metric learning
NASA Astrophysics Data System (ADS)
Ni, Zhuang; Liu, Kui; Kang, Guixia
2018-04-01
Distance metric learning algorithm has been widely applied to medical diagnosis and exhibited its strengths in classification problems. The k-nearest neighbour (KNN) is an efficient method which treats each feature equally. The large margin nearest neighbour classification (LMNN) improves the accuracy of KNN by learning a global distance metric, which did not consider the locality of data distributions. In this paper, we propose a new distance metric algorithm adopting cosine metric and LMNN named COS-SUBLMNN which takes more care about local feature of data to overcome the shortage of LMNN and improve the classification accuracy. The proposed methodology is verified on CVDs patient vector derived from real-world medical data. The Experimental results show that our method provides higher accuracy than KNN and LMNN did, which demonstrates the effectiveness of the Risk predictive model of CVDs based on COS-SUBLMNN.
Short-term Power Load Forecasting Based on Balanced KNN
NASA Astrophysics Data System (ADS)
Lv, Xianlong; Cheng, Xingong; YanShuang; Tang, Yan-mei
2018-03-01
To improve the accuracy of load forecasting, a short-term load forecasting model based on balanced KNN algorithm is proposed; According to the load characteristics, the historical data of massive power load are divided into scenes by the K-means algorithm; In view of unbalanced load scenes, the balanced KNN algorithm is proposed to classify the scene accurately; The local weighted linear regression algorithm is used to fitting and predict the load; Adopting the Apache Hadoop programming framework of cloud computing, the proposed algorithm model is parallelized and improved to enhance its ability of dealing with massive and high-dimension data. The analysis of the household electricity consumption data for a residential district is done by 23-nodes cloud computing cluster, and experimental results show that the load forecasting accuracy and execution time by the proposed model are the better than those of traditional forecasting algorithm.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Majidpour, Mostafa; Qiu, Charlie; Chu, Peter
Three algorithms for the forecasting of energy consumption at individual EV charging outlets have been applied to real world data from the UCLA campus. Out of these three algorithms, namely k-Nearest Neighbor (kNN), ARIMA, and Pattern Sequence Forecasting (PSF), kNN with k=1, was the best and PSF was the worst performing algorithm with respect to the SMAPE measure. The advantage of PSF is its increased robustness to noise by substituting the real valued time series with an integer valued one, and the advantage of NN is having the least SMAPE for our data. We propose a Modified PSF algorithm (MPSF)more » which is a combination of PSF and NN; it could be interpreted as NN on integer valued data or as PSF with considering only the most recent neighbor to produce the output. Some other shortcomings of PSF are also addressed in the MPSF. Results show that MPSF has improved the forecast performance.« less
Na, X D; Zang, S Y; Wu, C S; Li, W L
2015-11-01
Knowledge of the spatial extent of forested wetlands is essential to many studies including wetland functioning assessment, greenhouse gas flux estimation, and wildlife suitable habitat identification. For discriminating forested wetlands from their adjacent land cover types, researchers have resorted to image analysis techniques applied to numerous remotely sensed data. While with some success, there is still no consensus on the optimal approaches for mapping forested wetlands. To address this problem, we examined two machine learning approaches, random forest (RF) and K-nearest neighbor (KNN) algorithms, and applied these two approaches to the framework of pixel-based and object-based classifications. The RF and KNN algorithms were constructed using predictors derived from Landsat 8 imagery, Radarsat-2 advanced synthetic aperture radar (SAR), and topographical indices. The results show that the objected-based classifications performed better than per-pixel classifications using the same algorithm (RF) in terms of overall accuracy and the difference of their kappa coefficients are statistically significant (p<0.01). There were noticeably omissions for forested and herbaceous wetlands based on the per-pixel classifications using the RF algorithm. As for the object-based image analysis, there were also statistically significant differences (p<0.01) of Kappa coefficient between results performed based on RF and KNN algorithms. The object-based classification using RF provided a more visually adequate distribution of interested land cover types, while the object classifications based on the KNN algorithm showed noticeably commissions for forested wetlands and omissions for agriculture land. This research proves that the object-based classification with RF using optical, radar, and topographical data improved the mapping accuracy of land covers and provided a feasible approach to discriminate the forested wetlands from the other land cover types in forestry area.
Balouchestani, Mohammadreza; Krishnan, Sridhar
2014-01-01
Long-term recording of Electrocardiogram (ECG) signals plays an important role in health care systems for diagnostic and treatment purposes of heart diseases. Clustering and classification of collecting data are essential parts for detecting concealed information of P-QRS-T waves in the long-term ECG recording. Currently used algorithms do have their share of drawbacks: 1) clustering and classification cannot be done in real time; 2) they suffer from huge energy consumption and load of sampling. These drawbacks motivated us in developing novel optimized clustering algorithm which could easily scan large ECG datasets for establishing low power long-term ECG recording. In this paper, we present an advanced K-means clustering algorithm based on Compressed Sensing (CS) theory as a random sampling procedure. Then, two dimensionality reduction methods: Principal Component Analysis (PCA) and Linear Correlation Coefficient (LCC) followed by sorting the data using the K-Nearest Neighbours (K-NN) and Probabilistic Neural Network (PNN) classifiers are applied to the proposed algorithm. We show our algorithm based on PCA features in combination with K-NN classifier shows better performance than other methods. The proposed algorithm outperforms existing algorithms by increasing 11% classification accuracy. In addition, the proposed algorithm illustrates classification accuracy for K-NN and PNN classifiers, and a Receiver Operating Characteristics (ROC) area of 99.98%, 99.83%, and 99.75% respectively.
Fusing Bluetooth Beacon Data with Wi-Fi Radiomaps for Improved Indoor Localization
Kanaris, Loizos; Kokkinis, Akis; Liotta, Antonio; Stavrou, Stavros
2017-01-01
Indoor user localization and tracking are instrumental to a broad range of services and applications in the Internet of Things (IoT) and particularly in Body Sensor Networks (BSN) and Ambient Assisted Living (AAL) scenarios. Due to the widespread availability of IEEE 802.11, many localization platforms have been proposed, based on the Wi-Fi Received Signal Strength (RSS) indicator, using algorithms such as K-Nearest Neighbour (KNN), Maximum A Posteriori (MAP) and Minimum Mean Square Error (MMSE). In this paper, we introduce a hybrid method that combines the simplicity (and low cost) of Bluetooth Low Energy (BLE) and the popular 802.11 infrastructure, to improve the accuracy of indoor localization platforms. Building on KNN, we propose a new positioning algorithm (dubbed i-KNN) which is able to filter the initial fingerprint dataset (i.e., the radiomap), after considering the proximity of RSS fingerprints with respect to the BLE devices. In this way, i-KNN provides an optimised small subset of possible user locations, based on which it finally estimates the user position. The proposed methodology achieves fast positioning estimation due to the utilization of a fragment of the initial fingerprint dataset, while at the same time improves positioning accuracy by minimizing any calculation errors. PMID:28394268
Fusing Bluetooth Beacon Data with Wi-Fi Radiomaps for Improved Indoor Localization.
Kanaris, Loizos; Kokkinis, Akis; Liotta, Antonio; Stavrou, Stavros
2017-04-10
Indoor user localization and tracking are instrumental to a broad range of services and applications in the Internet of Things (IoT) and particularly in Body Sensor Networks (BSN) and Ambient Assisted Living (AAL) scenarios. Due to the widespread availability of IEEE 802.11, many localization platforms have been proposed, based on the Wi-Fi Received Signal Strength (RSS) indicator, using algorithms such as K -Nearest Neighbour (KNN), Maximum A Posteriori (MAP) and Minimum Mean Square Error (MMSE). In this paper, we introduce a hybrid method that combines the simplicity (and low cost) of Bluetooth Low Energy (BLE) and the popular 802.11 infrastructure, to improve the accuracy of indoor localization platforms. Building on KNN, we propose a new positioning algorithm (dubbed i-KNN) which is able to filter the initial fingerprint dataset (i.e., the radiomap), after considering the proximity of RSS fingerprints with respect to the BLE devices. In this way, i-KNN provides an optimised small subset of possible user locations, based on which it finally estimates the user position. The proposed methodology achieves fast positioning estimation due to the utilization of a fragment of the initial fingerprint dataset, while at the same time improves positioning accuracy by minimizing any calculation errors.
Integrated Sensing Processor, Phase 2
2005-12-01
performance analysis for several baseline classifiers including neural nets, linear classifiers, and kNN classifiers. Use of CCDR as a preprocessing step...below the level of the benchmark non-linear classifier for this problem ( kNN ). Furthermore, the CCDR preconditioned kNN achieved a 10% improvement over...the benchmark kNN without CCDR. Finally, we found an important connection between intrinsic dimension estimation via entropic graphs and the optimal
Statistical analysis for validating ACO-KNN algorithm as feature selection in sentiment analysis
NASA Astrophysics Data System (ADS)
Ahmad, Siti Rohaidah; Yusop, Nurhafizah Moziyana Mohd; Bakar, Azuraliza Abu; Yaakub, Mohd Ridzwan
2017-10-01
This research paper aims to propose a hybrid of ant colony optimization (ACO) and k-nearest neighbor (KNN) algorithms as feature selections for selecting and choosing relevant features from customer review datasets. Information gain (IG), genetic algorithm (GA), and rough set attribute reduction (RSAR) were used as baseline algorithms in a performance comparison with the proposed algorithm. This paper will also discuss the significance test, which was used to evaluate the performance differences between the ACO-KNN, IG-GA, and IG-RSAR algorithms. This study evaluated the performance of the ACO-KNN algorithm using precision, recall, and F-score, which were validated using the parametric statistical significance tests. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. In addition, the experimental results have proven that the ACO-KNN can be used as a feature selection technique in sentiment analysis to obtain quality, optimal feature subset that can represent the actual data in customer review data.
Latifoğlu, Fatma; Polat, Kemal; Kara, Sadik; Güneş, Salih
2008-02-01
In this study, we proposed a new medical diagnosis system based on principal component analysis (PCA), k-NN based weighting pre-processing, and Artificial Immune Recognition System (AIRS) for diagnosis of atherosclerosis from Carotid Artery Doppler Signals. The suggested system consists of four stages. First, in the feature extraction stage, we have obtained the features related with atherosclerosis disease using Fast Fourier Transformation (FFT) modeling and by calculating of maximum frequency envelope of sonograms. Second, in the dimensionality reduction stage, the 61 features of atherosclerosis disease have been reduced to 4 features using PCA. Third, in the pre-processing stage, we have weighted these 4 features using different values of k in a new weighting scheme based on k-NN based weighting pre-processing. Finally, in the classification stage, AIRS classifier has been used to classify subjects as healthy or having atherosclerosis. Hundred percent of classification accuracy has been obtained by the proposed system using 10-fold cross validation. This success shows that the proposed system is a robust and effective system in diagnosis of atherosclerosis disease.
Detecting the Difficulty Level of Foreign Language Texts
2010-02-01
continuous tenses), as well as part- of- speech labels for words. The authors used a k-Nearest Neighbor ( kNN ) classifier (Cover and Hart, 1967; Mitchell, 1997...anticipate, and influence these situations and to operate in them is found in foreign language speech and text. For this reason, military linguists are...the language model system, LGR is the prediction of one of the grammar-based classifiers, and CkNN is a confidence value of the kNN prediction for the
Evaluation of Data Processing Techniques for Unobtrusive Gait Authentication
2014-03-01
scatter plot depicting the performance of kNN , by TER, on all experimental mixtures...30 Table 9. Mean TER of SVM and kNN performance with different voting parameters...performance on XYZ-axis data. ...........................................................51 Table 19. kNN and SVM results in back pocket carrying
Fabrication of transparent lead-free KNN glass ceramics by incorporation method
2012-01-01
The incorporation method was employed to produce potassium sodium niobate [KNN] (K0.5Na0.5NbO3) glass ceramics from the KNN-SiO2 system. This incorporation method combines a simple mixed-oxide technique for producing KNN powder and a conventional melt-quenching technique to form the resulting glass. KNN was calcined at 800°C and subsequently mixed with SiO2 in the KNN:SiO2 ratio of 75:25 (mol%). The successfully produced optically transparent glass was then subjected to a heat treatment schedule at temperatures ranging from 525°C -575°C for crystallization. All glass ceramics of more than 40% transmittance crystallized into KNN nanocrystals that were rectangular in shape and dispersed well throughout the glass matrix. The crystal size and crystallinity were found to increase with increasing heat treatment temperature, which in turn plays an important role in controlling the properties of the glass ceramics, including physical, optical, and dielectric properties. The transparency of the glass samples decreased with increasing crystal size. The maximum room temperature dielectric constant (εr) was as high as 474 at 10 kHz with an acceptable low loss (tanδ) around 0.02 at 10 kHz. PMID:22340426
Chen, Wei; Yu, Zunxiong; Pang, Jinshan; Yu, Peng; Tan, Guoxin; Ning, Chengyun
2017-01-01
The discovery of piezoelectricity in natural bone has attracted extensive research in emulating biological electricity for various tissue regeneration. Here, we carried out experiments to build biocompatible potassium sodium niobate (KNN) ceramics. Then, influence substrate surface charges on bovine serum albumin (BSA) protein adsorption and cell proliferation on KNN ceramics surfaces was investigated. KNN ceramics with piezoelectric constant of ~93 pC/N and relative density of ~93% were fabricated. The adsorption of protein on the positive surfaces (Ps) and negative surfaces (Ns) of KNN ceramics with piezoelectric constant of ~93 pC/N showed greater protein adsorption capacity than that on non-polarized surfaces (NPs). Biocompatibility of KNN ceramics was verified through cell culturing and live/dead cell staining of MC3T3. The cells experiment showed enhanced cell growth on the positive surfaces (Ps) and negative surfaces (Ns) compared to non-polarized surfaces (NPs). These results revealed that KNN ceramics had great potential to be used to understand the effect of surface potential on cells processes and would benefit future research in designing piezoelectric materials for tissue regeneration. PMID:28772704
Chen, Wei; Yu, Zunxiong; Pang, Jinshan; Yu, Peng; Tan, Guoxin; Ning, Chengyun
2017-03-26
The discovery of piezoelectricity in natural bone has attracted extensive research in emulating biological electricity for various tissue regeneration. Here, we carried out experiments to build biocompatible potassium sodium niobate (KNN) ceramics. Then, influence substrate surface charges on bovine serum albumin (BSA) protein adsorption and cell proliferation on KNN ceramics surfaces was investigated. KNN ceramics with piezoelectric constant of ~93 pC/N and relative density of ~93% were fabricated. The adsorption of protein on the positive surfaces (Ps) and negative surfaces (Ns) of KNN ceramics with piezoelectric constant of ~93 pC/N showed greater protein adsorption capacity than that on non-polarized surfaces (NPs). Biocompatibility of KNN ceramics was verified through cell culturing and live/dead cell staining of MC3T3. The cells experiment showed enhanced cell growth on the positive surfaces (Ps) and negative surfaces (Ns) compared to non-polarized surfaces (NPs). These results revealed that KNN ceramics had great potential to be used to understand the effect of surface potential on cells processes and would benefit future research in designing piezoelectric materials for tissue regeneration.
GPU based cloud system for high-performance arrhythmia detection with parallel k-NN algorithm.
Tae Joon Jun; Hyun Ji Park; Hyuk Yoo; Young-Hak Kim; Daeyoung Kim
2016-08-01
In this paper, we propose an GPU based Cloud system for high-performance arrhythmia detection. Pan-Tompkins algorithm is used for QRS detection and we optimized beat classification algorithm with K-Nearest Neighbor (K-NN). To support high performance beat classification on the system, we parallelized beat classification algorithm with CUDA to execute the algorithm on virtualized GPU devices on the Cloud system. MIT-BIH Arrhythmia database is used for validation of the algorithm. The system achieved about 93.5% of detection rate which is comparable to previous researches while our algorithm shows 2.5 times faster execution time compared to CPU only detection algorithm.
Webcam mouse using face and eye tracking in various illumination environments.
Lin, Yuan-Pin; Chao, Yi-Ping; Lin, Chung-Chih; Chen, Jyh-Horng
2005-01-01
Nowadays, due to enhancement of computer performance and popular usage of webcam devices, it has become possible to acquire users' gestures for the human-computer-interface with PC via webcam. However, the effects of illumination variation would dramatically decrease the stability and accuracy of skin-based face tracking system; especially for a notebook or portable platform. In this study we present an effective illumination recognition technique, combining K-Nearest Neighbor classifier and adaptive skin model, to realize the real-time tracking system. We have demonstrated that the accuracy of face detection based on the KNN classifier is higher than 92% in various illumination environments. In real-time implementation, the system successfully tracks user face and eyes features at 15 fps under standard notebook platforms. Although KNN classifier only initiates five environments at preliminary stage, the system permits users to define and add their favorite environments to KNN for computer access. Eventually, based on this efficient tracking algorithm, we have developed a "Webcam Mouse" system to control the PC cursor using face and eye tracking. Preliminary studies in "point and click" style PC web games also shows promising applications in consumer electronic markets in the future.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Lingyan, E-mail: l.y.wang@mail.xjtu.edu.cn, E-mail: wren@mail.xjtu.edu.cn; Ren, Wei, E-mail: l.y.wang@mail.xjtu.edu.cn, E-mail: wren@mail.xjtu.edu.cn; Shi, Peng
Lead-free ferroelectric un-doped and doped K{sub 0.5}Na{sub 0.5}NbO{sub 3} (KNN) films with different amounts of manganese (Mn) were prepared by a chemical solution deposition method. The thicknesses of all films are about 1.6 μm. Their phase, microstructure, leakage current behavior, and electrical properties were investigated. With increasing the amounts of Mn, the crystallinity became worse. Fortunately, the electrical properties were improved due to the decreased leakage current density after Mn-doping. The study on leakage behaviors shows that the dominant conduction mechanism at low electric field in the un-doped KNN film is ohmic mode and that at high electric field is space-charge-limitedmore » and Pool-Frenkel emission. After Mn doping, the dominant conduction mechanism at high electric field of KNN films changed single space-charge-limited. However, the introduction of higher amount of Mn into the KNN film would lead to a changed conduction mechanism from space-charge-limited to ohmic mode. Consequently, there exists an optimal amount of Mn doping of 2.0 mol. %. The 2.0 mol. % Mn doped KNN film shows the lowest leakage current density and the best electrical properties. With the secondary ion mass spectroscopies and x-ray photoelectron spectroscopy analyses, the homogeneous distribution in the KNN films and entrance of Mn element in the lattice of KNN perovskite structure were also confirmed.« less
Mahajan, Amit; Pinho, Rui; Dolhen, Morgane; Costa, M Elisabete; Vilarinho, Paula M
2016-05-31
A current challenge for the fabrication of functional oxide-based devices is related with the need of environmental and sustainable materials and processes. By considering both lead-free ferroelectrics of potassium sodium niobate (K0.5Na0.5NbO3, KNN) and aqueous-based electrophoretic deposition here we demonstrate that an eco-friendly aqueous solution-based process can be used to produce KNN thick coatings with improved electromechanical performance. KNN thick films on platinum substrates with thickness varying between 10 and 15 μm have a dielectric permittivity of 495, dielectric losses of 0.08 at 1 MHz, and a piezoelectric coefficient d33 of ∼70 pC/N. At TC these films display a relative permittivity of 2166 and loss tangent of 0.11 at 1 MHz. A comparison of the physical properties between these films and their bulk ceramics counterparts demonstrates the impact of the aqueous-based electrophoretic deposition (EPD) technique for the preparation of lead-free ferroelectric thick films. This opens the door to the possible development of high-performance, lead-free piezoelectric thick films by a sustainable low-cost process, expanding the applicability of lead-free piezoelectrics.
A software package for interactive motor unit potential classification using fuzzy k-NN classifier.
Rasheed, Sarbast; Stashuk, Daniel; Kamel, Mohamed
2008-01-01
We present an interactive software package for implementing the supervised classification task during electromyographic (EMG) signal decomposition process using a fuzzy k-NN classifier and utilizing the MATLAB high-level programming language and its interactive environment. The method employs an assertion-based classification that takes into account a combination of motor unit potential (MUP) shapes and two modes of use of motor unit firing pattern information: the passive and the active modes. The developed package consists of several graphical user interfaces used to detect individual MUP waveforms from a raw EMG signal, extract relevant features, and classify the MUPs into motor unit potential trains (MUPTs) using assertion-based classifiers.
1991-12-01
user using the ’: knn ’ option in the do-scenario command line). An instance of the K-Nearest Neighbor object is first created and initialized before...Navigation Computer HF High Frequency ILS Instrument Landing System KNN K - Nearest Neighbor LRU Line Replaceable Unit MC Mission Computer MTCA...approaches have been investigated here, K-nearest Neighbors ( KNN ) and neural networks (NN). Both approaches require that previously classified examples of
Minimum Expected Risk Estimation for Near-neighbor Classification
2006-04-01
We consider the problems of class probability estimation and classification when using near-neighbor classifiers, such as k-nearest neighbors ( kNN ...estimate for weighted kNN classifiers with different prior information, for a broad class of risk functions. Theory and simulations show how significant...the difference is compared to the standard maximum likelihood weighted kNN estimates. Comparisons are made with uniform weights, symmetric weights
Portable Language-Independent Adaptive Translation from OCR. Phase 1
2009-04-01
including brute-force k-Nearest Neighbors ( kNN ), fast approximate kNN using hashed k-d trees, classification and regression trees, and locality...achieved by refinements in ground-truthing protocols. Recent algorithmic improvements to our approximate kNN classifier using hashed k-D trees allows...recent years discriminative training has been shown to outperform phonetic HMMs estimated using ML for speech recognition. Standard ML estimation
Secure and Efficient k-NN Queries⋆
Asif, Hafiz; Vaidya, Jaideep; Shafiq, Basit; Adam, Nabil
2017-01-01
Given the morass of available data, ranking and best match queries are often used to find records of interest. As such, k-NN queries, which give the k closest matches to a query point, are of particular interest, and have many applications. We study this problem in the context of the financial sector, wherein an investment portfolio database is queried for matching portfolios. Given the sensitivity of the information involved, our key contribution is to develop a secure k-NN computation protocol that can enable the computation k-NN queries in a distributed multi-party environment while taking domain semantics into account. The experimental results show that the proposed protocols are extremely efficient. PMID:29218333
Lip reading using neural networks
NASA Astrophysics Data System (ADS)
Kalbande, Dhananjay; Mishra, Akassh A.; Patil, Sanjivani; Nirgudkar, Sneha; Patel, Prashant
2011-10-01
Computerized lip reading, or speech reading, is concerned with the difficult task of converting a video signal of a speaking person to written text. It has several applications like teaching deaf and dumb to speak and communicate effectively with the other people, its crime fighting potential and invariance to acoustic environment. We convert the video of the subject speaking vowels into images and then images are further selected manually for processing. However, several factors like fast speech, bad pronunciation, and poor illumination, movement of face, moustaches and beards make lip reading difficult. Contour tracking methods and Template matching are used for the extraction of lips from the face. K Nearest Neighbor algorithm is then used to classify the 'speaking' images and the 'silent' images. The sequence of images is then transformed into segments of utterances. Feature vector is calculated on each frame for all the segments and is stored in the database with properly labeled class. Character recognition is performed using modified KNN algorithm which assigns more weight to nearer neighbors. This paper reports the recognition of vowels using KNN algorithms
Potential of near-infrared spectroscopy for quality evaluation of cattle leather.
Braz, Carlos Eduardo M; Jacinto, Manuel Antonio C; Pereira-Filho, Edenir R; Souza, Gilberto B; Nogueira, Ana Rita A
2018-05-09
Models using near-infrared spectroscopy (NIRS) were constructed based on physical-mechanical tests to determine the quality of cattle leather. The following official parameters were used, considering the industry requirements: tensile strength (TS), percentage elongation (%E), tear strength (TT), and double hole tear strength (DHS). Classification models were constructed with the use of k-nearest neighbor (kNN), soft independent modeling of class analogy (SIMCA), and partial least squares-discriminant analysis (PLS-DA). The evaluated figures of merit, accuracy, sensitivity, and specificity presented results between 85% and 93%, and the false alarm rates from 9% to 14%. The model with lowest validation percentage (92%) was kNN, and the highest was PLS-DA (100%). For TS, lower values were obtained, from 52% for kNN and 74% for SIMCA. The other parameters %E, TT, and DHS presented hit rates between 87 and 100%. The abilities of the models were similar, showing they can be used to predict the quality of cattle leather. Copyright © 2018 Elsevier B.V. All rights reserved.
Ni, Li-Jun; Luan, Shao-Rong; Zhang, Li-Guo
2016-10-01
Because of the numerous varieties of herbal species and active ingredients in the traditional Chinese medicine(TCM),the traditional methods employed could hardly satisfy the current determination requirements of TCM.The present work proposed an idea to realize rapid determination of the quality of TCM based on near infrared(NIR)spectroscopy and internet sharing mode. Low cost and portable multi-source composite spectrometer was invented by our group for in-site fast measurement of spectra of TCM samples. The database could be set up by sharing spectra and quality detection data of TCM samples among TCM enterprises based on the internet platform.A novel method called as keeping same relationship between X and Y space based on K nearest neighbors(KNN-KSR for short)was applied to predict the contents of effective compounds of the samples. In addition,a comparative study between KNN-KSR and partial least squares(PLS)was conducted. Two datasets were applied to validate above idea:one was about 58 Ginkgo Folium samples samples measured with four near-infrared spectroscopy instruments and two multi-source composite spectrometers,another one was about 80 corn samples available online measured with three NIR instruments. The results show that the KNN-KSR method could obtain more reliable outcomes without correcting spectrum.However transforming the PLS models to other instruments could hardly acquire better predictive results until spectral calibration is performed. Meanwhile,the similar analysis results of total flavonoids and total lactones of Ginkgo Folium samples are achieved on the multi-source composite spectrometers and near-infrared spectroscopy instruments,and the prediction results of KNN-KSR are better than PLS. The idea proposed in present study is in urgent need of more samples spectra, and then to be verified by more case studies. Copyright© by the Chinese Pharmaceutical Association.
Three Dimensional Object Recognition Using a Complex Autoregressive Model
1993-12-01
3.4.2 Template Matching Algorithm ...................... 3-16 3.4.3 K-Nearest-Neighbor ( KNN ) Techniques ................. 3-25 3.4.4 Hidden Markov Model...Neighbor ( KNN ) Test Results ...................... 4-13 4.2.1 Single-Look 1-NN Testing .......................... 4-14 4.2.2 Multiple-Look 1-NN Testing...4-15 4.2.3 Discussion of KNN Test Results ...................... 4-15 4.3 Hidden Markov Model (HMM) Test Results
On Algorithms for Generating Computationally Simple Piecewise Linear Classifiers
1989-05-01
suffers. - Waveform classification, e.g. speech recognition, seismic analysis (i.e. discrimination between earthquakes and nuclear explosions), target...assuming Gaussian distributions (B-G) d) Bayes classifier with probability densities estimated with the k-N-N method (B- kNN ) e) The -arest neighbour...range of classifiers are chosen including a fast, easy computable and often used classifier (B-G), reliable and complex classifiers (B- kNN and NNR
The Use of Fuzzy Set Classification for Pattern Recognition of the Polygraph
1993-12-01
actual feature extraction was done, It was decided to use the K-nearest neighbor ( KNN ) the data was preprocessed. The electrocardiogram classifier in...showing heart pulse, and a low frequency not known beforehand, and the KNN classifier does not component showing blood volume. The derivative of...the characteristics of the conventional KNN these six derived signals were detrended and filtered, classification method is that it assigns each
Huan, Yu; Wang, Xiaohui; Koruza, Jurij; Wang, Ke; Webber, Kyle G.; Hao, Yanan; Li, Longtu
2016-01-01
Miniaturization of domains to the nanometer scale has been previously reported in many piezoelectrics with two-phase coexistence. Despite the observation of nanoscale domain configuration near the polymorphic phase transition (PPT) regionin virgin (K0.5Na0.5)NbO3 (KNN) based ceramics, it remains unclear how this domain state responds to external loads and influences the macroscopic electro-mechanical properties. To this end, the electric-field-induced and stress-induced strain curves of KNN-based ceramics over a wide compositional range across PPT were characterized. It was found that the coercive field of the virgin samples was highest in PPT region, which was related to the inhibited domain wall motion due to the presence of nanodomains. However, the coercive field was found to be the lowest in the PPT region after electrical poling. This was related to the irreversible transformation of the nanodomains into micron-sized domains during the poling process. With the similar micron-sized domain configuration for all poled ceramics, the domains in the PPT region move more easily due to the additional polarization vectors. The results demonstrate that the poling process can give rise to the irreversible domain configuration transformation and then account for the inverted macroscopic piezoelectricity in the PPT region of KNN-based ceramics. PMID:26915972
Rajagopal, Rekha; Ranganathan, Vidhyapriya
2018-06-05
Automation in cardiac arrhythmia classification helps medical professionals make accurate decisions about the patient's health. The aim of this work was to design a hybrid classification model to classify cardiac arrhythmias. The design phase of the classification model comprises the following stages: preprocessing of the cardiac signal by eliminating detail coefficients that contain noise, feature extraction through Daubechies wavelet transform, and arrhythmia classification using a collaborative decision from the K nearest neighbor classifier (KNN) and a support vector machine (SVM). The proposed model is able to classify 5 arrhythmia classes as per the ANSI/AAMI EC57: 1998 classification standard. Level 1 of the proposed model involves classification using the KNN and the classifier is trained with examples from all classes. Level 2 involves classification using an SVM and is trained specifically to classify overlapped classes. The final classification of a test heartbeat pertaining to a particular class is done using the proposed KNN/SVM hybrid model. The experimental results demonstrated that the average sensitivity of the proposed model was 92.56%, the average specificity 99.35%, the average positive predictive value 98.13%, the average F-score 94.5%, and the average accuracy 99.78%. The results obtained using the proposed model were compared with the results of discriminant, tree, and KNN classifiers. The proposed model is able to achieve a high classification accuracy.
Weighted Parzen Windows for Pattern Classification
1994-05-01
Nearest-Neighbor Rule The k-Nearest-Neighbor ( kNN ) technique is nonparametric, assuming nothing about the distribution of the data. Stated succinctly...probabilities P(wj I x) from samples." Raudys and Jain [20:255] advance this interpretation by pointing out that the kNN technique can be viewed as the...34Parzen window classifier with a hyper- rectangular window function." As with the Parzen-window technique, the kNN classifier is more accurate as the
NASA Astrophysics Data System (ADS)
Yamada, H.; Matsuoka, T.; Kozuka, H.; Yamazaki, M.; Ohbayashi, K.; Ida, T.
2015-06-01
Two phases of (K,Na)NbO3 (KNN) co-exist in a KNN-based composite lead-free piezoelectric ceramic 0.910(K1-xNax)0.86Ca0.04Li0.02Nb0.85O3-δ-0.042K0.85Ti0.85Nb1.15O5-0.036BaZrO3-0.0016Co3O4- 0.0025Fe2O3-0.0069ZnO system, over a wide range of Na fractions, where 0.56 ≤ x ≤ 0.75. The crystal systems of the two KNN phases are identified to tetragonal and orthorhombic by analyzing the synchrotron powder X-ray diffraction (XRD) data, high-resolution transmission electron microscopy (HR-TEM), and selected-area electron diffraction (SAD). In the range 0.33 ≤ x ≤ 0.50, the main component of the composite system is found to be single-phase KNN with a tetragonal structure. Granular nanodomains of the orthorhombic phase dispersed in the tetragonal matrix have been identified by HR-TEM and SAD for 0.56 ≤ x ≤ 0.75. Only a trace amount of the orthorhombic phase has been found in the SAD patterns at the composition x = 0.56. However, the number of orthorhombic nanodomains gradually increases with increasing Na content up to x < 0.75, as observed from the HR-TEM images. An abrupt increase and agglomeration of the nanodomains are observed at x = 0.75, where weak diffraction peaks of the orthorhombic phase have also become detectable from the XRD data. The maximum value of the electromechanical coupling coefficient, kp = 0.56, has been observed at the composition x = 0.56.
NASA Astrophysics Data System (ADS)
Jin, Wenchao; Wang, Zhao; Li, Meng; He, Yahua; Hu, Xiaokang; Li, Luying; Gao, Yihua; Hu, Yongming; Gu, Haoshuang; Wang, Xiaolin
2018-04-01
Lead-free (K,Na)NbO3 (KNN) nanorod arrays were synthesized with the assistance of a Nb: SrTiO3 single-crystal substrate through the hydrothermal process. The evolutions of the morphology, composition, and structure of the as-synthesized KNN nanorods with the increase in reaction time were investigated. The results confirmed that the increase in reaction time up to 3 h led to the increase in the length and aspect ratio of the well-aligned KNN nanorods. All samples have K-rich orthorhombic crystal structures, while the diffraction peaks shifted towards a higher degree. The peak shifts should be attributed to the increase in the Na content in the KNN lattice, which could decrease the lattice parameters owing to the small ionic radius of Na+ than that of K+. Moreover, the increase in reaction time also resulted in the suppression of oxygen vacancies on the surface of the KNN nanorods. These evolutions of the composition and crystal structure, as well as the decrease in the defect content, lead to great enhancement of the nanorod's piezoelectric response, as their d33 value was increased from 19 to 64 pm/V. These results demonstrated the significant impact of reaction time on the hydrothermal growth of high-performance lead-free KNN one-dimensional nanomaterials.
Optimal Detection Range of RFID Tag for RFID-based Positioning System Using the k-NN Algorithm.
Han, Soohee; Kim, Junghwan; Park, Choung-Hwan; Yoon, Hee-Cheon; Heo, Joon
2009-01-01
Positioning technology to track a moving object is an important and essential component of ubiquitous computing environments and applications. An RFID-based positioning system using the k-nearest neighbor (k-NN) algorithm can determine the position of a moving reader from observed reference data. In this study, the optimal detection range of an RFID-based positioning system was determined on the principle that tag spacing can be derived from the detection range. It was assumed that reference tags without signal strength information are regularly distributed in 1-, 2- and 3-dimensional spaces. The optimal detection range was determined, through analytical and numerical approaches, to be 125% of the tag-spacing distance in 1-dimensional space. Through numerical approaches, the range was 134% in 2-dimensional space, 143% in 3-dimensional space.
Vrooman, Henri A; Cocosco, Chris A; van der Lijn, Fedde; Stokking, Rik; Ikram, M Arfan; Vernooij, Meike W; Breteler, Monique M B; Niessen, Wiro J
2007-08-01
Conventional k-Nearest-Neighbor (kNN) classification, which has been successfully applied to classify brain tissue in MR data, requires training on manually labeled subjects. This manual labeling is a laborious and time-consuming procedure. In this work, a new fully automated brain tissue classification procedure is presented, in which kNN training is automated. This is achieved by non-rigidly registering the MR data with a tissue probability atlas to automatically select training samples, followed by a post-processing step to keep the most reliable samples. The accuracy of the new method was compared to rigid registration-based training and to conventional kNN-based segmentation using training on manually labeled subjects for segmenting gray matter (GM), white matter (WM) and cerebrospinal fluid (CSF) in 12 data sets. Furthermore, for all classification methods, the performance was assessed when varying the free parameters. Finally, the robustness of the fully automated procedure was evaluated on 59 subjects. The automated training method using non-rigid registration with a tissue probability atlas was significantly more accurate than rigid registration. For both automated training using non-rigid registration and for the manually trained kNN classifier, the difference with the manual labeling by observers was not significantly larger than inter-observer variability for all tissue types. From the robustness study, it was clear that, given an appropriate brain atlas and optimal parameters, our new fully automated, non-rigid registration-based method gives accurate and robust segmentation results. A similarity index was used for comparison with manually trained kNN. The similarity indices were 0.93, 0.92 and 0.92, for CSF, GM and WM, respectively. It can be concluded that our fully automated method using non-rigid registration may replace manual segmentation, and thus that automated brain tissue segmentation without laborious manual training is feasible.
Improving the accuracy of k-nearest neighbor using local mean based and distance weight
NASA Astrophysics Data System (ADS)
Syaliman, K. U.; Nababan, E. B.; Sitompul, O. S.
2018-03-01
In k-nearest neighbor (kNN), the determination of classes for new data is normally performed by a simple majority vote system, which may ignore the similarities among data, as well as allowing the occurrence of a double majority class that can lead to misclassification. In this research, we propose an approach to resolve the majority vote issues by calculating the distance weight using a combination of local mean based k-nearest neighbor (LMKNN) and distance weight k-nearest neighbor (DWKNN). The accuracy of results is compared to the accuracy acquired from the original k-NN method using several datasets from the UCI Machine Learning repository, Kaggle and Keel, such as ionosphare, iris, voice genre, lower back pain, and thyroid. In addition, the proposed method is also tested using real data from a public senior high school in city of Tualang, Indonesia. Results shows that the combination of LMKNN and DWKNN was able to increase the classification accuracy of kNN, whereby the average accuracy on test data is 2.45% with the highest increase in accuracy of 3.71% occurring on the lower back pain symptoms dataset. For the real data, the increase in accuracy is obtained as high as 5.16%.
Multi-site Stochastic Simulation of Daily Streamflow with Markov Chain and KNN Algorithm
NASA Astrophysics Data System (ADS)
Mathai, J.; Mujumdar, P.
2017-12-01
A key focus of this study is to develop a method which is physically consistent with the hydrologic processes that can capture short-term characteristics of daily hydrograph as well as the correlation of streamflow in temporal and spatial domains. In complex water resource systems, flow fluctuations at small time intervals require that discretisation be done at small time scales such as daily scales. Also, simultaneous generation of synthetic flows at different sites in the same basin are required. We propose a method to equip water managers with a streamflow generator within a stochastic streamflow simulation framework. The motivation for the proposed method is to generate sequences that extend beyond the variability represented in the historical record of streamflow time series. The method has two steps: In step 1, daily flow is generated independently at each station by a two-state Markov chain, with rising limb increments randomly sampled from a Gamma distribution and the falling limb modelled as exponential recession and in step 2, the streamflow generated in step 1 is input to a nonparametric K-nearest neighbor (KNN) time series bootstrap resampler. The KNN model, being data driven, does not require assumptions on the dependence structure of the time series. A major limitation of KNN based streamflow generators is that they do not produce new values, but merely reshuffle the historical data to generate realistic streamflow sequences. However, daily flow generated using the Markov chain approach is capable of generating a rich variety of streamflow sequences. Furthermore, the rising and falling limbs of daily hydrograph represent different physical processes, and hence they need to be modelled individually. Thus, our method combines the strengths of the two approaches. We show the utility of the method and improvement over the traditional KNN by simulating daily streamflow sequences at 7 locations in the Godavari River basin in India.
Handwritten digits recognition based on immune network
NASA Astrophysics Data System (ADS)
Li, Yangyang; Wu, Yunhui; Jiao, Lc; Wu, Jianshe
2011-11-01
With the development of society, handwritten digits recognition technique has been widely applied to production and daily life. It is a very difficult task to solve these problems in the field of pattern recognition. In this paper, a new method is presented for handwritten digit recognition. The digit samples firstly are processed and features extraction. Based on these features, a novel immune network classification algorithm is designed and implemented to the handwritten digits recognition. The proposed algorithm is developed by Jerne's immune network model for feature selection and KNN method for classification. Its characteristic is the novel network with parallel commutating and learning. The performance of the proposed method is experimented to the handwritten number datasets MNIST and compared with some other recognition algorithms-KNN, ANN and SVM algorithm. The result shows that the novel classification algorithm based on immune network gives promising performance and stable behavior for handwritten digits recognition.
NASA Astrophysics Data System (ADS)
Ahn, C. W.; Y Lee, S.; Lee, H. J.; Ullah, A.; Bae, J. S.; Jeong, E. D.; Choi, J. S.; Park, B. H.; Kim, I. W.
2009-11-01
We have fabricated K0.5Na0.5NbO3 (KNN) thin films on Pt substrates by a chemical solution deposition method and investigated the effect of K and Na excess (0-30 mol%) on ferroelectric and piezoelectric properties of KNN thin film. It was found that with increasing K and Na excess in a precursor solution from 0 to 30 mol%, the leakage current and ferroelectric properties were strongly affected. KNN thin film synthesized by using 20 mol% K and Na excess precursor solution exhibited a low leakage current density and well saturated ferroelectric P-E hysteresis loops. Moreover, the optimized KNN thin film had good fatigue resistance and a piezoelectric constant of 40 pm V-1, which is comparable to that of polycrystalline PZT thin films.
Zhu, Benpeng; Zhang, Zhiqiang; Ma, Teng; Yang, Xiaofei; Li, Yongxiang; Shung, K. Kirk; Zhou, Qifa
2015-01-01
Using tape-casting technology, 35 μm free-standing (100)-textured Li doped KNN (KNLN) thick film was prepared by employing NaNbO3 (NN) as template. It exhibited similar piezoelectric behavior to lead containing materials: a longitudinal piezoelectric coefficient (d33) of ∼150 pm/V and an electromechanical coupling coefficient (kt) of 0.44. Based on this thick film, a 52 MHz side-looking miniature transducer with a bandwidth of 61.5% at −6 dB was built for Intravascular ultrasound (IVUS) imaging. In comparison with 40 MHz PMN-PT single crystal transducer, the rabbit aorta image had better resolution and higher noise-to-signal ratio, indicating that lead-free (100)-textured KNLN thick film may be suitable for IVUS (>50 MHz) imaging. PMID:25991874
Quantum Algorithm for K-Nearest Neighbors Classification Based on the Metric of Hamming Distance
NASA Astrophysics Data System (ADS)
Ruan, Yue; Xue, Xiling; Liu, Heng; Tan, Jianing; Li, Xi
2017-11-01
K-nearest neighbors (KNN) algorithm is a common algorithm used for classification, and also a sub-routine in various complicated machine learning tasks. In this paper, we presented a quantum algorithm (QKNN) for implementing this algorithm based on the metric of Hamming distance. We put forward a quantum circuit for computing Hamming distance between testing sample and each feature vector in the training set. Taking advantage of this method, we realized a good analog for classical KNN algorithm by setting a distance threshold value t to select k - n e a r e s t neighbors. As a result, QKNN achieves O( n 3) performance which is only relevant to the dimension of feature vectors and high classification accuracy, outperforms Llyod's algorithm (Lloyd et al. 2013) and Wiebe's algorithm (Wiebe et al. 2014).
NASA Astrophysics Data System (ADS)
Naghibi, Seyed Amir; Moradi Dashtpagerdi, Mostafa
2017-01-01
One important tool for water resources management in arid and semi-arid areas is groundwater potential mapping. In this study, four data-mining models including K-nearest neighbor (KNN), linear discriminant analysis (LDA), multivariate adaptive regression splines (MARS), and quadric discriminant analysis (QDA) were used for groundwater potential mapping to get better and more accurate groundwater potential maps (GPMs). For this purpose, 14 groundwater influence factors were considered, such as altitude, slope angle, slope aspect, plan curvature, profile curvature, slope length, topographic wetness index (TWI), stream power index, distance from rivers, river density, distance from faults, fault density, land use, and lithology. From 842 springs in the study area, in the Khalkhal region of Iran, 70 % (589 springs) were considered for training and 30 % (253 springs) were used as a validation dataset. Then, KNN, LDA, MARS, and QDA models were applied in the R statistical software and the results were mapped as GPMs. Finally, the receiver operating characteristics (ROC) curve was implemented to evaluate the performance of the models. According to the results, the area under the curve of ROCs were calculated as 81.4, 80.5, 79.6, and 79.2 % for MARS, QDA, KNN, and LDA, respectively. So, it can be concluded that the performances of KNN and LDA were acceptable and the performances of MARS and QDA were excellent. Also, the results depicted high contribution of altitude, TWI, slope angle, and fault density, while plan curvature and land use were seen to be the least important factors.
Text Categorization Using Weight Adjusted k-Nearest Neighbor Classification
1999-05-17
Experimental Results In this section, we compare kNN -mut which uses the weight vector obtained using mutual information as the fi- nal weight vector and...WAKNN against kNN , C4.5 [Qui93], RIPPER [Coh95], PEBLS [CS93], Rainbow [McC96], VSM [Low95] on several synthetic and real data sets. VSM is another k...obtained without this option. 3 C4.5 RIPPER PEBLS Rainbow kNN WAKNN Syn-1 100.0 100.0 100.0 100.0 77.3 100.0 Syn-2 67.5 69.5 62.0 50.0 66.0 68.8 Syn
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhao, Yajing; Chen, Yan; Chen, Kepi
In this study, the effects of doping with GeO 2 on the synthesis temperature, phase structure and morphology of (K 0.5Na 0.5)NbO 3 (KNN) ceramic powders were studied using XRD and SEM. The results show that KNN powders with good crystallinity and compositional homogeneity can be obtained after calcination at up to 900°C for 2 h. Introducing 0.5 mol.% GeO 2 into the starting mixture improved the synthesis of the KNN powders and allowed the calcination temperature to be decreased to 800°C, which can be ascribed to the formation of the liquid phase during the synthesis.
An automated algorithm for determining photometric redshifts of quasars
NASA Astrophysics Data System (ADS)
Wang, Dan; Zhang, Yanxia; Zhao, Yongheng
2010-07-01
We employ k-nearest neighbor algorithm (KNN) for photometric redshift measurement of quasars with the Fifth Data Release (DR5) of the Sloan Digital Sky Survey (SDSS). KNN is an instance learning algorithm where the result of new instance query is predicted based on the closest training samples. The regressor do not use any model to fit and only based on memory. Given a query quasar, we find the known quasars or (training points) closest to the query point, whose redshift value is simply assigned to be the average of the values of its k nearest neighbors. Three kinds of different colors (PSF, Model or Fiber) and spectral redshifts are used as input parameters, separatively. The combination of the three kinds of colors is also taken as input. The experimental results indicate that the best input pattern is PSF + Model + Fiber colors in all experiments. With this pattern, 59.24%, 77.34% and 84.68% of photometric redshifts are obtained within ▵z < 0.1, 0.2 and 0.3, respectively. If only using one kind of colors as input, the model colors achieve the best performance. However, when using two kinds of colors, the best result is achieved by PSF + Fiber colors. In addition, nearest neighbor method (k = 1) shows its superiority compared to KNN (k ≠ 1) for the given sample.
NASA Astrophysics Data System (ADS)
Taha, Zahari; Muazu Musa, Rabiu; Majeed, Anwar P. P. Abdul; Razali Abdullah, Mohamad; Muaz Alim, Muhammad; Nasir, Ahmad Fakhri Ab
2018-04-01
The present study aims at classifying and predicting high and low potential archers from a collection of psychological coping skills variables trained on different k-Nearest Neighbour (k-NN) kernels. 50 youth archers with the average age and standard deviation of (17.0 ±.056) gathered from various archery programmes completed a one end shooting score test. Psychological coping skills inventory which evaluates the archers level of related coping skills were filled out by the archers prior to their shooting tests. k-means cluster analysis was applied to cluster the archers based on their scores on variables assessed k-NN models, i.e. fine, medium, coarse, cosine, cubic and weighted kernel functions, were trained on the psychological variables. The k-means clustered the archers into high psychologically prepared archers (HPPA) and low psychologically prepared archers (LPPA), respectively. It was demonstrated that the cosine k-NN model exhibited good accuracy and precision throughout the exercise with an accuracy of 94% and considerably fewer error rate for the prediction of the HPPA and the LPPA as compared to the rest of the models. The findings of this investigation can be valuable to coaches and sports managers to recognise high potential athletes from the selected psychological coping skills variables examined which would consequently save time and energy during talent identification and development programme.
CHAI, Lian En; LAW, Chow Kuan; MOHAMAD, Mohd Saberi; CHONG, Chuii Khim; CHOON, Yee Wen; DERIS, Safaai; ILLIAS, Rosli Md
2014-01-01
Background: Gene expression data often contain missing expression values. Therefore, several imputation methods have been applied to solve the missing values, which include k-nearest neighbour (kNN), local least squares (LLS), and Bayesian principal component analysis (BPCA). However, the effects of these imputation methods on the modelling of gene regulatory networks from gene expression data have rarely been investigated and analysed using a dynamic Bayesian network (DBN). Methods: In the present study, we separately imputed datasets of the Escherichia coli S.O.S. DNA repair pathway and the Saccharomyces cerevisiae cell cycle pathway with kNN, LLS, and BPCA, and subsequently used these to generate gene regulatory networks (GRNs) using a discrete DBN. We made comparisons on the basis of previous studies in order to select the gene network with the least error. Results: We found that BPCA and LLS performed better on larger networks (based on the S. cerevisiae dataset), whereas kNN performed better on smaller networks (based on the E. coli dataset). Conclusion: The results suggest that the performance of each imputation method is dependent on the size of the dataset, and this subsequently affects the modelling of the resultant GRNs using a DBN. In addition, on the basis of these results, a DBN has the capacity to discover potential edges, as well as display interactions, between genes. PMID:24876803
NASA Astrophysics Data System (ADS)
Georganos, Stefanos; Grippa, Tais; Vanhuysse, Sabine; Lennert, Moritz; Shimoni, Michal; Wolff, Eléonore
2017-10-01
This study evaluates the impact of three Feature Selection (FS) algorithms in an Object Based Image Analysis (OBIA) framework for Very-High-Resolution (VHR) Land Use-Land Cover (LULC) classification. The three selected FS algorithms, Correlation Based Selection (CFS), Mean Decrease in Accuracy (MDA) and Random Forest (RF) based Recursive Feature Elimination (RFE), were tested on Support Vector Machine (SVM), K-Nearest Neighbor, and Random Forest (RF) classifiers. The results demonstrate that the accuracy of SVM and KNN classifiers are the most sensitive to FS. The RF appeared to be more robust to high dimensionality, although a significant increase in accuracy was found by using the RFE method. In terms of classification accuracy, SVM performed the best using FS, followed by RF and KNN. Finally, only a small number of features is needed to achieve the highest performance using each classifier. This study emphasizes the benefits of rigorous FS for maximizing performance, as well as for minimizing model complexity and interpretation.
Ultrahigh Piezoelectric Properties in Textured (K,Na)NbO3 -Based Lead-Free Ceramics.
Li, Peng; Zhai, Jiwei; Shen, Bo; Zhang, Shujun; Li, Xiaolong; Zhu, Fangyuan; Zhang, Xingmin
2018-02-01
High-performance lead-free piezoelectric materials are in great demand for next-generation electronic devices to meet the requirement of environmentally sustainable society. Here, ultrahigh piezoelectric properties with piezoelectric coefficients (d 33 ≈700 pC N -1 , d 33 * ≈980 pm V -1 ) and planar electromechanical coupling factor (k p ≈76%) are achieved in highly textured (K,Na)NbO 3 (KNN)-based ceramics. The excellent piezoelectric properties can be explained by the strong anisotropic feature, optimized engineered domain configuration in the textured ceramics, and facilitated polarization rotation induced by the intermediate phase. In addition, the nanodomain structures with decreased domain wall energy and increased domain wall mobility also contribute to the ultrahigh piezoelectric properties. This work not only demonstrates the tremendous potential of KNN-based ceramics to replace lead-based piezoelectrics but also provides a good strategy to design high-performance piezoelectrics by controlling appropriate phase and crystallographic orientation. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
NASA Astrophysics Data System (ADS)
Squiers, John J.; Li, Weizhi; King, Darlene R.; Mo, Weirong; Zhang, Xu; Lu, Yang; Sellke, Eric W.; Fan, Wensheng; DiMaio, J. Michael; Thatcher, Jeffrey E.
2016-03-01
The clinical judgment of expert burn surgeons is currently the standard on which diagnostic and therapeutic decisionmaking regarding burn injuries is based. Multispectral imaging (MSI) has the potential to increase the accuracy of burn depth assessment and the intraoperative identification of viable wound bed during surgical debridement of burn injuries. A highly accurate classification model must be developed using machine-learning techniques in order to translate MSI data into clinically-relevant information. An animal burn model was developed to build an MSI training database and to study the burn tissue classification ability of several models trained via common machine-learning algorithms. The algorithms tested, from least to most complex, were: K-nearest neighbors (KNN), decision tree (DT), linear discriminant analysis (LDA), weighted linear discriminant analysis (W-LDA), quadratic discriminant analysis (QDA), ensemble linear discriminant analysis (EN-LDA), ensemble K-nearest neighbors (EN-KNN), and ensemble decision tree (EN-DT). After the ground-truth database of six tissue types (healthy skin, wound bed, blood, hyperemia, partial injury, full injury) was generated by histopathological analysis, we used 10-fold cross validation to compare the algorithms' performances based on their accuracies in classifying data against the ground truth, and each algorithm was tested 100 times. The mean test accuracy of the algorithms were KNN 68.3%, DT 61.5%, LDA 70.5%, W-LDA 68.1%, QDA 68.9%, EN-LDA 56.8%, EN-KNN 49.7%, and EN-DT 36.5%. LDA had the highest test accuracy, reflecting the bias-variance tradeoff over the range of complexities inherent to the algorithms tested. Several algorithms were able to match the current standard in burn tissue classification, the clinical judgment of expert burn surgeons. These results will guide further development of an MSI burn tissue classification system. Given that there are few surgeons and facilities specializing in burn care, this technology may improve the standard of burn care for patients without access to specialized facilities.
A comparison between skeleton and bounding box models for falling direction recognition
NASA Astrophysics Data System (ADS)
Narupiyakul, Lalita; Srisrisawang, Nitikorn
2017-12-01
Falling is an injury that can lead to a serious medical condition in every range of the age of people. However, in the case of elderly, the risk of serious injury is much higher. Due to the fact that one way of preventing serious injury is to treat the fallen person as soon as possible, several works attempted to implement different algorithms to recognize the fall. Our work compares the performance of two models based on features extraction: (i) Body joint data (Skeleton Data) which are the joint's positions in 3 axes and (ii) Bounding box (Box-size Data) covering all body joints. Machine learning algorithms that were chosen are Decision Tree (DT), Naïve Bayes (NB), K-nearest neighbors (KNN), Linear discriminant analysis (LDA), Voting Classification (VC), and Gradient boosting (GB). The results illustrate that the models trained with Skeleton data are performed far better than those trained with Box-size data (with an average accuracy of 94-81% and 80-75%, respectively). KNN shows the best performance in both Body joint model and Bounding box model. In conclusion, KNN with Body joint model performs the best among the others.
Busarakam, Kanungnid; Bull, Alan T; Trujillo, Martha E; Riesco, Raul; Sangal, Vartul; van Wezel, Gilles P; Goodfellow, Michael
2016-06-01
A polyphasic study was designed to determine the taxonomic provenance of three Modestobacter strains isolated from an extreme hyper-arid Atacama Desert soil. The strains, isolates KNN 45-1a, KNN 45-2b(T) and KNN 45-3b, were shown to have chemotaxonomic and morphological properties in line with their classification in the genus Modestobacter. The isolates had identical 16S rRNA gene sequences and formed a branch in the Modestobacter gene tree that was most closely related to the type strain of Modestobacter marinus (99.6% similarity). All three isolates were distinguished readily from Modestobacter type strains by a broad range of phenotypic properties, by qualitative and quantitative differences in fatty acid profiles and by BOX fingerprint patterns. The whole genome sequence of isolate KNN 45-2b(T) showed 89.3% average nucleotide identity, 90.1% (SD: 10.97%) average amino acid identity and a digital DNA-DNA hybridization value of 42.4±3.1 against the genome sequence of M. marinus DSM 45201(T), values consistent with its assignment to a separate species. On the basis of all of these data, it is proposed that the isolates be assigned to the genus Modestobacter as Modestobacter caceresii sp. nov. with isolate KNN 45-2b(T) (CECT 9023(T)=DSM 101691(T)) as the type strain. Analysis of the whole-genome sequence of M. caceresii KNN 45-2b(T), with 4683 open reading frames and a genome size of ∽4.96Mb, revealed the presence of genes and gene-clusters that encode for properties relevant to its adaptability to harsh environmental conditions prevalent in extreme hyper arid Atacama Desert soils. Copyright © 2016. Published by Elsevier GmbH.
NASA Astrophysics Data System (ADS)
Patterson, Eric Andrew
Much recent research has focused on the development lead-free perovskite piezoelectrics as environmentally compatible alternatives to lead zirconate titanate (PZT). Two main categories of lead free perovskite piezoelectric ceramic systems were investigated as potential replacements to lead zirconate titanate (PZT) for actuator devices. First, solid solutions based on Li, Ta, and Sb modified (K0.5Na0.5)NbO3 (KNN) lead-free perovskite systems were created using standard solid state methods. Secondly, Bi-based materials a variety of compositions were explored for (1-x)(Bi 0.5Na0.5)TiO3-xBi(Zn0.5Ti0.5)O 3 (BNT-BZT) and Bi(Zn0.5Ti0.5)O3-(Bi 0.5K0.5)TiO3-(Bi0.5Na0.5)TiO 3 (BZT-BKT-BNT). It was shown that when BNT-BKT is combined with increasing concentrations of Bi(Zn1/2i1/2)O3 (BZT), a transition from normal ferroelectric behavior to a material with large electric field induced strains was observed. The higher BZT containing compositions are characterized by large hysteretic strains(> 0.3%) with no negative strains that might indicate domain switching. This work summarizes and analyzes the fatigue behavior of the new generation of Pb-free piezoelectric materials. In piezoelectric materials, fatigue is observed as a degradation in the electromechanical properties under the application of a bipolar or unipolar cyclic electrical load. In Pb-based materials such as lead zirconate titanate (PZT), fatigue has been studied in great depth for both bulk and thin film applications. In PZT, fatigue can result from microcracking or electrode effects (especially in thin films). Ultimately, however, it is electronic and ionic point defects that are the most influential mechanism. Therefore, this work also analyzes the fatigue characteristics of bulk polycrystalline ceramics of the modified-KNN and BNT-BKT-BZT compositions developed. The defect chemistry that underpins the fatigue behavior will be examined and the results will be compared to the existing body of work on PZT. It will be demonstrated that while some Pb-free materials show severe property degradation under cyclic loading, other materials such as BNT-BKT-BZT essentially exhibit fatigue- free piezoelectric properties with chemical doping or other modifications. Based on these results, these new Pb-free materials have great potential for use in piezoelectric applications requiring a large number of drive cycles such as MEMS devices or high frequency actuators.
Nanoscale characterization and local piezoelectric properties of lead-free KNN-LT-LS thin films
NASA Astrophysics Data System (ADS)
Abazari, M.; Choi, T.; Cheong, S.-W.; Safari, A.
2010-01-01
We report the observation of domain structure and piezoelectric properties of pure and Mn-doped (K0.44,Na0.52,Li0.04)(Nb0.84,Ta0.1,Sb0.06)O3 (KNN-LT-LS) thin films on SrTiO3 substrates. It is revealed that, using piezoresponse force microscopy, ferroelectric domain structure in such 500 nm thin films comprised of primarily 180° domains. This was in accordance with the tetragonal structure of the films, confirmed by relative permittivity measurements and x-ray diffraction patterns. Effective piezoelectric coefficient (d33) of the films were calculated using piezoelectric displacement curves and shown to be ~53 pm V-1 for pure KNN-LT-LS thin films. This value is among the highest values reported for an epitaxial lead-free thin film and shows a great potential for KNN-LT-LS to serve as an alternative to PZT thin films in future applications.
Khondoker, Mizanur; Dobson, Richard; Skirrow, Caroline; Simmons, Andrew; Stahl, Daniel
2016-10-01
Recent literature on the comparison of machine learning methods has raised questions about the neutrality, unbiasedness and utility of many comparative studies. Reporting of results on favourable datasets and sampling error in the estimated performance measures based on single samples are thought to be the major sources of bias in such comparisons. Better performance in one or a few instances does not necessarily imply so on an average or on a population level and simulation studies may be a better alternative for objectively comparing the performances of machine learning algorithms. We compare the classification performance of a number of important and widely used machine learning algorithms, namely the Random Forests (RF), Support Vector Machines (SVM), Linear Discriminant Analysis (LDA) and k-Nearest Neighbour (kNN). Using massively parallel processing on high-performance supercomputers, we compare the generalisation errors at various combinations of levels of several factors: number of features, training sample size, biological variation, experimental variation, effect size, replication and correlation between features. For smaller number of correlated features, number of features not exceeding approximately half the sample size, LDA was found to be the method of choice in terms of average generalisation errors as well as stability (precision) of error estimates. SVM (with RBF kernel) outperforms LDA as well as RF and kNN by a clear margin as the feature set gets larger provided the sample size is not too small (at least 20). The performance of kNN also improves as the number of features grows and outplays that of LDA and RF unless the data variability is too high and/or effect sizes are too small. RF was found to outperform only kNN in some instances where the data are more variable and have smaller effect sizes, in which cases it also provide more stable error estimates than kNN and LDA. Applications to a number of real datasets supported the findings from the simulation study. © The Author(s) 2013.
K-Nearest Neighbor Algorithm Optimization in Text Categorization
NASA Astrophysics Data System (ADS)
Chen, Shufeng
2018-01-01
K-Nearest Neighbor (KNN) classification algorithm is one of the simplest methods of data mining. It has been widely used in classification, regression and pattern recognition. The traditional KNN method has some shortcomings such as large amount of sample computation and strong dependence on the sample library capacity. In this paper, a method of representative sample optimization based on CURE algorithm is proposed. On the basis of this, presenting a quick algorithm QKNN (Quick k-nearest neighbor) to find the nearest k neighbor samples, which greatly reduces the similarity calculation. The experimental results show that this algorithm can effectively reduce the number of samples and speed up the search for the k nearest neighbor samples to improve the performance of the algorithm.
Promotion Assistance Tool for Mobile Phone Users
NASA Astrophysics Data System (ADS)
Intraprasert, P.; Jatikul, N.; Chantrapornchai, C.
In this paper, we propose an application tool to help analyze the usage of a mobile phone for a typical user. From the past usage, the tool can analyze the promotion that is suitable for the user which may save the total expense. The application consists of both client and server side. On the server side, the information for each promotion package for a phone operator is stored as well as the usage database for each client. The client side is a user interface for both phone operators and users to enter their information. The analysis engine are based on KNN, ANN, decision tree and Naïve Bayes models. For comparison, it is shown that KNN and decision outperforms the others.
Optimization of internet content filtering-Combined with KNN and OCAT algorithms
NASA Astrophysics Data System (ADS)
Guo, Tianze; Wu, Lingjing; Liu, Jiaming
2018-04-01
The face of the status quo that rampant illegal content in the Internet, the result of traditional way to filter information, keyword recognition and manual screening, is getting worse. Based on this, this paper uses OCAT algorithm nested by KNN classification algorithm to construct a corpus training library that can dynamically learn and update, which can be improved on the filter corpus for constantly updated illegal content of the network, including text and pictures, and thus can better filter and investigate illegal content and its source. After that, the research direction will focus on the simplified updating of recognition and comparison algorithms and the optimization of the corpus learning ability in order to improve the efficiency of filtering, save time and resources.
NASA Astrophysics Data System (ADS)
Abazari, M.; Safari, A.
2009-05-01
We report the effects of Ba, Ti, and Mn dopants on ferroelectric polarization and leakage current of (K0.44Na0.52Li0.04)(Nb0.84Ta0.1Sb0.06)O3 (KNN-LT-LS) thin films deposited by pulsed laser deposition. It is shown that donor dopants such as Ba2+, which increased the resistivity in bulk KNN-LT-LS, had an opposite effect in the thin film. Ti4+ as an acceptor B-site dopant reduces the leakage current by an order of magnitude, while the polarization values showed a slight degradation. Mn4+, however, was found to effectively suppress the leakage current by over two orders of magnitude while enhancing the polarization, with 15 and 23 μC/cm2 remanent and saturated polarization, whose values are ˜70% and 82% of the reported values for bulk composition. This phenomenon has been associated with the dual effect of Mn4+ in KNN-LT-LS thin film, by substituting both A- and B-site cations. A detailed description on how each dopant affects the concentrations of vacancies in the lattice is presented. Mn-doped KNN-LT-LS thin films are shown to be a promising candidate for lead-free thin films and applications.
NASA Astrophysics Data System (ADS)
Wang, Yonggang; Li, Deng; Lu, Xiaoming; Cheng, Xinyi; Wang, Liwei
2014-10-01
Continuous crystal-based positron emission tomography (PET) detectors could be an ideal alternative for current high-resolution pixelated PET detectors if the issues of high performance γ interaction position estimation and its real-time implementation are solved. Unfortunately, existing position estimators are not very feasible for implementation on field-programmable gate array (FPGA). In this paper, we propose a new self-organizing map neural network-based nearest neighbor (SOM-NN) positioning scheme aiming not only at providing high performance, but also at being realistic for FPGA implementation. Benefitting from the SOM feature mapping mechanism, the large set of input reference events at each calibration position is approximated by a small set of prototypes, and the computation of the nearest neighbor searching for unknown events is largely reduced. Using our experimental data, the scheme was evaluated, optimized and compared with the smoothed k-NN method. The spatial resolutions of full-width-at-half-maximum (FWHM) of both methods averaged over the center axis of the detector were obtained as 1.87 ±0.17 mm and 1.92 ±0.09 mm, respectively. The test results show that the SOM-NN scheme has an equivalent positioning performance with the smoothed k-NN method, but the amount of computation is only about one-tenth of the smoothed k-NN method. In addition, the algorithm structure of the SOM-NN scheme is more feasible for implementation on FPGA. It has the potential to realize real-time position estimation on an FPGA with a high-event processing throughput.
NASA Astrophysics Data System (ADS)
Fernandez Rojas, Raul; Huang, Xu; Ou, Keng-Liang
2017-10-01
Pain diagnosis for nonverbal patients represents a challenge in clinical settings. Neuroimaging methods, such as functional magnetic resonance imaging and functional near-infrared spectroscopy (fNIRS), have shown promising results to assess neuronal function in response to nociception and pain. Recent studies suggest that neuroimaging in conjunction with machine learning models can be used to predict different cognitive tasks. The aim of this study is to expand previous studies by exploring the classification of fNIRS signals (oxyhaemoglobin) according to temperature level (cold and hot) and corresponding pain intensity (low and high) using machine learning models. Toward this aim, we used the quantitative sensory testing to determine pain threshold and pain tolerance to cold and heat in 18 healthy subjects (three females), mean age±standard deviation (31.9±5.5). The classification model is based on the bag-of-words approach, a histogram representation used in document classification based on the frequencies of extracted words and adapted for time series; two learning algorithms were used separately, K-nearest neighbor (K-NN) and support vector machines (SVM). A comparison between two sets of fNIRS channels was also made in the classification task, all 24 channels and 8 channels from the somatosensory region defined as our region of interest (RoI). The results showed that K-NN obtained slightly better results (92.08%) than SVM (91.25%) using the 24 channels; however, the performance slightly dropped using only channels from the RoI with K-NN (91.53%) and SVM (90.83%). These results indicate potential applications of fNIRS in the development of a physiologically based diagnosis of human pain that would benefit vulnerable patients who cannot self-report pain.
NASA Astrophysics Data System (ADS)
Jeong, Chang Kyu; Han, Jae Hyun; Palneedi, Haribabu; Park, Hyewon; Hwang, Geon-Tae; Joung, Boyoung; Kim, Seong-Gon; Shin, Hong Ju; Kang, Il-Suk; Ryu, Jungho; Lee, Keon Jae
2017-07-01
Flexible piezoelectric energy harvesters have been regarded as an overarching candidate for achieving self-powered electronic systems for environmental sensors and biomedical devices using the self-sufficient electrical energy. In this research, we realize a flexible high-output and lead-free piezoelectric energy harvester by using the aerosol deposition method and the laser lift-off process. We also investigated the comprehensive biocompatibility of the lead-free piezoceramic device using ex-vivo ionic elusion and in vivo bioimplantation, as well as in vitro cell proliferation and histologic inspection. The fabricated LiNbO3-doped (K,Na)NbO3 (KNN) thin film-based flexible energy harvester exhibited an outstanding piezoresponse, and average output performance of an open-circuit voltage of ˜130 V and a short-circuit current of ˜1.3 μ A under normal bending and release deformation, which is the best record among previously reported flexible lead-free piezoelectric energy harvesters. Although both the KNN and Pb(Zr,Ti)O3 (PZT) devices showed short-term biocompatibility in cellular and histological studies, excessive Pb toxic ions were eluted from the PZT in human serum and tap water. Moreover, the KNN-based flexible energy harvester was implanted into a porcine chest and generated up to ˜5 V and 700 nA from the heartbeat motion, comparable to the output of previously reported lead-based flexible energy harvesters. This work can compellingly serve to advance the development of piezoelectric energy harvesting for actual and practical biocompatible self-powered biomedical applications beyond restrictions of lead-based materials in long-term physiological and clinical aspects.
Missing portion sizes in FFQ--alternatives to use of standard portions.
Køster-Rasmussen, Rasmus; Siersma, Volkert; Halldorsson, Thorhallur I; de Fine Olivarius, Niels; Henriksen, Jan E; Heitmann, Berit L
2015-08-01
Standard portions or substitution of missing portion sizes with medians may generate bias when quantifying the dietary intake from FFQ. The present study compared four different methods to include portion sizes in FFQ. We evaluated three stochastic methods for imputation of portion sizes based on information about anthropometry, sex, physical activity and age. Energy intakes computed with standard portion sizes, defined as sex-specific medians (median), or with portion sizes estimated with multinomial logistic regression (MLR), 'comparable categories' (Coca) or k-nearest neighbours (KNN) were compared with a reference based on self-reported portion sizes (quantified by a photographic food atlas embedded in the FFQ). The Danish Health Examination Survey 2007-2008. The study included 3728 adults with complete portion size data. Compared with the reference, the root-mean-square errors of the mean daily total energy intake (in kJ) computed with portion sizes estimated by the four methods were (men; women): median (1118; 1061), MLR (1060; 1051), Coca (1230; 1146), KNN (1281; 1181). The equivalent biases (mean error) were (in kJ): median (579; 469), MLR (248; 178), Coca (234; 188), KNN (-340; 218). The methods MLR and Coca provided the best agreement with the reference. The stochastic methods allowed for estimation of meaningful portion sizes by conditioning on information about physiology and they were suitable for multiple imputation. We propose to use MLR or Coca to substitute missing portion size values or when portion sizes needs to be included in FFQ without portion size data.
Improvement in synthesis of (K 0.5Na 0.5)NbO 3 powders by Ge 4+ acceptor doping
Zhao, Yajing; Chen, Yan; Chen, Kepi
2016-11-17
In this study, the effects of doping with GeO 2 on the synthesis temperature, phase structure and morphology of (K 0.5Na 0.5)NbO 3 (KNN) ceramic powders were studied using XRD and SEM. The results show that KNN powders with good crystallinity and compositional homogeneity can be obtained after calcination at up to 900°C for 2 h. Introducing 0.5 mol.% GeO 2 into the starting mixture improved the synthesis of the KNN powders and allowed the calcination temperature to be decreased to 800°C, which can be ascribed to the formation of the liquid phase during the synthesis.
Compositional inhomogeneityand segregation in (K 0.5Na 0.5)NbO 3 ceramics
Chen, Kepi; Tang, Jing; Chen, Yan
2016-03-11
The effects of the calcination temperature of (K 0.5Na 0.5)NbO 3 (KNN) powder on the sintering and piezoelectric properties of KNN ceramics have been investigated in this report. KNN powders are synthesized via the solid-state approach. Scanning electron microscopy and X-ray diffraction characterizations indicate that the incomplete reaction at 700 °C and 750 °C calcination results in the compositional inhomogeneity of the K-rich and Na-rich phases while the orthorhombic single phase is obtained after calcination at 900 °C. During the sintering, the presence of the liquid K-rich phase due to the lower melting point has a significant impact on themore » densification, the abnormal grain growth and the deteriorated piezoelectric properties. From the standpoint of piezoelectric properties, the optimal calcination temperature obtained for KNN ceramics calcined at this temperature is determined to be 800 °C, with piezoelectric constant d 33=128.3 pC/N, planar electromechanical coupling coefficient k p=32.2%, mechanical quality factor Q m=88, and dielectric loss tan δ=2.1%.« less
Shao, Xiaolong; Li, Hui; Wang, Nan; Zhang, Qiang
2015-10-21
An electronic nose (e-nose) was used to characterize sesame oils processed by three different methods (hot-pressed, cold-pressed, and refined), as well as blends of the sesame oils and soybean oil. Seven classification and prediction methods, namely PCA, LDA, PLS, KNN, SVM, LASSO and RF, were used to analyze the e-nose data. The classification accuracy and MAUC were employed to evaluate the performance of these methods. The results indicated that sesame oils processed with different methods resulted in different sensor responses, with cold-pressed sesame oil producing the strongest sensor signals, followed by the hot-pressed sesame oil. The blends of pressed sesame oils with refined sesame oil were more difficult to be distinguished than the blends of pressed sesame oils and refined soybean oil. LDA, KNN, and SVM outperformed the other classification methods in distinguishing sesame oil blends. KNN, LASSO, PLS, and SVM (with linear kernel), and RF models could adequately predict the adulteration level (% of added soybean oil) in the sesame oil blends. Among the prediction models, KNN with k = 1 and 2 yielded the best prediction results.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Kepi; Tang, Jing; Chen, Yan
The effects of the calcination temperature of (K 0.5Na 0.5)NbO 3 (KNN) powder on the sintering and piezoelectric properties of KNN ceramics have been investigated in this report. KNN powders are synthesized via the solid-state approach. Scanning electron microscopy and X-ray diffraction characterizations indicate that the incomplete reaction at 700 °C and 750 °C calcination results in the compositional inhomogeneity of the K-rich and Na-rich phases while the orthorhombic single phase is obtained after calcination at 900 °C. During the sintering, the presence of the liquid K-rich phase due to the lower melting point has a significant impact on themore » densification, the abnormal grain growth and the deteriorated piezoelectric properties. From the standpoint of piezoelectric properties, the optimal calcination temperature obtained for KNN ceramics calcined at this temperature is determined to be 800 °C, with piezoelectric constant d 33=128.3 pC/N, planar electromechanical coupling coefficient k p=32.2%, mechanical quality factor Q m=88, and dielectric loss tan δ=2.1%.« less
Real-time Data Display System of the Korean Neonatal Network
Lee, Byong Sop; Moon, Wi Hwan
2015-01-01
Real-time data reporting in clinical research networks can provide network members through interim analyses of the registered data, which can facilitate further studies and quality improvement activities. The aim of this report was to describe the building process of the data display system (DDS) of the Korean Neonatal Network (KNN) and its basic structure. After member verification at the KNN member's site, users can choose a variable of interest that is listed in the in-hospital data statistics (for 90 variables) or in the follow-up data statistics (for 54 variables). The statistical results of the outcome variables are displayed on the HyperText Markup Language 5-based chart graphs and tables. Participating hospitals can compare their performance to those of KNN as a whole and identify the trends over time. Ranking of each participating hospital is also displayed in terms of key outcome variables such as mortality and major neonatal morbidities with the names of other centers blinded. The most powerful function of the DDS is the ability to perform 'conditional filtering' which allows users to exclusively review the records of interest. Further collaboration is needed to upgrade the DDS to a more sophisticated analytical system and to provide a more user-friendly interface. PMID:26566352
NASA Astrophysics Data System (ADS)
Juniati, D.; Khotimah, C.; Wardani, D. E. K.; Budayasa, K.
2018-01-01
The heart abnormalities can be detected from heart sound. A heart sound can be heard directly with a stethoscope or indirectly by a phonocardiograph, a machine of the heart sound recording. This paper presents the implementation of fractal dimension theory to make a classification of phonocardiograms into a normal heart sound, a murmur, or an extrasystole. The main algorithm used to calculate the fractal dimension was Higuchi’s Algorithm. There were two steps to make a classification of phonocardiograms, feature extraction, and classification. For feature extraction, we used Discrete Wavelet Transform to decompose the signal of heart sound into several sub-bands depending on the selected level. After the decomposition process, the signal was processed using Fast Fourier Transform (FFT) to determine the spectral frequency. The fractal dimension of the FFT output was calculated using Higuchi Algorithm. The classification of fractal dimension of all phonocardiograms was done with KNN and Fuzzy c-mean clustering methods. Based on the research results, the best accuracy obtained was 86.17%, the feature extraction by DWT decomposition level 3 with the value of kmax 50, using 5-fold cross validation and the number of neighbors was 5 at K-NN algorithm. Meanwhile, for fuzzy c-mean clustering, the accuracy was 78.56%.
Sengur, Abdulkadir
2008-03-01
In the last two decades, the use of artificial intelligence methods in medical analysis is increasing. This is mainly because the effectiveness of classification and detection systems have improved a great deal to help the medical experts in diagnosing. In this work, we investigate the use of principal component analysis (PCA), artificial immune system (AIS) and fuzzy k-NN to determine the normal and abnormal heart valves from the Doppler heart sounds. The proposed heart valve disorder detection system is composed of three stages. The first stage is the pre-processing stage. Filtering, normalization and white de-noising are the processes that were used in this stage. The feature extraction is the second stage. During feature extraction stage, wavelet packet decomposition was used. As a next step, wavelet entropy was considered as features. For reducing the complexity of the system, PCA was used for feature reduction. In the classification stage, AIS and fuzzy k-NN were used. To evaluate the performance of the proposed methodology, a comparative study is realized by using a data set containing 215 samples. The validation of the proposed method is measured by using the sensitivity and specificity parameters; 95.9% sensitivity and 96% specificity rate was obtained.
Sun, Xiyang; Miao, Jiacheng; Wang, You; Luo, Zhiyuan; Li, Guang
2017-01-01
An estimate on the reliability of prediction in the applications of electronic nose is essential, which has not been paid enough attention. An algorithm framework called conformal prediction is introduced in this work for discriminating different kinds of ginsengs with a home-made electronic nose instrument. Nonconformity measure based on k-nearest neighbors (KNN) is implemented separately as underlying algorithm of conformal prediction. In offline mode, the conformal predictor achieves a classification rate of 84.44% based on 1NN and 80.63% based on 3NN, which is better than that of simple KNN. In addition, it provides an estimate of reliability for each prediction. In online mode, the validity of predictions is guaranteed, which means that the error rate of region predictions never exceeds the significance level set by a user. The potential of this framework for detecting borderline examples and outliers in the application of E-nose is also investigated. The result shows that conformal prediction is a promising framework for the application of electronic nose to make predictions with reliability and validity. PMID:28805721
Fall Detection System for the Elderly Based on the Classification of Shimmer Sensor Prototype Data
Ahmed, Moiz; Mehmood, Nadeem; Mehmood, Amir; Rizwan, Kashif
2017-01-01
Objectives Falling in the elderly is considered a major cause of death. In recent years, ambient and wireless sensor platforms have been extensively used in developed countries for the detection of falls in the elderly. However, we believe extra efforts are required to address this issue in developing countries, such as Pakistan, where most deaths due to falls are not even reported. Considering this, in this paper, we propose a fall detection system prototype that s based on the classification on real time shimmer sensor data. Methods We first developed a data set, ‘SMotion’ of certain postures that could lead to falls in the elderly by using a body area network of Shimmer sensors and categorized the items in this data set into age and weight groups. We developed a feature selection and classification system using three classifiers, namely, support vector machine (SVM), K-nearest neighbor (KNN), and neural network (NN). Finally, a prototype was fabricated to generate alerts to caregivers, health experts, or emergency services in case of fall. Results To evaluate the proposed system, SVM, KNN, and NN were used. The results of this study identified KNN as the most accurate classifier with maximum accuracy of 96% for age groups and 93% for weight groups. Conclusions In this paper, a classification-based fall detection system is proposed. For this purpose, the SMotion data set was developed and categorized into two groups (age and weight groups). The proposed fall detection system for the elderly is implemented through a body area sensor network using third-generation sensors. The evaluation results demonstrate the reasonable performance of the proposed fall detection prototype system in the tested scenarios. PMID:28875049
Automatic Classification of Protein Structure Using the Maximum Contact Map Overlap Metric
DOE Office of Scientific and Technical Information (OSTI.GOV)
Andonov, Rumen; Djidjev, Hristo Nikolov; Klau, Gunnar W.
In this paper, we propose a new distance measure for comparing two protein structures based on their contact map representations. We show that our novel measure, which we refer to as the maximum contact map overlap (max-CMO) metric, satisfies all properties of a metric on the space of protein representations. Having a metric in that space allows one to avoid pairwise comparisons on the entire database and, thus, to significantly accelerate exploring the protein space compared to no-metric spaces. We show on a gold standard superfamily classification benchmark set of 6759 proteins that our exact k-nearest neighbor (k-NN) scheme classifiesmore » up to 224 out of 236 queries correctly and on a larger, extended version of the benchmark with 60; 850 additional structures, up to 1361 out of 1369 queries. Finally, our k-NN classification thus provides a promising approach for the automatic classification of protein structures based on flexible contact map overlap alignments.« less
Automatic Classification of Protein Structure Using the Maximum Contact Map Overlap Metric
Andonov, Rumen; Djidjev, Hristo Nikolov; Klau, Gunnar W.; ...
2015-10-09
In this paper, we propose a new distance measure for comparing two protein structures based on their contact map representations. We show that our novel measure, which we refer to as the maximum contact map overlap (max-CMO) metric, satisfies all properties of a metric on the space of protein representations. Having a metric in that space allows one to avoid pairwise comparisons on the entire database and, thus, to significantly accelerate exploring the protein space compared to no-metric spaces. We show on a gold standard superfamily classification benchmark set of 6759 proteins that our exact k-nearest neighbor (k-NN) scheme classifiesmore » up to 224 out of 236 queries correctly and on a larger, extended version of the benchmark with 60; 850 additional structures, up to 1361 out of 1369 queries. Finally, our k-NN classification thus provides a promising approach for the automatic classification of protein structures based on flexible contact map overlap alignments.« less
NASA Astrophysics Data System (ADS)
Ahmed, Mustafa Wasir; Baishya, Manash Jyoti; Sharma, Sasanka Sekhor; Hazarika, Manash
2018-04-01
This paper presents a detecting system on power transformer in transformer winding, core and on load tap changer (OLTC). Accuracy of winding deformation is determined using kNN based classifier. Winding deformation in power transformer can be measured using sweep frequency response analysis (SFRA), which can enhance the diagnosis accuracy to a large degree. It is suggested that in the results minor deformation faults can be detected at frequency range of 1 mHz to 2 MHz. The values of RCL parameters are changed when faults occur and hence frequency response of the winding will change accordingly. The SFRA data of tested transformer is compared with reference trace. The difference between two graphs indicate faults in the transformer. The deformation between 1 mHz to 1kHz gives winding deformation, 1 kHz to 100 kHz gives core deformation and 100 kHz to 2 MHz gives OLTC deformation.
Steen Magnussen; Ronald E. McRoberts; Erkki O. Tomppo
2009-01-01
New model-based estimators of the uncertainty of pixel-level and areal k-nearest neighbour (knn) predictions of attribute Y from remotely-sensed ancillary data X are presented. Non-parametric functions predict Y from scalar 'Single Index Model' transformations of X. Variance functions generated...
Jay M. Ver Hoef; Hailemariam Temesgen; Sergio Gómez
2013-01-01
Forest surveys provide critical information for many diverse interests. Data are often collected from samples, and from these samples, maps of resources and estimates of aerial totals or averages are required. In this paper, two approaches for mapping and estimating totals; the spatial linear model (SLM) and k-NN (k-Nearest Neighbor) are compared, theoretically,...
K-nearest neighbor imputation of forest inventory variables in New Hampshire
Andrew Lister; Michael Hoppus; Raymond L. Czaplewski
2005-01-01
The k-nearest neighbor (kNN) method was used to map stand volume for a mosaic of 4 Landsat scenes covering the state of New Hampshire. Data for gross cubic foot volume and trees per acre were summarized from USDA Forest Service Forest Inventory and Analysis (FIA) plots and used as training for kNN. Six bands of...
Shao, Xiaolong; Li, Hui; Wang, Nan; Zhang, Qiang
2015-01-01
An electronic nose (e-nose) was used to characterize sesame oils processed by three different methods (hot-pressed, cold-pressed, and refined), as well as blends of the sesame oils and soybean oil. Seven classification and prediction methods, namely PCA, LDA, PLS, KNN, SVM, LASSO and RF, were used to analyze the e-nose data. The classification accuracy and MAUC were employed to evaluate the performance of these methods. The results indicated that sesame oils processed with different methods resulted in different sensor responses, with cold-pressed sesame oil producing the strongest sensor signals, followed by the hot-pressed sesame oil. The blends of pressed sesame oils with refined sesame oil were more difficult to be distinguished than the blends of pressed sesame oils and refined soybean oil. LDA, KNN, and SVM outperformed the other classification methods in distinguishing sesame oil blends. KNN, LASSO, PLS, and SVM (with linear kernel), and RF models could adequately predict the adulteration level (% of added soybean oil) in the sesame oil blends. Among the prediction models, KNN with k = 1 and 2 yielded the best prediction results. PMID:26506350
Optical and Piezoelectric Study of KNN Solid Solutions Co-Doped with La-Mn and Eu-Fe.
Peña-Jiménez, Jesús-Alejandro; González, Federico; López-Juárez, Rigoberto; Hernández-Alcántara, José-Manuel; Camarillo, Enrique; Murrieta-Sánchez, Héctor; Pardo, Lorena; Villafuerte-Castrejón, María-Elena
2016-09-28
The solid-state method was used to synthesize single phase potassium-sodium niobate (KNN) co-doped with the La 3+ -Mn 4+ and Eu 3+ -Fe 3+ ion pairs. Structural determination of all studied solid solutions was accomplished by XRD and Rietveld refinement method. Electron paramagnetic resonance (EPR) studies were performed to determine the oxidation state of paramagnetic centers. Optical spectroscopy measurements, excitation, emission and decay lifetime were carried out for each solid solution. The present study reveals that doping KNN with La 3+ -Mn 4+ and Eu 3+ -Fe 3+ at concentrations of 0.5 mol % and 1 mol %, respectively, improves the ferroelectric and piezoelectric behavior and induce the generation of optical properties in the material for potential applications.
Optical and Piezoelectric Study of KNN Solid Solutions Co-Doped with La-Mn and Eu-Fe
Peña-Jiménez, Jesús-Alejandro; González, Federico; López-Juárez, Rigoberto; Hernández-Alcántara, José-Manuel; Camarillo, Enrique; Murrieta-Sánchez, Héctor; Pardo, Lorena; Villafuerte-Castrejón, María-Elena
2016-01-01
The solid-state method was used to synthesize single phase potassium-sodium niobate (KNN) co-doped with the La3+–Mn4+ and Eu3+–Fe3+ ion pairs. Structural determination of all studied solid solutions was accomplished by XRD and Rietveld refinement method. Electron paramagnetic resonance (EPR) studies were performed to determine the oxidation state of paramagnetic centers. Optical spectroscopy measurements, excitation, emission and decay lifetime were carried out for each solid solution. The present study reveals that doping KNN with La3+–Mn4+ and Eu3+–Fe3+ at concentrations of 0.5 mol % and 1 mol %, respectively, improves the ferroelectric and piezoelectric behavior and induce the generation of optical properties in the material for potential applications. PMID:28773925
NASA Astrophysics Data System (ADS)
Chen, Yi-Ying; Chu, Chia-Ren; Li, Ming-Hsu
2012-10-01
SummaryIn this paper we present a semi-parametric multivariate gap-filling model for tower-based measurement of latent heat flux (LE). Two statistical techniques, the principal component analysis (PCA) and a nonlinear interpolation approach were integrated into this LE gap-filling model. The PCA was first used to resolve the multicollinearity relationships among various environmental variables, including radiation, soil moisture deficit, leaf area index, wind speed, etc. Two nonlinear interpolation methods, multiple regressions (MRS) and the K-nearest neighbors (KNNs) were examined with random selected flux gaps for both clear sky and nighttime/cloudy data to incorporate into this LE gap-filling model. Experimental results indicated that the KNN interpolation approach is able to provide consistent LE estimations while MRS presents over estimations during nighttime/cloudy. Rather than using empirical regression parameters, the KNN approach resolves the nonlinear relationship between the gap-filled LE flux and principal components with adaptive K values under different atmospheric states. The developed LE gap-filling model (PCA with KNN) works with a RMSE of 2.4 W m-2 (˜0.09 mm day-1) at a weekly time scale by adding 40% artificial flux gaps into original dataset. Annual evapotranspiration at this study site were estimated at 736 mm (1803 MJ) and 728 mm (1785 MJ) for year 2008 and 2009, respectively.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Matsuoka, T., E-mail: ta-matsuoka@mg.ngkntk.co.jp; Kozuka, H.; Kitamura, K.
A (K,Na)NbO₃-based lead-free piezoelectric ceramic was successfully densified. It exhibited an enhanced electromechanical coupling factor of kₚ=0.52, a piezoelectric constant d₃₃=252 pC/N, and a frequency constant Nₚ=3170 Hz m because of the incorporation of an elaborate secondary phase composed primarily of KTiNbO₅. The ceramic's nominal composition was 0.92K₀.₄₂Na₀.₄₄Ca₀.₀₄Li₀.₀₂Nb₀.₈₅O₃–0.047K₀.₈₅Ti₀.₈₅Nb₁.₁₅O₅–0.023BaZrO₃ –0.0017Co₃O₄–0.002Fe₂O₃–0.005ZnO, abbreviated herein as KNN–NTK composite. The KNN–NTK ceramic exhibited a dense microstructure with few microvoids which significantly degraded its piezoelectric properties. Elemental maps recorded using transmission electron microscopy with energy-dispersive X-ray spectroscopy (TEM–EDS) revealed regions of high concentrations of Co and Zn inside the NTK phase. In addition, X-ray diffraction patternsmore » confirmed that a small portion of the NTK phase was converted into K₂(Ti,Nb,Co,Zn)₆O₁₃ or CoZnTiO₄ by a possible reaction between Co and Zn solutes and the NTK phase during a programmed sintering schedule. TEM studies also clarified a distortion around the KNN/NTK interfaces. Such an NTK phase filled voids between KNN particles, resulting in an improved chemical stability of the KNN ceramic. The manufacturing process was subsequently scaled to 100 kg per batch for granulated ceramic powder using a spray-drying technique. The properties of the KNN–NTK composite ceramic produced using the scaled-up method were confirmed to be identical to those of the ceramic prepared by conventional solid-state reaction sintering. Consequently, slight changes in the NTK phase composition and the distortion around the KNN/NTK interfaces affected the KNN–NTK composite ceramic's piezoelectric characteristics.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yamada, H., E-mail: hide-yamada@mg.ngkntk.co.jp; Matsuoka, T.; Kozuka, H.
Two phases of (K,Na)NbO{sub 3} (KNN) co-exist in a KNN-based composite lead-free piezoelectric ceramic 0.910(K{sub 1−x}Na{sub x}){sub 0.86}Ca{sub 0.04}Li{sub 0.02}Nb{sub 0.85}O{sub 3−δ}–0.042K{sub 0.85}Ti{sub 0.85}Nb{sub 1.15}O{sub 5} –0.036BaZrO{sub 3}–0.0016Co{sub 3}O{sub 4}– 0.0025Fe{sub 2}O{sub 3}–0.0069ZnO system, over a wide range of Na fractions, where 0.56 ≤ x ≤ 0.75. The crystal systems of the two KNN phases are identified to tetragonal and orthorhombic by analyzing the synchrotron powder X-ray diffraction (XRD) data, high-resolution transmission electron microscopy (HR-TEM), and selected-area electron diffraction (SAD). In the range 0.33 ≤ x ≤ 0.50, the main component of the composite system is found to be single-phase KNN with a tetragonal structure. Granular nanodomains ofmore » the orthorhombic phase dispersed in the tetragonal matrix have been identified by HR-TEM and SAD for 0.56 ≤ x ≤ 0.75. Only a trace amount of the orthorhombic phase has been found in the SAD patterns at the composition x = 0.56. However, the number of orthorhombic nanodomains gradually increases with increasing Na content up to x < 0.75, as observed from the HR-TEM images. An abrupt increase and agglomeration of the nanodomains are observed at x = 0.75, where weak diffraction peaks of the orthorhombic phase have also become detectable from the XRD data. The maximum value of the electromechanical coupling coefficient, k{sub p} = 0.56, has been observed at the composition x = 0.56.« less
Mapping ionospheric observations using combined techniques for Europe region
NASA Astrophysics Data System (ADS)
Tomasik, Lukasz; Gulyaeva, Tamara; Stanislawska, Iwona; Swiatek, Anna; Pozoga, Mariusz; Dziak-Jankowska, Beata
An k nearest neighbours algorithm (KNN) was used for filling the gaps of the missing F2-layer critical frequency is proposed and applied. This method uses TEC data calculated from EGNOS Vertical Delay Estimate (VDE ≈0.78 TECU) and several GNSS stations and its spatial correlation whit data from selected ionosondes. For mapping purposes two-dimensional similarity function in KNN method was proposed.
Exploitation of RF-DNA for Device Classification and Verification Using GRLVQI Processing
2012-12-01
5 FLD Fisher’s Linear Discriminant . . . . . . . . . . . . . . . . . . . 6 kNN K-Nearest Neighbor...Neighbor ( kNN ), Support Vector Machine (SVM), and simple cross-correlation techniques [40, 57, 82, 88, 94, 95]. The RF-DNA fingerprinting research in...Expansion and the Dis- crete Gabor Transform on a Non-Separable Lattice”. 2000 IEEE Int’l Conf on Acoustics, Speech , and Signal Processing (ICASSP00
Hatamikia, Sepideh; Maghooli, Keivan; Nasrabadi, Ali Motie
2014-01-01
Electroencephalogram (EEG) is one of the useful biological signals to distinguish different brain diseases and mental states. In recent years, detecting different emotional states from biological signals has been merged more attention by researchers and several feature extraction methods and classifiers are suggested to recognize emotions from EEG signals. In this research, we introduce an emotion recognition system using autoregressive (AR) model, sequential forward feature selection (SFS) and K-nearest neighbor (KNN) classifier using EEG signals during emotional audio-visual inductions. The main purpose of this paper is to investigate the performance of AR features in the classification of emotional states. To achieve this goal, a distinguished AR method (Burg's method) based on Levinson-Durbin's recursive algorithm is used and AR coefficients are extracted as feature vectors. In the next step, two different feature selection methods based on SFS algorithm and Davies–Bouldin index are used in order to decrease the complexity of computing and redundancy of features; then, three different classifiers include KNN, quadratic discriminant analysis and linear discriminant analysis are used to discriminate two and three different classes of valence and arousal levels. The proposed method is evaluated with EEG signals of available database for emotion analysis using physiological signals, which are recorded from 32 participants during 40 1 min audio visual inductions. According to the results, AR features are efficient to recognize emotional states from EEG signals, and KNN performs better than two other classifiers in discriminating of both two and three valence/arousal classes. The results also show that SFS method improves accuracies by almost 10-15% as compared to Davies–Bouldin based feature selection. The best accuracies are %72.33 and %74.20 for two classes of valence and arousal and %61.10 and %65.16 for three classes, respectively. PMID:25298928
2014-01-01
Background Pulmonary acoustic parameters extracted from recorded respiratory sounds provide valuable information for the detection of respiratory pathologies. The automated analysis of pulmonary acoustic signals can serve as a differential diagnosis tool for medical professionals, a learning tool for medical students, and a self-management tool for patients. In this context, we intend to evaluate and compare the performance of the support vector machine (SVM) and K-nearest neighbour (K-nn) classifiers in diagnosis respiratory pathologies using respiratory sounds from R.A.L.E database. Results The pulmonary acoustic signals used in this study were obtained from the R.A.L.E lung sound database. The pulmonary acoustic signals were manually categorised into three different groups, namely normal, airway obstruction pathology, and parenchymal pathology. The mel-frequency cepstral coefficient (MFCC) features were extracted from the pre-processed pulmonary acoustic signals. The MFCC features were analysed by one-way ANOVA and then fed separately into the SVM and K-nn classifiers. The performances of the classifiers were analysed using the confusion matrix technique. The statistical analysis of the MFCC features using one-way ANOVA showed that the extracted MFCC features are significantly different (p < 0.001). The classification accuracies of the SVM and K-nn classifiers were found to be 92.19% and 98.26%, respectively. Conclusion Although the data used to train and test the classifiers are limited, the classification accuracies found are satisfactory. The K-nn classifier was better than the SVM classifier for the discrimination of pulmonary acoustic signals from pathological and normal subjects obtained from the RALE database. PMID:24970564
Palaniappan, Rajkumar; Sundaraj, Kenneth; Sundaraj, Sebastian
2014-06-27
Pulmonary acoustic parameters extracted from recorded respiratory sounds provide valuable information for the detection of respiratory pathologies. The automated analysis of pulmonary acoustic signals can serve as a differential diagnosis tool for medical professionals, a learning tool for medical students, and a self-management tool for patients. In this context, we intend to evaluate and compare the performance of the support vector machine (SVM) and K-nearest neighbour (K-nn) classifiers in diagnosis respiratory pathologies using respiratory sounds from R.A.L.E database. The pulmonary acoustic signals used in this study were obtained from the R.A.L.E lung sound database. The pulmonary acoustic signals were manually categorised into three different groups, namely normal, airway obstruction pathology, and parenchymal pathology. The mel-frequency cepstral coefficient (MFCC) features were extracted from the pre-processed pulmonary acoustic signals. The MFCC features were analysed by one-way ANOVA and then fed separately into the SVM and K-nn classifiers. The performances of the classifiers were analysed using the confusion matrix technique. The statistical analysis of the MFCC features using one-way ANOVA showed that the extracted MFCC features are significantly different (p < 0.001). The classification accuracies of the SVM and K-nn classifiers were found to be 92.19% and 98.26%, respectively. Although the data used to train and test the classifiers are limited, the classification accuracies found are satisfactory. The K-nn classifier was better than the SVM classifier for the discrimination of pulmonary acoustic signals from pathological and normal subjects obtained from the RALE database.
Latent Dirichlet Allocation (LDA) Model and kNN Algorithm to Classify Research Project Selection
NASA Astrophysics Data System (ADS)
Safi’ie, M. A.; Utami, E.; Fatta, H. A.
2018-03-01
Universitas Sebelas Maret has a teaching staff more than 1500 people, and one of its tasks is to carry out research. In the other side, the funding support for research and service is limited, so there is need to be evaluated to determine the Research proposal submission and devotion on society (P2M). At the selection stage, research proposal documents are collected as unstructured data and the data stored is very large. To extract information contained in the documents therein required text mining technology. This technology applied to gain knowledge to the documents by automating the information extraction. In this articles we use Latent Dirichlet Allocation (LDA) to the documents as a model in feature extraction process, to get terms that represent its documents. Hereafter we use k-Nearest Neighbour (kNN) algorithm to classify the documents based on its terms.
Multiwavelet grading of prostate pathological images
NASA Astrophysics Data System (ADS)
Soltanian-Zadeh, Hamid; Jafari-Khouzani, Kourosh
2002-05-01
We have developed image analysis methods to automatically grade pathological images of prostate. The proposed method generates Gleason grades to images, where each image is assigned a grade between 1 and 5. This is done using features extracted from multiwavelet transformations. We extract energy and entropy features from submatrices obtained in the decomposition. Next, we apply a k-NN classifier to grade the image. To find optimal multiwavelet basis, preprocessing, and classifier, we use features extracted by different multiwavelets with either critically sampled preprocessing or repeated row preprocessing and different k-NN classifiers and compare their performances, evaluated by total misclassification rate (TMR). To evaluate sensitivity to noise, we add white Gaussian noise to images and compare the results (TMR's). We applied proposed methods to 100 images. We evaluated the first and second levels of decomposition using Geronimo, Hardin, and Massopust (GHM), Chui and Lian (CL), and Shen (SA4) multiwavelets. We also evaluated k-NN classifier for k=1,2,3,4,5. Experimental results illustrate that first level of decomposition is quite noisy. They also show that critically sampled preprocessing outperforms repeated row preprocessing and has less sensitivity to noise. Finally, comparison studies indicate that SA4 multiwavelet and k-NN classifier (k=1) generates optimal results (with smallest TMR of 3%).
Classification of EEG Signals Based on Pattern Recognition Approach.
Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed
2017-01-01
Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a "pattern recognition" approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90-7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11-89.63% and 91.60-81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy.
Classification of EEG Signals Based on Pattern Recognition Approach
Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed
2017-01-01
Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a “pattern recognition” approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90–7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11–89.63% and 91.60–81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy. PMID:29209190
Reija Haapanen; Kimmo Lehtinen; Jukka Miettinen; Marvin E. Bauer; Alan R. Ek
2002-01-01
The k-nearest neighbor (k-NN) method has been undergoing development and testing for applications with USDA Forest Service Forest Inventory and Analysis (FIA) data in Minnesota since 1997. Research began using the 1987-1990 FIA inventory of the state, the then standard 10-point cluster plots, and Landsat TM imagery. In the past year, research has moved to examine...
Source Camera Identification and Blind Tamper Detections for Images
2007-04-24
measures and image quality measures in camera identification problem was studied using conjunction with a KNN classifier to identify the feature sets...shots varying from nature scenes .-.. motorala to close-ups of people. We experimented with the KNN *~. * ny classifier (K=5) as well SVM algorithm of...on Acoustic, Speech and Signal Processing (ICASSP), France, May 2006, vol. 5, pp. 401-404. [9] H. Farid and S. Lyu, "Higher-order wavelet statistics
Performance-driven Multimodality Sensor Fusion
2012-01-23
in IEEE Intl Conf. on Acoust., Speech , Signal Processing, (Dallas), Mar. 2010. [10] K. Sricharan, R. Raich, and A. Hero III, “Boundary compensated knn ...nearest neighbor ( kNN ) plug-in estima- tors, we have developed a generally applicable theory that gives analytical closed-form expressions for asymptotic...Co-PI’s Raich and Hero and was published in the IEEE Proc. of 2011 Intl Conf. on Acoustics, Speech , and Signal Processing. 2.4 Dimension estimation in
Phase transitions and electrical behavior of lead-free (K0.50Na0.50)NbO3 thin film
NASA Astrophysics Data System (ADS)
Wu, Jiagang; Wang, John
2009-09-01
Lead-free (K0.50Na0.50)NbO3 (KNN) thin films with a high degree of (100) preferred orientation were deposited on the SrRuO3-buffered SrTiO3(100) substrate by off-axis radio frequency magnetron sputtering. They possess lower phase transition temperatures (To-t˜120 °C and Tc˜310 °C), as compared to those of KNN bulk ceramic (To-t˜190 °C and Tc˜400 °C). They also demonstrate enhanced ferroelectric behavior (e.g., 2Pr=24.1 μc/cm2) and fatigue endurance, together with a lower dielectric loss (tan δ ˜0.017) and a lower leakage current, as compared to the bulk ceramic counterpart. Oxygen vacancies are shown to be involved in the conduction of the KNN thin film.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bayar, M.; Yamagata-Sekihara, J.; Oset, E.
We have performed a calculation of the scattering amplitude for the three-body system KNN assuming K scattering against a NN cluster using the fixed center approximation to the Faddeev equations. The KN amplitudes, which we take from chiral unitary dynamics, govern the reaction and we find a KNN amplitude that peaks around 40 MeV below the KNN threshold, with a width in |T|{sup 2} of the order of 50 MeV for spin 0 and has another peak around 27 MeV with similar width for spin 1. The results are in line with those obtained using different methods but implementing chiralmore » dynamics. The simplicity of the approach allows one to see the important ingredients responsible for the results. In particular, we show the effects from the reduction of the size of the NN cluster due to the interaction with the K and those from the explicit consideration of the {pi}{Sigma}N channel in the three-body equations.« less
A Robust False Matching Points Detection Method for Remote Sensing Image Registration
NASA Astrophysics Data System (ADS)
Shan, X. J.; Tang, P.
2015-04-01
Given the influences of illumination, imaging angle, and geometric distortion, among others, false matching points still occur in all image registration algorithms. Therefore, false matching points detection is an important step in remote sensing image registration. Random Sample Consensus (RANSAC) is typically used to detect false matching points. However, RANSAC method cannot detect all false matching points in some remote sensing images. Therefore, a robust false matching points detection method based on Knearest- neighbour (K-NN) graph (KGD) is proposed in this method to obtain robust and high accuracy result. The KGD method starts with the construction of the K-NN graph in one image. K-NN graph can be first generated for each matching points and its K nearest matching points. Local transformation model for each matching point is then obtained by using its K nearest matching points. The error of each matching point is computed by using its transformation model. Last, L matching points with largest error are identified false matching points and removed. This process is iterative until all errors are smaller than the given threshold. In addition, KGD method can be used in combination with other methods, such as RANSAC. Several remote sensing images with different resolutions and terrains are used in the experiment. We evaluate the performance of KGD method, RANSAC + KGD method, RANSAC, and Graph Transformation Matching (GTM). The experimental results demonstrate the superior performance of the KGD and RANSAC + KGD methods.
On the classification techniques in data mining for microarray data classification
NASA Astrophysics Data System (ADS)
Aydadenta, Husna; Adiwijaya
2018-03-01
Cancer is one of the deadly diseases, according to data from WHO by 2015 there are 8.8 million more deaths caused by cancer, and this will increase every year if not resolved earlier. Microarray data has become one of the most popular cancer-identification studies in the field of health, since microarray data can be used to look at levels of gene expression in certain cell samples that serve to analyze thousands of genes simultaneously. By using data mining technique, we can classify the sample of microarray data thus it can be identified with cancer or not. In this paper we will discuss some research using some data mining techniques using microarray data, such as Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5, and simulation of Random Forest algorithm with technique of reduction dimension using Relief. The result of this paper show performance measure (accuracy) from classification algorithm (SVM, ANN, Naive Bayes, kNN, C4.5, and Random Forets).The results in this paper show the accuracy of Random Forest algorithm higher than other classification algorithms (Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5). It is hoped that this paper can provide some information about the speed, accuracy, performance and computational cost generated from each Data Mining Classification Technique based on microarray data.
Comparative study of 2mol% Li- and Mn-substituted lead-free potassium sodium niobate ceramics
NASA Astrophysics Data System (ADS)
Dahiya, Asha; Thakur, O. P.; Juneja, J. K.; Singh, Sangeeta; Dipti
2014-12-01
The effect of Li and Mn substitution on the dielectric, ferroelectric and piezoelectric properties of lead free K0.5Na0.5NbO3 (KNN) was investigated. Samples were prepared using a conventional solid state reaction method. The sintering temperature for all the samples was 1050°C. The optimum doping concentration for the enhancement of different properties without the introduction of any other co-dopants such as Ti, Sb, and La was investigated. X-ray diffraction analysis confirmed that all the samples crystallize in a single phase perovskite structure. The dielectric properties were investigated as a function of temperature and applied electric field frequency. Compared with Li-substituted KNN (KLNN), Mn-substituted KNN (KMNN) exhibited a higher dielectric constant ɛ max (i.e., 4840) at its critical transition temperature T c (i.e., 421°C) along with a lower value of tangent loss at 10 kHz and greater values of saturation polarisation P s (i.e., 20.14 μC/cm2) and remnant polarisation P r (i.e., 15.48 μC/cm2). The piezoelectric constant ( d 33) of KMNN was 178 pC/N, which is comparable to that of lead-based hard ceramics. The results presented herein suggest that B-site or Mn substitution at the optimum concentration results in good enhancement of different properties required for materials used in memory devices and other applications.
Frequency Diverse Array Radar: Signal Characterization and Measurement Accuracy
2010-03-25
W knN (C.14) and f [n] = N−1∑ k=0 F [k]W− knN (C.15) where f [n] = f(t)|t=nTs F [k] = F (ω)|ω=k∆ω WN = exp(−j2π/N) Ts = f −1 s ∆ω = 2π NTs , fs is the...Properties of the MIMO radar ambiguity function”. Proceedings 2008 International Conference on Acoustics, Speech and Signal Processing, 2309–2312. April 2008
RKNNMDA: Ranking-based KNN for MiRNA-Disease Association prediction.
Chen, Xing; Wu, Qiao-Feng; Yan, Gui-Ying
2017-07-03
Cumulative verified experimental studies have demonstrated that microRNAs (miRNAs) could be closely related with the development and progression of human complex diseases. Based on the assumption that functional similar miRNAs may have a strong correlation with phenotypically similar diseases and vice versa, researchers developed various effective computational models which combine heterogeneous biologic data sets including disease similarity network, miRNA similarity network, and known disease-miRNA association network to identify potential relationships between miRNAs and diseases in biomedical research. Considering the limitations in previous computational study, we introduced a novel computational method of Ranking-based KNN for miRNA-Disease Association prediction (RKNNMDA) to predict potential related miRNAs for diseases, and our method obtained an AUC of 0.8221 based on leave-one-out cross validation. In addition, RKNNMDA was applied to 3 kinds of important human cancers for further performance evaluation. The results showed that 96%, 80% and 94% of predicted top 50 potential related miRNAs for Colon Neoplasms, Esophageal Neoplasms, and Prostate Neoplasms have been confirmed by experimental literatures, respectively. Moreover, RKNNMDA could be used to predict potential miRNAs for diseases without any known miRNAs, and it is anticipated that RKNNMDA would be of great use for novel miRNA-disease association identification.
RKNNMDA: Ranking-based KNN for MiRNA-Disease Association prediction
Chen, Xing; Yan, Gui-Ying
2017-01-01
ABSTRACT Cumulative verified experimental studies have demonstrated that microRNAs (miRNAs) could be closely related with the development and progression of human complex diseases. Based on the assumption that functional similar miRNAs may have a strong correlation with phenotypically similar diseases and vice versa, researchers developed various effective computational models which combine heterogeneous biologic data sets including disease similarity network, miRNA similarity network, and known disease-miRNA association network to identify potential relationships between miRNAs and diseases in biomedical research. Considering the limitations in previous computational study, we introduced a novel computational method of Ranking-based KNN for miRNA-Disease Association prediction (RKNNMDA) to predict potential related miRNAs for diseases, and our method obtained an AUC of 0.8221 based on leave-one-out cross validation. In addition, RKNNMDA was applied to 3 kinds of important human cancers for further performance evaluation. The results showed that 96%, 80% and 94% of predicted top 50 potential related miRNAs for Colon Neoplasms, Esophageal Neoplasms, and Prostate Neoplasms have been confirmed by experimental literatures, respectively. Moreover, RKNNMDA could be used to predict potential miRNAs for diseases without any known miRNAs, and it is anticipated that RKNNMDA would be of great use for novel miRNA-disease association identification. PMID:28421868
Improving cluster-based missing value estimation of DNA microarray data.
Brás, Lígia P; Menezes, José C
2007-06-01
We present a modification of the weighted K-nearest neighbours imputation method (KNNimpute) for missing values (MVs) estimation in microarray data based on the reuse of estimated data. The method was called iterative KNN imputation (IKNNimpute) as the estimation is performed iteratively using the recently estimated values. The estimation efficiency of IKNNimpute was assessed under different conditions (data type, fraction and structure of missing data) by the normalized root mean squared error (NRMSE) and the correlation coefficients between estimated and true values, and compared with that of other cluster-based estimation methods (KNNimpute and sequential KNN). We further investigated the influence of imputation on the detection of differentially expressed genes using SAM by examining the differentially expressed genes that are lost after MV estimation. The performance measures give consistent results, indicating that the iterative procedure of IKNNimpute can enhance the prediction ability of cluster-based methods in the presence of high missing rates, in non-time series experiments and in data sets comprising both time series and non-time series data, because the information of the genes having MVs is used more efficiently and the iterative procedure allows refining the MV estimates. More importantly, IKNN has a smaller detrimental effect on the detection of differentially expressed genes.
yaImpute: An R package for kNN imputation
Nicholas L. Crookston; Andrew O. Finley
2008-01-01
This article introduces yaImpute, an R package for nearest neighbor search and imputation. Although nearest neighbor imputation is used in a host of disciplines, the methods implemented in the yaImpute package are tailored to imputation-based forest attribute estimation and mapping. The impetus to writing the yaImpute is a growing interest in nearest neighbor...
Wang, Huazhen; Liu, Xin; Lv, Bing; Yang, Fan; Hong, Yanzhu
2014-01-01
Objective Chronic Fatigue (CF) still remains unclear about its etiology, pathophysiology, nomenclature and diagnostic criteria in the medical community. Traditional Chinese medicine (TCM) adopts a unique diagnostic method, namely ‘bian zheng lun zhi’ or syndrome differentiation, to diagnose the CF with a set of syndrome factors, which can be regarded as the Multi-Label Learning (MLL) problem in the machine learning literature. To obtain an effective and reliable diagnostic tool, we use Conformal Predictor (CP), Random Forest (RF) and Problem Transformation method (PT) for the syndrome differentiation of CF. Methods and Materials In this work, using PT method, CP-RF is extended to handle MLL problem. CP-RF applies RF to measure the confidence level (p-value) of each label being the true label, and then selects multiple labels whose p-values are larger than the pre-defined significance level as the region prediction. In this paper, we compare the proposed CP-RF with typical CP-NBC(Naïve Bayes Classifier), CP-KNN(K-Nearest Neighbors) and ML-KNN on CF dataset, which consists of 736 cases. Specifically, 95 symptoms are used to identify CF, and four syndrome factors are employed in the syndrome differentiation, including ‘spleen deficiency’, ‘heart deficiency’, ‘liver stagnation’ and ‘qi deficiency’. The Results CP-RF demonstrates an outstanding performance beyond CP-NBC, CP-KNN and ML-KNN under the general metrics of subset accuracy, hamming loss, one-error, coverage, ranking loss and average precision. Furthermore, the performance of CP-RF remains steady at the large scale of confidence levels from 80% to 100%, which indicates its robustness to the threshold determination. In addition, the confidence evaluation provided by CP is valid and well-calibrated. Conclusion CP-RF not only offers outstanding performance but also provides valid confidence evaluation for the CF syndrome differentiation. It would be well applicable to TCM practitioners and facilitate the utilities of objective, effective and reliable computer-based diagnosis tool. PMID:24918430
Nation-Building Modeling and Resource Allocation Via Dynamic Programming
2014-09-01
Figure 2. RAND Study Models[59:98,115] (WMA) and used both the k-Nearest Neighbor ( KNN ) and Nearest Centroid (NC) algorithms to classify future features...The study found that KNN performed bet- ter than NC with 85% or greater accuracy in all test cases. The methodology was adopted for use under the...analysis feature of the model. 3.7.1 The No Surge Alternative. On the 10th of January 2007, President George W. Bush delivered a speech to the American
Belekar, Vilas; Lingineni, Karthik; Garg, Prabha
2015-01-01
The breast cancer resistant protein (BCRP) is an important transporter and its inhibitors play an important role in cancer treatment by improving the oral bioavailability as well as blood brain barrier (BBB) permeability of anticancer drugs. In this work, a computational model was developed to predict the compounds as BCRP inhibitors or non-inhibitors. Various machine learning approaches like, support vector machine (SVM), k-nearest neighbor (k-NN) and artificial neural network (ANN) were used to develop the models. The Matthews correlation coefficients (MCC) of developed models using ANN, k-NN and SVM are 0.67, 0.71 and 0.77, and prediction accuracies are 85.2%, 88.3% and 90.8% respectively. The developed models were tested with a test set of 99 compounds and further validated with external set of 98 compounds. Distribution plot analysis and various machine learning models were also developed based on druglikeness descriptors. Applicability domain is used to check the prediction reliability of the new molecules.
Applied algorithm in the liner inspection of solid rocket motors
NASA Astrophysics Data System (ADS)
Hoffmann, Luiz Felipe Simões; Bizarria, Francisco Carlos Parquet; Bizarria, José Walter Parquet
2018-03-01
In rocket motors, the bonding between the solid propellant and thermal insulation is accomplished by a thin adhesive layer, known as liner. The liner application method involves a complex sequence of tasks, which includes in its final stage, the surface integrity inspection. Nowadays in Brazil, an expert carries out a thorough visual inspection to detect defects on the liner surface that may compromise the propellant interface bonding. Therefore, this paper proposes an algorithm that uses the photometric stereo technique and the K-nearest neighbor (KNN) classifier to assist the expert in the surface inspection. Photometric stereo allows the surface information recovery of the test images, while the KNN method enables image pixels classification into two classes: non-defect and defect. Tests performed on a computer vision based prototype validate the algorithm. The positive results suggest that the algorithm is feasible and when implemented in a real scenario, will be able to help the expert in detecting defective areas on the liner surface.
Zhang, Y N
2017-01-01
Parkinson's disease (PD) is primarily diagnosed by clinical examinations, such as walking test, handwriting test, and MRI diagnostic. In this paper, we propose a machine learning based PD telediagnosis method for smartphone. Classification of PD using speech records is a challenging task owing to the fact that the classification accuracy is still lower than doctor-level. Here we demonstrate automatic classification of PD using time frequency features, stacked autoencoders (SAE), and K nearest neighbor (KNN) classifier. KNN classifier can produce promising classification results from useful representations which were learned by SAE. Empirical results show that the proposed method achieves better performance with all tested cases across classification tasks, demonstrating machine learning capable of classifying PD with a level of competence comparable to doctor. It concludes that a smartphone can therefore potentially provide low-cost PD diagnostic care. This paper also gives an implementation on browser/server system and reports the running time cost. Both advantages and disadvantages of the proposed telediagnosis system are discussed.
2017-01-01
Parkinson's disease (PD) is primarily diagnosed by clinical examinations, such as walking test, handwriting test, and MRI diagnostic. In this paper, we propose a machine learning based PD telediagnosis method for smartphone. Classification of PD using speech records is a challenging task owing to the fact that the classification accuracy is still lower than doctor-level. Here we demonstrate automatic classification of PD using time frequency features, stacked autoencoders (SAE), and K nearest neighbor (KNN) classifier. KNN classifier can produce promising classification results from useful representations which were learned by SAE. Empirical results show that the proposed method achieves better performance with all tested cases across classification tasks, demonstrating machine learning capable of classifying PD with a level of competence comparable to doctor. It concludes that a smartphone can therefore potentially provide low-cost PD diagnostic care. This paper also gives an implementation on browser/server system and reports the running time cost. Both advantages and disadvantages of the proposed telediagnosis system are discussed. PMID:29075547
Activity Recognition in Egocentric video using SVM, kNN and Combined SVMkNN Classifiers
NASA Astrophysics Data System (ADS)
Sanal Kumar, K. P.; Bhavani, R., Dr.
2017-08-01
Egocentric vision is a unique perspective in computer vision which is human centric. The recognition of egocentric actions is a challenging task which helps in assisting elderly people, disabled patients and so on. In this work, life logging activity videos are taken as input. There are 2 categories, first one is the top level and second one is second level. Here, the recognition is done using the features like Histogram of Oriented Gradients (HOG), Motion Boundary Histogram (MBH) and Trajectory. The features are fused together and it acts as a single feature. The extracted features are reduced using Principal Component Analysis (PCA). The features that are reduced are provided as input to the classifiers like Support Vector Machine (SVM), k nearest neighbor (kNN) and combined Support Vector Machine (SVM) and k Nearest Neighbor (kNN) (combined SVMkNN). These classifiers are evaluated and the combined SVMkNN provided better results than other classifiers in the literature.
A Bio-Inspired Herbal Tea Flavour Assessment Technique
Zakaria, Nur Zawatil Isqi; Masnan, Maz Jamilah; Zakaria, Ammar; Shakaff, Ali Yeon Md
2014-01-01
Herbal-based products are becoming a widespread production trend among manufacturers for the domestic and international markets. As the production increases to meet the market demand, it is very crucial for the manufacturer to ensure that their products have met specific criteria and fulfil the intended quality determined by the quality controller. One famous herbal-based product is herbal tea. This paper investigates bio-inspired flavour assessments in a data fusion framework involving an e-nose and e-tongue. The objectives are to attain good classification of different types and brands of herbal tea, classification of different flavour masking effects and finally classification of different concentrations of herbal tea. Two data fusion levels were employed in this research, low level data fusion and intermediate level data fusion. Four classification approaches; LDA, SVM, KNN and PNN were examined in search of the best classifier to achieve the research objectives. In order to evaluate the classifiers' performance, an error estimator based on k-fold cross validation and leave-one-out were applied. Classification based on GC-MS TIC data was also included as a comparison to the classification performance using fusion approaches. Generally, KNN outperformed the other classification techniques for the three flavour assessments in the low level data fusion and intermediate level data fusion. However, the classification results based on GC-MS TIC data are varied. PMID:25010697
Floor Identification with Commercial Smartphones in Wifi-Based Indoor Localization System
NASA Astrophysics Data System (ADS)
Ai, H. J.; Liu, M. Y.; Shi, Y. M.; Zhao, J. Q.
2016-06-01
In this paper, we utilize novel sensors built-in commercial smart devices to propose a schema which can identify floors with high accuracy and efficiency. This schema can be divided into two modules: floor identifying and floor change detection. Floor identifying module starts at initial phase of positioning, and responsible for determining which floor the positioning start. We have estimated two methods to identify initial floor based on K-Nearest Neighbors (KNN) and BP Neural Network, respectively. In order to improve performance of KNN algorithm, we proposed a novel method based on weighting signal strength, which can identify floors robust and quickly. Floor change detection module turns on after entering into continues positioning procedure. In this module, sensors (such as accelerometer and barometer) of smart devices are used to determine whether the user is going up and down stairs or taking an elevator. This method has fused different kinds of sensor data and can adapt various motion pattern of users. We conduct our experiment with mobile client on Android Phone (Nexus 5) at a four-floors building with an open area between the second and third floor. The results demonstrate that our scheme can achieve an accuracy of 99% to identify floor and 97% to detecting floor changes as a whole.
Özdemir, Merve Erkınay; Telatar, Ziya; Eroğul, Osman; Tunca, Yusuf
2018-05-01
Dysmorphic syndromes have different facial malformations. These malformations are significant to an early diagnosis of dysmorphic syndromes and contain distinctive information for face recognition. In this study we define the certain features of each syndrome by considering facial malformations and classify Fragile X, Hurler, Prader Willi, Down, Wolf Hirschhorn syndromes and healthy groups automatically. The reference points are marked on the face images and ratios between the points' distances are taken into consideration as features. We suggest a neural network based hierarchical decision tree structure in order to classify the syndrome types. We also implement k-nearest neighbor (k-NN) and artificial neural network (ANN) classifiers to compare classification accuracy with our hierarchical decision tree. The classification accuracy is 50, 73 and 86.7% with k-NN, ANN and hierarchical decision tree methods, respectively. Then, the same images are shown to a clinical expert who achieve a recognition rate of 46.7%. We develop an efficient system to recognize different syndrome types automatically in a simple, non-invasive imaging data, which is independent from the patient's age, sex and race at high accuracy. The promising results indicate that our method can be used for pre-diagnosis of the dysmorphic syndromes by clinical experts.
Exploring Sampling in the Detection of Multicategory EEG Signals
Siuly, Siuly; Kabir, Enamul; Wang, Hua; Zhang, Yanchun
2015-01-01
The paper presents a structure based on samplings and machine leaning techniques for the detection of multicategory EEG signals where random sampling (RS) and optimal allocation sampling (OS) are explored. In the proposed framework, before using the RS and OS scheme, the entire EEG signals of each class are partitioned into several groups based on a particular time period. The RS and OS schemes are used in order to have representative observations from each group of each category of EEG data. Then all of the selected samples by the RS from the groups of each category are combined in a one set named RS set. In the similar way, for the OS scheme, an OS set is obtained. Then eleven statistical features are extracted from the RS and OS set, separately. Finally this study employs three well-known classifiers: k-nearest neighbor (k-NN), multinomial logistic regression with a ridge estimator (MLR), and support vector machine (SVM) to evaluate the performance for the RS and OS feature set. The experimental outcomes demonstrate that the RS scheme well represents the EEG signals and the k-NN with the RS is the optimum choice for detection of multicategory EEG signals. PMID:25977705
Kangas, Michael J; Burks, Raychelle M; Atwater, Jordyn; Lukowicz, Rachel M; Garver, Billy; Holmes, Andrea E
2018-02-01
With the increasing availability of digital imaging devices, colorimetric sensor arrays are rapidly becoming a simple, yet effective tool for the identification and quantification of various analytes. Colorimetric arrays utilize colorimetric data from many colorimetric sensors, with the multidimensional nature of the resulting data necessitating the use of chemometric analysis. Herein, an 8 sensor colorimetric array was used to analyze select acid and basic samples (0.5 - 10 M) to determine which chemometric methods are best suited for classification quantification of analytes within clusters. PCA, HCA, and LDA were used to visualize the data set. All three methods showed well-separated clusters for each of the acid or base analytes and moderate separation between analyte concentrations, indicating that the sensor array can be used to identify and quantify samples. Furthermore, PCA could be used to determine which sensors showed the most effective analyte identification. LDA, KNN, and HQI were used for identification of analyte and concentration. HQI and KNN could be used to correctly identify the analytes in all cases, while LDA correctly identified 95 of 96 analytes correctly. Additional studies demonstrated that controlling for solvent and image effects was unnecessary for all chemometric methods utilized in this study.
Probabilistic brain tissue segmentation in neonatal magnetic resonance imaging.
Anbeek, Petronella; Vincken, Koen L; Groenendaal, Floris; Koeman, Annemieke; van Osch, Matthias J P; van der Grond, Jeroen
2008-02-01
A fully automated method has been developed for segmentation of four different structures in the neonatal brain: white matter (WM), central gray matter (CEGM), cortical gray matter (COGM), and cerebrospinal fluid (CSF). The segmentation algorithm is based on information from T2-weighted (T2-w) and inversion recovery (IR) scans. The method uses a K nearest neighbor (KNN) classification technique with features derived from spatial information and voxel intensities. Probabilistic segmentations of each tissue type were generated. By applying thresholds on these probability maps, binary segmentations were obtained. These final segmentations were evaluated by comparison with a gold standard. The sensitivity, specificity, and Dice similarity index (SI) were calculated for quantitative validation of the results. High sensitivity and specificity with respect to the gold standard were reached: sensitivity >0.82 and specificity >0.9 for all tissue types. Tissue volumes were calculated from the binary and probabilistic segmentations. The probabilistic segmentation volumes of all tissue types accurately estimated the gold standard volumes. The KNN approach offers valuable ways for neonatal brain segmentation. The probabilistic outcomes provide a useful tool for accurate volume measurements. The described method is based on routine diagnostic magnetic resonance imaging (MRI) and is suitable for large population studies.
Císař, Petr; Labbé, Laurent; Souček, Pavel; Pelissier, Pablo; Kerneis, Thierry
2018-01-01
The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout (Oncorhynchus mykiss) were fed either a fish-meal based diet (80 fish) or a 100% plant-based diet (80 fish) and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF), Support vector machine (SVM), Logistic regression (LR) and k-Nearest neighbours (k-NN). The SVM with radial based kernel provided the best classifier with correct classification rate (CCR) of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k-NN was the least accurate (40%) classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet’s effects on fish skin. PMID:29596375
1980-12-05
classification procedures that are common in speech processing. The anesthesia level classification by EEG time series population screening problem example is in...formance. The use of the KL number type metric in NN rule classification, in a delete-one subj ect ’s EE-at-a-time KL-NN and KL- kNN classification of the...17 individual labeled EEG sample population using KL-NN and KL- kNN rules. The results obtained are shown in Table 1. The entries in the table indicate
2012-03-01
WMA) and used both the k-Nearest Neighbor ( KNN ) and Nearest Centroid 27 (a) Coalition and Regional (b) Indigenous Figure 3. RAND Study Models[32:98,115...NC) algorithms to classify future features. The study found that KNN performed better than NC with 85% or greater accuracy in all test cases. The...the model. 4.2.1 No Surge. On the 10th of January 2007, President George W. Bush delivered a speech to the American Public outlining a new strategy in
Tang, Yunwei; Jing, Linhai; Li, Hui; Liu, Qingjie; Yan, Qi; Li, Xiuxia
2016-01-01
This study explores the ability of WorldView-2 (WV-2) imagery for bamboo mapping in a mountainous region in Sichuan Province, China. A large area of this place is covered by shadows in the image, and only a few sampled points derived were useful. In order to identify bamboos based on sparse training data, the sample size was expanded according to the reflectance of multispectral bands selected using the principal component analysis (PCA). Then, class separability based on the training data was calculated using a feature space optimization method to select the features for classification. Four regular object-based classification methods were applied based on both sets of training data. The results show that the k-nearest neighbor (k-NN) method produced the greatest accuracy. A geostatistically-weighted k-NN classifier, accounting for the spatial correlation between classes, was then applied to further increase the accuracy. It achieved 82.65% and 93.10% of the producer’s and user’s accuracies respectively for the bamboo class. The canopy densities were estimated to explain the result. This study demonstrates that the WV-2 image can be used to identify small patches of understory bamboos given limited known samples, and the resulting bamboo distribution facilitates the assessments of the habitats of giant pandas. PMID:27879661
Differential diagnosis of pleural mesothelioma using Logic Learning Machine.
Parodi, Stefano; Filiberti, Rosa; Marroni, Paola; Libener, Roberta; Ivaldi, Giovanni Paolo; Mussap, Michele; Ferrari, Enrico; Manneschi, Chiara; Montani, Erika; Muselli, Marco
2015-01-01
Tumour markers are standard tools for the differential diagnosis of cancer. However, the occurrence of nonspecific symptoms and different malignancies involving the same cancer site may lead to a high proportion of misclassifications. Classification accuracy can be improved by combining information from different markers using standard data mining techniques, like Decision Tree (DT), Artificial Neural Network (ANN), and k-Nearest Neighbour (KNN) classifier. Unfortunately, each method suffers from some unavoidable limitations. DT, in general, tends to show a low classification performance, whereas ANN and KNN produce a "black-box" classification that does not provide biological information useful for clinical purposes. Logic Learning Machine (LLM) is an innovative method of supervised data analysis capable of building classifiers described by a set of intelligible rules including simple conditions in their antecedent part. It is essentially an efficient implementation of the Switching Neural Network model and reaches excellent classification accuracy while keeping low the computational demand. LLM was applied to data from a consecutive cohort of 169 patients admitted for diagnosis to two pulmonary departments in Northern Italy from 2009 to 2011. Patients included 52 malignant pleural mesotheliomas (MPM), 62 pleural metastases (MTX) from other tumours and 55 benign diseases (BD) associated with pleurisies. Concentration of three tumour markers (CEA, CYFRA 21-1 and SMRP) was measured in the pleural fluid of each patient and a cytological examination was also carried out. The performance of LLM and that of three competing methods (DT, KNN and ANN) was assessed by leave-one-out cross-validation. LLM outperformed all other considered methods. Global accuracy was 77.5% for LLM, 72.8% for DT, 54.4% for KNN, and 63.9% for ANN, respectively. In more details, LLM correctly classified 79% of MPM, 66% of MTX and 89% of BD. The corresponding figures for DT were: MPM = 83%, MTX = 55% and BD = 84%; for KNN: MPM = 58%, MTX = 45%, BD = 62%; for ANN: MPM = 71%, MTX = 47%, BD = 76%. Finally, LLM provided classification rules in a very good agreement with a priori knowledge about the biological role of the considered tumour markers. LLM is a new flexible tool potentially useful for the differential diagnosis of pleural mesothelioma.
David M. Bell; Matthew J. Gregory; Janet L. Ohmann
2015-01-01
Imputation provides a useful method for mapping forest attributes across broad geographic areas based on field plot measurements and Landsat multi-spectral data, but the resulting map products may be of limited use without corresponding analyses of uncertainties in predictions. In the case of k-nearest neighbor (kNN) imputation with k = 1, such as the Gradient Nearest...
NASA Astrophysics Data System (ADS)
Hsieh, Jui-Hua; Wang, Xiang S.; Teotico, Denise; Golbraikh, Alexander; Tropsha, Alexander
2008-09-01
The use of inaccurate scoring functions in docking algorithms may result in the selection of compounds with high predicted binding affinity that nevertheless are known experimentally not to bind to the target receptor. Such falsely predicted binders have been termed `binding decoys'. We posed a question as to whether true binders and decoys could be distinguished based only on their structural chemical descriptors using approaches commonly used in ligand based drug design. We have applied the k-Nearest Neighbor ( kNN) classification QSAR approach to a dataset of compounds characterized as binders or binding decoys of AmpC beta-lactamase. Models were subjected to rigorous internal and external validation as part of our standard workflow and a special QSAR modeling scheme was employed that took into account the imbalanced ratio of inhibitors to non-binders (1:4) in this dataset. 342 predictive models were obtained with correct classification rate (CCR) for both training and test sets as high as 0.90 or higher. The prediction accuracy was as high as 100% (CCR = 1.00) for the external validation set composed of 10 compounds (5 true binders and 5 decoys) selected randomly from the original dataset. For an additional external set of 50 known non-binders, we have achieved the CCR of 0.87 using very conservative model applicability domain threshold. The validated binary kNN QSAR models were further employed for mining the NCGC AmpC screening dataset (69653 compounds). The consensus prediction of 64 compounds identified as screening hits in the AmpC PubChem assay disagreed with their annotation in PubChem but was in agreement with the results of secondary assays. At the same time, 15 compounds were identified as potential binders contrary to their annotation in PubChem. Five of them were tested experimentally and showed inhibitory activities in millimolar range with the highest binding constant Ki of 135 μM. Our studies suggest that validated QSAR models could complement structure based docking and scoring approaches in identifying promising hits by virtual screening of molecular libraries.
[Terahertz Spectroscopic Identification with Deep Belief Network].
Ma, Shuai; Shen, Tao; Wang, Rui-qi; Lai, Hua; Yu, Zheng-tao
2015-12-01
Feature extraction and classification are the key issues of terahertz spectroscopy identification. Because many materials have no apparent absorption peaks in the terahertz band, it is difficult to extract theirs terahertz spectroscopy feature and identify. To this end, a novel of identify terahertz spectroscopy approach with Deep Belief Network (DBN) was studied in this paper, which combines the advantages of DBN and K-Nearest Neighbors (KNN) classifier. Firstly, cubic spline interpolation and S-G filter were used to normalize the eight kinds of substances (ATP, Acetylcholine Bromide, Bifenthrin, Buprofezin, Carbazole, Bleomycin, Buckminster and Cylotriphosphazene) terahertz transmission spectra in the range of 0.9-6 THz. Secondly, the DBN model was built by two restricted Boltzmann machine (RBM) and then trained layer by layer using unsupervised approach. Instead of using handmade features, the DBN was employed to learn suitable features automatically with raw input data. Finally, a KNN classifier was applied to identify the terahertz spectrum. Experimental results show that using the feature learned by DBN can identify the terahertz spectrum of different substances with the recognition rate of over 90%, which demonstrates that the proposed method can automatically extract the effective features of terahertz spectrum. Furthermore, this KNN classifier was compared with others (BP neural network, SOM neural network and RBF neural network). Comparisons showed that the recognition rate of KNN classifier is better than the other three classifiers. Using the approach that automatic extract terahertz spectrum features by DBN can greatly reduce the workload of feature extraction. This proposed method shows a promising future in the application of identifying the mass terahertz spectroscopy.
Majid, Abdul; Ali, Safdar; Iqbal, Mubashar; Kausar, Nabeela
2014-03-01
This study proposes a novel prediction approach for human breast and colon cancers using different feature spaces. The proposed scheme consists of two stages: the preprocessor and the predictor. In the preprocessor stage, the mega-trend diffusion (MTD) technique is employed to increase the samples of the minority class, thereby balancing the dataset. In the predictor stage, machine-learning approaches of K-nearest neighbor (KNN) and support vector machines (SVM) are used to develop hybrid MTD-SVM and MTD-KNN prediction models. MTD-SVM model has provided the best values of accuracy, G-mean and Matthew's correlation coefficient of 96.71%, 96.70% and 71.98% for cancer/non-cancer dataset, breast/non-breast cancer dataset and colon/non-colon cancer dataset, respectively. We found that hybrid MTD-SVM is the best with respect to prediction performance and computational cost. MTD-KNN model has achieved moderately better prediction as compared to hybrid MTD-NB (Naïve Bayes) but at the expense of higher computing cost. MTD-KNN model is faster than MTD-RF (random forest) but its prediction is not better than MTD-RF. To the best of our knowledge, the reported results are the best results, so far, for these datasets. The proposed scheme indicates that the developed models can be used as a tool for the prediction of cancer. This scheme may be useful for study of any sequential information such as protein sequence or any nucleic acid sequence. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Automated classification of neurological disorders of gait using spatio-temporal gait parameters.
Pradhan, Cauchy; Wuehr, Max; Akrami, Farhoud; Neuhaeusser, Maximilian; Huth, Sabrina; Brandt, Thomas; Jahn, Klaus; Schniepp, Roman
2015-04-01
Automated pattern recognition systems have been used for accurate identification of neurological conditions as well as the evaluation of the treatment outcomes. This study aims to determine the accuracy of diagnoses of (oto-)neurological gait disorders using different types of automated pattern recognition techniques. Clinically confirmed cases of phobic postural vertigo (N = 30), cerebellar ataxia (N = 30), progressive supranuclear palsy (N = 30), bilateral vestibulopathy (N = 30), as well as healthy subjects (N = 30) were recruited for the study. 8 measurements with 136 variables using a GAITRite(®) sensor carpet were obtained from each subject. Subjects were randomly divided into two groups (training cases and validation cases). Sensitivity and specificity of k-nearest neighbor (KNN), naive-bayes classifier (NB), artificial neural network (ANN), and support vector machine (SVM) in classifying the validation cases were calculated. ANN and SVM had the highest overall sensitivity with 90.6% and 92.0% respectively, followed by NB (76.0%) and KNN (73.3%). SVM and ANN showed high false negative rates for bilateral vestibulopathy cases (20.0% and 26.0%); while KNN and NB had high false negative rates for progressive supranuclear palsy cases (76.7% and 40.0%). Automated pattern recognition systems are able to identify pathological gait patterns and establish clinical diagnosis with good accuracy. SVM and ANN in particular differentiate gait patterns of several distinct oto-neurological disorders of gait with high sensitivity and specificity compared to KNN and NB. Both SVM and ANN appear to be a reliable diagnostic and management tool for disorders of gait. Copyright © 2015 Elsevier Ltd. All rights reserved.
EEG-based mild depressive detection using feature selection methods and classifiers.
Li, Xiaowei; Hu, Bin; Sun, Shuting; Cai, Hanshu
2016-11-01
Depression has become a major health burden worldwide, and effectively detection of such disorder is a great challenge which requires latest technological tool, such as Electroencephalography (EEG). This EEG-based research seeks to find prominent frequency band and brain regions that are most related to mild depression, as well as an optimal combination of classification algorithms and feature selection methods which can be used in future mild depression detection. An experiment based on facial expression viewing task (Emo_block and Neu_block) was conducted, and EEG data of 37 university students were collected using a 128 channel HydroCel Geodesic Sensor Net (HCGSN). For discriminating mild depressive patients and normal controls, BayesNet (BN), Support Vector Machine (SVM), Logistic Regression (LR), k-nearest neighbor (KNN) and RandomForest (RF) classifiers were used. And BestFirst (BF), GreedyStepwise (GSW), GeneticSearch (GS), LinearForwordSelection (LFS) and RankSearch (RS) based on Correlation Features Selection (CFS) were applied for linear and non-linear EEG features selection. Independent Samples T-test with Bonferroni correction was used to find the significantly discriminant electrodes and features. Data mining results indicate that optimal performance is achieved using a combination of feature selection method GSW based on CFS and classifier KNN for beta frequency band. Accuracies achieved 92.00% and 98.00%, and AUC achieved 0.957 and 0.997, for Emo_block and Neu_block beta band data respectively. T-test results validate the effectiveness of selected features by search method GSW. Simplified EEG system with only FP1, FP2, F3, O2, T3 electrodes was also explored with linear features, which yielded accuracies of 91.70% and 96.00%, AUC of 0.952 and 0.972, for Emo_block and Neu_block respectively. Classification results obtained by GSW + KNN are encouraging and better than previously published results. In the spatial distribution of features, we find that left parietotemporal lobe in beta EEG frequency band has greater effect on mild depression detection. And fewer EEG channels (FP1, FP2, F3, O2 and T3) combined with linear features may be good candidates for usage in portable systems for mild depression detection. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Classification of Malaysia aromatic rice using multivariate statistical analysis
NASA Astrophysics Data System (ADS)
Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A.; Omar, O.
2015-05-01
Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC-MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.
Classification of diabetic retinopathy using fractal dimension analysis of eye fundus image
NASA Astrophysics Data System (ADS)
Safitri, Diah Wahyu; Juniati, Dwi
2017-08-01
Diabetes Mellitus (DM) is a metabolic disorder when pancreas produce inadequate insulin or a condition when body resist insulin action, so the blood glucose level is high. One of the most common complications of diabetes mellitus is diabetic retinopathy which can lead to a vision problem. Diabetic retinopathy can be recognized by an abnormality in eye fundus. Those abnormalities are characterized by microaneurysms, hemorrhage, hard exudate, cotton wool spots, and venous's changes. The diabetic retinopathy is classified depends on the conditions of abnormality in eye fundus, that is grade 1 if there is a microaneurysm only in the eye fundus; grade 2, if there are a microaneurysm and a hemorrhage in eye fundus; and grade 3: if there are microaneurysm, hemorrhage, and neovascularization in the eye fundus. This study proposed a method and a process of eye fundus image to classify of diabetic retinopathy using fractal analysis and K-Nearest Neighbor (KNN). The first phase was image segmentation process using green channel, CLAHE, morphological opening, matched filter, masking, and morphological opening binary image. After segmentation process, its fractal dimension was calculated using box-counting method and the values of fractal dimension were analyzed to make a classification of diabetic retinopathy. Tests carried out by used k-fold cross validation method with k=5. In each test used 10 different grade K of KNN. The accuracy of the result of this method is 89,17% with K=3 or K=4, it was the best results than others K value. Based on this results, it can be concluded that the classification of diabetic retinopathy using fractal analysis and KNN had a good performance.
NASA Astrophysics Data System (ADS)
Lazcano, R.; Madroñal, D.; Fabelo, H.; Ortega, S.; Salvador, R.; Callicó, G. M.; Juárez, E.; Sanz, C.
2017-10-01
Hyperspectral Imaging (HI) assembles high resolution spectral information from hundreds of narrow bands across the electromagnetic spectrum, thus generating 3D data cubes in which each pixel gathers the spectral information of the reflectance of every spatial pixel. As a result, each image is composed of large volumes of data, which turns its processing into a challenge, as performance requirements have been continuously tightened. For instance, new HI applications demand real-time responses. Hence, parallel processing becomes a necessity to achieve this requirement, so the intrinsic parallelism of the algorithms must be exploited. In this paper, a spatial-spectral classification approach has been implemented using a dataflow language known as RVCCAL. This language represents a system as a set of functional units, and its main advantage is that it simplifies the parallelization process by mapping the different blocks over different processing units. The spatial-spectral classification approach aims at refining the classification results previously obtained by using a K-Nearest Neighbors (KNN) filtering process, in which both the pixel spectral value and the spatial coordinates are considered. To do so, KNN needs two inputs: a one-band representation of the hyperspectral image and the classification results provided by a pixel-wise classifier. Thus, spatial-spectral classification algorithm is divided into three different stages: a Principal Component Analysis (PCA) algorithm for computing the one-band representation of the image, a Support Vector Machine (SVM) classifier, and the KNN-based filtering algorithm. The parallelization of these algorithms shows promising results in terms of computational time, as the mapping of them over different cores presents a speedup of 2.69x when using 3 cores. Consequently, experimental results demonstrate that real-time processing of hyperspectral images is achievable.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md
Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy trainingmore » time, and prone to fatigue as the number of sample increased and inconsistent. The GC–MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.« less
NASA Astrophysics Data System (ADS)
Li, Xiaohui; Yang, Sibo; Fan, Rongwei; Yu, Xin; Chen, Deying
2018-06-01
In this paper, discrimination of soft tissues using laser-induced breakdown spectroscopy (LIBS) in combination with multivariate statistical methods is presented. Fresh pork fat, skin, ham, loin and tenderloin muscle tissues are manually cut into slices and ablated using a 1064 nm pulsed Nd:YAG laser. Discrimination analyses between fat, skin and muscle tissues, and further between highly similar ham, loin and tenderloin muscle tissues, are performed based on the LIBS spectra in combination with multivariate statistical methods, including principal component analysis (PCA), k nearest neighbors (kNN) classification, and support vector machine (SVM) classification. Performances of the discrimination models, including accuracy, sensitivity and specificity, are evaluated using 10-fold cross validation. The classification models are optimized to achieve best discrimination performances. The fat, skin and muscle tissues can be definitely discriminated using both kNN and SVM classifiers, with accuracy of over 99.83%, sensitivity of over 0.995 and specificity of over 0.998. The highly similar ham, loin and tenderloin muscle tissues can also be discriminated with acceptable performances. The best performances are achieved with SVM classifier using Gaussian kernel function, with accuracy of 76.84%, sensitivity of over 0.742 and specificity of over 0.869. The results show that the LIBS technique assisted with multivariate statistical methods could be a powerful tool for online discrimination of soft tissues, even for tissues of high similarity, such as muscles from different parts of the animal body. This technique could be used for discrimination of tissues suffering minor clinical changes, thus may advance the diagnosis of early lesions and abnormalities.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gupta, Shashaank; Belianinov, Alex; Okatan, Mahmut B
(001)pc textured K0.5Na0.5NbO3 (KNN) ceramic was found to exhibit a 65% improvement in the longitudinal piezoelectric response as compared to its random counterpart. Piezoresponse force microscopy study revealed the existence of larger 180 and non-180 domains for textured ceramic as compared to that of the random ceramic. Improvement in piezoresponse by the development of (001)pc texture is discussed in terms of the crystallographic nature of KNN and domain morphology. A comparative analysis performed with a rhombohedral composition suggested that the improvement in longitudinal piezoresponse of polycrystalline ceramics by the development of (001)pc texture is limited by the crystal structure.
Optimum Multisensor, Multitarget Localization and Tracking.
1983-06-07
parameter vector t is given by (see Equation (3.5.1-7)’ the simul- taneous solution of A(e) N B G --1 ae &j’ -4n-in (fk’ ijn k3jn ~ k )kjn kjn - knn =1 k...the coefficient of mutual dependence given by M = 12 -(K-2) 121M12 :(3l 11J12 ) (K-2 and Jij is given by (see Equation (6.4.1-2)) - - (_ I R knN kn K...Transactions on Acoustic, Speech and Signal Processing, Vol ASSP-29, No. 3, June 1981. 17. B. Friedlander, "An ARMA Modeling Approach to Multitarget Tracking
Gunavathi, Chellamuthu; Premalatha, Kandasamy
2014-01-01
Feature selection in cancer classification is a central area of research in the field of bioinformatics and used to select the informative genes from thousands of genes of the microarray. The genes are ranked based on T-statistics, signal-to-noise ratio (SNR), and F-test values. The swarm intelligence (SI) technique finds the informative genes from the top-m ranked genes. These selected genes are used for classification. In this paper the shuffled frog leaping with Lévy flight (SFLLF) is proposed for feature selection. In SFLLF, the Lévy flight is included to avoid premature convergence of shuffled frog leaping (SFL) algorithm. The SI techniques such as particle swarm optimization (PSO), cuckoo search (CS), SFL, and SFLLF are used for feature selection which identifies informative genes for classification. The k-nearest neighbour (k-NN) technique is used to classify the samples. The proposed work is applied on 10 different benchmark datasets and examined with SI techniques. The experimental results show that the results obtained from k-NN classifier through SFLLF feature selection method outperform PSO, CS, and SFL.
Chen, Xue; Li, Xiaohui; Yang, Sibo; Yu, Xin; Liu, Aichun
2018-01-01
Lymphoma is a significant cancer that affects the human lymphatic and hematopoietic systems. In this work, discrimination of lymphoma using laser-induced breakdown spectroscopy (LIBS) conducted on whole blood samples is presented. The whole blood samples collected from lymphoma patients and healthy controls are deposited onto standard quantitative filter papers and ablated with a 1064 nm Q-switched Nd:YAG laser. 16 atomic and ionic emission lines of calcium (Ca), iron (Fe), magnesium (Mg), potassium (K) and sodium (Na) are selected to discriminate the cancer disease. Chemometric methods, including principal component analysis (PCA), linear discriminant analysis (LDA) classification, and k nearest neighbor (kNN) classification are used to build the discrimination models. Both LDA and kNN models have achieved very good discrimination performances for lymphoma, with an accuracy of over 99.7%, a sensitivity of over 0.996, and a specificity of over 0.997. These results demonstrate that the whole-blood-based LIBS technique in combination with chemometric methods can serve as a fast, less invasive, and accurate method for detection and discrimination of human malignancies. PMID:29541503
Preserved Network Metrics across Translated Texts
NASA Astrophysics Data System (ADS)
Cabatbat, Josephine Jill T.; Monsanto, Jica P.; Tapang, Giovanni A.
2014-09-01
Co-occurrence language networks based on Bible translations and the Universal Declaration of Human Rights (UDHR) translations in different languages were constructed and compared with random text networks. Among the considered network metrics, the network size, N, the normalized betweenness centrality (BC), and the average k-nearest neighbors, knn, were found to be the most preserved across translations. Moreover, similar frequency distributions of co-occurring network motifs were observed for translated texts networks.
Lekha, C S Chitra; Kumar, Ajith S; Vivek, S; Rasi, U P Mohammed; Saravanan, K Venkata; Nandakumar, K; Nair, Swapna S
2017-02-03
Harvesting energy from surrounding vibrations and developing self-powered portable devices for wireless and mobile electronics have recently become popular. Here the authors demonstrate the synthesis of piezoelectric energy harvesters based on nanotube arrays by a wet chemical route, which requires no sophisticated instruments. The energy harvester gives an output voltage of 400 mV. Harvesting energy from a sinusoidal magnetic field is another interesting phenomenon for which the authors fabricated a magnetoelectric energy harvester based on piezoelectric-magnetostrictive coaxial nanotube arrays. Piezoelectric K 0.5 Na 0.5 NbO 3 (KNN) is fabricated as the shell and magnetostrictive CoFe 2 O 4 (CFO) as the core of the composite coaxial nanotubes. The delivered voltages are as high as 300 mV at 500 Hz and at a weak ac magnetic field of 100 Oe. Further tailoring of the thickness of the piezoelectric and magnetic layers can enhance the output voltage by several orders. Easy, single-step wet chemical synthesis enhances the industrial upscaling potential of these nanotubes as energy harvesters. In view of the excellent properties reported here, the lead-free piezoelectric component (KNN) in this nanocomposite should be explored for eco-friendly piezoelectric as well as magnetoelectric power generators in nanoelectromechanical systems (NEMS).
NASA Astrophysics Data System (ADS)
Lekha, C. S. Chitra; Kumar, Ajith S.; Vivek, S.; Rasi, U. P. Mohammed; Venkata Saravanan, K.; Nandakumar, K.; Nair, Swapna S.
2017-02-01
Harvesting energy from surrounding vibrations and developing self-powered portable devices for wireless and mobile electronics have recently become popular. Here the authors demonstrate the synthesis of piezoelectric energy harvesters based on nanotube arrays by a wet chemical route, which requires no sophisticated instruments. The energy harvester gives an output voltage of 400 mV. Harvesting energy from a sinusoidal magnetic field is another interesting phenomenon for which the authors fabricated a magnetoelectric energy harvester based on piezoelectric-magnetostrictive coaxial nanotube arrays. Piezoelectric K0.5Na0.5NbO3 (KNN) is fabricated as the shell and magnetostrictive CoFe2O4 (CFO) as the core of the composite coaxial nanotubes. The delivered voltages are as high as 300 mV at 500 Hz and at a weak ac magnetic field of 100 Oe. Further tailoring of the thickness of the piezoelectric and magnetic layers can enhance the output voltage by several orders. Easy, single-step wet chemical synthesis enhances the industrial upscaling potential of these nanotubes as energy harvesters. In view of the excellent properties reported here, the lead-free piezoelectric component (KNN) in this nanocomposite should be explored for eco-friendly piezoelectric as well as magnetoelectric power generators in nanoelectromechanical systems (NEMS).
Deng, Changjian; Lv, Kun; Shi, Debo; Yang, Bo; Yu, Song; He, Zhiyi; Yan, Jia
2018-06-12
In this paper, a novel feature selection and fusion framework is proposed to enhance the discrimination ability of gas sensor arrays for odor identification. Firstly, we put forward an efficient feature selection method based on the separability and the dissimilarity to determine the feature selection order for each type of feature when increasing the dimension of selected feature subsets. Secondly, the K-nearest neighbor (KNN) classifier is applied to determine the dimensions of the optimal feature subsets for different types of features. Finally, in the process of establishing features fusion, we come up with a classification dominance feature fusion strategy which conducts an effective basic feature. Experimental results on two datasets show that the recognition rates of Database I and Database II achieve 97.5% and 80.11%, respectively, when k = 1 for KNN classifier and the distance metric is correlation distance (COR), which demonstrates the superiority of the proposed feature selection and fusion framework in representing signal features. The novel feature selection method proposed in this paper can effectively select feature subsets that are conducive to the classification, while the feature fusion framework can fuse various features which describe the different characteristics of sensor signals, for enhancing the discrimination ability of gas sensors and, to a certain extent, suppressing drift effect.
Aesthetic preference recognition of 3D shapes using EEG.
Chew, Lin Hou; Teo, Jason; Mountstephens, James
2016-04-01
Recognition and identification of aesthetic preference is indispensable in industrial design. Humans tend to pursue products with aesthetic values and make buying decisions based on their aesthetic preferences. The existence of neuromarketing is to understand consumer responses toward marketing stimuli by using imaging techniques and recognition of physiological parameters. Numerous studies have been done to understand the relationship between human, art and aesthetics. In this paper, we present a novel preference-based measurement of user aesthetics using electroencephalogram (EEG) signals for virtual 3D shapes with motion. The 3D shapes are designed to appear like bracelets, which is generated by using the Gielis superformula. EEG signals were collected by using a medical grade device, the B-Alert X10 from advance brain monitoring, with a sampling frequency of 256 Hz and resolution of 16 bits. The signals obtained when viewing 3D bracelet shapes were decomposed into alpha, beta, theta, gamma and delta rhythm by using time-frequency analysis, then classified into two classes, namely like and dislike by using support vector machines and K-nearest neighbors (KNN) classifiers respectively. Classification accuracy of up to 80 % was obtained by using KNN with the alpha, theta and delta rhythms as the features extracted from frontal channels, Fz, F3 and F4 to classify two classes, like and dislike.
Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds.
Sur, Maitreyi; Suffredini, Tony; Wessells, Stephen M; Bloom, Peter H; Lanzone, Michael; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd
2017-01-01
Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data.
Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds
Suffredini, Tony; Wessells, Stephen M.; Bloom, Peter H.; Lanzone, Michael; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd
2017-01-01
Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data. PMID:28403159
Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds
Sur, Maitreyi; Suffredini, Tony; Wessells, Stephen M.; Bloom, Peter H.; Lanzone, Michael J.; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd
2017-01-01
Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data.
An Approach to Biometric Verification Based on Human Body Communication in Wearable Devices
Li, Jingzhen; Liu, Yuhang; Nie, Zedong; Qin, Wenjian; Pang, Zengyao; Wang, Lei
2017-01-01
In this paper, an approach to biometric verification based on human body communication (HBC) is presented for wearable devices. For this purpose, the transmission gain S21 of volunteer’s forearm is measured by vector network analyzer (VNA). Specifically, in order to determine the chosen frequency for biometric verification, 1800 groups of data are acquired from 10 volunteers in the frequency range 0.3 MHz to 1500 MHz, and each group includes 1601 sample data. In addition, to achieve the rapid verification, 30 groups of data for each volunteer are acquired at the chosen frequency, and each group contains only 21 sample data. Furthermore, a threshold-adaptive template matching (TATM) algorithm based on weighted Euclidean distance is proposed for rapid verification in this work. The results indicate that the chosen frequency for biometric verification is from 650 MHz to 750 MHz. The false acceptance rate (FAR) and false rejection rate (FRR) based on TATM are approximately 5.79% and 6.74%, respectively. In contrast, the FAR and FRR were 4.17% and 37.5%, 3.37% and 33.33%, and 3.80% and 34.17% using K-nearest neighbor (KNN) classification, support vector machines (SVM), and naive Bayesian method (NBM) classification, respectively. In addition, the running time of TATM is 0.019 s, whereas the running times of KNN, SVM and NBM are 0.310 s, 0.0385 s, and 0.168 s, respectively. Therefore, TATM is suggested to be appropriate for rapid verification use in wearable devices. PMID:28075375
An Approach to Biometric Verification Based on Human Body Communication in Wearable Devices.
Li, Jingzhen; Liu, Yuhang; Nie, Zedong; Qin, Wenjian; Pang, Zengyao; Wang, Lei
2017-01-10
In this paper, an approach to biometric verification based on human body communication (HBC) is presented for wearable devices. For this purpose, the transmission gain S21 of volunteer's forearm is measured by vector network analyzer (VNA). Specifically, in order to determine the chosen frequency for biometric verification, 1800 groups of data are acquired from 10 volunteers in the frequency range 0.3 MHz to 1500 MHz, and each group includes 1601 sample data. In addition, to achieve the rapid verification, 30 groups of data for each volunteer are acquired at the chosen frequency, and each group contains only 21 sample data. Furthermore, a threshold-adaptive template matching (TATM) algorithm based on weighted Euclidean distance is proposed for rapid verification in this work. The results indicate that the chosen frequency for biometric verification is from 650 MHz to 750 MHz. The false acceptance rate (FAR) and false rejection rate (FRR) based on TATM are approximately 5.79% and 6.74%, respectively. In contrast, the FAR and FRR were 4.17% and 37.5%, 3.37% and 33.33%, and 3.80% and 34.17% using K-nearest neighbor (KNN) classification, support vector machines (SVM), and naive Bayesian method (NBM) classification, respectively. In addition, the running time of TATM is 0.019 s, whereas the running times of KNN, SVM and NBM are 0.310 s, 0.0385 s, and 0.168 s, respectively. Therefore, TATM is suggested to be appropriate for rapid verification use in wearable devices.
Jakes, Peter; Kungl, Hans; Schierholz, Roland; Eichel, Rüdiger-A
2014-09-01
The defect structure for copper-doped sodium potassium niobate (KNN) ferroelectrics has been analyzed with respect to its defect structure. In particular, the interplay between the mutually compensating dimeric (Cu(Nb)'''-V(O)··) and trimeric (V(O)··-Cu(Nb)'''-V(O)··)· defect complexes with 180° and non-180° domain walls has been analyzed and compared to the effects from (Cu'' - V(O)··)(x)× dipoles in CuO-doped lead zirconate titanate (PZT). Attempts are made to relate the rearrangement of defect complexes to macroscopic electromechanical properties.
NASA Astrophysics Data System (ADS)
Yan, Tianxiang; Han, Feifei; Ren, Shaokai; Ma, Xing; Fang, Liang; Liu, Laijun; Kuang, Xiaojun; Elouadi, Brahim
2018-04-01
(1 - x)K0.5Na0.5NbO3- x(Bi0.5Li0.5)ZrO3 (labeled as (1 - x)KNN- xBLZ) lead-free ceramics were fabricated by a solid-state reaction method. A research was conducted on the effects of BLZ content on structure, dielectric properties and relaxation behavior of KNN ceramics. By combining the X-ray diffraction patterns with the temperature dependence of dielectric properties, an orthorhombic-tetragonal phase coexistence was identified for x = 0.03, a tetragonal phase was determined for x = 0.05, and a single rhombohedral structure occurred at x = 0.08. The 0.92KNN-0.08BLZ ceramic exhibits a high and stable permittivity ( 1317, ± 15% variation) from 55 to 445 °C and low dielectric loss (≤ 6%) from 120 to 400 °C, which is hugely attractive for high-temperature capacitors. Activation energies of both high-temperature dielectric relaxation and dc conductivity first increase and then decline with the increase of BLZ, which might be attributed to the lattice distortion and concentration of oxygen vacancies.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wei, Yongbin; Jia, Yanmin, E-mail: wuzheng@zjnu.cn, E-mail: ymjia@zjnu.edu.cn; Wu, Jiang
A mutual enhancement action between the ferro-/piezoelectric polarization and the photoluminescent performance of rare earth Pr{sup 3+} doped (K{sub 0.5}Na{sub 0.5})NbO{sub 3} (KNN) lead-free ceramics is reported. After Pr{sup 3+} doping, the KNN ceramics exhibit the maximum enhancement of ∼1.2 times in the ferroelectric remanent polarization strength and ∼1.25 times in the piezoelectric coefficient d{sub 33}, respectively. Furthermore, after undergoing a ferro-/piezoelectric polarization treatment, the maximum enhancement of ∼1.3 times in photoluminescence (PL) was observed in the poled 0.3% Pr{sup 3+} doped sample. After the trivalent Pr{sup 3+} unequivalently substituting the univalent (K{sub 0.5}Na{sub 0.5}){sup +}, A-sites ionic vacancies willmore » occur to maintain charge neutrality, which may reduce the inner stress and ease the domain wall motions, yielding to the enhancement in ferro-/piezoelectric performance. The polarization-induced enhancement in PL is attributed to the decrease of crystal symmetry abound the Pr{sup 3+} ions after polarization. The dual-enhancement of the ferro-/piezoelectric and photoluminescent performance makes the Pr{sup 3+} doped KNN ceramic hopeful for piezoelectric/luminescent multifunctional devices.« less
Wind data mining by Kohonen Neural Networks.
Fayos, José; Fayos, Carolina
2007-02-14
Time series of Circulation Weather Type (CWT), including daily averaged wind direction and vorticity, are self-classified by similarity using Kohonen Neural Networks (KNN). It is shown that KNN is able to map by similarity all 7300 five-day CWT sequences during the period of 1975-94, in London, United Kingdom. It gives, as a first result, the most probable wind sequences preceding each one of the 27 CWT Lamb classes in that period. Inversely, as a second result, the observed diffuse correlation between both five-day CWT sequences and the CWT of the 6(th) day, in the long 20-year period, can be generalized to predict the last from the previous CWT sequence in a different test period, like 1995, as both time series are similar. Although the average prediction error is comparable to that obtained by forecasting standard methods, the KNN approach gives complementary results, as they depend only on an objective classification of observed CWT data, without any model assumption. The 27 CWT of the Lamb Catalogue were coded with binary three-dimensional vectors, pointing to faces, edges and vertex of a "wind-cube," so that similar CWT vectors were close.
Sequential Adaptive Multi-Modality Target Detection and Classification Using Physics Based Models
2006-09-01
estimation," R. Raghuram, R. Raich and A.O. Hero, IEEE Intl. Conf. on Acoustics, Speech , and Signal Processing, Toulouse France, June 2006, <http...can then be solved using off-the-shelf classifiers such as radial basis functions, SVM, or kNN classifier structures. When applied to mine detection we...stage waveform selection for adaptive resource constrained state estimation," 2006 IEEE Intl. Conf. on Acoustics, Speech , and Signal Processing
NASA Astrophysics Data System (ADS)
Qi, Yong; Lei, Kai; Zhang, Lizeqing; Xing, Ximing; Gou, Wenyue
2018-06-01
This paper introduced the development of a self-serving medical data assisted diagnosis software of cervical cancer on the basis of artificial neural network (SVN, FNN, KNN). The system is developed based on the idea of self-service platform, supported by the application and innovation of neural network algorithm in medical data identification. Furthermore, it combined the advanced methods in various fields to effectively solve the complicated and inaccurate problem of cervical canceration data in the traditional manual treatment.
Fast Query-Optimized Kernel-Machine Classification
NASA Technical Reports Server (NTRS)
Mazzoni, Dominic; DeCoste, Dennis
2004-01-01
A recently developed algorithm performs kernel-machine classification via incremental approximate nearest support vectors. The algorithm implements support-vector machines (SVMs) at speeds 10 to 100 times those attainable by use of conventional SVM algorithms. The algorithm offers potential benefits for classification of images, recognition of speech, recognition of handwriting, and diverse other applications in which there are requirements to discern patterns in large sets of data. SVMs constitute a subset of kernel machines (KMs), which have become popular as models for machine learning and, more specifically, for automated classification of input data on the basis of labeled training data. While similar in many ways to k-nearest-neighbors (k-NN) models and artificial neural networks (ANNs), SVMs tend to be more accurate. Using representations that scale only linearly in the numbers of training examples, while exploring nonlinear (kernelized) feature spaces that are exponentially larger than the original input dimensionality, KMs elegantly and practically overcome the classic curse of dimensionality. However, the price that one must pay for the power of KMs is that query-time complexity scales linearly with the number of training examples, making KMs often orders of magnitude more computationally expensive than are ANNs, decision trees, and other popular machine learning alternatives. The present algorithm treats an SVM classifier as a special form of a k-NN. The algorithm is based partly on an empirical observation that one can often achieve the same classification as that of an exact KM by using only small fraction of the nearest support vectors (SVs) of a query. The exact KM output is a weighted sum over the kernel values between the query and the SVs. In this algorithm, the KM output is approximated with a k-NN classifier, the output of which is a weighted sum only over the kernel values involving k selected SVs. Before query time, there are gathered statistics about how misleading the output of the k-NN model can be, relative to the outputs of the exact KM for a representative set of examples, for each possible k from 1 to the total number of SVs. From these statistics, there are derived upper and lower thresholds for each step k. These thresholds identify output levels for which the particular variant of the k-NN model already leans so strongly positively or negatively that a reversal in sign is unlikely, given the weaker SV neighbors still remaining. At query time, the partial output of each query is incrementally updated, stopping as soon as it exceeds the predetermined statistical thresholds of the current step. For an easy query, stopping can occur as early as step k = 1. For more difficult queries, stopping might not occur until nearly all SVs are touched. A key empirical observation is that this approach can tolerate very approximate nearest-neighbor orderings. In experiments, SVs and queries were projected to a subspace comprising the top few principal- component dimensions and neighbor orderings were computed in that subspace. This approach ensured that the overhead of the nearest-neighbor computations was insignificant, relative to that of the exact KM computation.
An Efficient Statistical Computation Technique for Health Care Big Data using R
NASA Astrophysics Data System (ADS)
Sushma Rani, N.; Srinivasa Rao, P., Dr; Parimala, P.
2017-08-01
Due to the changes in living conditions and other factors many critical health related problems are arising. The diagnosis of the problem at earlier stages will increase the chances of survival and fast recovery. This reduces the time of recovery and the cost associated for the treatment. One such medical related issue is cancer and breast cancer has been identified as the second leading cause of cancer death. If detected in the early stage it can be cured. Once a patient is detected with breast cancer tumor, it should be classified whether it is cancerous or non-cancerous. So the paper uses k-nearest neighbors(KNN) algorithm which is one of the simplest machine learning algorithms and is an instance-based learning algorithm to classify the data. Day-to -day new records are added which leds to increase in the data to be classified and this tends to be big data problem. The algorithm is implemented in R whichis the most popular platform applied to machine learning algorithms for statistical computing. Experimentation is conducted by using various classification evaluation metric onvarious values of k. The results show that the KNN algorithm out performes better than existing models.
Machine Learning and Computer Vision System for Phenotype Data Acquisition and Analysis in Plants.
Navarro, Pedro J; Pérez, Fernando; Weiss, Julia; Egea-Cortines, Marcos
2016-05-05
Phenomics is a technology-driven approach with promising future to obtain unbiased data of biological systems. Image acquisition is relatively simple. However data handling and analysis are not as developed compared to the sampling capacities. We present a system based on machine learning (ML) algorithms and computer vision intended to solve the automatic phenotype data analysis in plant material. We developed a growth-chamber able to accommodate species of various sizes. Night image acquisition requires near infrared lightning. For the ML process, we tested three different algorithms: k-nearest neighbour (kNN), Naive Bayes Classifier (NBC), and Support Vector Machine. Each ML algorithm was executed with different kernel functions and they were trained with raw data and two types of data normalisation. Different metrics were computed to determine the optimal configuration of the machine learning algorithms. We obtained a performance of 99.31% in kNN for RGB images and a 99.34% in SVM for NIR. Our results show that ML techniques can speed up phenomic data analysis. Furthermore, both RGB and NIR images can be segmented successfully but may require different ML algorithms for segmentation.
Automated Identification of Abnormal Adult EEGs
López, S.; Suarez, G.; Jungreis, D.; Obeid, I.; Picone, J.
2016-01-01
The interpretation of electroencephalograms (EEGs) is a process that is still dependent on the subjective analysis of the examiners. Though interrater agreement on critical events such as seizures is high, it is much lower on subtler events (e.g., when there are benign variants). The process used by an expert to interpret an EEG is quite subjective and hard to replicate by machine. The performance of machine learning technology is far from human performance. We have been developing an interpretation system, AutoEEG, with a goal of exceeding human performance on this task. In this work, we are focusing on one of the early decisions made in this process – whether an EEG is normal or abnormal. We explore two baseline classification algorithms: k-Nearest Neighbor (kNN) and Random Forest Ensemble Learning (RF). A subset of the TUH EEG Corpus was used to evaluate performance. Principal Components Analysis (PCA) was used to reduce the dimensionality of the data. kNN achieved a 41.8% detection error rate while RF achieved an error rate of 31.7%. These error rates are significantly lower than those obtained by random guessing based on priors (49.5%). The majority of the errors were related to misclassification of normal EEGs. PMID:27195311
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shevchenko, N. V.; Mares, J.; Gal, A.
Coupled-channels three-body calculations of an I=1/2,J{sup {pi}}=0{sup -} KNN quasibound state in the KNN-{pi}{sigma}N system were performed and the dependence of the resulting three-body energy on the two-body KN-{pi}{sigma} interaction was investigated. Earlier results of binding energy B{sub K{sup -}}{sub pp}{approx}50-70 MeV and width {gamma}{sub K{sup -}}{sub pp}{approx}100 MeV are confirmed [N. V. Shevchenko et al., Phys. Rev. Lett. 98, 082301 (2007)]. It is shown that a suitably constructed energy-independent complex KN potential gives a considerably shallower and narrower three-body quasibound state than the full coupled-channels calculation. Comparison with other calculations is made.
The Development of Mobile Application to Introduce Historical Monuments in Manado
NASA Astrophysics Data System (ADS)
Rupilu, Moshe Markhasi; Suyoto; Santoso, Albertus Joko
2018-02-01
Learning the historical value of a monument is important because it preserves cultural and historical values, as well as expanding our personal insight. In Indonesia, particularly in Manado, North Sulawesi, there are many monuments. The monuments are erected for history, religion, culture and past war, however these aren't written in detail in the monuments. To get information on specific monument, manual search was required, i.e. asking related people or sources. Based on the problem, the development of an application which can utilize LBS (Location Based Service) method and some algorithmic methods specifically designed for mobile devices such as Smartphone, was required so that information on every monument in Manado can be displayed in detail using GPS coordinate. The application was developed by KNN method with K-means algorithm and collaborative filtering to recommend monument information to tourist. Tourists will get recommended options filtered by distance. Then, this method was also used to look for the closest monument from user. KNN algorithm determines the closest location by making comparisons according to calculation of longitude and latitude of several monuments tourist wants to visit. With this application, tourists who want to know and find information on monuments in Manado can do them easily and quickly because monument information is recommended directly to user without having to make selection. Moreover, tourist can see recommended monument information and search several monuments in Manado in real time.
Frog sound identification using extended k-nearest neighbor classifier
NASA Astrophysics Data System (ADS)
Mukahar, Nordiana; Affendi Rosdi, Bakhtiar; Athiar Ramli, Dzati; Jaafar, Haryati
2017-09-01
Frog sound identification based on the vocalization becomes important for biological research and environmental monitoring. As a result, different types of feature extractions and classifiers have been employed to evaluate the accuracy of frog sound identification. This paper presents a frog sound identification with Extended k-Nearest Neighbor (EKNN) classifier. The EKNN classifier integrates the nearest neighbors and mutual sharing of neighborhood concepts, with the aims of improving the classification performance. It makes a prediction based on who are the nearest neighbors of the testing sample and who consider the testing sample as their nearest neighbors. In order to evaluate the classification performance in frog sound identification, the EKNN classifier is compared with competing classifier, k -Nearest Neighbor (KNN), Fuzzy k -Nearest Neighbor (FKNN) k - General Nearest Neighbor (KGNN)and Mutual k -Nearest Neighbor (MKNN) on the recorded sounds of 15 frog species obtained in Malaysia forest. The recorded sounds have been segmented using Short Time Energy and Short Time Average Zero Crossing Rate (STE+STAZCR), sinusoidal modeling (SM), manual and the combination of Energy (E) and Zero Crossing Rate (ZCR) (E+ZCR) while the features are extracted by Mel Frequency Cepstrum Coefficient (MFCC). The experimental results have shown that the EKNCN classifier exhibits the best performance in terms of accuracy compared to the competing classifiers, KNN, FKNN, GKNN and MKNN for all cases.
NASA Astrophysics Data System (ADS)
Oung, Qi Wei; Nisha Basah, Shafriza; Muthusamy, Hariharan; Vijean, Vikneswaran; Lee, Hoileong
2018-03-01
Parkinson’s disease (PD) is one type of progressive neurodegenerative disease known as motor system syndrome, which is due to the death of dopamine-generating cells, a region of the human midbrain. PD normally affects people over 60 years of age, which at present has influenced a huge part of worldwide population. Lately, many researches have shown interest into the connection between PD and speech disorders. Researches have revealed that speech signals may be a suitable biomarker for distinguishing between people with Parkinson’s (PWP) from healthy subjects. Therefore, early diagnosis of PD through the speech signals can be considered for this aim. In this research, the speech data are acquired based on speech behaviour as the biomarker for differentiating PD severity levels (mild and moderate) from healthy subjects. Feature extraction algorithms applied are Mel Frequency Cepstral Coefficients (MFCC), Linear Predictive Coefficients (LPC), Linear Prediction Cepstral Coefficients (LPCC), and Weighted Linear Prediction Cepstral Coefficients (WLPCC). For classification, two types of classifiers are used: k-Nearest Neighbour (KNN) and Probabilistic Neural Network (PNN). The experimental results demonstrated that PNN classifier and KNN classifier achieve the best average classification performance of 92.63% and 88.56% respectively through 10-fold cross-validation measures. Favourably, the suggested techniques have the possibilities of becoming a new choice of promising tools for the PD detection with tremendous performance.
Hussain, Lal
2018-06-01
Epilepsy is a neurological disorder produced due to abnormal excitability of neurons in the brain. The research reveals that brain activity is monitored through electroencephalogram (EEG) of patients suffered from seizure to detect the epileptic seizure. The performance of EEG detection based epilepsy require feature extracting strategies. In this research, we have extracted varying features extracting strategies based on time and frequency domain characteristics, nonlinear, wavelet based entropy and few statistical features. A deeper study was undertaken using novel machine learning classifiers by considering multiple factors. The support vector machine kernels are evaluated based on multiclass kernel and box constraint level. Likewise, for K-nearest neighbors (KNN), we computed the different distance metrics, Neighbor weights and Neighbors. Similarly, the decision trees we tuned the paramours based on maximum splits and split criteria and ensemble classifiers are evaluated based on different ensemble methods and learning rate. For training/testing tenfold Cross validation was employed and performance was evaluated in form of TPR, NPR, PPV, accuracy and AUC. In this research, a deeper analysis approach was performed using diverse features extracting strategies using robust machine learning classifiers with more advanced optimal options. Support Vector Machine linear kernel and KNN with City block distance metric give the overall highest accuracy of 99.5% which was higher than using the default parameters for these classifiers. Moreover, highest separation (AUC = 0.9991, 0.9990) were obtained at different kernel scales using SVM. Additionally, the K-nearest neighbors with inverse squared distance weight give higher performance at different Neighbors. Moreover, to distinguish the postictal heart rate oscillations from epileptic ictal subjects, and highest performance of 100% was obtained using different machine learning classifiers.
Automated cloud classification using a ground based infra-red camera and texture analysis techniques
NASA Astrophysics Data System (ADS)
Rumi, Emal; Kerr, David; Coupland, Jeremy M.; Sandford, Andrew P.; Brettle, Mike J.
2013-10-01
Clouds play an important role in influencing the dynamics of local and global weather and climate conditions. Continuous monitoring of clouds is vital for weather forecasting and for air-traffic control. Convective clouds such as Towering Cumulus (TCU) and Cumulonimbus clouds (CB) are associated with thunderstorms, turbulence and atmospheric instability. Human observers periodically report the presence of CB and TCU clouds during operational hours at airports and observatories; however such observations are expensive and time limited. Robust, automatic classification of cloud type using infrared ground-based instrumentation offers the advantage of continuous, real-time (24/7) data capture and the representation of cloud structure in the form of a thermal map, which can greatly help to characterise certain cloud formations. The work presented here utilised a ground based infrared (8-14 μm) imaging device mounted on a pan/tilt unit for capturing high spatial resolution sky images. These images were processed to extract 45 separate textural features using statistical and spatial frequency based analytical techniques. These features were used to train a weighted k-nearest neighbour (KNN) classifier in order to determine cloud type. Ground truth data were obtained by inspection of images captured simultaneously from a visible wavelength colour camera at the same installation, with approximately the same field of view as the infrared device. These images were classified by a trained cloud observer. Results from the KNN classifier gave an encouraging success rate. A Probability of Detection (POD) of up to 90% with a Probability of False Alarm (POFA) as low as 16% was achieved.
Yousef, Malik; Khalifa, Waleed; AbedAllah, Loai
2016-12-22
The performance of many learning and data mining algorithms depends critically on suitable metrics to assess efficiency over the input space. Learning a suitable metric from examples may, therefore, be the key to successful application of these algorithms. We have demonstrated that the k-nearest neighbor (kNN) classification can be significantly improved by learning a distance metric from labeled examples. The clustering ensemble is used to define the distance between points in respect to how they co-cluster. This distance is then used within the framework of the kNN algorithm to define a classifier named ensemble clustering kNN classifier (EC-kNN). In many instances in our experiments we achieved highest accuracy while SVM failed to perform as well. In this study, we compare the performance of a two-class classifier using EC-kNN with different one-class and two-class classifiers. The comparison was applied to seven different plant microRNA species considering eight feature selection methods. In this study, the averaged results show that ECkNN outperforms all other methods employed here and previously published results for the same data. In conclusion, this study shows that the chosen classifier shows high performance when the distance metric is carefully chosen.
Yousef, Malik; Khalifa, Waleed; AbdAllah, Loai
2016-12-01
The performance of many learning and data mining algorithms depends critically on suitable metrics to assess efficiency over the input space. Learning a suitable metric from examples may, therefore, be the key to successful application of these algorithms. We have demonstrated that the k-nearest neighbor (kNN) classification can be significantly improved by learning a distance metric from labeled examples. The clustering ensemble is used to define the distance between points in respect to how they co-cluster. This distance is then used within the framework of the kNN algorithm to define a classifier named ensemble clustering kNN classifier (EC-kNN). In many instances in our experiments we achieved highest accuracy while SVM failed to perform as well. In this study, we compare the performance of a two-class classifier using EC-kNN with different one-class and two-class classifiers. The comparison was applied to seven different plant microRNA species considering eight feature selection methods. In this study, the averaged results show that EC-kNN outperforms all other methods employed here and previously published results for the same data. In conclusion, this study shows that the chosen classifier shows high performance when the distance metric is carefully chosen.
Mapping growing stock volume and forest live biomass: a case study of the Polissya region of Ukraine
NASA Astrophysics Data System (ADS)
Bilous, Andrii; Myroniuk, Viktor; Holiaka, Dmytrii; Bilous, Svitlana; See, Linda; Schepaschenko, Dmitry
2017-10-01
Forest inventory and biomass mapping are important tasks that require inputs from multiple data sources. In this paper we implement two methods for the Ukrainian region of Polissya: random forest (RF) for tree species prediction and k-nearest neighbors (k-NN) for growing stock volume and biomass mapping. We examined the suitability of the five-band RapidEye satellite image to predict the distribution of six tree species. The accuracy of RF is quite high: ~99% for forest/non-forest mask and 89% for tree species prediction. Our results demonstrate that inclusion of elevation as a predictor variable in the RF model improved the performance of tree species classification. We evaluated different distance metrics for the k-NN method, including Euclidean or Mahalanobis distance, most similar neighbor (MSN), gradient nearest neighbor, and independent component analysis. The MSN with the four nearest neighbors (k = 4) is the most precise (according to the root-mean-square deviation) for predicting forest attributes across the study area. The k-NN method allowed us to estimate growing stock volume with an accuracy of 3 m3 ha-1 and for live biomass of about 2 t ha-1 over the study area.
NASA Astrophysics Data System (ADS)
Huang, Jian; Liu, Gui-xiong
2016-09-01
The identification of targets varies in different surge tests. A multi-color space threshold segmentation and self-learning k-nearest neighbor algorithm ( k-NN) for equipment under test status identification was proposed after using feature matching to identify equipment status had to train new patterns every time before testing. First, color space (L*a*b*, hue saturation lightness (HSL), hue saturation value (HSV)) to segment was selected according to the high luminance points ratio and white luminance points ratio of the image. Second, the unknown class sample S r was classified by the k-NN algorithm with training set T z according to the feature vector, which was formed from number of pixels, eccentricity ratio, compactness ratio, and Euler's numbers. Last, while the classification confidence coefficient equaled k, made S r as one sample of pre-training set T z '. The training set T z increased to T z+1 by T z ' if T z ' was saturated. In nine series of illuminant, indicator light, screen, and disturbances samples (a total of 21600 frames), the algorithm had a 98.65%identification accuracy, also selected five groups of samples to enlarge the training set from T 0 to T 5 by itself.
Using machine learning algorithms to guide rehabilitation planning for home care clients.
Zhu, Mu; Zhang, Zhanyang; Hirdes, John P; Stolee, Paul
2007-12-20
Targeting older clients for rehabilitation is a clinical challenge and a research priority. We investigate the potential of machine learning algorithms - Support Vector Machine (SVM) and K-Nearest Neighbors (KNN) - to guide rehabilitation planning for home care clients. This study is a secondary analysis of data on 24,724 longer-term clients from eight home care programs in Ontario. Data were collected with the RAI-HC assessment system, in which the Activities of Daily Living Clinical Assessment Protocol (ADLCAP) is used to identify clients with rehabilitation potential. For study purposes, a client is defined as having rehabilitation potential if there was: i) improvement in ADL functioning, or ii) discharge home. SVM and KNN results are compared with those obtained using the ADLCAP. For comparison, the machine learning algorithms use the same functional and health status indicators as the ADLCAP. The KNN and SVM algorithms achieved similar substantially improved performance over the ADLCAP, although false positive and false negative rates were still fairly high (FP > .18, FN > .34 versus FP > .29, FN. > .58 for ADLCAP). Results are used to suggest potential revisions to the ADLCAP. Machine learning algorithms achieved superior predictions than the current protocol. Machine learning results are less readily interpretable, but can also be used to guide development of improved clinical protocols.
NASA Astrophysics Data System (ADS)
Heidari, Morteza; Zargari Khuzani, Abolfazl; Danala, Gopichandh; Mirniaharikandehei, Seyedehnafiseh; Qian, Wei; Zheng, Bin
2018-03-01
Both conventional and deep machine learning has been used to develop decision-support tools applied in medical imaging informatics. In order to take advantages of both conventional and deep learning approach, this study aims to investigate feasibility of applying a locally preserving projection (LPP) based feature regeneration algorithm to build a new machine learning classifier model to predict short-term breast cancer risk. First, a computer-aided image processing scheme was used to segment and quantify breast fibro-glandular tissue volume. Next, initially computed 44 image features related to the bilateral mammographic tissue density asymmetry were extracted. Then, an LLP-based feature combination method was applied to regenerate a new operational feature vector using a maximal variance approach. Last, a k-nearest neighborhood (KNN) algorithm based machine learning classifier using the LPP-generated new feature vectors was developed to predict breast cancer risk. A testing dataset involving negative mammograms acquired from 500 women was used. Among them, 250 were positive and 250 remained negative in the next subsequent mammography screening. Applying to this dataset, LLP-generated feature vector reduced the number of features from 44 to 4. Using a leave-onecase-out validation method, area under ROC curve produced by the KNN classifier significantly increased from 0.62 to 0.68 (p < 0.05) and odds ratio was 4.60 with a 95% confidence interval of [3.16, 6.70]. Study demonstrated that this new LPP-based feature regeneration approach enabled to produce an optimal feature vector and yield improved performance in assisting to predict risk of women having breast cancer detected in the next subsequent mammography screening.
Wen, Tingxi; Zhang, Zhongnan; Qiu, Ming; Zeng, Ming; Luo, Weizhen
2017-01-01
The computer mouse is an important human-computer interaction device. But patients with physical finger disability are unable to operate this device. Surface EMG (sEMG) can be monitored by electrodes on the skin surface and is a reflection of the neuromuscular activities. Therefore, we can control limbs auxiliary equipment by utilizing sEMG classification in order to help the physically disabled patients to operate the mouse. To develop a new a method to extract sEMG generated by finger motion and apply novel features to classify sEMG. A window-based data acquisition method was presented to extract signal samples from sEMG electordes. Afterwards, a two-dimensional matrix image based feature extraction method, which differs from the classical methods based on time domain or frequency domain, was employed to transform signal samples to feature maps used for classification. In the experiments, sEMG data samples produced by the index and middle fingers at the click of a mouse button were separately acquired. Then, characteristics of the samples were analyzed to generate a feature map for each sample. Finally, the machine learning classification algorithms (SVM, KNN, RBF-NN) were employed to classify these feature maps on a GPU. The study demonstrated that all classifiers can identify and classify sEMG samples effectively. In particular, the accuracy of the SVM classifier reached up to 100%. The signal separation method is a convenient, efficient and quick method, which can effectively extract the sEMG samples produced by fingers. In addition, unlike the classical methods, the new method enables to extract features by enlarging sample signals' energy appropriately. The classical machine learning classifiers all performed well by using these features.
Reuse of imputed data in microarray analysis increases imputation efficiency
Kim, Ki-Yeol; Kim, Byoung-Jin; Yi, Gwan-Su
2004-01-01
Background The imputation of missing values is necessary for the efficient use of DNA microarray data, because many clustering algorithms and some statistical analysis require a complete data set. A few imputation methods for DNA microarray data have been introduced, but the efficiency of the methods was low and the validity of imputed values in these methods had not been fully checked. Results We developed a new cluster-based imputation method called sequential K-nearest neighbor (SKNN) method. This imputes the missing values sequentially from the gene having least missing values, and uses the imputed values for the later imputation. Although it uses the imputed values, the efficiency of this new method is greatly improved in its accuracy and computational complexity over the conventional KNN-based method and other methods based on maximum likelihood estimation. The performance of SKNN was in particular higher than other imputation methods for the data with high missing rates and large number of experiments. Application of Expectation Maximization (EM) to the SKNN method improved the accuracy, but increased computational time proportional to the number of iterations. The Multiple Imputation (MI) method, which is well known but not applied previously to microarray data, showed a similarly high accuracy as the SKNN method, with slightly higher dependency on the types of data sets. Conclusions Sequential reuse of imputed data in KNN-based imputation greatly increases the efficiency of imputation. The SKNN method should be practically useful to save the data of some microarray experiments which have high amounts of missing entries. The SKNN method generates reliable imputed values which can be used for further cluster-based analysis of microarray data. PMID:15504240
Text Content Pushing Technology Research Based on Location and Topic
NASA Astrophysics Data System (ADS)
Wei, Dongqi; Wei, Jianxin; Wumuti, Naheman; Jiang, Baode
2016-11-01
In the field, geological workers usually want to obtain related geological background information in the working area quickly and accurately. This information exists in the massive geological data, text data is described in natural language accounted for a large proportion. This paper studied location information extracting method in the mass text data; proposed a geographic location—geological content—geological content related algorithm based on Spark and Mapreduce2, finally classified content by using KNN, and built the content pushing system based on location and topic. It is running in the geological survey cloud, and we have gained a good effect in testing by using real geological data.
PAKDD Data Mining Competition 2009: New Ways of Using Known Methods
NASA Astrophysics Data System (ADS)
Linhart, Chaim; Harari, Guy; Abramovich, Sharon; Buchris, Altina
The PAKDD 2009 competition focuses on the problem of credit risk assessment. As required, we had to confront the problem of the robustness of the credit-scoring model against performance degradation caused by gradual market changes along a few years of business operation. We utilized the following standard models: logistic regression, KNN, SVM, GBM and decision tree. The novelty of our approach is two-fold: the integration of existing models, namely feeding the results of KNN as an input variable to the logistic regression, and re-coding categorical variables as numerical values that represent each category's statistical impact on the target label. The best solution we obtained reached 3rd place in the competition, with an AUC score of 0.655.
Chen, Zewei; Zhang, Xin; Zhang, Zhuoyong
2016-12-01
Timely risk assessment of chronic kidney disease (CKD) and proper community-based CKD monitoring are important to prevent patients with potential risk from further kidney injuries. As many symptoms are associated with the progressive development of CKD, evaluating risk of CKD through a set of clinical data of symptoms coupled with multivariate models can be considered as an available method for prevention of CKD and would be useful for community-based CKD monitoring. Three common used multivariate models, i.e., K-nearest neighbor (KNN), support vector machine (SVM), and soft independent modeling of class analogy (SIMCA), were used to evaluate risk of 386 patients based on a series of clinical data taken from UCI machine learning repository. Different types of composite data, in which proportional disturbances were added to simulate measurement deviations caused by environment and instrument noises, were also utilized to evaluate the feasibility and robustness of these models in risk assessment of CKD. For the original data set, three mentioned multivariate models can differentiate patients with CKD and non-CKD with the overall accuracies over 93 %. KNN and SVM have better performances than SIMCA has in this study. For the composite data set, SVM model has the best ability to tolerate noise disturbance and thus are more robust than the other two models. Using clinical data set on symptoms coupled with multivariate models has been proved to be feasible approach for assessment of patient with potential CKD risk. SVM model can be used as useful and robust tool in this study.
Yu, Huan; Caldwell, Curtis; Mah, Katherine; Mozeg, Daniel
2009-03-01
Coregistered fluoro-deoxy-glucose (FDG) positron emission tomography/computed tomography (PET/CT) has shown potential to improve the accuracy of radiation targeting of head and neck cancer (HNC) when compared to the use of CT simulation alone. The objective of this study was to identify textural features useful in distinguishing tumor from normal tissue in head and neck via quantitative texture analysis of coregistered 18F-FDG PET and CT images. Abnormal and typical normal tissues were manually segmented from PET/CT images of 20 patients with HNC and 20 patients with lung cancer. Texture features including some derived from spatial grey-level dependence matrices (SGLDM) and neighborhood gray-tone-difference matrices (NGTDM) were selected for characterization of these segmented regions of interest (ROIs). Both K nearest neighbors (KNNs) and decision tree (DT)-based KNN classifiers were employed to discriminate images of abnormal and normal tissues. The area under the curve (AZ) of receiver operating characteristics (ROC) was used to evaluate the discrimination performance of features in comparison to an expert observer. The leave-one-out and bootstrap techniques were used to validate the results. The AZ of DT-based KNN classifier was 0.95. Sensitivity and specificity for normal and abnormal tissue classification were 89% and 99%, respectively. In summary, NGTDM features such as PET Coarseness, PET Contrast, and CT Coarseness extracted from FDG PET/CT images provided good discrimination performance. The clinical use of such features may lead to improvement in the accuracy of radiation targeting of HNC.
Acharya, U. Rajendra; Sree, S. Vinitha; Kulshreshtha, Sanjeev; Molinari, Filippo; Koh, Joel En Wei; Saba, Luca; Suri, Jasjit S.
2014-01-01
Ovarian cancer is the fifth highest cause of cancer in women and the leading cause of death from gynecological cancers. Accurate diagnosis of ovarian cancer from acquired images is dependent on the expertise and experience of ultrasonographers or physicians, and is therefore, associated with inter observer variabilities. Computer Aided Diagnostic (CAD) techniques use a number of different data mining techniques to automatically predict the presence or absence of cancer, and therefore, are more reliable and accurate. A review of published literature in the field of CAD based ovarian cancer detection indicates that many studies use ultrasound images as the base for analysis. The key objective of this work is to propose an effective adjunct CAD technique called GyneScan for ovarian tumor detection in ultrasound images. In our proposed data mining framework, we extract several texture features based on first order statistics, Gray Level Co-occurrence Matrix and run length matrix. The significant features selected using t-test are then used to train and test several supervised learning based classifiers such as Probabilistic Neural Networks (PNN), Support Vector Machine (SVM), Decision Tree (DT), k-Nearest Neighbor (KNN), and Naïve Bayes (NB). We evaluated the developed framework using 1300 benign and 1300 malignant images. Using 11 significant features in KNN/PNN classifiers, we were able to achieve 100% classification accuracy, sensitivity, specificity, and positive predictive value in detecting ovarian tumor. Even though more validation using larger databases would better establish the robustness of our technique, the preliminary results are promising. This technique could be used as a reliable adjunct method to existing imaging modalities to provide a more confident second opinion on the presence/absence of ovarian tumor. PMID:24325128
Physical Human Activity Recognition Using Wearable Sensors.
Attal, Ferhat; Mohammed, Samer; Dedabrishvili, Mariam; Chamroukhi, Faicel; Oukhellou, Latifa; Amirat, Yacine
2015-12-11
This paper presents a review of different classification techniques used to recognize human activities from wearable inertial sensor data. Three inertial sensor units were used in this study and were worn by healthy subjects at key points of upper/lower body limbs (chest, right thigh and left ankle). Three main steps describe the activity recognition process: sensors' placement, data pre-processing and data classification. Four supervised classification techniques namely, k-Nearest Neighbor (k-NN), Support Vector Machines (SVM), Gaussian Mixture Models (GMM), and Random Forest (RF) as well as three unsupervised classification techniques namely, k-Means, Gaussian mixture models (GMM) and Hidden Markov Model (HMM), are compared in terms of correct classification rate, F-measure, recall, precision, and specificity. Raw data and extracted features are used separately as inputs of each classifier. The feature selection is performed using a wrapper approach based on the RF algorithm. Based on our experiments, the results obtained show that the k-NN classifier provides the best performance compared to other supervised classification algorithms, whereas the HMM classifier is the one that gives the best results among unsupervised classification algorithms. This comparison highlights which approach gives better performance in both supervised and unsupervised contexts. It should be noted that the obtained results are limited to the context of this study, which concerns the classification of the main daily living human activities using three wearable accelerometers placed at the chest, right shank and left ankle of the subject.
Physical Human Activity Recognition Using Wearable Sensors
Attal, Ferhat; Mohammed, Samer; Dedabrishvili, Mariam; Chamroukhi, Faicel; Oukhellou, Latifa; Amirat, Yacine
2015-01-01
This paper presents a review of different classification techniques used to recognize human activities from wearable inertial sensor data. Three inertial sensor units were used in this study and were worn by healthy subjects at key points of upper/lower body limbs (chest, right thigh and left ankle). Three main steps describe the activity recognition process: sensors’ placement, data pre-processing and data classification. Four supervised classification techniques namely, k-Nearest Neighbor (k-NN), Support Vector Machines (SVM), Gaussian Mixture Models (GMM), and Random Forest (RF) as well as three unsupervised classification techniques namely, k-Means, Gaussian mixture models (GMM) and Hidden Markov Model (HMM), are compared in terms of correct classification rate, F-measure, recall, precision, and specificity. Raw data and extracted features are used separately as inputs of each classifier. The feature selection is performed using a wrapper approach based on the RF algorithm. Based on our experiments, the results obtained show that the k-NN classifier provides the best performance compared to other supervised classification algorithms, whereas the HMM classifier is the one that gives the best results among unsupervised classification algorithms. This comparison highlights which approach gives better performance in both supervised and unsupervised contexts. It should be noted that the obtained results are limited to the context of this study, which concerns the classification of the main daily living human activities using three wearable accelerometers placed at the chest, right shank and left ankle of the subject. PMID:26690450
Model-based segmentation of abdominal aortic aneurysms in CTA images
NASA Astrophysics Data System (ADS)
de Bruijne, Marleen; van Ginneken, Bram; Niessen, Wiro J.; Loog, Marco; Viergever, Max A.
2003-05-01
Segmentation of thrombus in abdominal aortic aneurysms is complicated by regions of low boundary contrast and by the presence of many neighboring structures in close proximity to the aneurysm wall. We present an automated method that is similar to the well known Active Shape Models (ASM), combining a three-dimensional shape model with a one-dimensional boundary appearance model. Our contribution is twofold: we developed a non-parametric appearance modeling scheme that effectively deals with a highly varying background, and we propose a way of generalizing models of curvilinear structures from small training sets. In contrast with the conventional ASM approach, the new appearance model trains on both true and false examples of boundary profiles. The probability that a given image profile belongs to the boundary is obtained using k nearest neighbor (kNN) probability density estimation. The performance of this scheme is compared to that of original ASMs, which minimize the Mahalanobis distance to the average true profile in the training set. The generalizability of the shape model is improved by modeling the objects axis deformation independent of its cross-sectional deformation. A leave-one-out experiment was performed on 23 datasets. Segmentation using the kNN appearance model significantly outperformed the original ASM scheme; average volume errors were 5.9% and 46% respectively.
Human action recognition based on kinematic similarity in real time
Chen, Longting; Luo, Ailing; Zhang, Sicong
2017-01-01
Human action recognition using 3D pose data has gained a growing interest in the field of computer robotic interfaces and pattern recognition since the availability of hardware to capture human pose. In this paper, we propose a fast, simple, and powerful method of human action recognition based on human kinematic similarity. The key to this method is that the action descriptor consists of joints position, angular velocity and angular acceleration, which can meet the different individual sizes and eliminate the complex normalization. The angular parameters of joints within a short sliding time window (approximately 5 frames) around the current frame are used to express each pose frame of human action sequence. Moreover, three modified KNN (k-nearest-neighbors algorithm) classifiers are employed in our method: one for achieving the confidence of every frame in the training step, one for estimating the frame label of each descriptor, and one for classifying actions. Additional estimating of the frame’s time label makes it possible to address single input frames. This approach can be used on difficult, unsegmented sequences. The proposed method is efficient and can be run in real time. The research shows that many public datasets are irregularly segmented, and a simple method is provided to regularize the datasets. The approach is tested on some challenging datasets such as MSR-Action3D, MSRDailyActivity3D, and UTD-MHAD. The results indicate our method achieves a higher accuracy. PMID:29073131
Research on Classification of Chinese Text Data Based on SVM
NASA Astrophysics Data System (ADS)
Lin, Yuan; Yu, Hongzhi; Wan, Fucheng; Xu, Tao
2017-09-01
Data Mining has important application value in today’s industry and academia. Text classification is a very important technology in data mining. At present, there are many mature algorithms for text classification. KNN, NB, AB, SVM, decision tree and other classification methods all show good classification performance. Support Vector Machine’ (SVM) classification method is a good classifier in machine learning research. This paper will study the classification effect based on the SVM method in the Chinese text data, and use the support vector machine method in the chinese text to achieve the classify chinese text, and to able to combination of academia and practical application.
Parametric classification of handvein patterns based on texture features
NASA Astrophysics Data System (ADS)
Al Mahafzah, Harbi; Imran, Mohammad; Supreetha Gowda H., D.
2018-04-01
In this paper, we have developed Biometric recognition system adopting hand based modality Handvein,which has the unique pattern for each individual and it is impossible to counterfeit and fabricate as it is an internal feature. We have opted in choosing feature extraction algorithms such as LBP-visual descriptor, LPQ-blur insensitive texture operator, Log-Gabor-Texture descriptor. We have chosen well known classifiers such as KNN and SVM for classification. We have experimented and tabulated results of single algorithm recognition rate for Handvein under different distance measures and kernel options. The feature level fusion is carried out which increased the performance level.
Al-Qazzaz, Noor Kamal; Ali, Sawal Hamid Bin Mohd; Ahmad, Siti Anom; Islam, Mohd Shabiul; Escudero, Javier
2018-01-01
Stroke survivors are more prone to developing cognitive impairment and dementia. Dementia detection is a challenge for supporting personalized healthcare. This study analyzes the electroencephalogram (EEG) background activity of 5 vascular dementia (VaD) patients, 15 stroke-related patients with mild cognitive impairment (MCI), and 15 control healthy subjects during a working memory (WM) task. The objective of this study is twofold. First, it aims to enhance the discrimination of VaD, stroke-related MCI patients, and control subjects using fuzzy neighborhood preserving analysis with QR-decomposition (FNPAQR); second, it aims to extract and investigate the spectral features that characterize the post-stroke dementia patients compared to the control subjects. Nineteen channels were recorded and analyzed using the independent component analysis and wavelet analysis (ICA-WT) denoising technique. Using ANOVA, linear spectral power including relative powers (RP) and power ratio were calculated to test whether the EEG dominant frequencies were slowed down in VaD and stroke-related MCI patients. Non-linear features including permutation entropy (PerEn) and fractal dimension (FD) were used to test the degree of irregularity and complexity, which was significantly lower in patients with VaD and stroke-related MCI than that in control subjects (ANOVA; p ˂ 0.05). This study is the first to use fuzzy neighborhood preserving analysis with QR-decomposition (FNPAQR) dimensionality reduction technique with EEG background activity of dementia patients. The impairment of post-stroke patients was detected using support vector machine (SVM) and k-nearest neighbors (kNN) classifiers. A comparative study has been performed to check the effectiveness of using FNPAQR dimensionality reduction technique with the SVM and kNN classifiers. FNPAQR with SVM and kNN obtained 91.48 and 89.63% accuracy, respectively, whereas without using the FNPAQR exhibited 70 and 67.78% accuracy for SVM and kNN, respectively, in classifying VaD, stroke-related MCI, and control patients, respectively. Therefore, EEG could be a reliable index for inspecting concise markers that are sensitive to VaD and stroke-related MCI patients compared to control healthy subjects.
About neighborhood counting measure metric and minimum risk metric.
Argentini, Andrea; Blanzieri, Enrico
2010-04-01
In a 2006 TPAMI paper, Wang proposed the Neighborhood Counting Measure, a similarity measure for the k-NN algorithm. In his paper, Wang mentioned the Minimum Risk Metric (MRM), an early distance measure based on the minimization of the risk of misclassification. Wang did not compare NCM to MRM because of its allegedly excessive computational load. In this comment paper, we complete the comparison that was missing in Wang's paper and, from our empirical evaluation, we show that MRM outperforms NCM and that its running time is not prohibitive as Wang suggested.
Fast Nonparametric Machine Learning Algorithms for High-Dimensional Massive Data and Applications
2006-03-01
know the probability of that from Lemma 2. Using the union bound, we know that for any query q, the probability that i-am-feeling-lucky search algorithm...and each point in a d-dimensional space, a naive k-NN search needs to do a linear scan of T for every single query q, and thus the computational time...algorithm based on partition trees with priority search , and give an expected query time O((1/)d log n). But the constant in the O((1/)d log n
Texture- and deformability-based surface recognition by tactile image analysis.
Khasnobish, Anwesha; Pal, Monalisa; Tibarewala, D N; Konar, Amit; Pal, Kunal
2016-08-01
Deformability and texture are two unique object characteristics which are essential for appropriate surface recognition by tactile exploration. Tactile sensation is required to be incorporated in artificial arms for rehabilitative and other human-computer interface applications to achieve efficient and human-like manoeuvring. To accomplish the same, surface recognition by tactile data analysis is one of the prerequisites. The aim of this work is to develop effective technique for identification of various surfaces based on deformability and texture by analysing tactile images which are obtained during dynamic exploration of the item by artificial arms whose gripper is fitted with tactile sensors. Tactile data have been acquired, while human beings as well as a robot hand fitted with tactile sensors explored the objects. The tactile images are pre-processed, and relevant features are extracted from the tactile images. These features are provided as input to the variants of support vector machine (SVM), linear discriminant analysis and k-nearest neighbour (kNN) for classification. Based on deformability, six household surfaces are recognized from their corresponding tactile images. Moreover, based on texture five surfaces of daily use are classified. The method adopted in the former two cases has also been applied for deformability- and texture-based recognition of four biomembranes, i.e. membranes prepared from biomaterials which can be used for various applications such as drug delivery and implants. Linear SVM performed best for recognizing surface deformability with an accuracy of 83 % in 82.60 ms, whereas kNN classifier recognizes surfaces of daily use having different textures with an accuracy of 89 % in 54.25 ms and SVM with radial basis function kernel recognizes biomembranes with an accuracy of 78 % in 53.35 ms. The classifiers are observed to generalize well on the unseen test datasets with very high performance to achieve efficient material recognition based on its deformability and texture.
Yang, Jiaojiao; Guo, Qian; Li, Wenjie; Wang, Suhong; Zou, Ling
2016-04-01
This paper aims to assist the individual clinical diagnosis of children with attention-deficit/hyperactivity disorder using electroencephalogram signal detection method.Firstly,in our experiments,we obtained and studied the electroencephalogram signals from fourteen attention-deficit/hyperactivity disorder children and sixteen typically developing children during the classic interference control task of Simon-spatial Stroop,and we completed electroencephalogram data preprocessing including filtering,segmentation,removal of artifacts and so on.Secondly,we selected the subset electroencephalogram electrodes using principal component analysis(PCA)method,and we collected the common channels of the optimal electrodes which occurrence rates were more than 90%in each kind of stimulation.We then extracted the latency(200~450ms)mean amplitude features of the common electrodes.Finally,we used the k-nearest neighbor(KNN)classifier based on Euclidean distance and the support vector machine(SVM)classifier based on radial basis kernel function to classify.From the experiment,at the same kind of interference control task,the attention-deficit/hyperactivity disorder children showed lower correct response rates and longer reaction time.The N2 emerged in prefrontal cortex while P2 presented in the inferior parietal area when all kinds of stimuli demonstrated.Meanwhile,the children with attention-deficit/hyperactivity disorder exhibited markedly reduced N2 and P2amplitude compared to typically developing children.KNN resulted in better classification accuracy than SVM classifier,and the best classification rate was 89.29%in StI task.The results showed that the electroencephalogram signals were different in the brain regions of prefrontal cortex and inferior parietal cortex between attention-deficit/hyperactivity disorder and typically developing children during the interference control task,which provided a scientific basis for the clinical diagnosis of attention-deficit/hyperactivity disorder individuals.
Does rational selection of training and test sets improve the outcome of QSAR modeling?
Martin, Todd M; Harten, Paul; Young, Douglas M; Muratov, Eugene N; Golbraikh, Alexander; Zhu, Hao; Tropsha, Alexander
2012-10-22
Prior to using a quantitative structure activity relationship (QSAR) model for external predictions, its predictive power should be established and validated. In the absence of a true external data set, the best way to validate the predictive ability of a model is to perform its statistical external validation. In statistical external validation, the overall data set is divided into training and test sets. Commonly, this splitting is performed using random division. Rational splitting methods can divide data sets into training and test sets in an intelligent fashion. The purpose of this study was to determine whether rational division methods lead to more predictive models compared to random division. A special data splitting procedure was used to facilitate the comparison between random and rational division methods. For each toxicity end point, the overall data set was divided into a modeling set (80% of the overall set) and an external evaluation set (20% of the overall set) using random division. The modeling set was then subdivided into a training set (80% of the modeling set) and a test set (20% of the modeling set) using rational division methods and by using random division. The Kennard-Stone, minimal test set dissimilarity, and sphere exclusion algorithms were used as the rational division methods. The hierarchical clustering, random forest, and k-nearest neighbor (kNN) methods were used to develop QSAR models based on the training sets. For kNN QSAR, multiple training and test sets were generated, and multiple QSAR models were built. The results of this study indicate that models based on rational division methods generate better statistical results for the test sets than models based on random division, but the predictive power of both types of models are comparable.
Suyundikov, Anvar; Stevens, John R.; Corcoran, Christopher; Herrick, Jennifer; Wolff, Roger K.; Slattery, Martha L.
2015-01-01
Missing data can arise in bioinformatics applications for a variety of reasons, and imputation methods are frequently applied to such data. We are motivated by a colorectal cancer study where miRNA expression was measured in paired tumor-normal samples of hundreds of patients, but data for many normal samples were missing due to lack of tissue availability. We compare the precision and power performance of several imputation methods, and draw attention to the statistical dependence induced by K-Nearest Neighbors (KNN) imputation. This imputation-induced dependence has not previously been addressed in the literature. We demonstrate how to account for this dependence, and show through simulation how the choice to ignore or account for this dependence affects both power and type I error rate control. PMID:25849489
Categorizing document by fuzzy C-Means and K-nearest neighbors approach
NASA Astrophysics Data System (ADS)
Priandini, Novita; Zaman, Badrus; Purwanti, Endah
2017-08-01
Increasing of technology had made categorizing documents become important. It caused by increasing of number of documents itself. Managing some documents by categorizing is one of Information Retrieval application, because it involve text mining on its process. Whereas, categorization technique could be done both Fuzzy C-Means (FCM) and K-Nearest Neighbors (KNN) method. This experiment would consolidate both methods. The aim of the experiment is increasing performance of document categorize. First, FCM is in order to clustering training documents. Second, KNN is in order to categorize testing document until the output of categorization is shown. Result of the experiment is 14 testing documents retrieve relevantly to its category. Meanwhile 6 of 20 testing documents retrieve irrelevant to its category. Result of system evaluation shows that both precision and recall are 0,7.
NASA Astrophysics Data System (ADS)
Ryu, Jungho; Choi, Jong-Jin; Hahn, Byung-Dong; Park, Dong-Soo; Yoon, Woon-Ha; Kim, Ki-Hoon
2007-04-01
Lead-free piezoelectric thick films of (K0.5Na0.5)NbO3 were fabricated by aerosol-deposition method. The thickness of KNN film was 7.1μm and fully dense films were obtained. The dielectric constants ɛ3T/ɛ0 of the as-deposited and annealed films at 1kHz were 116 and 545, respectively, which are higher than any previously reported values for lead-free piezoelectric thin/thick films, either without or with heat treatment. The ferroelectric properties were improved after annealing and the maximum values of Pr=8.1μC/cm3 and Ec=100kV/cm were achieved. These values are markedly superior to those of sintered KNN ceramic counterparts.
Thanh Noi, Phan; Kappas, Martin
2017-01-01
In previous classification studies, three non-parametric classifiers, Random Forest (RF), k-Nearest Neighbor (kNN), and Support Vector Machine (SVM), were reported as the foremost classifiers at producing high accuracies. However, only a few studies have compared the performances of these classifiers with different training sample sizes for the same remote sensing images, particularly the Sentinel-2 Multispectral Imager (MSI). In this study, we examined and compared the performances of the RF, kNN, and SVM classifiers for land use/cover classification using Sentinel-2 image data. An area of 30 × 30 km2 within the Red River Delta of Vietnam with six land use/cover types was classified using 14 different training sample sizes, including balanced and imbalanced, from 50 to over 1250 pixels/class. All classification results showed a high overall accuracy (OA) ranging from 90% to 95%. Among the three classifiers and 14 sub-datasets, SVM produced the highest OA with the least sensitivity to the training sample sizes, followed consecutively by RF and kNN. In relation to the sample size, all three classifiers showed a similar and high OA (over 93.85%) when the training sample size was large enough, i.e., greater than 750 pixels/class or representing an area of approximately 0.25% of the total study area. The high accuracy was achieved with both imbalanced and balanced datasets. PMID:29271909
Thanh Noi, Phan; Kappas, Martin
2017-12-22
In previous classification studies, three non-parametric classifiers, Random Forest (RF), k-Nearest Neighbor (kNN), and Support Vector Machine (SVM), were reported as the foremost classifiers at producing high accuracies. However, only a few studies have compared the performances of these classifiers with different training sample sizes for the same remote sensing images, particularly the Sentinel-2 Multispectral Imager (MSI). In this study, we examined and compared the performances of the RF, kNN, and SVM classifiers for land use/cover classification using Sentinel-2 image data. An area of 30 × 30 km² within the Red River Delta of Vietnam with six land use/cover types was classified using 14 different training sample sizes, including balanced and imbalanced, from 50 to over 1250 pixels/class. All classification results showed a high overall accuracy (OA) ranging from 90% to 95%. Among the three classifiers and 14 sub-datasets, SVM produced the highest OA with the least sensitivity to the training sample sizes, followed consecutively by RF and kNN. In relation to the sample size, all three classifiers showed a similar and high OA (over 93.85%) when the training sample size was large enough, i.e., greater than 750 pixels/class or representing an area of approximately 0.25% of the total study area. The high accuracy was achieved with both imbalanced and balanced datasets.
van Diest, Mike; Stegenga, Jan; Wörtche, Heinrich J.; Roerdink, Jos B. T. M; Verkerke, Gijsbertus J.; Lamoth, Claudine J. C.
2015-01-01
Background Exergames are becoming an increasingly popular tool for training balance ability, thereby preventing falls in older adults. Automatic, real time, assessment of the user’s balance control offers opportunities in terms of providing targeted feedback and dynamically adjusting the gameplay to the individual user, yet algorithms for quantification of balance control remain to be developed. The aim of the present study was to identify movement patterns, and variability therein, of young and older adults playing a custom-made weight-shifting (ice-skating) exergame. Methods Twenty older adults and twenty young adults played a weight-shifting exergame under five conditions of varying complexity, while multi-segmental whole-body movement data were captured using Kinect. Movement coordination patterns expressed during gameplay were identified using Self Organizing Maps (SOM), an artificial neural network, and variability in these patterns was quantified by computing Total Trajectory Variability (TTvar). Additionally a k Nearest Neighbor (kNN) classifier was trained to discriminate between young and older adults based on the SOM features. Results Results showed that TTvar was significantly higher in older adults than in young adults, when playing the exergame under complex task conditions. The kNN classifier showed a classification accuracy of 65.8%. Conclusions Older adults display more variable sway behavior than young adults, when playing the exergame under complex task conditions. The SOM features characterizing movement patterns expressed during exergaming allow for discriminating between young and older adults with limited accuracy. Our findings contribute to the development of algorithms for quantification of balance ability during home-based exergaming for balance training. PMID:26230655
Intelligent data analysis: the best approach for chronic heart failure (CHF) follow up management.
Mohammadzadeh, Niloofar; Safdari, Reza; Baraani, Alireza; Mohammadzadeh, Farshid
2014-08-01
Intelligent data analysis has ability to prepare and present complex relations between symptoms and diseases, medical and treatment consequences and definitely has significant role in improving follow-up management of chronic heart failure (CHF) patients, increasing speed and accuracy in diagnosis and treatments; reducing costs, designing and implementation of clinical guidelines. The aim of this article is to describe intelligent data analysis methods in order to improve patient monitoring in follow and treatment of chronic heart failure patients as the best approach for CHF follow up management. Minimum data set (MDS) requirements for monitoring and follow up of CHF patient designed in checklist with six main parts. All CHF patients that discharged in 2013 from Tehran heart center have been selected. The MDS for monitoring CHF patient status were collected during 5 months in three different times of follow up. Gathered data was imported in RAPIDMINER 5 software. Modeling was based on decision trees methods such as C4.5, CHAID, ID3 and k-Nearest Neighbors algorithm (K-NN) with k=1. Final analysis was based on voting method. Decision trees and K-NN evaluate according to Cross-Validation. Creating and using standard terminologies and databases consistent with these terminologies help to meet the challenges related to data collection from various places and data application in intelligent data analysis. It should be noted that intelligent analysis of health data and intelligent system can never replace cardiologists. It can only act as a helpful tool for the cardiologist's decisions making.
van Diest, Mike; Stegenga, Jan; Wörtche, Heinrich J; Roerdink, Jos B T M; Verkerke, Gijsbertus J; Lamoth, Claudine J C
2015-01-01
Exergames are becoming an increasingly popular tool for training balance ability, thereby preventing falls in older adults. Automatic, real time, assessment of the user's balance control offers opportunities in terms of providing targeted feedback and dynamically adjusting the gameplay to the individual user, yet algorithms for quantification of balance control remain to be developed. The aim of the present study was to identify movement patterns, and variability therein, of young and older adults playing a custom-made weight-shifting (ice-skating) exergame. Twenty older adults and twenty young adults played a weight-shifting exergame under five conditions of varying complexity, while multi-segmental whole-body movement data were captured using Kinect. Movement coordination patterns expressed during gameplay were identified using Self Organizing Maps (SOM), an artificial neural network, and variability in these patterns was quantified by computing Total Trajectory Variability (TTvar). Additionally a k Nearest Neighbor (kNN) classifier was trained to discriminate between young and older adults based on the SOM features. Results showed that TTvar was significantly higher in older adults than in young adults, when playing the exergame under complex task conditions. The kNN classifier showed a classification accuracy of 65.8%. Older adults display more variable sway behavior than young adults, when playing the exergame under complex task conditions. The SOM features characterizing movement patterns expressed during exergaming allow for discriminating between young and older adults with limited accuracy. Our findings contribute to the development of algorithms for quantification of balance ability during home-based exergaming for balance training.
NASA Astrophysics Data System (ADS)
Díaz-Ayil, G.; Amouroux, M.; Blondel, W. C. P. M.; Bourg-Heckly, G.; Leroux, A.; Guillemin, F.; Granjon, Y.
2009-07-01
This paper deals with the development and application of in vivo spatially-resolved bimodal spectroscopy (AutoFluorescence AF and Diffuse Reflectance DR), to discriminate various stages of skin precancer in a preclinical model (UV-irradiated mouse): Compensatory Hyperplasia CH, Atypical Hyperplasia AH and Dysplasia D. A programmable instrumentation was developed for acquiring AF emission spectra using 7 excitation wavelengths: 360, 368, 390, 400, 410, 420 and 430 nm, and DR spectra in the 390-720 nm wavelength range. After various steps of intensity spectra preprocessing (filtering, spectral correction and intensity normalization), several sets of spectral characteristics were extracted and selected based on their discrimination power statistically tested for every pair-wise comparison of histological classes. Data reduction with Principal Components Analysis (PCA) was performed and 3 classification methods were implemented (k-NN, LDA and SVM), in order to compare diagnostic performance of each method. Diagnostic performance was studied and assessed in terms of sensitivity (Se) and specificity (Sp) as a function of the selected features, of the combinations of 3 different inter-fibers distances and of the numbers of principal components, such that: Se and Sp ≈ 100% when discriminating CH vs. others; Sp ≈ 100% and Se > 95% when discriminating Healthy vs. AH or D; Sp ≈ 74% and Se ≈ 63%for AH vs. D.
Diagnosis of Tempromandibular Disorders Using Local Binary Patterns.
Haghnegahdar, A A; Kolahi, S; Khojastepour, L; Tajeripour, F
2018-03-01
Temporomandibular joint disorder (TMD) might be manifested as structural changes in bone through modification, adaptation or direct destruction. We propose to use Local Binary Pattern (LBP) characteristics and histogram-oriented gradients on the recorded images as a diagnostic tool in TMD assessment. CBCT images of 66 patients (132 joints) with TMD and 66 normal cases (132 joints) were collected and 2 coronal cut prepared from each condyle, although images were limited to head of mandibular condyle. In order to extract features of images, first we use LBP and then histogram of oriented gradients. To reduce dimensionality, the linear algebra Singular Value Decomposition (SVD) is applied to the feature vectors matrix of all images. For evaluation, we used K nearest neighbor (K-NN), Support Vector Machine, Naïve Bayesian and Random Forest classifiers. We used Receiver Operating Characteristic (ROC) to evaluate the hypothesis. K nearest neighbor classifier achieves a very good accuracy (0.9242), moreover, it has desirable sensitivity (0.9470) and specificity (0.9015) results, when other classifiers have lower accuracy, sensitivity and specificity. We proposed a fully automatic approach to detect TMD using image processing techniques based on local binary patterns and feature extraction. K-NN has been the best classifier for our experiments in detecting patients from healthy individuals, by 92.42% accuracy, 94.70% sensitivity and 90.15% specificity. The proposed method can help automatically diagnose TMD at its initial stages.
Predictive modelling for startup and investor relationship based on crowdfunding platform data
NASA Astrophysics Data System (ADS)
Alamsyah, Andry; Buono Asto Nugroho, Tri
2018-03-01
Crowdfunding platform is a place where startup shows off publicly their idea for the purpose to get their project funded. Crowdfunding platform such as Kickstarter are becoming popular today, it provides the efficient way for startup to get funded without liabilities, it also provides variety project category that can be participated. There is an available safety procedure to ensure achievable low-risk environment. The startup promoted project must accomplish their funded goal target. If they fail to reach the target, then there is no investment activity take place. It motivates startup to be more active to promote or disseminate their project idea and it also protect investor from losing money. The study objective is to predict the successfulness of proposed project and mapping investor trend using data mining framework. To achieve the objective, we proposed 3 models. First model is to predict whether a project is going to be successful or failed using K-Nearest Neighbour (KNN). Second model is to predict the number of successful project using Artificial Neural Network (ANN). Third model is to map the trend of investor in investing the project using K-Means clustering algorithm. KNN gives 99.04% model accuracy, while ANN best configuration gives 16-14-1 neuron layers and 0.2 learning rate, and K-Means gives 6 best separation clusters. The results of those models can help startup or investor to make decision regarding startup investment.
Forecasting municipal solid waste generation using artificial intelligence modelling approaches.
Abbasi, Maryam; El Hanandeh, Ali
2016-10-01
Municipal solid waste (MSW) management is a major concern to local governments to protect human health, the environment and to preserve natural resources. The design and operation of an effective MSW management system requires accurate estimation of future waste generation quantities. The main objective of this study was to develop a model for accurate forecasting of MSW generation that helps waste related organizations to better design and operate effective MSW management systems. Four intelligent system algorithms including support vector machine (SVM), adaptive neuro-fuzzy inference system (ANFIS), artificial neural network (ANN) and k-nearest neighbours (kNN) were tested for their ability to predict monthly waste generation in the Logan City Council region in Queensland, Australia. Results showed artificial intelligence models have good prediction performance and could be successfully applied to establish municipal solid waste forecasting models. Using machine learning algorithms can reliably predict monthly MSW generation by training with waste generation time series. In addition, results suggest that ANFIS system produced the most accurate forecasts of the peaks while kNN was successful in predicting the monthly averages of waste quantities. Based on the results, the total annual MSW generated in Logan City will reach 9.4×10(7)kg by 2020 while the peak monthly waste will reach 9.37×10(6)kg. Copyright © 2016 Elsevier Ltd. All rights reserved.
Emotional State Classification in Virtual Reality Using Wearable Electroencephalography
NASA Astrophysics Data System (ADS)
Suhaimi, N. S.; Teo, J.; Mountstephens, J.
2018-03-01
This paper presents the classification of emotions on EEG signals. One of the key issues in this research is the lack of mental classification using VR as the medium to stimulate emotion. The approach towards this research is by using K-nearest neighbor (KNN) and Support Vector Machine (SVM). Firstly, each of the participant will be required to wear the EEG headset and recording their brainwaves when they are immersed inside the VR. The data points are then marked if they showed any physical signs of emotion or by observing the brainwave pattern. Secondly, the data will then be tested and trained with KNN and SVM algorithms. The accuracy achieved from both methods were approximately 82% throughout the brainwave spectrum (α, β, γ, δ, θ). These methods showed promising results and will be further enhanced using other machine learning approaches in VR stimulus.
Leakage current behavior in lead-free ferroelectric (K,Na)NbO3-LiTaO3-LiSbO3 thin films
NASA Astrophysics Data System (ADS)
Abazari, M.; Safari, A.
2010-12-01
Conduction mechanisms in epitaxial (001)-oriented pure and 1 mol % Mn-doped (K0.44,Na0.52,Li0.04)(Nb0.84,Ta0.1,Sb0.06)O3 (KNN-LT-LS) thin films on SrTiO3 substrate were investigated. Temperature dependence of leakage current density was measured as a function of applied electric field in the range of 200-380 K. It was shown that the different transport mechanisms dominate in pure and Mn-doped thin films. In pure (KNN-LT-LS) thin films, Poole-Frenkel emission was found to be responsible for the leakage, while Schottky emission was the dominant mechanism in Mn-doped thin films at higher electric fields. This is a remarkable yet clear indication of effect of 1 mol % Mn on the resistive behavior of such thin films.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ikeda, Y.; Sato, T.
Three-body resonances in the KNN system have been studied within a framework of the KNN-{pi}YN coupled-channel Faddeev equation. By solving the three-body equation, the energy dependence of the resonant KN amplitude is fully taken into account. The S-matrix pole has been investigated from the eigenvalue of the kernel with the analytic continuation of the scattering amplitude on the unphysical Riemann sheet. The KN interaction is constructed from the leading order term of the chiral Lagrangian using relativistic kinematics. The {lambda}(1405) resonance is dynamically generated in this model, where the KN interaction parameters are fitted to the data of scattering length.more » As a result we find a three-body resonance of the strange dibaryon system with binding energy B{approx}79 MeV and width {gamma}{approx}74 MeV. The energy of the three-body resonance is found to be sensitive to the model of the I=0 KN interaction.« less
Assessment of various supervised learning algorithms using different performance metrics
NASA Astrophysics Data System (ADS)
Susheel Kumar, S. M.; Laxkar, Deepak; Adhikari, Sourav; Vijayarajan, V.
2017-11-01
Our work brings out comparison based on the performance of supervised machine learning algorithms on a binary classification task. The supervised machine learning algorithms which are taken into consideration in the following work are namely Support Vector Machine(SVM), Decision Tree(DT), K Nearest Neighbour (KNN), Naïve Bayes(NB) and Random Forest(RF). This paper mostly focuses on comparing the performance of above mentioned algorithms on one binary classification task by analysing the Metrics such as Accuracy, F-Measure, G-Measure, Precision, Misclassification Rate, False Positive Rate, True Positive Rate, Specificity, Prevalence.
NASA Astrophysics Data System (ADS)
Runyan, T. E.; Wood, W. T.; Palmsten, M. L.; Zhang, R.
2016-12-01
Gas hydrates, specifically methane hydrates, are sparsely sampled on a global scale, and their accumulation is difficult to predict geospatially. Several attempts have been made at estimating global inventories, and to some extent geospatial distribution, using geospatial extrapoltions guided with geophysical and geochemical methods. Our objective is to quantitatively predict the geospatial likelihood of encountering methane hydrates, with uncertainty. Predictions could be incorporated into analyses of drilling hazards as well as climate change. We use global data sets (including water depth, temperature, pressure, TOC, sediment thickness, and heat flow) as parameters to train a k-nearest neighbor (KNN) machine learning technique. The KNN is unsupervised and non-parametric, we do not provide any interpretive influence on prior probability distribution, so our results are strictly data driven. We have selected as test sites several locations where gas hydrates have been well studied, each with significantly different geologic settings.These include: The Blake Ridge (U.S. East Coast), Hydrate Ridge (U.S. West Coast), and the Gulf of Mexico. We then use KNN to quantify similarities between these sites, and determine, via the distance in parameter space, what is the likelihood and uncertainty of encountering gas hydrate anywhere in the world. Here we are operating under the assumption that the distance in parameter space is proportional to the probability of the occurrence of gas hydrate. We then compare these global similarity maps made from our several test sites to identify the geologic (geophyisical, bio-geochemical) parameters best suited for predicting gas hydrate occurrence.
Amaral, Jorge L M; Lopes, Agnaldo J; Jansen, José M; Faria, Alvaro C D; Melo, Pedro L
2013-12-01
The purpose of this study was to develop an automatic classifier to increase the accuracy of the forced oscillation technique (FOT) for diagnosing early respiratory abnormalities in smoking patients. The data consisted of FOT parameters obtained from 56 volunteers, 28 healthy and 28 smokers with low tobacco consumption. Many supervised learning techniques were investigated, including logistic linear classifiers, k nearest neighbor (KNN), neural networks and support vector machines (SVM). To evaluate performance, the ROC curve of the most accurate parameter was established as baseline. To determine the best input features and classifier parameters, we used genetic algorithms and a 10-fold cross-validation using the average area under the ROC curve (AUC). In the first experiment, the original FOT parameters were used as input. We observed a significant improvement in accuracy (KNN=0.89 and SVM=0.87) compared with the baseline (0.77). The second experiment performed a feature selection on the original FOT parameters. This selection did not cause any significant improvement in accuracy, but it was useful in identifying more adequate FOT parameters. In the third experiment, we performed a feature selection on the cross products of the FOT parameters. This selection resulted in a further increase in AUC (KNN=SVM=0.91), which allows for high diagnostic accuracy. In conclusion, machine learning classifiers can help identify early smoking-induced respiratory alterations. The use of FOT cross products and the search for the best features and classifier parameters can markedly improve the performance of machine learning classifiers. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Detection of Cardiac Abnormalities from Multilead ECG using Multiscale Phase Alternation Features.
Tripathy, R K; Dandapat, S
2016-06-01
The cardiac activities such as the depolarization and the relaxation of atria and ventricles are observed in electrocardiogram (ECG). The changes in the morphological features of ECG are the symptoms of particular heart pathology. It is a cumbersome task for medical experts to visually identify any subtle changes in the morphological features during 24 hours of ECG recording. Therefore, the automated analysis of ECG signal is a need for accurate detection of cardiac abnormalities. In this paper, a novel method for automated detection of cardiac abnormalities from multilead ECG is proposed. The method uses multiscale phase alternation (PA) features of multilead ECG and two classifiers, k-nearest neighbor (KNN) and fuzzy KNN for classification of bundle branch block (BBB), myocardial infarction (MI), heart muscle defect (HMD) and healthy control (HC). The dual tree complex wavelet transform (DTCWT) is used to decompose the ECG signal of each lead into complex wavelet coefficients at different scales. The phase of the complex wavelet coefficients is computed and the PA values at each wavelet scale are used as features for detection and classification of cardiac abnormalities. A publicly available multilead ECG database (PTB database) is used for testing of the proposed method. The experimental results show that, the proposed multiscale PA features and the fuzzy KNN classifier have better performance for detection of cardiac abnormalities with sensitivity values of 78.12 %, 80.90 % and 94.31 % for BBB, HMD and MI classes. The sensitivity value of proposed method for MI class is compared with the state-of-art techniques from multilead ECG.
Evaluation of normalization methods for cDNA microarray data by k-NN classification
Wu, Wei; Xing, Eric P; Myers, Connie; Mian, I Saira; Bissell, Mina J
2005-01-01
Background Non-biological factors give rise to unwanted variations in cDNA microarray data. There are many normalization methods designed to remove such variations. However, to date there have been few published systematic evaluations of these techniques for removing variations arising from dye biases in the context of downstream, higher-order analytical tasks such as classification. Results Ten location normalization methods that adjust spatial- and/or intensity-dependent dye biases, and three scale methods that adjust scale differences were applied, individually and in combination, to five distinct, published, cancer biology-related cDNA microarray data sets. Leave-one-out cross-validation (LOOCV) classification error was employed as the quantitative end-point for assessing the effectiveness of a normalization method. In particular, a known classifier, k-nearest neighbor (k-NN), was estimated from data normalized using a given technique, and the LOOCV error rate of the ensuing model was computed. We found that k-NN classifiers are sensitive to dye biases in the data. Using NONRM and GMEDIAN as baseline methods, our results show that single-bias-removal techniques which remove either spatial-dependent dye bias (referred later as spatial effect) or intensity-dependent dye bias (referred later as intensity effect) moderately reduce LOOCV classification errors; whereas double-bias-removal techniques which remove both spatial- and intensity effect reduce LOOCV classification errors even further. Of the 41 different strategies examined, three two-step processes, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, all of which removed intensity effect globally and spatial effect locally, appear to reduce LOOCV classification errors most consistently and effectively across all data sets. We also found that the investigated scale normalization methods do not reduce LOOCV classification error. Conclusion Using LOOCV error of k-NNs as the evaluation criterion, three double-bias-removal normalization strategies, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, outperform other strategies for removing spatial effect, intensity effect and scale differences from cDNA microarray data. The apparent sensitivity of k-NN LOOCV classification error to dye biases suggests that this criterion provides an informative measure for evaluating normalization methods. All the computational tools used in this study were implemented using the R language for statistical computing and graphics. PMID:16045803
Evaluation of normalization methods for cDNA microarray data by k-NN classification.
Wu, Wei; Xing, Eric P; Myers, Connie; Mian, I Saira; Bissell, Mina J
2005-07-26
Non-biological factors give rise to unwanted variations in cDNA microarray data. There are many normalization methods designed to remove such variations. However, to date there have been few published systematic evaluations of these techniques for removing variations arising from dye biases in the context of downstream, higher-order analytical tasks such as classification. Ten location normalization methods that adjust spatial- and/or intensity-dependent dye biases, and three scale methods that adjust scale differences were applied, individually and in combination, to five distinct, published, cancer biology-related cDNA microarray data sets. Leave-one-out cross-validation (LOOCV) classification error was employed as the quantitative end-point for assessing the effectiveness of a normalization method. In particular, a known classifier, k-nearest neighbor (k-NN), was estimated from data normalized using a given technique, and the LOOCV error rate of the ensuing model was computed. We found that k-NN classifiers are sensitive to dye biases in the data. Using NONRM and GMEDIAN as baseline methods, our results show that single-bias-removal techniques which remove either spatial-dependent dye bias (referred later as spatial effect) or intensity-dependent dye bias (referred later as intensity effect) moderately reduce LOOCV classification errors; whereas double-bias-removal techniques which remove both spatial- and intensity effect reduce LOOCV classification errors even further. Of the 41 different strategies examined, three two-step processes, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, all of which removed intensity effect globally and spatial effect locally, appear to reduce LOOCV classification errors most consistently and effectively across all data sets. We also found that the investigated scale normalization methods do not reduce LOOCV classification error. Using LOOCV error of k-NNs as the evaluation criterion, three double-bias-removal normalization strategies, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, outperform other strategies for removing spatial effect, intensity effect and scale differences from cDNA microarray data. The apparent sensitivity of k-NN LOOCV classification error to dye biases suggests that this criterion provides an informative measure for evaluating normalization methods. All the computational tools used in this study were implemented using the R language for statistical computing and graphics.
Griffanti, Ludovica; Zamboni, Giovanna; Khan, Aamira; Li, Linxin; Bonifacio, Guendalina; Sundaresan, Vaanathi; Schulz, Ursula G; Kuker, Wilhelm; Battaglini, Marco; Rothwell, Peter M; Jenkinson, Mark
2016-11-01
Reliable quantification of white matter hyperintensities of presumed vascular origin (WMHs) is increasingly needed, given the presence of these MRI findings in patients with several neurological and vascular disorders, as well as in elderly healthy subjects. We present BIANCA (Brain Intensity AbNormality Classification Algorithm), a fully automated, supervised method for WMH detection, based on the k-nearest neighbour (k-NN) algorithm. Relative to previous k-NN based segmentation methods, BIANCA offers different options for weighting the spatial information, local spatial intensity averaging, and different options for the choice of the number and location of the training points. BIANCA is multimodal and highly flexible so that the user can adapt the tool to their protocol and specific needs. We optimised and validated BIANCA on two datasets with different MRI protocols and patient populations (a "predominantly neurodegenerative" and a "predominantly vascular" cohort). BIANCA was first optimised on a subset of images for each dataset in terms of overlap and volumetric agreement with a manually segmented WMH mask. The correlation between the volumes extracted with BIANCA (using the optimised set of options), the volumes extracted from the manual masks and visual ratings showed that BIANCA is a valid alternative to manual segmentation. The optimised set of options was then applied to the whole cohorts and the resulting WMH volume estimates showed good correlations with visual ratings and with age. Finally, we performed a reproducibility test, to evaluate the robustness of BIANCA, and compared BIANCA performance against existing methods. Our findings suggest that BIANCA, which will be freely available as part of the FSL package, is a reliable method for automated WMH segmentation in large cross-sectional cohort studies. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Anam, Khairul; Al-Jumaily, Adel
2017-01-01
The success of myoelectric pattern recognition (M-PR) mostly relies on the features extracted and classifier employed. This paper proposes and evaluates a fast classifier, extreme learning machine (ELM), to classify individual and combined finger movements on amputees and non-amputees. ELM is a single hidden layer feed-forward network (SLFN) that avoids iterative learning by determining input weights randomly and output weights analytically. Therefore, it can accelerate the training time of SLFNs. In addition to the classifier evaluation, this paper evaluates various feature combinations to improve the performance of M-PR and investigate some feature projections to improve the class separability of the features. Different from other studies on the implementation of ELM in the myoelectric controller, this paper presents a complete and thorough investigation of various types of ELMs including the node-based and kernel-based ELM. Furthermore, this paper provides comparisons of ELMs and other well-known classifiers such as linear discriminant analysis (LDA), k-nearest neighbour (kNN), support vector machine (SVM) and least-square SVM (LS-SVM). The experimental results show the most accurate ELM classifier is radial basis function ELM (RBF-ELM). The comparison of RBF-ELM and other well-known classifiers shows that RBF-ELM is as accurate as SVM and LS-SVM but faster than the SVM family; it is superior to LDA and kNN. The experimental results also indicate that the accuracy gap of the M-PR on the amputees and non-amputees is not too much with the accuracy of 98.55% on amputees and 99.5% on the non-amputees using six electromyography (EMG) channels. Copyright © 2016 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xu Xiaoying; Ho, Shirley; Trac, Hy
We investigate machine learning (ML) techniques for predicting the number of galaxies (N{sub gal}) that occupy a halo, given the halo's properties. These types of mappings are crucial for constructing the mock galaxy catalogs necessary for analyses of large-scale structure. The ML techniques proposed here distinguish themselves from traditional halo occupation distribution (HOD) modeling as they do not assume a prescribed relationship between halo properties and N{sub gal}. In addition, our ML approaches are only dependent on parent halo properties (like HOD methods), which are advantageous over subhalo-based approaches as identifying subhalos correctly is difficult. We test two algorithms: supportmore » vector machines (SVM) and k-nearest-neighbor (kNN) regression. We take galaxies and halos from the Millennium simulation and predict N{sub gal} by training our algorithms on the following six halo properties: number of particles, M{sub 200}, {sigma}{sub v}, v{sub max}, half-mass radius, and spin. For Millennium, our predicted N{sub gal} values have a mean-squared error (MSE) of {approx}0.16 for both SVM and kNN. Our predictions match the overall distribution of halos reasonably well and the galaxy correlation function at large scales to {approx}5%-10%. In addition, we demonstrate a feature selection algorithm to isolate the halo parameters that are most predictive, a useful technique for understanding the mapping between halo properties and N{sub gal}. Lastly, we investigate these ML-based approaches in making mock catalogs for different galaxy subpopulations (e.g., blue, red, high M{sub star}, low M{sub star}). Given its non-parametric nature as well as its powerful predictive and feature selection capabilities, ML offers an interesting alternative for creating mock catalogs.« less
METAPHOR: Probability density estimation for machine learning based photometric redshifts
NASA Astrophysics Data System (ADS)
Amaro, V.; Cavuoti, S.; Brescia, M.; Vellucci, C.; Tortora, C.; Longo, G.
2017-06-01
We present METAPHOR (Machine-learning Estimation Tool for Accurate PHOtometric Redshifts), a method able to provide a reliable PDF for photometric galaxy redshifts estimated through empirical techniques. METAPHOR is a modular workflow, mainly based on the MLPQNA neural network as internal engine to derive photometric galaxy redshifts, but giving the possibility to easily replace MLPQNA with any other method to predict photo-z's and their PDF. We present here the results about a validation test of the workflow on the galaxies from SDSS-DR9, showing also the universality of the method by replacing MLPQNA with KNN and Random Forest models. The validation test include also a comparison with the PDF's derived from a traditional SED template fitting method (Le Phare).
Development of a Compton camera for prompt-gamma medical imaging
NASA Astrophysics Data System (ADS)
Aldawood, S.; Thirolf, P. G.; Miani, A.; Böhmer, M.; Dedes, G.; Gernhäuser, R.; Lang, C.; Liprandi, S.; Maier, L.; Marinšek, T.; Mayerhofer, M.; Schaart, D. R.; Lozano, I. Valencia; Parodi, K.
2017-11-01
A Compton camera-based detector system for photon detection from nuclear reactions induced by proton (or heavier ion) beams is under development at LMU Munich, targeting the online range verification of the particle beam in hadron therapy via prompt-gamma imaging. The detector is designed to be capable to reconstruct the photon source origin not only from the Compton scattering kinematics of the primary photon, but also to allow for tracking of the secondary Compton-scattered electrons, thus enabling a γ-source reconstruction also from incompletely absorbed photon events. The Compton camera consists of a monolithic LaBr3:Ce scintillation crystal, read out by a multi-anode PMT acting as absorber, preceded by a stacked array of 6 double-sided silicon strip detectors as scatterers. The detector components have been characterized both under offline and online conditions. The LaBr3:Ce crystal exhibits an excellent time and energy resolution. Using intense collimated 137Cs and 60Co sources, the monolithic scintillator was scanned on a fine 2D grid to generate a reference library of light amplitude distributions that allows for reconstructing the photon interaction position using a k-Nearest Neighbour (k-NN) algorithm. Systematic studies were performed to investigate the performance of the reconstruction algorithm, revealing an improvement of the spatial resolution with increasing photon energy to an optimum value of 3.7(1)mm at 1.33 MeV, achieved with the Categorical Average Pattern (CAP) modification of the k-NN algorithm.
Kumar, Raju; Singh, Satyendra
2018-02-16
Electrocaloric (EC) refrigeration, an EC effect based technology has been accepted as an auspicious way in the development of next generation refrigeration due to high efficiency and compact size. Here, we report the results of our experimental investigations on electrocaloric response and electrical energy storage properties in lead-free nanocrystalline (1 - x)K 0.5 Na 0.5 NbO 3 -xLiSbO 3 (KNN-xLS) ceramics in the range of 0.015 ≤ x ≤ 0.06 by the indirect EC measurements. Doping of LiSbO 3 has lowered both the transitions (T C and T O-T ) of KNN to the room temperature side effectively. A maximal value of EC temperature change, ΔT = 3.33 K was obtained for the composition with x = 0.03 at 345 K under an external electric field of 40 kV/cm. The higher value of EC responsivity, ζ = 8.32 × 10 -7 K.m/V is found with COP of 8.14 and recoverable energy storage of 0.128 J/cm 3 with 46% efficiency for the composition of x = 0.03. Our investigations show that this material is a very promising candidate for electrocaloric refrigeration and energy storage near room temperature.
Detecting falls with wearable sensors using machine learning techniques.
Özdemir, Ahmet Turan; Barshan, Billur
2014-06-18
Falls are a serious public health problem and possibly life threatening for people in fall risk groups. We develop an automated fall detection system with wearable motion sensor units fitted to the subjects' body at six different positions. Each unit comprises three tri-axial devices (accelerometer, gyroscope, and magnetometer/compass). Fourteen volunteers perform a standardized set of movements including 20 voluntary falls and 16 activities of daily living (ADLs), resulting in a large dataset with 2520 trials. To reduce the computational complexity of training and testing the classifiers, we focus on the raw data for each sensor in a 4 s time window around the point of peak total acceleration of the waist sensor, and then perform feature extraction and reduction. Most earlier studies on fall detection employ rule-based approaches that rely on simple thresholding of the sensor outputs. We successfully distinguish falls from ADLs using six machine learning techniques (classifiers): the k-nearest neighbor (k-NN) classifier, least squares method (LSM), support vector machines (SVM), Bayesian decision making (BDM), dynamic time warping (DTW), and artificial neural networks (ANNs). We compare the performance and the computational complexity of the classifiers and achieve the best results with the k-NN classifier and LSM, with sensitivity, specificity, and accuracy all above 99%. These classifiers also have acceptable computational requirements for training and testing. Our approach would be applicable in real-world scenarios where data records of indeterminate length, containing multiple activities in sequence, are recorded.
Diagnosis of Tempromandibular Disorders Using Local Binary Patterns
Haghnegahdar, A.A.; Kolahi, S.; Khojastepour, L.; Tajeripour, F.
2018-01-01
Background: Temporomandibular joint disorder (TMD) might be manifested as structural changes in bone through modification, adaptation or direct destruction. We propose to use Local Binary Pattern (LBP) characteristics and histogram-oriented gradients on the recorded images as a diagnostic tool in TMD assessment. Material and Methods: CBCT images of 66 patients (132 joints) with TMD and 66 normal cases (132 joints) were collected and 2 coronal cut prepared from each condyle, although images were limited to head of mandibular condyle. In order to extract features of images, first we use LBP and then histogram of oriented gradients. To reduce dimensionality, the linear algebra Singular Value Decomposition (SVD) is applied to the feature vectors matrix of all images. For evaluation, we used K nearest neighbor (K-NN), Support Vector Machine, Naïve Bayesian and Random Forest classifiers. We used Receiver Operating Characteristic (ROC) to evaluate the hypothesis. Results: K nearest neighbor classifier achieves a very good accuracy (0.9242), moreover, it has desirable sensitivity (0.9470) and specificity (0.9015) results, when other classifiers have lower accuracy, sensitivity and specificity. Conclusion: We proposed a fully automatic approach to detect TMD using image processing techniques based on local binary patterns and feature extraction. K-NN has been the best classifier for our experiments in detecting patients from healthy individuals, by 92.42% accuracy, 94.70% sensitivity and 90.15% specificity. The proposed method can help automatically diagnose TMD at its initial stages. PMID:29732343
Kruppa, Jochen; Liu, Yufeng; Biau, Gérard; Kohler, Michael; König, Inke R; Malley, James D; Ziegler, Andreas
2014-07-01
Probability estimation for binary and multicategory outcome using logistic and multinomial logistic regression has a long-standing tradition in biostatistics. However, biases may occur if the model is misspecified. In contrast, outcome probabilities for individuals can be estimated consistently with machine learning approaches, including k-nearest neighbors (k-NN), bagged nearest neighbors (b-NN), random forests (RF), and support vector machines (SVM). Because machine learning methods are rarely used by applied biostatisticians, the primary goal of this paper is to explain the concept of probability estimation with these methods and to summarize recent theoretical findings. Probability estimation in k-NN, b-NN, and RF can be embedded into the class of nonparametric regression learning machines; therefore, we start with the construction of nonparametric regression estimates and review results on consistency and rates of convergence. In SVMs, outcome probabilities for individuals are estimated consistently by repeatedly solving classification problems. For SVMs we review classification problem and then dichotomous probability estimation. Next we extend the algorithms for estimating probabilities using k-NN, b-NN, and RF to multicategory outcomes and discuss approaches for the multicategory probability estimation problem using SVM. In simulation studies for dichotomous and multicategory dependent variables we demonstrate the general validity of the machine learning methods and compare it with logistic regression. However, each method fails in at least one simulation scenario. We conclude with a discussion of the failures and give recommendations for selecting and tuning the methods. Applications to real data and example code are provided in a companion article (doi:10.1002/bimj.201300077). © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Vivekananthan, Venkateswaran; Alluri, Nagamalleswara Rao; Purusothaman, Yuvasree; Chandrasekhar, Arunkumar; Kim, Sang-Jae
2017-10-12
Flexible, planar composite piezoelectric nanogenerators (C-PNGs) were developed to harness waste mechanical energy using cost-effective composite films (CFs) prepared via a probe-sonication technique. CFs, made up of highly crystalline, randomly oriented lead free piezoelectric nanoparticles (1 - x)K 0.5 Na 0.5 NbO 3 -xBaTiO 3 , where x = 0.02, 0.04, 0.06, or 0.08 [designated as KNN-xBTO], were impregnated in a polyvinylidene fluoride (PVDF) matrix. The KNN piezoelectric properties were tuned via the substitution of BTO nanoparticles, without altering the orthorhombic phase. A C-PNG device (x ≈ 0.02) generates a maximum open circuit voltage ≈160 V, and the instantaneous area power density is ≈14 mW m -2 upon a low mechanical force ≈0.4 N. The effects of BTO concentration in the KNN lattice, electrical poling effects, the fixed weight ratio of nanoparticles in the PVDF matrix, switching polarity tests, and load resistance analysis of C-PNG devices were investigated with constant mechanical force. Furthermore, the experimentally demonstrated C-PNG device output is sufficient to drive commercial blue light emitting diodes. The C-PNG device was placed on a road side, and the maximum energy generation and stability under real time harsh conditions, such as vehicle motion (motorcycle and bicycle) and human walking, were tested. C-PNG generates a peak-to-peak output voltage ≈16 V, when motorcycle forward/backward motion acts on it. This result indicates that the C-PNG device is a potential candidate to power road side sensors, speed tachometers, light indicators, etc. on highways.
Detecting paroxysmal coughing from pertussis cases using voice recognition technology.
Parker, Danny; Picone, Joseph; Harati, Amir; Lu, Shuang; Jenkyns, Marion H; Polgreen, Philip M
2013-01-01
Pertussis is highly contagious; thus, prompt identification of cases is essential to control outbreaks. Clinicians experienced with the disease can easily identify classic cases, where patients have bursts of rapid coughing followed by gasps, and a characteristic whooping sound. However, many clinicians have never seen a case, and thus may miss initial cases during an outbreak. The purpose of this project was to use voice-recognition software to distinguish pertussis coughs from croup and other coughs. We collected a series of recordings representing pertussis, croup and miscellaneous coughing by children. We manually categorized coughs as either pertussis or non-pertussis, and extracted features for each category. We used Mel-frequency cepstral coefficients (MFCC), a sampling rate of 16 KHz, a frame Duration of 25 msec, and a frame rate of 10 msec. The coughs were filtered. Each cough was divided into 3 sections of proportion 3-4-3. The average of the 13 MFCCs for each section was computed and made into a 39-element feature vector used for the classification. We used the following machine learning algorithms: Neural Networks, K-Nearest Neighbor (KNN), and a 200 tree Random Forest (RF). Data were reserved for cross-validation of the KNN and RF. The Neural Network was trained 100 times, and the averaged results are presented. After categorization, we had 16 examples of non-pertussis coughs and 31 examples of pertussis coughs. Over 90% of all pertussis coughs were properly classified as pertussis. The error rates were: Type I errors of 7%, 12%, and 25% and Type II errors of 8%, 0%, and 0%, using the Neural Network, Random Forest, and KNN, respectively. Our results suggest that we can build a robust classifier to assist clinicians and the public to help identify pertussis cases in children presenting with typical symptoms.
An ensemble of dissimilarity based classifiers for Mackerel gender determination
NASA Astrophysics Data System (ADS)
Blanco, A.; Rodriguez, R.; Martinez-Maranon, I.
2014-03-01
Mackerel is an infravalored fish captured by European fishing vessels. A manner to add value to this specie can be achieved by trying to classify it attending to its sex. Colour measurements were performed on Mackerel females and males (fresh and defrozen) extracted gonads to obtain differences between sexes. Several linear and non linear classifiers such as Support Vector Machines (SVM), k Nearest Neighbors (k-NN) or Diagonal Linear Discriminant Analysis (DLDA) can been applied to this problem. However, theyare usually based on Euclidean distances that fail to reflect accurately the sample proximities. Classifiers based on non-Euclidean dissimilarities misclassify a different set of patterns. We combine different kind of dissimilarity based classifiers. The diversity is induced considering a set of complementary dissimilarities for each model. The experimental results suggest that our algorithm helps to improve classifiers based on a single dissimilarity.
Aided diagnosis methods of breast cancer based on machine learning
NASA Astrophysics Data System (ADS)
Zhao, Yue; Wang, Nian; Cui, Xiaoyu
2017-08-01
In the field of medicine, quickly and accurately determining whether the patient is malignant or benign is the key to treatment. In this paper, K-Nearest Neighbor, Linear Discriminant Analysis, Logistic Regression were applied to predict the classification of thyroid,Her-2,PR,ER,Ki67,metastasis and lymph nodes in breast cancer, in order to recognize the benign and malignant breast tumors and achieve the purpose of aided diagnosis of breast cancer. The results showed that the highest classification accuracy of LDA was 88.56%, while the classification effect of KNN and Logistic Regression were better than that of LDA, the best accuracy reached 96.30%.
Li, Qu; Yao, Min; Yang, Jianhua; Xu, Ning
2014-01-01
Online friend recommendation is a fast developing topic in web mining. In this paper, we used SVD matrix factorization to model user and item feature vector and used stochastic gradient descent to amend parameter and improve accuracy. To tackle cold start problem and data sparsity, we used KNN model to influence user feature vector. At the same time, we used graph theory to partition communities with fairly low time and space complexity. What is more, matrix factorization can combine online and offline recommendation. Experiments showed that the hybrid recommendation algorithm is able to recommend online friends with good accuracy.
1987-05-01
Federal Aviati6n Regulation (FAR) Part 36. PSIL Preferred Speech Interference Level as specified in AFR 161-35. dB Decibel, Base 10 logarithmic ratio...I IP 1 0 I 1- *P N p 0 - 5V S 4 S r. V 0 l- N N N & N i% OW ta b, in i in n ivWvP1*1v vv A m Am ANN N KNN #ai I I I6 I I " Ic OSI-IS 0 9A N I, fu a t
Identifying the minor set cover of dense connected bipartite graphs via random matching edge sets
NASA Astrophysics Data System (ADS)
Hamilton, Kathleen E.; Humble, Travis S.
2017-04-01
Using quantum annealing to solve an optimization problem requires minor embedding a logic graph into a known hardware graph. In an effort to reduce the complexity of the minor embedding problem, we introduce the minor set cover (MSC) of a known graph G: a subset of graph minors which contain any remaining minor of the graph as a subgraph. Any graph that can be embedded into G will be embeddable into a member of the MSC. Focusing on embedding into the hardware graph of commercially available quantum annealers, we establish the MSC for a particular known virtual hardware, which is a complete bipartite graph. We show that the complete bipartite graph K_{N,N} has a MSC of N minors, from which K_{N+1} is identified as the largest clique minor of K_{N,N}. The case of determining the largest clique minor of hardware with faults is briefly discussed but remains an open question.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mahesh, P., E-mail: pamu@iitg.ernet.in; Subhash, T., E-mail: pamu@iitg.ernet.in; Pamu, D., E-mail: pamu@iitg.ernet.in
We report the dielectric properties of (K{sub 0.5}Na{sub 0.5})NbO{sub 3} ceramics doped with x wt% of Dy{sub 2}O{sub 3} (x= 0.0-1.5 wt%) using the broadband dielectric spectroscopy. The X-ray diffraction studies showed the formation of perovskite structure signifying that Dy{sub 2}O{sub 3} diffuse into the KNN lattice. Samples doped with x > 0.5 wt% exhibit smaller grain size and lower relative densities. The dielectric properties of KNN ceramics doped with Dy{sub 2}O{sub 3} are enhanced by increasing the Dy{sup 3+} content; among the compositions studied, x = 0.5 wt% exhibited the highest dielectric constant and lowest loss at 1MHz overmore » the temperature range of 30°C to 400°C. All the samples exhibit maximum dielectric constant at the Curie temperature (∼ 326°C) and a small peak in the dielectric constant at around 165°C is due to a structural phase transition.« less
Fuzzy Temporal Logic Based Railway Passenger Flow Forecast Model
Dou, Fei; Jia, Limin; Wang, Li; Xu, Jie; Huang, Yakun
2014-01-01
Passenger flow forecast is of essential importance to the organization of railway transportation and is one of the most important basics for the decision-making on transportation pattern and train operation planning. Passenger flow of high-speed railway features the quasi-periodic variations in a short time and complex nonlinear fluctuation because of existence of many influencing factors. In this study, a fuzzy temporal logic based passenger flow forecast model (FTLPFFM) is presented based on fuzzy logic relationship recognition techniques that predicts the short-term passenger flow for high-speed railway, and the forecast accuracy is also significantly improved. An applied case that uses the real-world data illustrates the precision and accuracy of FTLPFFM. For this applied case, the proposed model performs better than the k-nearest neighbor (KNN) and autoregressive integrated moving average (ARIMA) models. PMID:25431586
Batres-Mendoza, Patricia; Montoro-Sanjose, Carlos R; Guerra-Hernandez, Erick I; Almanza-Ojeda, Dora L; Rostro-Gonzalez, Horacio; Romero-Troncoso, Rene J; Ibarra-Manzano, Mario A
2016-03-05
Quaternions can be used as an alternative to model the fundamental patterns of electroencephalographic (EEG) signals in the time domain. Thus, this article presents a new quaternion-based technique known as quaternion-based signal analysis (QSA) to represent EEG signals obtained using a brain-computer interface (BCI) device to detect and interpret cognitive activity. This quaternion-based signal analysis technique can extract features to represent brain activity related to motor imagery accurately in various mental states. Experimental tests in which users where shown visual graphical cues related to left and right movements were used to collect BCI-recorded signals. These signals were then classified using decision trees (DT), support vector machine (SVM) and k-nearest neighbor (KNN) techniques. The quantitative analysis of the classifiers demonstrates that this technique can be used as an alternative in the EEG-signal modeling phase to identify mental states.
Batres-Mendoza, Patricia; Montoro-Sanjose, Carlos R.; Guerra-Hernandez, Erick I.; Almanza-Ojeda, Dora L.; Rostro-Gonzalez, Horacio; Romero-Troncoso, Rene J.; Ibarra-Manzano, Mario A.
2016-01-01
Quaternions can be used as an alternative to model the fundamental patterns of electroencephalographic (EEG) signals in the time domain. Thus, this article presents a new quaternion-based technique known as quaternion-based signal analysis (QSA) to represent EEG signals obtained using a brain-computer interface (BCI) device to detect and interpret cognitive activity. This quaternion-based signal analysis technique can extract features to represent brain activity related to motor imagery accurately in various mental states. Experimental tests in which users where shown visual graphical cues related to left and right movements were used to collect BCI-recorded signals. These signals were then classified using decision trees (DT), support vector machine (SVM) and k-nearest neighbor (KNN) techniques. The quantitative analysis of the classifiers demonstrates that this technique can be used as an alternative in the EEG-signal modeling phase to identify mental states. PMID:26959029
Ding, Huijun; He, Qing; Zhou, Yongjin; Dan, Guo; Cui, Song
2017-01-01
Motion-intent-based finger gesture recognition systems are crucial for many applications such as prosthesis control, sign language recognition, wearable rehabilitation system, and human–computer interaction. In this article, a motion-intent-based finger gesture recognition system is designed to correctly identify the tapping of every finger for the first time. Two auto-event annotation algorithms are firstly applied and evaluated for detecting the finger tapping frame. Based on the truncated signals, the Wavelet packet transform (WPT) coefficients are calculated and compressed as the features, followed by a feature selection method that is able to improve the performance by optimizing the feature set. Finally, three popular classifiers including naive Bayes (NBC), K-nearest neighbor (KNN), and support vector machine (SVM) are applied and evaluated. The recognition accuracy can be achieved up to 94%. The design and the architecture of the system are presented with full system characterization results. PMID:29167655
A case study on Discrete Wavelet Transform based Hurst exponent for epilepsy detection.
Madan, Saiby; Srivastava, Kajri; Sharmila, A; Mahalakshmi, P
2018-01-01
Epileptic seizures are manifestations of epilepsy. Careful analysis of EEG records can provide valuable insight and improved understanding of the mechanism causing epileptic disorders. The detection of epileptic form discharges in EEG is an important component in the diagnosis of epilepsy. As EEG signals are non-stationary, the conventional frequency and time domain analysis does not provide better accuracy. So, in this work an attempt has been made to provide an overview of the determination of epilepsy by implementation of Hurst exponent (HE)-based discrete wavelet transform techniques for feature extraction from EEG data sets obtained during ictal and pre ictal stages of affected person and finally classifying EEG signals using SVM and KNN Classifiers. The The highest accuracy of 99% is obtained using SVM.
Song, Wen; Ade, Carl; Broxterman, Ryan; Barstow, Thomas; Nelson, Thomas; Warren, Steve
2012-01-01
Accelerometer data provide useful information about subject activity in many different application scenarios. For this study, single-accelerometer data were acquired from subjects participating in field tests that mimic tasks that astronauts might encounter in reduced gravity environments. The primary goal of this effort was to apply classification algorithms that could identify these tasks based on features present in their corresponding accelerometer data, where the end goal is to establish methods to unobtrusively gauge subject well-being based on sensors that reside in their local environment. In this initial analysis, six different activities that involve leg movement are classified. The k-Nearest Neighbors (kNN) algorithm was found to be the most effective, with an overall classification success rate of 90.8%.
Zhao, Weixiang; Davis, Cristina E.
2011-01-01
Objective This paper introduces a modified artificial immune system (AIS)-based pattern recognition method to enhance the recognition ability of the existing conventional AIS-based classification approach and demonstrates the superiority of the proposed new AIS-based method via two case studies of breast cancer diagnosis. Methods and materials Conventionally, the AIS approach is often coupled with the k nearest neighbor (k-NN) algorithm to form a classification method called AIS-kNN. In this paper we discuss the basic principle and possible problems of this conventional approach, and propose a new approach where AIS is integrated with the radial basis function – partial least square regression (AIS-RBFPLS). Additionally, both the two AIS-based approaches are compared with two classical and powerful machine learning methods, back-propagation neural network (BPNN) and orthogonal radial basis function network (Ortho-RBF network). Results The diagnosis results show that: (1) both the AIS-kNN and the AIS-RBFPLS proved to be a good machine leaning method for clinical diagnosis, but the proposed AIS-RBFPLS generated an even lower misclassification ratio, especially in the cases where the conventional AIS-kNN approach generated poor classification results because of possible improper AIS parameters. For example, based upon the AIS memory cells of “replacement threshold = 0.3”, the average misclassification ratios of two approaches for study 1 are 3.36% (AIS-RBFPLS) and 9.07% (AIS-kNN), and the misclassification ratios for study 2 are 19.18% (AIS-RBFPLS) and 28.36% (AIS-kNN); (2) the proposed AIS-RBFPLS presented its robustness in terms of the AIS-created memory cells, showing a smaller standard deviation of the results from the multiple trials than AIS-kNN. For example, using the result from the first set of AIS memory cells as an example, the standard deviations of the misclassification ratios for study 1 are 0.45% (AIS-RBFPLS) and 8.71% (AIS-kNN) and those for study 2 are 0.49% (AIS-RBFPLS) and 6.61% (AIS-kNN); and (3) the proposed AIS-RBFPLS classification approaches also yielded better diagnosis results than two classical neural network approaches of BPNN and Ortho-RBF network. Conclusion In summary, this paper proposed a new machine learning method for complex systems by integrating the AIS system with RBFPLS. This new method demonstrates its satisfactory effect on classification accuracy for clinical diagnosis, and also indicates its wide potential applications to other diagnosis and detection problems. PMID:21515033
2007-02-28
Shah, D. Waagen, H. Schmitt, S. Bellofiore, A. Spanias, and D. Cochran, 32nd International Conference on Acoustics, Speech , and Signal Processing...Information Exploitation Office kNN k-Nearest Neighbor LEAN Laplacian Eigenmap Adaptive Neighbor LIP Linear Integer Programming ISP
Chen, Zhao; Cao, Yanfeng; He, Shuaibing; Qiao, Yanjiang
2018-01-01
Action (" gongxiao " in Chinese) of traditional Chinese medicine (TCM) is the high recapitulation for therapeutic and health-preserving effects under the guidance of TCM theory. TCM-defined herbal properties (" yaoxing " in Chinese) had been used in this research. TCM herbal property (TCM-HP) is the high generalization and summary for actions, both of which come from long-term effective clinical practice in two thousands of years in China. However, the specific relationship between TCM-HP and action of TCM is complex and unclear from a scientific perspective. The research about this is conducive to expound the connotation of TCM-HP theory and is of important significance for the development of the TCM-HP theory. One hundred and thirty-three herbs including 88 heat-clearing herbs (HCHs) and 45 blood-activating stasis-resolving herbs (BAHRHs) were collected from reputable TCM literatures, and their corresponding TCM-HPs/actions information were collected from Chinese pharmacopoeia (2015 edition). The Kennard-Stone (K-S) algorithm was used to split 133 herbs into 100 calibration samples and 33 validation samples. Then, machine learning methods including supported vector machine (SVM), k-nearest neighbor (kNN) and deep learning methods including deep belief network (DBN), convolutional neutral network (CNN) were adopted to develop action classification models based on TCM-HP theory, respectively. In order to ensure robustness, these four classification methods were evaluated by using the method of tenfold cross validation and 20 external validation samples for prediction. As results, 72.7-100% of 33 validation samples including 17 HCHs and 16 BASRHs were correctly predicted by these four types of methods. Both of the DBN and CNN methods gave out the best results and their sensitivity, specificity, precision, accuracy were all 100.00%. Especially, the predicted results of external validation set showed that the performance of deep learning methods (DBN, CNN) were better than traditional machine learning methods (kNN, SVM) in terms of their sensitivity, specificity, precision, accuracy. Moreover, the distribution patterns of TCM-HPs of HCHs and BASRHs were also analyzed to detect the featured TCM-HPs of these two types of herbs. The result showed that the featured TCM-HPs of HCHs were cold, bitter, liver and stomach meridians entered, while those of BASRHs were warm, bitter and pungent, liver meridian entered. The performance on validation set and external validation set of deep learning methods (DBN, CNN) were better than machine learning models (kNN, SVM) in sensitivity, specificity, precision, accuracy when predicting the actions of heat-clearing and blood-activating stasis-resolving based on TCM-HP theory. The deep learning classification methods owned better generalization ability and accuracy when predicting the actions of heat-clearing and blood-activating stasis-resolving based on TCM-HP theory. Besides, the methods of deep learning would help us to improve our understanding about the relationship between herbal property and action, as well as to enrich and develop the theory of TCM-HP scientifically.
NASA Astrophysics Data System (ADS)
Wigdahl, J.; Agurto, C.; Murray, V.; Barriga, S.; Soliz, P.
2013-03-01
Diabetic retinopathy (DR) affects more than 4.4 million Americans age 40 and over. Automatic screening for DR has shown to be an efficient and cost-effective way to lower the burden on the healthcare system, by triaging diabetic patients and ensuring timely care for those presenting with DR. Several supervised algorithms have been developed to detect pathologies related to DR, but little work has been done in determining the size of the training set that optimizes an algorithm's performance. In this paper we analyze the effect of the training sample size on the performance of a top-down DR screening algorithm for different types of statistical classifiers. Results are based on partial least squares (PLS), support vector machines (SVM), k-nearest neighbor (kNN), and Naïve Bayes classifiers. Our dataset consisted of digital retinal images collected from a total of 745 cases (595 controls, 150 with DR). We varied the number of normal controls in the training set, while keeping the number of DR samples constant, and repeated the procedure 10 times using randomized training sets to avoid bias. Results show increasing performance in terms of area under the ROC curve (AUC) when the number of DR subjects in the training set increased, with similar trends for each of the classifiers. Of these, PLS and k-NN had the highest average AUC. Lower standard deviation and a flattening of the AUC curve gives evidence that there is a limit to the learning ability of the classifiers and an optimal number of cases to train on.
Streamlining machine learning in mobile devices for remote sensing
NASA Astrophysics Data System (ADS)
Coronel, Andrei D.; Estuar, Ma. Regina E.; Garcia, Kyle Kristopher P.; Dela Cruz, Bon Lemuel T.; Torrijos, Jose Emmanuel; Lim, Hadrian Paulo M.; Abu, Patricia Angela R.; Victorino, John Noel C.
2017-09-01
Mobile devices have been at the forefront of Intelligent Farming because of its ubiquitous nature. Applications on precision farming have been developed on smartphones to allow small farms to monitor environmental parameters surrounding crops. Mobile devices are used for most of these applications, collecting data to be sent to the cloud for storage, analysis, modeling and visualization. However, with the issue of weak and intermittent connectivity in geographically challenged areas of the Philippines, the solution is to provide analysis on the phone itself. Given this, the farmer gets a real time response after data submission. Though Machine Learning is promising, hardware constraints in mobile devices limit the computational capabilities, making model development on the phone restricted and challenging. This study discusses the development of a Machine Learning based mobile application using OpenCV libraries. The objective is to enable the detection of Fusarium oxysporum cubense (Foc) in juvenile and asymptomatic bananas using images of plant parts and microscopic samples as input. Image datasets of attached, unattached, dorsal, and ventral views of leaves were acquired through sampling protocols. Images of raw and stained specimens from soil surrounding the plant, and sap from the plant resulted to stained and unstained samples respectively. Segmentation and feature extraction techniques were applied to all images. Initial findings show no significant differences among the different feature extraction techniques. For differentiating infected from non-infected leaves, KNN yields highest average accuracy, as opposed to Naive Bayes and SVM. For microscopic images using MSER feature extraction, KNN has been tested as having a better accuracy than SVM or Naive-Bayes.
Diagnosis of response and non-response to dry eye treatment using infrared thermography images
NASA Astrophysics Data System (ADS)
Acharya, U. Rajendra; Tan, Jen Hong; Vidya, S.; Yeo, Sharon; Too, Cheah Loon; Lim, Wei Jie Eugene; Chua, Kuang Chua; Tong, Louis
2014-11-01
The dry eye treatment outcome depends on the assessment of clinical relevance of the treatment effect. The potential approach to assess the clinical relevance of the treatment is to identify the symptoms responders and non-responders to the given treatments using the responder analysis. In our work, we have performed the responder analysis to assess the clinical relevance effect of the dry eye treatments namely, hot towel, EyeGiene®, and Blephasteam® twice daily and 12 min session of Lipiflow®. Thermography is performed at week 0 (baseline), at weeks 4 and 12 after treatment. The clinical parameters such as, change in the clinical irritations scores, tear break up time (TBUT), corneal staining and Schirmer's symptoms tests values are used to obtain the responders and non-responders groups. We have obtained the infrared thermography images of dry eye symptoms responders and non-responders to the three types of warming treatments. The energy, kurtosis, skewness, mean, standard deviation, and various entropies namely Shannon, Renyi and Kapoor are extracted from responders and non-responders thermograms. The extracted features are ranked based on t-values. These ranked features are fed to the various classifiers to get the highest performance using minimum features. We have used decision tree (DT), K nearest neighbour (KNN), Naves Bayesian (NB) and support vector machine (SVM) to classify the features into responder and non-responder classes. We have obtained an average accuracy of 99.88%, sensitivity of 99.7% and specificity of 100% using KNN classifier using ten-fold cross validation.
Multi-feature classifiers for burst detection in single EEG channels from preterm infants
NASA Astrophysics Data System (ADS)
Navarro, X.; Porée, F.; Kuchenbuch, M.; Chavez, M.; Beuchée, Alain; Carrault, G.
2017-08-01
Objective. The study of electroencephalographic (EEG) bursts in preterm infants provides valuable information about maturation or prognostication after perinatal asphyxia. Over the last two decades, a number of works proposed algorithms to automatically detect EEG bursts in preterm infants, but they were designed for populations under 35 weeks of post menstrual age (PMA). However, as the brain activity evolves rapidly during postnatal life, these solutions might be under-performing with increasing PMA. In this work we focused on preterm infants reaching term ages (PMA ⩾36 weeks) using multi-feature classification on a single EEG channel. Approach. Five EEG burst detectors relying on different machine learning approaches were compared: logistic regression (LR), linear discriminant analysis (LDA), k-nearest neighbors (kNN), support vector machines (SVM) and thresholding (Th). Classifiers were trained by visually labeled EEG recordings from 14 very preterm infants (born after 28 weeks of gestation) with 36-41 weeks PMA. Main results. The most performing classifiers reached about 95% accuracy (kNN, SVM and LR) whereas Th obtained 84%. Compared to human-automatic agreements, LR provided the highest scores (Cohen’s kappa = 0.71) using only three EEG features. Applying this classifier in an unlabeled database of 21 infants ⩾36 weeks PMA, we found that long EEG bursts and short inter-burst periods are characteristic of infants with the highest PMA and weights. Significance. In view of these results, LR-based burst detection could be a suitable tool to study maturation in monitoring or portable devices using a single EEG channel.
A MapReduce approach to diminish imbalance parameters for big deoxyribonucleic acid dataset.
Kamal, Sarwar; Ripon, Shamim Hasnat; Dey, Nilanjan; Ashour, Amira S; Santhi, V
2016-07-01
In the age of information superhighway, big data play a significant role in information processing, extractions, retrieving and management. In computational biology, the continuous challenge is to manage the biological data. Data mining techniques are sometimes imperfect for new space and time requirements. Thus, it is critical to process massive amounts of data to retrieve knowledge. The existing software and automated tools to handle big data sets are not sufficient. As a result, an expandable mining technique that enfolds the large storage and processing capability of distributed or parallel processing platforms is essential. In this analysis, a contemporary distributed clustering methodology for imbalance data reduction using k-nearest neighbor (K-NN) classification approach has been introduced. The pivotal objective of this work is to illustrate real training data sets with reduced amount of elements or instances. These reduced amounts of data sets will ensure faster data classification and standard storage management with less sensitivity. However, general data reduction methods cannot manage very big data sets. To minimize these difficulties, a MapReduce-oriented framework is designed using various clusters of automated contents, comprising multiple algorithmic approaches. To test the proposed approach, a real DNA (deoxyribonucleic acid) dataset that consists of 90 million pairs has been used. The proposed model reduces the imbalance data sets from large-scale data sets without loss of its accuracy. The obtained results depict that MapReduce based K-NN classifier provided accurate results for big data of DNA. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Large-scale grain growth in the solid-state process: From "Abnormal" to "Normal"
NASA Astrophysics Data System (ADS)
Jiang, Minhong; Han, Shengnan; Zhang, Jingwei; Song, Jiageng; Hao, Chongyan; Deng, Manjiao; Ge, Lingjing; Gu, Zhengfei; Liu, Xinyu
2018-02-01
Abnormal grain growth (AGG) has been a common phenomenon during the ceramic or metallurgy processing since prehistoric times. However, usually it had been very difficult to grow big single crystal (centimeter scale over) by using the AGG method due to its so-called occasionality. Based on the AGG, a solid-state crystal growth (SSCG) method was developed. The greatest advantages of the SSCG technology are the simplicity and cost-effectiveness of the technique. But the traditional SSCG technology is still uncontrollable. This article first summarizes the history and current status of AGG, and then reports recent technical developments from AGG to SSCG, and further introduces a new seed-free, solid-state crystal growth (SFSSCG) technology. This SFSSCG method allows us to repeatedly and controllably fabricate large-scale single crystals with appreciable high quality and relatively stable chemical composition at a relatively low temperature, at least in (K0.5Na0.5)NbO3(KNN) and Cu-Al-Mn systems. In this sense, the exaggerated grain growth is no longer 'Abnormal' but 'Normal' since it is able to be artificially controllable and repeated now. This article also provides a crystal growth model to qualitatively explain the mechanism of SFSSCG for KNN system. Compared with the traditional melt and high temperature solution growth methods, the SFSSCG method has the advantages of low energy consumption, low investment, simple technique, composition homogeneity overcoming the issues with incongruent melting and high volatility. This SFSSCG could be helpful for improving the mechanical and physical properties of single crystals, which should be promising for industrial applications.
A novel feature ranking algorithm for biometric recognition with PPG signals.
Reşit Kavsaoğlu, A; Polat, Kemal; Recep Bozkurt, M
2014-06-01
This study is intended for describing the application of the Photoplethysmography (PPG) signal and the time domain features acquired from its first and second derivatives for biometric identification. For this purpose, a sum of 40 features has been extracted and a feature-ranking algorithm is proposed. This proposed algorithm calculates the contribution of each feature to biometric recognition and collocates the features, the contribution of which is from great to small. While identifying the contribution of the features, the Euclidean distance and absolute distance formulas are used. The efficiency of the proposed algorithms is demonstrated by the results of the k-NN (k-nearest neighbor) classifier applications of the features. During application, each 15-period-PPG signal belonging to two different durations from each of the thirty healthy subjects were used with a PPG data acquisition card. The first PPG signals recorded from the subjects were evaluated as the 1st configuration; the PPG signals recorded later at a different time as the 2nd configuration and the combination of both were evaluated as the 3rd configuration. When the results were evaluated for the k-NN classifier model created along with the proposed algorithm, an identification of 90.44% for the 1st configuration, 94.44% for the 2nd configuration, and 87.22% for the 3rd configuration has successfully been attained. The obtained results showed that both the proposed algorithm and the biometric identification model based on this developed PPG signal are very promising for contactless recognizing the people with the proposed method. Copyright © 2014 Elsevier Ltd. All rights reserved.
Ontology based decision system for breast cancer diagnosis
NASA Astrophysics Data System (ADS)
Trabelsi Ben Ameur, Soumaya; Cloppet, Florence; Wendling, Laurent; Sellami, Dorra
2018-04-01
In this paper, we focus on analysis and diagnosis of breast masses inspired by expert concepts and rules. Accordingly, a Bag of Words is built based on the ontology of breast cancer diagnosis, accurately described in the Breast Imaging Reporting and Data System. To fill the gap between low level knowledge and expert concepts, a semantic annotation is developed using a machine learning tool. Then, breast masses are classified into benign or malignant according to expert rules implicitly modeled with a set of classifiers (KNN, ANN, SVM and Decision Tree). This semantic context of analysis offers a frame where we can include external factors and other meta-knowledge such as patient risk factors as well as exploiting more than one modality. Based on MRI and DECEDM modalities, our developed system leads a recognition rate of 99.7% with Decision Tree where an improvement of 24.7 % is obtained owing to semantic analysis.
E-Nose Vapor Identification Based on Dempster-Shafer Fusion of Multiple Classifiers
NASA Technical Reports Server (NTRS)
Li, Winston; Leung, Henry; Kwan, Chiman; Linnell, Bruce R.
2005-01-01
Electronic nose (e-nose) vapor identification is an efficient approach to monitor air contaminants in space stations and shuttles in order to ensure the health and safety of astronauts. Data preprocessing (measurement denoising and feature extraction) and pattern classification are important components of an e-nose system. In this paper, a wavelet-based denoising method is applied to filter the noisy sensor measurements. Transient-state features are then extracted from the denoised sensor measurements, and are used to train multiple classifiers such as multi-layer perceptions (MLP), support vector machines (SVM), k nearest neighbor (KNN), and Parzen classifier. The Dempster-Shafer (DS) technique is used at the end to fuse the results of the multiple classifiers to get the final classification. Experimental analysis based on real vapor data shows that the wavelet denoising method can remove both random noise and outliers successfully, and the classification rate can be improved by using classifier fusion.
Statistical and Adaptive Signal Processing for UXO Discrimination for Next-Generation Sensor Data
2009-09-01
using the energies of all polarizations as features in a KNN classifier variant resulted in 100% probability of detection at a probability of false...International Conference on Acoustics, Speech , and Signal Processing, vol. V, 2005, pp. 885-888. [12] C. Kreucher, K. Kastella, and A. O. Hero
Implementation of a Compiler for the Functional Programming Language PHI.
1987-06-01
Chapter Three. 8 his acceptance speech for the 1977 ACM Turing Award, Backus criticized traditional programming languages and programming styles. He went... Knn "mfrn ~ i ptr ->type =type; :.f (f~ead :=NULL) { -st alreaay ex-s~ tracer f head; wnile (tracer->iink - NU:LL) rdent of >-sl tracer = : racer- Iik
Rapid lard identification with portable electronic nose
NASA Astrophysics Data System (ADS)
Latief, Marsad; Khorsidtalab, Aida; Saputra, Irwan; Akmeliawati, Rini; Nurashikin, Anis; Jaswir, Irwandi; Witjaksono, Gunawan
2017-11-01
Human sensory systems are limited in many different regards, yet they are great sources of inspiration for development of technologies that help humans to overcome their restraints. This paper signifies the capability of our developed electronic nose in rapid lard identification. The developed device, known as E-Nose, mimics human’s olfactory system’s technique to identify a particular substance. Lard is a common pig derivative which is often used as a food additive, emulsion or shortening. It’s also commonly used as an adulterant or as an alternative for cooking oils, margarine and butter. This substance is prohibited to be consumed by Muslims and Orthodox Jews for religious reasons. A portable reliable device with an ability to identify lard rapidly can be convenient to users concerned about lard adulteration. The prototype was examined using K-Nearest Neighbors algorithm (KNN), Support Vector Machine (SVM), Bagged Trees and Simple Tree, and can identify lard with the highest accuracy of 95.6% among three types of fat (lard, chicken and beef) in liquid form over a certain range of temperature using KNN.
Al-Qazzaz, Noor Kamal; Ali, Sawal; Ahmad, Siti Anom; Escudero, Javier
2017-07-01
The aim of the present study was to discriminate the electroencephalogram (EEG) of 5 patients with vascular dementia (VaD), 15 patients with stroke-related mild cognitive impairment (MCI), and 15 control normal subjects during a working memory (WM) task. We used independent component analysis (ICA) and wavelet transform (WT) as a hybrid preprocessing approach for EEG artifact removal. Three different features were extracted from the cleaned EEG signals: spectral entropy (SpecEn), permutation entropy (PerEn) and Tsallis entropy (TsEn). Two classification schemes were applied - support vector machine (SVM) and k-nearest neighbors (kNN) - with fuzzy neighborhood preserving analysis with QR-decomposition (FNPAQR) as a dimensionality reduction technique. The FNPAQR dimensionality reduction technique increased the SVM classification accuracy from 82.22% to 90.37% and from 82.6% to 86.67% for kNN. These results suggest that FNPAQR consistently improves the discrimination of VaD, MCI patients and control normal subjects and it could be a useful feature selection to help the identification of patients with VaD and MCI.
Yadav, Bechu K V; Nandy, S
2015-05-01
Mapping forest biomass is fundamental for estimating CO₂ emissions, and planning and monitoring of forests and ecosystem productivity. The present study attempted to map aboveground woody biomass (AGWB) integrating forest inventory, remote sensing and geostatistical techniques, viz., direct radiometric relationships (DRR), k-nearest neighbours (k-NN) and cokriging (CoK) and to evaluate their accuracy. A part of the Timli Forest Range of Kalsi Soil and Water Conservation Division, Uttarakhand, India was selected for the present study. Stratified random sampling was used to collect biophysical data from 36 sample plots of 0.1 ha (31.62 m × 31.62 m) size. Species-specific volumetric equations were used for calculating volume and multiplied by specific gravity to get biomass. Three forest-type density classes, viz. 10-40, 40-70 and >70% of Shorea robusta forest and four non-forest classes were delineated using on-screen visual interpretation of IRS P6 LISS-III data of December 2012. The volume in different strata of forest-type density ranged from 189.84 to 484.36 m(3) ha(-1). The total growing stock of the forest was found to be 2,024,652.88 m(3). The AGWB ranged from 143 to 421 Mgha(-1). Spectral bands and vegetation indices were used as independent variables and biomass as dependent variable for DRR, k-NN and CoK. After validation and comparison, k-NN method of Mahalanobis distance (root mean square error (RMSE) = 42.25 Mgha(-1)) was found to be the best method followed by fuzzy distance and Euclidean distance with RMSE of 44.23 and 45.13 Mgha(-1) respectively. DRR was found to be the least accurate method with RMSE of 67.17 Mgha(-1). The study highlighted the potential of integrating of forest inventory, remote sensing and geostatistical techniques for forest biomass mapping.
Ali, Safdar; Majid, Abdul; Khan, Asifullah
2014-04-01
Development of an accurate and reliable intelligent decision-making method for the construction of cancer diagnosis system is one of the fast growing research areas of health sciences. Such decision-making system can provide adequate information for cancer diagnosis and drug discovery. Descriptors derived from physicochemical properties of protein sequences are very useful for classifying cancerous proteins. Recently, several interesting research studies have been reported on breast cancer classification. To this end, we propose the exploitation of the physicochemical properties of amino acids in protein primary sequences such as hydrophobicity (Hd) and hydrophilicity (Hb) for breast cancer classification. Hd and Hb properties of amino acids, in recent literature, are reported to be quite effective in characterizing the constituent amino acids and are used to study protein foldings, interactions, structures, and sequence-order effects. Especially, using these physicochemical properties, we observed that proline, serine, tyrosine, cysteine, arginine, and asparagine amino acids offer high discrimination between cancerous and healthy proteins. In addition, unlike traditional ensemble classification approaches, the proposed 'IDM-PhyChm-Ens' method was developed by combining the decision spaces of a specific classifier trained on different feature spaces. The different feature spaces used were amino acid composition, split amino acid composition, and pseudo amino acid composition. Consequently, we have exploited different feature spaces using Hd and Hb properties of amino acids to develop an accurate method for classification of cancerous protein sequences. We developed ensemble classifiers using diverse learning algorithms such as random forest (RF), support vector machines (SVM), and K-nearest neighbor (KNN) trained on different feature spaces. We observed that ensemble-RF, in case of cancer classification, performed better than ensemble-SVM and ensemble-KNN. Our analysis demonstrates that ensemble-RF, ensemble-SVM and ensemble-KNN are more effective than their individual counterparts. The proposed 'IDM-PhyChm-Ens' method has shown improved performance compared to existing techniques.
Detecting Paroxysmal Coughing from Pertussis Cases Using Voice Recognition Technology
Parker, Danny; Picone, Joseph; Harati, Amir; Lu, Shuang; Jenkyns, Marion H.; Polgreen, Philip M.
2013-01-01
Background Pertussis is highly contagious; thus, prompt identification of cases is essential to control outbreaks. Clinicians experienced with the disease can easily identify classic cases, where patients have bursts of rapid coughing followed by gasps, and a characteristic whooping sound. However, many clinicians have never seen a case, and thus may miss initial cases during an outbreak. The purpose of this project was to use voice-recognition software to distinguish pertussis coughs from croup and other coughs. Methods We collected a series of recordings representing pertussis, croup and miscellaneous coughing by children. We manually categorized coughs as either pertussis or non-pertussis, and extracted features for each category. We used Mel-frequency cepstral coefficients (MFCC), a sampling rate of 16 KHz, a frame Duration of 25 msec, and a frame rate of 10 msec. The coughs were filtered. Each cough was divided into 3 sections of proportion 3-4-3. The average of the 13 MFCCs for each section was computed and made into a 39-element feature vector used for the classification. We used the following machine learning algorithms: Neural Networks, K-Nearest Neighbor (KNN), and a 200 tree Random Forest (RF). Data were reserved for cross-validation of the KNN and RF. The Neural Network was trained 100 times, and the averaged results are presented. Results After categorization, we had 16 examples of non-pertussis coughs and 31 examples of pertussis coughs. Over 90% of all pertussis coughs were properly classified as pertussis. The error rates were: Type I errors of 7%, 12%, and 25% and Type II errors of 8%, 0%, and 0%, using the Neural Network, Random Forest, and KNN, respectively. Conclusion Our results suggest that we can build a robust classifier to assist clinicians and the public to help identify pertussis cases in children presenting with typical symptoms. PMID:24391730
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boutilier, Justin J., E-mail: j.boutilier@mail.utoronto.ca; Lee, Taewoo; Craig, Tim
Purpose: To develop and evaluate the clinical applicability of advanced machine learning models that simultaneously predict multiple optimization objective function weights from patient geometry for intensity-modulated radiation therapy of prostate cancer. Methods: A previously developed inverse optimization method was applied retrospectively to determine optimal objective function weights for 315 treated patients. The authors used an overlap volume ratio (OV) of bladder and rectum for different PTV expansions and overlap volume histogram slopes (OVSR and OVSB for the rectum and bladder, respectively) as explanatory variables that quantify patient geometry. Using the optimal weights as ground truth, the authors trained and appliedmore » three prediction models: logistic regression (LR), multinomial logistic regression (MLR), and weighted K-nearest neighbor (KNN). The population average of the optimal objective function weights was also calculated. Results: The OV at 0.4 cm and OVSR at 0.1 cm features were found to be the most predictive of the weights. The authors observed comparable performance (i.e., no statistically significant difference) between LR, MLR, and KNN methodologies, with LR appearing to perform the best. All three machine learning models outperformed the population average by a statistically significant amount over a range of clinical metrics including bladder/rectum V53Gy, bladder/rectum V70Gy, and dose to the bladder, rectum, CTV, and PTV. When comparing the weights directly, the LR model predicted bladder and rectum weights that had, on average, a 73% and 74% relative improvement over the population average weights, respectively. The treatment plans resulting from the LR weights had, on average, a rectum V70Gy that was 35% closer to the clinical plan and a bladder V70Gy that was 29% closer, compared to the population average weights. Similar results were observed for all other clinical metrics. Conclusions: The authors demonstrated that the KNN and MLR weight prediction methodologies perform comparably to the LR model and can produce clinical quality treatment plans by simultaneously predicting multiple weights that capture trade-offs associated with sparing multiple OARs.« less
Farran, Bassam; Channanath, Arshad Mohamed; Behbehani, Kazem; Thanaraj, Thangavel Alphonse
2013-05-14
We build classification models and risk assessment tools for diabetes, hypertension and comorbidity using machine-learning algorithms on data from Kuwait. We model the increased proneness in diabetic patients to develop hypertension and vice versa. We ascertain the importance of ethnicity (and natives vs expatriate migrants) and of using regional data in risk assessment. Retrospective cohort study. Four machine-learning techniques were used: logistic regression, k-nearest neighbours (k-NN), multifactor dimensionality reduction and support vector machines. The study uses fivefold cross validation to obtain generalisation accuracies and errors. Kuwait Health Network (KHN) that integrates data from primary health centres and hospitals in Kuwait. 270 172 hospital visitors (of which, 89 858 are diabetic, 58 745 hypertensive and 30 522 comorbid) comprising Kuwaiti natives, Asian and Arab expatriates. Incident type 2 diabetes, hypertension and comorbidity. Classification accuracies of >85% (for diabetes) and >90% (for hypertension) are achieved using only simple non-laboratory-based parameters. Risk assessment tools based on k-NN classification models are able to assign 'high' risk to 75% of diabetic patients and to 94% of hypertensive patients. Only 5% of diabetic patients are seen assigned 'low' risk. Asian-specific models and assessments perform even better. Pathological conditions of diabetes in the general population or in hypertensive population and those of hypertension are modelled. Two-stage aggregate classification models and risk assessment tools, built combining both the component models on diabetes (or on hypertension), perform better than individual models. Data on diabetes, hypertension and comorbidity from the cosmopolitan State of Kuwait are available for the first time. This enabled us to apply four different case-control models to assess risks. These tools aid in the preliminary non-intrusive assessment of the population. Ethnicity is seen significant to the predictive models. Risk assessments need to be developed using regional data as we demonstrate the applicability of the American Diabetes Association online calculator on data from Kuwait.
NASA Astrophysics Data System (ADS)
Poyatos, Rafael; Sus, Oliver; Vilà-Cabrera, Albert; Vayreda, Jordi; Badiella, Llorenç; Mencuccini, Maurizio; Martínez-Vilalta, Jordi
2016-04-01
Plant functional traits are increasingly being used in ecosystem ecology thanks to the growing availability of large ecological databases. However, these databases usually contain a large fraction of missing data because measuring plant functional traits systematically is labour-intensive and because most databases are compilations of datasets with different sampling designs. As a result, within a given database, there is an inevitable variability in the number of traits available for each data entry and/or the species coverage in a given geographical area. The presence of missing data may severely bias trait-based analyses, such as the quantification of trait covariation or trait-environment relationships and may hamper efforts towards trait-based modelling of ecosystem biogeochemical cycles. Several data imputation (i.e. gap-filling) methods have been recently tested on compiled functional trait databases, but the performance of imputation methods applied to a functional trait database with a regular spatial sampling has not been thoroughly studied. Here, we assess the effects of data imputation on five tree functional traits (leaf biomass to sapwood area ratio, foliar nitrogen, maximum height, specific leaf area and wood density) in the Ecological and Forest Inventory of Catalonia, an extensive spatial database (covering 31900 km2). We tested the performance of species mean imputation, single imputation by the k-nearest neighbors algorithm (kNN) and a multiple imputation method, Multivariate Imputation with Chained Equations (MICE) at different levels of missing data (10%, 30%, 50%, and 80%). We also assessed the changes in imputation performance when additional predictors (species identity, climate, forest structure, spatial structure) were added in kNN and MICE imputations. We evaluated the imputed datasets using a battery of indexes describing departure from the complete dataset in trait distribution, in the mean prediction error, in the correlation matrix and in selected bivariate trait relationships. MICE yielded imputations which better preserved the variability and covariance structure of the data and provided an estimate of between-imputation uncertainty. We found that adding species identity as a predictor in MICE and kNN improved imputation for all traits, but adding climate did not lead to any appreciable improvement. However, forest structure and spatial structure did reduce imputation errors in maximum height and in leaf biomass to sapwood area ratios, respectively. Although species mean imputations showed the lowest error for 3 out the 5 studied traits, dataset-averaged errors were lowest for MICE imputations with all additional predictors, when missing data levels were 50% or lower. Species mean imputations always resulted in larger errors in the correlation matrix and appreciably altered the studied bivariate trait relationships. In conclusion, MICE imputations using species identity, climate, forest structure and spatial structure as predictors emerged as the most suitable method of the ones tested here, but it was also evident that imputation performance deteriorates at high levels of missing data (80%).
Creating Profiles from User Network Behavior
2013-09-01
We varied the m-estimate in Naïve Bayes, m for pruning in Learning Tree, and how many k nearest neighbors to select from in KNN, before settling on the...N. Taft, “The cubicle vs. the coffee shop: behavioral modes in enterprise end-users,” in Proc. of the 9th Int. Conf. on Passive and Active Network
Rezaee, Kh.; Azizi, E.; Haddadnia, J.
2016-01-01
Background Epilepsy is a severe disorder of the central nervous system that predisposes the person to recurrent seizures. Fifty million people worldwide suffer from epilepsy; after Alzheimer’s and stroke, it is the third widespread nervous disorder. Objective In this paper, an algorithm to detect the onset of epileptic seizures based on the analysis of brain electrical signals (EEG) has been proposed. 844 hours of EEG were recorded form 23 pediatric patients consecutively with 163 occurrences of seizures. Signals had been collected from Children’s Hospital Boston with a sampling frequency of 256 Hz through 18 channels in order to assess epilepsy surgery. By selecting effective features from seizure and non-seizure signals of each individual and putting them into two categories, the proposed algorithm detects the onset of seizures quickly and with high sensitivity. Method In this algorithm, L-sec epochs of signals are displayed in form of a third-order tensor in spatial, spectral and temporal spaces by applying wavelet transform. Then, after applying general tensor discriminant analysis (GTDA) on tensors and calculating mapping matrix, feature vectors are extracted. GTDA increases the sensitivity of the algorithm by storing data without deleting them. Finally, K-Nearest neighbors (KNN) is used to classify the selected features. Results The results of simulating algorithm on algorithm standard dataset shows that the algorithm is capable of detecting 98 percent of seizures with an average delay of 4.7 seconds and the average error rate detection of three errors in 24 hours. Conclusion Today, the lack of an automated system to detect or predict the seizure onset is strongly felt. PMID:27672628
Development of machine learning models for diagnosis of glaucoma.
Kim, Seong Jae; Cho, Kyong Jin; Oh, Sejong
2017-01-01
The study aimed to develop machine learning models that have strong prediction power and interpretability for diagnosis of glaucoma based on retinal nerve fiber layer (RNFL) thickness and visual field (VF). We collected various candidate features from the examination of retinal nerve fiber layer (RNFL) thickness and visual field (VF). We also developed synthesized features from original features. We then selected the best features proper for classification (diagnosis) through feature evaluation. We used 100 cases of data as a test dataset and 399 cases of data as a training and validation dataset. To develop the glaucoma prediction model, we considered four machine learning algorithms: C5.0, random forest (RF), support vector machine (SVM), and k-nearest neighbor (KNN). We repeatedly composed a learning model using the training dataset and evaluated it by using the validation dataset. Finally, we got the best learning model that produces the highest validation accuracy. We analyzed quality of the models using several measures. The random forest model shows best performance and C5.0, SVM, and KNN models show similar accuracy. In the random forest model, the classification accuracy is 0.98, sensitivity is 0.983, specificity is 0.975, and AUC is 0.979. The developed prediction models show high accuracy, sensitivity, specificity, and AUC in classifying among glaucoma and healthy eyes. It will be used for predicting glaucoma against unknown examination records. Clinicians may reference the prediction results and be able to make better decisions. We may combine multiple learning models to increase prediction accuracy. The C5.0 model includes decision rules for prediction. It can be used to explain the reasons for specific predictions.
NASA Astrophysics Data System (ADS)
Mura, Matteo; Bottalico, Francesca; Giannetti, Francesca; Bertani, Remo; Giannini, Raffaello; Mancini, Marco; Orlandini, Simone; Travaglini, Davide; Chirici, Gherardo
2018-04-01
The spatial prediction of growing stock volume is one of the most frequent application of remote sensing for supporting the sustainable management of forest ecosystems. For such a purpose data from active or passive sensors are used as predictor variables in combination with measures taken in the field in sampling plots. The Sentinel-2 (S2) satellites are equipped with a Multi Spectral Instrument (MSI) capable of acquiring 13 bands in the visible and infrared domains with a spatial resolution varying between 10 and 60 m. The present study aimed at evaluating the performance of the S2-MSI imagery for estimating the growing stock volume of forest ecosystems. To do so we used 240 plots measured in two study areas in Italy. The imputation was carried out with eight k-Nearest Neighbours (k-NN) methods available in the open source YaImpute R package. In order to evaluate the S2-MSI performance we repeated the experimental protocol also with two other sets of images acquired by two well-known satellites equipped with multi spectral instruments: Landsat 8 OLI and RapidEye scanner. We found that S2 worked better than Landsat in 37.5% of the cases and in 62.5% of the cases better than RapidEye. In one study area the best performance was obtained with Landsat OLI (RMSD = 6.84%) and in the other with S2 (RMSD = 22.94%), both with the k-NN system based on a distance matrix calculated with the Random Forest algorithm. The results confirmed that S2 images are suitable for predicting growing stock volume obtaining good performances (average RMSD for both the test areas of less than 19%).
Zheng, Bin; Lu, Amy; Hardesty, Lara A; Sumkin, Jules H; Hakim, Christiane M; Ganott, Marie A; Gur, David
2006-01-01
The purpose of this study was to develop and test a method for selecting "visually similar" regions of interest depicting breast masses from a reference library to be used in an interactive computer-aided diagnosis (CAD) environment. A reference library including 1000 malignant mass regions and 2000 benign and CAD-generated false-positive regions was established. When a suspicious mass region is identified, the scheme segments the region and searches for similar regions from the reference library using a multifeature based k-nearest neighbor (KNN) algorithm. To improve selection of reference images, we added an interactive step. All actual masses in the reference library were subjectively rated on a scale from 1 to 9 as to their "visual margins speculations". When an observer identifies a suspected mass region during a case interpretation he/she first rates the margins and the computerized search is then limited only to regions rated as having similar levels of spiculation (within +/-1 scale difference). In an observer preference study including 85 test regions, two sets of the six "similar" reference regions selected by the KNN with and without the interactive step were displayed side by side with each test region. Four radiologists and five nonclinician observers selected the more appropriate ("similar") reference set in a two alternative forced choice preference experiment. All four radiologists and five nonclinician observers preferred the sets of regions selected by the interactive method with an average frequency of 76.8% and 74.6%, respectively. The overall preference for the interactive method was highly significant (p < 0.001). The study demonstrated that a simple interactive approach that includes subjectively perceived ratings of one feature alone namely, a rating of margin "spiculation," could substantially improve the selection of "visually similar" reference images.
Modeling ready biodegradability of fragrance materials.
Ceriani, Lidia; Papa, Ester; Kovarich, Simona; Boethling, Robert; Gramatica, Paola
2015-06-01
In the present study, quantitative structure activity relationships were developed for predicting ready biodegradability of approximately 200 heterogeneous fragrance materials. Two classification methods, classification and regression tree (CART) and k-nearest neighbors (kNN), were applied to perform the modeling. The models were validated with multiple external prediction sets, and the structural applicability domain was verified by the leverage approach. The best models had good sensitivity (internal ≥80%; external ≥68%), specificity (internal ≥80%; external 73%), and overall accuracy (≥75%). Results from the comparison with BIOWIN global models, based on group contribution method, show that specific models developed in the present study perform better in prediction than BIOWIN6, in particular for the correct classification of not readily biodegradable fragrance materials. © 2015 SETAC.
Accounting for one-channel depletion improves missing value imputation in 2-dye microarray data.
Ritz, Cecilia; Edén, Patrik
2008-01-19
For 2-dye microarray platforms, some missing values may arise from an un-measurably low RNA expression in one channel only. Information of such "one-channel depletion" is so far not included in algorithms for imputation of missing values. Calculating the mean deviation between imputed values and duplicate controls in five datasets, we show that KNN-based imputation gives a systematic bias of the imputed expression values of one-channel depleted spots. Evaluating the correction of this bias by cross-validation showed that the mean square deviation between imputed values and duplicates were reduced up to 51%, depending on dataset. By including more information in the imputation step, we more accurately estimate missing expression values.
Fall Detection Using Smartphone Audio Features.
Cheffena, Michael
2016-07-01
An automated fall detection system based on smartphone audio features is developed. The spectrogram, mel frequency cepstral coefficents (MFCCs), linear predictive coding (LPC), and matching pursuit (MP) features of different fall and no-fall sound events are extracted from experimental data. Based on the extracted audio features, four different machine learning classifiers: k-nearest neighbor classifier (k-NN), support vector machine (SVM), least squares method (LSM), and artificial neural network (ANN) are investigated for distinguishing between fall and no-fall events. For each audio feature, the performance of each classifier in terms of sensitivity, specificity, accuracy, and computational complexity is evaluated. The best performance is achieved using spectrogram features with ANN classifier with sensitivity, specificity, and accuracy all above 98%. The classifier also has acceptable computational requirement for training and testing. The system is applicable in home environments where the phone is placed in the vicinity of the user.
A new feature constituting approach to detection of vocal fold pathology
NASA Astrophysics Data System (ADS)
Hariharan, M.; Polat, Kemal; Yaacob, Sazali
2014-08-01
In the last two decades, non-invasive methods through acoustic analysis of voice signal have been proved to be excellent and reliable tool to diagnose vocal fold pathologies. This paper proposes a new feature vector based on the wavelet packet transform and singular value decomposition for the detection of vocal fold pathology. k-means clustering based feature weighting is proposed to increase the distinguishing performance of the proposed features. In this work, two databases Massachusetts Eye and Ear Infirmary (MEEI) voice disorders database and MAPACI speech pathology database are used. Four different supervised classifiers such as k-nearest neighbour (k-NN), least-square support vector machine, probabilistic neural network and general regression neural network are employed for testing the proposed features. The experimental results uncover that the proposed features give very promising classification accuracy of 100% for both MEEI database and MAPACI speech pathology database.
Ronald E. McRoberts; Grant M. Domke; Qi Chen; Erik Næsset; Terje Gobakken
2016-01-01
The relatively small sampling intensities used by national forest inventories are often insufficient to produce the desired precision for estimates of population parameters unless the estimation process is augmented with auxiliary information, usually in the form of remotely sensed data. The k-Nearest Neighbors (k-NN) technique is a non-parametric,multivariate approach...
ERIC Educational Resources Information Center
Pankau, Brian L.
2009-01-01
This empirical study evaluates the document category prediction effectiveness of Naive Bayes (NB) and K-Nearest Neighbor (KNN) classifier treatments built from different feature selection and machine learning settings and trained and tested against textual corpora of 2300 Gang-Of-Four (GOF) design pattern documents. Analysis of the experiment's…
Ronald E. McRoberts; Erkki O. Tomppo; Andrew O. Finley; Heikkinen Juha
2007-01-01
The k-Nearest Neighbor (k-NN) technique has become extremely popular for a variety of forest inventory mapping and estimation applications. Much of this popularity may be attributed to the non-parametric, multivariate features of the technique, its intuitiveness, and its ease of use. When used with satellite imagery and forest...
Applying an efficient K-nearest neighbor search to forest attribute imputation
Andrew O. Finley; Ronald E. McRoberts; Alan R. Ek
2006-01-01
This paper explores the utility of an efficient nearest neighbor (NN) search algorithm for applications in multi-source kNN forest attribute imputation. The search algorithm reduces the number of distance calculations between a given target vector and each reference vector, thereby, decreasing the time needed to discover the NN subset. Results of five trials show gains...
2007-04-01
input signal with the conjugate of a delayed copy of itself, i.e., )exp(2* kjAzz knn ϕ=− , has a phase argument independent of n. As a result, the...Signal Processing (Elseivier), 2005. S.M. Kay, “A Fast and Accurate Single Frequency Estimator,” IEEE Trans. Acous. Speech Signal Proc., 37(12), 1987
Detection and Jamming Low Probability of Intercept (LPI) Radars
2006-09-01
69 The resulting WVD is, therefore ( ) 2 1 ’ 0 , 2 ( ) . j knN N l n W l k f n e π− − = = ∑ (3.11) Equation (3.11) is the final...of FMCW and P-4 polyphase LPI waveforms. 2002 IEEE International Conference on Acoustics, Speech , and Signal Processing.Proceedings (Cat.no.02CH37334
Breathing Pattern Interpretation as an Alternative and Effective Voice Communication Solution.
Elsahar, Yasmin; Bouazza-Marouf, Kaddour; Kerr, David; Gaur, Atul; Kaushik, Vipul; Hu, Sijung
2018-05-15
Augmentative and alternative communication (AAC) systems tend to rely on the interpretation of purposeful gestures for interaction. Existing AAC methods could be cumbersome and limit the solutions in terms of versatility. The study aims to interpret breathing patterns (BPs) to converse with the outside world by means of a unidirectional microphone and researches breathing-pattern interpretation (BPI) to encode messages in an interactive manner with minimal training. We present BP processing work with (1) output synthesized machine-spoken words (SMSW) along with single-channel Weiner filtering (WF) for signal de-noising, and (2) k -nearest neighbor ( k-NN ) classification of BPs associated with embedded dynamic time warping (DTW). An approved protocol to collect analogue modulated BP sets belonging to 4 distinct classes with 10 training BPs per class and 5 live BPs per class was implemented with 23 healthy subjects. An 86% accuracy of k-NN classification was obtained with decreasing error rates of 17%, 14%, and 11% for the live classifications of classes 2, 3, and 4, respectively. The results express a systematic reliability of 89% with increased familiarity. The outcomes from the current AAC setup recommend a durable engineering solution directly beneficial to the sufferers.
Motion Control of Drives for Prosthetic Hand Using Continuous Myoelectric Signals
NASA Astrophysics Data System (ADS)
Purushothaman, Geethanjali; Ray, Kalyan Kumar
2016-03-01
In this paper the authors present motion control of a prosthetic hand, through continuous myoelectric signal acquisition, classification and actuation of the prosthetic drive. A four channel continuous electromyogram (EMG) signal also known as myoelectric signals (MES) are acquired from the abled-body to classify the six unique movements of hand and wrist, viz, hand open (HO), hand close (HC), wrist flexion (WF), wrist extension (WE), ulnar deviation (UD) and radial deviation (RD). The classification technique involves in extracting the features/pattern through statistical time domain (TD) parameter/autoregressive coefficients (AR), which are reduced using principal component analysis (PCA). The reduced statistical TD features and or AR coefficients are used to classify the signal patterns through k nearest neighbour (kNN) as well as neural network (NN) classifier and the performance of the classifiers are compared. Performance comparison of the above two classifiers clearly shows that kNN classifier in identifying the hidden intended motion in the myoelectric signals is better than that of NN classifier. Once the classifier identifies the intended motion, the signal is amplified to actuate the three low power DC motor to perform the above mentioned movements.
Choi, Chang Won; Park, Moon Sung
2015-10-01
The Korean Neonatal Network (KNN), a nationwide prospective registry of very-low-birth-weight (VLBW, < 1,500 g at birth) infants, was launched in April 2013. Data management (DM) and site-visit monitoring (SVM) were crucial in ensuring the quality of the data collected from 55 participating hospitals across the country on 116 clinical variables. We describe the processes and results of DM and SVM performed during the establishment stage of the registry. The DM procedure included automated proof checks, electronic data validation, query creation, query resolution, and revalidation of the corrected data. SVM included SVM team organization, identification of unregistered cases, source document verification, and post-visit report production. By March 31, 2015, 4,063 VLBW infants were registered and 1,693 queries were produced. Of these, 1,629 queries were resolved and 64 queries remain unresolved. By November 28, 2014, 52 participating hospitals were visited, with 136 site-visits completed since April 2013. Each participating hospital was visited biannually. DM and SVM were performed to ensure the quality of the data collected for the KNN registry. Our experience with DM and SVM can be applied for similar multi-center registries with large numbers of participating centers.
Iris Recognition Using Feature Extraction of Box Counting Fractal Dimension
NASA Astrophysics Data System (ADS)
Khotimah, C.; Juniati, D.
2018-01-01
Biometrics is a science that is now growing rapidly. Iris recognition is a biometric modality which captures a photo of the eye pattern. The markings of the iris are distinctive that it has been proposed to use as a means of identification, instead of fingerprints. Iris recognition was chosen for identification in this research because every human has a special feature that each individual is different and the iris is protected by the cornea so that it will have a fixed shape. This iris recognition consists of three step: pre-processing of data, feature extraction, and feature matching. Hough transformation is used in the process of pre-processing to locate the iris area and Daugman’s rubber sheet model to normalize the iris data set into rectangular blocks. To find the characteristics of the iris, it was used box counting method to get the fractal dimension value of the iris. Tests carried out by used k-fold cross method with k = 5. In each test used 10 different grade K of K-Nearest Neighbor (KNN). The result of iris recognition was obtained with the best accuracy was 92,63 % for K = 3 value on K-Nearest Neighbor (KNN) method.
Multi-Band Received Signal Strength Fingerprinting Based Indoor Location System
NASA Astrophysics Data System (ADS)
Sertthin, Chinnapat; Fujii, Takeo; Ohtsuki, Tomoaki; Nakagawa, Masao
This paper proposes a new multi-band received signal strength (MRSS) fingerprinting based indoor location system, which employs the frequency diversity on the conventional single-band received signal strength (RSS) fingerprinting based indoor location system. In the proposed system, the impacts of frequency diversity on the enhancements of positioning accuracy are analyzed. Effectiveness of the proposed system is proved by experimental approach, which was conducted in non line-of-sight (NLOS) environment under the area of 103m2 at Yagami Campus, Keio University. WLAN access points, which simultaneously transmit dual-band signal of 2.4 and 5.2GHz, are utilized as transmitters. Likewise, a dual-band WLAN receiver is utilized as a receiver. Signal distances calculated by both Manhattan and Euclidean were classified by K-Nearest Neighbor (KNN) classifier to illustrate the performance of the proposed system. The results confirmed that Frequency diversity attributions of multi-band signal provide accuracy improvement over 50% of the conventional single-band.
Voxel classification based airway tree segmentation
NASA Astrophysics Data System (ADS)
Lo, Pechin; de Bruijne, Marleen
2008-03-01
This paper presents a voxel classification based method for segmenting the human airway tree in volumetric computed tomography (CT) images. In contrast to standard methods that use only voxel intensities, our method uses a more complex appearance model based on a set of local image appearance features and Kth nearest neighbor (KNN) classification. The optimal set of features for classification is selected automatically from a large set of features describing the local image structure at several scales. The use of multiple features enables the appearance model to differentiate between airway tree voxels and other voxels of similar intensities in the lung, thus making the segmentation robust to pathologies such as emphysema. The classifier is trained on imperfect segmentations that can easily be obtained using region growing with a manual threshold selection. Experiments show that the proposed method results in a more robust segmentation that can grow into the smaller airway branches without leaking into emphysematous areas, and is able to segment many branches that are not present in the training set.
Lead-free piezoelectrics based on potassium-sodium niobate with giant d(33).
Zhang, Binyu; Wu, Jiagang; Cheng, Xiaojing; Wang, Xiaopeng; Xiao, Dingquan; Zhu, Jianguo; Wang, Xiangjian; Lou, Xiaojie
2013-08-28
High-performance lead-free piezoelectrics (d33 > 400 pC/N) based on 0.96(K0.5Na0.5)0.95Li0.05Nb1-xSbxO3-0.04BaZrO3 with the rhombohedral-tetragonal (R-T) phase boundary have been designed and prepared. The R-T phase boundary lies the composition range of 0.04 ≤ x ≤ 0.07, and the dielectric and piezoelectric properties of the ceramics with the compositions near the phase boundary are significantly enhanced. In addition, the ceramic with x = 0.07 has a giant d33 of ~425 pC/N, which is comparable to that (~416 pC/N) of textured KNN-based ceramics (Saito, Y.; Takao, H.; Tani, T.; Nonoyama, T.; Takatori, K.; Homma, T.; Nagaya, T.; Nakamura, M. Nature 2004, 432, 84). The underlying physical mechanisms for enhanced piezoelectric properties are addressed. We believe that the material system is the most promising lead-free piezoelectric candidates for the practical applications.
Feature reconstruction of LFP signals based on PLSR in the neural information decoding study.
Yonghui Dong; Zhigang Shang; Mengmeng Li; Xinyu Liu; Hong Wan
2017-07-01
To solve the problems of Signal-to-Noise Ratio (SNR) and multicollinearity when the Local Field Potential (LFP) signals is used for the decoding of animal motion intention, a feature reconstruction of LFP signals based on partial least squares regression (PLSR) in the neural information decoding study is proposed in this paper. Firstly, the feature information of LFP coding band is extracted based on wavelet transform. Then the PLSR model is constructed by the extracted LFP coding features. According to the multicollinearity characteristics among the coding features, several latent variables which contribute greatly to the steering behavior are obtained, and the new LFP coding features are reconstructed. Finally, the K-Nearest Neighbor (KNN) method is used to classify the reconstructed coding features to verify the decoding performance. The results show that the proposed method can achieve the highest accuracy compared to the other three methods and the decoding effect of the proposed method is robust.
A Modular Mixed Signal VLSI Design Approach for Digital Radar Applications
2007-03-01
convenience, denote e−j 2π N nk by WN , so equation (2.2) becomes: X(k) = N−1∑ n=0 x(n)W knN , k = 0, 1, 2, ..., N − 1 (2.3) which can be expanded into... Speech , and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on, 3, 1994. 18. Soliman, Samir S. and Mandyam D. Srinath
Unconstrained and contactless hand geometry biometrics.
de-Santos-Sierra, Alberto; Sánchez-Ávila, Carmen; Del Pozo, Gonzalo Bailador; Guerra-Casanova, Javier
2011-01-01
This paper presents a hand biometric system for contact-less, platform-free scenarios, proposing innovative methods in feature extraction, template creation and template matching. The evaluation of the proposed method considers both the use of three contact-less publicly available hand databases, and the comparison of the performance to two competitive pattern recognition techniques existing in literature: namely support vector machines (SVM) and k-nearest neighbour (k-NN). Results highlight the fact that the proposed method outcomes existing approaches in literature in terms of computational cost, accuracy in human identification, number of extracted features and number of samples for template creation. The proposed method is a suitable solution for human identification in contact-less scenarios based on hand biometrics, providing a feasible solution to devices with limited hardware requirements like mobile devices.
NASA Astrophysics Data System (ADS)
Yang, Jie; Zhang, Faqiang; Yang, Qunbao; Liu, Zhifu; Li, Yongxiang; Liu, Yun; Zhang, Qiming
2016-05-01
We report lead-free single crystals with a nominal formula of (K0.45Na0.55)0.96Li0.04NbO3 grown using a simple low-cost seed-free solid-state crystal growth method (SFSSCG). The crystals thus prepared can reach maximum dimensions of 6 mm × 5 mm × 2 mm and exhibit a large piezoelectric coefficient d33 of 689 pC/N. Moreover, the effective piezoelectric coefficient d33 * , obtained under a unipolar electric field of 30 kV/cm, can reach 967 pm/V. The large piezoelectric response plus the high Curie temperature (TC) of 432 °C indicate that SFSSCG is an effective approach to synthesize high-performance lead-free piezoelectric single crystals.
Unconstrained and Contactless Hand Geometry Biometrics
de-Santos-Sierra, Alberto; Sánchez-Ávila, Carmen; del Pozo, Gonzalo Bailador; Guerra-Casanova, Javier
2011-01-01
This paper presents a hand biometric system for contact-less, platform-free scenarios, proposing innovative methods in feature extraction, template creation and template matching. The evaluation of the proposed method considers both the use of three contact-less publicly available hand databases, and the comparison of the performance to two competitive pattern recognition techniques existing in literature: namely Support Vector Machines (SVM) and k-Nearest Neighbour (k-NN). Results highlight the fact that the proposed method outcomes existing approaches in literature in terms of computational cost, accuracy in human identification, number of extracted features and number of samples for template creation. The proposed method is a suitable solution for human identification in contact-less scenarios based on hand biometrics, providing a feasible solution to devices with limited hardware requirements like mobile devices. PMID:22346634
Borghi, Giacomo; Tabacchini, Valerio; Schaart, Dennis R
2016-07-07
Gamma-ray detectors based on thick monolithic scintillator crystals can achieve spatial resolutions <2 mm full-width-at-half-maximum (FWHM) and coincidence resolving times (CRTs) better than 200 ps FWHM. Moreover, they provide high sensitivity and depth-of-interaction (DOI) information. While these are excellent characteristics for clinical time-of-flight (TOF) positron emission tomography (PET), the application of monolithic scintillators has so far been hampered by the lengthy and complex procedures needed for position- and time-of-interaction estimation. Here, the algorithms previously developed in our group are revised to make the calibration and operation of a large number of monolithic scintillator detectors in a TOF-PET system practical. In particular, the k-nearest neighbor (k-NN) classification method for x,y-position estimation is accelerated with an algorithm that quickly preselects only the most useful reference events, reducing the computation time for position estimation by a factor of ~200 compared to the previously published k-NN 1D method. Also, the procedures for estimating the DOI and time of interaction are revised to enable full detector calibration by means of fan-beam or flood irradiations only. Moreover, a new technique is presented to allow the use of events in which some of the photosensor pixel values and/or timestamps are missing (e.g. due to dead time), so as to further increase system sensitivity. The accelerated methods were tested on a monolithic scintillator detector specifically developed for clinical PET applications, consisting of a 32 mm × 32 mm × 22 mm LYSO : Ce crystal coupled to a digital photon counter (DPC) array. This resulted in a spatial resolution of 1.7 mm FWHM, an average DOI resolution of 3.7 mm FWHM, and a CRT of 214 ps. Moreover, the possibility of using events missing the information of up to 16 out of 64 photosensor pixels is shown. This results in only a small deterioration of the detector performance.
Farran, Bassam; Channanath, Arshad Mohamed; Behbehani, Kazem; Thanaraj, Thangavel Alphonse
2013-01-01
Objective We build classification models and risk assessment tools for diabetes, hypertension and comorbidity using machine-learning algorithms on data from Kuwait. We model the increased proneness in diabetic patients to develop hypertension and vice versa. We ascertain the importance of ethnicity (and natives vs expatriate migrants) and of using regional data in risk assessment. Design Retrospective cohort study. Four machine-learning techniques were used: logistic regression, k-nearest neighbours (k-NN), multifactor dimensionality reduction and support vector machines. The study uses fivefold cross validation to obtain generalisation accuracies and errors. Setting Kuwait Health Network (KHN) that integrates data from primary health centres and hospitals in Kuwait. Participants 270 172 hospital visitors (of which, 89 858 are diabetic, 58 745 hypertensive and 30 522 comorbid) comprising Kuwaiti natives, Asian and Arab expatriates. Outcome measures Incident type 2 diabetes, hypertension and comorbidity. Results Classification accuracies of >85% (for diabetes) and >90% (for hypertension) are achieved using only simple non-laboratory-based parameters. Risk assessment tools based on k-NN classification models are able to assign ‘high’ risk to 75% of diabetic patients and to 94% of hypertensive patients. Only 5% of diabetic patients are seen assigned ‘low’ risk. Asian-specific models and assessments perform even better. Pathological conditions of diabetes in the general population or in hypertensive population and those of hypertension are modelled. Two-stage aggregate classification models and risk assessment tools, built combining both the component models on diabetes (or on hypertension), perform better than individual models. Conclusions Data on diabetes, hypertension and comorbidity from the cosmopolitan State of Kuwait are available for the first time. This enabled us to apply four different case–control models to assess risks. These tools aid in the preliminary non-intrusive assessment of the population. Ethnicity is seen significant to the predictive models. Risk assessments need to be developed using regional data as we demonstrate the applicability of the American Diabetes Association online calculator on data from Kuwait. PMID:23676796
Bivariate empirical mode decomposition for ECG-based biometric identification with emotional data.
Ferdinando, Hany; Seppanen, Tapio; Alasaarela, Esko
2017-07-01
Emotions modulate ECG signals such that they might affect ECG-based biometric identification in real life application. It motivated in finding good feature extraction methods where the emotional state of the subjects has minimum impacts. This paper evaluates feature extraction based on bivariate empirical mode decomposition (BEMD) for biometric identification when emotion is considered. Using the ECG signal from the Mahnob-HCI database for affect recognition, the features were statistical distributions of dominant frequency after applying BEMD analysis to ECG signals. The achieved accuracy was 99.5% with high consistency using kNN classifier in 10-fold cross validation to identify 26 subjects when the emotional states of the subjects were ignored. When the emotional states of the subject were considered, the proposed method also delivered high accuracy, around 99.4%. We concluded that the proposed method offers emotion-independent features for ECG-based biometric identification. The proposed method needs more evaluation related to testing with other classifier and variation in ECG signals, e.g. normal ECG vs. ECG with arrhythmias, ECG from various ages, and ECG from other affective databases.
1983-05-01
LAB) STUDY VOLUME I: MOBILITY AND TRANSPORTATION ANALYSES JAMES C EMANUEL L. ALAN DORNEY LAWRENCE E. MARTIN WAYNE E. FERGUSON JOHN P. O’MALLEY...Runs were maBe on dry, eIcha!londi?Ln0Ve^^rfaireS *? 0btain data 5n maximm attainable speeds under K?nn M? wISiI Addltional ana yses were
Classification of vegetation types in military region
NASA Astrophysics Data System (ADS)
Gonçalves, Miguel; Silva, Jose Silvestre; Bioucas-Dias, Jose
2015-10-01
In decision-making process regarding planning and execution of military operations, the terrain is a determining factor. Aerial photographs are a source of vital information for the success of an operation in hostile region, namely when the cartographic information behind enemy lines is scarce or non-existent. The objective of present work is the development of a tool capable of processing aerial photos. The methodology implemented starts with feature extraction, followed by the application of an automatic selector of features. The next step, using the k-fold cross validation technique, estimates the input parameters for the following classifiers: Sparse Multinomial Logist Regression (SMLR), K Nearest Neighbor (KNN), Linear Classifier using Principal Component Expansion on the Joint Data (PCLDC) and Multi-Class Support Vector Machine (MSVM). These classifiers were used in two different studies with distinct objectives: discrimination of vegetation's density and identification of vegetation's main components. It was found that the best classifier on the first approach is the Sparse Logistic Multinomial Regression (SMLR). On the second approach, the implemented methodology applied to high resolution images showed that the better performance was achieved by KNN classifier and PCLDC. Comparing the two approaches there is a multiscale issue, in which for different resolutions, the best solution to the problem requires different classifiers and the extraction of different features.
Comparison of Classification Methods for Detecting Emotion from Mandarin Speech
NASA Astrophysics Data System (ADS)
Pao, Tsang-Long; Chen, Yu-Te; Yeh, Jun-Heng
It is said that technology comes out from humanity. What is humanity? The very definition of humanity is emotion. Emotion is the basis for all human expression and the underlying theme behind everything that is done, said, thought or imagined. Making computers being able to perceive and respond to human emotion, the human-computer interaction will be more natural. Several classifiers are adopted for automatically assigning an emotion category, such as anger, happiness or sadness, to a speech utterance. These classifiers were designed independently and tested on various emotional speech corpora, making it difficult to compare and evaluate their performance. In this paper, we first compared several popular classification methods and evaluated their performance by applying them to a Mandarin speech corpus consisting of five basic emotions, including anger, happiness, boredom, sadness and neutral. The extracted feature streams contain MFCC, LPCC, and LPC. The experimental results show that the proposed WD-MKNN classifier achieves an accuracy of 81.4% for the 5-class emotion recognition and outperforms other classification techniques, including KNN, MKNN, DW-KNN, LDA, QDA, GMM, HMM, SVM, and BPNN. Then, to verify the advantage of the proposed method, we compared these classifiers by applying them to another Mandarin expressive speech corpus consisting of two emotions. The experimental results still show that the proposed WD-MKNN outperforms others.
Application of texture analysis method for mammogram density classification
NASA Astrophysics Data System (ADS)
Nithya, R.; Santhi, B.
2017-07-01
Mammographic density is considered a major risk factor for developing breast cancer. This paper proposes an automated approach to classify breast tissue types in digital mammogram. The main objective of the proposed Computer-Aided Diagnosis (CAD) system is to investigate various feature extraction methods and classifiers to improve the diagnostic accuracy in mammogram density classification. Texture analysis methods are used to extract the features from the mammogram. Texture features are extracted by using histogram, Gray Level Co-Occurrence Matrix (GLCM), Gray Level Run Length Matrix (GLRLM), Gray Level Difference Matrix (GLDM), Local Binary Pattern (LBP), Entropy, Discrete Wavelet Transform (DWT), Wavelet Packet Transform (WPT), Gabor transform and trace transform. These extracted features are selected using Analysis of Variance (ANOVA). The features selected by ANOVA are fed into the classifiers to characterize the mammogram into two-class (fatty/dense) and three-class (fatty/glandular/dense) breast density classification. This work has been carried out by using the mini-Mammographic Image Analysis Society (MIAS) database. Five classifiers are employed namely, Artificial Neural Network (ANN), Linear Discriminant Analysis (LDA), Naive Bayes (NB), K-Nearest Neighbor (KNN), and Support Vector Machine (SVM). Experimental results show that ANN provides better performance than LDA, NB, KNN and SVM classifiers. The proposed methodology has achieved 97.5% accuracy for three-class and 99.37% for two-class density classification.
Automated diagnosis of dry eye using infrared thermography images
NASA Astrophysics Data System (ADS)
Acharya, U. Rajendra; Tan, Jen Hong; Koh, Joel E. W.; Sudarshan, Vidya K.; Yeo, Sharon; Too, Cheah Loon; Chua, Chua Kuang; Ng, E. Y. K.; Tong, Louis
2015-07-01
Dry Eye (DE) is a condition of either decreased tear production or increased tear film evaporation. Prolonged DE damages the cornea causing the corneal scarring, thinning and perforation. There is no single uniform diagnosis test available to date; combinations of diagnostic tests are to be performed to diagnose DE. The current diagnostic methods available are subjective, uncomfortable and invasive. Hence in this paper, we have developed an efficient, fast and non-invasive technique for the automated identification of normal and DE classes using infrared thermography images. The features are extracted from nonlinear method called Higher Order Spectra (HOS). Features are ranked using t-test ranking strategy. These ranked features are fed to various classifiers namely, K-Nearest Neighbor (KNN), Nave Bayesian Classifier (NBC), Decision Tree (DT), Probabilistic Neural Network (PNN), and Support Vector Machine (SVM) to select the best classifier using minimum number of features. Our proposed system is able to identify the DE and normal classes automatically with classification accuracy of 99.8%, sensitivity of 99.8%, and specificity if 99.8% for left eye using PNN and KNN classifiers. And we have reported classification accuracy of 99.8%, sensitivity of 99.9%, and specificity if 99.4% for right eye using SVM classifier with polynomial order 2 kernel.
Aquino, Francisco W B; Pereira-Filho, Edenir R
2015-03-01
Because of their short life span and high production and consumption rates, mobile phones are one of the contributors to WEEE (waste electrical and electronic equipment) growth in many countries. If incorrectly managed, the hazardous materials used in the assembly of these devices can pollute the environment and pose dangers for workers involved in the recycling of these materials. In this study, 144 polymer fragments originating from 50 broken or obsolete mobile phones were analyzed via laser-induced breakdown spectroscopy (LIBS) without previous treatment. The coated polymers were mainly characterized by the presence of Ag, whereas the uncoated polymers were related to the presence of Al, K, Na, Si and Ti. Classification models were proposed using black and white polymers separately in order to identify the manufacturer and origin using KNN (K-nearest neighbor), SIMCA (Soft Independent Modeling of Class Analogy) and PLS-DA (Partial Least Squares for Discriminant Analysis). For the black polymers the percentage of correct predictions was, in average, 58% taking into consideration the models for manufacturer and origin identification. In the case of white polymers, the percentage of correct predictions ranged from 72.8% (PLS-DA) to 100% (KNN). Copyright © 2014 Elsevier B.V. All rights reserved.
Dielectric spectroscopy of Dy2O3 doped (K0.5Na0.5)NbO3 piezoelectric ceramics
NASA Astrophysics Data System (ADS)
Mahesh, P.; Subhash, T.; Pamu, D.
2014-06-01
We report the dielectric properties of ( K 0.5 Na 0.5 ) NbO 3 ceramics doped with x wt% of Dy 2 O 3 (x= 0.0-1.5 wt%) using the broadband dielectric spectroscopy. The X-ray diffraction studies showed the formation of perovskite structure signifying that Dy 2 O 3 diffuse into the KNN lattice. Samples doped with x > 0.5 wt% exhibit smaller grain size and lower relative densities. The dielectric properties of KNN ceramics doped with Dy 2 O 3 are enhanced by increasing the Dy 3+ content; among the compositions studied, x = 0.5 wt% exhibited the highest dielectric constant and lowest loss at 1MHz over the temperature range of 30°C to 400°C. All the samples exhibit maximum dielectric constant at the Curie temperature (˜ 326°C) and a small peak in the dielectric constant at around 165°C is due to a structural phase transition. At the request of all authors, and by agreement with the Proceedings Editors, a corrected version of this article was published on 19 June 2014. The full text of the Corrigendum is attached to the corrected article PDF file.
Primary mass discrimination of high energy cosmic rays using PNN and k-NN methods
NASA Astrophysics Data System (ADS)
Rastegarzadeh, G.; Nemati, M.
2018-02-01
Probabilistic neural network (PNN) and k-Nearest Neighbors (k-NN) methods are widely used data classification techniques. In this paper, these two methods have been used to classify the Extensive Air Shower (EAS) data sets which were simulated using the CORSIKA code for three primary cosmic rays. The primaries are proton, oxygen and iron nuclei at energies of 100 TeV-10 PeV. This study is performed in the following of the investigations into the primary cosmic ray mass sensitive observables. We propose a new approach for measuring the mass sensitive observables of EAS in order to improve the primary mass separation. In this work, the EAS observables measurement has performed locally instead of total measurements. Also the relationships between the included number of observables in the classification methods and the prediction accuracy have been investigated. We have shown that the local measurements and inclusion of more mass sensitive observables in the classification processes can improve the classifying quality and also we have shown that muons and electrons energy density can be considered as primary mass sensitive observables in primary mass classification. Also it must be noted that this study is performed for Tehran observation level without considering the details of any certain EAS detection array.
Real-time image annotation by manifold-based biased Fisher discriminant analysis
NASA Astrophysics Data System (ADS)
Ji, Rongrong; Yao, Hongxun; Wang, Jicheng; Sun, Xiaoshuai; Liu, Xianming
2008-01-01
Automatic Linguistic Annotation is a promising solution to bridge the semantic gap in content-based image retrieval. However, two crucial issues are not well addressed in state-of-art annotation algorithms: 1. The Small Sample Size (3S) problem in keyword classifier/model learning; 2. Most of annotation algorithms can not extend to real-time online usage due to their low computational efficiencies. This paper presents a novel Manifold-based Biased Fisher Discriminant Analysis (MBFDA) algorithm to address these two issues by transductive semantic learning and keyword filtering. To address the 3S problem, Co-Training based Manifold learning is adopted for keyword model construction. To achieve real-time annotation, a Bias Fisher Discriminant Analysis (BFDA) based semantic feature reduction algorithm is presented for keyword confidence discrimination and semantic feature reduction. Different from all existing annotation methods, MBFDA views image annotation from a novel Eigen semantic feature (which corresponds to keywords) selection aspect. As demonstrated in experiments, our manifold-based biased Fisher discriminant analysis annotation algorithm outperforms classical and state-of-art annotation methods (1.K-NN Expansion; 2.One-to-All SVM; 3.PWC-SVM) in both computational time and annotation accuracy with a large margin.
Si, Jian-min; Luo, A-li; Wu, Fu-zhao; Wu, Yi-hong
2015-03-01
There are many valuable rare and unusual objects in spectra dataset of Sloan Digital Sky Survey (SDSS) Data Release eight (DR8), such as special white dwarfs (DZ, DQ, DC), carbon stars, white dwarf main-sequence binaries (WDMS), cataclysmic variable (CV) stars and so on, so it is extremely significant to search for rare and unusual celestial objects from massive spectra dataset. A novel algorithm based on Kernel dense estimation and K-nearest neighborhoods (KNN) has been presented, and applied to search for rare and unusual celestial objects from 546 383 stellar spectra of SDSS DR8. Their densities are estimated using Gaussian kernel density estimation, the top 5 000 spectra in descend order by their densities are selected as rare objects, and the top 300 000 spectra in ascend order by their densities are selected as normal objects. Then, KNN were used to classify the rest objects, and simultaneously K nearest neighbors of the 5 000 rare spectra are also selected as rare objects. As a result, there are totally 21 193 spectra selected as initial rare spectra, which include error spectra caused by deletion, redden, bad calibration, spectra consisting of different physically irrelevant components, planetary nebulas, QSOs, special white dwarfs (DZ, DQ, DC), carbon stars, white dwarf main-sequence binaries (WDMS), cataclysmic variable (CV) stars and so on. By cross identification with SIMBAD, NED, ADS and major literature, it is found that three DZ white dwarfs, one WDMS, two CVs with company of G-type star, three CVs candidates, six DC white dwarfs, one DC white dwarf candidate and one BL Lacertae (BL lac) candidate are our new findings. We also have found one special DA white dwarf with emission lines of Ca II triple and Mg I, and one unknown object whose spectrum looks like a late M star with emission lines and its image looks like a galaxy or nebula.
Algorithms for Autonomous Plume Detection on Outer Planet Satellites
NASA Astrophysics Data System (ADS)
Lin, Y.; Bunte, M. K.; Saripalli, S.; Greeley, R.
2011-12-01
We investigate techniques for automated detection of geophysical events (i.e., volcanic plumes) from spacecraft images. The algorithms presented here have not been previously applied to detection of transient events on outer planet satellites. We apply Scale Invariant Feature Transform (SIFT) to raw images of Io and Enceladus from the Voyager, Galileo, Cassini, and New Horizons missions. SIFT produces distinct interest points in every image; feature descriptors are reasonably invariant to changes in illumination, image noise, rotation, scaling, and small changes in viewpoint. We classified these descriptors as plumes using the k-nearest neighbor (KNN) algorithm. In KNN, an object is classified by its similarity to examples in a training set of images based on user defined thresholds. Using the complete database of Io images and a selection of Enceladus images where 1-3 plumes were manually detected in each image, we successfully detected 74% of plumes in Galileo and New Horizons images, 95% in Voyager images, and 93% in Cassini images. Preliminary tests yielded some false positive detections; further iterations will improve performance. In images where detections fail, plumes are less than 9 pixels in size or are lost in image glare. We compared the appearance of plumes and illuminated mountain slopes to determine the potential for feature classification. We successfully differentiated features. An advantage over other methods is the ability to detect plumes in non-limb views where they appear in the shadowed part of the surface; improvements will enable detection against the illuminated background surface where gradient changes would otherwise preclude detection. This detection method has potential applications to future outer planet missions for sustained plume monitoring campaigns and onboard automated prioritization of all spacecraft data. The complementary nature of this method is such that it could be used in conjunction with edge detection algorithms to increase effectiveness. We have demonstrated an ability to detect transient events above the planetary limb and on the surface and to distinguish feature classes in spacecraft images.
Mirjankar, Nikhil S; Fraga, Carlos G; Carman, April J; Moran, James J
2016-02-02
Chemical attribution signatures (CAS) for chemical threat agents (CTAs), such as cyanides, are being investigated to provide an evidentiary link between CTAs and specific sources to support criminal investigations and prosecutions. Herein, stocks of KCN and NaCN were analyzed for trace anions by high performance ion chromatography (HPIC), carbon stable isotope ratio (δ(13)C) by isotope ratio mass spectrometry (IRMS), and trace elements by inductively coupled plasma optical emission spectroscopy (ICP-OES). The collected analytical data were evaluated using hierarchical cluster analysis (HCA), Fisher-ratio (F-ratio), interval partial least-squares (iPLS), genetic algorithm-based partial least-squares (GAPLS), partial least-squares discriminant analysis (PLSDA), K nearest neighbors (KNN), and support vector machines discriminant analysis (SVMDA). HCA of anion impurity profiles from multiple cyanide stocks from six reported countries of origin resulted in cyanide samples clustering into three groups, independent of the associated alkali metal (K or Na). The three groups were independently corroborated by HCA of cyanide elemental profiles and corresponded to countries each having one known solid cyanide factory: Czech Republic, Germany, and United States. Carbon stable isotope measurements resulted in two clusters: Germany and United States (the single Czech stock grouped with United States stocks). Classification errors for two validation studies using anion impurity profiles collected over five years on different instruments were as low as zero for KNN and SVMDA, demonstrating the excellent reliability associated with using anion impurities for matching a cyanide sample to its factory using our current cyanide stocks. Variable selection methods reduced errors for those classification methods having errors greater than zero; iPLS-forward selection and F-ratio typically provided the lowest errors. Finally, using anion profiles to classify cyanides to a specific stock or stock group for a subset of United States stocks resulted in cross-validation errors ranging from 0 to 5.3%.
Nukala, Bhargava Teja; Nakano, Taro; Rodriguez, Amanda; Tsay, Jerry; Lopez, Jerry; Nguyen, Tam Q; Zupancic, Steven; Lie, Donald Y C
2016-11-29
Gait analysis using wearable wireless sensors can be an economical, convenient and effective way to provide diagnostic and clinical information for various health-related issues. In this work, our custom designed low-cost wireless gait analysis sensor that contains a basic inertial measurement unit (IMU) was used to collect the gait data for four patients diagnosed with balance disorders and additionally three normal subjects, each performing the Dynamic Gait Index (DGI) tests while wearing the custom wireless gait analysis sensor (WGAS). The small WGAS includes a tri-axial accelerometer integrated circuit (IC), two gyroscopes ICs and a Texas Instruments (TI) MSP430 microcontroller and is worn by each subject at the T4 position during the DGI tests. The raw gait data are wirelessly transmitted from the WGAS to a near-by PC for real-time gait data collection and analysis. In order to perform successful classification of patients vs. normal subjects, we used several different classification algorithms, such as the back propagation artificial neural network (BP-ANN), support vector machine (SVM), k -nearest neighbors (KNN) and binary decision trees (BDT), based on features extracted from the raw gait data of the gyroscopes and accelerometers. When the range was used as the input feature, the overall classification accuracy obtained is 100% with BP-ANN, 98% with SVM, 96% with KNN and 94% using BDT. Similar high classification accuracy results were also achieved when the standard deviation or other values were used as input features to these classifiers. These results show that gait data collected from our very low-cost wearable wireless gait sensor can effectively differentiate patients with balance disorders from normal subjects in real time using various classifiers, the success of which may eventually lead to accurate and objective diagnosis of abnormal human gaits and their underlying etiologies in the future, as more patient data are being collected.
NASA Astrophysics Data System (ADS)
Jiang, Li; Xuan, Jianping; Shi, Tielin
2013-12-01
Generally, the vibration signals of faulty machinery are non-stationary and nonlinear under complicated operating conditions. Therefore, it is a big challenge for machinery fault diagnosis to extract optimal features for improving classification accuracy. This paper proposes semi-supervised kernel Marginal Fisher analysis (SSKMFA) for feature extraction, which can discover the intrinsic manifold structure of dataset, and simultaneously consider the intra-class compactness and the inter-class separability. Based on SSKMFA, a novel approach to fault diagnosis is put forward and applied to fault recognition of rolling bearings. SSKMFA directly extracts the low-dimensional characteristics from the raw high-dimensional vibration signals, by exploiting the inherent manifold structure of both labeled and unlabeled samples. Subsequently, the optimal low-dimensional features are fed into the simplest K-nearest neighbor (KNN) classifier to recognize different fault categories and severities of bearings. The experimental results demonstrate that the proposed approach improves the fault recognition performance and outperforms the other four feature extraction methods.
NASA Astrophysics Data System (ADS)
Ochoa, Diego Alejandro; García, Jose Eduardo
2016-04-01
The Preisach model is a classical method for describing nonlinear behavior in hysteretic systems. According to this model, a hysteretic system contains a collection of simple bistable units which are characterized by an internal field and a coercive field. This set of bistable units exhibits a statistical distribution that depends on these fields as parameters. Thus, nonlinear response depends on the specific distribution function associated with the material. This model is satisfactorily used in this work to describe the temperature-dependent ferroelectric response in PZT- and KNN-based piezoceramics. A distribution function expanded in Maclaurin series considering only the first terms in the internal field and the coercive field is proposed. Changes in coefficient relations of a single distribution function allow us to explain the complex temperature dependence of hard piezoceramic behavior. A similar analysis based on the same form of the distribution function shows that the KNL-NTS properties soften around its orthorhombic to tetragonal phase transition.
NASA Astrophysics Data System (ADS)
Hu, Yifan; Han, Hao; Zhu, Wei; Li, Lihong; Pickhardt, Perry J.; Liang, Zhengrong
2016-03-01
Feature classification plays an important role in differentiation or computer-aided diagnosis (CADx) of suspicious lesions. As a widely used ensemble learning algorithm for classification, random forest (RF) has a distinguished performance for CADx. Our recent study has shown that the location index (LI), which is derived from the well-known kNN (k nearest neighbor) and wkNN (weighted k nearest neighbor) classifier [1], has also a distinguished role in the classification for CADx. Therefore, in this paper, based on the property that the LI will achieve a very high accuracy, we design an algorithm to integrate the LI into RF for improved or higher value of AUC (area under the curve of receiver operating characteristics -- ROC). Experiments were performed by the use of a database of 153 lesions (polyps), including 116 neoplastic lesions and 37 hyperplastic lesions, with comparison to the existing classifiers of RF and wkNN, respectively. A noticeable gain by the proposed integrated classifier was quantified by the AUC measure.
Chemical composition analysis and authentication of whisky.
Wiśniewska, Paulina; Dymerski, Tomasz; Wardencki, Waldemar; Namieśnik, Jacek
2015-08-30
Whisky (whiskey) is one of the most popular spirit-based drinks made from malted or saccharified grains, which should mature for at least 3 years in wooden barrels. High popularity of products usually causes a potential risk of adulteration. Thus authenticity assessment is one of the key elements of food product marketing. Authentication of whisky is based on comparing the composition of this alcohol with other spirit drinks. The present review summarizes all information about the comparison of whisky and other alcoholic beverages, the identification of type of whisky or the assessment of its quality and finally the authentication of whisky. The article also presents the various techniques used for analyzing whisky, such as gas and liquid chromatography with different types of detectors (FID, AED, UV-Vis), electronic nose, atomic absorption spectroscopy and mass spectrometry. In some cases the application of chemometric methods is also described, namely PCA, DFA, LDA, ANOVA, SIMCA, PNN, k-NN and CA, as well as preparation techniques such SPME or SPE. © 2014 Society of Chemical Industry.
Melchert, O; Katzgraber, Helmut G; Novotny, M A
2016-04-01
We estimate the critical thresholds of bond and site percolation on nonplanar, effectively two-dimensional graphs with chimeralike topology. The building blocks of these graphs are complete and symmetric bipartite subgraphs of size 2n, referred to as K_{n,n} graphs. For the numerical simulations we use an efficient union-find-based algorithm and employ a finite-size scaling analysis to obtain the critical properties for both bond and site percolation. We report the respective percolation thresholds for different sizes of the bipartite subgraph and verify that the associated universality class is that of standard two-dimensional percolation. For the canonical chimera graph used in the D-Wave Systems Inc. quantum annealer (n=4), we discuss device failure in terms of network vulnerability, i.e., we determine the critical fraction of qubits and couplers that can be absent due to random failures prior to losing large-scale connectivity throughout the device.
Unsupervised Indoor Localization Based on Smartphone Sensors, iBeacon and Wi-Fi.
Chen, Jing; Zhang, Yi; Xue, Wei
2018-04-28
In this paper, we propose UILoc, an unsupervised indoor localization scheme that uses a combination of smartphone sensors, iBeacons and Wi-Fi fingerprints for reliable and accurate indoor localization with zero labor cost. Firstly, compared with the fingerprint-based method, the UILoc system can build a fingerprint database automatically without any site survey and the database will be applied in the fingerprint localization algorithm. Secondly, since the initial position is vital to the system, UILoc will provide the basic location estimation through the pedestrian dead reckoning (PDR) method. To provide accurate initial localization, this paper proposes an initial localization module, a weighted fusion algorithm combined with a k-nearest neighbors (KNN) algorithm and a least squares algorithm. In UILoc, we have also designed a reliable model to reduce the landmark correction error. Experimental results show that the UILoc can provide accurate positioning, the average localization error is about 1.1 m in the steady state, and the maximum error is 2.77 m.
2006-09-01
Medioni, [11], estimates the local dimension using tensor voting . These recent works have clearly shown the necessity to go beyond manifold learning, into...2005. [11] P. Mordohai and G. Medioni. Unsupervised dimensionality estimation and manifold learning in high-dimensional spaces by tensor voting . In...walking, jumping, and arms waving. The whole run took 361 seconds in Matlab , while the classification time (PMM) can be neglected compared to the kNN
Geo-Coding for the Mapping of Documents and Social Media Messages
2013-08-22
O.L. (2007). UBC-ALM: Combining KNN with SVD for WSD. Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-2007), Prague...and Yarowsky, D. (1992). One sense per discourse. In Proceedings of the 4th DARPA Speech and Natural Language Workshop. pp. 233-237, 1992. Retrieved...Part-of- Speech Tagging for Twitter: Annotation, Features, and Experiments. Proceedings of the Annual Meeting of the Association for Computational
Getting What You Want: Accurate Document Filtering in a Terabyte World
2002-11-01
models are used widely in speech recognition and have shown promise for ad-hoc information retrieval (Ponte and Croft, 1998; Lafferty and Zhai, 2001...tasks is focused on developing techniques similar to those used in speech recognition. However the differing requirements of speech recognition and...Conference on Research and Development in Information Retrieval. ACM. 6. T.Ault, and Y. Yang. (2001.) kNN at TREC-9: A failure analysis. In
Automatic Title Generation for Spoken Broadcast News
2001-01-01
degrades much less with speech -recognized transcripts. Meanwhile, even though KNN performance not as well as TF.IDF and NBL in terms of F1 metric, it...test corpus of 1006 broadcast news documents, comparing the results over manual transcription to the results over automatically recognized speech . We...use both F1 and the average number of correct title words in the correct order as metric. Overall, the results show that title generation for speech
NASA Astrophysics Data System (ADS)
Stevens, Jeffrey
The past decade has seen the emergence of many hyperspectral image (HSI) analysis algorithms based on graph theory and derived manifold-coordinates. Yet, despite the growing number of algorithms, there has been limited study of the graphs constructed from spectral data themselves. Which graphs are appropriate for various HSI analyses--and why? This research aims to begin addressing these questions as the performance of graph-based techniques is inextricably tied to the graphical model constructed from the spectral data. We begin with a literature review providing a survey of spectral graph construction techniques currently used by the hyperspectral community, starting with simple constructs demonstrating basic concepts and then incrementally adding components to derive more complex approaches. Throughout this development, we discuss algorithm advantages and disadvantages for different types of hyperspectral analysis. A focus is provided on techniques influenced by spectral density through which the concept of community structure arises. Through the use of simulated and real HSI data, we demonstrate density-based edge allocation produces more uniform nearest neighbor lists than non-density based techniques through increasing the number of intracluster edges, facilitating higher k-nearest neighbor (k-NN) classification performance. Imposing the common mutuality constraint to symmetrify adjacency matrices is demonstrated to be beneficial in most circumstances, especially in rural (less cluttered) scenes. Many complex adaptive edge-reweighting techniques are shown to slightly degrade nearest-neighbor list characteristics. Analysis suggests this condition is possibly attributable to the validity of characterizing spectral density by a single variable representing data scale for each pixel. Additionally, it is shown that imposing mutuality hurts the performance of adaptive edge-allocation techniques or any technique that aims to assign a low number of edges (<10) to any pixel. A simple k bias addresses this problem. Many of the adaptive edge-reweighting techniques are based on the concept of codensity, so we explore codensity properties as they relate to density-based edge reweighting. We find that codensity may not be the best estimator of local scale due to variations in cluster density, so we introduce and compare two inherently density-weighted graph construction techniques from the data mining literature: shared nearest neighbors (SNN) and mutual proximity (MP). MP and SNN are not reliant upon a codensity measure, hence are not susceptible to its shortcomings. Neither has been used for hyperspectral analyses, so this presents the first study of these techniques on HSI data. We demonstrate MP and SNN can offer better performance, but in general none of the reweighting techniques improve the quality of these spectral graphs in our neighborhood structure tests. As such, these complex adaptive edge-reweighting techniques may need to be modified to increase their effectiveness. During this investigation, we probe deeper into properties of high-dimensional data and introduce the concept of concentration of measure (CoM)--the degradation in the efficacy of many common distance measures with increasing dimensionality--as it relates to spectral graph construction. CoM exists in pairwise distances between HSI pixels, but not to the degree experienced in random data of the same extrinsic dimension; a characteristic we demonstrate is due to the rich correlation and cluster structure present in HSI data. CoM can lead to hubness--a condition wherein some nodes have short distances (high similarities) to an exceptionally large number of nodes. We study hub presence in 49 HSI datasets of varying resolutions, altitudes, and spectral bands to demonstrate hubness effects are negligible in a k-NN classification example (generalized counting scenarios), but we note its impact on methods that use edge weights to derive manifold coordinates or splitting clusters based on spectral graph theory requires more investigation. Many of these new graph-related quantities can be exploited to demonstrate new techniques for HSI classification and anomaly detection. We present an initial exploration into this relatively new and exciting field based on an enhanced Schroedinger Eigenmap classification example and compare results to the current state-of-the-art approach. We produce equivalent results, but demonstrate different types of misclassifications, opening the door to combine the best of both approaches to achieve truly superior performance. A separate less mature hubness-assisted anomaly detector (HAAD) is also presented.
Sun, Minglei; Yang, Shaobao; Jiang, Jinling; Wang, Qiwei
2015-01-01
Pelger-Huet anomaly (PHA) and Pseudo Pelger-Huet anomaly (PPHA) are neutrophil with abnormal morphology. They have the bilobed or unilobed nucleus and excessive clumping chromatin. Currently, detection of this kind of cell mainly depends on the manual microscopic examination by a clinician, thus, the quality of detection is limited by the efficiency and a certain subjective consciousness of the clinician. In this paper, a detection method for PHA and PPHA is proposed based on karyomorphism and chromatin distribution features. Firstly, the skeleton of the nucleus is extracted using an augmented Fast Marching Method (AFMM) and width distribution is obtained through distance transform. Then, caryoplastin in the nucleus is extracted based on Speeded Up Robust Features (SURF) and a K-nearest-neighbor (KNN) classifier is constructed to analyze the features. Experiment shows that the sensitivity and specificity of this method achieved 87.5% and 83.33%, which means that the detection accuracy of PHA is acceptable. Meanwhile, the detection method should be helpful to the automatic morphological classification of blood cells.
Automatic evaluation of hypernasality based on a cleft palate speech database.
He, Ling; Zhang, Jing; Liu, Qi; Yin, Heng; Lech, Margaret; Huang, Yunzhi
2015-05-01
The hypernasality is one of the most typical characteristics of cleft palate (CP) speech. The evaluation outcome of hypernasality grading decides the necessity of follow-up surgery. Currently, the evaluation of CP speech is carried out by experienced speech therapists. However, the result strongly depends on their clinical experience and subjective judgment. This work aims to propose an automatic evaluation system for hypernasality grading in CP speech. The database tested in this work is collected by the Hospital of Stomatology, Sichuan University, which has the largest number of CP patients in China. Based on the production process of hypernasality, source sound pulse and vocal tract filter features are presented. These features include pitch, the first and second energy amplified frequency bands, cepstrum based features, MFCC, short-time energy in the sub-bands features. These features combined with KNN classier are applied to automatically classify four grades of hypernasality: normal, mild, moderate and severe. The experiment results show that the proposed system achieves a good performance. The classification rates for four hypernasality grades reach up to 80.4%. The sensitivity of proposed features to the gender is also discussed.
Jian, Yulin; Huang, Daoyu; Yan, Jia; Lu, Kun; Huang, Ying; Wen, Tailai; Zeng, Tanyue; Zhong, Shijie; Xie, Qilong
2017-06-19
A novel classification model, named the quantum-behaved particle swarm optimization (QPSO)-based weighted multiple kernel extreme learning machine (QWMK-ELM), is proposed in this paper. Experimental validation is carried out with two different electronic nose (e-nose) datasets. Being different from the existing multiple kernel extreme learning machine (MK-ELM) algorithms, the combination coefficients of base kernels are regarded as external parameters of single-hidden layer feedforward neural networks (SLFNs). The combination coefficients of base kernels, the model parameters of each base kernel, and the regularization parameter are optimized by QPSO simultaneously before implementing the kernel extreme learning machine (KELM) with the composite kernel function. Four types of common single kernel functions (Gaussian kernel, polynomial kernel, sigmoid kernel, and wavelet kernel) are utilized to constitute different composite kernel functions. Moreover, the method is also compared with other existing classification methods: extreme learning machine (ELM), kernel extreme learning machine (KELM), k-nearest neighbors (KNN), support vector machine (SVM), multi-layer perceptron (MLP), radical basis function neural network (RBFNN), and probabilistic neural network (PNN). The results have demonstrated that the proposed QWMK-ELM outperforms the aforementioned methods, not only in precision, but also in efficiency for gas classification.
A Step Towards EEG-based Brain Computer Interface for Autism Intervention*
Fan, Jing; Wade, Joshua W.; Bian, Dayi; Key, Alexandra P.; Warren, Zachary E.; Mion, Lorraine C.; Sarkar, Nilanjan
2017-01-01
Autism Spectrum Disorder (ASD) is a prevalent and costly neurodevelopmental disorder. Individuals with ASD often have deficits in social communication skills as well as adaptive behavior skills related to daily activities. We have recently designed a novel virtual reality (VR) based driving simulator for driving skill training for individuals with ASD. In this paper, we explored the feasibility of detecting engagement level, emotional states, and mental workload during VR-based driving using EEG as a first step towards a potential EEG-based Brain Computer Interface (BCI) for assisting autism intervention. We used spectral features of EEG signals from a 14-channel EEG neuroheadset, together with therapist ratings of behavioral engagement, enjoyment, frustration, boredom, and difficulty to train a group of classification models. Seven classification methods were applied and compared including Bayes network, naïve Bayes, Support Vector Machine (SVM), multilayer perceptron, K-nearest neighbors (KNN), random forest, and J48. The classification results were promising, with over 80% accuracy in classifying engagement and mental workload, and over 75% accuracy in classifying emotional states. Such results may lead to an adaptive closed-loop VR-based skill training system for use in autism intervention. PMID:26737113
Hair segmentation using adaptive threshold from edge and branch length measures.
Lee, Ian; Du, Xian; Anthony, Brian
2017-10-01
Non-invasive imaging techniques allow the monitoring of skin structure and diagnosis of skin diseases in clinical applications. However, hair in skin images hampers the imaging and classification of the skin structure of interest. Although many hair segmentation methods have been proposed for digital hair removal, a major challenge in hair segmentation remains in detecting hairs that are thin, overlapping, of similar contrast or color to underlying skin, or overlaid on highly-textured skin structure. To solve the problem, we present an automatic hair segmentation method that uses edge density (ED) and mean branch length (MBL) to measure hair. First, hair is detected by the integration of top-hat transform and modified second-order Gaussian filter. Second, we employ a robust adaptive threshold of ED and MBL to generate a hair mask. Third, the hair mask is refined by k-NN classification of hair and skin pixels. The proposed algorithm was tested using two datasets of healthy skin images and lesion images respectively. These datasets were taken from different imaging platforms in various illumination levels and varying skin colors. We compared the hair detection and segmentation results from our algorithm and six other hair segmentation methods of state of the art. Our method exhibits high value of sensitivity: 75% and specificity: 95%, which indicates significantly higher accuracy and better balance between true positive and false positive detection than the other methods. Published by Elsevier Ltd.
Automatic Fibrosis Quantification By Using a k-NN Classificator
2001-10-25
Fluthrope, “Stages in fiber breakdown in duchenne muscular dystrophy ,” J. Neurol. Sci., vol. 24, pp. 179– 186, 1975. [6] F. Cornelio and I. Dones, “ Muscle ...an automatic algorithm to measure fibrosis in muscle sections of mdx mice, a mutant species used as a model of the Duchenne dystrophy . The al- gorithm...fiber degeneration and necro- sis in muscular dystrophy and other muscle diseases: cytochem- ical and immunocytochemical data,” Ann. Neurol., vol. 16
Introduction and Terminology 2-Extendability in 3-Polytopes.
1985-01-01
and D.A. Holton, On defect-d matchings in graphs, Discrete Math ., 13, 1975, 41-54. [LGH2] (-), Erratum: "On defect-d matchings, Discrete Mlath., 14...Matching Theory, Vol. 29, knn. Discrete Math ., North- Holland, Amsterdam, 1986. [Plell J. Plesnik, Connectivity of regular graphs and the existence of 1...Plu2] -- ), A theorem on mnatchings in the plane, Graph Theo~ry in Memory of G..4. Dirac, Ann. Discrete Math ., North-Holland. Amisterdarni. to appear
Information Extraction from Multiple Syntactic Sources
2004-05-01
Performance of SVM and KNN (k=3) on different kernel setups. Types are ordered in decreasing order of frequency of occurrence in the ACE corpus. For SVM, the...name. But it is not easy to recognize “A Real New York Bargain” as a company name. In other languages or transcripts of English speech where...symbolic rules for extraction of posted computer jobs. It only assumed simple syntactic preprocessing such as tokeniza- tion and Part-of- Speech tagging
Model Classes, Approximation, and Metrics for Dynamic Processing of Urban Terrain Data
2013-01-01
Sensing,” DARPA IPTO Retreat, Annapolis, 2008. R. Baraniuk, “Compressive Sensing, Wavelets, and Sparsity,” SPIE Defense + Security (acceptance speech ... Speech and Signal Processing (ICASSP). 2011/05/22 00:00:00, Prague, Czech Republic. : , 08/31/2011 33.00 Sang-Mook Lee, Jeong Joon Im, Bo-Hee Lee... KNN ) points to define a local intrinsic coordinate system using PCA and to construct the manifold and function locally using least squares. Local
Dynamic analysis environment for nuclear forensic analyses
NASA Astrophysics Data System (ADS)
Stork, C. L.; Ummel, C. C.; Stuart, D. S.; Bodily, S.; Goldblum, B. L.
2017-01-01
A Dynamic Analysis Environment (DAE) software package is introduced to facilitate group inclusion/exclusion method testing, evaluation and comparison for pre-detonation nuclear forensics applications. Employing DAE, the multivariate signatures of a questioned material can be compared to the signatures for different, known groups, enabling the linking of the questioned material to its potential process, location, or fabrication facility. Advantages of using DAE for group inclusion/exclusion include built-in query tools for retrieving data of interest from a database, the recording and documentation of all analysis steps, a clear visualization of the analysis steps intelligible to a non-expert, and the ability to integrate analysis tools developed in different programming languages. Two group inclusion/exclusion methods are implemented in DAE: principal component analysis, a parametric feature extraction method, and k nearest neighbors, a nonparametric pattern recognition method. Spent Fuel Isotopic Composition (SFCOMPO), an open source international database of isotopic compositions for spent nuclear fuels (SNF) from 14 reactors, is used to construct PCA and KNN models for known reactor groups, and 20 simulated SNF samples are utilized in evaluating the performance of these group inclusion/exclusion models. For all 20 simulated samples, PCA in conjunction with the Q statistic correctly excludes a large percentage of reactor groups and correctly includes the true reactor of origination. Employing KNN, 14 of the 20 simulated samples are classified to their true reactor of origination.
NASA Astrophysics Data System (ADS)
Wu, Jiagang; Xiao, Dingquan; Wang, Yuanyu; Zhu, Jianguo; Yu, Ping; Jiang, Yihang
2007-12-01
(1-x)(K0.42Na0.58)NbO3-xLiSbO3 [(1-x)KNN-xLS] lead-free piezoelectric ceramics were prepared by the conventional mixed oxide method. The compositional dependence of the phase structure and the electrical properties of the ceramics were studied. A morphotropic phase boundary (MPB) between the orthorhombic and tetragonal phases was identified in the composition range of 0.04
NASA Astrophysics Data System (ADS)
Abazari, M.; Akdoǧan, E. K.; Safari, A.
2008-05-01
We report the growth of single-phase (K0.44,Na0.52,Li0.04)(Nb0.84,Ta0.10,Sb0.06)O3 thin films on SrRuO3 coated ⟨001⟩ oriented SrTiO3 substrates by using pulsed laser deposition. Films grown at 600°C under low laser fluence exhibit a ⟨001⟩ textured columnar grained nanostructure, which coalesce with increasing deposition temperature, leading to a uniform fully epitaxial highly stoichiometric film at 750°C. However, films deposited at lower temperatures exhibit compositional fluctuations as verified by Rutherford backscattering spectroscopy. The epitaxial films of 400-600nm thickness have a room temperature relative permittivity of ˜750 and a loss tangent of ˜6% at 1kHz. The room temperature remnant polarization of the films is 4μC /cm2, while the saturation polarization is 7.1μC/cm2 at 24kV/cm and the coercive field is ˜7.3kV/cm. The results indicate that approximately 50% of the bulk permittivity and 20% of bulk spontaneous polarization can be retained in submicron epitaxial KNN-LT-LS thin film, respectively. The conductivity of the films remains to be a challenge as evidenced by the high loss tangent, leakage currents, and broad hysteresis loops.
NASA Astrophysics Data System (ADS)
Taha, Z.; Razman, M. A. M.; Ghani, A. S. Abdul; Majeed, A. P. P. Abdul; Musa, R. M.; Adnan, F. A.; Sallehudin, M. F.; Mukai, Y.
2018-04-01
Fish Hunger behaviour is essential in determining the fish feeding routine, particularly for fish farmers. The inability to provide accurate feeding routines (under-feeding or over-feeding) may lead the death of the fish and consequently inhibits the quantity of the fish produced. Moreover, the excessive food that is not consumed by the fish will be dissolved in the water and accordingly reduce the water quality through the reduction of oxygen quantity. This problem also leads the death of the fish or even spur fish diseases. In the present study, a correlation of Barramundi fish-school behaviour with hunger condition through the hybrid data integration of image processing technique is established. The behaviour is clustered with respect to the position of the school size as well as the school density of the fish before feeding, during feeding and after feeding. The clustered fish behaviour is then classified through k-Nearest Neighbour (k-NN) learning algorithm. Three different variations of the algorithm namely cosine, cubic and weighted are assessed on its ability to classify the aforementioned fish hunger behaviour. It was found from the study that the weighted k-NN variation provides the best classification with an accuracy of 86.5%. Therefore, it could be concluded that the proposed integration technique may assist fish farmers in ascertaining fish feeding routine.
Consensus model for identification of novel PI3K inhibitors in large chemical library.
Liew, Chin Yee; Ma, Xiao Hua; Yap, Chun Wei
2010-02-01
Phosphoinositide 3-kinases (PI3Ks) inhibitors have treatment potential for cancer, diabetes, cardiovascular disease, chronic inflammation and asthma. A consensus model consisting of three base classifiers (AODE, kNN, and SVM) trained with 1,283 positive compounds (PI3K inhibitors), 16 negative compounds (PI3K non-inhibitors) and 64,078 generated putative negatives was developed for predicting compounds with PI3K inhibitory activity of IC(50) < or = 10 microM. The consensus model has an estimated false positive rate of 0.75%. Nine novel potential inhibitors were identified using the consensus model and several of these contain structural features that are consistent with those found to be important for PI3K inhibitory activities. An advantage of the current model is that it does not require knowledge of 3D structural information of the various PI3K isoforms, which is not readily available for all isoforms.
Consensus model for identification of novel PI3K inhibitors in large chemical library
NASA Astrophysics Data System (ADS)
Liew, Chin Yee; Ma, Xiao Hua; Yap, Chun Wei
2010-02-01
Phosphoinositide 3-kinases (PI3Ks) inhibitors have treatment potential for cancer, diabetes, cardiovascular disease, chronic inflammation and asthma. A consensus model consisting of three base classifiers (AODE, kNN, and SVM) trained with 1,283 positive compounds (PI3K inhibitors), 16 negative compounds (PI3K non-inhibitors) and 64,078 generated putative negatives was developed for predicting compounds with PI3K inhibitory activity of IC50 ≤ 10 μM. The consensus model has an estimated false positive rate of 0.75%. Nine novel potential inhibitors were identified using the consensus model and several of these contain structural features that are consistent with those found to be important for PI3K inhibitory activities. An advantage of the current model is that it does not require knowledge of 3D structural information of the various PI3K isoforms, which is not readily available for all isoforms.
Hepatitis Diagnosis Using Facial Color Image
NASA Astrophysics Data System (ADS)
Liu, Mingjia; Guo, Zhenhua
Facial color diagnosis is an important diagnostic method in traditional Chinese medicine (TCM). However, due to its qualitative, subjective and experi-ence-based nature, traditional facial color diagnosis has a very limited application in clinical medicine. To circumvent the subjective and qualitative problems of facial color diagnosis of Traditional Chinese Medicine, in this paper, we present a novel computer aided facial color diagnosis method (CAFCDM). The method has three parts: face Image Database, Image Preprocessing Module and Diagnosis Engine. Face Image Database is carried out on a group of 116 patients affected by 2 kinds of liver diseases and 29 healthy volunteers. The quantitative color feature is extracted from facial images by using popular digital image processing techni-ques. Then, KNN classifier is employed to model the relationship between the quantitative color feature and diseases. The results show that the method can properly identify three groups: healthy, severe hepatitis with jaundice and severe hepatitis without jaundice with accuracy higher than 73%.
Prediction of regulatory gene pairs using dynamic time warping and gene ontology.
Yang, Andy C; Hsu, Hui-Huang; Lu, Ming-Da; Tseng, Vincent S; Shih, Timothy K
2014-01-01
Selecting informative genes is the most important task for data analysis on microarray gene expression data. In this work, we aim at identifying regulatory gene pairs from microarray gene expression data. However, microarray data often contain multiple missing expression values. Missing value imputation is thus needed before further processing for regulatory gene pairs becomes possible. We develop a novel approach to first impute missing values in microarray time series data by combining k-Nearest Neighbour (KNN), Dynamic Time Warping (DTW) and Gene Ontology (GO). After missing values are imputed, we then perform gene regulation prediction based on our proposed DTW-GO distance measurement of gene pairs. Experimental results show that our approach is more accurate when compared with existing missing value imputation methods on real microarray data sets. Furthermore, our approach can also discover more regulatory gene pairs that are known in the literature than other methods.
NASA Astrophysics Data System (ADS)
Chen, Dan; Guo, Lin-yuan; Wang, Chen-hao; Ke, Xi-zheng
2017-07-01
Equalization can compensate channel distortion caused by channel multipath effects, and effectively improve convergent of modulation constellation diagram in optical wireless system. In this paper, the subspace blind equalization algorithm is used to preprocess M-ary phase shift keying (MPSK) subcarrier modulation signal in receiver. Mountain clustering is adopted to get the clustering centers of MPSK modulation constellation diagram, and the modulation order is automatically identified through the k-nearest neighbor (KNN) classifier. The experiment has been done under four different weather conditions. Experimental results show that the convergent of constellation diagram is improved effectively after using the subspace blind equalization algorithm, which means that the accuracy of modulation recognition is increased. The correct recognition rate of 16PSK can be up to 85% in any kind of weather condition which is mentioned in paper. Meanwhile, the correct recognition rate is the highest in cloudy and the lowest in heavy rain condition.
Kringel, D; Ultsch, A; Zimmermann, M; Jansen, J-P; Ilias, W; Freynhagen, R; Griessinger, N; Kopf, A; Stein, C; Doehring, A; Resch, E; Lötsch, J
2017-01-01
Next-generation sequencing (NGS) provides unrestricted access to the genome, but it produces ‘big data’ exceeding in amount and complexity the classical analytical approaches. We introduce a bioinformatics-based classifying biomarker that uses emergent properties in genetics to separate pain patients requiring extremely high opioid doses from controls. Following precisely calculated selection of the 34 most informative markers in the OPRM1, OPRK1, OPRD1 and SIGMAR1 genes, pattern of genotypes belonging to either patient group could be derived using a k-nearest neighbor (kNN) classifier that provided a diagnostic accuracy of 80.6±4%. This outperformed alternative classifiers such as reportedly functional opioid receptor gene variants or complex biomarkers obtained via multiple regression or decision tree analysis. The accumulation of several genetic variants with only minor functional influences may result in a qualitative consequence affecting complex phenotypes, pointing at emergent properties in genetics. PMID:27139154
Kringel, D; Ultsch, A; Zimmermann, M; Jansen, J-P; Ilias, W; Freynhagen, R; Griessinger, N; Kopf, A; Stein, C; Doehring, A; Resch, E; Lötsch, J
2017-10-01
Next-generation sequencing (NGS) provides unrestricted access to the genome, but it produces 'big data' exceeding in amount and complexity the classical analytical approaches. We introduce a bioinformatics-based classifying biomarker that uses emergent properties in genetics to separate pain patients requiring extremely high opioid doses from controls. Following precisely calculated selection of the 34 most informative markers in the OPRM1, OPRK1, OPRD1 and SIGMAR1 genes, pattern of genotypes belonging to either patient group could be derived using a k-nearest neighbor (kNN) classifier that provided a diagnostic accuracy of 80.6±4%. This outperformed alternative classifiers such as reportedly functional opioid receptor gene variants or complex biomarkers obtained via multiple regression or decision tree analysis. The accumulation of several genetic variants with only minor functional influences may result in a qualitative consequence affecting complex phenotypes, pointing at emergent properties in genetics.
Vehicle Classification Using an Imbalanced Dataset Based on a Single Magnetic Sensor.
Xu, Chang; Wang, Yingguan; Bao, Xinghe; Li, Fengrong
2018-05-24
This paper aims to improve the accuracy of automatic vehicle classifiers for imbalanced datasets. Classification is made through utilizing a single anisotropic magnetoresistive sensor, with the models of vehicles involved being classified into hatchbacks, sedans, buses, and multi-purpose vehicles (MPVs). Using time domain and frequency domain features in combination with three common classification algorithms in pattern recognition, we develop a novel feature extraction method for vehicle classification. These three common classification algorithms are the k-nearest neighbor, the support vector machine, and the back-propagation neural network. Nevertheless, a problem remains with the original vehicle magnetic dataset collected being imbalanced, and may lead to inaccurate classification results. With this in mind, we propose an approach called SMOTE, which can further boost the performance of classifiers. Experimental results show that the k-nearest neighbor (KNN) classifier with the SMOTE algorithm can reach a classification accuracy of 95.46%, thus minimizing the effect of the imbalance.
Basic Technical Data on Transmission Systems and Equipment Using Communications Lines. Part 1.
1978-08-01
without noticeable degradation of the speech quality. - 219 - The maximum number of repeater sections: For the KNK-6s For the KNK-6t for multiquad...power circuit 1]; 15. Low frequency amplifier for direction B - Aj 16. Low frequency amplifier; 17. KNN [initial slope network]; 18. LVN-2...frequency Voice frequency ringing at 3,800 Hz with a level 0.4 - 0.8 Np lower than the speech channel level. The system for service
Brandão, Luiz Filipe Paiva; Braga, Jez Willian Batista; Suarez, Paulo Anselmo Ziani
2012-02-17
The current legislation requires the mandatory addition of biodiesel to all Brazilian road diesel oil A (pure diesel) marketed in the country and bans the addition of vegetable oils for this type of diesel. However, cases of irregular addition of vegetable oils directly to the diesel oil may occur, mainly due to the lower cost of these raw materials compared to the final product, biodiesel. In Brazil, the situation is even more critical once the country is one of the largest producers of oleaginous products in the world, especially soybean, and also it has an extensive road network dependent on diesel. Therefore, alternatives to control the quality of diesel have become increasingly necessary. This study proposes an analytical methodology for quality control of diesel with intention to identify and determine adulterations of oils and even fats of vegetable origin. This methodology is based on detection, identification and quantification of triacylglycerols on diesel (main constituents of vegetable oils and fats) by high performance liquid chromatography in reversed phase with UV detection at 205nm associated with multivariate methods. Six different types of oils and fats were studied (soybean, frying oil, corn, cotton, palm oil and babassu) and two methods were developed for data analysis. The first one, based on principal component analysis (PCA), nearest neighbor classification (KNN) and univariate regression, was used for samples adulterated with a single type of oil or fat. In the second method, partial least square regression (PLS) was used for the cases where the adulterants were mixtures of up to three types of oils or fats. In the first method, the techniques of PCA and KNN were correctly classified as 17 out of 18 validation samples on the type of oil or fat present. The concentrations estimated for adulterants showed good agreement with the reference values, with mean errors of prediction (RMSEP) ranging between 0.10 and 0.22% (v/v). The PLS method was efficient in the quantification of mixtures of up to three types of oils and fats, with RMSEP being obtained between 0.08 and 0.27% (v/v), mean precision between 0.07 and 0.32% (v/v) and minimum detectable concentration between 0.23 and 0.81% (v/v) depending on the type of oil or fat in the mixture determined. Copyright © 2012 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Jiang, Li; Shi, Tielin; Xuan, Jianping
2012-05-01
Generally, the vibration signals of fault bearings are non-stationary and highly nonlinear under complicated operating conditions. Thus, it's a big challenge to extract optimal features for improving classification and simultaneously decreasing feature dimension. Kernel Marginal Fisher analysis (KMFA) is a novel supervised manifold learning algorithm for feature extraction and dimensionality reduction. In order to avoid the small sample size problem in KMFA, we propose regularized KMFA (RKMFA). A simple and efficient intelligent fault diagnosis method based on RKMFA is put forward and applied to fault recognition of rolling bearings. So as to directly excavate nonlinear features from the original high-dimensional vibration signals, RKMFA constructs two graphs describing the intra-class compactness and the inter-class separability, by combining traditional manifold learning algorithm with fisher criteria. Therefore, the optimal low-dimensional features are obtained for better classification and finally fed into the simplest K-nearest neighbor (KNN) classifier to recognize different fault categories of bearings. The experimental results demonstrate that the proposed approach improves the fault classification performance and outperforms the other conventional approaches.
Nagarajan, R; Hariharan, M; Satiyan, M
2012-08-01
Developing tools to assist physically disabled and immobilized people through facial expression is a challenging area of research and has attracted many researchers recently. In this paper, luminance stickers based facial expression recognition is proposed. Recognition of facial expression is carried out by employing Discrete Wavelet Transform (DWT) as a feature extraction method. Different wavelet families with their different orders (db1 to db20, Coif1 to Coif 5 and Sym2 to Sym8) are utilized to investigate their performance in recognizing facial expression and to evaluate their computational time. Standard deviation is computed for the coefficients of first level of wavelet decomposition for every order of wavelet family. This standard deviation is used to form a set of feature vectors for classification. In this study, conventional validation and cross validation are performed to evaluate the efficiency of the suggested feature vectors. Three different classifiers namely Artificial Neural Network (ANN), k-Nearest Neighborhood (kNN) and Linear Discriminant Analysis (LDA) are used to classify a set of eight facial expressions. The experimental results demonstrate that the proposed method gives very promising classification accuracies.
Kharbach, Mourad; Kamal, Rabie; Mansouri, Mohammed Alaoui; Marmouzi, Ilias; Viaene, Johan; Cherrah, Yahia; Alaoui, Katim; Vercammen, Joeri; Bouklouze, Abdelaziz; Vander Heyden, Yvan
2018-10-15
This study investigated the effectiveness of SIFT-MS versus chemical profiling, both coupled to multivariate data analysis, to classify 95 Extra Virgin Argan Oils (EVAO), originating from five Moroccan Argan forest locations. The full scan option of SIFT-MS, is suitable to indicate the geographic origin of EVAO based on the fingerprints obtained using the three chemical ionization precursors (H 3 O + , NO + and O 2 + ). The chemical profiling (including acidity, peroxide value, spectrophotometric indices, fatty acids, tocopherols- and sterols composition) was also used for classification. Partial least squares discriminant analysis (PLS-DA), soft independent modeling of class analogy (SIMCA), K-nearest neighbors (KNN), and support vector machines (SVM), were compared. The SIFT-MS data were therefore fed to variable-selection methods to find potential biomarkers for classification. The classification models based either on chemical profiling or SIFT-MS data were able to classify the samples with high accuracy. SIFT-MS was found to be advantageous for rapid geographic classification. Copyright © 2018 Elsevier Ltd. All rights reserved.
Zhu, Fangyuan; Ward, Michael B.; Li, Jing-Feng; Milne, Steven J.
2015-01-01
Legislation arising from health and environmental concerns has intensified research into finding suitable alternatives to lead-based piezoceramics. Recently, solid solutions based on sodium potassium niobate (K,Na)NbO3 (KNN) have become one of the globally-important lead-free counterparts, due to their favourable dielectric and piezoelectric properties. This data article provides information on the ferroelectric properties and core–shell grain structures for the system, (1−y)[(1−x)Na0.5K0.5NbO3 – xLiTaO3] – yBiScO3 (x=0–0.1, y=0.02, abbreviated as KNN–xLT–2BS). We show elemental analysis with aid of TEM spot-EDX to identify three-type grain-types in the KNN–LT–BS ternary system. Melting behaviour has been assessed using a tube furnace with build-in camera. Details for the ferroelectric properties and core–shell chemical segregation are illustrated. PMID:26217758
NASA Astrophysics Data System (ADS)
Yang, Yong-sheng; Ming, An-bo; Zhang, You-yun; Zhu, Yong-sheng
2017-10-01
Diesel engines, widely used in engineering, are very important for the running of equipments and their fault diagnosis have attracted much attention. In the past several decades, the image based fault diagnosis methods have provided efficient ways for the diesel engine fault diagnosis. By introducing the class information into the traditional non-negative matrix factorization (NMF), an improved NMF algorithm named as discriminative NMF (DNMF) was developed and a novel imaged based fault diagnosis method was proposed by the combination of the DNMF and the KNN classifier. Experiments performed on the fault diagnosis of diesel engine were used to validate the efficacy of the proposed method. It is shown that the fault conditions of diesel engine can be efficiently classified by the proposed method using the coefficient matrix obtained by DNMF. Compared with the original NMF (ONMF) and principle component analysis (PCA), the DNMF can represent the class information more efficiently because the class characters of basis matrices obtained by the DNMF are more visible than those in the basis matrices obtained by the ONMF and PCA.
Voice based gender classification using machine learning
NASA Astrophysics Data System (ADS)
Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.
2017-11-01
Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.
Multi-class SVM model for fMRI-based classification and grading of liver fibrosis
NASA Astrophysics Data System (ADS)
Freiman, M.; Sela, Y.; Edrei, Y.; Pappo, O.; Joskowicz, L.; Abramovitch, R.
2010-03-01
We present a novel non-invasive automatic method for the classification and grading of liver fibrosis from fMRI maps based on hepatic hemodynamic changes. This method automatically creates a model for liver fibrosis grading based on training datasets. Our supervised learning method evaluates hepatic hemodynamics from an anatomical MRI image and three T2*-W fMRI signal intensity time-course scans acquired during the breathing of air, air-carbon dioxide, and carbogen. It constructs a statistical model of liver fibrosis from these fMRI scans using a binary-based one-against-all multi class Support Vector Machine (SVM) classifier. We evaluated the resulting classification model with the leave-one out technique and compared it to both full multi-class SVM and K-Nearest Neighbor (KNN) classifications. Our experimental study analyzed 57 slice sets from 13 mice, and yielded a 98.2% separation accuracy between healthy and low grade fibrotic subjects, and an overall accuracy of 84.2% for fibrosis grading. These results are better than the existing image-based methods which can only discriminate between healthy and high grade fibrosis subjects. With appropriate extensions, our method may be used for non-invasive classification and progression monitoring of liver fibrosis in human patients instead of more invasive approaches, such as biopsy or contrast-enhanced imaging.
Jian, Yulin; Huang, Daoyu; Yan, Jia; Lu, Kun; Huang, Ying; Wen, Tailai; Zeng, Tanyue; Zhong, Shijie; Xie, Qilong
2017-01-01
A novel classification model, named the quantum-behaved particle swarm optimization (QPSO)-based weighted multiple kernel extreme learning machine (QWMK-ELM), is proposed in this paper. Experimental validation is carried out with two different electronic nose (e-nose) datasets. Being different from the existing multiple kernel extreme learning machine (MK-ELM) algorithms, the combination coefficients of base kernels are regarded as external parameters of single-hidden layer feedforward neural networks (SLFNs). The combination coefficients of base kernels, the model parameters of each base kernel, and the regularization parameter are optimized by QPSO simultaneously before implementing the kernel extreme learning machine (KELM) with the composite kernel function. Four types of common single kernel functions (Gaussian kernel, polynomial kernel, sigmoid kernel, and wavelet kernel) are utilized to constitute different composite kernel functions. Moreover, the method is also compared with other existing classification methods: extreme learning machine (ELM), kernel extreme learning machine (KELM), k-nearest neighbors (KNN), support vector machine (SVM), multi-layer perceptron (MLP), radical basis function neural network (RBFNN), and probabilistic neural network (PNN). The results have demonstrated that the proposed QWMK-ELM outperforms the aforementioned methods, not only in precision, but also in efficiency for gas classification. PMID:28629202
Kumar, Mukesh; Rath, Nitish Kumar; Rath, Santanu Kumar
2016-04-01
Microarray-based gene expression profiling has emerged as an efficient technique for classification, prognosis, diagnosis, and treatment of cancer. Frequent changes in the behavior of this disease generates an enormous volume of data. Microarray data satisfies both the veracity and velocity properties of big data, as it keeps changing with time. Therefore, the analysis of microarray datasets in a small amount of time is essential. They often contain a large amount of expression, but only a fraction of it comprises genes that are significantly expressed. The precise identification of genes of interest that are responsible for causing cancer are imperative in microarray data analysis. Most existing schemes employ a two-phase process such as feature selection/extraction followed by classification. In this paper, various statistical methods (tests) based on MapReduce are proposed for selecting relevant features. After feature selection, a MapReduce-based K-nearest neighbor (mrKNN) classifier is also employed to classify microarray data. These algorithms are successfully implemented in a Hadoop framework. A comparative analysis is done on these MapReduce-based models using microarray datasets of various dimensions. From the obtained results, it is observed that these models consume much less execution time than conventional models in processing big data. Copyright © 2016 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Le, Anh H.; Park, Young W.; Ma, Kevin; Jacobs, Colin; Liu, Brent J.
2010-03-01
Multiple Sclerosis (MS) is a progressive neurological disease affecting myelin pathways in the brain. Multiple lesions in the white matter can cause paralysis and severe motor disabilities of the affected patient. To solve the issue of inconsistency and user-dependency in manual lesion measurement of MRI, we have proposed a 3-D automated lesion quantification algorithm to enable objective and efficient lesion volume tracking. The computer-aided detection (CAD) of MS, written in MATLAB, utilizes K-Nearest Neighbors (KNN) method to compute the probability of lesions on a per-voxel basis. Despite the highly optimized algorithm of imaging processing that is used in CAD development, MS CAD integration and evaluation in clinical workflow is technically challenging due to the requirement of high computation rates and memory bandwidth in the recursive nature of the algorithm. In this paper, we present the development and evaluation of using a computing engine in the graphical processing unit (GPU) with MATLAB for segmentation of MS lesions. The paper investigates the utilization of a high-end GPU for parallel computing of KNN in the MATLAB environment to improve algorithm performance. The integration is accomplished using NVIDIA's CUDA developmental toolkit for MATLAB. The results of this study will validate the practicality and effectiveness of the prototype MS CAD in a clinical setting. The GPU method may allow MS CAD to rapidly integrate in an electronic patient record or any disease-centric health care system.
NASA Astrophysics Data System (ADS)
Teye, Ernest; Huang, Xingyi; Dai, Huang; Chen, Quansheng
2013-10-01
Quick, accurate and reliable technique for discrimination of cocoa beans according to geographical origin is essential for quality control and traceability management. This current study presents the application of Near Infrared Spectroscopy technique and multivariate classification for the differentiation of Ghana cocoa beans. A total of 194 cocoa bean samples from seven cocoa growing regions were used. Principal component analysis (PCA) was used to extract relevant information from the spectral data and this gave visible cluster trends. The performance of four multivariate classification methods: Linear discriminant analysis (LDA), K-nearest neighbors (KNN), Back propagation artificial neural network (BPANN) and Support vector machine (SVM) were compared. The performances of the models were optimized by cross validation. The results revealed that; SVM model was superior to all the mathematical methods with a discrimination rate of 100% in both the training and prediction set after preprocessing with Mean centering (MC). BPANN had a discrimination rate of 99.23% for the training set and 96.88% for prediction set. While LDA model had 96.15% and 90.63% for the training and prediction sets respectively. KNN model had 75.01% for the training set and 72.31% for prediction set. The non-linear classification methods used were superior to the linear ones. Generally, the results revealed that NIR Spectroscopy coupled with SVM model could be used successfully to discriminate cocoa beans according to their geographical origins for effective quality assurance.
MeSHLabeler: improving the accuracy of large-scale MeSH indexing by integrating diverse evidence.
Liu, Ke; Peng, Shengwen; Wu, Junqiu; Zhai, Chengxiang; Mamitsuka, Hiroshi; Zhu, Shanfeng
2015-06-15
Medical Subject Headings (MeSHs) are used by National Library of Medicine (NLM) to index almost all citations in MEDLINE, which greatly facilitates the applications of biomedical information retrieval and text mining. To reduce the time and financial cost of manual annotation, NLM has developed a software package, Medical Text Indexer (MTI), for assisting MeSH annotation, which uses k-nearest neighbors (KNN), pattern matching and indexing rules. Other types of information, such as prediction by MeSH classifiers (trained separately), can also be used for automatic MeSH annotation. However, existing methods cannot effectively integrate multiple evidence for MeSH annotation. We propose a novel framework, MeSHLabeler, to integrate multiple evidence for accurate MeSH annotation by using 'learning to rank'. Evidence includes numerous predictions from MeSH classifiers, KNN, pattern matching, MTI and the correlation between different MeSH terms, etc. Each MeSH classifier is trained independently, and thus prediction scores from different classifiers are incomparable. To address this issue, we have developed an effective score normalization procedure to improve the prediction accuracy. MeSHLabeler won the first place in Task 2A of 2014 BioASQ challenge, achieving the Micro F-measure of 0.6248 for 9,040 citations provided by the BioASQ challenge. Note that this accuracy is around 9.15% higher than 0.5724, obtained by MTI. The software is available upon request. © The Author 2015. Published by Oxford University Press.
MeSHLabeler: improving the accuracy of large-scale MeSH indexing by integrating diverse evidence
Liu, Ke; Peng, Shengwen; Wu, Junqiu; Zhai, Chengxiang; Mamitsuka, Hiroshi; Zhu, Shanfeng
2015-01-01
Motivation: Medical Subject Headings (MeSHs) are used by National Library of Medicine (NLM) to index almost all citations in MEDLINE, which greatly facilitates the applications of biomedical information retrieval and text mining. To reduce the time and financial cost of manual annotation, NLM has developed a software package, Medical Text Indexer (MTI), for assisting MeSH annotation, which uses k-nearest neighbors (KNN), pattern matching and indexing rules. Other types of information, such as prediction by MeSH classifiers (trained separately), can also be used for automatic MeSH annotation. However, existing methods cannot effectively integrate multiple evidence for MeSH annotation. Methods: We propose a novel framework, MeSHLabeler, to integrate multiple evidence for accurate MeSH annotation by using ‘learning to rank’. Evidence includes numerous predictions from MeSH classifiers, KNN, pattern matching, MTI and the correlation between different MeSH terms, etc. Each MeSH classifier is trained independently, and thus prediction scores from different classifiers are incomparable. To address this issue, we have developed an effective score normalization procedure to improve the prediction accuracy. Results: MeSHLabeler won the first place in Task 2A of 2014 BioASQ challenge, achieving the Micro F-measure of 0.6248 for 9,040 citations provided by the BioASQ challenge. Note that this accuracy is around 9.15% higher than 0.5724, obtained by MTI. Availability and implementation: The software is available upon request. Contact: zhusf@fudan.edu.cn PMID:26072501
Machine learning search for variable stars
NASA Astrophysics Data System (ADS)
Pashchenko, Ilya N.; Sokolovsky, Kirill V.; Gavras, Panagiotis
2018-04-01
Photometric variability detection is often considered as a hypothesis testing problem: an object is variable if the null hypothesis that its brightness is constant can be ruled out given the measurements and their uncertainties. The practical applicability of this approach is limited by uncorrected systematic errors. We propose a new variability detection technique sensitive to a wide range of variability types while being robust to outliers and underestimated measurement uncertainties. We consider variability detection as a classification problem that can be approached with machine learning. Logistic Regression (LR), Support Vector Machines (SVM), k Nearest Neighbours (kNN), Neural Nets (NN), Random Forests (RF), and Stochastic Gradient Boosting classifier (SGB) are applied to 18 features (variability indices) quantifying scatter and/or correlation between points in a light curve. We use a subset of Optical Gravitational Lensing Experiment phase two (OGLE-II) Large Magellanic Cloud (LMC) photometry (30 265 light curves) that was searched for variability using traditional methods (168 known variable objects) as the training set and then apply the NN to a new test set of 31 798 OGLE-II LMC light curves. Among 205 candidates selected in the test set, 178 are real variables, while 13 low-amplitude variables are new discoveries. The machine learning classifiers considered are found to be more efficient (select more variables and fewer false candidates) compared to traditional techniques using individual variability indices or their linear combination. The NN, SGB, SVM, and RF show a higher efficiency compared to LR and kNN.
Silva, Carlos Alberto; Klauberg, Carine; Hudak, Andrew T; Vierling, Lee A; Liesenberg, Veraldo; Bernett, Luiz G; Scheraiber, Clewerson F; Schoeninger, Emerson R
2018-01-01
Accurate forest inventory is of great economic importance to optimize the entire supply chain management in pulp and paper companies. The aim of this study was to estimate stand dominate and mean heights (HD and HM) and tree density (TD) of Pinus taeda plantations located in South Brazil using in-situ measurements, airborne Light Detection and Ranging (LiDAR) data and the non- k-nearest neighbor (k-NN) imputation. Forest inventory attributes and LiDAR derived metrics were calculated at 53 regular sample plots and we used imputation models to retrieve the forest attributes at plot and landscape-levels. The best LiDAR-derived metrics to predict HD, HM and TD were H99TH, HSD, SKE and HMIN. The Imputation model using the selected metrics was more effective for retrieving height than tree density. The model coefficients of determination (adj.R2) and a root mean squared difference (RMSD) for HD, HM and TD were 0.90, 0.94, 0.38m and 6.99, 5.70, 12.92%, respectively. Our results show that LiDAR and k-NN imputation can be used to predict stand heights with high accuracy in Pinus taeda. However, furthers studies need to be realized to improve the accuracy prediction of TD and to evaluate and compare the cost of acquisition and processing of LiDAR data against the conventional inventory procedures.
Karacavus, Seyhan; Yılmaz, Bülent; Tasdemir, Arzu; Kayaaltı, Ömer; Kaya, Eser; İçer, Semra; Ayyıldız, Oguzhan
2018-04-01
We investigated the association between the textural features obtained from 18 F-FDG images, metabolic parameters (SUVmax , SUVmean, MTV, TLG), and tumor histopathological characteristics (stage and Ki-67 proliferation index) in non-small cell lung cancer (NSCLC). The FDG-PET images of 67 patients with NSCLC were evaluated. MATLAB technical computing language was employed in the extraction of 137 features by using first order statistics (FOS), gray-level co-occurrence matrix (GLCM), gray-level run length matrix (GLRLM), and Laws' texture filters. Textural features and metabolic parameters were statistically analyzed in terms of good discrimination power between tumor stages, and selected features/parameters were used in the automatic classification by k-nearest neighbors (k-NN) and support vector machines (SVM). We showed that one textural feature (gray-level nonuniformity, GLN) obtained using GLRLM approach and nine textural features using Laws' approach were successful in discriminating all tumor stages, unlike metabolic parameters. There were significant correlations between Ki-67 index and some of the textural features computed using Laws' method (r = 0.6, p = 0.013). In terms of automatic classification of tumor stage, the accuracy was approximately 84% with k-NN classifier (k = 3) and SVM, using selected five features. Texture analysis of FDG-PET images has a potential to be an objective tool to assess tumor histopathological characteristics. The textural features obtained using Laws' approach could be useful in the discrimination of tumor stage.
Quantum chemical and statistical study of megazol-derived compounds with trypanocidal activity
NASA Astrophysics Data System (ADS)
Rosselli, F. P.; Albuquerque, C. N.; da Silva, A. B. F.
In this work we performed a structure-activity relationship (SAR) study with the aim to correlate molecular properties of the megazol compound and 10 of its analogs with the biological activity against Trypanosoma cruzi (trypanocidal or antichagasic activity) presented by these molecules. The biological activity indication was obtained from in vitro tests and the molecular properties (variables or descriptors) were obtained from the optimized chemical structures by using the PM3 semiempirical method. It was calculated ˜80 molecular properties selected among steric, constitutional, electronic, and lipophilicity properties. In order to reduce dimensionality and investigate which subset of variables (descriptors) would be more effective in classifying the compounds studied, according to their degree of trypanocidal activity, we employed statistical methodologies (pattern recognition and classification techniques) such as principal component analysis (PCA), hierarchical cluster analysis (HCA), K-nearest neighbor (KNN), and discriminant function analysis (DFA). These methods showed that the descriptors molecular mass (MM), energy of the second lowest unoccupied molecular orbital (LUMO+1), charge on the first nitrogen at substituent 2 (qN'), dihedral angles (D1 and D2), bond length between atom C4 and its substituent (L4), Moriguchi octanol-partition coefficient (MLogP), and length-to-breadth ratio (L/Bw) were the variables responsible for the separation between active and inactive compounds against T. cruzi. Afterwards, the PCA, KNN, and DFA models built in this work were used to perform trypanocidal activity predictions for eight new megazol analog compounds.
Research on fatigue driving pre-warning system based on multi-information fusion
NASA Astrophysics Data System (ADS)
Zhao, Xuyang; Ye, Wenwu
2018-05-01
With the development of science and technology, transportation network has grown faster. But at the same time, the quantity of traffic accidents due to fatigue driving grows faster as well. In the meantime, fatigue driving has been one of the main causes of traffic accidents. Therefore, it is indispensable for us to study the detection of fatigue driving to help to driving safety. There are numerous approaches in discrimination method. Each type of method has its reasonable theoretical basis, but the disadvantages of traditional fatigue driving detection methods have been more and more obvious since we study the traditional physiology and psychological features of fatigue drivers. So we set up a new system based on multi-information fusion and pattern recognition theory. In the paper, the fatigue driving pre-warning system discriminates fatigue by analyzing the characteristic parameters, the parameters derived from the steering wheel angle, the driver's power of gripping and the heart rate. And the data analysis system is established based on fuzzy C-means clustering theory. Finally, KNN classifier is used to establish the relation between feature indexes and fatigue degree. It is verified that the system has the better accuracy, agility and robustness according to our confirmatory experiment.
Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data.
Wei, Runmin; Wang, Jingye; Su, Mingming; Jia, Erik; Chen, Shaoqiu; Chen, Tianlu; Ni, Yan
2018-01-12
Missing values exist widely in mass-spectrometry (MS) based metabolomics data. Various methods have been applied for handling missing values, but the selection can significantly affect following data analyses. Typically, there are three types of missing values, missing not at random (MNAR), missing at random (MAR), and missing completely at random (MCAR). Our study comprehensively compared eight imputation methods (zero, half minimum (HM), mean, median, random forest (RF), singular value decomposition (SVD), k-nearest neighbors (kNN), and quantile regression imputation of left-censored data (QRILC)) for different types of missing values using four metabolomics datasets. Normalized root mean squared error (NRMSE) and NRMSE-based sum of ranks (SOR) were applied to evaluate imputation accuracy. Principal component analysis (PCA)/partial least squares (PLS)-Procrustes analysis were used to evaluate the overall sample distribution. Student's t-test followed by correlation analysis was conducted to evaluate the effects on univariate statistics. Our findings demonstrated that RF performed the best for MCAR/MAR and QRILC was the favored one for left-censored MNAR. Finally, we proposed a comprehensive strategy and developed a public-accessible web-tool for the application of missing value imputation in metabolomics ( https://metabolomics.cc.hawaii.edu/software/MetImp/ ).
A Hybrid Semi-Supervised Anomaly Detection Model for High-Dimensional Data.
Song, Hongchao; Jiang, Zhuqing; Men, Aidong; Yang, Bo
2017-01-01
Anomaly detection, which aims to identify observations that deviate from a nominal sample, is a challenging task for high-dimensional data. Traditional distance-based anomaly detection methods compute the neighborhood distance between each observation and suffer from the curse of dimensionality in high-dimensional space; for example, the distances between any pair of samples are similar and each sample may perform like an outlier. In this paper, we propose a hybrid semi-supervised anomaly detection model for high-dimensional data that consists of two parts: a deep autoencoder (DAE) and an ensemble k -nearest neighbor graphs- ( K -NNG-) based anomaly detector. Benefiting from the ability of nonlinear mapping, the DAE is first trained to learn the intrinsic features of a high-dimensional dataset to represent the high-dimensional data in a more compact subspace. Several nonparametric KNN-based anomaly detectors are then built from different subsets that are randomly sampled from the whole dataset. The final prediction is made by all the anomaly detectors. The performance of the proposed method is evaluated on several real-life datasets, and the results confirm that the proposed hybrid model improves the detection accuracy and reduces the computational complexity.
A Hybrid Semi-Supervised Anomaly Detection Model for High-Dimensional Data
Jiang, Zhuqing; Men, Aidong; Yang, Bo
2017-01-01
Anomaly detection, which aims to identify observations that deviate from a nominal sample, is a challenging task for high-dimensional data. Traditional distance-based anomaly detection methods compute the neighborhood distance between each observation and suffer from the curse of dimensionality in high-dimensional space; for example, the distances between any pair of samples are similar and each sample may perform like an outlier. In this paper, we propose a hybrid semi-supervised anomaly detection model for high-dimensional data that consists of two parts: a deep autoencoder (DAE) and an ensemble k-nearest neighbor graphs- (K-NNG-) based anomaly detector. Benefiting from the ability of nonlinear mapping, the DAE is first trained to learn the intrinsic features of a high-dimensional dataset to represent the high-dimensional data in a more compact subspace. Several nonparametric KNN-based anomaly detectors are then built from different subsets that are randomly sampled from the whole dataset. The final prediction is made by all the anomaly detectors. The performance of the proposed method is evaluated on several real-life datasets, and the results confirm that the proposed hybrid model improves the detection accuracy and reduces the computational complexity. PMID:29270197
Han, Te; Jiang, Dongxiang; Zhang, Xiaochen; Sun, Yankui
2017-03-27
Rotating machinery is widely used in industrial applications. With the trend towards more precise and more critical operating conditions, mechanical failures may easily occur. Condition monitoring and fault diagnosis (CMFD) technology is an effective tool to enhance the reliability and security of rotating machinery. In this paper, an intelligent fault diagnosis method based on dictionary learning and singular value decomposition (SVD) is proposed. First, the dictionary learning scheme is capable of generating an adaptive dictionary whose atoms reveal the underlying structure of raw signals. Essentially, dictionary learning is employed as an adaptive feature extraction method regardless of any prior knowledge. Second, the singular value sequence of learned dictionary matrix is served to extract feature vector. Generally, since the vector is of high dimensionality, a simple and practical principal component analysis (PCA) is applied to reduce dimensionality. Finally, the K -nearest neighbor (KNN) algorithm is adopted for identification and classification of fault patterns automatically. Two experimental case studies are investigated to corroborate the effectiveness of the proposed method in intelligent diagnosis of rotating machinery faults. The comparison analysis validates that the dictionary learning-based matrix construction approach outperforms the mode decomposition-based methods in terms of capacity and adaptability for feature extraction.
Bennet, Jaison; Ganaprakasam, Chilambuchelvan Arul; Arputharaj, Kannan
2014-01-01
Cancer classification by doctors and radiologists was based on morphological and clinical features and had limited diagnostic ability in olden days. The recent arrival of DNA microarray technology has led to the concurrent monitoring of thousands of gene expressions in a single chip which stimulates the progress in cancer classification. In this paper, we have proposed a hybrid approach for microarray data classification based on nearest neighbor (KNN), naive Bayes, and support vector machine (SVM). Feature selection prior to classification plays a vital role and a feature selection technique which combines discrete wavelet transform (DWT) and moving window technique (MWT) is used. The performance of the proposed method is compared with the conventional classifiers like support vector machine, nearest neighbor, and naive Bayes. Experiments have been conducted on both real and benchmark datasets and the results indicate that the ensemble approach produces higher classification accuracy than conventional classifiers. This paper serves as an automated system for the classification of cancer and can be applied by doctors in real cases which serve as a boon to the medical community. This work further reduces the misclassification of cancers which is highly not allowed in cancer detection.
Hakimzadeh, Neda; Parastar, Hadi; Fattahi, Mohammad
2014-01-24
In this study, multivariate curve resolution (MCR) and multivariate classification methods are proposed to develop a new chemometric strategy for comprehensive analysis of high-performance liquid chromatography-diode array absorbance detection (HPLC-DAD) fingerprints of sixty Salvia reuterana samples from five different geographical regions. Different chromatographic problems occurred during HPLC-DAD analysis of S. reuterana samples, such as baseline/background contribution and noise, low signal-to-noise ratio (S/N), asymmetric peaks, elution time shifts, and peak overlap are handled using the proposed strategy. In this way, chromatographic fingerprints of sixty samples are properly segmented to ten common chromatographic regions using local rank analysis and then, the corresponding segments are column-wise augmented for subsequent MCR analysis. Extended multivariate curve resolution-alternating least squares (MCR-ALS) is used to obtain pure component profiles in each segment. In general, thirty-one chemical components were resolved using MCR-ALS in sixty S. reuterana samples and the lack of fit (LOF) values of MCR-ALS models were below 10.0% in all cases. Pure spectral profiles are considered for identification of chemical components by comparing their resolved spectra with the standard ones and twenty-four components out of thirty-one components were identified. Additionally, pure elution profiles are used to obtain relative concentrations of chemical components in different samples for multivariate classification analysis by principal component analysis (PCA) and k-nearest neighbors (kNN). Inspection of the PCA score plot (explaining 76.1% of variance accounted for three PCs) showed that S. reuterana samples belong to four clusters. The degree of class separation (DCS) which quantifies the distance separating clusters in relation to the scatter within each cluster is calculated for four clusters and it was in the range of 1.6-5.8. These results are then confirmed by kNN. In addition, according to the PCA loading plot and kNN dendrogram of thirty-one variables, five chemical constituents of luteolin-7-o-glucoside, salvianolic acid D, rosmarinic acid, lithospermic acid and trijuganone A are identified as the most important variables (i.e., chemical markers) for clusters discrimination. Finally, the effect of different chemical markers on samples differentiation is investigated using counter-propagation artificial neural network (CP-ANN) method. It is concluded that the proposed strategy can be successfully applied for comprehensive analysis of chromatographic fingerprints of complex natural samples. Copyright © 2013 Elsevier B.V. All rights reserved.
Development of a sonar-based object recognition system
NASA Astrophysics Data System (ADS)
Ecemis, Mustafa Ihsan
2001-02-01
Sonars are used extensively in mobile robotics for obstacle detection, ranging and avoidance. However, these range-finding applications do not exploit the full range of information carried in sonar echoes. In addition, mobile robots need robust object recognition systems. Therefore, a simple and robust object recognition system using ultrasonic sensors may have a wide range of applications in robotics. This dissertation develops and analyzes an object recognition system that uses ultrasonic sensors of the type commonly found on mobile robots. Three principal experiments are used to test the sonar recognition system: object recognition at various distances, object recognition during unconstrained motion, and softness discrimination. The hardware setup, consisting of an inexpensive Polaroid sonar and a data acquisition board, is described first. The software for ultrasound signal generation, echo detection, data collection, and data processing is then presented. Next, the dissertation describes two methods to extract information from the echoes, one in the frequency domain and the other in the time domain. The system uses the fuzzy ARTMAP neural network to recognize objects on the basis of the information content of their echoes. In order to demonstrate that the performance of the system does not depend on the specific classification method being used, the K- Nearest Neighbors (KNN) Algorithm is also implemented. KNN yields a test accuracy similar to fuzzy ARTMAP in all experiments. Finally, the dissertation describes a method for extracting features from the envelope function in order to reduce the dimension of the input vector used by the classifiers. Decreasing the size of the input vectors reduces the memory requirements of the system and makes it run faster. It is shown that this method does not affect the performance of the system dramatically and is more appropriate for some tasks. The results of these experiments demonstrate that sonar can be used to develop a low-cost, low-computation system for real-time object recognition tasks on mobile robots. This system differs from all previous approaches in that it is relatively simple, robust, fast, and inexpensive.
Implementation of a Localization System for Sensor Networks
2006-05-18
its N-point DFT is mathematically formulated as X[k] = N−1∑ n=0 x[n] W nkN , k = 0, 1, . . . , N − 1 (7.1) W knN = e −j(2π/N)kn (7.2) There are two...distributed ad-hoc wireless sensor networks. In Int. Conf. on Acoustics, Speech , and Signal Proc. (ICASSP), pages 2037 – 2040, Salt Lake City, UT. [18] J...Stability of recursive qrd-ls algorithms using finite- precision systolic array implementation. IEEE Trans. on Acoustics, Speech , and Signal Proc., 37(5
A Feature-based Approach to Big Data Analysis of Medical Images
Toews, Matthew; Wachinger, Christian; Estepar, Raul San Jose; Wells, William M.
2015-01-01
This paper proposes an inference method well-suited to large sets of medical images. The method is based upon a framework where distinctive 3D scale-invariant features are indexed efficiently to identify approximate nearest-neighbor (NN) feature matches in O(log N) computational complexity in the number of images N. It thus scales well to large data sets, in contrast to methods based on pair-wise image registration or feature matching requiring O(N) complexity. Our theoretical contribution is a density estimator based on a generative model that generalizes kernel density estimation and K-nearest neighbor (KNN) methods. The estimator can be used for on-the-fly queries, without requiring explicit parametric models or an off-line training phase. The method is validated on a large multi-site data set of 95,000,000 features extracted from 19,000 lung CT scans. Subject-level classification identifies all images of the same subjects across the entire data set despite deformation due to breathing state, including unintentional duplicate scans. State-of-the-art performance is achieved in predicting chronic pulmonary obstructive disorder (COPD) severity across the 5-category GOLD clinical rating, with an accuracy of 89% if both exact and one-off predictions are considered correct. PMID:26221685
A Feature-Based Approach to Big Data Analysis of Medical Images.
Toews, Matthew; Wachinger, Christian; Estepar, Raul San Jose; Wells, William M
2015-01-01
This paper proposes an inference method well-suited to large sets of medical images. The method is based upon a framework where distinctive 3D scale-invariant features are indexed efficiently to identify approximate nearest-neighbor (NN) feature matches-in O (log N) computational complexity in the number of images N. It thus scales well to large data sets, in contrast to methods based on pair-wise image registration or feature matching requiring O(N) complexity. Our theoretical contribution is a density estimator based on a generative model that generalizes kernel density estimation and K-nearest neighbor (KNN) methods.. The estimator can be used for on-the-fly queries, without requiring explicit parametric models or an off-line training phase. The method is validated on a large multi-site data set of 95,000,000 features extracted from 19,000 lung CT scans. Subject-level classification identifies all images of the same subjects across the entire data set despite deformation due to breathing state, including unintentional duplicate scans. State-of-the-art performance is achieved in predicting chronic pulmonary obstructive disorder (COPD) severity across the 5-category GOLD clinical rating, with an accuracy of 89% if both exact and one-off predictions are considered correct.
GTM-Based QSAR Models and Their Applicability Domains.
Gaspar, H A; Baskin, I I; Marcou, G; Horvath, D; Varnek, A
2015-06-01
In this paper we demonstrate that Generative Topographic Mapping (GTM), a machine learning method traditionally used for data visualisation, can be efficiently applied to QSAR modelling using probability distribution functions (PDF) computed in the latent 2-dimensional space. Several different scenarios of the activity assessment were considered: (i) the "activity landscape" approach based on direct use of PDF, (ii) QSAR models involving GTM-generated on descriptors derived from PDF, and, (iii) the k-Nearest Neighbours approach in 2D latent space. Benchmarking calculations were performed on five different datasets: stability constants of metal cations Ca(2+) , Gd(3+) and Lu(3+) complexes with organic ligands in water, aqueous solubility and activity of thrombin inhibitors. It has been shown that the performance of GTM-based regression models is similar to that obtained with some popular machine-learning methods (random forest, k-NN, M5P regression tree and PLS) and ISIDA fragment descriptors. By comparing GTM activity landscapes built both on predicted and experimental activities, we may visually assess the model's performance and identify the areas in the chemical space corresponding to reliable predictions. The applicability domain used in this work is based on data likelihood. Its application has significantly improved the model performances for 4 out of 5 datasets. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Natural Language Processing Based Instrument for Classification of Free Text Medical Records
2016-01-01
According to the Ministry of Labor, Health and Social Affairs of Georgia a new health management system has to be introduced in the nearest future. In this context arises the problem of structuring and classifying documents containing all the history of medical services provided. The present work introduces the instrument for classification of medical records based on the Georgian language. It is the first attempt of such classification of the Georgian language based medical records. On the whole 24.855 examination records have been studied. The documents were classified into three main groups (ultrasonography, endoscopy, and X-ray) and 13 subgroups using two well-known methods: Support Vector Machine (SVM) and K-Nearest Neighbor (KNN). The results obtained demonstrated that both machine learning methods performed successfully, with a little supremacy of SVM. In the process of classification a “shrink” method, based on features selection, was introduced and applied. At the first stage of classification the results of the “shrink” case were better; however, on the second stage of classification into subclasses 23% of all documents could not be linked to only one definite individual subclass (liver or binary system) due to common features characterizing these subclasses. The overall results of the study were successful. PMID:27668260
Pattern recognition for passive polarimetric data using nonparametric classifiers
NASA Astrophysics Data System (ADS)
Thilak, Vimal; Saini, Jatinder; Voelz, David G.; Creusere, Charles D.
2005-08-01
Passive polarization based imaging is a useful tool in computer vision and pattern recognition. A passive polarization imaging system forms a polarimetric image from the reflection of ambient light that contains useful information for computer vision tasks such as object detection (classification) and recognition. Applications of polarization based pattern recognition include material classification and automatic shape recognition. In this paper, we present two target detection algorithms for images captured by a passive polarimetric imaging system. The proposed detection algorithms are based on Bayesian decision theory. In these approaches, an object can belong to one of any given number classes and classification involves making decisions that minimize the average probability of making incorrect decisions. This minimum is achieved by assigning an object to the class that maximizes the a posteriori probability. Computing a posteriori probabilities requires estimates of class conditional probability density functions (likelihoods) and prior probabilities. A Probabilistic neural network (PNN), which is a nonparametric method that can compute Bayes optimal boundaries, and a -nearest neighbor (KNN) classifier, is used for density estimation and classification. The proposed algorithms are applied to polarimetric image data gathered in the laboratory with a liquid crystal-based system. The experimental results validate the effectiveness of the above algorithms for target detection from polarimetric data.
Jiménez-Carvelo, Ana M; Pérez-Castaño, Estefanía; González-Casado, Antonio; Cuadros-Rodríguez, Luis
2017-04-15
A new method for differentiation of olive oil (independently of the quality category) from other vegetable oils (canola, safflower, corn, peanut, seeds, grapeseed, palm, linseed, sesame and soybean) has been developed. The analytical procedure for chromatographic fingerprinting of the methyl-transesterified fraction of each vegetable oil, using normal-phase liquid chromatography, is described and the chemometric strategies applied and discussed. Some chemometric methods, such as k-nearest neighbours (kNN), partial least squared-discriminant analysis (PLS-DA), support vector machine classification analysis (SVM-C), and soft independent modelling of class analogies (SIMCA), were applied to build classification models. Performance of the classification was evaluated and ranked using several classification quality metrics. The discriminant analysis, based on the use of one input-class, (plus a dummy class) was applied for the first time in this study. Copyright © 2016 Elsevier Ltd. All rights reserved.
A triboelectric motion sensor in wearable body sensor network for human activity recognition.
Hui Huang; Xian Li; Ye Sun
2016-08-01
The goal of this study is to design a novel triboelectric motion sensor in wearable body sensor network for human activity recognition. Physical activity recognition is widely used in well-being management, medical diagnosis and rehabilitation. Other than traditional accelerometers, we design a novel wearable sensor system based on triboelectrification. The triboelectric motion sensor can be easily attached to human body and collect motion signals caused by physical activities. The experiments are conducted to collect five common activity data: sitting and standing, walking, climbing upstairs, downstairs, and running. The k-Nearest Neighbor (kNN) clustering algorithm is adopted to recognize these activities and validate the feasibility of this new approach. The results show that our system can perform physical activity recognition with a successful rate over 80% for walking, sitting and standing. The triboelectric structure can also be used as an energy harvester for motion harvesting due to its high output voltage in random low-frequency motion.
Adaptive local linear regression with application to printer color management.
Gupta, Maya R; Garcia, Eric K; Chin, Erika
2008-06-01
Local learning methods, such as local linear regression and nearest neighbor classifiers, base estimates on nearby training samples, neighbors. Usually, the number of neighbors used in estimation is fixed to be a global "optimal" value, chosen by cross validation. This paper proposes adapting the number of neighbors used for estimation to the local geometry of the data, without need for cross validation. The term enclosing neighborhood is introduced to describe a set of neighbors whose convex hull contains the test point when possible. It is proven that enclosing neighborhoods yield bounded estimation variance under some assumptions. Three such enclosing neighborhood definitions are presented: natural neighbors, natural neighbors inclusive, and enclosing k-NN. The effectiveness of these neighborhood definitions with local linear regression is tested for estimating lookup tables for color management. Significant improvements in error metrics are shown, indicating that enclosing neighborhoods may be a promising adaptive neighborhood definition for other local learning tasks as well, depending on the density of training samples.
Multi-texture local ternary pattern for face recognition
NASA Astrophysics Data System (ADS)
Essa, Almabrok; Asari, Vijayan
2017-05-01
In imagery and pattern analysis domain a variety of descriptors have been proposed and employed for different computer vision applications like face detection and recognition. Many of them are affected under different conditions during the image acquisition process such as variations in illumination and presence of noise, because they totally rely on the image intensity values to encode the image information. To overcome these problems, a novel technique named Multi-Texture Local Ternary Pattern (MTLTP) is proposed in this paper. MTLTP combines the edges and corners based on the local ternary pattern strategy to extract the local texture features of the input image. Then returns a spatial histogram feature vector which is the descriptor for each image that we use to recognize a human being. Experimental results using a k-nearest neighbors classifier (k-NN) on two publicly available datasets justify our algorithm for efficient face recognition in the presence of extreme variations of illumination/lighting environments and slight variation of pose conditions.
Chemical data as markers of the geographical origins of sugarcane spirits.
Serafim, F A T; Pereira-Filho, Edenir R; Franco, D W
2016-04-01
In an attempt to classify sugarcane spirits according to their geographic region of origin, chemical data for 24 analytes were evaluated in 50 cachaças produced using a similar procedure in selected regions of Brazil: São Paulo - SP (15), Minas Gerais - MG (11), Rio de Janeiro - RJ (11), Paraiba -PB (9), and Ceará - CE (4). Multivariate analysis was applied to the analytical results, and the predictive abilities of different classification methods were evaluated. Principal component analysis identified five groups, and chemical similarities were observed between MG and SP samples and between RJ and PB samples. CE samples presented a distinct chemical profile. Among the samples, partial linear square discriminant analysis (PLS-DA) classified 50.2% of the samples correctly, K-nearest neighbor (KNN) 86%, and soft independent modeling of class analogy (SIMCA) 56.2%. Therefore, in this proof of concept demonstration, the proposed approach based on chemical data satisfactorily predicted the cachaças' geographic origins. Copyright © 2015 Elsevier Ltd. All rights reserved.
Navarro, Pedro J.; Fernández, Carlos; Borraz, Raúl; Alonso, Diego
2016-01-01
This article describes an automated sensor-based system to detect pedestrians in an autonomous vehicle application. Although the vehicle is equipped with a broad set of sensors, the article focuses on the processing of the information generated by a Velodyne HDL-64E LIDAR sensor. The cloud of points generated by the sensor (more than 1 million points per revolution) is processed to detect pedestrians, by selecting cubic shapes and applying machine vision and machine learning algorithms to the XY, XZ, and YZ projections of the points contained in the cube. The work relates an exhaustive analysis of the performance of three different machine learning algorithms: k-Nearest Neighbours (kNN), Naïve Bayes classifier (NBC), and Support Vector Machine (SVM). These algorithms have been trained with 1931 samples. The final performance of the method, measured a real traffic scenery, which contained 16 pedestrians and 469 samples of non-pedestrians, shows sensitivity (81.2%), accuracy (96.2%) and specificity (96.8%). PMID:28025565
Mild Depression Detection of College Students: an EEG-Based Solution with Free Viewing Tasks.
Li, Xiaowei; Hu, Bin; Shen, Ji; Xu, Tingting; Retcliffe, Martyn
2015-12-01
Depression is a common mental disorder with growing prevalence; however current diagnoses of depression face the problem of patient denial, clinical experience and subjective biases from self-report. By using a combination of linear and nonlinear EEG features in our research, we aim to develop a more accurate and objective approach to depression detection that supports the process of diagnosis and assists the monitoring of risk factors. By classifying EEG features during free viewing task, an accuracy of 99.1%, which is the highest to our knowledge by far, was achieved using kNN classifier to discriminate depressed and non-depressed subjects. Furthermore, through correlation analysis, comparisons of performance on each electrode were discussed on the availability of single channel EEG recording depression detection system. Combined with wearable EEG collecting devices, our method offers the possibility of cost effective wearable ubiquitous system for doctors to monitor their patients with depression, and for normal people to understand their mental states in time.
Privacy Preserving Nearest Neighbor Search
NASA Astrophysics Data System (ADS)
Shaneck, Mark; Kim, Yongdae; Kumar, Vipin
Data mining is frequently obstructed by privacy concerns. In many cases data is distributed, and bringing the data together in one place for analysis is not possible due to privacy laws (e.g. HIPAA) or policies. Privacy preserving data mining techniques have been developed to address this issue by providing mechanisms to mine the data while giving certain privacy guarantees. In this chapter we address the issue of privacy preserving nearest neighbor search, which forms the kernel of many data mining applications. To this end, we present a novel algorithm based on secure multiparty computation primitives to compute the nearest neighbors of records in horizontally distributed data. We show how this algorithm can be used in three important data mining algorithms, namely LOF outlier detection, SNN clustering, and kNN classification. We prove the security of these algorithms under the semi-honest adversarial model, and describe methods that can be used to optimize their performance. Keywords: Privacy Preserving Data Mining, Nearest Neighbor Search, Outlier Detection, Clustering, Classification, Secure Multiparty Computation
Georgouli, Konstantia; Martinez Del Rincon, Jesus; Koidis, Anastasios
2017-02-15
The main objective of this work was to develop a novel dimensionality reduction technique as a part of an integrated pattern recognition solution capable of identifying adulterants such as hazelnut oil in extra virgin olive oil at low percentages based on spectroscopic chemical fingerprints. A novel Continuous Locality Preserving Projections (CLPP) technique is proposed which allows the modelling of the continuous nature of the produced in-house admixtures as data series instead of discrete points. The maintenance of the continuous structure of the data manifold enables the better visualisation of this examined classification problem and facilitates the more accurate utilisation of the manifold for detecting the adulterants. The performance of the proposed technique is validated with two different spectroscopic techniques (Raman and Fourier transform infrared, FT-IR). In all cases studied, CLPP accompanied by k-Nearest Neighbors (kNN) algorithm was found to outperform any other state-of-the-art pattern recognition techniques. Copyright © 2016 Elsevier Ltd. All rights reserved.
Navarro, Pedro J; Fernández, Carlos; Borraz, Raúl; Alonso, Diego
2016-12-23
This article describes an automated sensor-based system to detect pedestrians in an autonomous vehicle application. Although the vehicle is equipped with a broad set of sensors, the article focuses on the processing of the information generated by a Velodyne HDL-64E LIDAR sensor. The cloud of points generated by the sensor (more than 1 million points per revolution) is processed to detect pedestrians, by selecting cubic shapes and applying machine vision and machine learning algorithms to the XY, XZ, and YZ projections of the points contained in the cube. The work relates an exhaustive analysis of the performance of three different machine learning algorithms: k-Nearest Neighbours (kNN), Naïve Bayes classifier (NBC), and Support Vector Machine (SVM). These algorithms have been trained with 1931 samples. The final performance of the method, measured a real traffic scenery, which contained 16 pedestrians and 469 samples of non-pedestrians, shows sensitivity (81.2%), accuracy (96.2%) and specificity (96.8%).
Classification of speech dysfluencies using LPC based parameterization techniques.
Hariharan, M; Chee, Lim Sin; Ai, Ooi Chia; Yaacob, Sazali
2012-06-01
The goal of this paper is to discuss and compare three feature extraction methods: Linear Predictive Coefficients (LPC), Linear Prediction Cepstral Coefficients (LPCC) and Weighted Linear Prediction Cepstral Coefficients (WLPCC) for recognizing the stuttered events. Speech samples from the University College London Archive of Stuttered Speech (UCLASS) were used for our analysis. The stuttered events were identified through manual segmentation and were used for feature extraction. Two simple classifiers namely, k-nearest neighbour (kNN) and Linear Discriminant Analysis (LDA) were employed for speech dysfluencies classification. Conventional validation method was used for testing the reliability of the classifier results. The study on the effect of different frame length, percentage of overlapping, value of ã in a first order pre-emphasizer and different order p were discussed. The speech dysfluencies classification accuracy was found to be improved by applying statistical normalization before feature extraction. The experimental investigation elucidated LPC, LPCC and WLPCC features can be used for identifying the stuttered events and WLPCC features slightly outperforms LPCC features and LPC features.
NASA Astrophysics Data System (ADS)
Nishizuka, N.; Sugiura, K.; Kubo, Y.; Den, M.; Watari, S.; Ishii, M.
2017-02-01
We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010-2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite. We detected active regions (ARs) from the full-disk magnetogram, from which ˜60 features were extracted with their time differentials, including magnetic neutral lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.
Özdemir, Ahmet Turan
2016-01-01
Wearable devices for fall detection have received attention in academia and industry, because falls are very dangerous, especially for elderly people, and if immediate aid is not provided, it may result in death. However, some predictive devices are not easily worn by elderly people. In this work, a huge dataset, including 2520 tests, is employed to determine the best sensor placement location on the body and to reduce the number of sensor nodes for device ergonomics. During the tests, the volunteer’s movements are recorded with six groups of sensors each with a triaxial (accelerometer, gyroscope and magnetometer) sensor, which is placed tightly on different parts of the body with special straps: head, chest, waist, right-wrist, right-thigh and right-ankle. The accuracy of individual sensor groups with their location is investigated with six machine learning techniques, namely the k-nearest neighbor (k-NN) classifier, Bayesian decision making (BDM), support vector machines (SVM), least squares method (LSM), dynamic time warping (DTW) and artificial neural networks (ANNs). Each technique is applied to single, double, triple, quadruple, quintuple and sextuple sensor configurations. These configurations create 63 different combinations, and for six machine learning techniques, a total of 63 × 6 = 378 combinations is investigated. As a result, the waist region is found to be the most suitable location for sensor placement on the body with 99.96% fall detection sensitivity by using the k-NN classifier, whereas the best sensitivity achieved by the wrist sensor is 97.37%, despite this location being highly preferred for today’s wearable applications. PMID:27463719
Como, F; Carnesecchi, E; Volani, S; Dorne, J L; Richardson, J; Bassan, A; Pavan, M; Benfenati, E
2017-01-01
Ecological risk assessment of plant protection products (PPPs) requires an understanding of both the toxicity and the extent of exposure to assess risks for a range of taxa of ecological importance including target and non-target species. Non-target species such as honey bees (Apis mellifera), solitary bees and bumble bees are of utmost importance because of their vital ecological services as pollinators of wild plants and crops. To improve risk assessment of PPPs in bee species, computational models predicting the acute and chronic toxicity of a range of PPPs and contaminants can play a major role in providing structural and physico-chemical properties for the prioritisation of compounds of concern and future risk assessments. Over the last three decades, scientific advisory bodies and the research community have developed toxicological databases and quantitative structure-activity relationship (QSAR) models that are proving invaluable to predict toxicity using historical data and reduce animal testing. This paper describes the development and validation of a k-Nearest Neighbor (k-NN) model using in-house software for the prediction of acute contact toxicity of pesticides on honey bees. Acute contact toxicity data were collected from different sources for 256 pesticides, which were divided into training and test sets. The k-NN models were validated with good prediction, with an accuracy of 70% for all compounds and of 65% for highly toxic compounds, suggesting that they might reliably predict the toxicity of structurally diverse pesticides and could be used to screen and prioritise new pesticides. Copyright © 2016 Elsevier Ltd. All rights reserved.
Murugappan, Murugappan; Murugappan, Subbulakshmi; Zheng, Bong Siao
2013-01-01
[Purpose] Intelligent emotion assessment systems have been highly successful in a variety of applications, such as e-learning, psychology, and psycho-physiology. This study aimed to assess five different human emotions (happiness, disgust, fear, sadness, and neutral) using heart rate variability (HRV) signals derived from an electrocardiogram (ECG). [Subjects] Twenty healthy university students (10 males and 10 females) with a mean age of 23 years participated in this experiment. [Methods] All five emotions were induced by audio-visual stimuli (video clips). ECG signals were acquired using 3 electrodes and were preprocessed using a Butterworth 3rd order filter to remove noise and baseline wander. The Pan-Tompkins algorithm was used to derive the HRV signals from ECG. Discrete wavelet transform (DWT) was used to extract statistical features from the HRV signals using four wavelet functions: Daubechies6 (db6), Daubechies7 (db7), Symmlet8 (sym8), and Coiflet5 (coif5). The k-nearest neighbor (KNN) and linear discriminant analysis (LDA) were used to map the statistical features into corresponding emotions. [Results] KNN provided the maximum average emotion classification rate compared to LDA for five emotions (sadness − 50.28%; happiness − 79.03%; fear − 77.78%; disgust − 88.69%; and neutral − 78.34%). [Conclusion] The results of this study indicate that HRV may be a reliable indicator of changes in the emotional state of subjects and provides an approach to the development of a real-time emotion assessment system with a higher reliability than other systems. PMID:24259846
Murugappan, Murugappan; Murugappan, Subbulakshmi; Zheng, Bong Siao
2013-07-01
[Purpose] Intelligent emotion assessment systems have been highly successful in a variety of applications, such as e-learning, psychology, and psycho-physiology. This study aimed to assess five different human emotions (happiness, disgust, fear, sadness, and neutral) using heart rate variability (HRV) signals derived from an electrocardiogram (ECG). [Subjects] Twenty healthy university students (10 males and 10 females) with a mean age of 23 years participated in this experiment. [Methods] All five emotions were induced by audio-visual stimuli (video clips). ECG signals were acquired using 3 electrodes and were preprocessed using a Butterworth 3rd order filter to remove noise and baseline wander. The Pan-Tompkins algorithm was used to derive the HRV signals from ECG. Discrete wavelet transform (DWT) was used to extract statistical features from the HRV signals using four wavelet functions: Daubechies6 (db6), Daubechies7 (db7), Symmlet8 (sym8), and Coiflet5 (coif5). The k-nearest neighbor (KNN) and linear discriminant analysis (LDA) were used to map the statistical features into corresponding emotions. [Results] KNN provided the maximum average emotion classification rate compared to LDA for five emotions (sadness - 50.28%; happiness - 79.03%; fear - 77.78%; disgust - 88.69%; and neutral - 78.34%). [Conclusion] The results of this study indicate that HRV may be a reliable indicator of changes in the emotional state of subjects and provides an approach to the development of a real-time emotion assessment system with a higher reliability than other systems.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nishizuka, N.; Kubo, Y.; Den, M.
We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010–2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite . We detected active regions (ARs) from the full-disk magnetogram, from which ∼60 features were extracted with their time differentials, including magnetic neutralmore » lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.« less
NASA Astrophysics Data System (ADS)
Chen, Xue; Li, Xiaohui; Yu, Xin; Chen, Deying; Liu, Aichun
2018-01-01
Diagnosis of malignancies is a challenging clinical issue. In this work, we present quick and robust diagnosis and discrimination of lymphoma and multiple myeloma (MM) using laser-induced breakdown spectroscopy (LIBS) conducted on human serum samples, in combination with chemometric methods. The serum samples collected from lymphoma and MM cancer patients and healthy controls were deposited on filter papers and ablated with a pulsed 1064 nm Nd:YAG laser. 24 atomic lines of Ca, Na, K, H, O, and N were selected for malignancy diagnosis. Principal component analysis (PCA), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and k nearest neighbors (kNN) classification were applied to build the malignancy diagnosis and discrimination models. The performances of the models were evaluated using 10-fold cross validation. The discrimination accuracy, confusion matrix and receiver operating characteristic (ROC) curves were obtained. The values of area under the ROC curve (AUC), sensitivity and specificity at the cut-points were determined. The kNN model exhibits the best performances with overall discrimination accuracy of 96.0%. Distinct discrimination between malignancies and healthy controls has been achieved with AUC, sensitivity and specificity for healthy controls all approaching 1. For lymphoma, the best discrimination performance values are AUC = 0.990, sensitivity = 0.970 and specificity = 0.956. For MM, the corresponding values are AUC = 0.986, sensitivity = 0.892 and specificity = 0.994. The results show that the serum-LIBS technique can serve as a quick, less invasive and robust method for diagnosis and discrimination of human malignancies.
Nearest neighbors by neighborhood counting.
Wang, Hui
2006-06-01
Finding nearest neighbors is a general idea that underlies many artificial intelligence tasks, including machine learning, data mining, natural language understanding, and information retrieval. This idea is explicitly used in the k-nearest neighbors algorithm (kNN), a popular classification method. In this paper, this idea is adopted in the development of a general methodology, neighborhood counting, for devising similarity functions. We turn our focus from neighbors to neighborhoods, a region in the data space covering the data point in question. To measure the similarity between two data points, we consider all neighborhoods that cover both data points. We propose to use the number of such neighborhoods as a measure of similarity. Neighborhood can be defined for different types of data in different ways. Here, we consider one definition of neighborhood for multivariate data and derive a formula for such similarity, called neighborhood counting measure or NCM. NCM was tested experimentally in the framework of kNN. Experiments show that NCM is generally comparable to VDM and its variants, the state-of-the-art distance functions for multivariate data, and, at the same time, is consistently better for relatively large k values. Additionally, NCM consistently outperforms HEOM (a mixture of Euclidean and Hamming distances), the "standard" and most widely used distance function for multivariate data. NCM has a computational complexity in the same order as the standard Euclidean distance function and NCM is task independent and works for numerical and categorical data in a conceptually uniform way. The neighborhood counting methodology is proven sound for multivariate data experimentally. We hope it will work for other types of data.
Estimating local scaling properties for the classification of interstitial lung disease patterns
NASA Astrophysics Data System (ADS)
Huber, Markus B.; Nagarajan, Mahesh B.; Leinsinger, Gerda; Ray, Lawrence A.; Wismueller, Axel
2011-03-01
Local scaling properties of texture regions were compared in their ability to classify morphological patterns known as 'honeycombing' that are considered indicative for the presence of fibrotic interstitial lung diseases in high-resolution computed tomography (HRCT) images. For 14 patients with known occurrence of honeycombing, a stack of 70 axial, lung kernel reconstructed images were acquired from HRCT chest exams. 241 regions of interest of both healthy and pathological (89) lung tissue were identified by an experienced radiologist. Texture features were extracted using six properties calculated from gray-level co-occurrence matrices (GLCM), Minkowski Dimensions (MDs), and the estimation of local scaling properties with Scaling Index Method (SIM). A k-nearest-neighbor (k-NN) classifier and a Multilayer Radial Basis Functions Network (RBFN) were optimized in a 10-fold cross-validation for each texture vector, and the classification accuracy was calculated on independent test sets as a quantitative measure of automated tissue characterization. A Wilcoxon signed-rank test was used to compare two accuracy distributions including the Bonferroni correction. The best classification results were obtained by the set of SIM features, which performed significantly better than all the standard GLCM and MD features (p < 0.005) for both classifiers with the highest accuracy (94.1%, 93.7%; for the k-NN and RBFN classifier, respectively). The best standard texture features were the GLCM features 'homogeneity' (91.8%, 87.2%) and 'absolute value' (90.2%, 88.5%). The results indicate that advanced texture features using local scaling properties can provide superior classification performance in computer-assisted diagnosis of interstitial lung diseases when compared to standard texture analysis methods.
A Novel Artificial Intelligence System for Endotracheal Intubation.
Carlson, Jestin N; Das, Samarjit; De la Torre, Fernando; Frisch, Adam; Guyette, Francis X; Hodgins, Jessica K; Yealy, Donald M
2016-01-01
Adequate visualization of the glottic opening is a key factor to successful endotracheal intubation (ETI); however, few objective tools exist to help guide providers' ETI attempts toward the glottic opening in real-time. Machine learning/artificial intelligence has helped to automate the detection of other visual structures but its utility with ETI is unknown. We sought to test the accuracy of various computer algorithms in identifying the glottic opening, creating a tool that could aid successful intubation. We collected a convenience sample of providers who each performed ETI 10 times on a mannequin using a video laryngoscope (C-MAC, Karl Storz Corp, Tuttlingen, Germany). We recorded each attempt and reviewed one-second time intervals for the presence or absence of the glottic opening. Four different machine learning/artificial intelligence algorithms analyzed each attempt and time point: k-nearest neighbor (KNN), support vector machine (SVM), decision trees, and neural networks (NN). We used half of the videos to train the algorithms and the second half to test the accuracy, sensitivity, and specificity of each algorithm. We enrolled seven providers, three Emergency Medicine attendings, and four paramedic students. From the 70 total recorded laryngoscopic video attempts, we created 2,465 time intervals. The algorithms had the following sensitivity and specificity for detecting the glottic opening: KNN (70%, 90%), SVM (70%, 90%), decision trees (68%, 80%), and NN (72%, 78%). Initial efforts at computer algorithms using artificial intelligence are able to identify the glottic opening with over 80% accuracy. With further refinements, video laryngoscopy has the potential to provide real-time, direction feedback to the provider to help guide successful ETI.
Özdemir, Ahmet Turan
2016-07-25
Wearable devices for fall detection have received attention in academia and industry, because falls are very dangerous, especially for elderly people, and if immediate aid is not provided, it may result in death. However, some predictive devices are not easily worn by elderly people. In this work, a huge dataset, including 2520 tests, is employed to determine the best sensor placement location on the body and to reduce the number of sensor nodes for device ergonomics. During the tests, the volunteer's movements are recorded with six groups of sensors each with a triaxial (accelerometer, gyroscope and magnetometer) sensor, which is placed tightly on different parts of the body with special straps: head, chest, waist, right-wrist, right-thigh and right-ankle. The accuracy of individual sensor groups with their location is investigated with six machine learning techniques, namely the k-nearest neighbor (k-NN) classifier, Bayesian decision making (BDM), support vector machines (SVM), least squares method (LSM), dynamic time warping (DTW) and artificial neural networks (ANNs). Each technique is applied to single, double, triple, quadruple, quintuple and sextuple sensor configurations. These configurations create 63 different combinations, and for six machine learning techniques, a total of 63 × 6 = 378 combinations is investigated. As a result, the waist region is found to be the most suitable location for sensor placement on the body with 99.96% fall detection sensitivity by using the k-NN classifier, whereas the best sensitivity achieved by the wrist sensor is 97.37%, despite this location being highly preferred for today's wearable applications.
Ma, Chao; Ouyang, Jihong; Chen, Hui-Ling; Zhao, Xue-Hua
2014-01-01
A novel hybrid method named SCFW-KELM, which integrates effective subtractive clustering features weighting and a fast classifier kernel-based extreme learning machine (KELM), has been introduced for the diagnosis of PD. In the proposed method, SCFW is used as a data preprocessing tool, which aims at decreasing the variance in features of the PD dataset, in order to further improve the diagnostic accuracy of the KELM classifier. The impact of the type of kernel functions on the performance of KELM has been investigated in detail. The efficiency and effectiveness of the proposed method have been rigorously evaluated against the PD dataset in terms of classification accuracy, sensitivity, specificity, area under the receiver operating characteristic (ROC) curve (AUC), f-measure, and kappa statistics value. Experimental results have demonstrated that the proposed SCFW-KELM significantly outperforms SVM-based, KNN-based, and ELM-based approaches and other methods in the literature and achieved highest classification results reported so far via 10-fold cross validation scheme, with the classification accuracy of 99.49%, the sensitivity of 100%, the specificity of 99.39%, AUC of 99.69%, the f-measure value of 0.9964, and kappa value of 0.9867. Promisingly, the proposed method might serve as a new candidate of powerful methods for the diagnosis of PD with excellent performance.
Ma, Chao; Ouyang, Jihong; Chen, Hui-Ling; Zhao, Xue-Hua
2014-01-01
A novel hybrid method named SCFW-KELM, which integrates effective subtractive clustering features weighting and a fast classifier kernel-based extreme learning machine (KELM), has been introduced for the diagnosis of PD. In the proposed method, SCFW is used as a data preprocessing tool, which aims at decreasing the variance in features of the PD dataset, in order to further improve the diagnostic accuracy of the KELM classifier. The impact of the type of kernel functions on the performance of KELM has been investigated in detail. The efficiency and effectiveness of the proposed method have been rigorously evaluated against the PD dataset in terms of classification accuracy, sensitivity, specificity, area under the receiver operating characteristic (ROC) curve (AUC), f-measure, and kappa statistics value. Experimental results have demonstrated that the proposed SCFW-KELM significantly outperforms SVM-based, KNN-based, and ELM-based approaches and other methods in the literature and achieved highest classification results reported so far via 10-fold cross validation scheme, with the classification accuracy of 99.49%, the sensitivity of 100%, the specificity of 99.39%, AUC of 99.69%, the f-measure value of 0.9964, and kappa value of 0.9867. Promisingly, the proposed method might serve as a new candidate of powerful methods for the diagnosis of PD with excellent performance. PMID:25484912
Passive RFID Rotation Dimension Reduction via Aggregation
NASA Astrophysics Data System (ADS)
Matthews, Eric
Radio Frequency IDentification (RFID) has applications in object identification, position, and orientation tracking. RFID technology can be applied in hospitals for patient and equipment tracking, stores and warehouses for product tracking, robots for self-localisation, tracking hazardous materials, or locating any other desired object. Efficient and accurate algorithms that perform localisation are required to extract meaningful data beyond simple identification. A Received Signal Strength Indicator (RSSI) is the strength of a received radio frequency signal used to localise passive and active RFID tags. Many factors affect RSSI such as reflections, tag rotation in 3D space, and obstacles blocking line-of-sight. LANDMARC is a statistical method for estimating tag location based on a target tag's similarity to surrounding reference tags. LANDMARC does not take into account the rotation of the target tag. By either aggregating multiple reference tag positions at various rotations, or by determining a rotation value for a newly read tag, we can perform an expected value calculation based on a comparison to the k-most similar training samples via an algorithm called K-Nearest Neighbours (KNN) more accurately. By choosing the average as the aggregation function, we improve the relative accuracy of single-rotation LANDMARC localisation by 10%, and any-rotation localisation by 20%.
Short text sentiment classification based on feature extension and ensemble classifier
NASA Astrophysics Data System (ADS)
Liu, Yang; Zhu, Xie
2018-05-01
With the rapid development of Internet social media, excavating the emotional tendencies of the short text information from the Internet, the acquisition of useful information has attracted the attention of researchers. At present, the commonly used can be attributed to the rule-based classification and statistical machine learning classification methods. Although micro-blog sentiment analysis has made good progress, there still exist some shortcomings such as not highly accurate enough and strong dependence from sentiment classification effect. Aiming at the characteristics of Chinese short texts, such as less information, sparse features, and diverse expressions, this paper considers expanding the original text by mining related semantic information from the reviews, forwarding and other related information. First, this paper uses Word2vec to compute word similarity to extend the feature words. And then uses an ensemble classifier composed of SVM, KNN and HMM to analyze the emotion of the short text of micro-blog. The experimental results show that the proposed method can make good use of the comment forwarding information to extend the original features. Compared with the traditional method, the accuracy, recall and F1 value obtained by this method have been improved.
Identification of Anisomerous Motor Imagery EEG Signals Based on Complex Algorithms
Zhang, Zhiwen; Duan, Feng; Zhou, Xin; Meng, Zixuan
2017-01-01
Motor imagery (MI) electroencephalograph (EEG) signals are widely applied in brain-computer interface (BCI). However, classified MI states are limited, and their classification accuracy rates are low because of the characteristics of nonlinearity and nonstationarity. This study proposes a novel MI pattern recognition system that is based on complex algorithms for classifying MI EEG signals. In electrooculogram (EOG) artifact preprocessing, band-pass filtering is performed to obtain the frequency band of MI-related signals, and then, canonical correlation analysis (CCA) combined with wavelet threshold denoising (WTD) is used for EOG artifact preprocessing. We propose a regularized common spatial pattern (R-CSP) algorithm for EEG feature extraction by incorporating the principle of generic learning. A new classifier combining the K-nearest neighbor (KNN) and support vector machine (SVM) approaches is used to classify four anisomerous states, namely, imaginary movements with the left hand, right foot, and right shoulder and the resting state. The highest classification accuracy rate is 92.5%, and the average classification accuracy rate is 87%. The proposed complex algorithm identification method can significantly improve the identification rate of the minority samples and the overall classification performance. PMID:28874909
A floor-map-aided WiFi/pseudo-odometry integration algorithm for an indoor positioning system.
Wang, Jian; Hu, Andong; Liu, Chunyan; Li, Xin
2015-03-24
This paper proposes a scheme for indoor positioning by fusing floor map, WiFi and smartphone sensor data to provide meter-level positioning without additional infrastructure. A topology-constrained K nearest neighbor (KNN) algorithm based on a floor map layout provides the coordinates required to integrate WiFi data with pseudo-odometry (P-O) measurements simulated using a pedestrian dead reckoning (PDR) approach. One method of further improving the positioning accuracy is to use a more effective multi-threshold step detection algorithm, as proposed by the authors. The "go and back" phenomenon caused by incorrect matching of the reference points (RPs) of a WiFi algorithm is eliminated using an adaptive fading-factor-based extended Kalman filter (EKF), taking WiFi positioning coordinates, P-O measurements and fused heading angles as observations. The "cross-wall" problem is solved based on the development of a floor-map-aided particle filter algorithm by weighting the particles, thereby also eliminating the gross-error effects originating from WiFi or P-O measurements. The performance observed in a field experiment performed on the fourth floor of the School of Environmental Science and Spatial Informatics (SESSI) building on the China University of Mining and Technology (CUMT) campus confirms that the proposed scheme can reliably achieve meter-level positioning.
A Novel Locally Linear KNN Method With Applications to Visual Recognition.
Liu, Qingfeng; Liu, Chengjun
2017-09-01
A locally linear K Nearest Neighbor (LLK) method is presented in this paper with applications to robust visual recognition. Specifically, the concept of an ideal representation is first presented, which improves upon the traditional sparse representation in many ways. The objective function based on a host of criteria for sparsity, locality, and reconstruction is then optimized to derive a novel representation, which is an approximation to the ideal representation. The novel representation is further processed by two classifiers, namely, an LLK-based classifier and a locally linear nearest mean-based classifier, for visual recognition. The proposed classifiers are shown to connect to the Bayes decision rule for minimum error. Additional new theoretical analysis is presented, such as the nonnegative constraint, the group regularization, and the computational efficiency of the proposed LLK method. New methods such as a shifted power transformation for improving reliability, a coefficients' truncating method for enhancing generalization, and an improved marginal Fisher analysis method for feature extraction are proposed to further improve visual recognition performance. Extensive experiments are implemented to evaluate the proposed LLK method for robust visual recognition. In particular, eight representative data sets are applied for assessing the performance of the LLK method for various visual recognition applications, such as action recognition, scene recognition, object recognition, and face recognition.
An RFID Indoor Positioning Algorithm Based on Bayesian Probability and K-Nearest Neighbor.
Xu, He; Ding, Ye; Li, Peng; Wang, Ruchuan; Li, Yizhu
2017-08-05
The Global Positioning System (GPS) is widely used in outdoor environmental positioning. However, GPS cannot support indoor positioning because there is no signal for positioning in an indoor environment. Nowadays, there are many situations which require indoor positioning, such as searching for a book in a library, looking for luggage in an airport, emergence navigation for fire alarms, robot location, etc. Many technologies, such as ultrasonic, sensors, Bluetooth, WiFi, magnetic field, Radio Frequency Identification (RFID), etc., are used to perform indoor positioning. Compared with other technologies, RFID used in indoor positioning is more cost and energy efficient. The Traditional RFID indoor positioning algorithm LANDMARC utilizes a Received Signal Strength (RSS) indicator to track objects. However, the RSS value is easily affected by environmental noise and other interference. In this paper, our purpose is to reduce the location fluctuation and error caused by multipath and environmental interference in LANDMARC. We propose a novel indoor positioning algorithm based on Bayesian probability and K -Nearest Neighbor (BKNN). The experimental results show that the Gaussian filter can filter some abnormal RSS values. The proposed BKNN algorithm has the smallest location error compared with the Gaussian-based algorithm, LANDMARC and an improved KNN algorithm. The average error in location estimation is about 15 cm using our method.
2018-01-01
Hyperspectral image classification with a limited number of training samples without loss of accuracy is desirable, as collecting such data is often expensive and time-consuming. However, classifiers trained with limited samples usually end up with a large generalization error. To overcome the said problem, we propose a fuzziness-based active learning framework (FALF), in which we implement the idea of selecting optimal training samples to enhance generalization performance for two different kinds of classifiers, discriminative and generative (e.g. SVM and KNN). The optimal samples are selected by first estimating the boundary of each class and then calculating the fuzziness-based distance between each sample and the estimated class boundaries. Those samples that are at smaller distances from the boundaries and have higher fuzziness are chosen as target candidates for the training set. Through detailed experimentation on three publically available datasets, we showed that when trained with the proposed sample selection framework, both classifiers achieved higher classification accuracy and lower processing time with the small amount of training data as opposed to the case where the training samples were selected randomly. Our experiments demonstrate the effectiveness of our proposed method, which equates favorably with the state-of-the-art methods. PMID:29304512
Han, Te; Jiang, Dongxiang; Zhang, Xiaochen; Sun, Yankui
2017-01-01
Rotating machinery is widely used in industrial applications. With the trend towards more precise and more critical operating conditions, mechanical failures may easily occur. Condition monitoring and fault diagnosis (CMFD) technology is an effective tool to enhance the reliability and security of rotating machinery. In this paper, an intelligent fault diagnosis method based on dictionary learning and singular value decomposition (SVD) is proposed. First, the dictionary learning scheme is capable of generating an adaptive dictionary whose atoms reveal the underlying structure of raw signals. Essentially, dictionary learning is employed as an adaptive feature extraction method regardless of any prior knowledge. Second, the singular value sequence of learned dictionary matrix is served to extract feature vector. Generally, since the vector is of high dimensionality, a simple and practical principal component analysis (PCA) is applied to reduce dimensionality. Finally, the K-nearest neighbor (KNN) algorithm is adopted for identification and classification of fault patterns automatically. Two experimental case studies are investigated to corroborate the effectiveness of the proposed method in intelligent diagnosis of rotating machinery faults. The comparison analysis validates that the dictionary learning-based matrix construction approach outperforms the mode decomposition-based methods in terms of capacity and adaptability for feature extraction. PMID:28346385
Group discriminatory power of handwritten characters
NASA Astrophysics Data System (ADS)
Tomai, Catalin I.; Kshirsagar, Devika M.; Srihari, Sargur N.
2003-12-01
Using handwritten characters we address two questions (i) what is the group identification performance of different alphabets (upper and lower case) and (ii) what are the best characters for the verification task (same writer/different writer discrimination) knowing demographic information about the writer such as ethnicity, age or sex. The Bhattacharya distance is used to rank different characters by their group discriminatory power and the k-nn classifier to measure the individual performance of characters for group identification. Given the tasks of identifying the correct gender/age/ethnicity or handedness, the accumulated performance of characters varies between 65% and 85%.
An unbalanced spectra classification method based on entropy
NASA Astrophysics Data System (ADS)
Liu, Zhong-bao; Zhao, Wen-juan
2017-05-01
How to solve the problem of distinguishing the minority spectra from the majority of the spectra is quite important in astronomy. In view of this, an unbalanced spectra classification method based on entropy (USCM) is proposed in this paper to deal with the unbalanced spectra classification problem. USCM greatly improves the performances of the traditional classifiers on distinguishing the minority spectra as it takes the data distribution into consideration in the process of classification. However, its time complexity is exponential with the training size, and therefore, it can only deal with the problem of small- and medium-scale classification. How to solve the large-scale classification problem is quite important to USCM. It can be easily obtained by mathematical computation that the dual form of USCM is equivalent to the minimum enclosing ball (MEB), and core vector machine (CVM) is introduced, USCM based on CVM is proposed to deal with the large-scale classification problem. Several comparative experiments on the 4 subclasses of K-type spectra, 3 subclasses of F-type spectra and 3 subclasses of G-type spectra from Sloan Digital Sky Survey (SDSS) verify USCM and USCM based on CVM perform better than kNN (k nearest neighbor) and SVM (support vector machine) in dealing with the problem of rare spectra mining respectively on the small- and medium-scale datasets and the large-scale datasets.
Zhuang, Qianfen; Cao, Wei; Ni, Yongnian; Wang, Yong
2018-08-01
Most of the conventional multidimensional differential sensors currently need at least two-step fabrication, namely synthesis of probe(s) and identification of multiple analytes by mixing of analytes with probe(s), and were conducted using multiple sensing elements or several devices. In the study, we chose five different nucleobases (adenine, cytosine, guanine, thymine, and uracil) as model analytes, and found that under hydrothermal conditions, sodium citrate could react directly with various nucleobases to yield different nitrogen-doped carbon nanodots (CDs). The CDs synthesized from different nucleobases exhibited different fluorescent properties, leading to their respective characteristic fluorescence spectra. Hence, we combined the fluorescence spectra of the CDs with advanced chemometrics like principle component analysis (PCA), hierarchical cluster analysis (HCA), K-nearest neighbor (KNN) and soft independent modeling of class analogy (SIMCA), to present a conceptually novel "synthesis-identification integration" strategy to construct a multidimensional differential sensor for nucleobase discrimination. Single-wavelength excitation fluorescence spectral data, single-wavelength emission fluorescence spectral data, and fluorescence Excitation-Emission Matrices (EEMs) of the CDs were respectively used as input data of the differential sensor. The results showed that the discrimination ability of the multidimensional differential sensor with EEM data set as input data was superior to those with single-wavelength excitation/emission fluorescence data set, suggesting that increasing the number of the data input could improve the discrimination power. Two supervised pattern recognition methods, namely KNN and SIMCA, correctly identified the five nucleobases with a classification accuracy of 100%. The proposed "synthesis-identification integration" strategy together with a multidimensional array of experimental data holds great promise in the construction of differential sensors. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Kuijf, Hugo J.; Moeskops, Pim; de Vos, Bob D.; Bouvy, Willem H.; de Bresser, Jeroen; Biessels, Geert Jan; Viergever, Max A.; Vincken, Koen L.
2016-03-01
Novelty detection is concerned with identifying test data that differs from the training data of a classifier. In the case of brain MR images, pathology or imaging artefacts are examples of untrained data. In this proof-of-principle study, we measure the behaviour of a classifier during the classification of trained labels (i.e. normal brain tissue). Next, we devise a measure that distinguishes normal classifier behaviour from abnormal behavior that occurs in the case of a novelty. This will be evaluated by training a kNN classifier on normal brain tissue, applying it to images with an untrained pathology (white matter hyperintensities (WMH)), and determine if our measure is able to identify abnormal classifier behaviour at WMH locations. For our kNN classifier, behaviour is modelled as the mean, median, or q1 distance to the k nearest points. Healthy tissue was trained on 15 images; classifier behaviour was trained/tested on 5 images with leave-one-out cross-validation. For each trained class, we measure the distribution of mean/median/q1 distances to the k nearest point. Next, for each test voxel, we compute its Z-score with respect to the measured distribution of its predicted label. We consider a Z-score >=4 abnormal behaviour of the classifier, having a probability due to chance of 0.000032. Our measure identified >90% of WMH volume and also highlighted other non-trained findings. The latter being predominantly vessels, cerebral falx, brain mask errors, choroid plexus. This measure is generalizable to other classifiers and might help in detecting unexpected findings or novelties by measuring classifier behaviour.
Hollings, Tracey; Robinson, Andrew; van Andel, Mary; Jewell, Chris; Burgman, Mark
2017-01-01
In livestock industries, reliable up-to-date spatial distribution and abundance records for animals and farms are critical for governments to manage and respond to risks. Yet few, if any, countries can afford to maintain comprehensive, up-to-date agricultural census data. Statistical modelling can be used as a proxy for such data but comparative modelling studies have rarely been undertaken for livestock populations. Widespread species, including livestock, can be difficult to model effectively due to complex spatial distributions that do not respond predictably to environmental gradients. We assessed three machine learning species distribution models (SDM) for their capacity to estimate national-level farm animal population numbers within property boundaries: boosted regression trees (BRT), random forests (RF) and K-nearest neighbour (K-NN). The models were built from a commercial livestock database and environmental and socio-economic predictor data for New Zealand. We used two spatial data stratifications to test (i) support for decision making in an emergency response situation, and (ii) the ability for the models to predict to new geographic regions. The performance of the three model types varied substantially, but the best performing models showed very high accuracy. BRTs had the best performance overall, but RF performed equally well or better in many simulations; RFs were superior at predicting livestock numbers for all but very large commercial farms. K-NN performed poorly relative to both RF and BRT in all simulations. The predictions of both multi species and single species models for farms and within hypothetical quarantine zones were very close to observed data. These models are generally applicable for livestock estimation with broad applications in disease risk modelling, biosecurity, policy and planning.
Robinson, Andrew; van Andel, Mary; Jewell, Chris; Burgman, Mark
2017-01-01
In livestock industries, reliable up-to-date spatial distribution and abundance records for animals and farms are critical for governments to manage and respond to risks. Yet few, if any, countries can afford to maintain comprehensive, up-to-date agricultural census data. Statistical modelling can be used as a proxy for such data but comparative modelling studies have rarely been undertaken for livestock populations. Widespread species, including livestock, can be difficult to model effectively due to complex spatial distributions that do not respond predictably to environmental gradients. We assessed three machine learning species distribution models (SDM) for their capacity to estimate national-level farm animal population numbers within property boundaries: boosted regression trees (BRT), random forests (RF) and K-nearest neighbour (K-NN). The models were built from a commercial livestock database and environmental and socio-economic predictor data for New Zealand. We used two spatial data stratifications to test (i) support for decision making in an emergency response situation, and (ii) the ability for the models to predict to new geographic regions. The performance of the three model types varied substantially, but the best performing models showed very high accuracy. BRTs had the best performance overall, but RF performed equally well or better in many simulations; RFs were superior at predicting livestock numbers for all but very large commercial farms. K-NN performed poorly relative to both RF and BRT in all simulations. The predictions of both multi species and single species models for farms and within hypothetical quarantine zones were very close to observed data. These models are generally applicable for livestock estimation with broad applications in disease risk modelling, biosecurity, policy and planning. PMID:28837685
Classification of interstitial lung disease patterns with topological texture features
NASA Astrophysics Data System (ADS)
Huber, Markus B.; Nagarajan, Mahesh; Leinsinger, Gerda; Ray, Lawrence A.; Wismüller, Axel
2010-03-01
Topological texture features were compared in their ability to classify morphological patterns known as 'honeycombing' that are considered indicative for the presence of fibrotic interstitial lung diseases in high-resolution computed tomography (HRCT) images. For 14 patients with known occurrence of honey-combing, a stack of 70 axial, lung kernel reconstructed images were acquired from HRCT chest exams. A set of 241 regions of interest of both healthy and pathological (89) lung tissue were identified by an experienced radiologist. Texture features were extracted using six properties calculated from gray-level co-occurrence matrices (GLCM), Minkowski Dimensions (MDs), and three Minkowski Functionals (MFs, e.g. MF.euler). A k-nearest-neighbor (k-NN) classifier and a Multilayer Radial Basis Functions Network (RBFN) were optimized in a 10-fold cross-validation for each texture vector, and the classification accuracy was calculated on independent test sets as a quantitative measure of automated tissue characterization. A Wilcoxon signed-rank test was used to compare two accuracy distributions and the significance thresholds were adjusted for multiple comparisons by the Bonferroni correction. The best classification results were obtained by the MF features, which performed significantly better than all the standard GLCM and MD features (p < 0.005) for both classifiers. The highest accuracy was found for MF.euler (97.5%, 96.6%; for the k-NN and RBFN classifier, respectively). The best standard texture features were the GLCM features 'homogeneity' (91.8%, 87.2%) and 'absolute value' (90.2%, 88.5%). The results indicate that advanced topological texture features can provide superior classification performance in computer-assisted diagnosis of interstitial lung diseases when compared to standard texture analysis methods.
Votano, Joseph R; Parham, Marc; Hall, L Mark; Hall, Lowell H; Kier, Lemont B; Oloff, Scott; Tropsha, Alexander
2006-11-30
Four modeling techniques, using topological descriptors to represent molecular structure, were employed to produce models of human serum protein binding (% bound) on a data set of 1008 experimental values, carefully screened from publicly available sources. To our knowledge, this data is the largest set on human serum protein binding reported for QSAR modeling. The data was partitioned into a training set of 808 compounds and an external validation test set of 200 compounds. Partitioning was accomplished by clustering the compounds in a structure descriptor space so that random sampling of 20% of the whole data set produced an external test set that is a good representative of the training set with respect to both structure and protein binding values. The four modeling techniques include multiple linear regression (MLR), artificial neural networks (ANN), k-nearest neighbors (kNN), and support vector machines (SVM). With the exception of the MLR model, the ANN, kNN, and SVM QSARs were ensemble models. Training set correlation coefficients and mean absolute error ranged from r2=0.90 and MAE=7.6 for ANN to r2=0.61 and MAE=16.2 for MLR. Prediction results from the validation set yielded correlation coefficients and mean absolute errors which ranged from r2=0.70 and MAE=14.1 for ANN to a low of r2=0.59 and MAE=18.3 for the SVM model. Structure descriptors that contribute significantly to the models are discussed and compared with those found in other published models. For the ANN model, structure descriptor trends with respect to their affects on predicted protein binding can assist the chemist in structure modification during the drug design process.
NASA Astrophysics Data System (ADS)
Liu, Zhen; Ren, Weijun; Peng, Ping; Guo, Shaobo; Lu, Teng; Liu, Yun; Dong, Xianlin; Wang, Genshui
2018-04-01
Both high pyroelectric properties and good temperature stability of ferroelectric materials are desirable when used for applications in infrared thermal detectors. In this work, we report lead-free ternary 0.97(0.99Bi0.5Na0.5TiO3-0.01BiAlO3)-0.03K0.5Na0.5NbO3 (BNT-BA-KNN) ceramics, which not only exhibits a large pyroelectric coefficient (p ˜ 3.7 × 10-8 C cm-2 K-1) and figures of merit (Fi, Fv, and Fd) but also shows excellent thermal stable properties. At room temperature, Fi, Fv, and Fd are determined as high as 1.32 × 10-10 m/V, 2.89 × 10-2 m2/C, and 1.15 × 10-5 Pa-1/2 at 1 kHz and 1.32 × 10-10 m/V, 2.70 × 10-2 m2/C, and 1.09 × 10-5 Pa-1/2 at 20 Hz, respectively. During the temperature range of RT to 85 °C, the achieved p, Fi, Fv, and Fd do not vary too much. The high depolarization temperature and the undispersed ferroelectric-ergodic relaxor phase transition with a sharp pyroelectric coefficient peak value of ˜400 × 10-8 C cm-2 K-1 are suggested to be responsible for this thermal stability, which ensures reliable actual operation. The results reveal the BNT-BA-KNN ceramics as promising lead-free candidates for infrared thermal detector applications.
Fast quantum nD Fourier and radon transforms
NASA Astrophysics Data System (ADS)
Labunets, Valeri G.; Labunets-Rundblad, Ekaterina V.; Astola, Jaakko T.
2001-07-01
Fast Classical and quantum algorithms are introduced for a wide class of non-separable nD discrete unitary K- transforms(DKT)KNn. They require a number of 1D DKT Kn smaller than in the Cooley-Tukey radix-p FFT-type approach. The method utilizes a decomposition of the nDK- transform into a product of original nD discrete Radon Transform and of a family parallel/independ 1DK-transforms. If the nDK-transform has a separable kernel, that again in this case our approach leads to decrease of multiplicative complexity by factor of n compared to the tow/column separable Cooley-Tukey p-radix approach.
Pan, Yongke; Niu, Wenjia
2017-01-01
Semisupervised Discriminant Analysis (SDA) is a semisupervised dimensionality reduction algorithm, which can easily resolve the out-of-sample problem. Relative works usually focus on the geometric relationships of data points, which are not obvious, to enhance the performance of SDA. Different from these relative works, the regularized graph construction is researched here, which is important in the graph-based semisupervised learning methods. In this paper, we propose a novel graph for Semisupervised Discriminant Analysis, which is called combined low-rank and k-nearest neighbor (LRKNN) graph. In our LRKNN graph, we map the data to the LR feature space and then the kNN is adopted to satisfy the algorithmic requirements of SDA. Since the low-rank representation can capture the global structure and the k-nearest neighbor algorithm can maximally preserve the local geometrical structure of the data, the LRKNN graph can significantly improve the performance of SDA. Extensive experiments on several real-world databases show that the proposed LRKNN graph is an efficient graph constructor, which can largely outperform other commonly used baselines. PMID:28316616
Prediction of carbonate rock type from NMR responses using data mining techniques
NASA Astrophysics Data System (ADS)
Gonçalves, Eduardo Corrêa; da Silva, Pablo Nascimento; Silveira, Carla Semiramis; Carneiro, Giovanna; Domingues, Ana Beatriz; Moss, Adam; Pritchard, Tim; Plastino, Alexandre; Azeredo, Rodrigo Bagueira de Vasconcellos
2017-05-01
Recent studies have indicated that the accurate identification of carbonate rock types in a reservoir can be employed as a preliminary step to enhance the effectiveness of petrophysical property modeling. Furthermore, rock typing activity has been shown to be of key importance in several steps of formation evaluation, such as the study of sedimentary series, reservoir zonation and well-to-well correlation. In this paper, a methodology based exclusively on the analysis of 1H-NMR (Nuclear Magnetic Resonance) relaxation responses - using data mining algorithms - is evaluated to perform the automatic classification of carbonate samples according to their rock type. We analyze the effectiveness of six different classification algorithms (k-NN, Naïve Bayes, C4.5, Random Forest, SMO and Multilayer Perceptron) and two data preprocessing strategies (discretization and feature selection). The dataset used in this evaluation is formed by 78 1H-NMR T2 distributions of fully brine-saturated rock samples from six different rock type classes. The experiments reveal that the combination of preprocessing strategies with classification algorithms is able to achieve a prediction accuracy of 97.4%.
NASA Astrophysics Data System (ADS)
Díaz-Ayil, Gilberto; Amouroux, Marine; Clanché, Fabien; Granjon, Yves; Blondel, Walter C. P. M.
2009-07-01
Spatially-resolved bimodal spectroscopy (multiple AutoFluorescence AF excitation and Diffuse Reflectance DR), was used in vivo to discriminate various healthy and precancerous skin stages in a pre-clinical model (UV-irradiated mouse): Compensatory Hyperplasia CH, Atypical Hyperplasia AH and Dysplasia D. A specific data preprocessing scheme was applied to intensity spectra (filtering, spectral correction and intensity normalization), and several sets of spectral characteristics were automatically extracted and selected based on their discrimination power, statistically tested for every pair-wise comparison of histological classes. Data reduction with Principal Components Analysis (PCA) was performed and 3 classification methods were implemented (k-NN, LDA and SVM), in order to compare diagnostic performance of each method. Diagnostic performance was studied and assessed in terms of Sensibility (Se) and Specificity (Sp) as a function of the selected features, of the combinations of 3 different inter-fibres distances and of the numbers of principal components, such that: Se and Sp ~ 100% when discriminating CH vs. others; Sp ~ 100% and Se > 95% when discriminating Healthy vs. AH or D; Sp ~ 74% and Se ~ 63% for AH vs. D.
NASA Astrophysics Data System (ADS)
He, Runnan; Wang, Kuanquan; Li, Qince; Yuan, Yongfeng; Zhao, Na; Liu, Yang; Zhang, Henggui
2017-12-01
Cardiovascular diseases are associated with high morbidity and mortality. However, it is still a challenge to diagnose them accurately and efficiently. Electrocardiogram (ECG), a bioelectrical signal of the heart, provides crucial information about the dynamical functions of the heart, playing an important role in cardiac diagnosis. As the QRS complex in ECG is associated with ventricular depolarization, therefore, accurate QRS detection is vital for interpreting ECG features. In this paper, we proposed a real-time, accurate, and effective algorithm for QRS detection. In the algorithm, a proposed preprocessor with a band-pass filter was first applied to remove baseline wander and power-line interference from the signal. After denoising, a method combining K-Nearest Neighbor (KNN) and Particle Swarm Optimization (PSO) was used for accurate QRS detection in ECGs with different morphologies. The proposed algorithm was tested and validated using 48 ECG records from MIT-BIH arrhythmia database (MITDB), achieved a high averaged detection accuracy, sensitivity and positive predictivity of 99.43, 99.69, and 99.72%, respectively, indicating a notable improvement to extant algorithms as reported in literatures.
Segmentation for the enhancement of microcalcifications in digital mammograms.
Milosevic, Marina; Jankovic, Dragan; Peulic, Aleksandar
2014-01-01
Microcalcification clusters appear as groups of small, bright particles with arbitrary shapes on mammographic images. They are the earliest sign of breast carcinomas and their detection is the key for improving breast cancer prognosis. But due to the low contrast of microcalcifications and same properties as noise, it is difficult to detect microcalcification. This work is devoted to developing a system for the detection of microcalcification in digital mammograms. After removing noise from mammogram using the Discrete Wavelet Transformation (DWT), we first selected the region of interest (ROI) in order to demarcate the breast region on a mammogram. Segmenting region of interest represents one of the most important stages of mammogram processing procedure. The proposed segmentation method is based on a filtering using the Sobel filter. This process will identify the significant pixels, that belong to edges of microcalcifications. Microcalcifications were detected by increasing the contrast of the images obtained by applying Sobel operator. In order to confirm the effectiveness of this microcalcification segmentation method, the Support Vector Machine (SVM) and k-Nearest Neighborhood (k-NN) algorithm are employed for the classification task using cross-validation technique.
Wang, Shunfang; Liu, Shuhui
2015-12-19
An effective representation of a protein sequence plays a crucial role in protein sub-nuclear localization. The existing representations, such as dipeptide composition (DipC), pseudo-amino acid composition (PseAAC) and position specific scoring matrix (PSSM), are insufficient to represent protein sequence due to their single perspectives. Thus, this paper proposes two fusion feature representations of DipPSSM and PseAAPSSM to integrate PSSM with DipC and PseAAC, respectively. When constructing each fusion representation, we introduce the balance factors to value the importance of its components. The optimal values of the balance factors are sought by genetic algorithm. Due to the high dimensionality of the proposed representations, linear discriminant analysis (LDA) is used to find its important low dimensional structure, which is essential for classification and location prediction. The numerical experiments on two public datasets with KNN classifier and cross-validation tests showed that in terms of the common indexes of sensitivity, specificity, accuracy and MCC, the proposed fusing representations outperform the traditional representations in protein sub-nuclear localization, and the representation treated by LDA outperforms the untreated one.
NASA Astrophysics Data System (ADS)
Garciá-Arteaga, Juan D.; Corredor, Germán.; Wang, Xiangxue; Velcheti, Vamsidhar; Madabhushi, Anant; Romero, Eduardo
2017-11-01
Tumor-infiltrating lymphocytes occurs when various classes of white blood cells migrate from the blood stream towards the tumor, infiltrating it. The presence of TIL is predictive of the response of the patient to therapy. In this paper, we show how the automatic detection of lymphocytes in digital H and E histopathological images and the quantitative evaluation of the global lymphocyte configuration, evaluated through global features extracted from non-parametric graphs, constructed from the lymphocytes' detected positions, can be correlated to the patient's outcome in early-stage non-small cell lung cancer (NSCLC). The method was assessed on a tissue microarray cohort composed of 63 NSCLC cases. From the evaluated graphs, minimum spanning trees and K-nn showed the highest predictive ability, yielding F1 Scores of 0.75 and 0.72 and accuracies of 0.67 and 0.69, respectively. The predictive power of the proposed methodology indicates that graphs may be used to develop objective measures of the infiltration grade of tumors, which can, in turn, be used by pathologists to improve the decision making and treatment planning processes.
Joutsijoki, Henry; Haponen, Markus; Rasku, Jyrki; Aalto-Setälä, Katriina; Juhola, Martti
2016-01-01
The focus of this research is on automated identification of the quality of human induced pluripotent stem cell (iPSC) colony images. iPS cell technology is a contemporary method by which the patient's cells are reprogrammed back to stem cells and are differentiated to any cell type wanted. iPS cell technology will be used in future to patient specific drug screening, disease modeling, and tissue repairing, for instance. However, there are technical challenges before iPS cell technology can be used in practice and one of them is quality control of growing iPSC colonies which is currently done manually but is unfeasible solution in large-scale cultures. The monitoring problem returns to image analysis and classification problem. In this paper, we tackle this problem using machine learning methods such as multiclass Support Vector Machines and several baseline methods together with Scaled Invariant Feature Transformation based features. We perform over 80 test arrangements and do a thorough parameter value search. The best accuracy (62.4%) for classification was obtained by using a k-NN classifier showing improved accuracy compared to earlier studies.
Developing a radiomics framework for classifying non-small cell lung carcinoma subtypes
NASA Astrophysics Data System (ADS)
Yu, Dongdong; Zang, Yali; Dong, Di; Zhou, Mu; Gevaert, Olivier; Fang, Mengjie; Shi, Jingyun; Tian, Jie
2017-03-01
Patient-targeted treatment of non-small cell lung carcinoma (NSCLC) has been well documented according to the histologic subtypes over the past decade. In parallel, recent development of quantitative image biomarkers has recently been highlighted as important diagnostic tools to facilitate histological subtype classification. In this study, we present a radiomics analysis that classifies the adenocarcinoma (ADC) and squamous cell carcinoma (SqCC). We extract 52-dimensional, CT-based features (7 statistical features and 45 image texture features) to represent each nodule. We evaluate our approach on a clinical dataset including 324 ADCs and 110 SqCCs patients with CT image scans. Classification of these features is performed with four different machine-learning classifiers including Support Vector Machines with Radial Basis Function kernel (RBF-SVM), Random forest (RF), K-nearest neighbor (KNN), and RUSBoost algorithms. To improve the classifiers' performance, optimal feature subset is selected from the original feature set by using an iterative forward inclusion and backward eliminating algorithm. Extensive experimental results demonstrate that radiomics features achieve encouraging classification results on both complete feature set (AUC=0.89) and optimal feature subset (AUC=0.91).
Wang, Shunfang; Liu, Shuhui
2015-01-01
An effective representation of a protein sequence plays a crucial role in protein sub-nuclear localization. The existing representations, such as dipeptide composition (DipC), pseudo-amino acid composition (PseAAC) and position specific scoring matrix (PSSM), are insufficient to represent protein sequence due to their single perspectives. Thus, this paper proposes two fusion feature representations of DipPSSM and PseAAPSSM to integrate PSSM with DipC and PseAAC, respectively. When constructing each fusion representation, we introduce the balance factors to value the importance of its components. The optimal values of the balance factors are sought by genetic algorithm. Due to the high dimensionality of the proposed representations, linear discriminant analysis (LDA) is used to find its important low dimensional structure, which is essential for classification and location prediction. The numerical experiments on two public datasets with KNN classifier and cross-validation tests showed that in terms of the common indexes of sensitivity, specificity, accuracy and MCC, the proposed fusing representations outperform the traditional representations in protein sub-nuclear localization, and the representation treated by LDA outperforms the untreated one. PMID:26703574
Khashan, Raed; Zheng, Weifan; Tropsha, Alexander
2014-03-01
We present a novel approach to generating fragment-based molecular descriptors. The molecules are represented by labeled undirected chemical graph. Fast Frequent Subgraph Mining (FFSM) is used to find chemical-fragments (subgraphs) that occur in at least a subset of all molecules in a dataset. The collection of frequent subgraphs (FSG) forms a dataset-specific descriptors whose values for each molecule are defined by the number of times each frequent fragment occurs in this molecule. We have employed the FSG descriptors to develop variable selection k Nearest Neighbor (kNN) QSAR models of several datasets with binary target property including Maximum Recommended Therapeutic Dose (MRTD), Salmonella Mutagenicity (Ames Genotoxicity), and P-Glycoprotein (PGP) data. Each dataset was divided into training, test, and validation sets to establish the statistical figures of merit reflecting the model validated predictive power. The classification accuracies of models for both training and test sets for all datasets exceeded 75 %, and the accuracy for the external validation sets exceeded 72 %. The model accuracies were comparable or better than those reported earlier in the literature for the same datasets. Furthermore, the use of fragment-based descriptors affords mechanistic interpretation of validated QSAR models in terms of essential chemical fragments responsible for the compounds' target property. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Image-based Analysis of Emotional Facial Expressions in Full Face Transplants.
Bedeloglu, Merve; Topcu, Çagdas; Akgul, Arzu; Döger, Ela Naz; Sever, Refik; Ozkan, Ozlenen; Ozkan, Omer; Uysal, Hilmi; Polat, Ovunc; Çolak, Omer Halil
2018-01-20
In this study, it is aimed to determine the degree of the development in emotional expression of full face transplant patients from photographs. Hence, a rehabilitation process can be planned according to the determination of degrees as a later work. As envisaged, in full face transplant cases, the determination of expressions can be confused or cannot be achieved as the healthy control group. In order to perform image-based analysis, a control group consist of 9 healthy males and 2 full-face transplant patients participated in the study. Appearance-based Gabor Wavelet Transform (GWT) and Local Binary Pattern (LBP) methods are adopted for recognizing neutral and 6 emotional expressions which consist of angry, scared, happy, hate, confused and sad. Feature extraction was carried out by using both methods and combination of these methods serially. In the performed expressions, the extracted features of the most distinct zones in the facial area where the eye and mouth region, have been used to classify the emotions. Also, the combination of these region features has been used to improve classifier performance. Control subjects and transplant patients' ability to perform emotional expressions have been determined with K-nearest neighbor (KNN) classifier with region-specific and method-specific decision stages. The results have been compared with healthy group. It has been observed that transplant patients don't reflect some emotional expressions. Also, there were confusions among expressions.
Thermography based diagnosis of ruptured anterior cruciate ligament (ACL) in canines
NASA Astrophysics Data System (ADS)
Lama, Norsang; Umbaugh, Scott E.; Mishra, Deependra; Dahal, Rohini; Marino, Dominic J.; Sackman, Joseph
2016-09-01
Anterior cruciate ligament (ACL) rupture in canines is a common orthopedic injury in veterinary medicine. Veterinarians use both imaging and non-imaging methods to diagnose the disease. Common imaging methods such as radiography, computed tomography (CT scan) and magnetic resonance imaging (MRI) have some disadvantages: expensive setup, high dose of radiation, and time-consuming. In this paper, we present an alternative diagnostic method based on feature extraction and pattern classification (FEPC) to diagnose abnormal patterns in ACL thermograms. The proposed method was experimented with a total of 30 thermograms for each camera view (anterior, lateral and posterior) including 14 disease and 16 non-disease cases provided from Long Island Veterinary Specialists. The normal and abnormal patterns in thermograms are analyzed in two steps: feature extraction and pattern classification. Texture features based on gray level co-occurrence matrices (GLCM), histogram features and spectral features are extracted from the color normalized thermograms and the computed feature vectors are applied to Nearest Neighbor (NN) classifier, K-Nearest Neighbor (KNN) classifier and Support Vector Machine (SVM) classifier with leave-one-out validation method. The algorithm gives the best classification success rate of 86.67% with a sensitivity of 85.71% and a specificity of 87.5% in ACL rupture detection using NN classifier for the lateral view and Norm-RGB-Lum color normalization method. Our results show that the proposed method has the potential to detect ACL rupture in canines.
CFS-SMO based classification of breast density using multiple texture models.
Sharma, Vipul; Singh, Sukhwinder
2014-06-01
It is highly acknowledged in the medical profession that density of breast tissue is a major cause for the growth of breast cancer. Increased breast density was found to be linked with an increased risk of breast cancer growth, as high density makes it difficult for radiologists to see an abnormality which leads to false negative results. Therefore, there is need for the development of highly efficient techniques for breast tissue classification based on density. This paper presents a hybrid scheme for classification of fatty and dense mammograms using correlation-based feature selection (CFS) and sequential minimal optimization (SMO). In this work, texture analysis is done on a region of interest selected from the mammogram. Various texture models have been used to quantify the texture of parenchymal patterns of breast. To reduce the dimensionality and to identify the features which differentiate between breast tissue densities, CFS is used. Finally, classification is performed using SMO. The performance is evaluated using 322 images of mini-MIAS database. Highest accuracy of 96.46% is obtained for two-class problem (fatty and dense) using proposed approach. Performance of selected features by CFS is also evaluated by Naïve Bayes, Multilayer Perceptron, RBF Network, J48 and kNN classifier. The proposed CFS-SMO method outperforms all other classifiers giving a sensitivity of 100%. This makes it suitable to be taken as a second opinion in classifying breast tissue density.
A Floor-Map-Aided WiFi/Pseudo-Odometry Integration Algorithm for an Indoor Positioning System
Wang, Jian; Hu, Andong; Liu, Chunyan; Li, Xin
2015-01-01
This paper proposes a scheme for indoor positioning by fusing floor map, WiFi and smartphone sensor data to provide meter-level positioning without additional infrastructure. A topology-constrained K nearest neighbor (KNN) algorithm based on a floor map layout provides the coordinates required to integrate WiFi data with pseudo-odometry (P-O) measurements simulated using a pedestrian dead reckoning (PDR) approach. One method of further improving the positioning accuracy is to use a more effective multi-threshold step detection algorithm, as proposed by the authors. The “go and back” phenomenon caused by incorrect matching of the reference points (RPs) of a WiFi algorithm is eliminated using an adaptive fading-factor-based extended Kalman filter (EKF), taking WiFi positioning coordinates, P-O measurements and fused heading angles as observations. The “cross-wall” problem is solved based on the development of a floor-map-aided particle filter algorithm by weighting the particles, thereby also eliminating the gross-error effects originating from WiFi or P-O measurements. The performance observed in a field experiment performed on the fourth floor of the School of Environmental Science and Spatial Informatics (SESSI) building on the China University of Mining and Technology (CUMT) campus confirms that the proposed scheme can reliably achieve meter-level positioning. PMID:25811224
NASA Astrophysics Data System (ADS)
Qur’ania, A.; Sarinah, I.
2018-03-01
People often wrong in knowing the type of jasmine by just looking at the white color of the jasmine, while not all white flowers including jasmine and not all jasmine flowers have white. There is a jasmine that is yellow and there is a jasmine that is white and purple.The aim of this research is to identify Jasmine flower (Jasminum sp.) based on the shape of the flower image-based using Sobel edge detection and k-Nearest Neighbor. Edge detection is used to detect the type of flower from the flower shape. Edge detection aims to improve the appearance of the border of a digital image. While k-Nearest Neighbor method is used to classify the classification of test objects into classes that have neighbouring properties closest to the object of training. The data used in this study are three types of jasmine namely jasmine white (Jasminum sambac), jasmine gambir (Jasminum pubescens), and jasmine japan (Pseuderanthemum reticulatum). Testing of jasmine flower image resized 50 × 50 pixels, 100 × 100 pixels, 150 × 150 pixels yields an accuracy of 84%. Tests on distance values of the k-NN method with spacing 5, 10 and 15 resulted in different accuracy rates for 5 and 10 closest distances yielding the same accuracy rate of 84%, for the 15 shortest distance resulted in a small accuracy of 65.2%.
Predicting hepatotoxicity using ToxCast in vitro bioactivity and ...
Background: The U.S. EPA ToxCastTM program is screening thousands of environmental chemicals for bioactivity using hundreds of high-throughput in vitro assays to build predictive models of toxicity. We represented chemicals based on bioactivity and chemical structure descriptors then used supervised machine learning to predict their hepatotoxic effects.Results: A set of 677 chemicals were represented by 711 in vitro bioactivity descriptors (from ToxCast assays), 4,376 chemical structure descriptors (from QikProp, OpenBabel, PADEL, and PubChem), and three hepatotoxicity categories (from animal studies). Hepatotoxicants were defined by rat liver histopathology observed after chronic chemical testing and grouped into hypertrophy (161), injury (101) and proliferative lesions (99). Classifiers were built using six machine learning algorithms: linear discriminant analysis (LDA), Naïve Bayes (NB), support vector classification (SVM), classification and regression trees (CART), k-nearest neighbors (KNN) and an ensemble of classifiers (ENSMB). Classifiers of hepatotoxicity were built using chemical structure, ToxCast bioactivity, and a hybrid representation. Predictive performance was evaluated using 10-fold cross-validation testing and in-loop, filter-based, feature subset selection. Hybrid classifiers had the best balanced accuracy for predicting hypertrophy (0.78±0.08), injury (0.73±0.10) and proliferative lesions (0.72±0.09). Though chemical and bioactivity class
Alikin, Denis; Turygin, Anton; Kholkin, Andrei; Shur, Vladimir
2017-01-01
Recent advances in the development of novel methods for the local characterization of ferroelectric domains open up new opportunities not only to image, but also to control and to create desired domain configurations (domain engineering). The morphotropic and polymorphic phase boundaries that are frequently used to increase the electromechanical and dielectric performance of ferroelectric ceramics have a tremendous effect on the domain structure, which can serve as a signature of complex polarization states and link local and macroscopic piezoelectric and dielectric responses. This is especially important for the study of lead-free ferroelectric ceramics, which is currently replacing traditional lead-containing materials, and great efforts are devoted to increasing their performance to match that of lead zirconate titanate (PZT). In this work, we provide a short overview of the recent progress in the imaging of domain structure in two major families of ceramic lead-free systems based on BiFeO3 (BFO) and (Ka0.5Na0.5)NbO3 (KNN). This can be used as a guideline for the understanding of domain processes in lead-free piezoelectric ceramics and provide further insight into the mechanisms of structure–property relationship in these technologically important material families. PMID:28772408
A False Alarm Reduction Method for a Gas Sensor Based Electronic Nose
Rahman, Mohammad Mizanur; Suksompong, Prapun; Toochinda, Pisanu; Taparugssanagorn, Attaphongse
2017-01-01
Electronic noses (E-Noses) are becoming popular for food and fruit quality assessment due to their robustness and repeated usability without fatigue, unlike human experts. An E-Nose equipped with classification algorithms and having open ended classification boundaries such as the k-nearest neighbor (k-NN), support vector machine (SVM), and multilayer perceptron neural network (MLPNN), are found to suffer from false classification errors of irrelevant odor data. To reduce false classification and misclassification errors, and to improve correct rejection performance; algorithms with a hyperspheric boundary, such as a radial basis function neural network (RBFNN) and generalized regression neural network (GRNN) with a Gaussian activation function in the hidden layer should be used. The simulation results presented in this paper show that GRNN has more correct classification efficiency and false alarm reduction capability compared to RBFNN. As the design of a GRNN and RBFNN is complex and expensive due to large numbers of neuron requirements, a simple hyperspheric classification method based on minimum, maximum, and mean (MMM) values of each class of the training dataset was presented. The MMM algorithm was simple and found to be fast and efficient in correctly classifying data of training classes, and correctly rejecting data of extraneous odors, and thereby reduced false alarms. PMID:28895910
Song, Weiran; Wang, Hui; Maguire, Paul; Nibouche, Omar
2018-06-07
Partial Least Squares Discriminant Analysis (PLS-DA) is one of the most effective multivariate analysis methods for spectral data analysis, which extracts latent variables and uses them to predict responses. In particular, it is an effective method for handling high-dimensional and collinear spectral data. However, PLS-DA does not explicitly address data multimodality, i.e., within-class multimodal distribution of data. In this paper, we present a novel method termed nearest clusters based PLS-DA (NCPLS-DA) for addressing the multimodality and nonlinearity issues explicitly and improving the performance of PLS-DA on spectral data classification. The new method applies hierarchical clustering to divide samples into clusters and calculates the corresponding centre of every cluster. For a given query point, only clusters whose centres are nearest to such a query point are used for PLS-DA. Such a method can provide a simple and effective tool for separating multimodal and nonlinear classes into clusters which are locally linear and unimodal. Experimental results on 17 datasets, including 12 UCI and 5 spectral datasets, show that NCPLS-DA can outperform 4 baseline methods, namely, PLS-DA, kernel PLS-DA, local PLS-DA and k-NN, achieving the highest classification accuracy most of the time. Copyright © 2018 Elsevier B.V. All rights reserved.
Facial expression identification using 3D geometric features from Microsoft Kinect device
NASA Astrophysics Data System (ADS)
Han, Dongxu; Al Jawad, Naseer; Du, Hongbo
2016-05-01
Facial expression identification is an important part of face recognition and closely related to emotion detection from face images. Various solutions have been proposed in the past using different types of cameras and features. Microsoft Kinect device has been widely used for multimedia interactions. More recently, the device has been increasingly deployed for supporting scientific investigations. This paper explores the effectiveness of using the device in identifying emotional facial expressions such as surprise, smile, sad, etc. and evaluates the usefulness of 3D data points on a face mesh structure obtained from the Kinect device. We present a distance-based geometric feature component that is derived from the distances between points on the face mesh and selected reference points in a single frame. The feature components extracted across a sequence of frames starting and ending by neutral emotion represent a whole expression. The feature vector eliminates the need for complex face orientation correction, simplifying the feature extraction process and making it more efficient. We applied the kNN classifier that exploits a feature component based similarity measure following the principle of dynamic time warping to determine the closest neighbors. Preliminary tests on a small scale database of different facial expressions show promises of the newly developed features and the usefulness of the Kinect device in facial expression identification.
Carvalho, Luis Felipe C. S.; Nogueira, Marcelo Saito; Neto, Lázaro P. M.; Bhattacharjee, Tanmoy T.; Martin, Airton A.
2017-01-01
Most oral injuries are diagnosed by histopathological analysis of a biopsy, which is an invasive procedure and does not give immediate results. On the other hand, Raman spectroscopy is a real time and minimally invasive analytical tool with potential for the diagnosis of diseases. The potential for diagnostics can be improved by data post-processing. Hence, this study aims to evaluate the performance of preprocessing steps and multivariate analysis methods for the classification of normal tissues and pathological oral lesion spectra. A total of 80 spectra acquired from normal and abnormal tissues using optical fiber Raman-based spectroscopy (OFRS) were subjected to PCA preprocessing in the z-scored data set, and the KNN (K-nearest neighbors), J48 (unpruned C4.5 decision tree), RBF (radial basis function), RF (random forest), and MLP (multilayer perceptron) classifiers at WEKA software (Waikato environment for knowledge analysis), after area normalization or maximum intensity normalization. Our results suggest the best classification was achieved by using maximum intensity normalization followed by MLP. Based on these results, software for automated analysis can be generated and validated using larger data sets. This would aid quick comprehension of spectroscopic data and easy diagnosis by medical practitioners in clinical settings. PMID:29188115
Carvalho, Luis Felipe C S; Nogueira, Marcelo Saito; Neto, Lázaro P M; Bhattacharjee, Tanmoy T; Martin, Airton A
2017-11-01
Most oral injuries are diagnosed by histopathological analysis of a biopsy, which is an invasive procedure and does not give immediate results. On the other hand, Raman spectroscopy is a real time and minimally invasive analytical tool with potential for the diagnosis of diseases. The potential for diagnostics can be improved by data post-processing. Hence, this study aims to evaluate the performance of preprocessing steps and multivariate analysis methods for the classification of normal tissues and pathological oral lesion spectra. A total of 80 spectra acquired from normal and abnormal tissues using optical fiber Raman-based spectroscopy (OFRS) were subjected to PCA preprocessing in the z-scored data set, and the KNN (K-nearest neighbors), J48 (unpruned C4.5 decision tree), RBF (radial basis function), RF (random forest), and MLP (multilayer perceptron) classifiers at WEKA software (Waikato environment for knowledge analysis), after area normalization or maximum intensity normalization. Our results suggest the best classification was achieved by using maximum intensity normalization followed by MLP. Based on these results, software for automated analysis can be generated and validated using larger data sets. This would aid quick comprehension of spectroscopic data and easy diagnosis by medical practitioners in clinical settings.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mao, Peiyuan; Urry, C. Megan
We investigate a sample of 622 blazars with measured fluxes at 12 wavebands across the radio-to-gamma-ray spectrum but without spectroscopic or photometric redshifts. This sample includes hundreds of sources with newly analyzed X-ray spectra reported here. From the synchrotron peak frequencies, estimated by fitting the broadband spectral energy distributions (SEDs), we find that the fraction of high-synchrotron-peaked blazars in these 622 sources is roughly the same as in larger samples of blazars that do have redshifts. We characterize the no-redshift blazars using their infrared colors, which lie in the distinct locus called the WISE blazar strip, then estimate their redshiftsmore » using a KNN regression based on the redshifts of the closest blazars in the WISE color–color plot. Finally, using randomly drawn values from plausible redshift distributions, we simulate the SEDs of these blazars and compare them to known blazar SEDs. Based on all these considerations, we conclude that blazars without redshift estimates are unlikely to be high-luminosity, high-synchrotron-peaked objects, which had been suggested in order to explain the “blazar sequence”—an observed trend of SED shape with luminosity—as a selection effect. Instead, the observed properties of no-redshift blazars are compatible with a causal connection between jet power and electron cooling, i.e., a true blazar sequence.« less
Diagnosis of multiple sclerosis from EEG signals using nonlinear methods.
Torabi, Ali; Daliri, Mohammad Reza; Sabzposhan, Seyyed Hojjat
2017-12-01
EEG signals have essential and important information about the brain and neural diseases. The main purpose of this study is classifying two groups of healthy volunteers and Multiple Sclerosis (MS) patients using nonlinear features of EEG signals while performing cognitive tasks. EEG signals were recorded when users were doing two different attentional tasks. One of the tasks was based on detecting a desired change in color luminance and the other task was based on detecting a desired change in direction of motion. EEG signals were analyzed in two ways: EEG signals analysis without rhythms decomposition and EEG sub-bands analysis. After recording and preprocessing, time delay embedding method was used for state space reconstruction; embedding parameters were determined for original signals and their sub-bands. Afterwards nonlinear methods were used in feature extraction phase. To reduce the feature dimension, scalar feature selections were done by using T-test and Bhattacharyya criteria. Then, the data were classified using linear support vector machines (SVM) and k-nearest neighbor (KNN) method. The best combination of the criteria and classifiers was determined for each task by comparing performances. For both tasks, the best results were achieved by using T-test criterion and SVM classifier. For the direction-based and the color-luminance-based tasks, maximum classification performances were 93.08 and 79.79% respectively which were reached by using optimal set of features. Our results show that the nonlinear dynamic features of EEG signals seem to be useful and effective in MS diseases diagnosis.
A Comparative Study with RapidMiner and WEKA Tools over some Classification Techniques for SMS Spam
NASA Astrophysics Data System (ADS)
Foozy, Cik Feresa Mohd; Ahmad, Rabiah; Faizal Abdollah, M. A.; Chai Wen, Chuah
2017-08-01
SMS Spamming is a serious attack that can manipulate the use of the SMS by spreading the advertisement in bulk. By sending the unwanted SMS that contain advertisement can make the users feeling disturb and this against the privacy of the mobile users. To overcome these issues, many studies have proposed to detect SMS Spam by using data mining tools. This paper will do a comparative study using five machine learning techniques such as Naïve Bayes, K-NN (K-Nearest Neighbour Algorithm), Decision Tree, Random Forest and Decision Stumps to observe the accuracy result between RapidMiner and WEKA for dataset SMS Spam UCI Machine Learning repository.
Classification of fMRI resting-state maps using machine learning techniques: A comparative study
NASA Astrophysics Data System (ADS)
Gallos, Ioannis; Siettos, Constantinos
2017-11-01
We compare the efficiency of Principal Component Analysis (PCA) and nonlinear learning manifold algorithms (ISOMAP and Diffusion maps) for classifying brain maps between groups of schizophrenia patients and healthy from fMRI scans during a resting-state experiment. After a standard pre-processing pipeline, we applied spatial Independent component analysis (ICA) to reduce (a) noise and (b) spatial-temporal dimensionality of fMRI maps. On the cross-correlation matrix of the ICA components, we applied PCA, ISOMAP and Diffusion Maps to find an embedded low-dimensional space. Finally, support-vector-machines (SVM) and k-NN algorithms were used to evaluate the performance of the algorithms in classifying between the two groups.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mirjankar, Nikhil S.; Fraga, Carlos G.; Carman, April J.
Chemical attribution signatures (CAS) for chemical threat agents (CTAs) are being investigated to provide an evidentiary link between CTAs and specific sources to support criminal investigations and prosecutions. In a previous study, anionic impurity profiles developed using high performance ion chromatography (HPIC) were demonstrated as CAS for matching samples from eight potassium cyanide (KCN) stocks to their reported countries of origin. Herein, a larger number of solid KCN stocks (n = 13) and, for the first time, solid sodium cyanide (NaCN) stocks (n = 15) were examined to determine what additional sourcing information can be obtained through anion, carbon stablemore » isotope, and elemental analyses of cyanide stocks by HPIC, isotope ratio mass spectrometry (IRMS), and inductively coupled plasma optical emission spectroscopy (ICP-OES), respectively. The HPIC anion data was evaluated using the variable selection methods of Fisher-ratio (F-ratio), interval partial least squares (iPLS), and genetic algorithm-based partial least squares (GAPLS) and the classification methods of partial least squares discriminate analysis (PLSDA), K nearest neighbors (KNN), and support vector machines discriminate analysis (SVMDA). In summary, hierarchical cluster analysis (HCA) of anion impurity profiles from multiple cyanide stocks from six reported country of origins resulted in cyanide samples clustering into three groups: Czech Republic, Germany, and United States, independent of the associated alkali metal (K or Na). The three country groups were independently corroborated by HCA of cyanide elemental profiles and corresponded to countries with known solid cyanide factories. Both the anion and elemental CAS are believed to originate from the aqueous alkali hydroxides used in cyanide manufacture. Carbon stable isotope measurements resulted in two clusters: Germany and United States (the single Czech stock grouped with United States stocks). The carbon isotope CAS is believed to originate from the carbon source and process used to make the HCN utilized in cyanide synthesis. Classification errors for two validation studies using anion impurity profiles collected over five years on different instruments were as low as zero for KNN and SVMDA, demonstrating the excellent reliability (so far) of using anion impurities for matching a cyanide sample to its country of manufacture (i.e., factory). Variable selection reduced errors for those classification methods having errors greater than zero with iPLS-forward selection, and F-ratio typically providing the lowest errors. Finally, using anion profiles to match cyanides to a specific stock or stock group resulted in cross-validation errors ranging from zero to 5.3%.« less
Li, Yanfei; Tian, Yun
2018-01-01
The development of network technology and the popularization of image capturing devices have led to a rapid increase in the number of digital images available, and it is becoming increasingly difficult to identify a desired image from among the massive number of possible images. Images usually contain rich semantic information, and people usually understand images at a high semantic level. Therefore, achieving the ability to use advanced technology to identify the emotional semantics contained in images to enable emotional semantic image classification remains an urgent issue in various industries. To this end, this study proposes an improved OCC emotion model that integrates personality and mood factors for emotional modelling to describe the emotional semantic information contained in an image. The proposed classification system integrates the k-Nearest Neighbour (KNN) algorithm with the Support Vector Machine (SVM) algorithm. The MapReduce parallel programming model was used to adapt the KNN-SVM algorithm for parallel implementation in the Hadoop cluster environment, thereby achieving emotional semantic understanding for the classification of a massive collection of images. For training and testing, 70,000 scene images were randomly selected from the SUN Database. The experimental results indicate that users with different personalities show overall consistency in their emotional understanding of the same image. For a training sample size of 50,000, the classification accuracies for different emotional categories targeted at users with different personalities were approximately 95%, and the training time was only 1/5 of that required for the corresponding algorithm with a single-node architecture. Furthermore, the speedup of the system also showed a linearly increasing tendency. Thus, the experiments achieved a good classification effect and can lay a foundation for classification in terms of additional types of emotional image semantics, thereby demonstrating the practical significance of the proposed model. PMID:29320579
An Approach for Forest Inventory in Canada's Northern Boreal region, Northwest Territories
NASA Astrophysics Data System (ADS)
Mahoney, C.; Hopkinson, C.; Hall, R.; Filiatrault, M.
2017-12-01
The northern extent of Canada's northern boreal forest is largely inaccessible resulting in logistical, financial, and human challenges with respect to obtaining concise and accurate forest resource inventory (FRI) attributes such as stand height, aboveground biomass and forest carbon stocks. This challenge is further exacerbated by mandated government resource management and reporting of key attributes with respect to assessing impacts of natural disturbances, monitoring wildlife habitat and establishing policies to mitigate effects of climate change. This study presents a framework methodology utilized to inventory canopy height and crown closure over a 420,000 km2 area in Canada's Northwest Territories (NWT) by integrating field, LiDAR and satellite remote sensing data. Attributes are propagated from available field to coincident airborne LiDAR thru to satellite laser altimetry footprints. A quality controlled form of the latter are then submitted to a k-nearest neighbor (kNN) imputation algorithm to produce a continuous map of each attribute on a 30 m grid. The resultant kNN stand height (r=0.62, p=0.00) and crown closure (r=0.64, p=0.00) products were identified as statistically similar to a comprehensive independent airborne LiDAR source. Regional uncertainty can be produced with each attribute to identify areas of potential improvement through future strategic data acquisitions or the fine tuning of model parameters. This study's framework concept was developed to inform Natural Resources Canada - Canadian Forest Service's Multisource Vegetation Inventory and update vast regions of Canada's northern forest inventories, however, its applicability can be generalized to any environment. Not only can such a framework approach incorporate other data sources (such as Synthetic Aperture Radar) to potentially better characterize forest attributes, but it can also utilize future Earth observation mission data (for example ICESat-2) to monitor forest dynamics and the status, health and sustainability of Canada's northern boreal regions as areas where detailed inventory information is typically not available.
Zhu, Hao; Rusyn, Ivan; Richard, Ann; Tropsha, Alexander
2008-01-01
Background To develop efficient approaches for rapid evaluation of chemical toxicity and human health risk of environmental compounds, the National Toxicology Program (NTP) in collaboration with the National Center for Chemical Genomics has initiated a project on high-throughput screening (HTS) of environmental chemicals. The first HTS results for a set of 1,408 compounds tested for their effects on cell viability in six different cell lines have recently become available via PubChem. Objectives We have explored these data in terms of their utility for predicting adverse health effects of the environmental agents. Methods and results Initially, the classification k nearest neighbor (kNN) quantitative structure–activity relationship (QSAR) modeling method was applied to the HTS data only, for a curated data set of 384 compounds. The resulting models had prediction accuracies for training, test (containing 275 compounds together), and external validation (109 compounds) sets as high as 89%, 71%, and 74%, respectively. We then asked if HTS results could be of value in predicting rodent carcinogenicity. We identified 383 compounds for which data were available from both the Berkeley Carcinogenic Potency Database and NTP–HTS studies. We found that compounds classified by HTS as “actives” in at least one cell line were likely to be rodent carcinogens (sensitivity 77%); however, HTS “inactives” were far less informative (specificity 46%). Using chemical descriptors only, kNN QSAR modeling resulted in 62.3% prediction accuracy for rodent carcinogenicity applied to this data set. Importantly, the prediction accuracy of the model was significantly improved (72.7%) when chemical descriptors were augmented by HTS data, which were regarded as biological descriptors. Conclusions Our studies suggest that combining NTP–HTS profiles with conventional chemical descriptors could considerably improve the predictive power of computational approaches in toxicology. PMID:18414635
A Semi-parametric Multivariate Gap-filling Model for Eddy Covariance Latent Heat Flux
NASA Astrophysics Data System (ADS)
Li, M.; Chen, Y.
2010-12-01
Quantitative descriptions of latent heat fluxes are important to study the water and energy exchanges between terrestrial ecosystems and the atmosphere. The eddy covariance approaches have been recognized as the most reliable technique for measuring surface fluxes over time scales ranging from hours to years. However, unfavorable micrometeorological conditions, instrument failures, and applicable measurement limitations may cause inevitable flux gaps in time series data. Development and application of suitable gap-filling techniques are crucial to estimate long term fluxes. In this study, a semi-parametric multivariate gap-filling model was developed to fill latent heat flux gaps for eddy covariance measurements. Our approach combines the advantages of a multivariate statistical analysis (principal component analysis, PCA) and a nonlinear interpolation technique (K-nearest-neighbors, KNN). The PCA method was first used to resolve the multicollinearity relationships among various hydrometeorological factors, such as radiation, soil moisture deficit, LAI, and wind speed. The KNN method was then applied as a nonlinear interpolation tool to estimate the flux gaps as the weighted sum latent heat fluxes with the K-nearest distances in the PCs’ domain. Two years, 2008 and 2009, of eddy covariance and hydrometeorological data from a subtropical mixed evergreen forest (the Lien-Hua-Chih Site) were collected to calibrate and validate the proposed approach with artificial gaps after standard QC/QA procedures. The optimal K values and weighting factors were determined by the maximum likelihood test. The results of gap-filled latent heat fluxes conclude that developed model successful preserving energy balances of daily, monthly, and yearly time scales. Annual amounts of evapotranspiration from this study forest were 747 mm and 708 mm for 2008 and 2009, respectively. Nocturnal evapotranspiration was estimated with filled gaps and results are comparable with other studies. Seasonal and daily variability of latent heat fluxes were also discussed.
2013-01-01
Background Significant efforts have been made to address the problem of identifying short genes in prokaryotic genomes. However, most known methods are not effective in detecting short genes. Because of the limited information contained in short DNA sequences, it is very difficult to accurately distinguish between protein coding and non-coding sequences in prokaryotic genomes. We have developed a new Iteratively Adaptive Sparse Partial Least Squares (IASPLS) algorithm as the classifier to improve the accuracy of the identification process. Results For testing, we chose the short coding and non-coding sequences from seven prokaryotic organisms. We used seven feature sets (including GC content, Z-curve, etc.) of short genes. In comparison with GeneMarkS, Metagene, Orphelia, and Heuristic Approachs methods, our model achieved the best prediction performance in identification of short prokaryotic genes. Even when we focused on the very short length group ([60–100 nt)), our model provided sensitivity as high as 83.44% and specificity as high as 92.8%. These values are two or three times higher than three of the other methods while Metagene fails to recognize genes in this length range. The experiments also proved that the IASPLS can improve the identification accuracy in comparison with other widely used classifiers, i.e. Logistic, Random Forest (RF) and K nearest neighbors (KNN). The accuracy in using IASPLS was improved 5.90% or more in comparison with the other methods. In addition to the improvements in accuracy, IASPLS required ten times less computer time than using KNN or RF. Conclusions It is conclusive that our method is preferable for application as an automated method of short gene classification. Its linearity and easily optimized parameters make it practicable for predicting short genes of newly-sequenced or under-studied species. Reviewers This article was reviewed by Alexey Kondrashov, Rajeev Azad (nominated by Dr J.Peter Gogarten) and Yuriy Fofanov (nominated by Dr Janet Siefert). PMID:24067167
NASA Astrophysics Data System (ADS)
Shi, Y.; Gorban, A. N.; Y Yang, T.
2014-03-01
This case study tests the possibility of prediction for 'success' (or 'winner') components of four stock & shares market indices in a time period of three years from 02-Jul-2009 to 29-Jun-2012.We compare their performance ain two time frames: initial frame three months at the beginning (02/06/2009-30/09/2009) and the final three month frame (02/04/2012-29/06/2012).To label the components, average price ratio between two time frames in descending order is computed. The average price ratio is defined as the ratio between the mean prices of the beginning and final time period. The 'winner' components are referred to the top one third of total components in the same order as average price ratio it means the mean price of final time period is relatively higher than the beginning time period. The 'loser' components are referred to the last one third of total components in the same order as they have higher mean prices of beginning time period. We analyse, is there any information about the winner-looser separation in the initial fragments of the daily closing prices log-returns time series.The Leave-One-Out Cross-Validation with k-NN algorithm is applied on the daily log-return of components using a distance and proximity in the experiment. By looking at the error analysis, it shows that for HANGSENG and DAX index, there are clear signs of possibility to evaluate the probability of long-term success. The correlation distance matrix histograms and 2-D/3-D elastic maps generated from ViDaExpert show that the 'winner' components are closer to each other and 'winner'/'loser' components are separable on elastic maps for HANGSENG and DAX index while for the negative possibility indices, there is no sign of separation.
Cao, Jianfang; Li, Yanfei; Tian, Yun
2018-01-01
The development of network technology and the popularization of image capturing devices have led to a rapid increase in the number of digital images available, and it is becoming increasingly difficult to identify a desired image from among the massive number of possible images. Images usually contain rich semantic information, and people usually understand images at a high semantic level. Therefore, achieving the ability to use advanced technology to identify the emotional semantics contained in images to enable emotional semantic image classification remains an urgent issue in various industries. To this end, this study proposes an improved OCC emotion model that integrates personality and mood factors for emotional modelling to describe the emotional semantic information contained in an image. The proposed classification system integrates the k-Nearest Neighbour (KNN) algorithm with the Support Vector Machine (SVM) algorithm. The MapReduce parallel programming model was used to adapt the KNN-SVM algorithm for parallel implementation in the Hadoop cluster environment, thereby achieving emotional semantic understanding for the classification of a massive collection of images. For training and testing, 70,000 scene images were randomly selected from the SUN Database. The experimental results indicate that users with different personalities show overall consistency in their emotional understanding of the same image. For a training sample size of 50,000, the classification accuracies for different emotional categories targeted at users with different personalities were approximately 95%, and the training time was only 1/5 of that required for the corresponding algorithm with a single-node architecture. Furthermore, the speedup of the system also showed a linearly increasing tendency. Thus, the experiments achieved a good classification effect and can lay a foundation for classification in terms of additional types of emotional image semantics, thereby demonstrating the practical significance of the proposed model.
Houshyarifar, Vahid; Chehel Amirani, Mehdi
2016-08-12
In this paper we present a method to predict Sudden Cardiac Arrest (SCA) with higher order spectral (HOS) and linear (Time) features extracted from heart rate variability (HRV) signal. Predicting the occurrence of SCA is important in order to avoid the probability of Sudden Cardiac Death (SCD). This work is a challenge to predict five minutes before SCA onset. The method consists of four steps: pre-processing, feature extraction, feature reduction, and classification. In the first step, the QRS complexes are detected from the electrocardiogram (ECG) signal and then the HRV signal is extracted. In second step, bispectrum features of HRV signal and time-domain features are obtained. Six features are extracted from bispectrum and two features from time-domain. In the next step, these features are reduced to one feature by the linear discriminant analysis (LDA) technique. Finally, KNN and support vector machine-based classifiers are used to classify the HRV signals. We used two database named, MIT/BIH Sudden Cardiac Death (SCD) Database and Physiobank Normal Sinus Rhythm (NSR). In this work we achieved prediction of SCD occurrence for six minutes before the SCA with the accuracy over 91%.
Drivelos, Spiros A; Higgins, Kevin; Kalivas, John H; Haroutounian, Serkos A; Georgiou, Constantinos A
2014-12-15
"Fava Santorinis", is a protected designation of origin (PDO) yellow split pea species growing only in the island of Santorini in Greece. Due to its nutritional quality and taste, it has gained a high monetary value. Thus, it is prone to adulteration with other yellow split peas. In order to discriminate "Fava Santorinis" from other yellow split peas, four classification methods utilising rare earth elements (REEs) measured through inductively coupled plasma-mass spectrometry (ICP-MS) are studied. The four classification processes are orthogonal projection analysis (OPA), Mahalanobis distance (MD), partial least squares discriminant analysis (PLS-DA) and k nearest neighbours (KNN). Since it is known that trace elements are often useful to determine geographical origin of food products, we further quantitated for trace elements using ICP-MS. Presented in this paper are results using the four classification processes based on the fusion of the REEs data with the trace element data. Overall, the OPA method was found to perform best with up to 100% accuracy using the fused data. Copyright © 2014 Elsevier Ltd. All rights reserved.
Computational intelligence techniques for biological data mining: An overview
NASA Astrophysics Data System (ADS)
Faye, Ibrahima; Iqbal, Muhammad Javed; Said, Abas Md; Samir, Brahim Belhaouari
2014-10-01
Computational techniques have been successfully utilized for a highly accurate analysis and modeling of multifaceted and raw biological data gathered from various genome sequencing projects. These techniques are proving much more effective to overcome the limitations of the traditional in-vitro experiments on the constantly increasing sequence data. However, most critical problems that caught the attention of the researchers may include, but not limited to these: accurate structure and function prediction of unknown proteins, protein subcellular localization prediction, finding protein-protein interactions, protein fold recognition, analysis of microarray gene expression data, etc. To solve these problems, various classification and clustering techniques using machine learning have been extensively used in the published literature. These techniques include neural network algorithms, genetic algorithms, fuzzy ARTMAP, K-Means, K-NN, SVM, Rough set classifiers, decision tree and HMM based algorithms. Major difficulties in applying the above algorithms include the limitations found in the previous feature encoding and selection methods while extracting the best features, increasing classification accuracy and decreasing the running time overheads of the learning algorithms. The application of this research would be potentially useful in the drug design and in the diagnosis of some diseases. This paper presents a concise overview of the well-known protein classification techniques.
Guldbrandsen, Niels; Kostidis, Sarantos; Schäfer, Hartmut; De Mieri, Maria; Spraul, Manfred; Skaltsounis, Alexios-Leandros; Mikros, Emmanuel; Hamburger, Matthias
2015-05-22
Isatis tinctoria is an ancient dye and medicinal plant with potent anti-inflammatory and antiallergic properties. Metabolic differences were investigated by NMR spectroscopy of accessions from different origins that were grown under identical conditions on experimental plots. For these accessions, metabolite profiles at different harvesting dates were analyzed, and single and repeatedly harvested plants were compared. Leaf samples were shock-frozen in liquid N2 immediately after being harvested, freeze-dried, and cryomilled prior to extraction. Extracts were prepared by pressurized liquid extraction with ethyl acetate and 70% aqueous methanol. NMR spectra were analyzed using a combination of different methods of multivariate data analysis such as principal component analysis (PCA), canonical analysis (CA), and k-nearest neighbor concept (k-NN). Accessions and harvesting dates were well separated in the PCA/CA/k-NN analysis in both extracts. Pairwise statistical total correlation spectroscopy (STOCSY) revealed unsaturated fatty acids, porphyrins, carbohydrates, indole derivatives, isoprenoids, phenylpropanoids, and minor aromatic compounds as the cause of these differences. In addition, the metabolite profile was affected by the repeated harvest regime, causing a decrease of 1,5-anhydroglucitol, sucrose, unsaturated fatty acids, porphyrins, isoprenoids, and a flavonoid.
LMD Based Features for the Automatic Seizure Detection of EEG Signals Using SVM.
Zhang, Tao; Chen, Wanzhong
2017-08-01
Achieving the goal of detecting seizure activity automatically using electroencephalogram (EEG) signals is of great importance and significance for the treatment of epileptic seizures. To realize this aim, a newly-developed time-frequency analytical algorithm, namely local mean decomposition (LMD), is employed in the presented study. LMD is able to decompose an arbitrary signal into a series of product functions (PFs). Primarily, the raw EEG signal is decomposed into several PFs, and then the temporal statistical and non-linear features of the first five PFs are calculated. The features of each PF are fed into five classifiers, including back propagation neural network (BPNN), K-nearest neighbor (KNN), linear discriminant analysis (LDA), un-optimized support vector machine (SVM) and SVM optimized by genetic algorithm (GA-SVM), for five classification cases, respectively. Confluent features of all PFs and raw EEG are further passed into the high-performance GA-SVM for the same classification tasks. Experimental results on the international public Bonn epilepsy EEG dataset show that the average classification accuracy of the presented approach are equal to or higher than 98.10% in all the five cases, and this indicates the effectiveness of the proposed approach for automated seizure detection.
Ga2O3 doping and vacancy effect in KNN—LT lead-free piezoceramics
NASA Astrophysics Data System (ADS)
Tan, Zhi; Xing, Jie; Jiang, Laiming; Zhu, Jianguo; Wu, Bo
2017-12-01
Ga2O3 was doped into 0.95(K0.48Na0.52)NbO3—0.05LiTaO3 (KNN—LT) ceramics and its influences on the sintering behavior, phase structure and electrical properties of ceramics were studied. Firstly, SEM observation exhibits that more and more glass phase appears in ceramics with the gradual addition of Ga2O3, which determines the continuous decrease in sintering temperatures. And the addition of Ga2O3 is also found to increase the orthorhombic—tetragonal transition temperature ( T O—T) of system to a higher level. Secondly, both the density and the coercive field ( E C) of ceramics increase firstly and then decrease with increasing the Ga2O3 content, and the KNN—LT— xGa sample at x = 0.004 shows a pinched P— E hysteresis loop. Finally, the impedance characteristics of KNN—LT— xGa ceramics were investigated at different temperatures, revealing a typical vacancy related conduction mechanism. This work demonstrates that Ga2O3 is a good sintering aid for KNN-based ceramics, and the vacancy plays an important role in the sintering and electrical behaviors of ceramics.
Direct Ink Writing of Three-Dimensional (K, Na)NbO3-Based Piezoelectric Ceramics
Li, Yayun; Li, Longtu; Li, Bo
2015-01-01
A kind of piezoelectric ink was prepared with Li, Ta, Sb co-doped (K, Na)NbO3 (KNN) powders. Piezoelectric scaffolds with diameters at micrometer scale were constructed from this ink by using direct ink writing method. According to the micro-morphology and density test, the samples sintered at 1100 °C for 2 h have formed ceramics completely with a high relative density of 98%. X-ray diffraction (XRD) test shows that the main phase of sintered samples is orthogonal (Na0.52K0.4425Li0.0375)(Nb0.87Sb0.07Ta0.06)O3. The piezoelectric constant d33 of 280 pC/N, dielectric constant ε of 1775, remanent polarization Pr of 18.8 μC/cm2 and coercive field Ec of 8.5 kV/cm prove that the sintered samples exhibit good electrical properties. The direct ink writing method allows one to design and rapidly fabricate piezoelectric structures in complex three-dimensional (3D) shapes without the need for any dies or lithographic masks, which will simplify the process of material preparation and offer new ideas for the design and application of piezoelectric devices. PMID:28788028