Variance approximations for assessments of classification accuracy
R. L. Czaplewski
1994-01-01
Variance approximations are derived for the weighted and unweighted kappa statistics, the conditional kappa statistic, and conditional probabilities. These statistics are useful to assess classification accuracy, such as accuracy of remotely sensed classifications in thematic maps when compared to a sample of reference classifications made in the field. Published...
ERIC Educational Resources Information Center
Wang, Wenyi; Song, Lihong; Chen, Ping; Meng, Yaru; Ding, Shuliang
2015-01-01
Classification consistency and accuracy are viewed as important indicators for evaluating the reliability and validity of classification results in cognitive diagnostic assessment (CDA). Pattern-level classification consistency and accuracy indices were introduced by Cui, Gierl, and Chang. However, the indices at the attribute level have not yet…
Nationwide forestry applications program. Analysis of forest classification accuracy
NASA Technical Reports Server (NTRS)
Congalton, R. G.; Mead, R. A.; Oderwald, R. G.; Heinen, J. (Principal Investigator)
1981-01-01
The development of LANDSAT classification accuracy assessment techniques, and of a computerized system for assessing wildlife habitat from land cover maps are considered. A literature review on accuracy assessment techniques and an explanation for the techniques development under both projects are included along with listings of the computer programs. The presentations and discussions at the National Working Conference on LANDSAT Classification Accuracy are summarized. Two symposium papers which were published on the results of this project are appended.
Estimating Classification Consistency and Accuracy for Cognitive Diagnostic Assessment
ERIC Educational Resources Information Center
Cui, Ying; Gierl, Mark J.; Chang, Hua-Hua
2012-01-01
This article introduces procedures for the computation and asymptotic statistical inference for classification consistency and accuracy indices specifically designed for cognitive diagnostic assessments. The new classification indices can be used as important indicators of the reliability and validity of classification results produced by…
Ground Truth Sampling and LANDSAT Accuracy Assessment
NASA Technical Reports Server (NTRS)
Robinson, J. W.; Gunther, F. J.; Campbell, W. J.
1982-01-01
It is noted that the key factor in any accuracy assessment of remote sensing data is the method used for determining the ground truth, independent of the remote sensing data itself. The sampling and accuracy procedures developed for nuclear power plant siting study are described. The purpose of the sampling procedure was to provide data for developing supervised classifications for two study sites and for assessing the accuracy of that and the other procedures used. The purpose of the accuracy assessment was to allow the comparison of the cost and accuracy of various classification procedures as applied to various data types.
NASA Technical Reports Server (NTRS)
Justice, C.; Townshend, J. (Principal Investigator)
1981-01-01
Two unsupervised classification procedures were applied to ratioed and unratioed LANDSAT multispectral scanner data of an area of spatially complex vegetation and terrain. An objective accuracy assessment was undertaken on each classification and comparison was made of the classification accuracies. The two unsupervised procedures use the same clustering algorithm. By on procedure the entire area is clustered and by the other a representative sample of the area is clustered and the resulting statistics are extrapolated to the remaining area using a maximum likelihood classifier. Explanation is given of the major steps in the classification procedures including image preprocessing; classification; interpretation of cluster classes; and accuracy assessment. Of the four classifications undertaken, the monocluster block approach on the unratioed data gave the highest accuracy of 80% for five coarse cover classes. This accuracy was increased to 84% by applying a 3 x 3 contextual filter to the classified image. A detailed description and partial explanation is provided for the major misclassification. The classification of the unratioed data produced higher percentage accuracies than for the ratioed data and the monocluster block approach gave higher accuracies than clustering the entire area. The moncluster block approach was additionally the most economical in terms of computing time.
NASA Technical Reports Server (NTRS)
Card, Don H.; Strong, Laurence L.
1989-01-01
An application of a classification accuracy assessment procedure is described for a vegetation and land cover map prepared by digital image processing of LANDSAT multispectral scanner data. A statistical sampling procedure called Stratified Plurality Sampling was used to assess the accuracy of portions of a map of the Arctic National Wildlife Refuge coastal plain. Results are tabulated as percent correct classification overall as well as per category with associated confidence intervals. Although values of percent correct were disappointingly low for most categories, the study was useful in highlighting sources of classification error and demonstrating shortcomings of the plurality sampling method.
Yang, Xiaoyan; Chen, Longgao; Li, Yingkui; Xi, Wenjia; Chen, Longqian
2015-07-01
Land use/land cover (LULC) inventory provides an important dataset in regional planning and environmental assessment. To efficiently obtain the LULC inventory, we compared the LULC classifications based on single satellite imagery with a rule-based classification based on multi-seasonal imagery in Lianyungang City, a coastal city in China, using CBERS-02 (the 2nd China-Brazil Environmental Resource Satellites) images. The overall accuracies of the classification based on single imagery are 78.9, 82.8, and 82.0% in winter, early summer, and autumn, respectively. The rule-based classification improves the accuracy to 87.9% (kappa 0.85), suggesting that combining multi-seasonal images can considerably improve the classification accuracy over any single image-based classification. This method could also be used to analyze seasonal changes of LULC types, especially for those associated with tidal changes in coastal areas. The distribution and inventory of LULC types with an overall accuracy of 87.9% and a spatial resolution of 19.5 m can assist regional planning and environmental assessment efficiently in Lianyungang City. This rule-based classification provides a guidance to improve accuracy for coastal areas with distinct LULC temporal spectral features.
ERIC Educational Resources Information Center
Pena, Elizabeth D.; Gillam, Ronald B.; Malek, Melynn; Ruiz-Felter, Roxanna; Resendiz, Maria; Fiestas, Christine; Sabel, Tracy
2006-01-01
Two experiments examined reliability and classification accuracy of a narration-based dynamic assessment task. Purpose: The first experiment evaluated whether parallel results were obtained from stories created in response to 2 different wordless picture books. If so, the tasks and measures would be appropriate for assessing pretest and posttest…
Tahmasian, Masoud; Jamalabadi, Hamidreza; Abedini, Mina; Ghadami, Mohammad R; Sepehry, Amir A; Knight, David C; Khazaie, Habibolah
2017-05-22
Sleep disturbance is common in chronic post-traumatic stress disorder (PTSD). However, prior work has demonstrated that there are inconsistencies between subjective and objective assessments of sleep disturbance in PTSD. Therefore, we investigated whether subjective or objective sleep assessment has greater clinical utility to differentiate PTSD patients from healthy subjects. Further, we evaluated whether the combination of subjective and objective methods improves the accuracy of classification into patient versus healthy groups, which has important diagnostic implications. We recruited 32 chronic war-induced PTSD patients and 32 age- and gender-matched healthy subjects to participate in this study. Subjective (i.e. from three self-reported sleep questionnaires) and objective sleep-related data (i.e. from actigraphy scores) were collected from each participant. Subjective, objective, and combined (subjective and objective) sleep data were then analyzed using support vector machine classification. The classification accuracy, sensitivity, and specificity for subjective variables were 89.2%, 89.3%, and 89%, respectively. The classification accuracy, sensitivity, and specificity for objective variables were 65%, 62.3%, and 67.8%, respectively. The classification accuracy, sensitivity, and specificity for the aggregate variables (combination of subjective and objective variables) were 91.6%, 93.0%, and 90.3%, respectively. Our findings indicate that classification accuracy using subjective measurements is superior to objective measurements and the combination of both assessments appears to improve the classification accuracy for differentiating PTSD patients from healthy individuals. Copyright © 2017 Elsevier B.V. All rights reserved.
IMPACTS OF PATCH SIZE AND LANDSCAPE HETEROGENEITY ON THEMATIC IMAGE CLASSIFICATION ACCURACY
Impacts of Patch Size and Landscape Heterogeneity on Thematic Image Classification Accuracy.
Currently, most thematic accuracy assessments of classified remotely sensed images oily account for errors between the various classes employed, at particular pixels of interest, thu...
NASA Astrophysics Data System (ADS)
Gao, Yan; Marpu, Prashanth; Morales Manila, Luis M.
2014-11-01
This paper assesses the suitability of 8-band Worldview-2 (WV2) satellite data and object-based random forest algorithm for the classification of avocado growth stages in Mexico. We tested both pixel-based with minimum distance (MD) and maximum likelihood (MLC) and object-based with Random Forest (RF) algorithm for this task. Training samples and verification data were selected by visual interpreting the WV2 images for seven thematic classes: fully grown, middle stage, and early stage of avocado crops, bare land, two types of natural forests, and water body. To examine the contribution of the four new spectral bands of WV2 sensor, all the tested classifications were carried out with and without the four new spectral bands. Classification accuracy assessment results show that object-based classification with RF algorithm obtained higher overall higher accuracy (93.06%) than pixel-based MD (69.37%) and MLC (64.03%) method. For both pixel-based and object-based methods, the classifications with the four new spectral bands (overall accuracy obtained higher accuracy than those without: overall accuracy of object-based RF classification with vs without: 93.06% vs 83.59%, pixel-based MD: 69.37% vs 67.2%, pixel-based MLC: 64.03% vs 36.05%, suggesting that the four new spectral bands in WV2 sensor contributed to the increase of the classification accuracy.
Classification Consistency and Accuracy for Complex Assessments Using Item Response Theory
ERIC Educational Resources Information Center
Lee, Won-Chan
2010-01-01
In this article, procedures are described for estimating single-administration classification consistency and accuracy indices for complex assessments using item response theory (IRT). This IRT approach was applied to real test data comprising dichotomous and polytomous items. Several different IRT model combinations were considered. Comparisons…
Byun, Wonwoo; Lee, Jung-Min; Kim, Youngwon; Brusseau, Timothy A
2018-03-26
This study examined the accuracy of the Fitbit activity tracker (FF) for quantifying sedentary behavior (SB) and varying intensities of physical activity (PA) in 3-5-year-old children. Twenty-eight healthy preschool-aged children (Girls: 46%, Mean age: 4.8 ± 1.0 years) wore the FF and were directly observed while performing a set of various unstructured and structured free-living activities from sedentary to vigorous intensity. The classification accuracy of the FF for measuring SB, light PA (LPA), moderate-to-vigorous PA (MVPA), and total PA (TPA) was examined calculating Pearson correlation coefficients (r), mean absolute percent error (MAPE), Cohen's kappa ( k ), sensitivity (Se), specificity (Sp), and area under the receiver operating curve (ROC-AUC). The classification accuracies of the FF (ROC-AUC) were 0.92, 0.63, 0.77 and 0.92 for SB, LPA, MVPA and TPA, respectively. Similarly, values of kappa, Se, Sp and percentage of correct classification were consistently high for SB and TPA, but low for LPA and MVPA. The FF demonstrated excellent classification accuracy for assessing SB and TPA, but lower accuracy for classifying LPA and MVPA. Our findings suggest that the FF should be considered as a valid instrument for assessing time spent sedentary and overall physical activity in preschool-aged children.
Madison, Matthew J; Bradshaw, Laine P
2015-06-01
Diagnostic classification models are psychometric models that aim to classify examinees according to their mastery or non-mastery of specified latent characteristics. These models are well-suited for providing diagnostic feedback on educational assessments because of their practical efficiency and increased reliability when compared with other multidimensional measurement models. A priori specifications of which latent characteristics or attributes are measured by each item are a core element of the diagnostic assessment design. This item-attribute alignment, expressed in a Q-matrix, precedes and supports any inference resulting from the application of the diagnostic classification model. This study investigates the effects of Q-matrix design on classification accuracy for the log-linear cognitive diagnosis model. Results indicate that classification accuracy, reliability, and convergence rates improve when the Q-matrix contains isolated information from each measured attribute.
Variance estimates and confidence intervals for the Kappa measure of classification accuracy
M. A. Kalkhan; R. M. Reich; R. L. Czaplewski
1997-01-01
The Kappa statistic is frequently used to characterize the results of an accuracy assessment used to evaluate land use and land cover classifications obtained by remotely sensed data. This statistic allows comparisons of alternative sampling designs, classification algorithms, photo-interpreters, and so forth. In order to make these comparisons, it is...
ERIC Educational Resources Information Center
Wyse, Adam E.; Babcock, Ben
2016-01-01
A common suggestion made in the psychometric literature for fixed-length classification tests is that one should design tests so that they have maximum information at the cut score. Designing tests in this way is believed to maximize the classification accuracy and consistency of the assessment. This article uses simulated examples to illustrate…
Can segmentation evaluation metric be used as an indicator of land cover classification accuracy?
NASA Astrophysics Data System (ADS)
Švab Lenarčič, Andreja; Đurić, Nataša; Čotar, Klemen; Ritlop, Klemen; Oštir, Krištof
2016-10-01
It is a broadly established belief that the segmentation result significantly affects subsequent image classification accuracy. However, the actual correlation between the two has never been evaluated. Such an evaluation would be of considerable importance for any attempts to automate the object-based classification process, as it would reduce the amount of user intervention required to fine-tune the segmentation parameters. We conducted an assessment of segmentation and classification by analyzing 100 different segmentation parameter combinations, 3 classifiers, 5 land cover classes, 20 segmentation evaluation metrics, and 7 classification accuracy measures. The reliability definition of segmentation evaluation metrics as indicators of land cover classification accuracy was based on the linear correlation between the two. All unsupervised metrics that are not based on number of segments have a very strong correlation with all classification measures and are therefore reliable as indicators of land cover classification accuracy. On the other hand, correlation at supervised metrics is dependent on so many factors that it cannot be trusted as a reliable classification quality indicator. Algorithms for land cover classification studied in this paper are widely used; therefore, presented results are applicable to a wider area.
Characterization and delineation of caribou habitat on Unimak Island using remote sensing techniques
NASA Astrophysics Data System (ADS)
Atkinson, Brain M.
The assessment of herbivore habitat quality is traditionally based on quantifying the forages available to the animal across their home range through ground-based techniques. While these methods are highly accurate, they can be time-consuming and highly expensive, especially for herbivores that occupy vast spatial landscapes. The Unimak Island caribou herd has been decreasing in the last decade at rates that have prompted discussion of management intervention. Frequent inclement weather in this region of Alaska has provided for little opportunity to study the caribou forage habitat on Unimak Island. The overall objectives of this study were two-fold 1) to assess the feasibility of using high-resolution color and near-infrared aerial imagery to map the forage distribution of caribou habitat on Unimak Island and 2) to assess the use of a new high-resolution multispectral satellite imagery platform, RapidEye, and use of the "red-edge" spectral band on vegetation classification accuracy. Maximum likelihood classification algorithms were used to create land cover maps in aerial and satellite imagery. Accuracy assessments and transformed divergence values were produced to assess vegetative spectral information and classification accuracy. By using RapidEye and aerial digital imagery in a hierarchical supervised classification technique, we were able to produce a high resolution land cover map of Unimak Island. We obtained overall accuracy rates of 71.4 percent which are comparable to other land cover maps using RapidEye imagery. The "red-edge" spectral band included in the RapidEye imagery provides additional spectral information that allows for a more accurate overall classification, raising overall accuracy 5.2 percent.
ERIC Educational Resources Information Center
Bramley, Tom
2010-01-01
Background: A recent article published in "Educational Research" on the reliability of results in National Curriculum testing in England (Newton, "The reliability of results from national curriculum testing in England," "Educational Research" 51, no. 2: 181-212, 2009) suggested that: (1) classification accuracy can be…
ERIC Educational Resources Information Center
Decker, Dawn M.; Hixson, Michael D.; Shaw, Amber; Johnson, Gloria
2014-01-01
The purpose of this study was to examine whether using a multiple-measure framework yielded better classification accuracy than oral reading fluency (ORF) or maze alone in predicting pass/fail rates for middle-school students on a large-scale reading assessment. Participants were 178 students in Grades 7 and 8 from a Midwestern school district.…
Raymond L. Czaplewski
2000-01-01
Consider the following example of an accuracy assessment. Landsat data are used to build a thematic map of land cover for a multicounty region. The map classifier (e.g., a supervised classification algorithm) assigns each pixel into one category of land cover. The classification system includes 12 different types of forest and land cover: black spruce, balsam fir,...
Forest tree species discrimination in western Himalaya using EO-1 Hyperion
NASA Astrophysics Data System (ADS)
George, Rajee; Padalia, Hitendra; Kushwaha, S. P. S.
2014-05-01
The information acquired in the narrow bands of hyperspectral remote sensing data has potential to capture plant species spectral variability, thereby improving forest tree species mapping. This study assessed the utility of spaceborne EO-1 Hyperion data in discrimination and classification of broadleaved evergreen and conifer forest tree species in western Himalaya. The pre-processing of 242 bands of Hyperion data resulted into 160 noise-free and vertical stripe corrected reflectance bands. Of these, 29 bands were selected through step-wise exclusion of bands (Wilk's Lambda). Spectral Angle Mapper (SAM) and Support Vector Machine (SVM) algorithms were applied to the selected bands to assess their effectiveness in classification. SVM was also applied to broadband data (Landsat TM) to compare the variation in classification accuracy. All commonly occurring six gregarious tree species, viz., white oak, brown oak, chir pine, blue pine, cedar and fir in western Himalaya could be effectively discriminated. SVM produced a better species classification (overall accuracy 82.27%, kappa statistic 0.79) than SAM (overall accuracy 74.68%, kappa statistic 0.70). It was noticed that classification accuracy achieved with Hyperion bands was significantly higher than Landsat TM bands (overall accuracy 69.62%, kappa statistic 0.65). Study demonstrated the potential utility of narrow spectral bands of Hyperion data in discriminating tree species in a hilly terrain.
Conceptual Scoring and Classification Accuracy of Vocabulary Testing in Bilingual Children
ERIC Educational Resources Information Center
Anaya, Jissel B.; Peña, Elizabeth D.; Bedore, Lisa M.
2018-01-01
Purpose: This study examined the effects of single-language and conceptual scoring on the vocabulary performance of bilingual children with and without specific language impairment. We assessed classification accuracy across 3 scoring methods. Method: Participants included Spanish-English bilingual children (N = 247) aged 5;1 (years;months) to…
NASA Astrophysics Data System (ADS)
Löw, Fabian; Schorcht, Gunther; Michel, Ulrich; Dech, Stefan; Conrad, Christopher
2012-10-01
Accurate crop identification and crop area estimation are important for studies on irrigated agricultural systems, yield and water demand modeling, and agrarian policy development. In this study a novel combination of Random Forest (RF) and Support Vector Machine (SVM) classifiers is presented that (i) enhances crop classification accuracy and (ii) provides spatial information on map uncertainty. The methodology was implemented over four distinct irrigated sites in Middle Asia using RapidEye time series data. The RF feature importance statistics was used as feature-selection strategy for the SVM to assess possible negative effects on classification accuracy caused by an oversized feature space. The results of the individual RF and SVM classifications were combined with rules based on posterior classification probability and estimates of classification probability entropy. SVM classification performance was increased by feature selection through RF. Further experimental results indicate that the hybrid classifier improves overall classification accuracy in comparison to the single classifiers as well as useŕs and produceŕs accuracy.
NASA Technical Reports Server (NTRS)
Stoner, E. R.; May, G. A.; Kalcic, M. T. (Principal Investigator)
1981-01-01
Sample segments of ground-verified land cover data collected in conjunction with the USDA/ESS June Enumerative Survey were merged with LANDSAT data and served as a focus for unsupervised spectral class development and accuracy assessment. Multitemporal data sets were created from single-date LANDSAT MSS acquisitions from a nominal scene covering an eleven-county area in north central Missouri. Classification accuracies for the four land cover types predominant in the test site showed significant improvement in going from unitemporal to multitemporal data sets. Transformed LANDSAT data sets did not significantly improve classification accuracies. Regression estimators yielded mixed results for different land covers. Misregistration of two LANDSAT data sets by as much and one half pixels did not significantly alter overall classification accuracies. Existing algorithms for scene-to scene overlay proved adequate for multitemporal data analysis as long as statistical class development and accuracy assessment were restricted to field interior pixels.
Belgiu, Mariana; Dr Guţ, Lucian
2014-10-01
Although multiresolution segmentation (MRS) is a powerful technique for dealing with very high resolution imagery, some of the image objects that it generates do not match the geometries of the target objects, which reduces the classification accuracy. MRS can, however, be guided to produce results that approach the desired object geometry using either supervised or unsupervised approaches. Although some studies have suggested that a supervised approach is preferable, there has been no comparative evaluation of these two approaches. Therefore, in this study, we have compared supervised and unsupervised approaches to MRS. One supervised and two unsupervised segmentation methods were tested on three areas using QuickBird and WorldView-2 satellite imagery. The results were assessed using both segmentation evaluation methods and an accuracy assessment of the resulting building classifications. Thus, differences in the geometries of the image objects and in the potential to achieve satisfactory thematic accuracies were evaluated. The two approaches yielded remarkably similar classification results, with overall accuracies ranging from 82% to 86%. The performance of one of the unsupervised methods was unexpectedly similar to that of the supervised method; they identified almost identical scale parameters as being optimal for segmenting buildings, resulting in very similar geometries for the resulting image objects. The second unsupervised method produced very different image objects from the supervised method, but their classification accuracies were still very similar. The latter result was unexpected because, contrary to previously published findings, it suggests a high degree of independence between the segmentation results and classification accuracy. The results of this study have two important implications. The first is that object-based image analysis can be automated without sacrificing classification accuracy, and the second is that the previously accepted idea that classification is dependent on segmentation is challenged by our unexpected results, casting doubt on the value of pursuing 'optimal segmentation'. Our results rather suggest that as long as under-segmentation remains at acceptable levels, imperfections in segmentation can be ruled out, so that a high level of classification accuracy can still be achieved.
ERIC Educational Resources Information Center
Zhang, Bo
2010-01-01
This article investigates how measurement models and statistical procedures can be applied to estimate the accuracy of proficiency classification in language testing. The paper starts with a concise introduction of four measurement models: the classical test theory (CTT) model, the dichotomous item response theory (IRT) model, the testlet response…
Palaniappan, Rajkumar; Sundaraj, Kenneth; Sundaraj, Sebastian; Huliraj, N; Revadi, S S
2017-06-08
Auscultation is a medical procedure used for the initial diagnosis and assessment of lung and heart diseases. From this perspective, we propose assessing the performance of the extreme learning machine (ELM) classifiers for the diagnosis of pulmonary pathology using breath sounds. Energy and entropy features were extracted from the breath sound using the wavelet packet transform. The statistical significance of the extracted features was evaluated by one-way analysis of variance (ANOVA). The extracted features were inputted into the ELM classifier. The maximum classification accuracies obtained for the conventional validation (CV) of the energy and entropy features were 97.36% and 98.37%, respectively, whereas the accuracies obtained for the cross validation (CRV) of the energy and entropy features were 96.80% and 97.91%, respectively. In addition, maximum classification accuracies of 98.25% and 99.25% were obtained for the CV and CRV of the ensemble features, respectively. The results indicate that the classification accuracy obtained with the ensemble features was higher than those obtained with the energy and entropy features.
NASA Astrophysics Data System (ADS)
Wei, Hongqiang; Zhou, Guiyun; Zhou, Junjie
2018-04-01
The classification of leaf and wood points is an essential preprocessing step for extracting inventory measurements and canopy characterization of trees from the terrestrial laser scanning (TLS) data. The geometry-based approach is one of the widely used classification method. In the geometry-based method, it is common practice to extract salient features at one single scale before the features are used for classification. It remains unclear how different scale(s) used affect the classification accuracy and efficiency. To assess the scale effect on the classification accuracy and efficiency, we extracted the single-scale and multi-scale salient features from the point clouds of two oak trees of different sizes and conducted the classification on leaf and wood. Our experimental results show that the balanced accuracy of the multi-scale method is higher than the average balanced accuracy of the single-scale method by about 10 % for both trees. The average speed-up ratio of single scale classifiers over multi-scale classifier for each tree is higher than 30.
Multi-site evaluation of IKONOS data for classification of tropical coral reef environments
Andrefouet, S.; Kramer, Philip; Torres-Pulliza, D.; Joyce, K.E.; Hochberg, E.J.; Garza-Perez, R.; Mumby, P.J.; Riegl, Bernhard; Yamano, H.; White, W.H.; Zubia, M.; Brock, J.C.; Phinn, S.R.; Naseer, A.; Hatcher, B.G.; Muller-Karger, F. E.
2003-01-01
Ten IKONOS images of different coral reef sites distributed around the world were processed to assess the potential of 4-m resolution multispectral data for coral reef habitat mapping. Complexity of reef environments, established by field observation, ranged from 3 to 15 classes of benthic habitats containing various combinations of sediments, carbonate pavement, seagrass, algae, and corals in different geomorphologic zones (forereef, lagoon, patch reef, reef flats). Processing included corrections for sea surface roughness and bathymetry, unsupervised or supervised classification, and accuracy assessment based on ground-truth data. IKONOS classification results were compared with classified Landsat 7 imagery for simple to moderate complexity of reef habitats (5-11 classes). For both sensors, overall accuracies of the classifications show a general linear trend of decreasing accuracy with increasing habitat complexity. The IKONOS sensor performed better, with a 15-20% improvement in accuracy compared to Landsat. For IKONOS, overall accuracy was 77% for 4-5 classes, 71% for 7-8 classes, 65% in 9-11 classes, and 53% for more than 13 classes. The Landsat classification accuracy was systematically lower, with an average of 56% for 5-10 classes. Within this general trend, inter-site comparisons and specificities demonstrate the benefits of different approaches. Pre-segmentation of the different geomorphologic zones and depth correction provided different advantages in different environments. Our results help guide scientists and managers in applying IKONOS-class data for coral reef mapping applications. ?? 2003 Elsevier Inc. All rights reserved.
Jeff Jenness; J. Judson Wynne
2005-01-01
In the field of spatially explicit modeling, well-developed accuracy assessment methodologies are often poorly applied. Deriving model accuracy metrics have been possible for decades, but these calculations were made by hand or with the use of a spreadsheet application. Accuracy assessments may be useful for: (1) ascertaining the quality of a model; (2) improving model...
An accuracy assessment of forest disturbance mapping in the western Great Lakes
P.L. Zimmerman; I.W. Housman; C.H. Perry; R.A. Chastain; J.B. Webb; M.V. Finco
2013-01-01
The increasing availability of satellite imagery has spurred the production of thematic land cover maps based on satellite data. These maps are more valuable to the scientific community and land managers when the accuracy of their classifications has been assessed. Here, we assessed the accuracy of a map of forest disturbance in the watersheds of Lake Superior and Lake...
Engelken, Florian; Wassilew, Georgi I; Köhlitz, Torsten; Brockhaus, Sebastian; Hamm, Bernd; Perka, Carsten; Diederichs, und Gerd
2014-01-01
The purpose of this study was to quantify the performance of the Goutallier classification for assessing fatty degeneration of the gluteus muscles from magnetic resonance (MR) images and to compare its performance to a newly proposed system. Eighty-four hips with clinical signs of gluteal insufficiency and 50 hips from asymptomatic controls were analyzed using a standard classification system (Goutallier) and a new scoring system (Quartile). Interobserver reliability and intraobserver repeatability were determined, and accuracy was assessed by comparing readers' scores with quantitative estimates of the proportion of intramuscular fat based on MR signal intensities (gold standard). The existing Goutallier classification system and the new Quartile system performed equally well in assessing fatty degeneration of the gluteus muscles, both showing excellent levels of interrater and intrarater agreement. While the Goutallier classification system has the advantage of being widely known, the benefit of the Quartile system is that it is based on more clearly defined grades of fatty degeneration. Copyright © 2014 Elsevier Inc. All rights reserved.
Mediterranean Land Use and Land Cover Classification Assessment Using High Spatial Resolution Data
NASA Astrophysics Data System (ADS)
Elhag, Mohamed; Boteva, Silvena
2016-10-01
Landscape fragmentation is noticeably practiced in Mediterranean regions and imposes substantial complications in several satellite image classification methods. To some extent, high spatial resolution data were able to overcome such complications. For better classification performances in Land Use Land Cover (LULC) mapping, the current research adopts different classification methods comparison for LULC mapping using Sentinel-2 satellite as a source of high spatial resolution. Both of pixel-based and an object-based classification algorithms were assessed; the pixel-based approach employs Maximum Likelihood (ML), Artificial Neural Network (ANN) algorithms, Support Vector Machine (SVM), and, the object-based classification uses the Nearest Neighbour (NN) classifier. Stratified Masking Process (SMP) that integrates a ranking process within the classes based on spectral fluctuation of the sum of the training and testing sites was implemented. An analysis of the overall and individual accuracy of the classification results of all four methods reveals that the SVM classifier was the most efficient overall by distinguishing most of the classes with the highest accuracy. NN succeeded to deal with artificial surface classes in general while agriculture area classes, and forest and semi-natural area classes were segregated successfully with SVM. Furthermore, a comparative analysis indicates that the conventional classification method yielded better accuracy results than the SMP method overall with both classifiers used, ML and SVM.
The utility of Digital Orthophoto Quads (DOQS) in assessing the classification accuracy of land cover derived from Landsat MSS data was investigated. Initially, the suitability of DOQs in distinguishing between different land cover classes was assessed using high-resolution airbo...
Quantitative falls risk estimation through multi-sensor assessment of standing balance.
Greene, Barry R; McGrath, Denise; Walsh, Lorcan; Doheny, Emer P; McKeown, David; Garattini, Chiara; Cunningham, Clodagh; Crosby, Lisa; Caulfield, Brian; Kenny, Rose A
2012-12-01
Falls are the most common cause of injury and hospitalization and one of the principal causes of death and disability in older adults worldwide. Measures of postural stability have been associated with the incidence of falls in older adults. The aim of this study was to develop a model that accurately classifies fallers and non-fallers using novel multi-sensor quantitative balance metrics that can be easily deployed into a home or clinic setting. We compared the classification accuracy of our model with an established method for falls risk assessment, the Berg balance scale. Data were acquired using two sensor modalities--a pressure sensitive platform sensor and a body-worn inertial sensor, mounted on the lower back--from 120 community dwelling older adults (65 with a history of falls, 55 without, mean age 73.7 ± 5.8 years, 63 female) while performing a number of standing balance tasks in a geriatric research clinic. Results obtained using a support vector machine yielded a mean classification accuracy of 71.52% (95% CI: 68.82-74.28) in classifying falls history, obtained using one model classifying all data points. Considering male and female participant data separately yielded classification accuracies of 72.80% (95% CI: 68.85-77.17) and 73.33% (95% CI: 69.88-76.81) respectively, leading to a mean classification accuracy of 73.07% in identifying participants with a history of falls. Results compare favourably to those obtained using the Berg balance scale (mean classification accuracy: 59.42% (95% CI: 56.96-61.88)). Results from the present study could lead to a robust method for assessing falls risk in both supervised and unsupervised environments.
Thomas C. Edwards; D. Richard Cutler; Niklaus E. Zimmermann; Linda Geiser; Gretchen G. Moisen
2006-01-01
We evaluated the effects of probabilistic (hereafter DESIGN) and non-probabilistic (PURPOSIVE) sample surveys on resultant classification tree models for predicting the presence of four lichen species in the Pacific Northwest, USA. Models derived from both survey forms were assessed using an independent data set (EVALUATION). Measures of accuracy as gauged by...
NASA Astrophysics Data System (ADS)
Erener, A.
2013-04-01
Automatic extraction of urban features from high resolution satellite images is one of the main applications in remote sensing. It is useful for wide scale applications, namely: urban planning, urban mapping, disaster management, GIS (geographic information systems) updating, and military target detection. One common approach to detecting urban features from high resolution images is to use automatic classification methods. This paper has four main objectives with respect to detecting buildings. The first objective is to compare the performance of the most notable supervised classification algorithms, including the maximum likelihood classifier (MLC) and the support vector machine (SVM). In this experiment the primary consideration is the impact of kernel configuration on the performance of the SVM. The second objective of the study is to explore the suitability of integrating additional bands, namely first principal component (1st PC) and the intensity image, for original data for multi classification approaches. The performance evaluation of classification results is done using two different accuracy assessment methods: pixel based and object based approaches, which reflect the third aim of the study. The objective here is to demonstrate the differences in the evaluation of accuracies of classification methods. Considering consistency, the same set of ground truth data which is produced by labeling the building boundaries in the GIS environment is used for accuracy assessment. Lastly, the fourth aim is to experimentally evaluate variation in the accuracy of classifiers for six different real situations in order to identify the impact of spatial and spectral diversity on results. The method is applied to Quickbird images for various urban complexity levels, extending from simple to complex urban patterns. The simple surface type includes a regular urban area with low density and systematic buildings with brick rooftops. The complex surface type involves almost all kinds of challenges, such as high dense build up areas, regions with bare soil, and small and large buildings with different rooftops, such as concrete, brick, and metal. Using the pixel based accuracy assessment it was shown that the percent building detection (PBD) and quality percent (QP) of the MLC and SVM depend on the complexity and texture variation of the region. Generally, PBD values range between 70% and 90% for the MLC and SVM, respectively. No substantial improvements were observed when the SVM and MLC classifications were developed by the addition of more variables, instead of the use of only four bands. In the evaluation of object based accuracy assessment, it was demonstrated that while MLC and SVM provide higher rates of correct detection, they also provide higher rates of false alarms.
NASA Astrophysics Data System (ADS)
Bangs, Corey F.; Kruse, Fred A.; Olsen, Chris R.
2013-05-01
Hyperspectral data were assessed to determine the effect of integrating spectral data and extracted texture feature data on classification accuracy. Four separate spectral ranges (hundreds of spectral bands total) were used from the Visible and Near Infrared (VNIR) and Shortwave Infrared (SWIR) portions of the electromagnetic spectrum. Haralick texture features (contrast, entropy, and correlation) were extracted from the average gray-level image for each of the four spectral ranges studied. A maximum likelihood classifier was trained using a set of ground truth regions of interest (ROIs) and applied separately to the spectral data, texture data, and a fused dataset containing both. Classification accuracy was measured by comparison of results to a separate verification set of test ROIs. Analysis indicates that the spectral range (source of the gray-level image) used to extract the texture feature data has a significant effect on the classification accuracy. This result applies to texture-only classifications as well as the classification of integrated spectral data and texture feature data sets. Overall classification improvement for the integrated data sets was near 1%. Individual improvement for integrated spectral and texture classification of the "Urban" class showed approximately 9% accuracy increase over spectral-only classification. Texture-only classification accuracy was highest for the "Dirt Path" class at approximately 92% for the spectral range from 947 to 1343nm. This research demonstrates the effectiveness of texture feature data for more accurate analysis of hyperspectral data and the importance of selecting the correct spectral range to be used for the gray-level image source to extract these features.
A PIXEL COMPOSITION-BASED REFERENCE DATA SET FOR THEMATIC ACCURACY ASSESSMENT
Developing reference data sets for accuracy assessment of land-cover classifications derived from coarse spatial resolution sensors such as MODIS can be difficult due to the large resolution differences between the image data and available reference data sources. Ideally, the spa...
EXhype: A tool for mineral classification using hyperspectral data
NASA Astrophysics Data System (ADS)
Adep, Ramesh Nityanand; shetty, Amba; Ramesh, H.
2017-02-01
Various supervised classification algorithms have been developed to classify earth surface features using hyperspectral data. Each algorithm is modelled based on different human expertises. However, the performance of conventional algorithms is not satisfactory to map especially the minerals in view of their typical spectral responses. This study introduces a new expert system named 'EXhype (Expert system for hyperspectral data classification)' to map minerals. The system incorporates human expertise at several stages of it's implementation: (i) to deal with intra-class variation; (ii) to identify absorption features; (iii) to discriminate spectra by considering absorption features, non-absorption features and by full spectra comparison; and (iv) finally takes a decision based on learning and by emphasizing most important features. It is developed using a knowledge base consisting of an Optimal Spectral Library, Segmented Upper Hull method, Spectral Angle Mapper (SAM) and Artificial Neural Network. The performance of the EXhype is compared with a traditional, most commonly used SAM algorithm using Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) data acquired over Cuprite, Nevada, USA. A virtual verification method is used to collect samples information for accuracy assessment. Further, a modified accuracy assessment method is used to get a real users accuracies in cases where only limited or desired classes are considered for classification. With the modified accuracy assessment method, SAM and EXhype yields an overall accuracy of 60.35% and 90.75% and the kappa coefficient of 0.51 and 0.89 respectively. It was also found that the virtual verification method allows to use most desired stratified random sampling method and eliminates all the difficulties associated with it. The experimental results show that EXhype is not only producing better accuracy compared to traditional SAM but, can also rightly classify the minerals. It is proficient in avoiding misclassification between target classes when applied on minerals.
Developing collaborative classifiers using an expert-based model
Mountrakis, G.; Watts, R.; Luo, L.; Wang, Jingyuan
2009-01-01
This paper presents a hierarchical, multi-stage adaptive strategy for image classification. We iteratively apply various classification methods (e.g., decision trees, neural networks), identify regions of parametric and geographic space where accuracy is low, and in these regions, test and apply alternate methods repeating the process until the entire image is classified. Currently, classifiers are evaluated through human input using an expert-based system; therefore, this paper acts as the proof of concept for collaborative classifiers. Because we decompose the problem into smaller, more manageable sub-tasks, our classification exhibits increased flexibility compared to existing methods since classification methods are tailored to the idiosyncrasies of specific regions. A major benefit of our approach is its scalability and collaborative support since selected low-accuracy classifiers can be easily replaced with others without affecting classification accuracy in high accuracy areas. At each stage, we develop spatially explicit accuracy metrics that provide straightforward assessment of results by non-experts and point to areas that need algorithmic improvement or ancillary data. Our approach is demonstrated in the task of detecting impervious surface areas, an important indicator for human-induced alterations to the environment, using a 2001 Landsat scene from Las Vegas, Nevada. ?? 2009 American Society for Photogrammetry and Remote Sensing.
Schmidt, Robert L; Walker, Brandon S; Cohen, Michael B
2015-03-01
Reliable estimates of accuracy are important for any diagnostic test. Diagnostic accuracy studies are subject to unique sources of bias. Verification bias and classification bias are 2 sources of bias that commonly occur in diagnostic accuracy studies. Statistical methods are available to estimate the impact of these sources of bias when they occur alone. The impact of interactions when these types of bias occur together has not been investigated. We developed mathematical relationships to show the combined effect of verification bias and classification bias. A wide range of case scenarios were generated to assess the impact of bias components and interactions on total bias. Interactions between verification bias and classification bias caused overestimation of sensitivity and underestimation of specificity. Interactions had more effect on sensitivity than specificity. Sensitivity was overestimated by at least 7% in approximately 6% of the tested scenarios. Specificity was underestimated by at least 7% in less than 0.1% of the scenarios. Interactions between verification bias and classification bias create distortions in accuracy estimates that are greater than would be predicted from each source of bias acting independently. © 2014 American Cancer Society.
A new self-report inventory of dyslexia for students: criterion and construct validity.
Tamboer, Peter; Vorst, Harrie C M
2015-02-01
The validity of a Dutch self-report inventory of dyslexia was ascertained in two samples of students. Six biographical questions, 20 general language statements and 56 specific language statements were based on dyslexia as a multi-dimensional deficit. Dyslexia and non-dyslexia were assessed with two criteria: identification with test results (Sample 1) and classification using biographical information (both samples). Using discriminant analyses, these criteria were predicted with various groups of statements. All together, 11 discriminant functions were used to estimate classification accuracy of the inventory. In Sample 1, 15 statements predicted the test criterion with classification accuracy of 98%, and 18 statements predicted the biographical criterion with classification accuracy of 97%. In Sample 2, 16 statements predicted the biographical criterion with classification accuracy of 94%. Estimations of positive and negative predictive value were 89% and 99%. Items of various discriminant functions were factor analysed to find characteristic difficulties of students with dyslexia, resulting in a five-factor structure in Sample 1 and a four-factor structure in Sample 2. Answer bias was investigated with measures of internal consistency reliability. Less than 20 self-report items are sufficient to accurately classify students with and without dyslexia. This supports the usefulness of self-assessment of dyslexia as a valid alternative to diagnostic test batteries. Copyright © 2015 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Liu, Tao; Im, Jungho; Quackenbush, Lindi J.
2015-12-01
This study provides a novel approach to individual tree crown delineation (ITCD) using airborne Light Detection and Ranging (LiDAR) data in dense natural forests using two main steps: crown boundary refinement based on a proposed Fishing Net Dragging (FiND) method, and segment merging based on boundary classification. FiND starts with approximate tree crown boundaries derived using a traditional watershed method with Gaussian filtering and refines these boundaries using an algorithm that mimics how a fisherman drags a fishing net. Random forest machine learning is then used to classify boundary segments into two classes: boundaries between trees and boundaries between branches that belong to a single tree. Three groups of LiDAR-derived features-two from the pseudo waveform generated along with crown boundaries and one from a canopy height model (CHM)-were used in the classification. The proposed ITCD approach was tested using LiDAR data collected over a mountainous region in the Adirondack Park, NY, USA. Overall accuracy of boundary classification was 82.4%. Features derived from the CHM were generally more important in the classification than the features extracted from the pseudo waveform. A comprehensive accuracy assessment scheme for ITCD was also introduced by considering both area of crown overlap and crown centroids. Accuracy assessment using this new scheme shows the proposed ITCD achieved 74% and 78% as overall accuracy, respectively, for deciduous and mixed forest.
Marciano, Michael A; Adelman, Jonathan D
2017-03-01
The deconvolution of DNA mixtures remains one of the most critical challenges in the field of forensic DNA analysis. In addition, of all the data features required to perform such deconvolution, the number of contributors in the sample is widely considered the most important, and, if incorrectly chosen, the most likely to negatively influence the mixture interpretation of a DNA profile. Unfortunately, most current approaches to mixture deconvolution require the assumption that the number of contributors is known by the analyst, an assumption that can prove to be especially faulty when faced with increasingly complex mixtures of 3 or more contributors. In this study, we propose a probabilistic approach for estimating the number of contributors in a DNA mixture that leverages the strengths of machine learning. To assess this approach, we compare classification performances of six machine learning algorithms and evaluate the model from the top-performing algorithm against the current state of the art in the field of contributor number classification. Overall results show over 98% accuracy in identifying the number of contributors in a DNA mixture of up to 4 contributors. Comparative results showed 3-person mixtures had a classification accuracy improvement of over 6% compared to the current best-in-field methodology, and that 4-person mixtures had a classification accuracy improvement of over 20%. The Probabilistic Assessment for Contributor Estimation (PACE) also accomplishes classification of mixtures of up to 4 contributors in less than 1s using a standard laptop or desktop computer. Considering the high classification accuracy rates, as well as the significant time commitment required by the current state of the art model versus seconds required by a machine learning-derived model, the approach described herein provides a promising means of estimating the number of contributors and, subsequently, will lead to improved DNA mixture interpretation. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Classification of urban features using airborne hyperspectral data
NASA Astrophysics Data System (ADS)
Ganesh Babu, Bharath
Accurate mapping and modeling of urban environments are critical for their efficient and successful management. Superior understanding of complex urban environments is made possible by using modern geospatial technologies. This research focuses on thematic classification of urban land use and land cover (LULC) using 248 bands of 2.0 meter resolution hyperspectral data acquired from an airborne imaging spectrometer (AISA+) on 24th July 2006 in and near Terre Haute, Indiana. Three distinct study areas including two commercial classes, two residential classes, and two urban parks/recreational classes were selected for classification and analysis. Four commonly used classification methods -- maximum likelihood (ML), extraction and classification of homogeneous objects (ECHO), spectral angle mapper (SAM), and iterative self organizing data analysis (ISODATA) - were applied to each data set. Accuracy assessment was conducted and overall accuracies were compared between the twenty four resulting thematic maps. With the exception of SAM and ISODATA in a complex commercial area, all methods employed classified the designated urban features with more than 80% accuracy. The thematic classification from ECHO showed the best agreement with ground reference samples. The residential area with relatively homogeneous composition was classified consistently with highest accuracy by all four of the classification methods used. The average accuracy amongst the classifiers was 93.60% for this area. When individually observed, the complex recreational area (Deming Park) was classified with the highest accuracy by ECHO, with an accuracy of 96.80% and 96.10% Kappa. The average accuracy amongst all the classifiers was 92.07%. The commercial area with relatively high complexity was classified with the least accuracy by all classifiers. The lowest accuracy was achieved by SAM at 63.90% with 59.20% Kappa. This was also the lowest accuracy in the entire analysis. This study demonstrates the potential for using the visible and near infrared (VNIR) bands from AISA+ hyperspectral data in urban LULC classification. Based on their performance, the need for further research using ECHO and SAM is underscored. The importance incorporating imaging spectrometer data in high resolution urban feature mapping is emphasized.
Simulation of seagrass bed mapping by satellite images based on the radiative transfer model
NASA Astrophysics Data System (ADS)
Sagawa, Tatsuyuki; Komatsu, Teruhisa
2015-06-01
Seagrass and seaweed beds play important roles in coastal marine ecosystems. They are food sources and habitats for many marine organisms, and influence the physical, chemical, and biological environment. They are sensitive to human impacts such as reclamation and pollution. Therefore, their management and preservation are necessary for a healthy coastal environment. Satellite remote sensing is a useful tool for mapping and monitoring seagrass beds. The efficiency of seagrass mapping, seagrass bed classification in particular, has been evaluated by mapping accuracy using an error matrix. However, mapping accuracies are influenced by coastal environments such as seawater transparency, bathymetry, and substrate type. Coastal management requires sufficient accuracy and an understanding of mapping limitations for monitoring coastal habitats including seagrass beds. Previous studies are mainly based on case studies in specific regions and seasons. Extensive data are required to generalise assessments of classification accuracy from case studies, which has proven difficult. This study aims to build a simulator based on a radiative transfer model to produce modelled satellite images and assess the visual detectability of seagrass beds under different transparencies and seagrass coverages, as well as to examine mapping limitations and classification accuracy. Our simulations led to the development of a model of water transparency and the mapping of depth limits and indicated the possibility for seagrass density mapping under certain ideal conditions. The results show that modelling satellite images is useful in evaluating the accuracy of classification and that establishing seagrass bed monitoring by remote sensing is a reliable tool.
EFFECTS OF LANDSCAPE CHARACTERISTICS ON LAND-COVER CLASS ACCURACY
Utilizing land-cover data gathered as part of the National Land-Cover Data (NLCD) set accuracy assessment, several logistic regression models were formulated to analyze the effects of patch size and land-cover heterogeneity on classification accuracy. Specific land-cover ...
Two-stage cluster sampling reduces the cost of collecting accuracy assessment reference data by constraining sample elements to fall within a limited number of geographic domains (clusters). However, because classification error is typically positively spatially correlated, withi...
Feature ranking and rank aggregation for automatic sleep stage classification: a comparative study.
Najdi, Shirin; Gharbali, Ali Abdollahi; Fonseca, José Manuel
2017-08-18
Nowadays, sleep quality is one of the most important measures of healthy life, especially considering the huge number of sleep-related disorders. Identifying sleep stages using polysomnographic (PSG) signals is the traditional way of assessing sleep quality. However, the manual process of sleep stage classification is time-consuming, subjective and costly. Therefore, in order to improve the accuracy and efficiency of the sleep stage classification, researchers have been trying to develop automatic classification algorithms. Automatic sleep stage classification mainly consists of three steps: pre-processing, feature extraction and classification. Since classification accuracy is deeply affected by the extracted features, a poor feature vector will adversely affect the classifier and eventually lead to low classification accuracy. Therefore, special attention should be given to the feature extraction and selection process. In this paper the performance of seven feature selection methods, as well as two feature rank aggregation methods, were compared. Pz-Oz EEG, horizontal EOG and submental chin EMG recordings of 22 healthy males and females were used. A comprehensive feature set including 49 features was extracted from these recordings. The extracted features are among the most common and effective features used in sleep stage classification from temporal, spectral, entropy-based and nonlinear categories. The feature selection methods were evaluated and compared using three criteria: classification accuracy, stability, and similarity. Simulation results show that MRMR-MID achieves the highest classification performance while Fisher method provides the most stable ranking. In our simulations, the performance of the aggregation methods was in the average level, although they are known to generate more stable results and better accuracy. The Borda and RRA rank aggregation methods could not outperform significantly the conventional feature ranking methods. Among conventional methods, some of them slightly performed better than others, although the choice of a suitable technique is dependent on the computational complexity and accuracy requirements of the user.
How reliable and accurate is the AO/OTA comprehensive classification for adult long-bone fractures?
Meling, Terje; Harboe, Knut; Enoksen, Cathrine H; Aarflot, Morten; Arthursson, Astvaldur J; Søreide, Kjetil
2012-07-01
Reliable classification of fractures is important for treatment allocation and study comparisons. The overall accuracy of scoring applied to a general population of fractures is little known. This study aimed to investigate the accuracy and reliability of the comprehensive Arbeitsgemeinschaft für Osteosynthesefragen/Orthopedic Trauma Association classification for adult long-bone fractures and identify factors associated with poor coding agreement. Adults (>16 years) with long-bone fractures coded in a Fracture and Dislocation Registry at the Stavanger University Hospital during the fiscal year 2008 were included. An unblinded reference code dataset was generated for the overall accuracy assessment by two experienced orthopedic trauma surgeons. Blinded analysis of intrarater reliability was performed by rescoring and of interrater reliability by recoding of a randomly selected fracture sample. Proportion of agreement (PA) and kappa (κ) statistics are presented. Uni- and multivariate logistic regression analyses of factors predicting accuracy were performed. During the study period, 949 fractures were included and coded by 26 surgeons. For the intrarater analysis, overall agreements were κ = 0.67 (95% confidence interval [CI]: 0.64-0.70) and PA 69%. For interrater assessment, κ = 0.67 (95% CI: 0.62-0.72) and PA 69%. The accuracy of surgeons' blinded recoding was κ = 0.68 (95% CI: 0.65- 0.71) and PA 68%. Fracture type, frequency of the fracture, and segment fractured significantly influenced accuracy whereas the coder's experience did not. Both the reliability and accuracy of the comprehensive Arbeitsgemeinschaft für Osteosynthesefragen/Orthopedic Trauma Association classification for long-bone fractures ranged from substantial to excellent. Variations in coding accuracy seem to be related more to the fracture itself than the surgeon. Diagnostic study, level I.
Dumitru Salajanu; Dennis M. Jacobs
2007-01-01
The objective of this study was to determine how well forestfnon-forest and biomass classifications obtained from Landsat-TM and MODIS satellite data modeled with FIA plots, compare to each other and with forested area and biomass estimates from the national inventory data, as well as whether there is an increase in overall accuracy when pixel size (spatial resolution...
Stinchfield, Randy; McCready, John; Turner, Nigel E; Jimenez-Murcia, Susana; Petry, Nancy M; Grant, Jon; Welte, John; Chapman, Heather; Winters, Ken C
2016-09-01
The DSM-5 was published in 2013 and it included two substantive revisions for gambling disorder (GD). These changes are the reduction in the threshold from five to four criteria and elimination of the illegal activities criterion. The purpose of this study was to twofold. First, to assess the reliability, validity and classification accuracy of the DSM-5 diagnostic criteria for GD. Second, to compare the DSM-5-DSM-IV on reliability, validity, and classification accuracy, including an examination of the effect of the elimination of the illegal acts criterion on diagnostic accuracy. To compare DSM-5 and DSM-IV, eight datasets from three different countries (Canada, USA, and Spain; total N = 3247) were used. All datasets were based on similar research methods. Participants were recruited from outpatient gambling treatment services to represent the group with a GD and from the community to represent the group without a GD. All participants were administered a standardized measure of diagnostic criteria. The DSM-5 yielded satisfactory reliability, validity and classification accuracy. In comparing the DSM-5 to the DSM-IV, most comparisons of reliability, validity and classification accuracy showed more similarities than differences. There was evidence of modest improvements in classification accuracy for DSM-5 over DSM-IV, particularly in reduction of false negative errors. This reduction in false negative errors was largely a function of lowering the cut score from five to four and this revision is an improvement over DSM-IV. From a statistical standpoint, eliminating the illegal acts criterion did not make a significant impact on diagnostic accuracy. From a clinical standpoint, illegal acts can still be addressed in the context of the DSM-5 criterion of lying to others.
Multi-Temporal Classification and Change Detection Using Uav Images
NASA Astrophysics Data System (ADS)
Makuti, S.; Nex, F.; Yang, M. Y.
2018-05-01
In this paper different methodologies for the classification and change detection of UAV image blocks are explored. UAV is not only the cheapest platform for image acquisition but it is also the easiest platform to operate in repeated data collections over a changing area like a building construction site. Two change detection techniques have been evaluated in this study: the pre-classification and the post-classification algorithms. These methods are based on three main steps: feature extraction, classification and change detection. A set of state of the art features have been used in the tests: colour features (HSV), textural features (GLCM) and 3D geometric features. For classification purposes Conditional Random Field (CRF) has been used: the unary potential was determined using the Random Forest algorithm while the pairwise potential was defined by the fully connected CRF. In the performed tests, different feature configurations and settings have been considered to assess the performance of these methods in such challenging task. Experimental results showed that the post-classification approach outperforms the pre-classification change detection method. This was analysed using the overall accuracy, where by post classification have an accuracy of up to 62.6 % and the pre classification change detection have an accuracy of 46.5 %. These results represent a first useful indication for future works and developments.
Thematic accuracy of the National Land Cover Database (NLCD) 2001 land cover for Alaska
Selkowitz, D.J.; Stehman, S.V.
2011-01-01
The National Land Cover Database (NLCD) 2001 Alaska land cover classification is the first 30-m resolution land cover product available covering the entire state of Alaska. The accuracy assessment of the NLCD 2001 Alaska land cover classification employed a geographically stratified three-stage sampling design to select the reference sample of pixels. Reference land cover class labels were determined via fixed wing aircraft, as the high resolution imagery used for determining the reference land cover classification in the conterminous U.S. was not available for most of Alaska. Overall thematic accuracy for the Alaska NLCD was 76.2% (s.e. 2.8%) at Level II (12 classes evaluated) and 83.9% (s.e. 2.1%) at Level I (6 classes evaluated) when agreement was defined as a match between the map class and either the primary or alternate reference class label. When agreement was defined as a match between the map class and primary reference label only, overall accuracy was 59.4% at Level II and 69.3% at Level I. The majority of classification errors occurred at Level I of the classification hierarchy (i.e., misclassifications were generally to a different Level I class, not to a Level II class within the same Level I class). Classification accuracy was higher for more abundant land cover classes and for pixels located in the interior of homogeneous land cover patches. ?? 2011.
ERIC Educational Resources Information Center
Montoya, Isaac D.
2008-01-01
Three classification techniques (Chi-square Automatic Interaction Detection [CHAID], Classification and Regression Tree [CART], and discriminant analysis) were tested to determine their accuracy in predicting Temporary Assistance for Needy Families program recipients' future employment. Technique evaluation was based on proportion of correctly…
Karan, Shivesh Kishore; Samadder, Sukha Ranjan
2016-08-01
One objective of the present study was to evaluate the performance of support vector machine (SVM)-based image classification technique with the maximum likelihood classification (MLC) technique for a rapidly changing landscape of an open-cast mine. The other objective was to assess the change in land use pattern due to coal mining from 2006 to 2016. Assessing the change in land use pattern accurately is important for the development and monitoring of coalfields in conjunction with sustainable development. For the present study, Landsat 5 Thematic Mapper (TM) data of 2006 and Landsat 8 Operational Land Imager (OLI)/Thermal Infrared Sensor (TIRS) data of 2016 of a part of Jharia Coalfield, Dhanbad, India, were used. The SVM classification technique provided greater overall classification accuracy when compared to the MLC technique in classifying heterogeneous landscape with limited training dataset. SVM exceeded MLC in handling a difficult challenge of classifying features having near similar reflectance on the mean signature plot, an improvement of over 11 % was observed in classification of built-up area, and an improvement of 24 % was observed in classification of surface water using SVM; similarly, the SVM technique improved the overall land use classification accuracy by almost 6 and 3 % for Landsat 5 and Landsat 8 images, respectively. Results indicated that land degradation increased significantly from 2006 to 2016 in the study area. This study will help in quantifying the changes and can also serve as a basis for further decision support system studies aiding a variety of purposes such as planning and management of mines and environmental impact assessment.
NASA Astrophysics Data System (ADS)
Kamal, Muhammad; Johansen, Kasper
2017-10-01
Effective mangrove management requires spatially explicit information of mangrove tree crown map as a basis for ecosystem diversity study and health assessment. Accuracy assessment is an integral part of any mapping activities to measure the effectiveness of the classification approach. In geographic object-based image analysis (GEOBIA) the assessment of the geometric accuracy (shape, symmetry and location) of the created image objects from image segmentation is required. In this study we used an explicit area-based accuracy assessment to measure the degree of similarity between the results of the classification and reference data from different aspects, including overall quality (OQ), user's accuracy (UA), producer's accuracy (PA) and overall accuracy (OA). We developed a rule set to delineate the mangrove tree crown using WorldView-2 pan-sharpened image. The reference map was obtained by visual delineation of the mangrove tree crowns boundaries form a very high-spatial resolution aerial photograph (7.5cm pixel size). Ten random points with a 10 m radius circular buffer were created to calculate the area-based accuracy assessment. The resulting circular polygons were used to clip both the classified image objects and reference map for area comparisons. In this case, the area-based accuracy assessment resulted 64% and 68% for the OQ and OA, respectively. The overall quality of the calculation results shows the class-related area accuracy; which is the area of correctly classified as tree crowns was 64% out of the total area of tree crowns. On the other hand, the overall accuracy of 68% was calculated as the percentage of all correctly classified classes (tree crowns and canopy gaps) in comparison to the total class area (an entire image). Overall, the area-based accuracy assessment was simple to implement and easy to interpret. It also shows explicitly the omission and commission error variations of object boundary delineation with colour coded polygons.
Baltzer, Pascal A T; Dietzel, Matthias; Kaiser, Werner A
2013-08-01
In the face of multiple available diagnostic criteria in MR-mammography (MRM), a practical algorithm for lesion classification is needed. Such an algorithm should be as simple as possible and include only important independent lesion features to differentiate benign from malignant lesions. This investigation aimed to develop a simple classification tree for differential diagnosis in MRM. A total of 1,084 lesions in standardised MRM with subsequent histological verification (648 malignant, 436 benign) were investigated. Seventeen lesion criteria were assessed by 2 readers in consensus. Classification analysis was performed using the chi-squared automatic interaction detection (CHAID) method. Results include the probability for malignancy for every descriptor combination in the classification tree. A classification tree incorporating 5 lesion descriptors with a depth of 3 ramifications (1, root sign; 2, delayed enhancement pattern; 3, border, internal enhancement and oedema) was calculated. Of all 1,084 lesions, 262 (40.4 %) and 106 (24.3 %) could be classified as malignant and benign with an accuracy above 95 %, respectively. Overall diagnostic accuracy was 88.4 %. The classification algorithm reduced the number of categorical descriptors from 17 to 5 (29.4 %), resulting in a high classification accuracy. More than one third of all lesions could be classified with accuracy above 95 %. • A practical algorithm has been developed to classify lesions found in MR-mammography. • A simple decision tree consisting of five criteria reaches high accuracy of 88.4 %. • Unique to this approach, each classification is associated with a diagnostic certainty. • Diagnostic certainty of greater than 95 % is achieved in 34 % of all cases.
Spatial and thematic assessment of object-based forest stand delineation using an OFA-matrix
NASA Astrophysics Data System (ADS)
Hernando, A.; Tiede, D.; Albrecht, F.; Lang, S.
2012-10-01
The delineation and classification of forest stands is a crucial aspect of forest management. Object-based image analysis (OBIA) can be used to produce detailed maps of forest stands from either orthophotos or very high resolution satellite imagery. However, measures are then required for evaluating and quantifying both the spatial and thematic accuracy of the OBIA output. In this paper we present an approach for delineating forest stands and a new Object Fate Analysis (OFA) matrix for accuracy assessment. A two-level object-based orthophoto analysis was first carried out to delineate stands on the Dehesa Boyal public land in central Spain (Avila Province). Two structural features were first created for use in class modelling, enabling good differentiation between stands: a relational tree cover cluster feature, and an arithmetic ratio shadow/tree feature. We then extended the OFA comparison approach with an OFA-matrix to enable concurrent validation of thematic and spatial accuracies. Its diagonal shows the proportion of spatial and thematic coincidence between a reference data and the corresponding classification. New parameters for Spatial Thematic Loyalty (STL), Spatial Thematic Loyalty Overall (STLOVERALL) and Maximal Interfering Object (MIO) are introduced to summarise the OFA-matrix accuracy assessment. A stands map generated by OBIA (classification data) was compared with a map of the same area produced from photo interpretation and field data (reference data). In our example the OFA-matrix results indicate good spatial and thematic accuracies (>65%) for all stand classes except for the shrub stands (31.8%), and a good STLOVERALL (69.8%). The OFA-matrix has therefore been shown to be a valid tool for OBIA accuracy assessment.
ASSESSMENT OF LANDSCAPE CHARACTERISTICS ON THEMATIC IMAGE CLASSIFICATION ACCURACY
Landscape characteristics such as small patch size and land cover heterogeneity have been hypothesized to increase the likelihood of misclassifying pixels during thematic image classification. However, there has been a lack of empirical evidence, to support these hypotheses. This...
NASA Astrophysics Data System (ADS)
Zou, Xiaoliang; Zhao, Guihua; Li, Jonathan; Yang, Yuanxi; Fang, Yong
2016-06-01
With the rapid developments of the sensor technology, high spatial resolution imagery and airborne Lidar point clouds can be captured nowadays, which make classification, extraction, evaluation and analysis of a broad range of object features available. High resolution imagery, Lidar dataset and parcel map can be widely used for classification as information carriers. Therefore, refinement of objects classification is made possible for the urban land cover. The paper presents an approach to object based image analysis (OBIA) combing high spatial resolution imagery and airborne Lidar point clouds. The advanced workflow for urban land cover is designed with four components. Firstly, colour-infrared TrueOrtho photo and laser point clouds were pre-processed to derive the parcel map of water bodies and nDSM respectively. Secondly, image objects are created via multi-resolution image segmentation integrating scale parameter, the colour and shape properties with compactness criterion. Image can be subdivided into separate object regions. Thirdly, image objects classification is performed on the basis of segmentation and a rule set of knowledge decision tree. These objects imagery are classified into six classes such as water bodies, low vegetation/grass, tree, low building, high building and road. Finally, in order to assess the validity of the classification results for six classes, accuracy assessment is performed through comparing randomly distributed reference points of TrueOrtho imagery with the classification results, forming the confusion matrix and calculating overall accuracy and Kappa coefficient. The study area focuses on test site Vaihingen/Enz and a patch of test datasets comes from the benchmark of ISPRS WG III/4 test project. The classification results show higher overall accuracy for most types of urban land cover. Overall accuracy is 89.5% and Kappa coefficient equals to 0.865. The OBIA approach provides an effective and convenient way to combine high resolution imagery and Lidar ancillary data for classification of urban land cover.
Faber-Langendoen, D.; Aaseng, N.; Hop, K.; Lew-Smith, M.; Drake, J.
2007-01-01
Question: How can the U.S. National Vegetation Classification (USNVC) serve as an effective tool for classifying and mapping vegetation, and inform assessments and monitoring? Location: Voyageurs National Park, northern Minnesota, U.S.A and environs. The park contains 54 243 ha of terrestrial habitat in the sub-boreal region of North America. Methods: We classified and mapped the natural vegetation using the USNVC, with 'alliance' and 'association' as base units. We compiled 259 classification plots and 1251 accuracy assessment test plots. Both plot and type ordinations were used to analyse vegetation and environmental patterns. Color infrared aerial photography (1:15840 scale) was used for mapping. Polygons were manually drawn, then transferred into digital form. Classification and mapping products are stored in publicly available databases. Past fire and logging events were used to assess distribution of forest types. Results and Discussion: Ordination and cluster analyses confirmed 49 associations and 42 alliances, with three associations ranked as globally vulnerable to extirpation. Ordination provided a useful summary of vegetation and ecological gradients. Overall map accuracy was 82.4%. Pinus banksiana - Picea mariana forests were less frequent in areas unburned since the 1930s. Conclusion: The USNVC provides a consistent ecological tool for summarizing and mapping vegetation. The products provide a baseline for assessing forests and wetlands, including fire management. The standardized classification and map units provide local to continental perspectives on park resources through linkages to state, provincial, and national classifications in the U.S. and Canada, and to NatureServe's Ecological Systems classification. ?? IAVS; Opulus Press.
Bredesen, Ida Marie; Bjøro, Karen; Gunningberg, Lena; Hofoss, Dag
2016-05-01
Pressure ulcers (PUs) are a problem in health care. Staff competency is paramount to PU prevention. Education is essential to increase skills in pressure ulcer classification and risk assessment. Currently, no pressure ulcer learning programs are available in Norwegian. Develop and test an e-learning program for assessment of pressure ulcer risk and pressure ulcer classification. Forty-four nurses working in acute care hospital wards or nursing homes participated and were assigned randomly into two groups: an e-learning program group (intervention) and a traditional classroom lecture group (control). Data was collected immediately before and after training, and again after three months. The study was conducted at one nursing home and two hospitals between May and December 2012. Accuracy of risk assessment (five patient cases) and pressure ulcer classification (40 photos [normal skin, pressure ulcer categories I-IV] split in two sets) were measured by comparing nurse evaluations in each of the two groups to a pre-established standard based on ratings by experts in pressure ulcer classification and risk assessment. Inter-rater reliability was measured by exact percent agreement and multi-rater Fleiss kappa. A Mann-Whitney U test was used for continuous sum score variables. An e-learning program did not improve Braden subscale scoring. For pressure ulcer classification, however, the intervention group scored significantly higher than the control group on several of the categories in post-test immediately after training. However, after three months there were no significant differences in classification skills between the groups. An e-learning program appears to have a greater effect on the accuracy of pressure ulcer classification than classroom teaching in the short term. For proficiency in Braden scoring, no significant effect of educational methods on learning results was detected. Copyright © 2016 Elsevier Ltd. All rights reserved.
Combining accuracy assessment of land-cover maps with environmental monitoring programs
Stehman, S.V.; Czaplewski, R.L.; Nusser, S.M.; Yang, L.; Zhu, Z.
2000-01-01
A scientifically valid accuracy assessment of a large-area, land-cover map is expensive. Environmental monitoring programs offer a potential source of data to partially defray the cost of accuracy assessment while still maintaining the statistical validity. In this article, three general strategies for combining accuracy assessment and environmental monitoring protocols are described. These strategies range from a fully integrated accuracy assessment and environmental monitoring protocol, to one in which the protocols operate nearly independently. For all three strategies, features critical to using monitoring data for accuracy assessment include compatibility of the land-cover classification schemes, precisely co-registered sample data, and spatial and temporal compatibility of the map and reference data. Two monitoring programs, the National Resources Inventory (NRI) and the Forest Inventory and Monitoring (FIM), are used to illustrate important features for implementing a combined protocol.
NASA Astrophysics Data System (ADS)
Karakacan Kuzucu, A.; Bektas Balcik, F.
2017-11-01
Accurate and reliable land use/land cover (LULC) information obtained by remote sensing technology is necessary in many applications such as environmental monitoring, agricultural management, urban planning, hydrological applications, soil management, vegetation condition study and suitability analysis. But this information still remains a challenge especially in heterogeneous landscapes covering urban and rural areas due to spectrally similar LULC features. In parallel with technological developments, supplementary data such as satellite-derived spectral indices have begun to be used as additional bands in classification to produce data with high accuracy. The aim of this research is to test the potential of spectral vegetation indices combination with supervised classification methods and to extract reliable LULC information from SPOT 7 multispectral imagery. The Normalized Difference Vegetation Index (NDVI), the Ratio Vegetation Index (RATIO), the Soil Adjusted Vegetation Index (SAVI) were the three vegetation indices used in this study. The classical maximum likelihood classifier (MLC) and support vector machine (SVM) algorithm were applied to classify SPOT 7 image. Catalca is selected region located in the north west of the Istanbul in Turkey, which has complex landscape covering artificial surface, forest and natural area, agricultural field, quarry/mining area, pasture/scrubland and water body. Accuracy assessment of all classified images was performed through overall accuracy and kappa coefficient. The results indicated that the incorporation of these three different vegetation indices decrease the classification accuracy for the MLC and SVM classification. In addition, the maximum likelihood classification slightly outperformed the support vector machine classification approach in both overall accuracy and kappa statistics.
Multiple confidence estimates as indices of eyewitness memory.
Sauer, James D; Brewer, Neil; Weber, Nathan
2008-08-01
Eyewitness identification decisions are vulnerable to various influences on witnesses' decision criteria that contribute to false identifications of innocent suspects and failures to choose perpetrators. An alternative procedure using confidence estimates to assess the degree of match between novel and previously viewed faces was investigated. Classification algorithms were applied to participants' confidence data to determine when a confidence value or pattern of confidence values indicated a positive response. Experiment 1 compared confidence group classification accuracy with a binary decision control group's accuracy on a standard old-new face recognition task and found superior accuracy for the confidence group for target-absent trials but not for target-present trials. Experiment 2 used a face mini-lineup task and found reduced target-present accuracy offset by large gains in target-absent accuracy. Using a standard lineup paradigm, Experiments 3 and 4 also found improved classification accuracy for target-absent lineups and, with a more sophisticated algorithm, for target-present lineups. This demonstrates the accessibility of evidence for recognition memory decisions and points to a more sensitive index of memory quality than is afforded by binary decisions.
2013-01-01
Background and purpose Guidelines for fracture treatment and evaluation require a valid classification. Classifications especially designed for children are available, but they might lead to reduced accuracy, considering the relative infrequency of childhood fractures in a general orthopedic department. We tested the reliability and accuracy of the Müller classification when used for long bone fractures in children. Methods We included all long bone fractures in children aged < 16 years who were treated in 2008 at the surgical ward of Stavanger University Hospital. 20 surgeons recorded 232 fractures. Datasets were generated for intra- and inter-rater analysis, as well as a reference dataset for accuracy calculations. We present proportion of agreement (PA) and kappa (K) statistics. Results For intra-rater analysis, overall agreement (κ) was 0.75 (95% CI: 0.68–0.81) and PA was 79%. For inter-rater assessment, K was 0.71 (95% CI: 0.61–0.80) and PA was 77%. Accuracy was estimated: κ = 0.72 (95% CI: 0.64–0.79) and PA = 76%. Interpretation The Müller classification (slightly adjusted for pediatric fractures) showed substantial to excellent accuracy among general orthopedic surgeons when applied to long bone fractures in children. However, separate knowledge about the child-specific fracture pattern, the maturity of the bone, and the degree of displacement must be considered when the treatment and the prognosis of the fractures are evaluated. PMID:23245225
NASA Technical Reports Server (NTRS)
Quattrochi, D. A.; Anderson, J. E.; Brannon, D. P.; Hill, C. L.
1982-01-01
An initial analysis of LANDSAT 4 thematic mapper (TM) data for the delineation and classification of agricultural, forested wetland, and urban land covers was conducted. A study area in Poinsett County, Arkansas was used to evaluate a classification of agricultural lands derived from multitemporal LANDSAT multispectral scanner (MSS) data in comparison with a classification of TM data for the same area. Data over Reelfoot Lake in northwestern Tennessee were utilized to evaluate the TM for delineating forested wetland species. A classification of the study area was assessed for accuracy in discriminating five forested wetland categories. Finally, the TM data were used to identify urban features within a small city. A computer generated classification of Union City, Tennessee was analyzed for accuracy in delineating urban land covers. An evaluation of digitally enhanced TM data using principal components analysis to facilitate photointerpretation of urban features was also performed.
This study applied a phenology-based land-cover classification approach across the Laurentian Great Lakes Basin (GLB) using time-series data consisting of 23 Moderate Resolution Imaging Spectroradiometer (MODIS) Normalized Difference Vegetation Index (NDVI) composite images (250 ...
NASA Astrophysics Data System (ADS)
Sukawattanavijit, Chanika; Srestasathiern, Panu
2017-10-01
Land Use and Land Cover (LULC) information are significant to observe and evaluate environmental change. LULC classification applying remotely sensed data is a technique popularly employed on a global and local dimension particularly, in urban areas which have diverse land cover types. These are essential components of the urban terrain and ecosystem. In the present, object-based image analysis (OBIA) is becoming widely popular for land cover classification using the high-resolution image. COSMO-SkyMed SAR data was fused with THAICHOTE (namely, THEOS: Thailand Earth Observation Satellite) optical data for land cover classification using object-based. This paper indicates a comparison between object-based and pixel-based approaches in image fusion. The per-pixel method, support vector machines (SVM) was implemented to the fused image based on Principal Component Analysis (PCA). For the objectbased classification was applied to the fused images to separate land cover classes by using nearest neighbor (NN) classifier. Finally, the accuracy assessment was employed by comparing with the classification of land cover mapping generated from fused image dataset and THAICHOTE image. The object-based data fused COSMO-SkyMed with THAICHOTE images demonstrated the best classification accuracies, well over 85%. As the results, an object-based data fusion provides higher land cover classification accuracy than per-pixel data fusion.
The effect of storage temperature on the accuracy of a cow-side test for ketosis
Hubbard, Jennifer; LeBlanc, Stephen; Duffield, Todd; Bagg, Randal; Dubuc, Jocelyn
2010-01-01
The objective of this study was to assess the effect of storage conditions on the accuracy of a milk test strip for ketosis. Storage at 21°C for up to 18 wk had little effect on accuracy for diagnosis and classification of subclinical ketosis. PMID:20676298
A review of supervised object-based land-cover image classification
NASA Astrophysics Data System (ADS)
Ma, Lei; Li, Manchun; Ma, Xiaoxue; Cheng, Liang; Du, Peijun; Liu, Yongxue
2017-08-01
Object-based image classification for land-cover mapping purposes using remote-sensing imagery has attracted significant attention in recent years. Numerous studies conducted over the past decade have investigated a broad array of sensors, feature selection, classifiers, and other factors of interest. However, these research results have not yet been synthesized to provide coherent guidance on the effect of different supervised object-based land-cover classification processes. In this study, we first construct a database with 28 fields using qualitative and quantitative information extracted from 254 experimental cases described in 173 scientific papers. Second, the results of the meta-analysis are reported, including general characteristics of the studies (e.g., the geographic range of relevant institutes, preferred journals) and the relationships between factors of interest (e.g., spatial resolution and study area or optimal segmentation scale, accuracy and number of targeted classes), especially with respect to the classification accuracy of different sensors, segmentation scale, training set size, supervised classifiers, and land-cover types. Third, useful data on supervised object-based image classification are determined from the meta-analysis. For example, we find that supervised object-based classification is currently experiencing rapid advances, while development of the fuzzy technique is limited in the object-based framework. Furthermore, spatial resolution correlates with the optimal segmentation scale and study area, and Random Forest (RF) shows the best performance in object-based classification. The area-based accuracy assessment method can obtain stable classification performance, and indicates a strong correlation between accuracy and training set size, while the accuracy of the point-based method is likely to be unstable due to mixed objects. In addition, the overall accuracy benefits from higher spatial resolution images (e.g., unmanned aerial vehicle) or agricultural sites where it also correlates with the number of targeted classes. More than 95.6% of studies involve an area less than 300 ha, and the spatial resolution of images is predominantly between 0 and 2 m. Furthermore, we identify some methods that may advance supervised object-based image classification. For example, deep learning and type-2 fuzzy techniques may further improve classification accuracy. Lastly, scientists are strongly encouraged to report results of uncertainty studies to further explore the effects of varied factors on supervised object-based image classification.
Mapping Winter Wheat with Multi-Temporal SAR and Optical Images in an Urban Agricultural Region
Zhou, Tao; Pan, Jianjun; Zhang, Peiyu; Wei, Shanbao; Han, Tao
2017-01-01
Winter wheat is the second largest food crop in China. It is important to obtain reliable winter wheat acreage to guarantee the food security for the most populous country in the world. This paper focuses on assessing the feasibility of in-season winter wheat mapping and investigating potential classification improvement by using SAR (Synthetic Aperture Radar) images, optical images, and the integration of both types of data in urban agricultural regions with complex planting structures in Southern China. Both SAR (Sentinel-1A) and optical (Landsat-8) data were acquired, and classification using different combinations of Sentinel-1A-derived information and optical images was performed using a support vector machine (SVM) and a random forest (RF) method. The interference coherence and texture images were obtained and used to assess the effect of adding them to the backscatter intensity images on the classification accuracy. The results showed that the use of four Sentinel-1A images acquired before the jointing period of winter wheat can provide satisfactory winter wheat classification accuracy, with an F1 measure of 87.89%. The combination of SAR and optical images for winter wheat mapping achieved the best F1 measure–up to 98.06%. The SVM was superior to RF in terms of the overall accuracy and the kappa coefficient, and was faster than RF, while the RF classifier was slightly better than SVM in terms of the F1 measure. In addition, the classification accuracy can be effectively improved by adding the texture and coherence images to the backscatter intensity data. PMID:28587066
Assessment of the Thematic Accuracy of Land Cover Maps
NASA Astrophysics Data System (ADS)
Höhle, J.
2015-08-01
Several land cover maps are generated from aerial imagery and assessed by different approaches. The test site is an urban area in Europe for which six classes (`building', `hedge and bush', `grass', `road and parking lot', `tree', `wall and car port') had to be derived. Two classification methods were applied (`Decision Tree' and `Support Vector Machine') using only two attributes (height above ground and normalized difference vegetation index) which both are derived from the images. The assessment of the thematic accuracy applied a stratified design and was based on accuracy measures such as user's and producer's accuracy, and kappa coefficient. In addition, confidence intervals were computed for several accuracy measures. The achieved accuracies and confidence intervals are thoroughly analysed and recommendations are derived from the gained experiences. Reliable reference values are obtained using stereovision, false-colour image pairs, and positioning to the checkpoints with 3D coordinates. The influence of the training areas on the results is studied. Cross validation has been tested with a few reference points in order to derive approximate accuracy measures. The two classification methods perform equally for five classes. Trees are classified with a much better accuracy and a smaller confidence interval by means of the decision tree method. Buildings are classified by both methods with an accuracy of 99% (95% CI: 95%-100%) using independent 3D checkpoints. The average width of the confidence interval of six classes was 14% of the user's accuracy.
ERIC Educational Resources Information Center
Kunina-Habenicht, Olga; Rupp, Andre A.; Wilhelm, Oliver
2012-01-01
Using a complex simulation study we investigated parameter recovery, classification accuracy, and performance of two item-fit statistics for correct and misspecified diagnostic classification models within a log-linear modeling framework. The basic manipulated test design factors included the number of respondents (1,000 vs. 10,000), attributes (3…
Impacts of land use/cover classification accuracy on regional climate simulations
NASA Astrophysics Data System (ADS)
Ge, Jianjun; Qi, Jiaguo; Lofgren, Brent M.; Moore, Nathan; Torbick, Nathan; Olson, Jennifer M.
2007-03-01
Land use/cover change has been recognized as a key component in global change. Various land cover data sets, including historically reconstructed, recently observed, and future projected, have been used in numerous climate modeling studies at regional to global scales. However, little attention has been paid to the effect of land cover classification accuracy on climate simulations, though accuracy assessment has become a routine procedure in land cover production community. In this study, we analyzed the behavior of simulated precipitation in the Regional Atmospheric Modeling System (RAMS) over a range of simulated classification accuracies over a 3 month period. This study found that land cover accuracy under 80% had a strong effect on precipitation especially when the land surface had a greater control of the atmosphere. This effect became stronger as the accuracy decreased. As shown in three follow-on experiments, the effect was further influenced by model parameterizations such as convection schemes and interior nudging, which can mitigate the strength of surface boundary forcings. In reality, land cover accuracy rarely obtains the commonly recommended 85% target. Its effect on climate simulations should therefore be considered, especially when historically reconstructed and future projected land covers are employed.
Sub-pixel image classification for forest types in East Texas
NASA Astrophysics Data System (ADS)
Westbrook, Joey
Sub-pixel classification is the extraction of information about the proportion of individual materials of interest within a pixel. Landcover classification at the sub-pixel scale provides more discrimination than traditional per-pixel multispectral classifiers for pixels where the material of interest is mixed with other materials. It allows for the un-mixing of pixels to show the proportion of each material of interest. The materials of interest for this study are pine, hardwood, mixed forest and non-forest. The goal of this project was to perform a sub-pixel classification, which allows a pixel to have multiple labels, and compare the result to a traditional supervised classification, which allows a pixel to have only one label. The satellite image used was a Landsat 5 Thematic Mapper (TM) scene of the Stephen F. Austin Experimental Forest in Nacogdoches County, Texas and the four cover type classes are pine, hardwood, mixed forest and non-forest. Once classified, a multi-layer raster datasets was created that comprised four raster layers where each layer showed the percentage of that cover type within the pixel area. Percentage cover type maps were then produced and the accuracy of each was assessed using a fuzzy error matrix for the sub-pixel classifications, and the results were compared to the supervised classification in which a traditional error matrix was used. The overall accuracy of the sub-pixel classification using the aerial photo for both training and reference data had the highest (65% overall) out of the three sub-pixel classifications. This was understandable because the analyst can visually observe the cover types actually on the ground for training data and reference data, whereas using the FIA (Forest Inventory and Analysis) plot data, the analyst must assume that an entire pixel contains the exact percentage of a cover type found in a plot. An increase in accuracy was found after reclassifying each sub-pixel classification from nine classes with 10 percent interval each to five classes with 20 percent interval each. When compared to the supervised classification which has a satisfactory overall accuracy of 90%, none of the sub-pixel classification achieved the same level. However, since traditional per-pixel classifiers assign only one label to pixels throughout the landscape while sub-pixel classifications assign multiple labels to each pixel, the traditional 85% accuracy of acceptance for pixel-based classifications should not apply to sub-pixel classifications. More research is needed in order to define the level of accuracy that is deemed acceptable for sub-pixel classifications.
Disregarding population specificity: its influence on the sex assessment methods from the tibia.
Kotěrová, Anežka; Velemínská, Jana; Dupej, Ján; Brzobohatá, Hana; Pilný, Aleš; Brůžek, Jaroslav
2017-01-01
Forensic anthropology has developed classification techniques for sex estimation of unknown skeletal remains, for example population-specific discriminant function analyses. These methods were designed for populations that lived mostly in the late nineteenth and twentieth centuries. Their level of reliability or misclassification is important for practical use in today's forensic practice; it is, however, unknown. We addressed the question of what the likelihood of errors would be if population specificity of discriminant functions of the tibia were disregarded. Moreover, five classification functions in a Czech sample were proposed (accuracies 82.1-87.5 %, sex bias ranged from -1.3 to -5.4 %). We measured ten variables traditionally used for sex assessment of the tibia on a sample of 30 male and 26 female models from recent Czech population. To estimate the classification accuracy and error (misclassification) rates ignoring population specificity, we selected published classification functions of tibia for the Portuguese, south European, and the North American populations. These functions were applied on the dimensions of the Czech population. Comparing the classification success of the reference and the tested Czech sample showed that females from Czech population were significantly overestimated and mostly misclassified as males. Overall accuracy of sex assessment significantly decreased (53.6-69.7 %), sex bias -29.4-100 %, which is most probably caused by secular trend and the generally high variability of body size. Results indicate that the discriminant functions, developed for skeletal series representing geographically and chronologically diverse populations, are not applicable in current forensic investigations. Finally, implications and recommendations for future research are discussed.
Application of Sensor Fusion to Improve Uav Image Classification
NASA Astrophysics Data System (ADS)
Jabari, S.; Fathollahi, F.; Zhang, Y.
2017-08-01
Image classification is one of the most important tasks of remote sensing projects including the ones that are based on using UAV images. Improving the quality of UAV images directly affects the classification results and can save a huge amount of time and effort in this area. In this study, we show that sensor fusion can improve image quality which results in increasing the accuracy of image classification. Here, we tested two sensor fusion configurations by using a Panchromatic (Pan) camera along with either a colour camera or a four-band multi-spectral (MS) camera. We use the Pan camera to benefit from its higher sensitivity and the colour or MS camera to benefit from its spectral properties. The resulting images are then compared to the ones acquired by a high resolution single Bayer-pattern colour camera (here referred to as HRC). We assessed the quality of the output images by performing image classification tests. The outputs prove that the proposed sensor fusion configurations can achieve higher accuracies compared to the images of the single Bayer-pattern colour camera. Therefore, incorporating a Pan camera on-board in the UAV missions and performing image fusion can help achieving higher quality images and accordingly higher accuracy classification results.
Iliyasu, Abdullah M; Fatichah, Chastine
2017-12-19
A quantum hybrid (QH) intelligent approach that blends the adaptive search capability of the quantum-behaved particle swarm optimisation (QPSO) method with the intuitionistic rationality of traditional fuzzy k -nearest neighbours (Fuzzy k -NN) algorithm (known simply as the Q-Fuzzy approach) is proposed for efficient feature selection and classification of cells in cervical smeared (CS) images. From an initial multitude of 17 features describing the geometry, colour, and texture of the CS images, the QPSO stage of our proposed technique is used to select the best subset features (i.e., global best particles) that represent a pruned down collection of seven features. Using a dataset of almost 1000 images, performance evaluation of our proposed Q-Fuzzy approach assesses the impact of our feature selection on classification accuracy by way of three experimental scenarios that are compared alongside two other approaches: the All-features (i.e., classification without prior feature selection) and another hybrid technique combining the standard PSO algorithm with the Fuzzy k -NN technique (P-Fuzzy approach). In the first and second scenarios, we further divided the assessment criteria in terms of classification accuracy based on the choice of best features and those in terms of the different categories of the cervical cells. In the third scenario, we introduced new QH hybrid techniques, i.e., QPSO combined with other supervised learning methods, and compared the classification accuracy alongside our proposed Q-Fuzzy approach. Furthermore, we employed statistical approaches to establish qualitative agreement with regards to the feature selection in the experimental scenarios 1 and 3. The synergy between the QPSO and Fuzzy k -NN in the proposed Q-Fuzzy approach improves classification accuracy as manifest in the reduction in number cell features, which is crucial for effective cervical cancer detection and diagnosis.
Duvekot, Jorieke; van der Ende, Jan; Verhulst, Frank C; Greaves-Lord, Kirstin
2015-06-01
The screening accuracy of the parent and teacher-reported Social Responsiveness Scale (SRS) was compared with an autism spectrum disorder (ASD) classification according to (1) the Developmental, Dimensional, and Diagnostic Interview (3 Di), (2) the Autism Diagnostic Observation Schedule (ADOS), (3) both the 3 Di and ADOS, in 186 children referred to six mental health centers. The parent report showed excellent correspondence to an ASD classification according to the 3 Di and both the 3 Di and ADOS. The teacher report added significantly to the screening accuracy over and above the parent report when compared with the ADOS classification. Findings support the screening utility of the parent-reported SRS among clinically referred children and indicate that different informants may provide unique information relevant for ASD assessment.
Acosta-Mesa, Héctor-Gabriel; Rechy-Ramírez, Fernando; Mezura-Montes, Efrén; Cruz-Ramírez, Nicandro; Hernández Jiménez, Rodolfo
2014-06-01
In this work, we present a novel application of time series discretization using evolutionary programming for the classification of precancerous cervical lesions. The approach optimizes the number of intervals in which the length and amplitude of the time series should be compressed, preserving the important information for classification purposes. Using evolutionary programming, the search for a good discretization scheme is guided by a cost function which considers three criteria: the entropy regarding the classification, the complexity measured as the number of different strings needed to represent the complete data set, and the compression rate assessed as the length of the discrete representation. This discretization approach is evaluated using a time series data based on temporal patterns observed during a classical test used in cervical cancer detection; the classification accuracy reached by our method is compared with the well-known times series discretization algorithm SAX and the dimensionality reduction method PCA. Statistical analysis of the classification accuracy shows that the discrete representation is as efficient as the complete raw representation for the present application, reducing the dimensionality of the time series length by 97%. This representation is also very competitive in terms of classification accuracy when compared with similar approaches. Copyright © 2014 Elsevier Inc. All rights reserved.
Validation assessment of shoreline extraction on medium resolution satellite image
NASA Astrophysics Data System (ADS)
Manaf, Syaifulnizam Abd; Mustapha, Norwati; Sulaiman, Md Nasir; Husin, Nor Azura; Shafri, Helmi Zulhaidi Mohd
2017-10-01
Monitoring coastal zones helps provide information about the conditions of the coastal zones, such as erosion or accretion. Moreover, monitoring the shorelines can help measure the severity of such conditions. Such measurement can be performed accurately by using Earth observation satellite images rather than by using traditional ground survey. To date, shorelines can be extracted from satellite images with a high degree of accuracy by using satellite image classification techniques based on machine learning to identify the land and water classes of the shorelines. In this study, the researchers validated the results of extracted shorelines of 11 classifiers using a reference shoreline provided by the local authority. Specifically, the validation assessment was performed to examine the difference between the extracted shorelines and the reference shorelines. The research findings showed that the SVM Linear was the most effective image classification technique, as evidenced from the lowest mean distance between the extracted shoreline and the reference shoreline. Furthermore, the findings showed that the accuracy of the extracted shoreline was not directly proportional to the accuracy of the image classification.
Shermeyer, Jacob S.; Haack, Barry N.
2015-01-01
Two forestry-change detection methods are described, compared, and contrasted for estimating deforestation and growth in threatened forests in southern Peru from 2000 to 2010. The methods used in this study rely on freely available data, including atmospherically corrected Landsat 5 Thematic Mapper and Moderate Resolution Imaging Spectroradiometer (MODIS) vegetation continuous fields (VCF). The two methods include a conventional supervised signature extraction method and a unique self-calibrating method called MODIS VCF guided forest/nonforest (FNF) masking. The process chain for each of these methods includes a threshold classification of MODIS VCF, training data or signature extraction, signature evaluation, k-nearest neighbor classification, analyst-guided reclassification, and postclassification image differencing to generate forest change maps. Comparisons of all methods were based on an accuracy assessment using 500 validation pixels. Results of this accuracy assessment indicate that FNF masking had a 5% higher overall accuracy and was superior to conventional supervised classification when estimating forest change. Both methods succeeded in classifying persistently forested and nonforested areas, and both had limitations when classifying forest change.
NASA Astrophysics Data System (ADS)
Roychowdhury, K.
2016-06-01
Landcover is the easiest detectable indicator of human interventions on land. Urban and peri-urban areas present a complex combination of landcover, which makes classification challenging. This paper assesses the different methods of classifying landcover using dual polarimetric Sentinel-1 data collected during monsoon (July) and winter (December) months of 2015. Four broad landcover classes such as built up areas, water bodies and wetlands, vegetation and open spaces of Kolkata and its surrounding regions were identified. Polarimetric analyses were conducted on Single Look Complex (SLC) data of the region while ground range detected (GRD) data were used for spectral and spatial classification. Unsupervised classification by means of K-Means clustering used backscatter values and was able to identify homogenous landcovers over the study area. The results produced an overall accuracy of less than 50% for both the seasons. Higher classification accuracy (around 70%) was achieved by adding texture variables as inputs along with the backscatter values. However, the accuracy of classification increased significantly with polarimetric analyses. The overall accuracy was around 80% in Wishart H-A-Alpha unsupervised classification. The method was useful in identifying urban areas due to their double-bounce scattering and vegetated areas, which have more random scattering. Normalized Difference Built-up index (NDBI) and Normalized Difference Vegetation Index (NDVI) obtained from Landsat 8 data over the study area were used to verify vegetation and urban classes. The study compares the accuracies of different methods of classifying landcover using medium resolution SAR data in a complex urban area and suggests that polarimetric analyses present the most accurate results for urban and suburban areas.
An Evaluation of Item Response Theory Classification Accuracy and Consistency Indices
ERIC Educational Resources Information Center
Wyse, Adam E.; Hao, Shiqi
2012-01-01
This article introduces two new classification consistency indices that can be used when item response theory (IRT) models have been applied. The new indices are shown to be related to Rudner's classification accuracy index and Guo's classification accuracy index. The Rudner- and Guo-based classification accuracy and consistency indices are…
Myint, S.W.; Yuan, M.; Cerveny, R.S.; Giri, C.P.
2008-01-01
Remote sensing techniques have been shown effective for large-scale damage surveys after a hazardous event in both near real-time or post-event analyses. The paper aims to compare accuracy of common imaging processing techniques to detect tornado damage tracks from Landsat TM data. We employed the direct change detection approach using two sets of images acquired before and after the tornado event to produce a principal component composite images and a set of image difference bands. Techniques in the comparison include supervised classification, unsupervised classification, and objectoriented classification approach with a nearest neighbor classifier. Accuracy assessment is based on Kappa coefficient calculated from error matrices which cross tabulate correctly identified cells on the TM image and commission and omission errors in the result. Overall, the Object-oriented Approach exhibits the highest degree of accuracy in tornado damage detection. PCA and Image Differencing methods show comparable outcomes. While selected PCs can improve detection accuracy 5 to 10%, the Object-oriented Approach performs significantly better with 15-20% higher accuracy than the other two techniques. ?? 2008 by MDPI.
Myint, Soe W.; Yuan, May; Cerveny, Randall S.; Giri, Chandra P.
2008-01-01
Remote sensing techniques have been shown effective for large-scale damage surveys after a hazardous event in both near real-time or post-event analyses. The paper aims to compare accuracy of common imaging processing techniques to detect tornado damage tracks from Landsat TM data. We employed the direct change detection approach using two sets of images acquired before and after the tornado event to produce a principal component composite images and a set of image difference bands. Techniques in the comparison include supervised classification, unsupervised classification, and object-oriented classification approach with a nearest neighbor classifier. Accuracy assessment is based on Kappa coefficient calculated from error matrices which cross tabulate correctly identified cells on the TM image and commission and omission errors in the result. Overall, the Object-oriented Approach exhibits the highest degree of accuracy in tornado damage detection. PCA and Image Differencing methods show comparable outcomes. While selected PCs can improve detection accuracy 5 to 10%, the Object-oriented Approach performs significantly better with 15-20% higher accuracy than the other two techniques. PMID:27879757
Large Scale Crop Mapping in Ukraine Using Google Earth Engine
NASA Astrophysics Data System (ADS)
Shelestov, A.; Lavreniuk, M. S.; Kussul, N.
2016-12-01
There are no globally available high resolution satellite-derived crop specific maps at present. Only coarse-resolution imagery (> 250 m spatial resolution) has been utilized to derive global cropland extent. In 2016 we are going to carry out a country level demonstration of Sentinel-2 use for crop classification in Ukraine within the ESA Sen2-Agri project. But optical imagery can be contaminated by cloud cover that makes it difficult to acquire imagery in an optimal time range to discriminate certain crops. Due to the Copernicus program since 2015, a lot of Sentinel-1 SAR data at high spatial resolution is available for free for Ukraine. It allows us to use the time series of SAR data for crop classification. Our experiment for one administrative region in 2015 showed much higher crop classification accuracy with SAR data than with optical only time series [1, 2]. Therefore, in 2016 within the Google Earth Engine Research Award we use SAR data together with optical ones for large area crop mapping (entire territory of Ukraine) using cloud computing capabilities available at Google Earth Engine (GEE). This study compares different classification methods for crop mapping for the whole territory of Ukraine using data and algorithms from GEE. Classification performance assessed using overall classification accuracy, Kappa coefficients, and user's and producer's accuracies. Also, crop areas from derived classification maps compared to the official statistics [3]. S. Skakun et al., "Efficiency assessment of multitemporal C-band Radarsat-2 intensity and Landsat-8 surface reflectance satellite imagery for crop classification in Ukraine," IEEE Journal of Selected Topics in Applied Earth Observ. and Rem. Sens., 2015, DOI: 10.1109/JSTARS.2015.2454297. N. Kussul, S. Skakun, A. Shelestov, O. Kussul, "The use of satellite SAR imagery to crop classification in Ukraine within JECAM project," IEEE International Geoscience and Remote Sensing Symposium (IGARSS), pp.1497-1500, 13-18 July 2014, Quebec City, Canada. F.J. Gallego, N. Kussul, S. Skakun, O. Kravchenko, A. Shelestov, O. Kussul, "Efficiency assessment of using satellite data for crop area estimation in Ukraine," International Journal of Applied Earth Observation and Geoinformation vol. 29, pp. 22-30, 2014.
THEMATIC ACCURACY OF MRLC LAND COVER FOR THE EASTERN UNITED STATES
One objective of the MultiResolution Land Characteristics (MRLC) consortium is to map general land-cover categories for the conterminous United States using Landsat Thematic Mapper (TM) data. Land-cover mapping and classification accuracy assessment are complete for the e...
Monteiro-Soares, M; Martins-Mendes, D; Vaz-Carneiro, A; Sampaio, S; Dinis-Ribeiro, M
2014-10-01
We systematically review the available systems used to classify diabetic foot ulcers in order to synthesize their methodological qualitative issues and accuracy to predict lower extremity amputation, as this may represent a critical point in these patients' care. Two investigators searched, in EBSCO, ISI, PubMed and SCOPUS databases, and independently selected studies published until May 2013 and reporting prognostic accuracy and/or reliability of specific systems for patients with diabetic foot ulcer in order to predict lower extremity amputation. We included 25 studies reporting a prevalence of lower extremity amputation between 6% and 78%. Eight different diabetic foot ulcer descriptions and seven prognostic stratification classification systems were addressed with a variable (1-9) number of factors included, specially peripheral arterial disease (n = 12) or infection at the ulcer site (n = 10) or ulcer depth (n = 10). The Meggitt-Wagner, S(AD)SAD and Texas University Classification systems were the most extensively validated, whereas ten classifications were derived or validated only once. Reliability was reported in a single study, and accuracy measures were reported in five studies with another eight allowing their calculation. Pooled accuracy ranged from 0.65 (for gangrene) to 0.74 (for infection). There are numerous classification systems for diabetic foot ulcer outcome prediction, but only few studies evaluated their reliability or external validity. Studies rarely validated several systems simultaneously and only a few reported accuracy measures. Further studies assessing reliability and accuracy of the available systems and their composing variables are needed. Copyright © 2014 John Wiley & Sons, Ltd.
Edwards, T.C.; Cutler, D.R.; Zimmermann, N.E.; Geiser, L.; Moisen, Gretchen G.
2006-01-01
We evaluated the effects of probabilistic (hereafter DESIGN) and non-probabilistic (PURPOSIVE) sample surveys on resultant classification tree models for predicting the presence of four lichen species in the Pacific Northwest, USA. Models derived from both survey forms were assessed using an independent data set (EVALUATION). Measures of accuracy as gauged by resubstitution rates were similar for each lichen species irrespective of the underlying sample survey form. Cross-validation estimates of prediction accuracies were lower than resubstitution accuracies for all species and both design types, and in all cases were closer to the true prediction accuracies based on the EVALUATION data set. We argue that greater emphasis should be placed on calculating and reporting cross-validation accuracy rates rather than simple resubstitution accuracy rates. Evaluation of the DESIGN and PURPOSIVE tree models on the EVALUATION data set shows significantly lower prediction accuracy for the PURPOSIVE tree models relative to the DESIGN models, indicating that non-probabilistic sample surveys may generate models with limited predictive capability. These differences were consistent across all four lichen species, with 11 of the 12 possible species and sample survey type comparisons having significantly lower accuracy rates. Some differences in accuracy were as large as 50%. The classification tree structures also differed considerably both among and within the modelled species, depending on the sample survey form. Overlap in the predictor variables selected by the DESIGN and PURPOSIVE tree models ranged from only 20% to 38%, indicating the classification trees fit the two evaluated survey forms on different sets of predictor variables. The magnitude of these differences in predictor variables throws doubt on ecological interpretation derived from prediction models based on non-probabilistic sample surveys. ?? 2006 Elsevier B.V. All rights reserved.
A machine learning approach to multi-level ECG signal quality classification.
Li, Qiao; Rajagopalan, Cadathur; Clifford, Gari D
2014-12-01
Current electrocardiogram (ECG) signal quality assessment studies have aimed to provide a two-level classification: clean or noisy. However, clinical usage demands more specific noise level classification for varying applications. This work outlines a five-level ECG signal quality classification algorithm. A total of 13 signal quality metrics were derived from segments of ECG waveforms, which were labeled by experts. A support vector machine (SVM) was trained to perform the classification and tested on a simulated dataset and was validated using data from the MIT-BIH arrhythmia database (MITDB). The simulated training and test datasets were created by selecting clean segments of the ECG in the 2011 PhysioNet/Computing in Cardiology Challenge database, and adding three types of real ECG noise at different signal-to-noise ratio (SNR) levels from the MIT-BIH Noise Stress Test Database (NSTDB). The MITDB was re-annotated for five levels of signal quality. Different combinations of the 13 metrics were trained and tested on the simulated datasets and the best combination that produced the highest classification accuracy was selected and validated on the MITDB. Performance was assessed using classification accuracy (Ac), and a single class overlap accuracy (OAc), which assumes that an individual type classified into an adjacent class is acceptable. An Ac of 80.26% and an OAc of 98.60% on the test set were obtained by selecting 10 metrics while 57.26% (Ac) and 94.23% (OAc) were the numbers for the unseen MITDB validation data without retraining. By performing the fivefold cross validation, an Ac of 88.07±0.32% and OAc of 99.34±0.07% were gained on the validation fold of MITDB. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Domínguez, Rocio Berenice; Moreno-Barón, Laura; Muñoz, Roberto; Gutiérrez, Juan Manuel
2014-01-01
This paper describes a new method based on a voltammetric electronic tongue (ET) for the recognition of distinctive features in coffee samples. An ET was directly applied to different samples from the main Mexican coffee regions without any pretreatment before the analysis. The resulting electrochemical information was modeled with two different mathematical tools, namely Linear Discriminant Analysis (LDA) and Support Vector Machines (SVM). Growing conditions (i.e., organic or non-organic practices and altitude of crops) were considered for a first classification. LDA results showed an average discrimination rate of 88% ± 6.53% while SVM successfully accomplished an overall accuracy of 96.4% ± 3.50% for the same task. A second classification based on geographical origin of samples was carried out. Results showed an overall accuracy of 87.5% ± 7.79% for LDA and a superior performance of 97.5% ± 3.22% for SVM. Given the complexity of coffee samples, the high accuracy percentages achieved by ET coupled with SVM in both classification problems suggested a potential applicability of ET in the assessment of selected coffee features with a simpler and faster methodology along with a null sample pretreatment. In addition, the proposed method can be applied to authentication assessment while improving cost, time and accuracy of the general procedure. PMID:25254303
Domínguez, Rocio Berenice; Moreno-Barón, Laura; Muñoz, Roberto; Gutiérrez, Juan Manuel
2014-09-24
This paper describes a new method based on a voltammetric electronic tongue (ET) for the recognition of distinctive features in coffee samples. An ET was directly applied to different samples from the main Mexican coffee regions without any pretreatment before the analysis. The resulting electrochemical information was modeled with two different mathematical tools, namely Linear Discriminant Analysis (LDA) and Support Vector Machines (SVM). Growing conditions (i.e., organic or non-organic practices and altitude of crops) were considered for a first classification. LDA results showed an average discrimination rate of 88% ± 6.53% while SVM successfully accomplished an overall accuracy of 96.4% ± 3.50% for the same task. A second classification based on geographical origin of samples was carried out. Results showed an overall accuracy of 87.5% ± 7.79% for LDA and a superior performance of 97.5% ± 3.22% for SVM. Given the complexity of coffee samples, the high accuracy percentages achieved by ET coupled with SVM in both classification problems suggested a potential applicability of ET in the assessment of selected coffee features with a simpler and faster methodology along with a null sample pretreatment. In addition, the proposed method can be applied to authentication assessment while improving cost, time and accuracy of the general procedure.
Valderrama-Landeros, L; Flores-de-Santiago, F; Kovacs, J M; Flores-Verdugo, F
2017-12-14
Optimizing the classification accuracy of a mangrove forest is of utmost importance for conservation practitioners. Mangrove forest mapping using satellite-based remote sensing techniques is by far the most common method of classification currently used given the logistical difficulties of field endeavors in these forested wetlands. However, there is now an abundance of options from which to choose in regards to satellite sensors, which has led to substantially different estimations of mangrove forest location and extent with particular concern for degraded systems. The objective of this study was to assess the accuracy of mangrove forest classification using different remotely sensed data sources (i.e., Landsat-8, SPOT-5, Sentinel-2, and WorldView-2) for a system located along the Pacific coast of Mexico. Specifically, we examined a stressed semiarid mangrove forest which offers a variety of conditions such as dead areas, degraded stands, healthy mangroves, and very dense mangrove island formations. The results indicated that Landsat-8 (30 m per pixel) had the lowest overall accuracy at 64% and that WorldView-2 (1.6 m per pixel) had the highest at 93%. Moreover, the SPOT-5 and the Sentinel-2 classifications (10 m per pixel) were very similar having accuracies of 75 and 78%, respectively. In comparison to WorldView-2, the other sensors overestimated the extent of Laguncularia racemosa and underestimated the extent of Rhizophora mangle. When considering such type of sensors, the higher spatial resolution can be particularly important in mapping small mangrove islands that often occur in degraded mangrove systems.
Ralston, Barbara E.; Davis, Philip A.; Weber, Robert M.; Rundall, Jill M.
2008-01-01
A vegetation database of the riparian vegetation located within the Colorado River ecosystem (CRE), a subsection of the Colorado River between Glen Canyon Dam and the western boundary of Grand Canyon National Park, was constructed using four-band image mosaics acquired in May 2002. A digital line scanner was flown over the Colorado River corridor in Arizona by ISTAR Americas, using a Leica ADS-40 digital camera to acquire a digital surface model and four-band image mosaics (blue, green, red, and near-infrared) for vegetation mapping. The primary objective of this mapping project was to develop a digital inventory map of vegetation to enable patch- and landscape-scale change detection, and to establish randomized sampling points for ground surveys of terrestrial fauna (principally, but not exclusively, birds). The vegetation base map was constructed through a combination of ground surveys to identify vegetation classes, image processing, and automated supervised classification procedures. Analysis of the imagery and subsequent supervised classification involved multiple steps to evaluate band quality, band ratios, and vegetation texture and density. Identification of vegetation classes involved collection of cover data throughout the river corridor and subsequent analysis using two-way indicator species analysis (TWINSPAN). Vegetation was classified into six vegetation classes, following the National Vegetation Classification Standard, based on cover dominance. This analysis indicated that total area covered by all vegetation within the CRE was 3,346 ha. Considering the six vegetation classes, the sparse shrub (SS) class accounted for the greatest amount of vegetation (627 ha) followed by Pluchea (PLSE) and Tamarix (TARA) at 494 and 366 ha, respectively. The wetland (WTLD) and Prosopis-Acacia (PRGL) classes both had similar areal cover values (227 and 213 ha, respectively). Baccharis-Salix (BAXX) was the least represented at 94 ha. Accuracy assessment of the supervised classification determined that accuracies varied among vegetation classes from 90% to 49%. Causes for low accuracies were similar spectral signatures among vegetation classes. Fuzzy accuracy assessment improved classification accuracies such that Federal mapping standards of 80% accuracies for all classes were met. The scale used to quantify vegetation adequately meets the needs of the stakeholder group. Increasing the scale to meet the U.S. Geological Survey (USGS)-National Park Service (NPS)National Mapping Program's minimum mapping unit of 0.5 ha is unwarranted because this scale would reduce the resolution of some classes (e.g., seep willow/coyote willow would likely be combined with tamarisk). While this would undoubtedly improve classification accuracies, it would not provide the community-level information about vegetation change that would benefit stakeholders. The identification of vegetation classes should follow NPS mapping approaches to complement the national effort and should incorporate the alternative analysis for community identification that is being incorporated into newer NPS mapping efforts. National Vegetation Classification is followed in this report for association- to formation-level categories. Accuracies could be improved by including more environmental variables such as stage elevation in the classification process and incorporating object-based classification methods. Another approach that may address the heterogeneous species issue and classification is to use spectral mixing analysis to estimate the fractional cover of species within each pixel and better quantify the cover of individual species that compose a cover class. Varying flights to capture vegetation at different times of the year might also help separate some vegetation classes, though the cost may be prohibitive. Lastly, photointerpretation instead of automated mapping could be tried. Photointerpretation would likely not improve accuracies in this case, howev
NASA Technical Reports Server (NTRS)
Hill, C. L.
1984-01-01
A computer-implemented classification has been derived from Landsat-4 Thematic Mapper data acquired over Baldwin County, Alabama on January 15, 1983. One set of spectral signatures was developed from the data by utilizing a 3x3 pixel sliding window approach. An analysis of the classification produced from this technique identified forested areas. Additional information regarding only the forested areas. Additional information regarding only the forested areas was extracted by employing a pixel-by-pixel signature development program which derived spectral statistics only for pixels within the forested land covers. The spectral statistics from both approaches were integrated and the data classified. This classification was evaluated by comparing the spectral classes produced from the data against corresponding ground verification polygons. This iterative data analysis technique resulted in an overall classification accuracy of 88.4 percent correct for slash pine, young pine, loblolly pine, natural pine, and mixed hardwood-pine. An accuracy assessment matrix has been produced for the classification.
Combined use of LiDAR data and multispectral earth observation imagery for wetland habitat mapping
NASA Astrophysics Data System (ADS)
Rapinel, Sébastien; Hubert-Moy, Laurence; Clément, Bernard
2015-05-01
Although wetlands play a key role in controlling flooding and nonpoint source pollution, sequestering carbon and providing an abundance of ecological services, the inventory and characterization of wetland habitats are most often limited to small areas. This explains why the understanding of their ecological functioning is still insufficient for a reliable functional assessment on areas larger than a few hectares. While LiDAR data and multispectral Earth Observation (EO) images are often used separately to map wetland habitats, their combined use is currently being assessed for different habitat types. The aim of this study is to evaluate the combination of multispectral and multiseasonal imagery and LiDAR data to precisely map the distribution of wetland habitats. The image classification was performed combining an object-based approach and decision-tree modeling. Four multispectral images with high (SPOT-5) and very high spatial resolution (Quickbird, KOMPSAT-2, aerial photographs) were classified separately. Another classification was then applied integrating summer and winter multispectral image data and three layers derived from LiDAR data: vegetation height, microtopography and intensity return. The comparison of classification results shows that some habitats are better identified on the winter image and others on the summer image (overall accuracies = 58.5 and 57.6%). They also point out that classification accuracy is highly improved (overall accuracy = 86.5%) when combining LiDAR data and multispectral images. Moreover, this study highlights the advantage of integrating vegetation height, microtopography and intensity parameters in the classification process. This article demonstrates that information provided by the synergetic use of multispectral images and LiDAR data can help in wetland functional assessment
NASA Astrophysics Data System (ADS)
Chen, Y.; Luo, M.; Xu, L.; Zhou, X.; Ren, J.; Zhou, J.
2018-04-01
The RF method based on grid-search parameter optimization could achieve a classification accuracy of 88.16 % in the classification of images with multiple feature variables. This classification accuracy was higher than that of SVM and ANN under the same feature variables. In terms of efficiency, the RF classification method performs better than SVM and ANN, it is more capable of handling multidimensional feature variables. The RF method combined with object-based analysis approach could highlight the classification accuracy further. The multiresolution segmentation approach on the basis of ESP scale parameter optimization was used for obtaining six scales to execute image segmentation, when the segmentation scale was 49, the classification accuracy reached the highest value of 89.58 %. The classification accuracy of object-based RF classification was 1.42 % higher than that of pixel-based classification (88.16 %), and the classification accuracy was further improved. Therefore, the RF classification method combined with object-based analysis approach could achieve relatively high accuracy in the classification and extraction of land use information for industrial and mining reclamation areas. Moreover, the interpretation of remotely sensed imagery using the proposed method could provide technical support and theoretical reference for remotely sensed monitoring land reclamation.
Optimizing Support Vector Machine Parameters with Genetic Algorithm for Credit Risk Assessment
NASA Astrophysics Data System (ADS)
Manurung, Jonson; Mawengkang, Herman; Zamzami, Elviawaty
2017-12-01
Support vector machine (SVM) is a popular classification method known to have strong generalization capabilities. SVM can solve the problem of classification and linear regression or nonlinear kernel which can be a learning algorithm for the ability of classification and regression. However, SVM also has a weakness that is difficult to determine the optimal parameter value. SVM calculates the best linear separator on the input feature space according to the training data. To classify data which are non-linearly separable, SVM uses kernel tricks to transform the data into a linearly separable data on a higher dimension feature space. The kernel trick using various kinds of kernel functions, such as : linear kernel, polynomial, radial base function (RBF) and sigmoid. Each function has parameters which affect the accuracy of SVM classification. To solve the problem genetic algorithms are proposed to be applied as the optimal parameter value search algorithm thus increasing the best classification accuracy on SVM. Data taken from UCI repository of machine learning database: Australian Credit Approval. The results show that the combination of SVM and genetic algorithms is effective in improving classification accuracy. Genetic algorithms has been shown to be effective in systematically finding optimal kernel parameters for SVM, instead of randomly selected kernel parameters. The best accuracy for data has been upgraded from kernel Linear: 85.12%, polynomial: 81.76%, RBF: 77.22% Sigmoid: 78.70%. However, for bigger data sizes, this method is not practical because it takes a lot of time.
Laufer, Shlomi; D'Angelo, Anne-Lise D; Kwan, Calvin; Ray, Rebbeca D; Yudkowsky, Rachel; Boulet, John R; McGaghie, William C; Pugh, Carla M
2017-12-01
Develop new performance evaluation standards for the clinical breast examination (CBE). There are several, technical aspects of a proper CBE. Our recent work discovered a significant, linear relationship between palpation force and CBE accuracy. This article investigates the relationship between other technical aspects of the CBE and accuracy. This performance assessment study involved data collection from physicians (n = 553) attending 3 different clinical meetings between 2013 and 2014: American Society of Breast Surgeons, American Academy of Family Physicians, and American College of Obstetricians and Gynecologists. Four, previously validated, sensor-enabled breast models were used for clinical skills assessment. Models A and B had solitary, superficial, 2 cm and 1 cm soft masses, respectively. Models C and D had solitary, deep, 2 cm hard and moderately firm masses, respectively. Finger movements (search technique) from 1137 CBE video recordings were independently classified by 2 observers. Final classifications were compared with CBE accuracy. Accuracy rates were model A = 99.6%, model B = 89.7%, model C = 75%, and model D = 60%. Final classification categories for search technique included rubbing movement, vertical movement, piano fingers, and other. Interrater reliability was (k = 0.79). Rubbing movement was 4 times more likely to yield an accurate assessment (odds ratio 3.81, P < 0.001) compared with vertical movement and piano fingers. Piano fingers had the highest failure rate (36.5%). Regression analysis of search pattern, search technique, palpation force, examination time, and 6 demographic variables, revealed that search technique independently and significantly affected CBE accuracy (P < 0.001). Our results support measurement and classification of CBE techniques and provide the foundation for a new paradigm in teaching and assessing hands-on clinical skills. The newly described piano fingers palpation technique was noted to have unusually high failure rates. Medical educators should be aware of the potential differences in effectiveness for various CBE techniques.
Algorithmic Classification of Five Characteristic Types of Paraphasias.
Fergadiotis, Gerasimos; Gorman, Kyle; Bedrick, Steven
2016-12-01
This study was intended to evaluate a series of algorithms developed to perform automatic classification of paraphasic errors (formal, semantic, mixed, neologistic, and unrelated errors). We analyzed 7,111 paraphasias from the Moss Aphasia Psycholinguistics Project Database (Mirman et al., 2010) and evaluated the classification accuracy of 3 automated tools. First, we used frequency norms from the SUBTLEXus database (Brysbaert & New, 2009) to differentiate nonword errors and real-word productions. Then we implemented a phonological-similarity algorithm to identify phonologically related real-word errors. Last, we assessed the performance of a semantic-similarity criterion that was based on word2vec (Mikolov, Yih, & Zweig, 2013). Overall, the algorithmic classification replicated human scoring for the major categories of paraphasias studied with high accuracy. The tool that was based on the SUBTLEXus frequency norms was more than 97% accurate in making lexicality judgments. The phonological-similarity criterion was approximately 91% accurate, and the overall classification accuracy of the semantic classifier ranged from 86% to 90%. Overall, the results highlight the potential of tools from the field of natural language processing for the development of highly reliable, cost-effective diagnostic tools suitable for collecting high-quality measurement data for research and clinical purposes.
ERIC Educational Resources Information Center
Madison, Matthew J.; Bradshaw, Laine P.
2015-01-01
Diagnostic classification models are psychometric models that aim to classify examinees according to their mastery or non-mastery of specified latent characteristics. These models are well-suited for providing diagnostic feedback on educational assessments because of their practical efficiency and increased reliability when compared with other…
ERIC Educational Resources Information Center
Jacoby, Larry L.; Wahlheim, Christopher N.; Coane, Jennifer H.
2010-01-01
Three experiments examined testing effects on learning of natural concepts and metacognitive assessments of such learning. Results revealed that testing enhanced recognition memory and classification accuracy for studied and novel exemplars of bird families on immediate and delayed tests. These effects depended on the balance of study and test…
Land cover mapping after the tsunami event over Nanggroe Aceh Darussalam (NAD) province, Indonesia
NASA Astrophysics Data System (ADS)
Lim, H. S.; MatJafri, M. Z.; Abdullah, K.; Alias, A. N.; Mohd. Saleh, N.; Wong, C. J.; Surbakti, M. S.
2008-03-01
Remote sensing offers an important means of detecting and analyzing temporal changes occurring in our landscape. This research used remote sensing to quantify land use/land cover changes at the Nanggroe Aceh Darussalam (Nad) province, Indonesia on a regional scale. The objective of this paper is to assess the changed produced from the analysis of Landsat TM data. A Landsat TM image was used to develop land cover classification map for the 27 March 2005. Four supervised classifications techniques (Maximum Likelihood, Minimum Distance-to- Mean, Parallelepiped and Parallelepiped with Maximum Likelihood Classifier Tiebreaker classifier) were performed to the satellite image. Training sites and accuracy assessment were needed for supervised classification techniques. The training sites were established using polygons based on the colour image. High detection accuracy (>80%) and overall Kappa (>0.80) were achieved by the Parallelepiped with Maximum Likelihood Classifier Tiebreaker classifier in this study. This preliminary study has produced a promising result. This indicates that land cover mapping can be carried out using remote sensing classification method of the satellite digital imagery.
Ozcift, Akin; Gulten, Arif
2011-12-01
Improving accuracies of machine learning algorithms is vital in designing high performance computer-aided diagnosis (CADx) systems. Researches have shown that a base classifier performance might be enhanced by ensemble classification strategies. In this study, we construct rotation forest (RF) ensemble classifiers of 30 machine learning algorithms to evaluate their classification performances using Parkinson's, diabetes and heart diseases from literature. While making experiments, first the feature dimension of three datasets is reduced using correlation based feature selection (CFS) algorithm. Second, classification performances of 30 machine learning algorithms are calculated for three datasets. Third, 30 classifier ensembles are constructed based on RF algorithm to assess performances of respective classifiers with the same disease data. All the experiments are carried out with leave-one-out validation strategy and the performances of the 60 algorithms are evaluated using three metrics; classification accuracy (ACC), kappa error (KE) and area under the receiver operating characteristic (ROC) curve (AUC). Base classifiers succeeded 72.15%, 77.52% and 84.43% average accuracies for diabetes, heart and Parkinson's datasets, respectively. As for RF classifier ensembles, they produced average accuracies of 74.47%, 80.49% and 87.13% for respective diseases. RF, a newly proposed classifier ensemble algorithm, might be used to improve accuracy of miscellaneous machine learning algorithms to design advanced CADx systems. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Hall-Brown, Mary
The heterogeneity of Arctic vegetation can make land cover classification vey difficult when using medium to small resolution imagery (Schneider et al., 2009; Muller et al., 1999). Using high radiometric and spatial resolution imagery, such as the SPOT 5 and IKONOS satellites, have helped arctic land cover classification accuracies rise into the 80 and 90 percentiles (Allard, 2003; Stine et al., 2010; Muller et al., 1999). However, those increases usually come at a high price. High resolution imagery is very expensive and can often add tens of thousands of dollars onto the cost of the research. The EO-1 satellite launched in 2002 carries two sensors that have high specral and/or high spatial resolutions and can be an acceptable compromise between the resolution versus cost issues. The Hyperion is a hyperspectral sensor with the capability of collecting 242 spectral bands of information. The Advanced Land Imager (ALI) is an advanced multispectral sensor whose spatial resolution can be sharpened to 10 meters. This dissertation compares the accuracies of arctic land cover classifications produced by the Hyperion and ALI sensors to the classification accuracies produced by the Systeme Pour l' Observation de le Terre (SPOT), the Landsat Thematic Mapper (TM) and the Landsat Enhanced Thematic Mapper Plus (ETM+) sensors. Hyperion and ALI images from August 2004 were collected over the Upper Kuparuk River Basin, Alaska. Image processing included the stepwise discriminant analysis of pixels that were positively classified from coinciding ground control points, geometric and radiometric correction, and principle component analysis. Finally, stratified random sampling was used to perform accuracy assessments on satellite derived land cover classifications. Accuracy was estimated from an error matrix (confusion matrix) that provided the overall, producer's and user's accuracies. This research found that while the Hyperion sensor produced classfication accuracies that were equivalent to the TM and ETM+ sensor (approximately 78%), the Hyperion could not obtain the accuracy of the SPOT 5 HRV sensor. However, the land cover classifications derived from the ALI sensor exceeded most classification accuracies derived from the TM and ETM+ senors and were even comparable to most SPOT 5 HRV classifications (87%). With the deactivation of the Landsat series satellites, the monitoring of remote locations such as in the Arctic on an uninterupted basis thoughout the world is in jeopardy. The utilization of the Hyperion and ALI sensors are a way to keep that endeavor operational. By keeping the ALI sensor active at all times, uninterupted observation of the entire Earth can be accomplished. Keeping the Hyperion sensor as a "tasked" sensor can provide scientists with additional imagery and options for their studies without overburdening storage issues.
A Visual mining based framework for classification accuracy estimation
NASA Astrophysics Data System (ADS)
Arun, Pattathal Vijayakumar
2013-12-01
Classification techniques have been widely used in different remote sensing applications and correct classification of mixed pixels is a tedious task. Traditional approaches adopt various statistical parameters, however does not facilitate effective visualisation. Data mining tools are proving very helpful in the classification process. We propose a visual mining based frame work for accuracy assessment of classification techniques using open source tools such as WEKA and PREFUSE. These tools in integration can provide an efficient approach for getting information about improvements in the classification accuracy and helps in refining training data set. We have illustrated framework for investigating the effects of various resampling methods on classification accuracy and found that bilinear (BL) is best suited for preserving radiometric characteristics. We have also investigated the optimal number of folds required for effective analysis of LISS-IV images. Techniki klasyfikacji są szeroko wykorzystywane w różnych aplikacjach teledetekcyjnych, w których poprawna klasyfikacja pikseli stanowi poważne wyzwanie. Podejście tradycyjne wykorzystujące różnego rodzaju parametry statystyczne nie zapewnia efektywnej wizualizacji. Wielce obiecujące wydaje się zastosowanie do klasyfikacji narzędzi do eksploracji danych. W artykule zaproponowano podejście bazujące na wizualnej analizie eksploracyjnej, wykorzystujące takie narzędzia typu open source jak WEKA i PREFUSE. Wymienione narzędzia ułatwiają korektę pół treningowych i efektywnie wspomagają poprawę dokładności klasyfikacji. Działanie metody sprawdzono wykorzystując wpływ różnych metod resampling na zachowanie dokładności radiometrycznej i uzyskując najlepsze wyniki dla metody bilinearnej (BL).
Telephone-quality pathological speech classification using empirical mode decomposition.
Kaleem, M F; Ghoraani, B; Guergachi, A; Krishnan, S
2011-01-01
This paper presents a computationally simple and effective methodology based on empirical mode decomposition (EMD) for classification of telephone quality normal and pathological speech signals. EMD is used to decompose continuous normal and pathological speech signals into intrinsic mode functions, which are analyzed to extract physically meaningful and unique temporal and spectral features. Using continuous speech samples from a database of 51 normal and 161 pathological speakers, which has been modified to simulate telephone quality speech under different levels of noise, a linear classifier is used with the feature vector thus obtained to obtain a high classification accuracy, thereby demonstrating the effectiveness of the methodology. The classification accuracy reported in this paper (89.7% for signal-to-noise ratio 30 dB) is a significant improvement over previously reported results for the same task, and demonstrates the utility of our methodology for cost-effective remote voice pathology assessment over telephone channels.
NASA Technical Reports Server (NTRS)
Spruce, J. P.; Smoot, James; Ellis, Jean; Hilbert, Kent; Swann, Roberta
2012-01-01
This paper discusses the development and implementation of a geospatial data processing method and multi-decadal Landsat time series for computing general coastal U.S. land-use and land-cover (LULC) classifications and change products consisting of seven classes (water, barren, upland herbaceous, non-woody wetland, woody upland, woody wetland, and urban). Use of this approach extends the observational period of the NOAA-generated Coastal Change and Analysis Program (C-CAP) products by almost two decades, assuming the availability of one cloud free Landsat scene from any season for each targeted year. The Mobile Bay region in Alabama was used as a study area to develop, demonstrate, and validate the method that was applied to derive LULC products for nine dates at approximate five year intervals across a 34-year time span, using single dates of data for each classification in which forests were either leaf-on, leaf-off, or mixed senescent conditions. Classifications were computed and refined using decision rules in conjunction with unsupervised classification of Landsat data and C-CAP value-added products. Each classification's overall accuracy was assessed by comparing stratified random locations to available reference data, including higher spatial resolution satellite and aerial imagery, field survey data, and raw Landsat RGBs. Overall classification accuracies ranged from 83 to 91% with overall Kappa statistics ranging from 0.78 to 0.89. The accuracies are comparable to those from similar, generalized LULC products derived from C-CAP data. The Landsat MSS-based LULC product accuracies are similar to those from Landsat TM or ETM+ data. Accurate classifications were computed for all nine dates, yielding effective results regardless of season. This classification method yielded products that were used to compute LULC change products via additive GIS overlay techniques.
Mohammed A. Kalkhan; Robin M. Reich; Raymond L. Czaplewski
1996-01-01
A Monte Carlo simulation was used to evaluate the statistical properties of measures of association and the Kappa statistic under double sampling with replacement. Three error matrices representing three levels of classification accuracy of Landsat TM Data consisting of four forest cover types in North Carolina. The overall accuracy of the five indices ranged from 0.35...
Mezgec, Simon; Eftimov, Tome; Bucher, Tamara; Koroušić Seljak, Barbara
2018-04-06
The present study tested the combination of an established and a validated food-choice research method (the 'fake food buffet') with a new food-matching technology to automate the data collection and analysis. The methodology combines fake-food image recognition using deep learning and food matching and standardization based on natural language processing. The former is specific because it uses a single deep learning network to perform both the segmentation and the classification at the pixel level of the image. To assess its performance, measures based on the standard pixel accuracy and Intersection over Union were applied. Food matching firstly describes each of the recognized food items in the image and then matches the food items with their compositional data, considering both their food names and their descriptors. The final accuracy of the deep learning model trained on fake-food images acquired by 124 study participants and providing fifty-five food classes was 92·18 %, while the food matching was performed with a classification accuracy of 93 %. The present findings are a step towards automating dietary assessment and food-choice research. The methodology outperforms other approaches in pixel accuracy, and since it is the first automatic solution for recognizing the images of fake foods, the results could be used as a baseline for possible future studies. As the approach enables a semi-automatic description of recognized food items (e.g. with respect to FoodEx2), these can be linked to any food composition database that applies the same classification and description system.
NASA Astrophysics Data System (ADS)
Mafanya, Madodomzi; Tsele, Philemon; Botai, Joel; Manyama, Phetole; Swart, Barend; Monate, Thabang
2017-07-01
Invasive alien plants (IAPs) not only pose a serious threat to biodiversity and water resources but also have impacts on human and animal wellbeing. To support decision making in IAPs monitoring, semi-automated image classifiers which are capable of extracting valuable information in remotely sensed data are vital. This study evaluated the mapping accuracies of supervised and unsupervised image classifiers for mapping Harrisia pomanensis (a cactus plant commonly known as the Midnight Lady) using two interlinked evaluation strategies i.e. point and area based accuracy assessment. Results of the point-based accuracy assessment show that with reference to 219 ground control points, the supervised image classifiers (i.e. Maxver and Bhattacharya) mapped H. pomanensis better than the unsupervised image classifiers (i.e. K-mediuns, Euclidian Length and Isoseg). In this regard, user and producer accuracies were 82.4% and 84% respectively for the Maxver classifier. The user and producer accuracies for the Bhattacharya classifier were 90% and 95.7%, respectively. Though the Maxver produced a higher overall accuracy and Kappa estimate than the Bhattacharya classifier, the Maxver Kappa estimate of 0.8305 is not significantly (statistically) greater than the Bhattacharya Kappa estimate of 0.8088 at a 95% confidence interval. The area based accuracy assessment results show that the Bhattacharya classifier estimated the spatial extent of H. pomanensis with an average mapping accuracy of 86.1% whereas the Maxver classifier only gave an average mapping accuracy of 65.2%. Based on these results, the Bhattacharya classifier is therefore recommended for mapping H. pomanensis. These findings will aid in the algorithm choice making for the development of a semi-automated image classification system for mapping IAPs.
NASA Astrophysics Data System (ADS)
Hale Topaloğlu, Raziye; Sertel, Elif; Musaoğlu, Nebiye
2016-06-01
This study aims to compare classification accuracies of land cover/use maps created from Sentinel-2 and Landsat-8 data. Istanbul metropolitan city of Turkey, with a population of around 14 million, having different landscape characteristics was selected as study area. Water, forest, agricultural areas, grasslands, transport network, urban, airport- industrial units and barren land- mine land cover/use classes adapted from CORINE nomenclature were used as main land cover/use classes to identify. To fulfil the aims of this research, recently acquired dated 08/02/2016 Sentinel-2 and dated 22/02/2016 Landsat-8 images of Istanbul were obtained and image pre-processing steps like atmospheric and geometric correction were employed. Both Sentinel-2 and Landsat-8 images were resampled to 30m pixel size after geometric correction and similar spectral bands for both satellites were selected to create a similar base for these multi-sensor data. Maximum Likelihood (MLC) and Support Vector Machine (SVM) supervised classification methods were applied to both data sets to accurately identify eight different land cover/ use classes. Error matrix was created using same reference points for Sentinel-2 and Landsat-8 classifications. After the classification accuracy, results were compared to find out the best approach to create current land cover/use map of the region. The results of MLC and SVM classification methods were compared for both images.
D Land Cover Classification Based on Multispectral LIDAR Point Clouds
NASA Astrophysics Data System (ADS)
Zou, Xiaoliang; Zhao, Guihua; Li, Jonathan; Yang, Yuanxi; Fang, Yong
2016-06-01
Multispectral Lidar System can emit simultaneous laser pulses at the different wavelengths. The reflected multispectral energy is captured through a receiver of the sensor, and the return signal together with the position and orientation information of sensor is recorded. These recorded data are solved with GNSS/IMU data for further post-processing, forming high density multispectral 3D point clouds. As the first commercial multispectral airborne Lidar sensor, Optech Titan system is capable of collecting point clouds data from all three channels at 532nm visible (Green), at 1064 nm near infrared (NIR) and at 1550nm intermediate infrared (IR). It has become a new source of data for 3D land cover classification. The paper presents an Object Based Image Analysis (OBIA) approach to only use multispectral Lidar point clouds datasets for 3D land cover classification. The approach consists of three steps. Firstly, multispectral intensity images are segmented into image objects on the basis of multi-resolution segmentation integrating different scale parameters. Secondly, intensity objects are classified into nine categories by using the customized features of classification indexes and a combination the multispectral reflectance with the vertical distribution of object features. Finally, accuracy assessment is conducted via comparing random reference samples points from google imagery tiles with the classification results. The classification results show higher overall accuracy for most of the land cover types. Over 90% of overall accuracy is achieved via using multispectral Lidar point clouds for 3D land cover classification.
Towards automated spectroscopic tissue classification in thyroid and parathyroid surgery.
Schols, Rutger M; Alic, Lejla; Wieringa, Fokko P; Bouvy, Nicole D; Stassen, Laurents P S
2017-03-01
In (para-)thyroid surgery iatrogenic parathyroid injury should be prevented. To aid the surgeons' eye, a camera system enabling parathyroid-specific image enhancement would be useful. Hyperspectral camera technology might work, provided that the spectral signature of parathyroid tissue offers enough specific features to be reliably and automatically distinguished from surrounding tissues. As a first step to investigate this, we examined the feasibility of wide band diffuse reflectance spectroscopy (DRS) for automated spectroscopic tissue classification, using silicon (Si) and indium-gallium-arsenide (InGaAs) sensors. DRS (350-1830 nm) was performed during (para-)thyroid resections. From the acquired spectra 36 features at predefined wavelengths were extracted. The best features for classification of parathyroid from adipose or thyroid were assessed by binary logistic regression for Si- and InGaAs-sensor ranges. Classification performance was evaluated by leave-one-out cross-validation. In 19 patients 299 spectra were recorded (62 tissue sites: thyroid = 23, parathyroid = 21, adipose = 18). Classification accuracy of parathyroid-adipose was, respectively, 79% (Si), 82% (InGaAs) and 97% (Si/InGaAs combined). Parathyroid-thyroid classification accuracies were 80% (Si), 75% (InGaAs), 82% (Si/InGaAs combined). Si and InGaAs sensors are fairly accurate for automated spectroscopic classification of parathyroid, adipose and thyroid tissues. Combination of both sensor technologies improves accuracy. Follow-up research, aimed towards hyperspectral imaging seems justified. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Mapping Mangrove Density from Rapideye Data in Central America
NASA Astrophysics Data System (ADS)
Son, Nguyen-Thanh; Chen, Chi-Farn; Chen, Cheng-Ru
2017-06-01
Mangrove forests provide a wide range of socioeconomic and ecological services for coastal communities. Extensive aquaculture development of mangrove waters in many developing countries has constantly ignored services of mangrove ecosystems, leading to unintended environmental consequences. Monitoring the current status and distribution of mangrove forests is deemed important for evaluating forest management strategies. This study aims to delineate the density distribution of mangrove forests in the Gulf of Fonseca, Central America with Rapideye data using the support vector machines (SVM). The data collected in 2012 for density classification of mangrove forests were processed based on four different band combination schemes: scheme-1 (bands 1-3, 5 excluding the red-edge band 4), scheme-2 (bands 1-5), scheme-3 (bands 1-3, 5 incorporating with the normalized difference vegetation index, NDVI), and scheme-4 (bands 1-3, 5 incorporating with the normalized difference red-edge index, NDRI). We also hypothesized if the obvious contribution of Rapideye red-edge band could improve the classification results. Three main steps of data processing were employed: (1), data pre-processing, (2) image classification, and (3) accuracy assessment to evaluate the contribution of red-edge band in terms of the accuracy of classification results across these four schemes. The classification maps compared with the ground reference data indicated the slightly higher accuracy level observed for schemes 2 and 4. The overall accuracies and Kappa coefficients were 97% and 0.95 for scheme-2 and 96.9% and 0.95 for scheme-4, respectively.
NASA Technical Reports Server (NTRS)
Myint, Soe W.; Mesev, Victor; Quattrochi, Dale; Wentz, Elizabeth A.
2013-01-01
Remote sensing methods used to generate base maps to analyze the urban environment rely predominantly on digital sensor data from space-borne platforms. This is due in part from new sources of high spatial resolution data covering the globe, a variety of multispectral and multitemporal sources, sophisticated statistical and geospatial methods, and compatibility with GIS data sources and methods. The goal of this chapter is to review the four groups of classification methods for digital sensor data from space-borne platforms; per-pixel, sub-pixel, object-based (spatial-based), and geospatial methods. Per-pixel methods are widely used methods that classify pixels into distinct categories based solely on the spectral and ancillary information within that pixel. They are used for simple calculations of environmental indices (e.g., NDVI) to sophisticated expert systems to assign urban land covers. Researchers recognize however, that even with the smallest pixel size the spectral information within a pixel is really a combination of multiple urban surfaces. Sub-pixel classification methods therefore aim to statistically quantify the mixture of surfaces to improve overall classification accuracy. While within pixel variations exist, there is also significant evidence that groups of nearby pixels have similar spectral information and therefore belong to the same classification category. Object-oriented methods have emerged that group pixels prior to classification based on spectral similarity and spatial proximity. Classification accuracy using object-based methods show significant success and promise for numerous urban 3 applications. Like the object-oriented methods that recognize the importance of spatial proximity, geospatial methods for urban mapping also utilize neighboring pixels in the classification process. The primary difference though is that geostatistical methods (e.g., spatial autocorrelation methods) are utilized during both the pre- and post-classification steps. Within this chapter, each of the four approaches is described in terms of scale and accuracy classifying urban land use and urban land cover; and for its range of urban applications. We demonstrate the overview of four main classification groups in Figure 1 while Table 1 details the approaches with respect to classification requirements and procedures (e.g., reflectance conversion, steps before training sample selection, training samples, spatial approaches commonly used, classifiers, primary inputs for classification, output structures, number of output layers, and accuracy assessment). The chapter concludes with a brief summary of the methods reviewed and the challenges that remain in developing new classification methods for improving the efficiency and accuracy of mapping urban areas.
Multiple Hypotheses Image Segmentation and Classification With Application to Dietary Assessment
Zhu, Fengqing; Bosch, Marc; Khanna, Nitin; Boushey, Carol J.; Delp, Edward J.
2016-01-01
We propose a method for dietary assessment to automatically identify and locate food in a variety of images captured during controlled and natural eating events. Two concepts are combined to achieve this: a set of segmented objects can be partitioned into perceptually similar object classes based on global and local features; and perceptually similar object classes can be used to assess the accuracy of image segmentation. These ideas are implemented by generating multiple segmentations of an image to select stable segmentations based on the classifier’s confidence score assigned to each segmented image region. Automatic segmented regions are classified using a multichannel feature classification system. For each segmented region, multiple feature spaces are formed. Feature vectors in each of the feature spaces are individually classified. The final decision is obtained by combining class decisions from individual feature spaces using decision rules. We show improved accuracy of segmenting food images with classifier feedback. PMID:25561457
Multiple hypotheses image segmentation and classification with application to dietary assessment.
Zhu, Fengqing; Bosch, Marc; Khanna, Nitin; Boushey, Carol J; Delp, Edward J
2015-01-01
We propose a method for dietary assessment to automatically identify and locate food in a variety of images captured during controlled and natural eating events. Two concepts are combined to achieve this: a set of segmented objects can be partitioned into perceptually similar object classes based on global and local features; and perceptually similar object classes can be used to assess the accuracy of image segmentation. These ideas are implemented by generating multiple segmentations of an image to select stable segmentations based on the classifier's confidence score assigned to each segmented image region. Automatic segmented regions are classified using a multichannel feature classification system. For each segmented region, multiple feature spaces are formed. Feature vectors in each of the feature spaces are individually classified. The final decision is obtained by combining class decisions from individual feature spaces using decision rules. We show improved accuracy of segmenting food images with classifier feedback.
Zemp, Roland; Tanadini, Matteo; Plüss, Stefan; Schnüriger, Karin; Singh, Navrag B; Taylor, William R; Lorenzetti, Silvio
2016-01-01
Occupational musculoskeletal disorders, particularly chronic low back pain (LBP), are ubiquitous due to prolonged static sitting or nonergonomic sitting positions. Therefore, the aim of this study was to develop an instrumented chair with force and acceleration sensors to determine the accuracy of automatically identifying the user's sitting position by applying five different machine learning methods (Support Vector Machines, Multinomial Regression, Boosting, Neural Networks, and Random Forest). Forty-one subjects were requested to sit four times in seven different prescribed sitting positions (total 1148 samples). Sixteen force sensor values and the backrest angle were used as the explanatory variables (features) for the classification. The different classification methods were compared by means of a Leave-One-Out cross-validation approach. The best performance was achieved using the Random Forest classification algorithm, producing a mean classification accuracy of 90.9% for subjects with which the algorithm was not familiar. The classification accuracy varied between 81% and 98% for the seven different sitting positions. The present study showed the possibility of accurately classifying different sitting positions by means of the introduced instrumented office chair combined with machine learning analyses. The use of such novel approaches for the accurate assessment of chair usage could offer insights into the relationships between sitting position, sitting behaviour, and the occurrence of musculoskeletal disorders.
Analysis of spatial distribution of land cover maps accuracy
NASA Astrophysics Data System (ADS)
Khatami, R.; Mountrakis, G.; Stehman, S. V.
2017-12-01
Land cover maps have become one of the most important products of remote sensing science. However, classification errors will exist in any classified map and affect the reliability of subsequent map usage. Moreover, classification accuracy often varies over different regions of a classified map. These variations of accuracy will affect the reliability of subsequent analyses of different regions based on the classified maps. The traditional approach of map accuracy assessment based on an error matrix does not capture the spatial variation in classification accuracy. Here, per-pixel accuracy prediction methods are proposed based on interpolating accuracy values from a test sample to produce wall-to-wall accuracy maps. Different accuracy prediction methods were developed based on four factors: predictive domain (spatial versus spectral), interpolation function (constant, linear, Gaussian, and logistic), incorporation of class information (interpolating each class separately versus grouping them together), and sample size. Incorporation of spectral domain as explanatory feature spaces of classification accuracy interpolation was done for the first time in this research. Performance of the prediction methods was evaluated using 26 test blocks, with 10 km × 10 km dimensions, dispersed throughout the United States. The performance of the predictions was evaluated using the area under the curve (AUC) of the receiver operating characteristic. Relative to existing accuracy prediction methods, our proposed methods resulted in improvements of AUC of 0.15 or greater. Evaluation of the four factors comprising the accuracy prediction methods demonstrated that: i) interpolations should be done separately for each class instead of grouping all classes together; ii) if an all-classes approach is used, the spectral domain will result in substantially greater AUC than the spatial domain; iii) for the smaller sample size and per-class predictions, the spectral and spatial domain yielded similar AUC; iv) for the larger sample size (i.e., very dense spatial sample) and per-class predictions, the spatial domain yielded larger AUC; v) increasing the sample size improved accuracy predictions with a greater benefit accruing to the spatial domain; and vi) the function used for interpolation had the smallest effect on AUC.
Integrating multisource imagery and GIS analysis for mapping Bermuda`s benthic habitats
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vierros, M.K.
1997-06-01
Bermuda is a group of isolated oceanic situated in the northwest Atlantic Ocean and surrounded by the Sargasso Sea. Bermuda possesses the northernmost coral reefs and mangroves in the Atlantic Ocean, and because of its high population density, both the terrestrial and marine environments are under intense human pressure. Although a long record of scientific research exists, this study is the first attempt to comprehensively map the area`s benthic habitats, despite the need for such a map for resource assessment and management purposes. Multi-source and multi-date imagery were used for producing the habitat map due to lack of a completemore » up-to-date image. Classifications were performed with SPOT data, and the results verified from recent aerial photography and current aerial video, along with extensive ground truthing. Stratification of the image into regions prior to classification reduced the confusing effects of varying water depth. Classification accuracy in shallow areas was increased by derivation of a texture pseudo-channel, while bathymetry was used as a classification tool in deeper areas, where local patterns of zonation were well known. Because of seasonal variation in extent of seagrasses, a classification scheme based on density could not be used. Instead, a set of classes based on the seagrass area`s exposure to the open ocean were developed. The resulting habitat map is currently being assessed for accuracy with promising preliminary results, indicating its usefulness as a basis for future resource assessment studies.« less
Ramsey, Elijah W.; Nelson, Gene A.; Sapkota, Sijan
1998-01-01
A progressive classification of a marsh and forest system using Landsat Thematic Mapper (TM), color infrared (CIR) photograph, and ERS-1 synthetic aperture radar (SAR) data improved classification accuracy when compared to classification using solely TM reflective band data. The classification resulted in a detailed identification of differences within a nearly monotypic black needlerush marsh. Accuracy percentages of these classes were surprisingly high given the complexities of classification. The detailed classification resulted in a more accurate portrayal of the marsh transgressive sequence than was obtainable with TM data alone. Individual sensor contribution to the improved classification was compared to that using only the six reflective TM bands. Individually, the green reflective CIR and SAR data identified broad categories of water, marsh, and forest. In combination with TM, SAR and the green CIR band each improved overall accuracy by about 3% and 15% respectively. The SAR data improved the TM classification accuracy mostly in the marsh classes. The green CIR data also improved the marsh classification accuracy and accuracies in some water classes. The final combination of all sensor data improved almost all class accuracies from 2% to 70% with an overall improvement of about 20% over TM data alone. Not only was the identification of vegetation types improved, but the spatial detail of the classification approached 10 m in some areas.
ERIC Educational Resources Information Center
Duffrin, Christopher; Eakin, Angela; Bertrand, Brenda; Barber-Heidel, Kimberly; Carraway-Stage, Virginia
2011-01-01
The American College Health Association estimated that 31% of college students are overweight or obese. It is important that students have a correct perception of body weight status as extra weight has potential adverse health effects. This study assessed accuracy of perceived weight status versus medical classification among 102 college students.…
Men, Hong; Fu, Songlin; Yang, Jialin; Cheng, Meiqi; Shi, Yan; Liu, Jingjing
2018-01-18
Paraffin odor intensity is an important quality indicator when a paraffin inspection is performed. Currently, paraffin odor level assessment is mainly dependent on an artificial sensory evaluation. In this paper, we developed a paraffin odor analysis system to classify and grade four kinds of paraffin samples. The original feature set was optimized using Principal Component Analysis (PCA) and Partial Least Squares (PLS). Support Vector Machine (SVM), Random Forest (RF), and Extreme Learning Machine (ELM) were applied to three different feature data sets for classification and level assessment of paraffin. For classification, the model based on SVM, with an accuracy rate of 100%, was superior to that based on RF, with an accuracy rate of 98.33-100%, and ELM, with an accuracy rate of 98.01-100%. For level assessment, the R² related to the training set was above 0.97 and the R² related to the test set was above 0.87. Through comprehensive comparison, the generalization of the model based on ELM was superior to those based on SVM and RF. The scoring errors for the three models were 0.0016-0.3494, lower than the error of 0.5-1.0 measured by industry standard experts, meaning these methods have a higher prediction accuracy for scoring paraffin level.
Comparison of Feature Selection Techniques in Machine Learning for Anatomical Brain MRI in Dementia.
Tohka, Jussi; Moradi, Elaheh; Huttunen, Heikki
2016-07-01
We present a comparative split-half resampling analysis of various data driven feature selection and classification methods for the whole brain voxel-based classification analysis of anatomical magnetic resonance images. We compared support vector machines (SVMs), with or without filter based feature selection, several embedded feature selection methods and stability selection. While comparisons of the accuracy of various classification methods have been reported previously, the variability of the out-of-training sample classification accuracy and the set of selected features due to independent training and test sets have not been previously addressed in a brain imaging context. We studied two classification problems: 1) Alzheimer's disease (AD) vs. normal control (NC) and 2) mild cognitive impairment (MCI) vs. NC classification. In AD vs. NC classification, the variability in the test accuracy due to the subject sample did not vary between different methods and exceeded the variability due to different classifiers. In MCI vs. NC classification, particularly with a large training set, embedded feature selection methods outperformed SVM-based ones with the difference in the test accuracy exceeding the test accuracy variability due to the subject sample. The filter and embedded methods produced divergent feature patterns for MCI vs. NC classification that suggests the utility of the embedded feature selection for this problem when linked with the good generalization performance. The stability of the feature sets was strongly correlated with the number of features selected, weakly correlated with the stability of classification accuracy, and uncorrelated with the average classification accuracy.
Sabr, Abutaleb; Moeinaddini, Mazaher; Azarnivand, Hossein; Guinot, Benjamin
2016-12-01
In the recent years, dust storms originating from local abandoned agricultural lands have increasingly impacted Tehran and Karaj air quality. Designing and implementing mitigation plans are necessary to study land use/land cover change (LUCC). Land use/cover classification is particularly relevant in arid areas. This study aimed to map land use/cover by pixel- and object-based image classification methods, analyse landscape fragmentation and determine the effects of two different classification methods on landscape metrics. The same sets of ground data were used for both classification methods. Because accuracy of classification plays a key role in better understanding LUCC, both methods were employed. Land use/cover maps of the southwest area of Tehran city for the years 1985, 2000 and 2014 were obtained from Landsat digital images and classified into three categories: built-up, agricultural and barren lands. The results of our LUCC analysis showed that the most important changes in built-up agricultural land categories were observed in zone B (Shahriar, Robat Karim and Eslamshahr) between 1985 and 2014. The landscape metrics obtained for all categories pictured high landscape fragmentation in the study area. Despite no significant difference was evidenced between the two classification methods, the object-based classification led to an overall higher accuracy than using the pixel-based classification. In particular, the accuracy of the built-up category showed a marked increase. In addition, both methods showed similar trends in fragmentation metrics. One of the reasons is that the object-based classification is able to identify buildings, impervious surface and roads in dense urban areas, which produced more accurate maps.
Classification of vegetation in an open landscape using full-waveform airborne laser scanner data
NASA Astrophysics Data System (ADS)
Alexander, Cici; Deák, Balázs; Kania, Adam; Mücke, Werner; Heilmeier, Hermann
2015-09-01
Airborne laser scanning (ALS) is increasingly being used for the mapping of vegetation, although the focus so far has been on woody vegetation, and ALS data have only rarely been used for the classification of grassland vegetation. In this study, we classified the vegetation of an open alkali landscape, characterized by two Natura 2000 habitat types: Pannonic salt steppes and salt marshes and Pannonic loess steppic grasslands. We generated 18 variables from an ALS dataset collected in the growing (leaf-on) season. Elevation is a key factor determining the patterns of vegetation types in the landscape, and hence 3 additional variables were based on a digital terrain model (DTM) generated from an ALS dataset collected in the dormant (leaf-off) season. We classified the vegetation into 24 classes based on these 21 variables, at a pixel size of 1 m. Two groups of variables with and without the DTM-based variables were used in a Random Forest classifier, to estimate the influence of elevation, on the accuracy of the classification. The resulting classes at Level 4, based on associations, were aggregated at three levels - Level 3 (11 classes), Level 2 (8 classes) and Level 1 (5 classes) - based on species pool, site conditions and structure, and the accuracies were assessed. The classes were also aggregated based on Natura 2000 habitat types to assess the accuracy of the classification, and its usefulness for the monitoring of habitat quality. The vegetation could be classified into dry grasslands, wetlands, weeds, woody species and man-made features, at Level 1, with an accuracy of 0.79 (Cohen's kappa coefficient, κ). The accuracies at Levels 2-4 and the classification based on the Natura 2000 habitat types were κ: 0.76, 0.61, 0.51 and 0.69, respectively. Levels 1 and 2 provide suitable information for nature conservationists and land managers, while Levels 3 and 4 are especially useful for ecologists, geologists and soil scientists as they provide high resolution data on species distribution, vegetation patterns, soil properties and on their correlations. Including the DTM-based variables increased the accuracy (κ) from 0.73 to 0.79 for Level 1. These findings show that the structural and spectral attributes of ALS echoes can be used for the classification of open landscapes, especially those where vegetation is influenced by elevation, such as coastal salt marshes, sand dunes, karst or alluvial areas; in these cases, ALS has a distinct advantage over other remotely sensed data.
Araki, Tadashi; Jain, Pankaj K; Suri, Harman S; Londhe, Narendra D; Ikeda, Nobutaka; El-Baz, Ayman; Shrivastava, Vimal K; Saba, Luca; Nicolaides, Andrew; Shafique, Shoaib; Laird, John R; Gupta, Ajay; Suri, Jasjit S
2017-01-01
Stroke risk stratification based on grayscale morphology of the ultrasound carotid wall has recently been shown to have a promise in classification of high risk versus low risk plaque or symptomatic versus asymptomatic plaques. In previous studies, this stratification has been mainly based on analysis of the far wall of the carotid artery. Due to the multifocal nature of atherosclerotic disease, the plaque growth is not restricted to the far wall alone. This paper presents a new approach for stroke risk assessment by integrating assessment of both the near and far walls of the carotid artery using grayscale morphology of the plaque. Further, this paper presents a scientific validation system for stroke risk assessment. Both these innovations have never been presented before. The methodology consists of an automated segmentation system of the near wall and far wall regions in grayscale carotid B-mode ultrasound scans. Sixteen grayscale texture features are computed, and fed into the machine learning system. The training system utilizes the lumen diameter to create ground truth labels for the stratification of stroke risk. The cross-validation procedure is adapted in order to obtain the machine learning testing classification accuracy through the use of three sets of partition protocols: (5, 10, and Jack Knife). The mean classification accuracy over all the sets of partition protocols for the automated system in the far and near walls is 95.08% and 93.47%, respectively. The corresponding accuracies for the manual system are 94.06% and 92.02%, respectively. The precision of merit of the automated machine learning system when compared against manual risk assessment system are 98.05% and 97.53% for the far and near walls, respectively. The ROC of the risk assessment system for the far and near walls is close to 1.0 demonstrating high accuracy. Copyright © 2016 Elsevier Ltd. All rights reserved.
Dyrba, Martin; Barkhof, Frederik; Fellgiebel, Andreas; Filippi, Massimo; Hausner, Lucrezia; Hauenstein, Karlheinz; Kirste, Thomas; Teipel, Stefan J
2015-01-01
Alzheimer's disease (AD) patients show early changes in white matter (WM) structural integrity. We studied the use of diffusion tensor imaging (DTI) in assessing WM alterations in the predementia stage of mild cognitive impairment (MCI). We applied a Support Vector Machine (SVM) classifier to DTI and volumetric magnetic resonance imaging data from 35 amyloid-β42 negative MCI subjects (MCI-Aβ42-), 35 positive MCI subjects (MCI-Aβ42+), and 25 healthy controls (HC) retrieved from the European DTI Study on Dementia. The SVM was applied to DTI-derived fractional anisotropy, mean diffusivity (MD), and mode of anisotropy (MO) maps. For comparison, we studied classification based on gray matter (GM) and WM volume. We obtained accuracies of up to 68% for MO and 63% for GM volume when it came to distinguishing between MCI-Aβ42- and MCI-Aβ42+. When it came to separating MCI-Aβ42+ from HC we achieved an accuracy of up to 77% for MD and a significantly lower accuracy of 68% for GM volume. The accuracy of multimodal classification was not higher than the accuracy of the best single modality. Our results suggest that DTI data provide better prediction accuracy than GM volume in predementia AD. Copyright © 2015 by the American Society of Neuroimaging.
The Accuracy and Reliability of Crowdsource Annotations of Digital Retinal Images
Mitry, Danny; Zutis, Kris; Dhillon, Baljean; Peto, Tunde; Hayat, Shabina; Khaw, Kay-Tee; Morgan, James E.; Moncur, Wendy; Trucco, Emanuele; Foster, Paul J.
2016-01-01
Purpose Crowdsourcing is based on outsourcing computationally intensive tasks to numerous individuals in the online community who have no formal training. Our aim was to develop a novel online tool designed to facilitate large-scale annotation of digital retinal images, and to assess the accuracy of crowdsource grading using this tool, comparing it to expert classification. Methods We used 100 retinal fundus photograph images with predetermined disease criteria selected by two experts from a large cohort study. The Amazon Mechanical Turk Web platform was used to drive traffic to our site so anonymous workers could perform a classification and annotation task of the fundus photographs in our dataset after a short training exercise. Three groups were assessed: masters only, nonmasters only and nonmasters with compulsory training. We calculated the sensitivity, specificity, and area under the curve (AUC) of receiver operating characteristic (ROC) plots for all classifications compared to expert grading, and used the Dice coefficient and consensus threshold to assess annotation accuracy. Results In total, we received 5389 annotations for 84 images (excluding 16 training images) in 2 weeks. A specificity and sensitivity of 71% (95% confidence interval [CI], 69%–74%) and 87% (95% CI, 86%–88%) was achieved for all classifications. The AUC in this study for all classifications combined was 0.93 (95% CI, 0.91–0.96). For image annotation, a maximal Dice coefficient (∼0.6) was achieved with a consensus threshold of 0.25. Conclusions This study supports the hypothesis that annotation of abnormalities in retinal images by ophthalmologically naive individuals is comparable to expert annotation. The highest AUC and agreement with expert annotation was achieved in the nonmasters with compulsory training group. Translational Relevance The use of crowdsourcing as a technique for retinal image analysis may be comparable to expert graders and has the potential to deliver timely, accurate, and cost-effective image analysis. PMID:27668130
The Accuracy and Reliability of Crowdsource Annotations of Digital Retinal Images.
Mitry, Danny; Zutis, Kris; Dhillon, Baljean; Peto, Tunde; Hayat, Shabina; Khaw, Kay-Tee; Morgan, James E; Moncur, Wendy; Trucco, Emanuele; Foster, Paul J
2016-09-01
Crowdsourcing is based on outsourcing computationally intensive tasks to numerous individuals in the online community who have no formal training. Our aim was to develop a novel online tool designed to facilitate large-scale annotation of digital retinal images, and to assess the accuracy of crowdsource grading using this tool, comparing it to expert classification. We used 100 retinal fundus photograph images with predetermined disease criteria selected by two experts from a large cohort study. The Amazon Mechanical Turk Web platform was used to drive traffic to our site so anonymous workers could perform a classification and annotation task of the fundus photographs in our dataset after a short training exercise. Three groups were assessed: masters only, nonmasters only and nonmasters with compulsory training. We calculated the sensitivity, specificity, and area under the curve (AUC) of receiver operating characteristic (ROC) plots for all classifications compared to expert grading, and used the Dice coefficient and consensus threshold to assess annotation accuracy. In total, we received 5389 annotations for 84 images (excluding 16 training images) in 2 weeks. A specificity and sensitivity of 71% (95% confidence interval [CI], 69%-74%) and 87% (95% CI, 86%-88%) was achieved for all classifications. The AUC in this study for all classifications combined was 0.93 (95% CI, 0.91-0.96). For image annotation, a maximal Dice coefficient (∼0.6) was achieved with a consensus threshold of 0.25. This study supports the hypothesis that annotation of abnormalities in retinal images by ophthalmologically naive individuals is comparable to expert annotation. The highest AUC and agreement with expert annotation was achieved in the nonmasters with compulsory training group. The use of crowdsourcing as a technique for retinal image analysis may be comparable to expert graders and has the potential to deliver timely, accurate, and cost-effective image analysis.
Lukas, Vanessa A; Fishbein, Kenneth W; Reiter, David A; Lin, Ping-Chang; Schneider, Erika; Spencer, Richard G
2015-07-01
To evaluate the sensitivity and specificity of classification of pathomimetically degraded bovine nasal cartilage at 3 Tesla and 37°C using univariate MRI measurements of both pure parameter values and intensities of parameter-weighted images. Pre- and posttrypsin degradation values of T1 , T2 , T2 *, magnetization transfer ratio (MTR), and apparent diffusion coefficient (ADC), and corresponding weighted images, were analyzed. Classification based on the Euclidean distance was performed and the quality of classification was assessed through sensitivity, specificity and accuracy (ACC). The classifiers with the highest accuracy values were ADC (ACC = 0.82 ± 0.06), MTR (ACC = 0.78 ± 0.06), T1 (ACC = 0.99 ± 0.01), T2 derived from a three-dimensional (3D) spin-echo sequence (ACC = 0.74 ± 0.05), and T2 derived from a 2D spin-echo sequence (ACC = 0.77 ± 0.06), along with two of the diffusion-weighted signal intensities (b = 333 s/mm(2) : ACC = 0.80 ± 0.05; b = 666 s/mm(2) : ACC = 0.85 ± 0.04). In particular, T1 values differed substantially between the groups, resulting in atypically high classification accuracy. The second-best classifier, diffusion weighting with b = 666 s/mm(2) , as well as all other parameters evaluated, exhibited substantial overlap between pre- and postdegradation groups, resulting in decreased accuracies. Classification according to T1 values showed excellent test characteristics (ACC = 0.99), with several other parameters also showing reasonable performance (ACC > 0.70). Of these, diffusion weighting is particularly promising as a potentially practical clinical modality. As in previous work, we again find that highly statistically significant group mean differences do not necessarily translate into accurate clinical classification rules. © 2014 Wiley Periodicals, Inc.
[Accuracy improvement of spectral classification of crop using microwave backscatter data].
Jia, Kun; Li, Qiang-Zi; Tian, Yi-Chen; Wu, Bing-Fang; Zhang, Fei-Fei; Meng, Ji-Hua
2011-02-01
In the present study, VV polarization microwave backscatter data used for improving accuracies of spectral classification of crop is investigated. Classification accuracy using different classifiers based on the fusion data of HJ satellite multi-spectral and Envisat ASAR VV backscatter data are compared. The results indicate that fusion data can take full advantage of spectral information of HJ multi-spectral data and the structure sensitivity feature of ASAR VV polarization data. The fusion data enlarges the spectral difference among different classifications and improves crop classification accuracy. The classification accuracy using fusion data can be increased by 5 percent compared to the single HJ data. Furthermore, ASAR VV polarization data is sensitive to non-agrarian area of planted field, and VV polarization data joined classification can effectively distinguish the field border. VV polarization data associating with multi-spectral data used in crop classification enlarges the application of satellite data and has the potential of spread in the domain of agriculture.
Assessment of pedophilia using hemodynamic brain response to sexual stimuli.
Ponseti, Jorge; Granert, Oliver; Jansen, Olav; Wolff, Stephan; Beier, Klaus; Neutze, Janina; Deuschl, Günther; Mehdorn, Hubertus; Siebner, Hartwig; Bosinski, Hartmut
2012-02-01
Accurately assessing sexual preference is important in the treatment of child sex offenders. Phallometry is the standard method to identify sexual preference; however, this measure has been criticized for its intrusiveness and limited reliability. To evaluate whether spatial response pattern to sexual stimuli as revealed by a change in the blood oxygen level-dependent signal facilitates the identification of pedophiles. During functional magnetic resonance imaging, pedophilic and nonpedophilic participants were briefly exposed to same- and opposite-sex images of nude children and adults. We calculated differences in blood oxygen level-dependent signals to child and adult sexual stimuli for each participant. The corresponding contrast images were entered into a group analysis to calculate whole-brain difference maps between groups. We calculated an expression value that corresponded to the group result for each participant. These expression values were submitted to 2 different classification algorithms: Fisher linear discriminant analysis and κ -nearest neighbor analysis. This classification procedure was cross-validated using the leave-one-out method. Section of Sexual Medicine, Medical School, Christian Albrechts University of Kiel, Kiel, Germany. We recruited 24 participants with pedophilia who were sexually attracted to either prepubescent girls (n = 11) or prepubescent boys (n = 13) and 32 healthy male controls who were sexually attracted to either adult women (n = 18) or adult men (n = 14). Sensitivity and specificity scores of the 2 classification algorithms. The highest classification accuracy was achieved by Fisher linear discriminant analysis, which showed a mean accuracy of 95% (100% specificity, 88% sensitivity). Functional brain response patterns to sexual stimuli contain sufficient information to identify pedophiles with high accuracy. The automatic classification of these patterns is a promising objective tool to clinically diagnose pedophilia.
NASA Astrophysics Data System (ADS)
Knoefel, Patrick; Loew, Fabian; Conrad, Christopher
2015-04-01
Crop maps based on classification of remotely sensed data are of increased attendance in agricultural management. This induces a more detailed knowledge about the reliability of such spatial information. However, classification of agricultural land use is often limited by high spectral similarities of the studied crop types. More, spatially and temporally varying agro-ecological conditions can introduce confusion in crop mapping. Classification errors in crop maps in turn may have influence on model outputs, like agricultural production monitoring. One major goal of the PhenoS project ("Phenological structuring to determine optimal acquisition dates for Sentinel-2 data for field crop classification"), is the detection of optimal phenological time windows for land cover classification purposes. Since many crop species are spectrally highly similar, accurate classification requires the right selection of satellite images for a certain classification task. In the course of one growing season, phenological phases exist where crops are separable with higher accuracies. For this purpose, coupling of multi-temporal spectral characteristics and phenological events is promising. The focus of this study is set on the separation of spectrally similar cereal crops like winter wheat, barley, and rye of two test sites in Germany called "Harz/Central German Lowland" and "Demmin". However, this study uses object based random forest (RF) classification to investigate the impact of image acquisition frequency and timing on crop classification uncertainty by permuting all possible combinations of available RapidEye time series recorded on the test sites between 2010 and 2014. The permutations were applied to different segmentation parameters. Then, classification uncertainty was assessed and analysed, based on the probabilistic soft-output from the RF algorithm at the per-field basis. From this soft output, entropy was calculated as a spatial measure of classification uncertainty. The results indicate that uncertainty estimates provide a valuable addition to traditional accuracy assessments and helps the user to allocate error in crop maps.
Energy-Based Metrics for Arthroscopic Skills Assessment.
Poursartip, Behnaz; LeBel, Marie-Eve; McCracken, Laura C; Escoto, Abelardo; Patel, Rajni V; Naish, Michael D; Trejos, Ana Luisa
2017-08-05
Minimally invasive skills assessment methods are essential in developing efficient surgical simulators and implementing consistent skills evaluation. Although numerous methods have been investigated in the literature, there is still a need to further improve the accuracy of surgical skills assessment. Energy expenditure can be an indication of motor skills proficiency. The goals of this study are to develop objective metrics based on energy expenditure, normalize these metrics, and investigate classifying trainees using these metrics. To this end, different forms of energy consisting of mechanical energy and work were considered and their values were divided by the related value of an ideal performance to develop normalized metrics. These metrics were used as inputs for various machine learning algorithms including support vector machines (SVM) and neural networks (NNs) for classification. The accuracy of the combination of the normalized energy-based metrics with these classifiers was evaluated through a leave-one-subject-out cross-validation. The proposed method was validated using 26 subjects at two experience levels (novices and experts) in three arthroscopic tasks. The results showed that there are statistically significant differences between novices and experts for almost all of the normalized energy-based metrics. The accuracy of classification using SVM and NN methods was between 70% and 95% for the various tasks. The results show that the normalized energy-based metrics and their combination with SVM and NN classifiers are capable of providing accurate classification of trainees. The assessment method proposed in this study can enhance surgical training by providing appropriate feedback to trainees about their level of expertise and can be used in the evaluation of proficiency.
NASA Astrophysics Data System (ADS)
Law, Yan Nei; Lieng, Monica Keiko; Li, Jingmei; Khoo, David Aik-Aun
2014-03-01
Breast cancer is the most common cancer and second leading cause of cancer death among women in the US. The relative survival rate is lower among women with a more advanced stage at diagnosis. Early detection through screening is vital. Mammography is the most widely used and only proven screening method for reliably and effectively detecting abnormal breast tissues. In particular, mammographic density is one of the strongest breast cancer risk factors, after age and gender, and can be used to assess the future risk of disease before individuals become symptomatic. A reliable method for automatic density assessment would be beneficial and could assist radiologists in the evaluation of mammograms. To address this problem, we propose a density classification method which uses statistical features from different parts of the breast. Our method is composed of three parts: breast region identification, feature extraction and building ensemble classifiers for density assessment. It explores the potential of the features extracted from second and higher order statistical information for mammographic density classification. We further investigate the registration of bilateral pairs and time-series of mammograms. The experimental results on 322 mammograms demonstrate that (1) a classifier using features from dense regions has higher discriminative power than a classifier using only features from the whole breast region; (2) these high-order features can be effectively combined to boost the classification accuracy; (3) a classifier using these statistical features from dense regions achieves 75% accuracy, which is a significant improvement from 70% accuracy obtained by the existing approaches.
NASA Astrophysics Data System (ADS)
Suiter, Ashley Elizabeth
Multi-spectral imagery provides a robust and low-cost dataset for assessing wetland extent and quality over broad regions and is frequently used for wetland inventories. However in forested wetlands, hydrology is obscured by tree canopy making it difficult to detect with multi-spectral imagery alone. Because of this, classification of forested wetlands often includes greater errors than that of other wetlands types. Elevation and terrain derivatives have been shown to be useful for modelling wetland hydrology. But, few studies have addressed the use of LiDAR intensity data detecting hydrology in forested wetlands. Due the tendency of LiDAR signal to be attenuated by water, this research proposed the fusion of LiDAR intensity data with LiDAR elevation, terrain data, and aerial imagery, for the detection of forested wetland hydrology. We examined the utility of LiDAR intensity data and determined whether the fusion of Lidar derived data with multispectral imagery increased the accuracy of forested wetland classification compared with a classification performed with only multi-spectral image. Four classifications were performed: Classification A -- All Imagery, Classification B -- All LiDAR, Classification C -- LiDAR without Intensity, and Classification D -- Fusion of All Data. These classifications were performed using random forest and each resulted in a 3-foot resolution thematic raster of forested upland and forested wetland locations in Vermilion County, Illinois. The accuracies of these classifications were compared using Kappa Coefficient of Agreement. Importance statistics produced within the random forest classifier were evaluated in order to understand the contribution of individual datasets. Classification D, which used the fusion of LiDAR and multi-spectral imagery as input variables, had moderate to strong agreement between reference data and classification results. It was found that Classification A performed using all the LiDAR data and its derivatives (intensity, elevation, slope, aspect, curvatures, and Topographic Wetness Index) was the most accurate classification with Kappa: 78.04%, indicating moderate to strong agreement. However, Classification C, performed with LiDAR derivative without intensity data had less agreement than would be expected by chance, indicating that LiDAR contributed significantly to the accuracy of Classification B.
An Assessment of Worldview-2 Imagery for the Classification Of a Mixed Deciduous Forest
NASA Astrophysics Data System (ADS)
Carter, Nahid
Remote sensing provides a variety of methods for classifying forest communities and can be a valuable tool for the impact assessment of invasive species. The emerald ash borer (Agrilus planipennis) infestation of ash trees (Fraxinus) in the United States has resulted in the mortality of large stands of ash throughout the Northeast. This study assessed the suitability of multi-temporal Worldview-2 multispectral satellite imagery for classifying a mixed deciduous forest in Upstate New York. Training sites were collected using a Global Positioning System (GPS) receiver, with each training site consisting of a single tree of a corresponding class. Six classes were collected; Ash, Maple, Oak, Beech, Evergreen, and Other. Three different classifications were investigated on four data sets. A six class classification (6C), a two class classification consisting of ash and all other classes combined (2C), and a merging of the ash and maple classes for a five class classification (5C). The four data sets included Worldview-2 multispectral data collection from June 2010 (J-WV2) and September 2010 (S-WV2), a layer stacked data set using J-WV2 and S-WV2 (LS-WV2), and a reduced data set (RD-WV2). RD-WV2 was created using a statistical analysis of the processed and unprocessed imagery. Statistical analysis was used to reduce the dimensionality of the data and identify key bands to create a fourth data set (RD-WV2). Overall accuracy varied considerably depending upon the classification type, but results indicated that ash was confused with maple in a majority of the classifications. Ash was most accurately identified using the 2C classification and RD-WV2 data set (81.48%). A combination of the ash and maple classes yielded an accuracy of 89.41%. Future work should focus on separating the ash and maple classifiers by using data sources such as hyperspectral imagery, LiDAR, or extensive forest surveys.
ERIC Educational Resources Information Center
Chen, Chau-Kuang
2010-01-01
Artificial Neural Network (ANN) and Support Vector Machine (SVM) approaches have been on the cutting edge of science and technology for pattern recognition and data classification. In the ANN model, classification accuracy can be achieved by using the feed-forward of inputs, back-propagation of errors, and the adjustment of connection weights. In…
Zhou, Tao; Li, Zhaofu; Pan, Jianjun
2018-01-27
This paper focuses on evaluating the ability and contribution of using backscatter intensity, texture, coherence, and color features extracted from Sentinel-1A data for urban land cover classification and comparing different multi-sensor land cover mapping methods to improve classification accuracy. Both Landsat-8 OLI and Hyperion images were also acquired, in combination with Sentinel-1A data, to explore the potential of different multi-sensor urban land cover mapping methods to improve classification accuracy. The classification was performed using a random forest (RF) method. The results showed that the optimal window size of the combination of all texture features was 9 × 9, and the optimal window size was different for each individual texture feature. For the four different feature types, the texture features contributed the most to the classification, followed by the coherence and backscatter intensity features; and the color features had the least impact on the urban land cover classification. Satisfactory classification results can be obtained using only the combination of texture and coherence features, with an overall accuracy up to 91.55% and a kappa coefficient up to 0.8935, respectively. Among all combinations of Sentinel-1A-derived features, the combination of the four features had the best classification result. Multi-sensor urban land cover mapping obtained higher classification accuracy. The combination of Sentinel-1A and Hyperion data achieved higher classification accuracy compared to the combination of Sentinel-1A and Landsat-8 OLI images, with an overall accuracy of up to 99.12% and a kappa coefficient up to 0.9889. When Sentinel-1A data was added to Hyperion images, the overall accuracy and kappa coefficient were increased by 4.01% and 0.0519, respectively.
Clustering performance comparison using K-means and expectation maximization algorithms.
Jung, Yong Gyu; Kang, Min Soo; Heo, Jun
2014-11-14
Clustering is an important means of data mining based on separating data categories by similar features. Unlike the classification algorithm, clustering belongs to the unsupervised type of algorithms. Two representatives of the clustering algorithms are the K -means and the expectation maximization (EM) algorithm. Linear regression analysis was extended to the category-type dependent variable, while logistic regression was achieved using a linear combination of independent variables. To predict the possibility of occurrence of an event, a statistical approach is used. However, the classification of all data by means of logistic regression analysis cannot guarantee the accuracy of the results. In this paper, the logistic regression analysis is applied to EM clusters and the K -means clustering method for quality assessment of red wine, and a method is proposed for ensuring the accuracy of the classification results.
NASA Astrophysics Data System (ADS)
Massey, Richard
Cropland characteristics and accurate maps of their spatial distribution are required to develop strategies for global food security by continental-scale assessments and agricultural land use policies. North America is the major producer and exporter of coarse grains, wheat, and other crops. While cropland characteristics such as crop types are available at country-scales in North America, however, at continental-scale cropland products are lacking at fine sufficient resolution such as 30m. Additionally, applications of automated, open, and rapid methods to map cropland characteristics over large areas without the need of ground samples are needed on efficient high performance computing platforms for timely and long-term cropland monitoring. In this study, I developed novel, automated, and open methods to map cropland extent, crop intensity, and crop types in the North American continent using large remote sensing datasets on high-performance computing platforms. First, a novel method was developed in this study to fuse pixel-based classification of continental-scale Landsat data using Random Forest algorithm available on Google Earth Engine cloud computing platform with an object-based classification approach, recursive hierarchical segmentation (RHSeg) to map cropland extent at continental scale. Using the fusion method, a continental-scale cropland extent map for North America at 30m spatial resolution for the nominal year 2010 was produced. In this map, the total cropland area for North America was estimated at 275.2 million hectares (Mha). This map was assessed for accuracy using randomly distributed samples derived from United States Department of Agriculture (USDA) cropland data layer (CDL), Agriculture and Agri-Food Canada (AAFC) annual crop inventory (ACI), Servicio de Informacion Agroalimentaria y Pesquera (SIAP), Mexico's agricultural boundaries, and photo-interpretation of high-resolution imagery. The overall accuracies of the map are 93.4% with a producer's accuracy for crop class at 85.4% and user's accuracy of 74.5% across the continent. The sub-country statistics including state-wise and county-wise cropland statistics derived from this map compared well in regression models resulting in R2 > 0.84. Secondly, an automated phenological pattern matching (PPM) method to efficiently map cropping intensity was also developed in this study. This study presents a continental-scale cropping intensity map for the North American continent at 250m spatial resolution for 2010. In this map, the total areas for single crop, double crop, continuous crop, and fallow were estimated to be 123.5 Mha, 11.1 Mha, 64.0 Mha, and 83.4 Mha, respectively. This map was assessed using limited country-level reference datasets derived from United States Department of Agriculture cropland data layer and Agriculture and Agri-Food Canada annual crop inventory with overall accuracies of 79.8% and 80.2%, respectively. Third, two novel and automated decision tree classification approaches to map crop types across the conterminous United States (U.S.) using MODIS 250 m resolution data: 1) generalized, and 2) year-specific classification were developed. The classification approaches use similarities and dissimilarities in crop type phenology derived from NDVI time-series data for the two approaches. Annual crop type maps were produced for 8 major crop types in the United States using the generalized classification approach for 2001-2014 and the year-specific approach for 2008, 2010, 2011 and 2012. The year-specific classification had overall accuracies greater than 78%, while the generalized classifier had accuracies greater than 75% for the conterminous U.S. for 2008, 2010, 2011, and 2012. The generalized classifier enables automated and routine crop type mapping without repeated and expensive ground sample collection year after year with overall accuracies > 70% across all independent years. Taken together, these cropland products of extent, cropping intensity, and crop types, are significantly beneficial in agricultural and water use planning and monitoring to formulate policies towards global and North American food security issues.
Power System Transient Stability Based on Data Mining Theory
NASA Astrophysics Data System (ADS)
Cui, Zhen; Shi, Jia; Wu, Runsheng; Lu, Dan; Cui, Mingde
2018-01-01
In order to study the stability of power system, a power system transient stability based on data mining theory is designed. By introducing association rules analysis in data mining theory, an association classification method for transient stability assessment is presented. A mathematical model of transient stability assessment based on data mining technology is established. Meanwhile, combining rule reasoning with classification prediction, the method of association classification is proposed to perform transient stability assessment. The transient stability index is used to identify the samples that cannot be correctly classified in association classification. Then, according to the critical stability of each sample, the time domain simulation method is used to determine the state, so as to ensure the accuracy of the final results. The results show that this stability assessment system can improve the speed of operation under the premise that the analysis result is completely correct, and the improved algorithm can find out the inherent relation between the change of power system operation mode and the change of transient stability degree.
NASA Astrophysics Data System (ADS)
Diesing, Markus; Green, Sophie L.; Stephens, David; Lark, R. Murray; Stewart, Heather A.; Dove, Dayton
2014-08-01
Marine spatial planning and conservation need underpinning with sufficiently detailed and accurate seabed substrate and habitat maps. Although multibeam echosounders enable us to map the seabed with high resolution and spatial accuracy, there is still a lack of fit-for-purpose seabed maps. This is due to the high costs involved in carrying out systematic seabed mapping programmes and the fact that the development of validated, repeatable, quantitative and objective methods of swath acoustic data interpretation is still in its infancy. We compared a wide spectrum of approaches including manual interpretation, geostatistics, object-based image analysis and machine-learning to gain further insights into the accuracy and comparability of acoustic data interpretation approaches based on multibeam echosounder data (bathymetry, backscatter and derivatives) and seabed samples with the aim to derive seabed substrate maps. Sample data were split into a training and validation data set to allow us to carry out an accuracy assessment. Overall thematic classification accuracy ranged from 67% to 76% and Cohen's kappa varied between 0.34 and 0.52. However, these differences were not statistically significant at the 5% level. Misclassifications were mainly associated with uncommon classes, which were rarely sampled. Map outputs were between 68% and 87% identical. To improve classification accuracy in seabed mapping, we suggest that more studies on the effects of factors affecting the classification performance as well as comparative studies testing the performance of different approaches need to be carried out with a view to developing guidelines for selecting an appropriate method for a given dataset. In the meantime, classification accuracy might be improved by combining different techniques to hybrid approaches and multi-method ensembles.
Empirical evaluation of data normalization methods for molecular classification.
Huang, Huei-Chung; Qin, Li-Xuan
2018-01-01
Data artifacts due to variations in experimental handling are ubiquitous in microarray studies, and they can lead to biased and irreproducible findings. A popular approach to correct for such artifacts is through post hoc data adjustment such as data normalization. Statistical methods for data normalization have been developed and evaluated primarily for the discovery of individual molecular biomarkers. Their performance has rarely been studied for the development of multi-marker molecular classifiers-an increasingly important application of microarrays in the era of personalized medicine. In this study, we set out to evaluate the performance of three commonly used methods for data normalization in the context of molecular classification, using extensive simulations based on re-sampling from a unique pair of microRNA microarray datasets for the same set of samples. The data and code for our simulations are freely available as R packages at GitHub. In the presence of confounding handling effects, all three normalization methods tended to improve the accuracy of the classifier when evaluated in an independent test data. The level of improvement and the relative performance among the normalization methods depended on the relative level of molecular signal, the distributional pattern of handling effects (e.g., location shift vs scale change), and the statistical method used for building the classifier. In addition, cross-validation was associated with biased estimation of classification accuracy in the over-optimistic direction for all three normalization methods. Normalization may improve the accuracy of molecular classification for data with confounding handling effects; however, it cannot circumvent the over-optimistic findings associated with cross-validation for assessing classification accuracy.
Erdodi, Laszlo A; Tyson, Bradley T; Shahein, Ayman G; Lichtenstein, Jonathan D; Abeare, Christopher A; Pelletier, Chantalle L; Zuccato, Brandon G; Kucharski, Brittany; Roth, Robert M
2017-05-01
The Recognition Memory Test (RMT) and Word Choice Test (WCT) are structurally similar, but psychometrically different. Previous research demonstrated that adding a time-to-completion cutoff improved the classification accuracy of the RMT. However, the contribution of WCT time-cutoffs to improve the detection of invalid responding has not been investigated. The present study was designed to evaluate the classification accuracy of time-to-completion on the WCT compared to the accuracy score and the RMT. Both tests were administered to 202 adults (M age = 45.3 years, SD = 16.8; 54.5% female) clinically referred for neuropsychological assessment in counterbalanced order as part of a larger battery of cognitive tests. Participants obtained lower and more variable scores on the RMT (M = 44.1, SD = 7.6) than on the WCT (M = 46.9, SD = 5.7). Similarly, they took longer to complete the recognition trial on the RMT (M = 157.2 s,SD = 71.8) than the WCT (M = 137.2 s, SD = 75.7). The optimal cutoff on the RMT (≤43) produced .60 sensitivity at .87 specificity. The optimal cutoff on the WCT (≤47) produced .57 sensitivity at .87 specificity. Time-cutoffs produced comparable classification accuracies for both RMT (≥192 s; .48 sensitivity at .88 specificity) and WCT (≥171 s; .49 sensitivity at .91 specificity). They also identified an additional 6-10% of the invalid profiles missed by accuracy score cutoffs, while maintaining good specificity (.93-.95). Functional equivalence was reached at accuracy scores ≤43 (RMT) and ≤47 (WCT) or time-to-completion ≥192 s (RMT) and ≥171 s (WCT). Time-to-completion cutoffs are valuable additions to both tests. They can function as independent validity indicators or enhance the sensitivity of accuracy scores without requiring additional measures or extending standard administration time.
Active relearning for robust supervised classification of pulmonary emphysema
NASA Astrophysics Data System (ADS)
Raghunath, Sushravya; Rajagopalan, Srinivasan; Karwoski, Ronald A.; Bartholmai, Brian J.; Robb, Richard A.
2012-03-01
Radiologists are adept at recognizing the appearance of lung parenchymal abnormalities in CT scans. However, the inconsistent differential diagnosis, due to subjective aggregation, mandates supervised classification. Towards optimizing Emphysema classification, we introduce a physician-in-the-loop feedback approach in order to minimize uncertainty in the selected training samples. Using multi-view inductive learning with the training samples, an ensemble of Support Vector Machine (SVM) models, each based on a specific pair-wise dissimilarity metric, was constructed in less than six seconds. In the active relearning phase, the ensemble-expert label conflicts were resolved by an expert. This just-in-time feedback with unoptimized SVMs yielded 15% increase in classification accuracy and 25% reduction in the number of support vectors. The generality of relearning was assessed in the optimized parameter space of six different classifiers across seven dissimilarity metrics. The resultant average accuracy improved to 21%. The co-operative feedback method proposed here could enhance both diagnostic and staging throughput efficiency in chest radiology practice.
Monitoring strip mining and reclamation with LANDSAT data in Belmont County, Ohio
NASA Technical Reports Server (NTRS)
Witt, R. G.; Schaal, G. M.; Bly, B. G.
1983-01-01
The utility of LANDSAT digital data for mapping and monitoring surface mines in Belmont County, Ohio was investigated. Two data sets from 1976 and 1979 were processed to classify level 1 land covers and three strip mine categories in order to examine change over time and assess reclamation efforts. The two classifications were compared with aerial photographs. Results of the accuracy assessment show that both classifications are approximately 86 per cent correct, and that surface mine change detection (date-to-date comparison) is facilitated by the digital format of LANDSAT data.
Combining Machine Learning and Natural Language Processing to Assess Literary Text Comprehension
ERIC Educational Resources Information Center
Balyan, Renu; McCarthy, Kathryn S.; McNamara, Danielle S.
2017-01-01
This study examined how machine learning and natural language processing (NLP) techniques can be leveraged to assess the interpretive behavior that is required for successful literary text comprehension. We compared the accuracy of seven different machine learning classification algorithms in predicting human ratings of student essays about…
Assessing the accuracy of a regional land cover classification
William Clerke; Raymond Czaplewski; Jeff Campbell; Janet Fahringer
1996-01-01
The Southern Region USDA Forest Service recently completed the Southern Appalachian Assessment (SAA). The Assessment is a broad scale interagency analysis and sharing of existing information relative to the natural and human resources of the region. The SAA encompasses over 36 million acres extending from Northern Virginia to Northern Alabama. It was clear early in the...
Land-cover classification in a moist tropical region of Brazil with Landsat TM imagery.
Li, Guiying; Lu, Dengsheng; Moran, Emilio; Hetrick, Scott
2011-01-01
This research aims to improve land-cover classification accuracy in a moist tropical region in Brazil by examining the use of different remote sensing-derived variables and classification algorithms. Different scenarios based on Landsat Thematic Mapper (TM) spectral data and derived vegetation indices and textural images, and different classification algorithms - maximum likelihood classification (MLC), artificial neural network (ANN), classification tree analysis (CTA), and object-based classification (OBC), were explored. The results indicated that a combination of vegetation indices as extra bands into Landsat TM multispectral bands did not improve the overall classification performance, but the combination of textural images was valuable for improving vegetation classification accuracy. In particular, the combination of both vegetation indices and textural images into TM multispectral bands improved overall classification accuracy by 5.6% and kappa coefficient by 6.25%. Comparison of the different classification algorithms indicated that CTA and ANN have poor classification performance in this research, but OBC improved primary forest and pasture classification accuracies. This research indicates that use of textural images or use of OBC are especially valuable for improving the vegetation classes such as upland and liana forest classes having complex stand structures and having relatively large patch sizes.
Land-cover classification in a moist tropical region of Brazil with Landsat TM imagery
LI, GUIYING; LU, DENGSHENG; MORAN, EMILIO; HETRICK, SCOTT
2011-01-01
This research aims to improve land-cover classification accuracy in a moist tropical region in Brazil by examining the use of different remote sensing-derived variables and classification algorithms. Different scenarios based on Landsat Thematic Mapper (TM) spectral data and derived vegetation indices and textural images, and different classification algorithms – maximum likelihood classification (MLC), artificial neural network (ANN), classification tree analysis (CTA), and object-based classification (OBC), were explored. The results indicated that a combination of vegetation indices as extra bands into Landsat TM multispectral bands did not improve the overall classification performance, but the combination of textural images was valuable for improving vegetation classification accuracy. In particular, the combination of both vegetation indices and textural images into TM multispectral bands improved overall classification accuracy by 5.6% and kappa coefficient by 6.25%. Comparison of the different classification algorithms indicated that CTA and ANN have poor classification performance in this research, but OBC improved primary forest and pasture classification accuracies. This research indicates that use of textural images or use of OBC are especially valuable for improving the vegetation classes such as upland and liana forest classes having complex stand structures and having relatively large patch sizes. PMID:22368311
Classification of Aerosol Retrievals from Spaceborne Polarimetry Using a Multiparameter Algorithm
NASA Technical Reports Server (NTRS)
Russell, Philip B.; Kacenelenbogen, Meloe; Livingston, John M.; Hasekamp, Otto P.; Burton, Sharon P.; Schuster, Gregory L.; Johnson, Matthew S.; Knobelspiesse, Kirk D.; Redemann, Jens; Ramachandran, S.;
2013-01-01
In this presentation, we demonstrate application of a new aerosol classification algorithm to retrievals from the POLDER-3 polarimter on the PARASOL spacecraft. Motivation and method: Since the development of global aerosol measurements by satellites and AERONET, classification of observed aerosols into several types (e.g., urban-industrial, biomass burning, mineral dust, maritime, and various subtypes or mixtures of these) has proven useful to: understanding aerosol sources, transformations, effects, and feedback mechanisms; improving accuracy of satellite retrievals and quantifying assessments of aerosol radiative impacts on climate.
Accuracy assessment of NLCD 2006 land cover and impervious surface
Wickham, James D.; Stehman, Stephen V.; Gass, Leila; Dewitz, Jon; Fry, Joyce A.; Wade, Timothy G.
2013-01-01
Release of NLCD 2006 provides the first wall-to-wall land-cover change database for the conterminous United States from Landsat Thematic Mapper (TM) data. Accuracy assessment of NLCD 2006 focused on four primary products: 2001 land cover, 2006 land cover, land-cover change between 2001 and 2006, and impervious surface change between 2001 and 2006. The accuracy assessment was conducted by selecting a stratified random sample of pixels with the reference classification interpreted from multi-temporal high resolution digital imagery. The NLCD Level II (16 classes) overall accuracies for the 2001 and 2006 land cover were 79% and 78%, respectively, with Level II user's accuracies exceeding 80% for water, high density urban, all upland forest classes, shrubland, and cropland for both dates. Level I (8 classes) accuracies were 85% for NLCD 2001 and 84% for NLCD 2006. The high overall and user's accuracies for the individual dates translated into high user's accuracies for the 2001–2006 change reporting themes water gain and loss, forest loss, urban gain, and the no-change reporting themes for water, urban, forest, and agriculture. The main factor limiting higher accuracies for the change reporting themes appeared to be difficulty in distinguishing the context of grass. We discuss the need for more research on land-cover change accuracy assessment.
Men, Hong; Fu, Songlin; Yang, Jialin; Cheng, Meiqi; Shi, Yan
2018-01-01
Paraffin odor intensity is an important quality indicator when a paraffin inspection is performed. Currently, paraffin odor level assessment is mainly dependent on an artificial sensory evaluation. In this paper, we developed a paraffin odor analysis system to classify and grade four kinds of paraffin samples. The original feature set was optimized using Principal Component Analysis (PCA) and Partial Least Squares (PLS). Support Vector Machine (SVM), Random Forest (RF), and Extreme Learning Machine (ELM) were applied to three different feature data sets for classification and level assessment of paraffin. For classification, the model based on SVM, with an accuracy rate of 100%, was superior to that based on RF, with an accuracy rate of 98.33–100%, and ELM, with an accuracy rate of 98.01–100%. For level assessment, the R2 related to the training set was above 0.97 and the R2 related to the test set was above 0.87. Through comprehensive comparison, the generalization of the model based on ELM was superior to those based on SVM and RF. The scoring errors for the three models were 0.0016–0.3494, lower than the error of 0.5–1.0 measured by industry standard experts, meaning these methods have a higher prediction accuracy for scoring paraffin level. PMID:29346328
Classification of right-hand grasp movement based on EMOTIV Epoc+
NASA Astrophysics Data System (ADS)
Tobing, T. A. M. L.; Prawito, Wijaya, S. K.
2017-07-01
Combinations of BCT elements for right-hand grasp movement have been obtained, providing the average value of their classification accuracy. The aim of this study is to find a suitable combination for best classification accuracy of right-hand grasp movement based on EEG headset, EMOTIV Epoc+. There are three movement classifications: grasping hand, relax, and opening hand. These classifications take advantage of Event-Related Desynchronization (ERD) phenomenon that makes it possible to differ relaxation, imagery, and movement state from each other. The combinations of elements are the usage of Independent Component Analysis (ICA), spectrum analysis by Fast Fourier Transform (FFT), maximum mu and beta power with their frequency as features, and also classifier Probabilistic Neural Network (PNN) and Radial Basis Function (RBF). The average values of classification accuracy are ± 83% for training and ± 57% for testing. To have a better understanding of the signal quality recorded by EMOTIV Epoc+, the result of classification accuracy of left or right-hand grasping movement EEG signal (provided by Physionet) also be given, i.e.± 85% for training and ± 70% for testing. The comparison of accuracy value from each combination, experiment condition, and external EEG data are provided for the purpose of value analysis of classification accuracy.
The Southwest Regional Gap Analysis Project (SW ReGAP) improves upon previous GAP projects conducted in Arizona, Colorado, Nevada, New Mexico, and Utah to provide a
consistent, seamless vegetation map for this large and ecologically diverse geographic region. Nevada's compone...
Mapping of land cover in northern California with simulated hyperspectral satellite imagery
NASA Astrophysics Data System (ADS)
Clark, Matthew L.; Kilham, Nina E.
2016-09-01
Land-cover maps are important science products needed for natural resource and ecosystem service management, biodiversity conservation planning, and assessing human-induced and natural drivers of land change. Analysis of hyperspectral, or imaging spectrometer, imagery has shown an impressive capacity to map a wide range of natural and anthropogenic land cover. Applications have been mostly with single-date imagery from relatively small spatial extents. Future hyperspectral satellites will provide imagery at greater spatial and temporal scales, and there is a need to assess techniques for mapping land cover with these data. Here we used simulated multi-temporal HyspIRI satellite imagery over a 30,000 km2 area in the San Francisco Bay Area, California to assess its capabilities for mapping classes defined by the international Land Cover Classification System (LCCS). We employed a mapping methodology and analysis framework that is applicable to regional and global scales. We used the Random Forests classifier with three sets of predictor variables (reflectance, MNF, hyperspectral metrics), two temporal resolutions (summer, spring-summer-fall), two sample scales (pixel, polygon) and two levels of classification complexity (12, 20 classes). Hyperspectral metrics provided a 16.4-21.8% and 3.1-6.7% increase in overall accuracy relative to MNF and reflectance bands, respectively, depending on pixel or polygon scales of analysis. Multi-temporal metrics improved overall accuracy by 0.9-3.1% over summer metrics, yet increases were only significant at the pixel scale of analysis. Overall accuracy at pixel scales was 72.2% (Kappa 0.70) with three seasons of metrics. Anthropogenic and homogenous natural vegetation classes had relatively high confidence and producer and user accuracies were over 70%; in comparison, woodland and forest classes had considerable confusion. We next focused on plant functional types with relatively pure spectra by removing open-canopy shrublands, woodlands and mixed forests from the classification. This 12-class map had significantly improved accuracy of 85.1% (Kappa 0.83) and most classes had over 70% producer and user accuracies. Finally, we summarized important metrics from the multi-temporal Random Forests to infer the underlying chemical and structural properties that best discriminated our land-cover classes across seasons.
Graph-Based Semi-Supervised Hyperspectral Image Classification Using Spatial Information
NASA Astrophysics Data System (ADS)
Jamshidpour, N.; Homayouni, S.; Safari, A.
2017-09-01
Hyperspectral image classification has been one of the most popular research areas in the remote sensing community in the past decades. However, there are still some problems that need specific attentions. For example, the lack of enough labeled samples and the high dimensionality problem are two most important issues which degrade the performance of supervised classification dramatically. The main idea of semi-supervised learning is to overcome these issues by the contribution of unlabeled samples, which are available in an enormous amount. In this paper, we propose a graph-based semi-supervised classification method, which uses both spectral and spatial information for hyperspectral image classification. More specifically, two graphs were designed and constructed in order to exploit the relationship among pixels in spectral and spatial spaces respectively. Then, the Laplacians of both graphs were merged to form a weighted joint graph. The experiments were carried out on two different benchmark hyperspectral data sets. The proposed method performed significantly better than the well-known supervised classification methods, such as SVM. The assessments consisted of both accuracy and homogeneity analyses of the produced classification maps. The proposed spectral-spatial SSL method considerably increased the classification accuracy when the labeled training data set is too scarce.When there were only five labeled samples for each class, the performance improved 5.92% and 10.76% compared to spatial graph-based SSL, for AVIRIS Indian Pine and Pavia University data sets respectively.
Comparison of seven protocols to identify fecal contamination sources using Escherichia coli
Stoeckel, D.M.; Mathes, M.V.; Hyer, K.E.; Hagedorn, C.; Kator, H.; Lukasik, J.; O'Brien, T. L.; Fenger, T.W.; Samadpour, M.; Strickler, K.M.; Wiggins, B.A.
2004-01-01
Microbial source tracking (MST) uses various approaches to classify fecal-indicator microorganisms to source hosts. Reproducibility, accuracy, and robustness of seven phenotypic and genotypic MST protocols were evaluated by use of Escherichia coli from an eight-host library of known-source isolates and a separate, blinded challenge library. In reproducibility tests, measuring each protocol's ability to reclassify blinded replicates, only one (pulsed-field gel electrophoresis; PFGE) correctly classified all test replicates to host species; three protocols classified 48-62% correctly, and the remaining three classified fewer than 25% correctly. In accuracy tests, measuring each protocol's ability to correctly classify new isolates, ribotyping with EcoRI and PvuII approached 100% correct classification but only 6% of isolates were classified; four of the other six protocols (antibiotic resistance analysis, PFGE, and two repetitive-element PCR protocols) achieved better than random accuracy rates when 30-100% of challenge isolates were classified. In robustness tests, measuring each protocol's ability to recognize isolates from nonlibrary hosts, three protocols correctly classified 33-100% of isolates as "unknown origin," whereas four protocols classified all isolates to a source category. A relevance test, summarizing interpretations for a hypothetical water sample containing 30 challenge isolates, indicated that false-positive classifications would hinder interpretations for most protocols. Study results indicate that more representation in known-source libraries and better classification accuracy would be needed before field application. Thorough reliability assessment of classification results is crucial before and during application of MST protocols.
Tse, Samson; Davidson, Larry; Chung, Ka-Fai; Yu, Chong Ho; Ng, King Lam; Tsoi, Emily
2015-02-01
More mental health services are adopting the recovery paradigm. This study adds to prior research by (a) using measures of stages of recovery and elements of recovery that were designed and validated in a non-Western, Chinese culture and (b) testing which demographic factors predict advanced recovery and whether placing importance on certain elements predicts advanced recovery. We examined recovery and factors associated with recovery among 75 Hong Kong adults who were diagnosed with schizophrenia and assessed to be in clinical remission. Data were collected on socio-demographic factors, recovery stages and elements associated with recovery. Logistic regression analysis was used to identify variables that could best predict stages of recovery. Receiver operating characteristic curves were used to detect the classification accuracy of the model (i.e. rates of correct classification of stages of recovery). Logistic regression results indicated that stages of recovery could be distinguished with reasonable accuracy for Stage 3 ('living with disability', classification accuracy = 75.45%) and Stage 4 ('living beyond disability', classification accuracy = 75.50%). However, there was no sufficient information to predict Combined Stages 1 and 2 ('overwhelmed by disability' and 'struggling with disability'). It was found that having a meaningful role and age were the most important differentiators of recovery stage. Preliminary findings suggest that adopting salient life roles personally is important to recovery and that this component should be incorporated into mental health services. © The Author(s) 2014.
Urban Change Detection of Pingtan City based on Bi-temporal Remote Sensing Images
NASA Astrophysics Data System (ADS)
Degang, JIANG; Jinyan, XU; Yikang, GAO
2017-02-01
In this paper, a pair of SPOT 5-6 images with the resolution of 0.5m is selected. An object-oriented classification method is used to the two images and five classes of ground features were identified as man-made objects, farmland, forest, waterbody and unutilized land. An auxiliary ASTER GDEM was used to improve the classification accuracy. And the change detection based on the classification results was performed. Accuracy assessment was carried out finally. Consequently, satisfactory results were obtained. The results show that great changes of the Pingtan city have been detected as the expansion of the city area and the intensity increase of man-made buildings, roads and other infrastructures with the establishment of Pingtan comprehensive experimental zone. Wide range of open sea area along the island coast zones has been reclaimed for port and CBDs construction.
Nijdam-Jones, Alicia; Rosenfeld, Barry
2017-11-01
The cross-cultural validity of feigning instruments and cut-scores is a critical concern for forensic mental health clinicians. This systematic review evaluated feigning classification accuracy and effect sizes across instruments and languages by summarizing 45 published peer-reviewed articles and unpublished doctoral dissertations conducted in Europe, Asia, and North America using linguistically, ethnically, and culturally diverse samples. The most common psychiatric symptom measures used with linguistically, ethnically, and culturally diverse samples included the Structured Inventory of Malingered Symptomatology, the Miller Forensic Assessment of Symptoms Test, and the Minnesota Multiphasic Personality Inventory (MMPI). The most frequently studied cognitive effort measures included the Word Recognition Test, the Test of Memory Malingering, and the Rey 15-item Memory test. The classification accuracy of these measures is compared and the implications of this research literature are discussed. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Pan, Jianjun
2018-01-01
This paper focuses on evaluating the ability and contribution of using backscatter intensity, texture, coherence, and color features extracted from Sentinel-1A data for urban land cover classification and comparing different multi-sensor land cover mapping methods to improve classification accuracy. Both Landsat-8 OLI and Hyperion images were also acquired, in combination with Sentinel-1A data, to explore the potential of different multi-sensor urban land cover mapping methods to improve classification accuracy. The classification was performed using a random forest (RF) method. The results showed that the optimal window size of the combination of all texture features was 9 × 9, and the optimal window size was different for each individual texture feature. For the four different feature types, the texture features contributed the most to the classification, followed by the coherence and backscatter intensity features; and the color features had the least impact on the urban land cover classification. Satisfactory classification results can be obtained using only the combination of texture and coherence features, with an overall accuracy up to 91.55% and a kappa coefficient up to 0.8935, respectively. Among all combinations of Sentinel-1A-derived features, the combination of the four features had the best classification result. Multi-sensor urban land cover mapping obtained higher classification accuracy. The combination of Sentinel-1A and Hyperion data achieved higher classification accuracy compared to the combination of Sentinel-1A and Landsat-8 OLI images, with an overall accuracy of up to 99.12% and a kappa coefficient up to 0.9889. When Sentinel-1A data was added to Hyperion images, the overall accuracy and kappa coefficient were increased by 4.01% and 0.0519, respectively. PMID:29382073
Farran, Bassam; Channanath, Arshad Mohamed; Behbehani, Kazem; Thanaraj, Thangavel Alphonse
2013-05-14
We build classification models and risk assessment tools for diabetes, hypertension and comorbidity using machine-learning algorithms on data from Kuwait. We model the increased proneness in diabetic patients to develop hypertension and vice versa. We ascertain the importance of ethnicity (and natives vs expatriate migrants) and of using regional data in risk assessment. Retrospective cohort study. Four machine-learning techniques were used: logistic regression, k-nearest neighbours (k-NN), multifactor dimensionality reduction and support vector machines. The study uses fivefold cross validation to obtain generalisation accuracies and errors. Kuwait Health Network (KHN) that integrates data from primary health centres and hospitals in Kuwait. 270 172 hospital visitors (of which, 89 858 are diabetic, 58 745 hypertensive and 30 522 comorbid) comprising Kuwaiti natives, Asian and Arab expatriates. Incident type 2 diabetes, hypertension and comorbidity. Classification accuracies of >85% (for diabetes) and >90% (for hypertension) are achieved using only simple non-laboratory-based parameters. Risk assessment tools based on k-NN classification models are able to assign 'high' risk to 75% of diabetic patients and to 94% of hypertensive patients. Only 5% of diabetic patients are seen assigned 'low' risk. Asian-specific models and assessments perform even better. Pathological conditions of diabetes in the general population or in hypertensive population and those of hypertension are modelled. Two-stage aggregate classification models and risk assessment tools, built combining both the component models on diabetes (or on hypertension), perform better than individual models. Data on diabetes, hypertension and comorbidity from the cosmopolitan State of Kuwait are available for the first time. This enabled us to apply four different case-control models to assess risks. These tools aid in the preliminary non-intrusive assessment of the population. Ethnicity is seen significant to the predictive models. Risk assessments need to be developed using regional data as we demonstrate the applicability of the American Diabetes Association online calculator on data from Kuwait.
Object-oriented crop mapping and monitoring using multi-temporal polarimetric RADARSAT-2 data
NASA Astrophysics Data System (ADS)
Jiao, Xianfeng; Kovacs, John M.; Shang, Jiali; McNairn, Heather; Walters, Dan; Ma, Baoluo; Geng, Xiaoyuan
2014-10-01
The aim of this paper is to assess the accuracy of an object-oriented classification of polarimetric Synthetic Aperture Radar (PolSAR) data to map and monitor crops using 19 RADARSAT-2 fine beam polarimetric (FQ) images of an agricultural area in North-eastern Ontario, Canada. Polarimetric images and field data were acquired during the 2011 and 2012 growing seasons. The classification and field data collection focused on the main crop types grown in the region, which include: wheat, oat, soybean, canola and forage. The polarimetric parameters were extracted with PolSAR analysis using both the Cloude-Pottier and Freeman-Durden decompositions. The object-oriented classification, with a single date of PolSAR data, was able to classify all five crop types with an accuracy of 95% and Kappa of 0.93; a 6% improvement in comparison with linear-polarization only classification. However, the time of acquisition is crucial. The larger biomass crops of canola and soybean were most accurately mapped, whereas the identification of oat and wheat were more variable. The multi-temporal data using the Cloude-Pottier decomposition parameters provided the best classification accuracy compared to the linear polarizations and the Freeman-Durden decomposition parameters. In general, the object-oriented classifications were able to accurately map crop types by reducing the noise inherent in the SAR data. Furthermore, using the crop classification maps we were able to monitor crop growth stage based on a trend analysis of the radar response. Based on field data from canola crops, there was a strong relationship between the phenological growth stage based on the BBCH scale, and the HV backscatter and entropy.
An assessment of the effectiveness of a random forest classifier for land-cover classification
NASA Astrophysics Data System (ADS)
Rodriguez-Galiano, V. F.; Ghimire, B.; Rogan, J.; Chica-Olmo, M.; Rigol-Sanchez, J. P.
2012-01-01
Land cover monitoring using remotely sensed data requires robust classification methods which allow for the accurate mapping of complex land cover and land use categories. Random forest (RF) is a powerful machine learning classifier that is relatively unknown in land remote sensing and has not been evaluated thoroughly by the remote sensing community compared to more conventional pattern recognition techniques. Key advantages of RF include: their non-parametric nature; high classification accuracy; and capability to determine variable importance. However, the split rules for classification are unknown, therefore RF can be considered to be black box type classifier. RF provides an algorithm for estimating missing values; and flexibility to perform several types of data analysis, including regression, classification, survival analysis, and unsupervised learning. In this paper, the performance of the RF classifier for land cover classification of a complex area is explored. Evaluation was based on several criteria: mapping accuracy, sensitivity to data set size and noise. Landsat-5 Thematic Mapper data captured in European spring and summer were used with auxiliary variables derived from a digital terrain model to classify 14 different land categories in the south of Spain. Results show that the RF algorithm yields accurate land cover classifications, with 92% overall accuracy and a Kappa index of 0.92. RF is robust to training data reduction and noise because significant differences in kappa values were only observed for data reduction and noise addition values greater than 50 and 20%, respectively. Additionally, variables that RF identified as most important for classifying land cover coincided with expectations. A McNemar test indicates an overall better performance of the random forest model over a single decision tree at the 0.00001 significance level.
Hartling, Lisa; Bond, Kenneth; Santaguida, P Lina; Viswanathan, Meera; Dryden, Donna M
2011-08-01
To develop and test a study design classification tool. We contacted relevant organizations and individuals to identify tools used to classify study designs and ranked these using predefined criteria. The highest ranked tool was a design algorithm developed, but no longer advocated, by the Cochrane Non-Randomized Studies Methods Group; this was modified to include additional study designs and decision points. We developed a reference classification for 30 studies; 6 testers applied the tool to these studies. Interrater reliability (Fleiss' κ) and accuracy against the reference classification were assessed. The tool was further revised and retested. Initial reliability was fair among the testers (κ=0.26) and the reference standard raters κ=0.33). Testing after revisions showed improved reliability (κ=0.45, moderate agreement) with improved, but still low, accuracy. The most common disagreements were whether the study design was experimental (5 of 15 studies), and whether there was a comparison of any kind (4 of 15 studies). Agreement was higher among testers who had completed graduate level training versus those who had not. The moderate reliability and low accuracy may be because of lack of clarity and comprehensiveness of the tool, inadequate reporting of the studies, and variability in tester characteristics. The results may not be generalizable to all published studies, as the test studies were selected because they had posed challenges for previous reviewers with respect to their design classification. Application of such a tool should be accompanied by training, pilot testing, and context-specific decision rules. Copyright © 2011 Elsevier Inc. All rights reserved.
Empirical evaluation of data normalization methods for molecular classification
Huang, Huei-Chung
2018-01-01
Background Data artifacts due to variations in experimental handling are ubiquitous in microarray studies, and they can lead to biased and irreproducible findings. A popular approach to correct for such artifacts is through post hoc data adjustment such as data normalization. Statistical methods for data normalization have been developed and evaluated primarily for the discovery of individual molecular biomarkers. Their performance has rarely been studied for the development of multi-marker molecular classifiers—an increasingly important application of microarrays in the era of personalized medicine. Methods In this study, we set out to evaluate the performance of three commonly used methods for data normalization in the context of molecular classification, using extensive simulations based on re-sampling from a unique pair of microRNA microarray datasets for the same set of samples. The data and code for our simulations are freely available as R packages at GitHub. Results In the presence of confounding handling effects, all three normalization methods tended to improve the accuracy of the classifier when evaluated in an independent test data. The level of improvement and the relative performance among the normalization methods depended on the relative level of molecular signal, the distributional pattern of handling effects (e.g., location shift vs scale change), and the statistical method used for building the classifier. In addition, cross-validation was associated with biased estimation of classification accuracy in the over-optimistic direction for all three normalization methods. Conclusion Normalization may improve the accuracy of molecular classification for data with confounding handling effects; however, it cannot circumvent the over-optimistic findings associated with cross-validation for assessing classification accuracy. PMID:29666754
Morris, Alan; Burgon, Nathan; McGann, Christopher; MacLeod, Robert; Cates, Joshua
2013-01-01
Radiofrequency ablation is a promising procedure for treating atrial fibrillation (AF) that relies on accurate lesion delivery in the left atrial (LA) wall for success. Late Gadolinium Enhancement MRI (LGE MRI) at three months post-ablation has proven effective for noninvasive assessment of the location and extent of scar formation, which are important factors for predicting patient outcome and planning of redo ablation procedures. We have developed an algorithm for automatic classification in LGE MRI of scar tissue in the LA wall and have evaluated accuracy and consistency compared to manual scar classifications by expert observers. Our approach clusters voxels based on normalized intensity and was chosen through a systematic comparison of the performance of multivariate clustering on many combinations of image texture. Algorithm performance was determined by overlap with ground truth, using multiple overlap measures, and the accuracy of the estimation of the total amount of scar in the LA. Ground truth was determined using the STAPLE algorithm, which produces a probabilistic estimate of the true scar classification from multiple expert manual segmentations. Evaluation of the ground truth data set was based on both inter- and intra-observer agreement, with variation among expert classifiers indicating the difficulty of scar classification for a given a dataset. Our proposed automatic scar classification algorithm performs well for both scar localization and estimation of scar volume: for ground truth datasets considered easy, variability from the ground truth was low; for those considered difficult, variability from ground truth was on par with the variability across experts. PMID:24236224
NASA Astrophysics Data System (ADS)
Perry, Daniel; Morris, Alan; Burgon, Nathan; McGann, Christopher; MacLeod, Robert; Cates, Joshua
2012-03-01
Radiofrequency ablation is a promising procedure for treating atrial fibrillation (AF) that relies on accurate lesion delivery in the left atrial (LA) wall for success. Late Gadolinium Enhancement MRI (LGE MRI) at three months post-ablation has proven effective for noninvasive assessment of the location and extent of scar formation, which are important factors for predicting patient outcome and planning of redo ablation procedures. We have developed an algorithm for automatic classification in LGE MRI of scar tissue in the LA wall and have evaluated accuracy and consistency compared to manual scar classifications by expert observers. Our approach clusters voxels based on normalized intensity and was chosen through a systematic comparison of the performance of multivariate clustering on many combinations of image texture. Algorithm performance was determined by overlap with ground truth, using multiple overlap measures, and the accuracy of the estimation of the total amount of scar in the LA. Ground truth was determined using the STAPLE algorithm, which produces a probabilistic estimate of the true scar classification from multiple expert manual segmentations. Evaluation of the ground truth data set was based on both inter- and intra-observer agreement, with variation among expert classifiers indicating the difficulty of scar classification for a given a dataset. Our proposed automatic scar classification algorithm performs well for both scar localization and estimation of scar volume: for ground truth datasets considered easy, variability from the ground truth was low; for those considered difficult, variability from ground truth was on par with the variability across experts.
NASA Astrophysics Data System (ADS)
Liu, F.; Chen, T.; He, J.; Wen, Q.; Yu, F.; Gu, X.; Wang, Z.
2018-04-01
In recent years, the quick upgrading and improvement of SAR sensors provide beneficial complements for the traditional optical remote sensing in the aspects of theory, technology and data. In this paper, Sentinel-1A SAR data and GF-1 optical data were selected for image fusion, and more emphases were put on the dryland crop classification under a complex crop planting structure, regarding corn and cotton as the research objects. Considering the differences among various data fusion methods, the principal component analysis (PCA), Gram-Schmidt (GS), Brovey and wavelet transform (WT) methods were compared with each other, and the GS and Brovey methods were proved to be more applicable in the study area. Then, the classification was conducted based on the object-oriented technique process. And for the GS, Brovey fusion images and GF-1 optical image, the nearest neighbour algorithm was adopted to realize the supervised classification with the same training samples. Based on the sample plots in the study area, the accuracy assessment was conducted subsequently. The values of overall accuracy and kappa coefficient of fusion images were all higher than those of GF-1 optical image, and GS method performed better than Brovey method. In particular, the overall accuracy of GS fusion image was 79.8 %, and the Kappa coefficient was 0.644. Thus, the results showed that GS and Brovey fusion images were superior to optical images for dryland crop classification. This study suggests that the fusion of SAR and optical images is reliable for dryland crop classification under a complex crop planting structure.
Kos, Gregor; Sieger, Markus; McMullin, David; Zahradnik, Celine; Sulyok, Michael; Öner, Tuba; Mizaikoff, Boris; Krska, Rudolf
2016-10-01
The rapid identification of mycotoxins such as deoxynivalenol and aflatoxin B 1 in agricultural commodities is an ongoing concern for food importers and processors. While sophisticated chromatography-based methods are well established for regulatory testing by food safety authorities, few techniques exist to provide a rapid assessment for traders. This study advances the development of a mid-infrared spectroscopic method, recording spectra with little sample preparation. Spectral data were classified using a bootstrap-aggregated (bagged) decision tree method, evaluating the protein and carbohydrate absorption regions of the spectrum. The method was able to classify 79% of 110 maize samples at the European Union regulatory limit for deoxynivalenol of 1750 µg kg -1 and, for the first time, 77% of 92 peanut samples at 8 µg kg -1 of aflatoxin B 1 . A subset model revealed a dependency on variety and type of fungal infection. The employed CRC and SBL maize varieties could be pooled in the model with a reduction of classification accuracy from 90% to 79%. Samples infected with Fusarium verticillioides were removed, leaving samples infected with F. graminearum and F. culmorum in the dataset improving classification accuracy from 73% to 79%. A 500 µg kg -1 classification threshold for deoxynivalenol in maize performed even better with 85% accuracy. This is assumed to be due to a larger number of samples around the threshold increasing representativity. Comparison with established principal component analysis classification, which consistently showed overlapping clusters, confirmed the superior performance of bagged decision tree classification.
Otitis Media Diagnosis for Developing Countries Using Tympanic Membrane Image-Analysis.
Myburgh, Hermanus C; van Zijl, Willemien H; Swanepoel, DeWet; Hellström, Sten; Laurent, Claude
2016-03-01
Otitis media is one of the most common childhood diseases worldwide, but because of lack of doctors and health personnel in developing countries it is often misdiagnosed or not diagnosed at all. This may lead to serious, and life-threatening complications. There is, thus a need for an automated computer based image-analyzing system that could assist in making accurate otitis media diagnoses anywhere. A method for automated diagnosis of otitis media is proposed. The method uses image-processing techniques to classify otitis media. The system is trained using high quality pre-assessed images of tympanic membranes, captured by digital video-otoscopes, and classifies undiagnosed images into five otitis media categories based on predefined signs. Several verification tests analyzed the classification capability of the method. An accuracy of 80.6% was achieved for images taken with commercial video-otoscopes, while an accuracy of 78.7% was achieved for images captured on-site with a low cost custom-made video-otoscope. The high accuracy of the proposed otitis media classification system compares well with the classification accuracy of general practitioners and pediatricians (~64% to 80%) using traditional otoscopes, and therefore holds promise for the future in making automated diagnosis of otitis media in medically underserved populations.
Otitis Media Diagnosis for Developing Countries Using Tympanic Membrane Image-Analysis
Myburgh, Hermanus C.; van Zijl, Willemien H.; Swanepoel, DeWet; Hellström, Sten; Laurent, Claude
2016-01-01
Background Otitis media is one of the most common childhood diseases worldwide, but because of lack of doctors and health personnel in developing countries it is often misdiagnosed or not diagnosed at all. This may lead to serious, and life-threatening complications. There is, thus a need for an automated computer based image-analyzing system that could assist in making accurate otitis media diagnoses anywhere. Methods A method for automated diagnosis of otitis media is proposed. The method uses image-processing techniques to classify otitis media. The system is trained using high quality pre-assessed images of tympanic membranes, captured by digital video-otoscopes, and classifies undiagnosed images into five otitis media categories based on predefined signs. Several verification tests analyzed the classification capability of the method. Findings An accuracy of 80.6% was achieved for images taken with commercial video-otoscopes, while an accuracy of 78.7% was achieved for images captured on-site with a low cost custom-made video-otoscope. Interpretation The high accuracy of the proposed otitis media classification system compares well with the classification accuracy of general practitioners and pediatricians (~ 64% to 80%) using traditional otoscopes, and therefore holds promise for the future in making automated diagnosis of otitis media in medically underserved populations. PMID:27077122
Sadikov, Aleksander; Groznik, Vida; Možina, Martin; Žabkar, Jure; Nyholm, Dag; Memedi, Mevludin; Bratko, Ivan; Georgiev, Dejan
2017-09-01
Parkinson's disease (PD) is currently incurable, however proper treatment can ease the symptoms and significantly improve the quality of life of patients. Since PD is a chronic disease, its efficient monitoring and management is very important. The objective of this paper was to investigate the feasibility of using the features and methodology of a spirography application, originally designed to detect early Parkinson's disease (PD) motoric symptoms, for automatically assessing motor symptoms of advanced PD patients experiencing motor fluctuations. More specifically, the aim was to objectively assess motor symptoms related to bradykinesias (slowness of movements occurring as a result of under-medication) and dyskinesias (involuntary movements occurring as a result of over-medication). This work combined spirography data and clinical assessments from a longitudinal clinical study in Sweden with the features and pre-processing methodology of a Slovenian spirography application. The study involved 65 advanced PD patients and over 30,000 spiral-drawing measurements over the course of three years. Machine learning methods were used to learn to predict the "cause" (bradykinesia or dyskinesia) of upper limb motor dysfunctions as assessed by a clinician who observed animated spirals in a web interface. The classification model was also tested for comprehensibility. For this purpose a visualisation technique was used to present visual clues to clinicians as to which parts of the spiral drawing (or its animation) are important for the given classification. Using the machine learning methods with feature descriptions and pre-processing from the Slovenian application resulted in 86% classification accuracy and over 0.90 AUC. The clinicians also rated the computer's visual explanations of its classifications as at least meaningful if not necessarily helpful in over 90% of the cases. The relatively high classification accuracy and AUC demonstrates the usefulness of this approach for objective monitoring of PD patients. The positive evaluation of computer's explanations suggests the potential use of this methodology in a decision support setting. Copyright © 2017 Elsevier B.V. All rights reserved.
The effect of finite field size on classification and atmospheric correction
NASA Technical Reports Server (NTRS)
Kaufman, Y. J.; Fraser, R. S.
1981-01-01
The atmospheric effect on the upward radiance of sunlight scattered from the Earth-atmosphere system is strongly influenced by the contrasts between fields and their sizes. For a given atmospheric turbidity, the atmospheric effect on classification of surface features is much stronger for nonuniform surfaces than for uniform surfaces. Therefore, the classification accuracy of agricultural fields and urban areas is dependent not only on the optical characteristics of the atmosphere, but also on the size of the surface do not account for the nonuniformity of the surface have only a slight effect on the classification accuracy; in other cases the classification accuracy descreases. The radiances above finite fields were computed to simulate radiances measured by a satellite. A simulation case including 11 agricultural fields and four natural fields (water, soil, savanah, and forest) was used to test the effect of the size of the background reflectance and the optical thickness of the atmosphere on classification accuracy. It is concluded that new atmospheric correction methods, which take into account the finite size of the fields, have to be developed to improve significantly the classification accuracy.
Correlation-based pattern recognition for implantable defibrillators.
Wilkins, J.
1996-01-01
An estimated 300,000 Americans die each year from cardiac arrhythmias. Historically, drug therapy or surgery were the only treatment options available for patients suffering from arrhythmias. Recently, implantable arrhythmia management devices have been developed. These devices allow abnormal cardiac rhythms to be sensed and corrected in vivo. Proper arrhythmia classification is critical to selecting the appropriate therapeutic intervention. The classification problem is made more challenging by the power/computation constraints imposed by the short battery life of implantable devices. Current devices utilize heart rate-based classification algorithms. Although easy to implement, rate-based approaches have unacceptably high error rates in distinguishing supraventricular tachycardia (SVT) from ventricular tachycardia (VT). Conventional morphology assessment techniques used in ECG analysis often require too much computation to be practical for implantable devices. In this paper, a computationally-efficient, arrhythmia classification architecture using correlation-based morphology assessment is presented. The architecture classifies individuals heart beats by assessing similarity between an incoming cardiac signal vector and a series of prestored class templates. A series of these beat classifications are used to make an overall rhythm assessment. The system makes use of several new results in the field of pattern recognition. The resulting system achieved excellent accuracy in discriminating SVT and VT. PMID:8947674
Peatland classification of West Siberia based on Landsat imagery
NASA Astrophysics Data System (ADS)
Terentieva, I.; Glagolev, M.; Lapshina, E.; Maksyutov, S. S.
2014-12-01
Increasing interest in peatlands for prediction of environmental changes requires an understanding of its geographical distribution. West Siberia Plain is the biggest peatland area in Eurasia and is situated in the high latitudes experiencing enhanced rate of climate change. West Siberian taiga mires are important globally, accounting for about 12.5% of the global wetland area. A number of peatland maps of the West Siberia was developed in 1970s, but their accuracy is limited. Here we report the effort in mapping West Siberian peatlands using 30 m resolution Landsat imagery. As a first step, peatland classification scheme oriented on environmental parameter upscaling was developed. The overall workflow involves data pre-processing, training data collection, image classification on a scene-by-scene basis, regrouping of the derived classes into final peatland types and accuracy assessment. To avoid misclassification peatlands were distinguished from other landscapes using threshold method: for each scene, Green-Red Vegetation Indices was used for peatland masking and 5th channel was used for masking water bodies. Peatland image masks were made in Quantum GIS, filtered in MATLAB and then classified in Multispec (Purdue Research Foundation) using maximum likelihood algorithm of supervised classification method. Training sample selection was mostly based on spectral signatures due to limited ancillary and high-resolution image data. As an additional source of information, we applied our field knowledge resulting from more than 10 years of fieldwork in West Siberia summarized in an extensive dataset of botanical relevés, field photos, pH and electrical conductivity data from 40 test sites. After the classification procedure, discriminated spectral classes were generalized into 12 peatland types. Overall accuracy assessment was based on 439 randomly assigned test sites showing final map accuracy was 80%. Total peatland area was estimated at 73.0 Mha. Various ridge-hollow and ridge-hollow-pool bog complexes prevail here occupying 34.5 Mha. They are followed by lakes (11.1 Mha), fens (10.7 Mha), pine-dwarf-shrub sphagnum bogs (9.3 Mha) and palsa complexes (7.4 Mha).
NASA Astrophysics Data System (ADS)
Squiers, John J.; Li, Weizhi; King, Darlene R.; Mo, Weirong; Zhang, Xu; Lu, Yang; Sellke, Eric W.; Fan, Wensheng; DiMaio, J. Michael; Thatcher, Jeffrey E.
2016-03-01
The clinical judgment of expert burn surgeons is currently the standard on which diagnostic and therapeutic decisionmaking regarding burn injuries is based. Multispectral imaging (MSI) has the potential to increase the accuracy of burn depth assessment and the intraoperative identification of viable wound bed during surgical debridement of burn injuries. A highly accurate classification model must be developed using machine-learning techniques in order to translate MSI data into clinically-relevant information. An animal burn model was developed to build an MSI training database and to study the burn tissue classification ability of several models trained via common machine-learning algorithms. The algorithms tested, from least to most complex, were: K-nearest neighbors (KNN), decision tree (DT), linear discriminant analysis (LDA), weighted linear discriminant analysis (W-LDA), quadratic discriminant analysis (QDA), ensemble linear discriminant analysis (EN-LDA), ensemble K-nearest neighbors (EN-KNN), and ensemble decision tree (EN-DT). After the ground-truth database of six tissue types (healthy skin, wound bed, blood, hyperemia, partial injury, full injury) was generated by histopathological analysis, we used 10-fold cross validation to compare the algorithms' performances based on their accuracies in classifying data against the ground truth, and each algorithm was tested 100 times. The mean test accuracy of the algorithms were KNN 68.3%, DT 61.5%, LDA 70.5%, W-LDA 68.1%, QDA 68.9%, EN-LDA 56.8%, EN-KNN 49.7%, and EN-DT 36.5%. LDA had the highest test accuracy, reflecting the bias-variance tradeoff over the range of complexities inherent to the algorithms tested. Several algorithms were able to match the current standard in burn tissue classification, the clinical judgment of expert burn surgeons. These results will guide further development of an MSI burn tissue classification system. Given that there are few surgeons and facilities specializing in burn care, this technology may improve the standard of burn care for patients without access to specialized facilities.
Austin, Peter C; Lee, Douglas S
2011-01-01
Purpose: Classification trees are increasingly being used to classifying patients according to the presence or absence of a disease or health outcome. A limitation of classification trees is their limited predictive accuracy. In the data-mining and machine learning literature, boosting has been developed to improve classification. Boosting with classification trees iteratively grows classification trees in a sequence of reweighted datasets. In a given iteration, subjects that were misclassified in the previous iteration are weighted more highly than subjects that were correctly classified. Classifications from each of the classification trees in the sequence are combined through a weighted majority vote to produce a final classification. The authors' objective was to examine whether boosting improved the accuracy of classification trees for predicting outcomes in cardiovascular patients. Methods: We examined the utility of boosting classification trees for classifying 30-day mortality outcomes in patients hospitalized with either acute myocardial infarction or congestive heart failure. Results: Improvements in the misclassification rate using boosted classification trees were at best minor compared to when conventional classification trees were used. Minor to modest improvements to sensitivity were observed, with only a negligible reduction in specificity. For predicting cardiovascular mortality, boosted classification trees had high specificity, but low sensitivity. Conclusions: Gains in predictive accuracy for predicting cardiovascular outcomes were less impressive than gains in performance observed in the data mining literature. PMID:22254181
Caminiti, Silvia Paola; Ballarini, Tommaso; Sala, Arianna; Cerami, Chiara; Presotto, Luca; Santangelo, Roberto; Fallanca, Federico; Vanoli, Emilia Giovanna; Gianolli, Luigi; Iannaccone, Sandro; Magnani, Giuseppe; Perani, Daniela
2018-01-01
In this multicentre study in clinical settings, we assessed the accuracy of optimized procedures for FDG-PET brain metabolism and CSF classifications in predicting or excluding the conversion to Alzheimer's disease (AD) dementia and non-AD dementias. We included 80 MCI subjects with neurological and neuropsychological assessments, FDG-PET scan and CSF measures at entry, all with clinical follow-up. FDG-PET data were analysed with a validated voxel-based SPM method. Resulting single-subject SPM maps were classified by five imaging experts according to the disease-specific patterns, as "typical-AD", "atypical-AD" (i.e. posterior cortical atrophy, asymmetric logopenic AD variant, frontal-AD variant), "non-AD" (i.e. behavioural variant FTD, corticobasal degeneration, semantic variant FTD; dementia with Lewy bodies) or "negative" patterns. To perform the statistical analyses, the individual patterns were grouped either as "AD dementia vs. non-AD dementia (all diseases)" or as "FTD vs. non-FTD (all diseases)". Aβ42, total and phosphorylated Tau CSF-levels were classified dichotomously, and using the Erlangen Score algorithm. Multivariate logistic models tested the prognostic accuracy of FDG-PET-SPM and CSF dichotomous classifications. Accuracy of Erlangen score and Erlangen Score aided by FDG-PET SPM classification was evaluated. The multivariate logistic model identified FDG-PET "AD" SPM classification (Expβ = 19.35, 95% C.I. 4.8-77.8, p < 0.001) and CSF Aβ42 (Expβ = 6.5, 95% C.I. 1.64-25.43, p < 0.05) as the best predictors of conversion from MCI to AD dementia. The "FTD" SPM pattern significantly predicted conversion to FTD dementias at follow-up (Expβ = 14, 95% C.I. 3.1-63, p < 0.001). Overall, FDG-PET-SPM classification was the most accurate biomarker, able to correctly differentiate either the MCI subjects who converted to AD or FTD dementias, and those who remained stable or reverted to normal cognition (Expβ = 17.9, 95% C.I. 4.55-70.46, p < 0.001). Our results support the relevant role of FDG-PET-SPM classification in predicting progression to different dementia conditions in prodromal MCI phase, and in the exclusion of progression, outperforming CSF biomarkers.
Di-codon Usage for Gene Classification
NASA Astrophysics Data System (ADS)
Nguyen, Minh N.; Ma, Jianmin; Fogel, Gary B.; Rajapakse, Jagath C.
Classification of genes into biologically related groups facilitates inference of their functions. Codon usage bias has been described previously as a potential feature for gene classification. In this paper, we demonstrate that di-codon usage can further improve classification of genes. By using both codon and di-codon features, we achieve near perfect accuracies for the classification of HLA molecules into major classes and sub-classes. The method is illustrated on 1,841 HLA sequences which are classified into two major classes, HLA-I and HLA-II. Major classes are further classified into sub-groups. A binary SVM using di-codon usage patterns achieved 99.95% accuracy in the classification of HLA genes into major HLA classes; and multi-class SVM achieved accuracy rates of 99.82% and 99.03% for sub-class classification of HLA-I and HLA-II genes, respectively. Furthermore, by combining codon and di-codon usages, the prediction accuracies reached 100%, 99.82%, and 99.84% for HLA major class classification, and for sub-class classification of HLA-I and HLA-II genes, respectively.
NASA Technical Reports Server (NTRS)
Cibula, William G.; Nyquist, Maurice O.
1987-01-01
An unsupervised computer classification of vegetation/landcover of Olympic National Park and surrounding environs was initially carried out using four bands of Landsat MSS data. The primary objective of the project was to derive a level of landcover classifications useful for park management applications while maintaining an acceptably high level of classification accuracy. Initially, nine generalized vegetation/landcover classes were derived. Overall classification accuracy was 91.7 percent. In an attempt to refine the level of classification, a geographic information system (GIS) approach was employed. Topographic data and watershed boundaries (inferred precipitation/temperature) data were registered with the Landsat MSS data. The resultant boolean operations yielded 21 vegetation/landcover classes while maintaining the same level of classification accuracy. The final classification provided much better identification and location of the major forest types within the park at the same high level of accuracy, and these met the project objective. This classification could now become inputs into a GIS system to help provide answers to park management coupled with other ancillary data programs such as fire management.
Rajasekaran, S; Bhushan, Manindra; Aiyer, Siddharth; Kanna, Rishi; Shetty, Ajoy Prasad
2018-01-09
To develop a classification based on the technical complexity encountered during pedicle screw insertion and to evaluate the performance of AIRO ® CT navigation system based on this classification, in the clinical scenario of complex spinal deformity. 31 complex spinal deformity correction surgeries were prospectively analyzed for performance of AIRO ® mobile CT-based navigation system. Pedicles were classified according to complexity of insertion into five types. Analysis was performed to estimate the accuracy of screw placement and time for screw insertion. Breach greater than 2 mm was considered for analysis. 452 pedicle screws were inserted (T1-T6: 116; T7-T12: 171; L1-S1: 165). The average Cobb angle was 68.3° (range 60°-104°). We had 242 grade 2 pedicles, 133 grade 3, and 77 grade 4, and 44 pedicles were unfit for pedicle screw insertion. We noted 27 pedicle screw breach (medial: 10; lateral: 16; anterior: 1). Among lateral breach (n = 16), ten screws were planned for in-out-in pedicle screw insertion. Among lateral breach (n = 16), ten screws were planned for in-out-in pedicle screw insertion. Average screw insertion time was 1.76 ± 0.89 min. After accounting for planned breach, the effective breach rate was 3.8% resulting in 96.2% accuracy for pedicle screw placement. This classification helps compare the accuracy of screw insertion in range of conditions by considering the complexity of screw insertion. Considering the clinical scenario of complex pedicle anatomy in spinal deformity AIRO ® navigation showed an excellent accuracy rate of 96.2%.
Accuracy assessment in the Large Area Crop Inventory Experiment
NASA Technical Reports Server (NTRS)
Houston, A. G.; Pitts, D. E.; Feiveson, A. H.; Badhwar, G.; Ferguson, M.; Hsu, E.; Potter, J.; Chhikara, R.; Rader, M.; Ahlers, C.
1979-01-01
The Accuracy Assessment System (AAS) of the Large Area Crop Inventory Experiment (LACIE) was responsible for determining the accuracy and reliability of LACIE estimates of wheat production, area, and yield, made at regular intervals throughout the crop season, and for investigating the various LACIE error sources, quantifying these errors, and relating them to their causes. Some results of using the AAS during the three years of LACIE are reviewed. As the program culminated, AAS was able not only to meet the goal of obtaining accurate statistical estimates of sampling and classification accuracy, but also the goal of evaluating component labeling errors. Furthermore, the ground-truth data processing matured from collecting data for one crop (small grains) to collecting, quality-checking, and archiving data for all crops in a LACIE small segment.
Modeling time-to-event (survival) data using classification tree analysis.
Linden, Ariel; Yarnold, Paul R
2017-12-01
Time to the occurrence of an event is often studied in health research. Survival analysis differs from other designs in that follow-up times for individuals who do not experience the event by the end of the study (called censored) are accounted for in the analysis. Cox regression is the standard method for analysing censored data, but the assumptions required of these models are easily violated. In this paper, we introduce classification tree analysis (CTA) as a flexible alternative for modelling censored data. Classification tree analysis is a "decision-tree"-like classification model that provides parsimonious, transparent (ie, easy to visually display and interpret) decision rules that maximize predictive accuracy, derives exact P values via permutation tests, and evaluates model cross-generalizability. Using empirical data, we identify all statistically valid, reproducible, longitudinally consistent, and cross-generalizable CTA survival models and then compare their predictive accuracy to estimates derived via Cox regression and an unadjusted naïve model. Model performance is assessed using integrated Brier scores and a comparison between estimated survival curves. The Cox regression model best predicts average incidence of the outcome over time, whereas CTA survival models best predict either relatively high, or low, incidence of the outcome over time. Classification tree analysis survival models offer many advantages over Cox regression, such as explicit maximization of predictive accuracy, parsimony, statistical robustness, and transparency. Therefore, researchers interested in accurate prognoses and clear decision rules should consider developing models using the CTA-survival framework. © 2017 John Wiley & Sons, Ltd.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, Yongjun; Lim, Jonghyuck; Kim, Namkug
2013-05-15
Purpose: To investigate the effect of using different computed tomography (CT) scanners on the accuracy of high-resolution CT (HRCT) images in classifying regional disease patterns in patients with diffuse lung disease, support vector machine (SVM) and Bayesian classifiers were applied to multicenter data. Methods: Two experienced radiologists marked sets of 600 rectangular 20 Multiplication-Sign 20 pixel regions of interest (ROIs) on HRCT images obtained from two scanners (GE and Siemens), including 100 ROIs for each of local patterns of lungs-normal lung and five of regional pulmonary disease patterns (ground-glass opacity, reticular opacity, honeycombing, emphysema, and consolidation). Each ROI was assessedmore » using 22 quantitative features belonging to one of the following descriptors: histogram, gradient, run-length, gray level co-occurrence matrix, low-attenuation area cluster, and top-hat transform. For automatic classification, a Bayesian classifier and a SVM classifier were compared under three different conditions. First, classification accuracies were estimated using data from each scanner. Next, data from the GE and Siemens scanners were used for training and testing, respectively, and vice versa. Finally, all ROI data were integrated regardless of the scanner type and were then trained and tested together. All experiments were performed based on forward feature selection and fivefold cross-validation with 20 repetitions. Results: For each scanner, better classification accuracies were achieved with the SVM classifier than the Bayesian classifier (92% and 82%, respectively, for the GE scanner; and 92% and 86%, respectively, for the Siemens scanner). The classification accuracies were 82%/72% for training with GE data and testing with Siemens data, and 79%/72% for the reverse. The use of training and test data obtained from the HRCT images of different scanners lowered the classification accuracy compared to the use of HRCT images from the same scanner. For integrated ROI data obtained from both scanners, the classification accuracies with the SVM and Bayesian classifiers were 92% and 77%, respectively. The selected features resulting from the classification process differed by scanner, with more features included for the classification of the integrated HRCT data than for the classification of the HRCT data from each scanner. For the integrated data, consisting of HRCT images of both scanners, the classification accuracy based on the SVM was statistically similar to the accuracy of the data obtained from each scanner. However, the classification accuracy of the integrated data using the Bayesian classifier was significantly lower than the classification accuracy of the ROI data of each scanner. Conclusions: The use of an integrated dataset along with a SVM classifier rather than a Bayesian classifier has benefits in terms of the classification accuracy of HRCT images acquired with more than one scanner. This finding is of relevance in studies involving large number of images, as is the case in a multicenter trial with different scanners.« less
Corn and soybean Landsat MSS classification performance as a function of scene characteristics
NASA Technical Reports Server (NTRS)
Batista, G. T.; Hixson, M. M.; Bauer, M. E.
1982-01-01
In order to fully utilize remote sensing to inventory crop production, it is important to identify the factors that affect the accuracy of Landsat classifications. The objective of this study was to investigate the effect of scene characteristics involving crop, soil, and weather variables on the accuracy of Landsat classifications of corn and soybeans. Segments sampling the U.S. Corn Belt were classified using a Gaussian maximum likelihood classifier on multitemporally registered data from two key acquisition periods. Field size had a strong effect on classification accuracy with small fields tending to have low accuracies even when the effect of mixed pixels was eliminated. Other scene characteristics accounting for variability in classification accuracy included proportions of corn and soybeans, crop diversity index, proportion of all field crops, soil drainage, slope, soil order, long-term average soybean yield, maximum yield, relative position of the segment in the Corn Belt, weather, and crop development stage.
A Classification of Remote Sensing Image Based on Improved Compound Kernels of Svm
NASA Astrophysics Data System (ADS)
Zhao, Jianing; Gao, Wanlin; Liu, Zili; Mou, Guifen; Lu, Lin; Yu, Lina
The accuracy of RS classification based on SVM which is developed from statistical learning theory is high under small number of train samples, which results in satisfaction of classification on RS using SVM methods. The traditional RS classification method combines visual interpretation with computer classification. The accuracy of the RS classification, however, is improved a lot based on SVM method, because it saves much labor and time which is used to interpret images and collect training samples. Kernel functions play an important part in the SVM algorithm. It uses improved compound kernel function and therefore has a higher accuracy of classification on RS images. Moreover, compound kernel improves the generalization and learning ability of the kernel.
NASA Astrophysics Data System (ADS)
Lewis, Donna L.; Phinn, Stuart
2011-01-01
Aerial photography interpretation is the most common mapping technique in the world. However, unlike an algorithm-based classification of satellite imagery, accuracy of aerial photography interpretation generated maps is rarely assessed. Vegetation communities covering an area of 530 km2 on Bullo River Station, Northern Territory, Australia, were mapped using an interpretation of 1:50,000 color aerial photography. Manual stereoscopic line-work was delineated at 1:10,000 and thematic maps generated at 1:25,000 and 1:100,000. Multivariate and intuitive analysis techniques were employed to identify 22 vegetation communities within the study area. The accuracy assessment was based on 50% of a field dataset collected over a 4 year period (2006 to 2009) and the remaining 50% of sites were used for map attribution. The overall accuracy and Kappa coefficient for both thematic maps was 66.67% and 0.63, respectively, calculated from standard error matrices. Our findings highlight the need for appropriate scales of mapping and accuracy assessment of aerial photography interpretation generated vegetation community maps.
Wickham, J.D.; Stehman, S.V.; Smith, J.H.; Wade, T.G.; Yang, L.
2004-01-01
Two-stage cluster sampling reduces the cost of collecting accuracy assessment reference data by constraining sample elements to fall within a limited number of geographic domains (clusters). However, because classification error is typically positively spatially correlated, within-cluster correlation may reduce the precision of the accuracy estimates. The detailed population information to quantify a priori the effect of within-cluster correlation on precision is typically unavailable. Consequently, a convenient, practical approach to evaluate the likely performance of a two-stage cluster sample is needed. We describe such an a priori evaluation protocol focusing on the spatial distribution of the sample by land-cover class across different cluster sizes and costs of different sampling options, including options not imposing clustering. This protocol also assesses the two-stage design's adequacy for estimating the precision of accuracy estimates for rare land-cover classes. We illustrate the approach using two large-area, regional accuracy assessments from the National Land-Cover Data (NLCD), and describe how the a priorievaluation was used as a decision-making tool when implementing the NLCD design.
NASA Astrophysics Data System (ADS)
Pradhan, Biswajeet; Kabiri, Keivan
2012-07-01
This paper describes an assessment of coral reef mapping using multi sensor satellite images such as Landsat ETM, SPOT and IKONOS images for Tioman Island, Malaysia. The study area is known to be one of the best Islands in South East Asia for its unique collection of diversified coral reefs and serves host to thousands of tourists every year. For the coral reef identification, classification and analysis, Landsat ETM, SPOT and IKONOS images were collected processed and classified using hierarchical classification schemes. At first, Decision tree classification method was implemented to separate three main land cover classes i.e. water, rural and vegetation and then maximum likelihood supervised classification method was used to classify these main classes. The accuracy of the classification result is evaluated by a separated test sample set, which is selected based on the fieldwork survey and view interpretation from IKONOS image. Few types of ancillary data in used are: (a) DGPS ground control points; (b) Water quality parameters measured by Hydrolab DS4a; (c) Sea-bed substrates spectrum measured by Unispec and; (d) Landcover observation photos along Tioman island coastal area. The overall accuracy of the final classification result obtained was 92.25% with the kappa coefficient is 0.8940. Key words: Coral reef, Multi-spectral Segmentation, Pixel-Based Classification, Decision Tree, Tioman Island
NASA Technical Reports Server (NTRS)
Hoffbeck, Joseph P.; Landgrebe, David A.
1994-01-01
Many analysis algorithms for high-dimensional remote sensing data require that the remotely sensed radiance spectra be transformed to approximate reflectance to allow comparison with a library of laboratory reflectance spectra. In maximum likelihood classification, however, the remotely sensed spectra are compared to training samples, thus a transformation to reflectance may or may not be helpful. The effect of several radiance-to-reflectance transformations on maximum likelihood classification accuracy is investigated in this paper. We show that the empirical line approach, LOWTRAN7, flat-field correction, single spectrum method, and internal average reflectance are all non-singular affine transformations, and that non-singular affine transformations have no effect on discriminant analysis feature extraction and maximum likelihood classification accuracy. (An affine transformation is a linear transformation with an optional offset.) Since the Atmosphere Removal Program (ATREM) and the log residue method are not affine transformations, experiments with Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) data were conducted to determine the effect of these transformations on maximum likelihood classification accuracy. The average classification accuracy of the data transformed by ATREM and the log residue method was slightly less than the accuracy of the original radiance data. Since the radiance-to-reflectance transformations allow direct comparison of remotely sensed spectra with laboratory reflectance spectra, they can be quite useful in labeling the training samples required by maximum likelihood classification, but these transformations have only a slight effect or no effect at all on discriminant analysis and maximum likelihood classification accuracy.
Ortega, Alonso; Labrenz, Stephan; Markowitsch, Hans J; Piefke, Martina
2013-01-01
In the last decade, different statistical techniques have been introduced to improve assessment of malingering-related poor effort. In this context, we have recently shown preliminary evidence that a Bayesian latent group model may help to optimize classification accuracy using a simulation research design. In the present study, we conducted two analyses. Firstly, we evaluated how accurately this Bayesian approach can distinguish between participants answering in an honest way (honest response group) and participants feigning cognitive impairment (experimental malingering group). Secondly, we tested the accuracy of our model in the differentiation between patients who had real cognitive deficits (cognitively impaired group) and participants who belonged to the experimental malingering group. All Bayesian analyses were conducted using the raw scores of a visual recognition forced-choice task (2AFC), the Test of Memory Malingering (TOMM, Trial 2), and the Word Memory Test (WMT, primary effort subtests). The first analysis showed 100% accuracy for the Bayesian model in distinguishing participants of both groups with all effort measures. The second analysis showed outstanding overall accuracy of the Bayesian model when estimates were obtained from the 2AFC and the TOMM raw scores. Diagnostic accuracy of the Bayesian model diminished when using the WMT total raw scores. Despite, overall diagnostic accuracy can still be considered excellent. The most plausible explanation for this decrement is the low performance in verbal recognition and fluency tasks of some patients of the cognitively impaired group. Additionally, the Bayesian model provides individual estimates, p(zi |D), of examinees' effort levels. In conclusion, both high classification accuracy levels and Bayesian individual estimates of effort may be very useful for clinicians when assessing for effort in medico-legal settings.
NASA Astrophysics Data System (ADS)
Gutierrez-Velez, V. H.; DeFries, R. S.
2011-12-01
Oil palm expansion has led to clearing of extensive forest areas in the tropics. However quantitative assessments of the magnitude of oil palm expansion to deforestation have been challenging due in large part to the limitations presented by conventional optical data sets for discriminating plantations from forests and other tree cover vegetations. Recently available information from active remote sensors has opened the possibility of using these data sources to overcome these limitations. The purpose of this analysis is to evaluate the accuracy of oil palm classification when using ALOS/PALSAR active satellite data in conjunction with Landsat information, compared to the use of Landsat data only. The analysis takes place in a focused region around the city of Pucallpa in the Ucayali province of the Peruvian Amazon for the year 2010. Oil palm plantations were separated in five categories consisting of four age classes (0-3, 3-5, 5-10 and > 10 yrs) and an additional class accounting for degraded plantations older than 15 yr. Other land covers were water bodies, unvegetated land, short and tall grass, fallow, secondary vegetation, and forest. Classifications were performed using random forests. Training points for calibration and validation consisted of 411 polygons measured in areas representative of the land covers of interest and totaled 6,367 ha. Overall classification accuracy increased from 89.9% using only Landsat data sets to 94.3% using both Landast and ALOS/PALSAR. Both user's and producer's accuracy increased in all classes when using both data sets except for producer's accuracy in short grass which decreased by 1%. The largest increase in user's accuracy was obtained in oil palm plantations older than 10 years from 62 to 80% while producer's accuracy improved the most in plantations in age class 3-5 from 63 to 80%. Results demonstrate the suitability of data from ALOS/PALSAR and other active remote sensors to improve classification of oil palm plantations in age classes and discriminate them from other land covers. Results suggest a potential for improving discrimination of other tree cover types using a combination of active and conventional optical remote sensors.
NASA Astrophysics Data System (ADS)
Adjorlolo, Clement; Mutanga, Onisimo; Cho, Moses A.; Ismail, Riyad
2013-04-01
In this paper, a user-defined inter-band correlation filter function was used to resample hyperspectral data and thereby mitigate the problem of multicollinearity in classification analysis. The proposed resampling technique convolves the spectral dependence information between a chosen band-centre and its shorter and longer wavelength neighbours. Weighting threshold of inter-band correlation (WTC, Pearson's r) was calculated, whereby r = 1 at the band-centre. Various WTC (r = 0.99, r = 0.95 and r = 0.90) were assessed, and bands with coefficients beyond a chosen threshold were assigned r = 0. The resultant data were used in the random forest analysis to classify in situ C3 and C4 grass canopy reflectance. The respective WTC datasets yielded improved classification accuracies (kappa = 0.82, 0.79 and 0.76) with less correlated wavebands when compared to resampled Hyperion bands (kappa = 0.76). Overall, the results obtained from this study suggested that resampling of hyperspectral data should account for the spectral dependence information to improve overall classification accuracy as well as reducing the problem of multicollinearity.
Clemans, Katherine H; Musci, Rashelle J; Leoutsakos, Jeannie-Marie S; Ialongo, Nicholas S
2014-04-01
This study compared the ability of teacher, parent, and peer reports of aggressive behavior in early childhood to accurately classify cases of maladaptive outcomes in late adolescence and early adulthood. Weighted kappa analyses determined optimal cut points and relative classification accuracy among teacher, parent, and peer reports of aggression assessed for 691 students (54% male; 84% African American and 13% White) in the fall of first grade. Outcomes included antisocial personality, substance use, incarceration history, risky sexual behavior, and failure to graduate from high school on time. Peer reports were the most accurate classifier of all outcomes in the full sample. For most outcomes, the addition of teacher or parent reports did not improve overall classification accuracy once peer reports were accounted for. Additional gender-specific and adjusted kappa analyses supported the superior classification utility of the peer report measure. The results suggest that peer reports provided the most useful classification information of the 3 aggression measures. Implications for targeted intervention efforts in which screening measures are used to identify at-risk children are discussed.
A fuzzy hill-climbing algorithm for the development of a compact associative classifier
NASA Astrophysics Data System (ADS)
Mitra, Soumyaroop; Lam, Sarah S.
2012-02-01
Classification, a data mining technique, has widespread applications including medical diagnosis, targeted marketing, and others. Knowledge discovery from databases in the form of association rules is one of the important data mining tasks. An integrated approach, classification based on association rules, has drawn the attention of the data mining community over the last decade. While attention has been mainly focused on increasing classifier accuracies, not much efforts have been devoted towards building interpretable and less complex models. This paper discusses the development of a compact associative classification model using a hill-climbing approach and fuzzy sets. The proposed methodology builds the rule-base by selecting rules which contribute towards increasing training accuracy, thus balancing classification accuracy with the number of classification association rules. The results indicated that the proposed associative classification model can achieve competitive accuracies on benchmark datasets with continuous attributes and lend better interpretability, when compared with other rule-based systems.
Derivation of an artificial gene to improve classification accuracy upon gene selection.
Seo, Minseok; Oh, Sejong
2012-02-01
Classification analysis has been developed continuously since 1936. This research field has advanced as a result of development of classifiers such as KNN, ANN, and SVM, as well as through data preprocessing areas. Feature (gene) selection is required for very high dimensional data such as microarray before classification work. The goal of feature selection is to choose a subset of informative features that reduces processing time and provides higher classification accuracy. In this study, we devised a method of artificial gene making (AGM) for microarray data to improve classification accuracy. Our artificial gene was derived from a whole microarray dataset, and combined with a result of gene selection for classification analysis. We experimentally confirmed a clear improvement of classification accuracy after inserting artificial gene. Our artificial gene worked well for popular feature (gene) selection algorithms and classifiers. The proposed approach can be applied to any type of high dimensional dataset. Copyright © 2011 Elsevier Ltd. All rights reserved.
SVM-RFE based feature selection and Taguchi parameters optimization for multiclass SVM classifier.
Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, W M; Li, R K; Jiang, Bo-Ru
2014-01-01
Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases.
SVM-RFE Based Feature Selection and Taguchi Parameters Optimization for Multiclass SVM Classifier
Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, W. M.; Li, R. K.; Jiang, Bo-Ru
2014-01-01
Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases. PMID:25295306
Automated Clinical Assessment from Smart home-based Behavior Data
Dawadi, Prafulla Nath; Cook, Diane Joyce; Schmitter-Edgecombe, Maureen
2016-01-01
Smart home technologies offer potential benefits for assisting clinicians by automating health monitoring and well-being assessment. In this paper, we examine the actual benefits of smart home-based analysis by monitoring daily behaviour in the home and predicting standard clinical assessment scores of the residents. To accomplish this goal, we propose a Clinical Assessment using Activity Behavior (CAAB) approach to model a smart home resident’s daily behavior and predict the corresponding standard clinical assessment scores. CAAB uses statistical features that describe characteristics of a resident’s daily activity performance to train machine learning algorithms that predict the clinical assessment scores. We evaluate the performance of CAAB utilizing smart home sensor data collected from 18 smart homes over two years using prediction and classification-based experiments. In the prediction-based experiments, we obtain a statistically significant correlation (r = 0.72) between CAAB-predicted and clinician-provided cognitive assessment scores and a statistically significant correlation (r = 0.45) between CAAB-predicted and clinician-provided mobility scores. Similarly, for the classification-based experiments, we find CAAB has a classification accuracy of 72% while classifying cognitive assessment scores and 76% while classifying mobility scores. These prediction and classification results suggest that it is feasible to predict standard clinical scores using smart home sensor data and learning-based data analysis. PMID:26292348
NASA Astrophysics Data System (ADS)
Liu, Yansong; Monteiro, Sildomar T.; Saber, Eli
2015-10-01
Changes in vegetation cover, building construction, road network and traffic conditions caused by urban expansion affect the human habitat as well as the natural environment in rapidly developing cities. It is crucial to assess these changes and respond accordingly by identifying man-made and natural structures with accurate classification algorithms. With the increase in use of multi-sensor remote sensing systems, researchers are able to obtain a more complete description of the scene of interest. By utilizing multi-sensor data, the accuracy of classification algorithms can be improved. In this paper, we propose a method for combining 3D LiDAR point clouds and high-resolution color images to classify urban areas using Gaussian processes (GP). GP classification is a powerful non-parametric classification method that yields probabilistic classification results. It makes predictions in a way that addresses the uncertainty of real world. In this paper, we attempt to identify man-made and natural objects in urban areas including buildings, roads, trees, grass, water and vehicles. LiDAR features are derived from the 3D point clouds and the spatial and color features are extracted from RGB images. For classification, we use the Laplacian approximation for GP binary classification on the new combined feature space. The multiclass classification has been implemented by using one-vs-all binary classification strategy. The result of applying support vector machines (SVMs) and logistic regression (LR) classifier is also provided for comparison. Our experiments show a clear improvement of classification results by using the two sensors combined instead of each sensor separately. Also we found the advantage of applying GP approach to handle the uncertainty in classification result without compromising accuracy compared to SVM, which is considered as the state-of-the-art classification method.
NASA Astrophysics Data System (ADS)
Iabchoon, Sanwit; Wongsai, Sangdao; Chankon, Kanoksuk
2017-10-01
Land use and land cover (LULC) data are important to monitor and assess environmental change. LULC classification using satellite images is a method widely used on a global and local scale. Especially, urban areas that have various LULC types are important components of the urban landscape and ecosystem. This study aims to classify urban LULC using WorldView-3 (WV-3) very high-spatial resolution satellite imagery and the object-based image analysis method. A decision rules set was applied to classify the WV-3 images in Kathu subdistrict, Phuket province, Thailand. The main steps were as follows: (1) the image was ortho-rectified with ground control points and using the digital elevation model, (2) multiscale image segmentation was applied to divide the image pixel level into image object level, (3) development of the decision ruleset for LULC classification using spectral bands, spectral indices, spatial and contextual information, and (4) accuracy assessment was computed using testing data, which sampled by statistical random sampling. The results show that seven LULC classes (water, vegetation, open space, road, residential, building, and bare soil) were successfully classified with overall classification accuracy of 94.14% and a kappa coefficient of 92.91%.
Knauer, Uwe; Matros, Andrea; Petrovic, Tijana; Zanker, Timothy; Scott, Eileen S; Seiffert, Udo
2017-01-01
Hyperspectral imaging is an emerging means of assessing plant vitality, stress parameters, nutrition status, and diseases. Extraction of target values from the high-dimensional datasets either relies on pixel-wise processing of the full spectral information, appropriate selection of individual bands, or calculation of spectral indices. Limitations of such approaches are reduced classification accuracy, reduced robustness due to spatial variation of the spectral information across the surface of the objects measured as well as a loss of information intrinsic to band selection and use of spectral indices. In this paper we present an improved spatial-spectral segmentation approach for the analysis of hyperspectral imaging data and its application for the prediction of powdery mildew infection levels (disease severity) of intact Chardonnay grape bunches shortly before veraison. Instead of calculating texture features (spatial features) for the huge number of spectral bands independently, dimensionality reduction by means of Linear Discriminant Analysis (LDA) was applied first to derive a few descriptive image bands. Subsequent classification was based on modified Random Forest classifiers and selective extraction of texture parameters from the integral image representation of the image bands generated. Dimensionality reduction, integral images, and the selective feature extraction led to improved classification accuracies of up to [Formula: see text] for detached berries used as a reference sample (training dataset). Our approach was validated by predicting infection levels for a sample of 30 intact bunches. Classification accuracy improved with the number of decision trees of the Random Forest classifier. These results corresponded with qPCR results. An accuracy of 0.87 was achieved in classification of healthy, infected, and severely diseased bunches. However, discrimination between visually healthy and infected bunches proved to be challenging for a few samples, perhaps due to colonized berries or sparse mycelia hidden within the bunch or airborne conidia on the berries that were detected by qPCR. An advanced approach to hyperspectral image classification based on combined spatial and spectral image features, potentially applicable to many available hyperspectral sensor technologies, has been developed and validated to improve the detection of powdery mildew infection levels of Chardonnay grape bunches. The spatial-spectral approach improved especially the detection of light infection levels compared with pixel-wise spectral data analysis. This approach is expected to improve the speed and accuracy of disease detection once the thresholds for fungal biomass detected by hyperspectral imaging are established; it can also facilitate monitoring in plant phenotyping of grapevine and additional crops.
Zhou, Xi-Yin; Lei, Kun; Meng, Wei
2017-09-01
Coastal zones are population and economy highly intensity regions all over the world, and coastal habitat supports the sustainable development of human society. The accurate assessment of coastal habitat degradation is the essential prerequisite for coastal zone protection. In this study, an integrated framework of coastal habitat degradation assessment including landuse classification, habitat classifying and zoning, evaluation criterion of coastal habitat degradation and coastal habitat degradation index has been established for better regional coastal habitat assessment. Through establishment of detailed three-class landuse classification, the fine landscape change is revealed, the evaluation criterion of coastal habitat degradation through internal comparison based on the results of habitat classifying and zoning could indicate the levels of habitat degradation and distinguish the intensity of human disturbances in different habitat subareas under the same habitat classification. Finally, the results of coastal habitat degradation assessment could be achieved through coastal habitat degradation index (CHI). A case study of the framework is carried out in the Circum-Bohai-Sea-Coast, China, and the main results show the following: (1) The accuracy of all land use classes are above 90%, which indicates a satisfactory accuracy for the classification map. (2) The Circum-Bohai-Sea-Coast is divided into 3 kinds of habitats and 5 subareas. (3) In the five subareas of the Circum-Bohai-Sea-Coast, the levels of coastal habitat degradation own significant difference. The whole Circum-Bohai-Sea-Coast generally is in a worse state according to area weighting of each habitat subarea. This assessment framework of coastal habitat degradation would characterize the landuse change trend, realize better coastal habitat degradation assessment, reveal the habitat conservation tendency and distinguish intensity of human disturbances. Furthermore, it would support for accurate coastal zone protection measures for the specific coastal area. Copyright © 2017 Elsevier B.V. All rights reserved.
Classification of large-scale fundus image data sets: a cloud-computing framework.
Roychowdhury, Sohini
2016-08-01
Large medical image data sets with high dimensionality require substantial amount of computation time for data creation and data processing. This paper presents a novel generalized method that finds optimal image-based feature sets that reduce computational time complexity while maximizing overall classification accuracy for detection of diabetic retinopathy (DR). First, region-based and pixel-based features are extracted from fundus images for classification of DR lesions and vessel-like structures. Next, feature ranking strategies are used to distinguish the optimal classification feature sets. DR lesion and vessel classification accuracies are computed using the boosted decision tree and decision forest classifiers in the Microsoft Azure Machine Learning Studio platform, respectively. For images from the DIARETDB1 data set, 40 of its highest-ranked features are used to classify four DR lesion types with an average classification accuracy of 90.1% in 792 seconds. Also, for classification of red lesion regions and hemorrhages from microaneurysms, accuracies of 85% and 72% are observed, respectively. For images from STARE data set, 40 high-ranked features can classify minor blood vessels with an accuracy of 83.5% in 326 seconds. Such cloud-based fundus image analysis systems can significantly enhance the borderline classification performances in automated screening systems.
ERIC Educational Resources Information Center
Petersen, Douglas B.; Chanthongthip, Helen; Ukrainetz, Teresa A.; Spencer, Trina D.; Steeve, Roger W.
2017-01-01
Purpose: This study investigated the classification accuracy of a concentrated English narrative dynamic assessment (DA) for identifying language impairment (LI). Method: Forty-two Spanish-English bilingual kindergarten to third-grade children (10 LI and 32 with no LI) were administered two 25-min DA test-teach-test sessions. Pre- and posttest…
Shi, Rong; Schraedley-Desmond, Pamela; Napel, Sandy; Olcott, Eric W; Jeffrey, R Brooke; Yee, Judy; Zalis, Michael E; Margolis, Daniel; Paik, David S; Sherbondy, Anthony J; Sundaram, Padmavathi; Beaulieu, Christopher F
2006-06-01
To retrospectively determine if three-dimensional (3D) viewing improves radiologists' accuracy in classifying true-positive (TP) and false-positive (FP) polyp candidates identified with computer-aided detection (CAD) and to determine candidate polyp features that are associated with classification accuracy, with known polyps serving as the reference standard. Institutional review board approval and informed consent were obtained; this study was HIPAA compliant. Forty-seven computed tomographic (CT) colonography data sets were obtained in 26 men and 10 women (age range, 42-76 years). Four radiologists classified 705 polyp candidates (53 TP candidates, 652 FP candidates) identified with CAD; initially, only two-dimensional images were used, but these were later supplemented with 3D rendering. Another radiologist unblinded to colonoscopy findings characterized the features of each candidate, assessed colon distention and preparation, and defined the true nature of FP candidates. Receiver operating characteristic curves were used to compare readers' performance, and repeated-measures analysis of variance was used to test features that affect interpretation. Use of 3D viewing improved classification accuracy for three readers and increased the area under the receiver operating characteristic curve to 0.96-0.97 (P<.001). For TP candidates, maximum polyp width (P=.038), polyp height (P=.019), and preparation (P=.004) significantly affected accuracy. For FP candidates, colonic segment (P=.007), attenuation (P<.001), surface smoothness (P<.001), distention (P=.034), preparation (P<.001), and true nature of candidate lesions (P<.001) significantly affected accuracy. Use of 3D viewing increases reader accuracy in the classification of polyp candidates identified with CAD. Polyp size and examination quality are significantly associated with accuracy. Copyright (c) RSNA, 2006.
Real-time, resource-constrained object classification on a micro-air vehicle
NASA Astrophysics Data System (ADS)
Buck, Louis; Ray, Laura
2013-12-01
A real-time embedded object classification algorithm is developed through the novel combination of binary feature descriptors, a bag-of-visual-words object model and the cortico-striatal loop (CSL) learning algorithm. The BRIEF, ORB and FREAK binary descriptors are tested and compared to SIFT descriptors with regard to their respective classification accuracies, execution times, and memory requirements when used with CSL on a 12.6 g ARM Cortex embedded processor running at 800 MHz. Additionally, the effect of x2 feature mapping and opponent-color representations used with these descriptors is examined. These tests are performed on four data sets of varying sizes and difficulty, and the BRIEF descriptor is found to yield the best combination of speed and classification accuracy. Its use with CSL achieves accuracies between 67% and 95% of those achieved with SIFT descriptors and allows for the embedded classification of a 128x192 pixel image in 0.15 seconds, 60 times faster than classification with SIFT. X2 mapping is found to provide substantial improvements in classification accuracy for all of the descriptors at little cost, while opponent-color descriptors are offer accuracy improvements only on colorful datasets.
2016-01-01
Moderate Resolution Imaging Spectroradiometer (MODIS) data forms the basis for numerous land use and land cover (LULC) mapping and analysis frameworks at regional scale. Compared to other satellite sensors, the spatial, temporal and spectral specifications of MODIS are considered as highly suitable for LULC classifications which support many different aspects of social, environmental and developmental research. The LULC mapping of this study was carried out in the context of the development of an evaluation approach for Zimbabwe’s land reform program. Within the discourse about the success of this program, a lack of spatially explicit methods to produce objective data, such as on the extent of agricultural area, is apparent. We therefore assessed the suitability of moderate spatial and high temporal resolution imagery and phenological parameters to retrieve regional figures about the extent of cropland area in former freehold tenure in a series of 13 years from 2001–2013. Time-series data was processed with TIMESAT and was stratified according to agro-ecological potential zoning of Zimbabwe. Random Forest (RF) classifications were used to produce annual binary crop/non crop maps which were evaluated with high spatial resolution data from other satellite sensors. We assessed the cropland products in former freehold tenure in terms of classification accuracy, inter-annual comparability and heterogeneity. Although general LULC patterns were depicted in classification results and an overall accuracy of over 80% was achieved, user accuracies for rainfed agriculture were limited to below 65%. We conclude that phenological analysis has to be treated with caution when rainfed agriculture and grassland in semi-humid tropical regions have to be separated based on MODIS spectral data and phenological parameters. Because classification results significantly underestimate redistributed commercial farmland in Zimbabwe, we argue that the method cannot be used to produce spatial information on land-use which could be linked to tenure change. Hence capabilities of moderate resolution data are limited to assess Zimbabwe’s land reform. To make use of the unquestionable potential of MODIS time-series analysis, we propose an analysis of plant productivity which allows to link annual growth and production of vegetation to ownership after Zimbabwe’s land reform. PMID:27253327
A Nonparametric Approach to Estimate Classification Accuracy and Consistency
ERIC Educational Resources Information Center
Lathrop, Quinn N.; Cheng, Ying
2014-01-01
When cut scores for classifications occur on the total score scale, popular methods for estimating classification accuracy (CA) and classification consistency (CC) require assumptions about a parametric form of the test scores or about a parametric response model, such as item response theory (IRT). This article develops an approach to estimate CA…
NutriNet: A Deep Learning Food and Drink Image Recognition System for Dietary Assessment.
Mezgec, Simon; Koroušić Seljak, Barbara
2017-06-27
Automatic food image recognition systems are alleviating the process of food-intake estimation and dietary assessment. However, due to the nature of food images, their recognition is a particularly challenging task, which is why traditional approaches in the field have achieved a low classification accuracy. Deep neural networks have outperformed such solutions, and we present a novel approach to the problem of food and drink image detection and recognition that uses a newly-defined deep convolutional neural network architecture, called NutriNet. This architecture was tuned on a recognition dataset containing 225,953 512 × 512 pixel images of 520 different food and drink items from a broad spectrum of food groups, on which we achieved a classification accuracy of 86 . 72 % , along with an accuracy of 94 . 47 % on a detection dataset containing 130 , 517 images. We also performed a real-world test on a dataset of self-acquired images, combined with images from Parkinson's disease patients, all taken using a smartphone camera, achieving a top-five accuracy of 55 % , which is an encouraging result for real-world images. Additionally, we tested NutriNet on the University of Milano-Bicocca 2016 (UNIMIB2016) food image dataset, on which we improved upon the provided baseline recognition result. An online training component was implemented to continually fine-tune the food and drink recognition model on new images. The model is being used in practice as part of a mobile app for the dietary assessment of Parkinson's disease patients.
Berger, Rachel P; Parks, Sharyn; Fromkin, Janet; Rubin, Pamela; Pecora, Peter J
2015-04-01
To assess the accuracy of an International Classification of Diseases (ICD) code-based operational case definition for abusive head trauma (AHT). Subjects were children <5 years of age evaluated for AHT by a hospital-based Child Protection Team (CPT) at a tertiary care paediatric hospital with a completely electronic medical record (EMR) system. Subjects were designated as non-AHT traumatic brain injury (TBI) or AHT based on whether the CPT determined that the injuries were due to AHT. The sensitivity and specificity of the ICD-based definition were calculated. There were 223 children evaluated for AHT: 117 AHT and 106 non-AHT TBI. The sensitivity and specificity of the ICD-based operational case definition were 92% (95% CI 85.8 to 96.2) and 96% (95% CI 92.3 to 99.7), respectively. All errors in sensitivity and three of the four specificity errors were due to coder error; one specificity error was a physician error. In a paediatric tertiary care hospital with an EMR system, the accuracy of an ICD-based case definition for AHT was high. Additional studies are needed to assess the accuracy of this definition in all types of hospitals in which children with AHT are cared for. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
TMS combined with EEG in genetic generalized epilepsy: A phase II diagnostic accuracy study.
Kimiskidis, Vasilios K; Tsimpiris, Alkiviadis; Ryvlin, Philippe; Kalviainen, Reetta; Koutroumanidis, Michalis; Valentin, Antonio; Laskaris, Nikolaos; Kugiumtzis, Dimitris
2017-02-01
(A) To develop a TMS-EEG stimulation and data analysis protocol in genetic generalized epilepsy (GGE). (B) To investigate the diagnostic accuracy of TMS-EEG in GGE. Pilot experiments resulted in the development and optimization of a paired-pulse TMS-EEG protocol at rest, during hyperventilation (HV), and post-HV combined with multi-level data analysis. This protocol was applied in 11 controls (C) and 25 GGE patients (P), further dichotomized into responders to antiepileptic drugs (R, n=13) and non-responders (n-R, n=12).Features (n=57) extracted from TMS-EEG responses after multi-level analysis were given to a feature selection scheme and a Bayesian classifier, and the accuracy of assigning participants into the classes P-C and R-nR was computed. On the basis of the optimal feature subset, the cross-validated accuracy of TMS-EEG for the classification P-C was 0.86 at rest, 0.81 during HV and 0.92 at post-HV, whereas for R-nR the corresponding figures are 0.80, 0.78 and 0.65, respectively. Applying a fusion approach on all conditions resulted in an accuracy of 0.84 for the classification P-C and 0.76 for the classification R-nR. TMS-EEG can be used for diagnostic purposes and for assessing the response to antiepileptic drugs. TMS-EEG holds significant diagnostic potential in GGE. Copyright © 2016 International Federation of Clinical Neurophysiology. Published by Elsevier B.V. All rights reserved.
Comparative analysis of Worldview-2 and Landsat 8 for coastal saltmarsh mapping accuracy assessment
NASA Astrophysics Data System (ADS)
Rasel, Sikdar M. M.; Chang, Hsing-Chung; Diti, Israt Jahan; Ralph, Tim; Saintilan, Neil
2016-05-01
Coastal saltmarsh and their constituent components and processes are of an interest scientifically due to their ecological function and services. However, heterogeneity and seasonal dynamic of the coastal wetland system makes it challenging to map saltmarshes with remotely sensed data. This study selected four important saltmarsh species Pragmitis australis, Sporobolus virginicus, Ficiona nodosa and Schoeloplectus sp. as well as a Mangrove and Pine tree species, Avecinia and Casuarina sp respectively. High Spatial Resolution Worldview-2 data and Coarse Spatial resolution Landsat 8 imagery were selected in this study. Among the selected vegetation types some patches ware fragmented and close to the spatial resolution of Worldview-2 data while and some patch were larger than the 30 meter resolution of Landsat 8 data. This study aims to test the effectiveness of different classifier for the imagery with various spatial and spectral resolutions. Three different classification algorithm, Maximum Likelihood Classifier (MLC), Support Vector Machine (SVM) and Artificial Neural Network (ANN) were tested and compared with their mapping accuracy of the results derived from both satellite imagery. For Worldview-2 data SVM was giving the higher overall accuracy (92.12%, kappa =0.90) followed by ANN (90.82%, Kappa 0.89) and MLC (90.55%, kappa = 0.88). For Landsat 8 data, MLC (82.04%) showed the highest classification accuracy comparing to SVM (77.31%) and ANN (75.23%). The producer accuracy of the classification results were also presented in the paper.
Delineation of marsh types of the Texas coast from Corpus Christi Bay to the Sabine River in 2010
Enwright, Nicholas M.; Hartley, Stephen B.; Brasher, Michael G.; Visser, Jenneke M.; Mitchell, Michael K.; Ballard, Bart M.; Parr, Mark W.; Couvillion, Brady R.; Wilson, Barry C.
2014-01-01
Coastal zone managers and researchers often require detailed information regarding emergent marsh vegetation types for modeling habitat capacities and needs of marsh-reliant wildlife (such as waterfowl and alligator). Detailed information on the extent and distribution of marsh vegetation zones throughout the Texas coast has been historically unavailable. In response, the U.S. Geological Survey, in cooperation and collaboration with the U.S. Fish and Wildlife Service via the Gulf Coast Joint Venture, Texas A&M University-Kingsville, the University of Louisiana-Lafayette, and Ducks Unlimited, Inc., has produced a classification of marsh vegetation types along the middle and upper Texas coast from Corpus Christi Bay to the Sabine River. This study incorporates approximately 1,000 ground reference locations collected via helicopter surveys in coastal marsh areas and about 2,000 supplemental locations from fresh marsh, water, and “other” (that is, nonmarsh) areas. About two-thirds of these data were used for training, and about one-third were used for assessing accuracy. Decision-tree analyses using Rulequest See5 were used to classify emergent marsh vegetation types by using these data, multitemporal satellite-based multispectral imagery from 2009 to 2011, a bare-earth digital elevation model (DEM) based on airborne light detection and ranging (lidar), alternative contemporary land cover classifications, and other spatially explicit variables believed to be important for delineating the extent and distribution of marsh vegetation communities. Image objects were generated from segmentation of high-resolution airborne imagery acquired in 2010 and were used to refine the classification. The classification is dated 2010 because the year is both the midpoint of the multitemporal satellite-based imagery (2009–11) classified and the date of the high-resolution airborne imagery that was used to develop image objects. Overall accuracy corrected for bias (accuracy estimate incorporates true marginal proportions) was 91 percent (95 percent confidence interval [CI]: 89.2–92.8), with a kappa statistic of 0.79 (95 percent CI: 0.77–0.81). The classification performed best for saline marsh (user’s accuracy 81.5 percent; producer’s accuracy corrected for bias 62.9 percent) but showed a lesser ability to discriminate intermediate marsh (user’s accuracy 47.7 percent; producer’s accuracy corrected for bias 49.5 percent). Because of confusion in intermediate and brackish marsh classes, an alternative classification containing only three marsh types was created in which intermediate and brackish marshes were combined into a single class. Image objects were reattributed by using this alternative three-marsh-type classification. Overall accuracy, corrected for bias, of this more general classification was 92.4 percent (95 percent CI: 90.7–94.2), and the kappa statistic was 0.83 (95 percent CI: 0.81–0.85). Mean user’s accuracy for marshes within the four-marsh-type and three-marsh-type classifications was 65.4 percent and 75.6 percent, respectively, whereas mean producer’s accuracy was 56.7 percent and 65.1 percent, respectively. This study provides a more objective and repeatable method for classifying marsh types of the middle and upper Texas coast at an extent and greater level of detail than previously available for the study area. The seamless classification produced through this work is now available to help State agencies (such as the Texas Parks and Wildlife Department) and landscape-scale conservation partnerships (such as the Gulf Coast Prairie Landscape Conservation Cooperative and the Gulf Coast Joint Venture) to develop and (or) refine conservation plans targeting priority natural resources. Moreover, these data may improve projections of landscape change and serve as a baseline for monitoring future changes resulting from chronic and episodic stressors.
Information extraction with object based support vector machines and vegetation indices
NASA Astrophysics Data System (ADS)
Ustuner, Mustafa; Abdikan, Saygin; Balik Sanli, Fusun
2016-07-01
Information extraction through remote sensing data is important for policy and decision makers as extracted information provide base layers for many application of real world. Classification of remotely sensed data is the one of the most common methods of extracting information however it is still a challenging issue because several factors are affecting the accuracy of the classification. Resolution of the imagery, number and homogeneity of land cover classes, purity of training data and characteristic of adopted classifiers are just some of these challenging factors. Object based image classification has some superiority than pixel based classification for high resolution images since it uses geometry and structure information besides spectral information. Vegetation indices are also commonly used for the classification process since it provides additional spectral information for vegetation, forestry and agricultural areas. In this study, the impacts of the Normalized Difference Vegetation Index (NDVI) and Normalized Difference Red Edge Index (NDRE) on the classification accuracy of RapidEye imagery were investigated. Object based Support Vector Machines were implemented for the classification of crop types for the study area located in Aegean region of Turkey. Results demonstrated that the incorporation of NDRE increase the classification accuracy from 79,96% to 86,80% as overall accuracy, however NDVI decrease the classification accuracy from 79,96% to 78,90%. Moreover it is proven than object based classification with RapidEye data give promising results for crop type mapping and analysis.
NASA Astrophysics Data System (ADS)
Gordon, Marshall N.; Cha, Kenny H.; Hadjiiski, Lubomir M.; Chan, Heang-Ping; Cohan, Richard H.; Caoili, Elaine M.; Paramagul, Chintana; Alva, Ajjai; Weizer, Alon Z.
2018-02-01
We are developing a decision support system for assisting clinicians in assessment of response to neoadjuvant chemotherapy for bladder cancer. Accurate treatment response assessment is crucial for identifying responders and improving quality of life for non-responders. An objective machine learning decision support system may help reduce variability and inaccuracy in treatment response assessment. We developed a predictive model to assess the likelihood that a patient will respond based on image and clinical features. With IRB approval, we retrospectively collected a data set of pre- and post- treatment CT scans along with clinical information from surgical pathology from 98 patients. A linear discriminant analysis (LDA) classifier was used to predict the likelihood that a patient would respond to treatment based on radiomic features extracted from CT urography (CTU), a radiologist's semantic feature, and a clinical feature extracted from surgical and pathology reports. The classification accuracy was evaluated using the area under the ROC curve (AUC) with a leave-one-case-out cross validation. The classification accuracy was compared for the systems based on radiomic features, clinical feature, and radiologist's semantic feature. For the system based on only radiomic features the AUC was 0.75. With the addition of clinical information from examination under anesthesia (EUA) the AUC was improved to 0.78. Our study demonstrated the potential of designing a decision support system to assist in treatment response assessment. The combination of clinical features, radiologist semantic features and CTU radiomic features improved the performance of the classifier and the accuracy of treatment response assessment.
Bolin, Jocelyn Holden; Finch, W Holmes
2014-01-01
Statistical classification of phenomena into observed groups is very common in the social and behavioral sciences. Statistical classification methods, however, are affected by the characteristics of the data under study. Statistical classification can be further complicated by initial misclassification of the observed groups. The purpose of this study is to investigate the impact of initial training data misclassification on several statistical classification and data mining techniques. Misclassification conditions in the three group case will be simulated and results will be presented in terms of overall as well as subgroup classification accuracy. Results show decreased classification accuracy as sample size, group separation and group size ratio decrease and as misclassification percentage increases with random forests demonstrating the highest accuracy across conditions.
Koutsouleris, Nikolaos; Meisenzahl, Eva M.; Davatzikos, Christos; Bottlender, Ronald; Frodl, Thomas; Scheuerecker, Johanna; Schmitt, Gisela; Zetzsche, Thomas; Decker, Petra; Reiser, Maximilian; Möller, Hans-Jürgen; Gaser, Christian
2014-01-01
Context Identification of individuals at high risk of developing psychosis has relied on prodromal symptomatology. Recently, machine learning algorithms have been successfully used for magnetic resonance imaging–based diagnostic classification of neuropsychiatric patient populations. Objective To determine whether multivariate neuroanatomical pattern classification facilitates identification of individuals in different at-risk mental states (ARMS) of psychosis and enables the prediction of disease transition at the individual level. Design Multivariate neuroanatomical pattern classification was performed on the structural magnetic resonance imaging data of individuals in early or late ARMS vs healthy controls (HCs). The predictive power of the method was then evaluated by categorizing the baseline imaging data of individuals with transition to psychosis vs those without transition vs HCs after 4 years of clinical follow-up. Classification generalizability was estimated by cross-validation and by categorizing an independent cohort of 45 new HCs. Setting Departments of Psychiatry and Psychotherapy, Ludwig-Maximilians-University, Munich, Germany. Participants The first classification analysis included 20 early and 25 late at-risk individuals and 25 matched HCs. The second analysis consisted of 15 individuals with transition, 18 without transition, and 17 matched HCs. Main Outcome Measures Specificity, sensitivity, and accuracy of classification. Results The 3-group, cross-validated classification accuracies of the first analysis were 86% (HCs vs the rest), 91% (early at-risk individuals vs the rest), and 86% (late at-risk individuals vs the rest). The accuracies in the second analysis were 90% (HCs vs the rest), 88% (individuals with transition vs the rest), and 86% (individuals without transition vs the rest). Independent HCs were correctly classified in 96% (first analysis) and 93% (second analysis) of cases. Conclusions Different ARMSs and their clinical outcomes may be reliably identified on an individual basis by assessing patterns of whole-brain neuroanatomical abnormalities. These patterns may serve as valuable biomarkers for the clinician to guide early detection in the prodromal phase of psychosis. PMID:19581561
Targeting an efficient target-to-target interval for P300 speller brain–computer interfaces
Sellers, Eric W.; Wang, Xingyu
2013-01-01
Longer target-to-target intervals (TTI) produce greater P300 event-related potential amplitude, which can increase brain–computer interface (BCI) classification accuracy and decrease the number of flashes needed for accurate character classification. However, longer TTIs requires more time for each trial, which will decrease the information transfer rate of BCI. In this paper, a P300 BCI using a 7 × 12 matrix explored new flash patterns (16-, 18- and 21-flash pattern) with different TTIs to assess the effects of TTI on P300 BCI performance. The new flash patterns were designed to minimize TTI, decrease repetition blindness, and examine the temporal relationship between each flash of a given stimulus by placing a minimum of one (16-flash pattern), two (18-flash pattern), or three (21-flash pattern) non-target flashes between each target flashes. Online results showed that the 16-flash pattern yielded the lowest classification accuracy among the three patterns. The results also showed that the 18-flash pattern provides a significantly higher information transfer rate (ITR) than the 21-flash pattern; both patterns provide high ITR and high accuracy for all subjects. PMID:22350331
Mathieu, Renaud; Aryal, Jagannath; Chong, Albert K
2007-11-20
Effective assessment of biodiversity in cities requires detailed vegetation maps.To date, most remote sensing of urban vegetation has focused on thematically coarse landcover products. Detailed habitat maps are created by manual interpretation of aerialphotographs, but this is time consuming and costly at large scale. To address this issue, wetested the effectiveness of object-based classifications that use automated imagesegmentation to extract meaningful ground features from imagery. We applied thesetechniques to very high resolution multispectral Ikonos images to produce vegetationcommunity maps in Dunedin City, New Zealand. An Ikonos image was orthorectified and amulti-scale segmentation algorithm used to produce a hierarchical network of image objects.The upper level included four coarse strata: industrial/commercial (commercial buildings),residential (houses and backyard private gardens), vegetation (vegetation patches larger than0.8/1ha), and water. We focused on the vegetation stratum that was segmented at moredetailed level to extract and classify fifteen classes of vegetation communities. The firstclassification yielded a moderate overall classification accuracy (64%, κ = 0.52), which ledus to consider a simplified classification with ten vegetation classes. The overallclassification accuracy from the simplified classification was 77% with a κ value close tothe excellent range (κ = 0.74). These results compared favourably with similar studies inother environments. We conclude that this approach does not provide maps as detailed as those produced by manually interpreting aerial photographs, but it can still extract ecologically significant classes. It is an efficient way to generate accurate and detailed maps in significantly shorter time. The final map accuracy could be improved by integrating segmentation, automated and manual classification in the mapping process, especially when considering important vegetation classes with limited spectral contrast.
Automated reliability assessment for spectroscopic redshift measurements
NASA Astrophysics Data System (ADS)
Jamal, S.; Le Brun, V.; Le Fèvre, O.; Vibert, D.; Schmitt, A.; Surace, C.; Copin, Y.; Garilli, B.; Moresco, M.; Pozzetti, L.
2018-03-01
Context. Future large-scale surveys, such as the ESA Euclid mission, will produce a large set of galaxy redshifts (≥106) that will require fully automated data-processing pipelines to analyze the data, extract crucial information and ensure that all requirements are met. A fundamental element in these pipelines is to associate to each galaxy redshift measurement a quality, or reliability, estimate. Aim. In this work, we introduce a new approach to automate the spectroscopic redshift reliability assessment based on machine learning (ML) and characteristics of the redshift probability density function. Methods: We propose to rephrase the spectroscopic redshift estimation into a Bayesian framework, in order to incorporate all sources of information and uncertainties related to the redshift estimation process and produce a redshift posterior probability density function (PDF). To automate the assessment of a reliability flag, we exploit key features in the redshift posterior PDF and machine learning algorithms. Results: As a working example, public data from the VIMOS VLT Deep Survey is exploited to present and test this new methodology. We first tried to reproduce the existing reliability flags using supervised classification in order to describe different types of redshift PDFs, but due to the subjective definition of these flags (classification accuracy 58%), we soon opted for a new homogeneous partitioning of the data into distinct clusters via unsupervised classification. After assessing the accuracy of the new clusters via resubstitution and test predictions (classification accuracy 98%), we projected unlabeled data from preliminary mock simulations for the Euclid space mission into this mapping to predict their redshift reliability labels. Conclusions: Through the development of a methodology in which a system can build its own experience to assess the quality of a parameter, we are able to set a preliminary basis of an automated reliability assessment for spectroscopic redshift measurements. This newly-defined method is very promising for next-generation large spectroscopic surveys from the ground and in space, such as Euclid and WFIRST. A table of the reclassified VVDS redshifts and reliability is only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/611/A53
Minimum distance classification in remote sensing
NASA Technical Reports Server (NTRS)
Wacker, A. G.; Landgrebe, D. A.
1972-01-01
The utilization of minimum distance classification methods in remote sensing problems, such as crop species identification, is considered. Literature concerning both minimum distance classification problems and distance measures is reviewed. Experimental results are presented for several examples. The objective of these examples is to: (a) compare the sample classification accuracy of a minimum distance classifier, with the vector classification accuracy of a maximum likelihood classifier, and (b) compare the accuracy of a parametric minimum distance classifier with that of a nonparametric one. Results show the minimum distance classifier performance is 5% to 10% better than that of the maximum likelihood classifier. The nonparametric classifier is only slightly better than the parametric version.
Assessment of sexual orientation using the hemodynamic brain response to visual sexual stimuli.
Ponseti, Jorge; Granert, Oliver; Jansen, Olav; Wolff, Stephan; Mehdorn, Hubertus; Bosinski, Hartmut; Siebner, Hartwig
2009-06-01
The assessment of sexual orientation is of importance to the diagnosis and treatment of sex offenders and paraphilic disorders. Phallometry is considered gold standard in objectifying sexual orientation, yet this measurement has been criticized because of its intrusiveness and limited reliability. To evaluate whether the spatial response pattern to sexual stimuli as revealed by a change in blood oxygen level-dependent (BOLD) signal can be used for individual classification of sexual orientation. We used a preexisting functional MRI (fMRI) data set that had been acquired in a nonclinical sample of 12 heterosexual men and 14 homosexual men. During fMRI, participants were briefly exposed to pictures of same-sex and opposite-sex genitals. Data analysis involved four steps: (i) differences in the BOLD response to female and male sexual stimuli were calculated for each subject; (ii) these contrast images were entered into a group analysis to calculate whole-brain difference maps between homosexual and heterosexual participants; (iii) a single expression value was computed for each subject expressing its correspondence to the group result; and (iv) based on these expression values, Fisher's linear discriminant analysis and the kappa-nearest neighbor classification method were used to predict the sexual orientation of each subject. Sensitivity and specificity of the two classification methods in predicting individual sexual orientation. Both classification methods performed well in predicting individual sexual orientation with a mean accuracy of >85% (Fisher's linear discriminant analysis: 92% sensitivity, 85% specificity; kappa-nearest neighbor classification: 88% sensitivity, 92% specificity). Despite the small sample size, the functional response patterns of the brain to sexual stimuli contained sufficient information to predict individual sexual orientation with high accuracy. These results suggest that fMRI-based classification methods hold promise for the diagnosis of paraphilic disorders (e.g., pedophilia).
Large-area settlement pattern recognition from Landsat-8 data
NASA Astrophysics Data System (ADS)
Wieland, Marc; Pittore, Massimiliano
2016-09-01
The study presents an image processing and analysis pipeline that combines object-based image analysis with a Support Vector Machine to derive a multi-layered settlement product from Landsat-8 data over large areas. 43 image scenes are processed over large parts of Central Asia (Southern Kazakhstan, Kyrgyzstan, Tajikistan and Eastern Uzbekistan). The main tasks tackled by this work include built-up area identification, settlement type classification and urban structure types pattern recognition. Besides commonly used accuracy assessments of the resulting map products, thorough performance evaluations are carried out under varying conditions to tune algorithm parameters and assess their applicability for the given tasks. As part of this, several research questions are being addressed. In particular the influence of the improved spatial and spectral resolution of Landsat-8 on the SVM performance to identify built-up areas and urban structure types are evaluated. Also the influence of an extended feature space including digital elevation model features is tested for mountainous regions. Moreover, the spatial distribution of classification uncertainties is analyzed and compared to the heterogeneity of the building stock within the computational unit of the segments. The study concludes that the information content of Landsat-8 images is sufficient for the tested classification tasks and even detailed urban structures could be extracted with satisfying accuracy. Freely available ancillary settlement point location data could further improve the built-up area classification. Digital elevation features and pan-sharpening could, however, not significantly improve the classification results. The study highlights the importance of dynamically tuned classifier parameters, and underlines the use of Shannon entropy computed from the soft answers of the SVM as a valid measure of the spatial distribution of classification uncertainties.
Shin, Jaeyoung; Kwon, Jinuk; Im, Chang-Hwan
2018-01-01
The performance of a brain-computer interface (BCI) can be enhanced by simultaneously using two or more modalities to record brain activity, which is generally referred to as a hybrid BCI. To date, many BCI researchers have tried to implement a hybrid BCI system by combining electroencephalography (EEG) and functional near-infrared spectroscopy (NIRS) to improve the overall accuracy of binary classification. However, since hybrid EEG-NIRS BCI, which will be denoted by hBCI in this paper, has not been applied to ternary classification problems, paradigms and classification strategies appropriate for ternary classification using hBCI are not well investigated. Here we propose the use of an hBCI for the classification of three brain activation patterns elicited by mental arithmetic, motor imagery, and idle state, with the aim to elevate the information transfer rate (ITR) of hBCI by increasing the number of classes while minimizing the loss of accuracy. EEG electrodes were placed over the prefrontal cortex and the central cortex, and NIRS optodes were placed only on the forehead. The ternary classification problem was decomposed into three binary classification problems using the "one-versus-one" (OVO) classification strategy to apply the filter-bank common spatial patterns filter to EEG data. A 10 × 10-fold cross validation was performed using shrinkage linear discriminant analysis (sLDA) to evaluate the average classification accuracies for EEG-BCI, NIRS-BCI, and hBCI when the meta-classification method was adopted to enhance classification accuracy. The ternary classification accuracies for EEG-BCI, NIRS-BCI, and hBCI were 76.1 ± 12.8, 64.1 ± 9.7, and 82.2 ± 10.2%, respectively. The classification accuracy of the proposed hBCI was thus significantly higher than those of the other BCIs ( p < 0.005). The average ITR for the proposed hBCI was calculated to be 4.70 ± 1.92 bits/minute, which was 34.3% higher than that reported for a previous binary hBCI study.
Selective classification for improved robustness of myoelectric control under nonideal conditions.
Scheme, Erik J; Englehart, Kevin B; Hudgins, Bernard S
2011-06-01
Recent literature in pattern recognition-based myoelectric control has highlighted a disparity between classification accuracy and the usability of upper limb prostheses. This paper suggests that the conventionally defined classification accuracy may be idealistic and may not reflect true clinical performance. Herein, a novel myoelectric control system based on a selective multiclass one-versus-one classification scheme, capable of rejecting unknown data patterns, is introduced. This scheme is shown to outperform nine other popular classifiers when compared using conventional classification accuracy as well as a form of leave-one-out analysis that may be more representative of real prosthetic use. Additionally, the classification scheme allows for real-time, independent adjustment of individual class-pair boundaries making it flexible and intuitive for clinical use.
Multi-source remotely sensed data fusion for improving land cover classification
NASA Astrophysics Data System (ADS)
Chen, Bin; Huang, Bo; Xu, Bing
2017-02-01
Although many advances have been made in past decades, land cover classification of fine-resolution remotely sensed (RS) data integrating multiple temporal, angular, and spectral features remains limited, and the contribution of different RS features to land cover classification accuracy remains uncertain. We proposed to improve land cover classification accuracy by integrating multi-source RS features through data fusion. We further investigated the effect of different RS features on classification performance. The results of fusing Landsat-8 Operational Land Imager (OLI) data with Moderate Resolution Imaging Spectroradiometer (MODIS), China Environment 1A series (HJ-1A), and Advanced Spaceborne Thermal Emission and Reflection (ASTER) digital elevation model (DEM) data, showed that the fused data integrating temporal, spectral, angular, and topographic features achieved better land cover classification accuracy than the original RS data. Compared with the topographic feature, the temporal and angular features extracted from the fused data played more important roles in classification performance, especially those temporal features containing abundant vegetation growth information, which markedly increased the overall classification accuracy. In addition, the multispectral and hyperspectral fusion successfully discriminated detailed forest types. Our study provides a straightforward strategy for hierarchical land cover classification by making full use of available RS data. All of these methods and findings could be useful for land cover classification at both regional and global scales.
3D micro-mapping: Towards assessing the quality of crowdsourcing to support 3D point cloud analysis
NASA Astrophysics Data System (ADS)
Herfort, Benjamin; Höfle, Bernhard; Klonner, Carolin
2018-03-01
In this paper, we propose a method to crowdsource the task of complex three-dimensional information extraction from 3D point clouds. We design web-based 3D micro tasks tailored to assess segmented LiDAR point clouds of urban trees and investigate the quality of the approach in an empirical user study. Our results for three different experiments with increasing complexity indicate that a single crowdsourcing task can be solved in a very short time of less than five seconds on average. Furthermore, the results of our empirical case study reveal that the accuracy, sensitivity and precision of 3D crowdsourcing are high for most information extraction problems. For our first experiment (binary classification with single answer) we obtain an accuracy of 91%, a sensitivity of 95% and a precision of 92%. For the more complex tasks of the second Experiment 2 (multiple answer classification) the accuracy ranges from 65% to 99% depending on the label class. Regarding the third experiment - the determination of the crown base height of individual trees - our study highlights that crowdsourcing can be a tool to obtain values with even higher accuracy in comparison to an automated computer-based approach. Finally, we found out that the accuracy of the crowdsourced results for all experiments is hardly influenced by characteristics of the input point cloud data and of the users. Importantly, the results' accuracy can be estimated using agreement among volunteers as an intrinsic indicator, which makes a broad application of 3D micro-mapping very promising.
Analysis of near infrared spectra for age-grading of wild populations of Anopheles gambiae.
Krajacich, Benjamin J; Meyers, Jacob I; Alout, Haoues; Dabiré, Roch K; Dowell, Floyd E; Foy, Brian D
2017-11-07
Understanding the age-structure of mosquito populations, especially malaria vectors such as Anopheles gambiae, is important for assessing the risk of infectious mosquitoes, and how vector control interventions may impact this risk. The use of near-infrared spectroscopy (NIRS) for age-grading has been demonstrated previously on laboratory and semi-field mosquitoes, but to date has not been utilized on wild-caught mosquitoes whose age is externally validated via parity status or parasite infection stage. In this study, we developed regression and classification models using NIRS on datasets of wild An. gambiae (s.l.) reared from larvae collected from the field in Burkina Faso, and two laboratory strains. We compared the accuracy of these models for predicting the ages of wild-caught mosquitoes that had been scored for their parity status as well as for positivity for Plasmodium sporozoites. Regression models utilizing variable selection increased predictive accuracy over the more common full-spectrum partial least squares (PLS) approach for cross-validation of the datasets, validation, and independent test sets. Models produced from datasets that included the greatest range of mosquito samples (i.e. different sampling locations and times) had the highest predictive accuracy on independent testing sets, though overall accuracy on these samples was low. For classification, we found that intramodel accuracy ranged between 73.5-97.0% for grouping of mosquitoes into "early" and "late" age classes, with the highest prediction accuracy found in laboratory colonized mosquitoes. However, this accuracy was decreased on test sets, with the highest classification of an independent set of wild-caught larvae reared to set ages being 69.6%. Variation in NIRS data, likely from dietary, genetic, and other factors limits the accuracy of this technique with wild-caught mosquitoes. Alternative algorithms may help improve prediction accuracy, but care should be taken to either maximize variety in models or minimize confounders.
Vijayakumar, Vishal; Case, Michelle; Shirinpour, Sina; He, Bin
2017-12-01
Effective pain assessment and management strategies are needed to better manage pain. In addition to self-report, an objective pain assessment system can provide a more complete picture of the neurophysiological basis for pain. In this study, a robust and accurate machine learning approach is developed to quantify tonic thermal pain across healthy subjects into a maximum of ten distinct classes. A random forest model was trained to predict pain scores using time-frequency wavelet representations of independent components obtained from electroencephalography (EEG) data, and the relative importance of each frequency band to pain quantification is assessed. The mean classification accuracy for predicting pain on an independent test subject for a range of 1-10 is 89.45%, highest among existing state of the art quantification algorithms for EEG. The gamma band is the most important to both intersubject and intrasubject classification accuracy. The robustness and generalizability of the classifier are demonstrated. Our results demonstrate the potential of this tool to be used clinically to help us to improve chronic pain treatment and establish spectral biomarkers for future pain-related studies using EEG.
Estepp, Justin R.; Christensen, James C.
2015-01-01
The passive brain-computer interface (pBCI) framework has been shown to be a very promising construct for assessing cognitive and affective state in both individuals and teams. There is a growing body of work that focuses on solving the challenges of transitioning pBCI systems from the research laboratory environment to practical, everyday use. An interesting issue is what impact methodological variability may have on the ability to reliably identify (neuro)physiological patterns that are useful for state assessment. This work aimed at quantifying the effects of methodological variability in a pBCI design for detecting changes in cognitive workload. Specific focus was directed toward the effects of replacing electrodes over dual sessions (thus inducing changes in placement, electromechanical properties, and/or impedance between the electrode and skin surface) on the accuracy of several machine learning approaches in a binary classification problem. In investigating these methodological variables, it was determined that the removal and replacement of the electrode suite between sessions does not impact the accuracy of a number of learning approaches when trained on one session and tested on a second. This finding was confirmed by comparing to a control group for which the electrode suite was not replaced between sessions. This result suggests that sensors (both neurological and peripheral) may be removed and replaced over the course of many interactions with a pBCI system without affecting its performance. Future work on multi-session and multi-day pBCI system use should seek to replicate this (lack of) effect between sessions in other tasks, temporal time courses, and data analytic approaches while also focusing on non-stationarity and variable classification performance due to intrinsic factors. PMID:25805963
Generation of 2D Land Cover Maps for Urban Areas Using Decision Tree Classification
NASA Astrophysics Data System (ADS)
Höhle, J.
2014-09-01
A 2D land cover map can automatically and efficiently be generated from high-resolution multispectral aerial images. First, a digital surface model is produced and each cell of the elevation model is then supplemented with attributes. A decision tree classification is applied to extract map objects like buildings, roads, grassland, trees, hedges, and walls from such an "intelligent" point cloud. The decision tree is derived from training areas which borders are digitized on top of a false-colour orthoimage. The produced 2D land cover map with six classes is then subsequently refined by using image analysis techniques. The proposed methodology is described step by step. The classification, assessment, and refinement is carried out by the open source software "R"; the generation of the dense and accurate digital surface model by the "Match-T DSM" program of the Trimble Company. A practical example of a 2D land cover map generation is carried out. Images of a multispectral medium-format aerial camera covering an urban area in Switzerland are used. The assessment of the produced land cover map is based on class-wise stratified sampling where reference values of samples are determined by means of stereo-observations of false-colour stereopairs. The stratified statistical assessment of the produced land cover map with six classes and based on 91 points per class reveals a high thematic accuracy for classes "building" (99 %, 95 % CI: 95 %-100 %) and "road and parking lot" (90 %, 95 % CI: 83 %-95 %). Some other accuracy measures (overall accuracy, kappa value) and their 95 % confidence intervals are derived as well. The proposed methodology has a high potential for automation and fast processing and may be applied to other scenes and sensors.
Estepp, Justin R; Christensen, James C
2015-01-01
The passive brain-computer interface (pBCI) framework has been shown to be a very promising construct for assessing cognitive and affective state in both individuals and teams. There is a growing body of work that focuses on solving the challenges of transitioning pBCI systems from the research laboratory environment to practical, everyday use. An interesting issue is what impact methodological variability may have on the ability to reliably identify (neuro)physiological patterns that are useful for state assessment. This work aimed at quantifying the effects of methodological variability in a pBCI design for detecting changes in cognitive workload. Specific focus was directed toward the effects of replacing electrodes over dual sessions (thus inducing changes in placement, electromechanical properties, and/or impedance between the electrode and skin surface) on the accuracy of several machine learning approaches in a binary classification problem. In investigating these methodological variables, it was determined that the removal and replacement of the electrode suite between sessions does not impact the accuracy of a number of learning approaches when trained on one session and tested on a second. This finding was confirmed by comparing to a control group for which the electrode suite was not replaced between sessions. This result suggests that sensors (both neurological and peripheral) may be removed and replaced over the course of many interactions with a pBCI system without affecting its performance. Future work on multi-session and multi-day pBCI system use should seek to replicate this (lack of) effect between sessions in other tasks, temporal time courses, and data analytic approaches while also focusing on non-stationarity and variable classification performance due to intrinsic factors.
Liu, Siqi; Oh, Heesoo; Chambers, David William; Xu, Tianmin; Baumrind, Sheldon
2018-04-06
Determine optimal weightings of Peer Assessment Rating (PAR) index and Discrepancy Index (DI) for malocclusion severity assessment in Chinese orthodontic patients. Sixty-nine Chinese orthodontists assessed a full set of pre-treatment records from a stratified random sample of 120 subjects gathered from six university orthodontic centres. Using professional judgment as the outcome variable, multiple regression analyses were performed to derive customized weighting systems for the PAR index and DI, for all subjects and each Angle classification subgroup. Professional judgment was consistent, with an Intraclass Correlation Coefficient (ICC) of 0.995. The PAR index or DI can be reliably measured, with ICC = 0.959 and 0.990, respectively. The predictive accuracy of PAR index was greatly improved by the Chinese weighting process (from r = 0.431 to r = 0.788) with almost equal distribution in each Angle classification subgroup. The Chinese-weighted DI showed a higher predictive accuracy, at P = 0.01, compared with the PAR index (r = 0.851 versus r = 0.788). A better performance was found in the Class II group (r = 0.890) when compared to Class I (r = 0.736) and III (r = 0.785) groups. The Chinese-weighted PAR index and DI were capable of predicting 62 per cent and 73 per cent of total variance in the professional judgment of malocclusion severity in Chinese patients. Differential prediction across Angle classifications merits attention since different weighting formulas were found.
Comparison of wheat classification accuracy using different classifiers of the image-100 system
NASA Technical Reports Server (NTRS)
Dejesusparada, N. (Principal Investigator); Chen, S. C.; Moreira, M. A.; Delima, A. M.
1981-01-01
Classification results using single-cell and multi-cell signature acquisition options, a point-by-point Gaussian maximum-likelihood classifier, and K-means clustering of the Image-100 system are presented. Conclusions reached are that: a better indication of correct classification can be provided by using a test area which contains various cover types of the study area; classification accuracy should be evaluated considering both the percentages of correct classification and error of commission; supervised classification approaches are better than K-means clustering; Gaussian distribution maximum likelihood classifier is better than Single-cell and Multi-cell Signature Acquisition Options of the Image-100 system; and in order to obtain a high classification accuracy in a large and heterogeneous crop area, using Gaussian maximum-likelihood classifier, homogeneous spectral subclasses of the study crop should be created to derive training statistics.
Classification of Focal and Non Focal Epileptic Seizures Using Multi-Features and SVM Classifier.
Sriraam, N; Raghu, S
2017-09-02
Identifying epileptogenic zones prior to surgery is an essential and crucial step in treating patients having pharmacoresistant focal epilepsy. Electroencephalogram (EEG) is a significant measurement benchmark to assess patients suffering from epilepsy. This paper investigates the application of multi-features derived from different domains to recognize the focal and non focal epileptic seizures obtained from pharmacoresistant focal epilepsy patients from Bern Barcelona database. From the dataset, five different classification tasks were formed. Total 26 features were extracted from focal and non focal EEG. Significant features were selected using Wilcoxon rank sum test by setting p-value (p < 0.05) and z-score (-1.96 > z > 1.96) at 95% significance interval. Hypothesis was made that the effect of removing outliers improves the classification accuracy. Turkey's range test was adopted for pruning outliers from feature set. Finally, 21 features were classified using optimized support vector machine (SVM) classifier with 10-fold cross validation. Bayesian optimization technique was adopted to minimize the cross-validation loss. From the simulation results, it was inferred that the highest sensitivity, specificity, and classification accuracy of 94.56%, 89.74%, and 92.15% achieved respectively and found to be better than the state-of-the-art approaches. Further, it was observed that the classification accuracy improved from 80.2% with outliers to 92.15% without outliers. The classifier performance metrics ensures the suitability of the proposed multi-features with optimized SVM classifier. It can be concluded that the proposed approach can be applied for recognition of focal EEG signals to localize epileptogenic zones.
NASA Astrophysics Data System (ADS)
Fei, Baowei; Lu, Guolan; Wang, Xu; Zhang, Hongzheng; Little, James V.; Magliocca, Kelly R.; Chen, Amy Y.
2017-02-01
We are developing label-free hyperspectral imaging (HSI) for tumor margin assessment. HSI data, hypercube (x,y,λ), consists of a series of high-resolution images of the same field of view that are acquired at different wavelengths. Every pixel on the HSI image has an optical spectrum. We developed preprocessing and classification methods for HSI data. We used spectral features from HSI data for the classification of cancer and benign tissue. We collected surgical tissue specimens from 16 human patients who underwent head and neck (H&N) cancer surgery. We acquired both HSI, autofluorescence images, and fluorescence images with 2-NBDG and proflavine from the specimens. Digitized histologic slides were examined by an H&N pathologist. The hyperspectral imaging and classification method was able to distinguish between cancer and normal tissue from oral cavity with an average accuracy of 90+/-8%, sensitivity of 89+/-9%, and specificity of 91+/-6%. For tissue specimens from the thyroid, the method achieved an average accuracy of 94+/-6%, sensitivity of 94+/-6%, and specificity of 95+/-6%. Hyperspectral imaging outperformed autofluorescence imaging or fluorescence imaging with vital dye (2-NBDG or proflavine). This study suggests that label-free hyperspectral imaging has great potential for tumor margin assessment in surgical tissue specimens of H&N cancer patients. Further development of the hyperspectral imaging technology is warranted for its application in image-guided surgery.
Validation of Accelerometer Cut-Points in Children With Cerebral Palsy Aged 4 to 5 Years.
Keawutan, Piyapa; Bell, Kristie L; Oftedal, Stina; Davies, Peter S W; Boyd, Roslyn N
2016-01-01
To derive and validate triaxial accelerometer cut-points in children with cerebral palsy (CP) and compare these with previously established cut-points in children with typical development. Eighty-four children with CP aged 4 to 5 years wore the ActiGraph during a play-based gross motor function measure assessment that was video-taped for direct observation. Receiver operating characteristic and Bland-Altman plots were used for analyses. The ActiGraph had good classification accuracy in Gross Motor Function Classification System (GMFCS) levels III and V and fair classification accuracy in GMFCS levels I, II, and IV. These results support the use of the previously established cut-points for sedentary time of 820 counts per minute in children with CP aged 4 to 5 years across all functional abilities. The cut-point provides an objective measure of sedentary and active time in children with CP. The cut-point is applicable to group data but not for individual children.
Accuracy of Remotely Sensed Classifications For Stratification of Forest and Nonforest Lands
Raymond L. Czaplewski; Paul L. Patterson
2001-01-01
We specify accuracy standards for remotely sensed classifications used by FIA to stratify landscapes into two categories: forest and nonforest. Accuracy must be highest when forest area approaches 100 percent of the landscape. If forest area is rare in a landscape, then accuracy in the nonforest stratum must be very high, even at the expense of accuracy in the forest...
NASA Astrophysics Data System (ADS)
Shahriari Nia, Morteza; Wang, Daisy Zhe; Bohlman, Stephanie Ann; Gader, Paul; Graves, Sarah J.; Petrovic, Milenko
2015-01-01
Hyperspectral images can be used to identify savannah tree species at the landscape scale, which is a key step in measuring biomass and carbon, and tracking changes in species distributions, including invasive species, in these ecosystems. Before automated species mapping can be performed, image processing and atmospheric correction is often performed, which can potentially affect the performance of classification algorithms. We determine how three processing and correction techniques (atmospheric correction, Gaussian filters, and shade/green vegetation filters) affect the prediction accuracy of classification of tree species at pixel level from airborne visible/infrared imaging spectrometer imagery of longleaf pine savanna in Central Florida, United States. Species classification using fast line-of-sight atmospheric analysis of spectral hypercubes (FLAASH) atmospheric correction outperformed ATCOR in the majority of cases. Green vegetation (normalized difference vegetation index) and shade (near-infrared) filters did not increase classification accuracy when applied to large and continuous patches of specific species. Finally, applying a Gaussian filter reduces interband noise and increases species classification accuracy. Using the optimal preprocessing steps, our classification accuracy of six species classes is about 75%.
Comparing Features for Classification of MEG Responses to Motor Imagery.
Halme, Hanna-Leena; Parkkonen, Lauri
2016-01-01
Motor imagery (MI) with real-time neurofeedback could be a viable approach, e.g., in rehabilitation of cerebral stroke. Magnetoencephalography (MEG) noninvasively measures electric brain activity at high temporal resolution and is well-suited for recording oscillatory brain signals. MI is known to modulate 10- and 20-Hz oscillations in the somatomotor system. In order to provide accurate feedback to the subject, the most relevant MI-related features should be extracted from MEG data. In this study, we evaluated several MEG signal features for discriminating between left- and right-hand MI and between MI and rest. MEG was measured from nine healthy participants imagining either left- or right-hand finger tapping according to visual cues. Data preprocessing, feature extraction and classification were performed offline. The evaluated MI-related features were power spectral density (PSD), Morlet wavelets, short-time Fourier transform (STFT), common spatial patterns (CSP), filter-bank common spatial patterns (FBCSP), spatio-spectral decomposition (SSD), and combined SSD+CSP, CSP+PSD, CSP+Morlet, and CSP+STFT. We also compared four classifiers applied to single trials using 5-fold cross-validation for evaluating the classification accuracy and its possible dependence on the classification algorithm. In addition, we estimated the inter-session left-vs-right accuracy for each subject. The SSD+CSP combination yielded the best accuracy in both left-vs-right (mean 73.7%) and MI-vs-rest (mean 81.3%) classification. CSP+Morlet yielded the best mean accuracy in inter-session left-vs-right classification (mean 69.1%). There were large inter-subject differences in classification accuracy, and the level of the 20-Hz suppression correlated significantly with the subjective MI-vs-rest accuracy. Selection of the classification algorithm had only a minor effect on the results. We obtained good accuracy in sensor-level decoding of MI from single-trial MEG data. Feature extraction methods utilizing both the spatial and spectral profile of MI-related signals provided the best classification results, suggesting good performance of these methods in an online MEG neurofeedback system.
PCA based feature reduction to improve the accuracy of decision tree c4.5 classification
NASA Astrophysics Data System (ADS)
Nasution, M. Z. F.; Sitompul, O. S.; Ramli, M.
2018-03-01
Splitting attribute is a major process in Decision Tree C4.5 classification. However, this process does not give a significant impact on the establishment of the decision tree in terms of removing irrelevant features. It is a major problem in decision tree classification process called over-fitting resulting from noisy data and irrelevant features. In turns, over-fitting creates misclassification and data imbalance. Many algorithms have been proposed to overcome misclassification and overfitting on classifications Decision Tree C4.5. Feature reduction is one of important issues in classification model which is intended to remove irrelevant data in order to improve accuracy. The feature reduction framework is used to simplify high dimensional data to low dimensional data with non-correlated attributes. In this research, we proposed a framework for selecting relevant and non-correlated feature subsets. We consider principal component analysis (PCA) for feature reduction to perform non-correlated feature selection and Decision Tree C4.5 algorithm for the classification. From the experiments conducted using available data sets from UCI Cervical cancer data set repository with 858 instances and 36 attributes, we evaluated the performance of our framework based on accuracy, specificity and precision. Experimental results show that our proposed framework is robust to enhance classification accuracy with 90.70% accuracy rates.
ERIC Educational Resources Information Center
Petersen, Douglas B.; Allen, Melissa M.; Spencer, Trina D.
2016-01-01
The purpose of this study was to examine and compare the classification accuracy of early static prereading measures and early dynamic assessment reading measures administered to 600 kindergarten students. At the beginning of kindergarten, all of the participants were administered two commonly used static prereading measures. The participants were…
Improving crop classification through attention to the timing of airborne radar acquisitions
NASA Technical Reports Server (NTRS)
Brisco, B.; Ulaby, F. T.; Protz, R.
1984-01-01
Radar remote sensors may provide valuable input to crop classification procedures because of (1) their independence of weather conditions and solar illumination, and (2) their ability to respond to differences in crop type. Manual classification of multidate synthetic aperture radar (SAR) imagery resulted in an overall accuracy of 83 percent for corn, forest, grain, and 'other' cover types. Forests and corn fields were identified with accuracies approaching or exceeding 90 percent. Grain fields and 'other' fields were often confused with each other, resulting in classification accuracies of 51 and 66 percent, respectively. The 83 percent correct classification represents a 10 percent improvement when compared to similar SAR data for the same area collected at alternate time periods in 1978. These results demonstrate that improvements in crop classification accuracy can be achieved with SAR data by synchronizing data collection times with crop growth stages in order to maximize differences in the geometric and dielectric properties of the cover types of interest.
Gates to Gregg High Voltage Transmission Line Study. [California
NASA Technical Reports Server (NTRS)
Bergis, V.; Maw, K.; Newland, W.; Sinnott, D.; Thornbury, G.; Easterwood, P.; Bonderud, J.
1982-01-01
The usefulness of LANDSAT data in the planning of transmission line routes was assessed. LANDSAT digital data and image processing techniques, specifically a multi-date supervised classification aproach, were used to develop a land cover map for an agricultural area near Fresno, California. Twenty-six land cover classes were identified, of which twenty classes were agricultural crops. High classification accuracies (greater than 80%) were attained for several classes, including cotton, grain, and vineyards. The primary products generated were 1:24,000, 1:100,000 and 1:250,000 scale maps of the classification and acreage summaries for all land cover classes within four alternate transmission line routes.
NASA Astrophysics Data System (ADS)
Malatesta, Luca; Attorre, Fabio; Altobelli, Alfredo; Adeeb, Ahmed; De Sanctis, Michele; Taleb, Nadim M.; Scholte, Paul T.; Vitale, Marcello
2013-01-01
Socotra Island (Yemen), a global biodiversity hotspot, is characterized by high geomorphological and biological diversity. In this study, we present a high-resolution vegetation map of the island based on combining vegetation analysis and classification with remote sensing. Two different image classification approaches were tested to assess the most accurate one in mapping the vegetation mosaic of Socotra. Spectral signatures of the vegetation classes were obtained through a Gaussian mixture distribution model, and a sequential maximum a posteriori (SMAP) classification was applied to account for the heterogeneity and the complex spatial pattern of the arid vegetation. This approach was compared to the traditional maximum likelihood (ML) classification. Satellite data were represented by a RapidEye image with 5 m pixel resolution and five spectral bands. Classified vegetation relevés were used to obtain the training and evaluation sets for the main plant communities. Postclassification sorting was performed to adjust the classification through various rule-based operations. Twenty-eight classes were mapped, and SMAP, with an accuracy of 87%, proved to be more effective than ML (accuracy: 66%). The resulting map will represent an important instrument for the elaboration of conservation strategies and the sustainable use of natural resources in the island.
Automated Tissue Classification Framework for Reproducible Chronic Wound Assessment
Mukherjee, Rashmi; Manohar, Dhiraj Dhane; Das, Dev Kumar; Achar, Arun; Mitra, Analava; Chakraborty, Chandan
2014-01-01
The aim of this paper was to develop a computer assisted tissue classification (granulation, necrotic, and slough) scheme for chronic wound (CW) evaluation using medical image processing and statistical machine learning techniques. The red-green-blue (RGB) wound images grabbed by normal digital camera were first transformed into HSI (hue, saturation, and intensity) color space and subsequently the “S” component of HSI color channels was selected as it provided higher contrast. Wound areas from 6 different types of CW were segmented from whole images using fuzzy divergence based thresholding by minimizing edge ambiguity. A set of color and textural features describing granulation, necrotic, and slough tissues in the segmented wound area were extracted using various mathematical techniques. Finally, statistical learning algorithms, namely, Bayesian classification and support vector machine (SVM), were trained and tested for wound tissue classification in different CW images. The performance of the wound area segmentation protocol was further validated by ground truth images labeled by clinical experts. It was observed that SVM with 3rd order polynomial kernel provided the highest accuracies, that is, 86.94%, 90.47%, and 75.53%, for classifying granulation, slough, and necrotic tissues, respectively. The proposed automated tissue classification technique achieved the highest overall accuracy, that is, 87.61%, with highest kappa statistic value (0.793). PMID:25114925
Proceedings of Technical Sessions, Volumes 1 and 2: the LACIE Symposium
NASA Technical Reports Server (NTRS)
1979-01-01
The technical design of the Large Area Crop Inventory Experiment is examined and data acquired over 3 global crop years is analyzed with respect to (1) sampling and aggregation; (2) growth size estimation; (3) classification and mensuration; (4) yield estimation; and (5) accuracy assessment. Seventy-nine papers delivered at conference sessions cover system implementation and operation; data processing systems; experiment results and accuracy; supporting research and technology; and the USDA application test system.
Hrabok, Marianne; Brooks, Brian L; Fay-McClymont, Taryn B; Sherman, Elisabeth M S
2014-01-01
The purpose of this article was to investigate the accuracy of the WISC-IV short forms in estimating Full Scale Intelligence Quotient (FSIQ) and General Ability Index (GAI) in pediatric epilepsy. One hundred and four children with epilepsy completed the WISC-IV as part of a neuropsychological assessment at a tertiary-level children's hospital. The clinical accuracy of eight short forms was assessed in two ways: (a) accuracy within +/- 5 index points of FSIQ and (b) the clinical classification rate according to Wechsler conventions. The sample was further subdivided into low FSIQ (≤ 80) and high FSIQ (> 80). All short forms were significantly correlated with FSIQ. Seven-subtest (Crawford et al. [2010] FSIQ) and 5-subtest (BdSiCdVcLn) short forms yielded the highest clinical accuracy rates (77%-89%). Overall, a 2-subtest (VcMr) short form yielded the lowest clinical classification rates for FSIQ (35%-63%). The short form yielding the most accurate estimate of GAI was VcSiMrBd (73%-84%). Short forms show promise as useful estimates. The 7-subtest (Crawford et al., 2010) and 5-subtest (BdSiVcLnCd) short forms yielded the most accurate estimates of FSIQ. VcSiMrBd yielded the most accurate estimate of GAI. Clinical recommendations are provided for use of short forms in pediatric epilepsy.
NASA Astrophysics Data System (ADS)
Książek, Judyta
2015-10-01
At present, there has been a great interest in the development of texture based image classification methods in many different areas. This study presents the results of research carried out to assess the usefulness of selected textural features for detection of asbestos-cement roofs in orthophotomap classification. Two different orthophotomaps of southern Poland (with ground resolution: 5 cm and 25 cm) were used. On both orthoimages representative samples for two classes: asbestos-cement roofing sheets and other roofing materials were selected. Estimation of texture analysis usefulness was conducted using machine learning methods based on decision trees (C5.0 algorithm). For this purpose, various sets of texture parameters were calculated in MaZda software. During the calculation of decision trees different numbers of texture parameters groups were considered. In order to obtain the best settings for decision trees models cross-validation was performed. Decision trees models with the lowest mean classification error were selected. The accuracy of the classification was held based on validation data sets, which were not used for the classification learning. For 5 cm ground resolution samples, the lowest mean classification error was 15.6%. The lowest mean classification error in the case of 25 cm ground resolution was 20.0%. The obtained results confirm potential usefulness of the texture parameter image processing for detection of asbestos-cement roofing sheets. In order to improve the accuracy another extended study should be considered in which additional textural features as well as spectral characteristics should be analyzed.
Farran, Bassam; Channanath, Arshad Mohamed; Behbehani, Kazem; Thanaraj, Thangavel Alphonse
2013-01-01
Objective We build classification models and risk assessment tools for diabetes, hypertension and comorbidity using machine-learning algorithms on data from Kuwait. We model the increased proneness in diabetic patients to develop hypertension and vice versa. We ascertain the importance of ethnicity (and natives vs expatriate migrants) and of using regional data in risk assessment. Design Retrospective cohort study. Four machine-learning techniques were used: logistic regression, k-nearest neighbours (k-NN), multifactor dimensionality reduction and support vector machines. The study uses fivefold cross validation to obtain generalisation accuracies and errors. Setting Kuwait Health Network (KHN) that integrates data from primary health centres and hospitals in Kuwait. Participants 270 172 hospital visitors (of which, 89 858 are diabetic, 58 745 hypertensive and 30 522 comorbid) comprising Kuwaiti natives, Asian and Arab expatriates. Outcome measures Incident type 2 diabetes, hypertension and comorbidity. Results Classification accuracies of >85% (for diabetes) and >90% (for hypertension) are achieved using only simple non-laboratory-based parameters. Risk assessment tools based on k-NN classification models are able to assign ‘high’ risk to 75% of diabetic patients and to 94% of hypertensive patients. Only 5% of diabetic patients are seen assigned ‘low’ risk. Asian-specific models and assessments perform even better. Pathological conditions of diabetes in the general population or in hypertensive population and those of hypertension are modelled. Two-stage aggregate classification models and risk assessment tools, built combining both the component models on diabetes (or on hypertension), perform better than individual models. Conclusions Data on diabetes, hypertension and comorbidity from the cosmopolitan State of Kuwait are available for the first time. This enabled us to apply four different case–control models to assess risks. These tools aid in the preliminary non-intrusive assessment of the population. Ethnicity is seen significant to the predictive models. Risk assessments need to be developed using regional data as we demonstrate the applicability of the American Diabetes Association online calculator on data from Kuwait. PMID:23676796
Global Optimization Ensemble Model for Classification Methods
Anwar, Hina; Qamar, Usman; Muzaffar Qureshi, Abdul Wahab
2014-01-01
Supervised learning is the process of data mining for deducing rules from training datasets. A broad array of supervised learning algorithms exists, every one of them with its own advantages and drawbacks. There are some basic issues that affect the accuracy of classifier while solving a supervised learning problem, like bias-variance tradeoff, dimensionality of input space, and noise in the input data space. All these problems affect the accuracy of classifier and are the reason that there is no global optimal method for classification. There is not any generalized improvement method that can increase the accuracy of any classifier while addressing all the problems stated above. This paper proposes a global optimization ensemble model for classification methods (GMC) that can improve the overall accuracy for supervised learning problems. The experimental results on various public datasets showed that the proposed model improved the accuracy of the classification models from 1% to 30% depending upon the algorithm complexity. PMID:24883382
Characterization and classification of lupus patients based on plasma thermograms
Chaires, Jonathan B.; Mekmaysy, Chongkham S.; DeLeeuw, Lynn; Sivils, Kathy L.; Harley, John B.; Rovin, Brad H.; Kulasekera, K. B.; Jarjour, Wael N.
2017-01-01
Objective Plasma thermograms (thermal stability profiles of blood plasma) are being utilized as a new diagnostic approach for clinical assessment. In this study, we investigated the ability of plasma thermograms to classify systemic lupus erythematosus (SLE) patients versus non SLE controls using a sample of 300 SLE and 300 control subjects from the Lupus Family Registry and Repository. Additionally, we evaluated the heterogeneity of thermograms along age, sex, ethnicity, concurrent health conditions and SLE diagnostic criteria. Methods Thermograms were visualized graphically for important differences between covariates and summarized using various measures. A modified linear discriminant analysis was used to segregate SLE versus control subjects on the basis of the thermograms. Classification accuracy was measured based on multiple training/test splits of the data and compared to classification based on SLE serological markers. Results Median sensitivity, specificity, and overall accuracy based on classification using plasma thermograms was 86%, 83%, and 84% compared to 78%, 95%, and 86% based on a combination of five antibody tests. Combining thermogram and serology information together improved sensitivity from 78% to 86% and overall accuracy from 86% to 89% relative to serology alone. Predictive accuracy of thermograms for distinguishing SLE and osteoarthritis / rheumatoid arthritis patients was comparable. Both gender and anemia significantly interacted with disease status for plasma thermograms (p<0.001), with greater separation between SLE and control thermograms for females relative to males and for patients with anemia relative to patients without anemia. Conclusion Plasma thermograms constitute an additional biomarker which may help improve diagnosis of SLE patients, particularly when coupled with standard diagnostic testing. Differences in thermograms according to patient sex, ethnicity, clinical and environmental factors are important considerations for application of thermograms in a clinical setting. PMID:29149219
Gromski, Piotr S; Correa, Elon; Vaughan, Andrew A; Wedge, David C; Turner, Michael L; Goodacre, Royston
2014-11-01
Accurate detection of certain chemical vapours is important, as these may be diagnostic for the presence of weapons, drugs of misuse or disease. In order to achieve this, chemical sensors could be deployed remotely. However, the readout from such sensors is a multivariate pattern, and this needs to be interpreted robustly using powerful supervised learning methods. Therefore, in this study, we compared the classification accuracy of four pattern recognition algorithms which include linear discriminant analysis (LDA), partial least squares-discriminant analysis (PLS-DA), random forests (RF) and support vector machines (SVM) which employed four different kernels. For this purpose, we have used electronic nose (e-nose) sensor data (Wedge et al., Sensors Actuators B Chem 143:365-372, 2009). In order to allow direct comparison between our four different algorithms, we employed two model validation procedures based on either 10-fold cross-validation or bootstrapping. The results show that LDA (91.56% accuracy) and SVM with a polynomial kernel (91.66% accuracy) were very effective at analysing these e-nose data. These two models gave superior prediction accuracy, sensitivity and specificity in comparison to the other techniques employed. With respect to the e-nose sensor data studied here, our findings recommend that SVM with a polynomial kernel should be favoured as a classification method over the other statistical models that we assessed. SVM with non-linear kernels have the advantage that they can be used for classifying non-linear as well as linear mapping from analytical data space to multi-group classifications and would thus be a suitable algorithm for the analysis of most e-nose sensor data.
Detection of epileptic seizure in EEG signals using linear least squares preprocessing.
Roshan Zamir, Z
2016-09-01
An epileptic seizure is a transient event of abnormal excessive neuronal discharge in the brain. This unwanted event can be obstructed by detection of electrical changes in the brain that happen before the seizure takes place. The automatic detection of seizures is necessary since the visual screening of EEG recordings is a time consuming task and requires experts to improve the diagnosis. Much of the prior research in detection of seizures has been developed based on artificial neural network, genetic programming, and wavelet transforms. Although the highest achieved accuracy for classification is 100%, there are drawbacks, such as the existence of unbalanced datasets and the lack of investigations in performances consistency. To address these, four linear least squares-based preprocessing models are proposed to extract key features of an EEG signal in order to detect seizures. The first two models are newly developed. The original signal (EEG) is approximated by a sinusoidal curve. Its amplitude is formed by a polynomial function and compared with the predeveloped spline function. Different statistical measures, namely classification accuracy, true positive and negative rates, false positive and negative rates and precision, are utilised to assess the performance of the proposed models. These metrics are derived from confusion matrices obtained from classifiers. Different classifiers are used over the original dataset and the set of extracted features. The proposed models significantly reduce the dimension of the classification problem and the computational time while the classification accuracy is improved in most cases. The first and third models are promising feature extraction methods with the classification accuracy of 100%. Logistic, LazyIB1, LazyIB5, and J48 are the best classifiers. Their true positive and negative rates are 1 while false positive and negative rates are 0 and the corresponding precision values are 1. Numerical results suggest that these models are robust and efficient for detecting epileptic seizure. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Xia, Jiaqi; Peng, Zhenling; Qi, Dawei; Mu, Hongbo; Yang, Jianyi
2017-03-15
Protein fold classification is a critical step in protein structure prediction. There are two possible ways to classify protein folds. One is through template-based fold assignment and the other is ab-initio prediction using machine learning algorithms. Combination of both solutions to improve the prediction accuracy was never explored before. We developed two algorithms, HH-fold and SVM-fold for protein fold classification. HH-fold is a template-based fold assignment algorithm using the HHsearch program. SVM-fold is a support vector machine-based ab-initio classification algorithm, in which a comprehensive set of features are extracted from three complementary sequence profiles. These two algorithms are then combined, resulting to the ensemble approach TA-fold. We performed a comprehensive assessment for the proposed methods by comparing with ab-initio methods and template-based threading methods on six benchmark datasets. An accuracy of 0.799 was achieved by TA-fold on the DD dataset that consists of proteins from 27 folds. This represents improvement of 5.4-11.7% over ab-initio methods. After updating this dataset to include more proteins in the same folds, the accuracy increased to 0.971. In addition, TA-fold achieved >0.9 accuracy on a large dataset consisting of 6451 proteins from 184 folds. Experiments on the LE dataset show that TA-fold consistently outperforms other threading methods at the family, superfamily and fold levels. The success of TA-fold is attributed to the combination of template-based fold assignment and ab-initio classification using features from complementary sequence profiles that contain rich evolution information. http://yanglab.nankai.edu.cn/TA-fold/. yangjy@nankai.edu.cn or mhb-506@163.com. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
A novel approach for individual tree crown delineation using lidar data
NASA Astrophysics Data System (ADS)
Liu, Tao
Individual tree crown delineation (ITCD) is an important technique to support precision forestry. ITCD is particularly difficult for deciduous forests where the existence of multiple branches can lead to false tree top detection. This thesis focused on developing a new ITCD model, which consists of two components: (1) boundary refinement using a novel algorithm called Fishing Net Dragging (FiND), and (2) segment merging using boundary classification. The proposed ITCD model was tested in both deciduous and mixed forests, attaining an overall accuracy of 74% and 78%, respectively. This compared favorably to an ITCD method commonly cited in the literature, which attained 41% and 51% on the same plots. To facilitate comparison of research in the ITCD community, this thesis also developed a new accuracy assessment scheme for ITCD. This new accuracy assessment is easy to interpret and convenient to implement while comprehensively evaluating ITCD accuracy.
Sex discrimination potential of buccolingual and mesiodistal tooth dimensions.
Acharya, Ashith B; Mainali, Sneedha
2008-07-01
Tooth crown dimensions are reasonably accurate predictors of sex and are useful adjuncts in sex assessment. This study explores the utility of buccolingual (BL) and mesiodistal (MD) measurements in sex differentiation when used independently. BL and MD measurements of 28 teeth (third molars excluded) were obtained from a group of 53 Nepalese subjects (22 women and 31 men) aged 19-28 years. Stepwise discriminant analyses were undertaken separately for both types of tooth crown variables and their accuracy in sex classification compared with one another. MD dimensions had recognizably greater accuracy (77.4-83%) in sex identification than BL measurements (62.3-64.2%)--results that are consistent with previous reports. However, the accuracy of MD variables is not high enough to warrant their exclusive use in odontometric sex assessment--higher accuracy levels have been obtained when both types of dimensions were used concurrently, implying that BL variables contribute to sex assessment to some extent. Hence, it is inferred that optimal results in dental sex assessment are obtained when both MD and BL variables are used together.
NutriNet: A Deep Learning Food and Drink Image Recognition System for Dietary Assessment
Koroušić Seljak, Barbara
2017-01-01
Automatic food image recognition systems are alleviating the process of food-intake estimation and dietary assessment. However, due to the nature of food images, their recognition is a particularly challenging task, which is why traditional approaches in the field have achieved a low classification accuracy. Deep neural networks have outperformed such solutions, and we present a novel approach to the problem of food and drink image detection and recognition that uses a newly-defined deep convolutional neural network architecture, called NutriNet. This architecture was tuned on a recognition dataset containing 225,953 512 × 512 pixel images of 520 different food and drink items from a broad spectrum of food groups, on which we achieved a classification accuracy of 86.72%, along with an accuracy of 94.47% on a detection dataset containing 130,517 images. We also performed a real-world test on a dataset of self-acquired images, combined with images from Parkinson’s disease patients, all taken using a smartphone camera, achieving a top-five accuracy of 55%, which is an encouraging result for real-world images. Additionally, we tested NutriNet on the University of Milano-Bicocca 2016 (UNIMIB2016) food image dataset, on which we improved upon the provided baseline recognition result. An online training component was implemented to continually fine-tune the food and drink recognition model on new images. The model is being used in practice as part of a mobile app for the dietary assessment of Parkinson’s disease patients. PMID:28653995
Porras-Alfaro, Andrea; Liu, Kuan-Liang; Kuske, Cheryl R; Xie, Gary
2014-02-01
We compared the classification accuracy of two sections of the fungal internal transcribed spacer (ITS) region, individually and combined, and the 5' section (about 600 bp) of the large-subunit rRNA (LSU), using a naive Bayesian classifier and BLASTN. A hand-curated ITS-LSU training set of 1,091 sequences and a larger training set of 8,967 ITS region sequences were used. Of the factors evaluated, database composition and quality had the largest effect on classification accuracy, followed by fragment size and use of a bootstrap cutoff to improve classification confidence. The naive Bayesian classifier and BLASTN gave similar results at higher taxonomic levels, but the classifier was faster and more accurate at the genus level when a bootstrap cutoff was used. All of the ITS and LSU sections performed well (>97.7% accuracy) at higher taxonomic ranks from kingdom to family, and differences between them were small at the genus level (within 0.66 to 1.23%). When full-length sequence sections were used, the LSU outperformed the ITS1 and ITS2 fragments at the genus level, but the ITS1 and ITS2 showed higher accuracy when smaller fragment sizes of the same length and a 50% bootstrap cutoff were used. In a comparison using the larger ITS training set, ITS1 and ITS2 had very similar accuracy classification for fragments between 100 and 200 bp. Collectively, the results show that any of the ITS or LSU sections we tested provided comparable classification accuracy to the genus level and underscore the need for larger and more diverse classification training sets.
Liu, Kuan-Liang; Kuske, Cheryl R.
2014-01-01
We compared the classification accuracy of two sections of the fungal internal transcribed spacer (ITS) region, individually and combined, and the 5′ section (about 600 bp) of the large-subunit rRNA (LSU), using a naive Bayesian classifier and BLASTN. A hand-curated ITS-LSU training set of 1,091 sequences and a larger training set of 8,967 ITS region sequences were used. Of the factors evaluated, database composition and quality had the largest effect on classification accuracy, followed by fragment size and use of a bootstrap cutoff to improve classification confidence. The naive Bayesian classifier and BLASTN gave similar results at higher taxonomic levels, but the classifier was faster and more accurate at the genus level when a bootstrap cutoff was used. All of the ITS and LSU sections performed well (>97.7% accuracy) at higher taxonomic ranks from kingdom to family, and differences between them were small at the genus level (within 0.66 to 1.23%). When full-length sequence sections were used, the LSU outperformed the ITS1 and ITS2 fragments at the genus level, but the ITS1 and ITS2 showed higher accuracy when smaller fragment sizes of the same length and a 50% bootstrap cutoff were used. In a comparison using the larger ITS training set, ITS1 and ITS2 had very similar accuracy classification for fragments between 100 and 200 bp. Collectively, the results show that any of the ITS or LSU sections we tested provided comparable classification accuracy to the genus level and underscore the need for larger and more diverse classification training sets. PMID:24242255
Modinos, Gemma; Mechelli, Andrea; Pettersson-Yeo, William; Allen, Paul; McGuire, Philip; Aleman, Andre
2013-01-01
We used Support Vector Machine (SVM) to perform multivariate pattern classification based on brain activation during emotional processing in healthy participants with subclinical depressive symptoms. Six-hundred undergraduate students completed the Beck Depression Inventory II (BDI-II). Two groups were subsequently formed: (i) subclinical (mild) mood disturbance (n = 17) and (ii) no mood disturbance (n = 17). Participants also completed a self-report questionnaire on subclinical psychotic symptoms, the Community Assessment of Psychic Experiences Questionnaire (CAPE) positive subscale. The functional magnetic resonance imaging (fMRI) paradigm entailed passive viewing of negative emotional and neutral scenes. The pattern of brain activity during emotional processing allowed correct group classification with an overall accuracy of 77% (p = 0.002), within a network of regions including the amygdala, insula, anterior cingulate cortex and medial prefrontal cortex. However, further analysis suggested that the classification accuracy could also be explained by subclinical psychotic symptom scores (correlation with SVM weights r = 0.459, p = 0.006). Psychosis proneness may thus be a confounding factor for neuroimaging studies in subclinical depression.
Clemans, Katherine H.; Musci, Rashelle J.; Leoutsakos, Jeannie-Marie S.; Ialongo, Nicholas S.
2014-01-01
Objective This study compared the ability of teacher, parent, and peer reports of aggressive behavior in early childhood to accurately classify cases of maladaptive outcomes in late adolescence and early adulthood. Method Weighted kappa analyses determined optimal cut points and relative classification accuracy among teacher, parent, and peer reports of aggression assessed for 691 students (54% male; 84% African American, 13% White) in the fall of first grade. Outcomes included antisocial personality, substance use, incarceration history, risky sexual behavior, and failure to graduate from high school on time. Results Peer reports were the most accurate classifier of all outcomes in the full sample. For most outcomes, the addition of teacher or parent reports did not improve overall classification accuracy once peer reports were accounted for. Additional gender-specific and adjusted kappa analyses supported the superior classification utility of the peer report measure. Conclusion The results suggest that peer reports provided the most useful classification information of the three aggression measures. Implications for targeted intervention efforts which use screening measures to identify at-risk children are discussed. PMID:24512126
Van Cott, Andrew; Hastings, Charles E; Landsiedel, Robert; Kolle, Susanne; Stinchcombe, Stefan
2018-02-01
In vivo acute systemic testing is a regulatory requirement for agrochemical formulations. GHS specifies an alternative computational approach (GHS additivity formula) for calculating the acute toxicity of mixtures. We collected acute systemic toxicity data from formulations that contained one of several acutely-toxic active ingredients. The resulting acute data set includes 210 formulations tested for oral toxicity, 128 formulations tested for inhalation toxicity and 31 formulations tested for dermal toxicity. The GHS additivity formula was applied to each of these formulations and compared with the experimental in vivo result. In the acute oral assay, the GHS additivity formula misclassified 110 formulations using the GHS classification criteria (48% accuracy) and 119 formulations using the USEPA classification criteria (43% accuracy). With acute inhalation, the GHS additivity formula misclassified 50 formulations using the GHS classification criteria (61% accuracy) and 34 formulations using the USEPA classification criteria (73% accuracy). For acute dermal toxicity, the GHS additivity formula misclassified 16 formulations using the GHS classification criteria (48% accuracy) and 20 formulations using the USEPA classification criteria (36% accuracy). This data indicates the acute systemic toxicity of many formulations is not the sum of the ingredients' toxicity (additivity); but rather, ingredients in a formulation can interact to result in lower or higher toxicity than predicted by the GHS additivity formula. Copyright © 2018 Elsevier Inc. All rights reserved.
Siuly; Yin, Xiaoxia; Hadjiloucas, Sillas; Zhang, Yanchun
2016-04-01
This work provides a performance comparison of four different machine learning classifiers: multinomial logistic regression with ridge estimators (MLR) classifier, k-nearest neighbours (KNN), support vector machine (SVM) and naïve Bayes (NB) as applied to terahertz (THz) transient time domain sequences associated with pixelated images of different powder samples. The six substances considered, although have similar optical properties, their complex insertion loss at the THz part of the spectrum is significantly different because of differences in both their frequency dependent THz extinction coefficient as well as differences in their refractive index and scattering properties. As scattering can be unquantifiable in many spectroscopic experiments, classification solely on differences in complex insertion loss can be inconclusive. The problem is addressed using two-dimensional (2-D) cross-correlations between background and sample interferograms, these ensure good noise suppression of the datasets and provide a range of statistical features that are subsequently used as inputs to the above classifiers. A cross-validation procedure is adopted to assess the performance of the classifiers. Firstly the measurements related to samples that had thicknesses of 2mm were classified, then samples at thicknesses of 4mm, and after that 3mm were classified and the success rate and consistency of each classifier was recorded. In addition, mixtures having thicknesses of 2 and 4mm as well as mixtures of 2, 3 and 4mm were presented simultaneously to all classifiers. This approach provided further cross-validation of the classification consistency of each algorithm. The results confirm the superiority in classification accuracy and robustness of the MLR (least accuracy 88.24%) and KNN (least accuracy 90.19%) algorithms which consistently outperformed the SVM (least accuracy 74.51%) and NB (least accuracy 56.86%) classifiers for the same number of feature vectors across all studies. The work establishes a general methodology for assessing the performance of other hyperspectral dataset classifiers on the basis of 2-D cross-correlations in far-infrared spectroscopy or other parts of the electromagnetic spectrum. It also advances the wider proliferation of automated THz imaging systems across new application areas e.g., biomedical imaging, industrial processing and quality control where interpretation of hyperspectral images is still under development. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Geelen, Christopher D.; Wijnhoven, Rob G. J.; Dubbelman, Gijs; de With, Peter H. N.
2015-03-01
This research considers gender classification in surveillance environments, typically involving low-resolution images and a large amount of viewpoint variations and occlusions. Gender classification is inherently difficult due to the large intra-class variation and interclass correlation. We have developed a gender classification system, which is successfully evaluated on two novel datasets, which realistically consider the above conditions, typical for surveillance. The system reaches a mean accuracy of up to 90% and approaches our human baseline of 92.6%, proving a high-quality gender classification system. We also present an in-depth discussion of the fundamental differences between SVM and RF classifiers. We conclude that balancing the degree of randomization in any classifier is required for the highest classification accuracy. For our problem, an RF-SVM hybrid classifier exploiting the combination of HSV and LBP features results in the highest classification accuracy of 89.9 0.2%, while classification computation time is negligible compared to the detection time of pedestrians.
NASA Astrophysics Data System (ADS)
Tamimi, E.; Ebadi, H.; Kiani, A.
2017-09-01
Automatic building detection from High Spatial Resolution (HSR) images is one of the most important issues in Remote Sensing (RS). Due to the limited number of spectral bands in HSR images, using other features will lead to improve accuracy. By adding these features, the presence probability of dependent features will be increased, which leads to accuracy reduction. In addition, some parameters should be determined in Support Vector Machine (SVM) classification. Therefore, it is necessary to simultaneously determine classification parameters and select independent features according to image type. Optimization algorithm is an efficient method to solve this problem. On the other hand, pixel-based classification faces several challenges such as producing salt-paper results and high computational time in high dimensional data. Hence, in this paper, a novel method is proposed to optimize object-based SVM classification by applying continuous Ant Colony Optimization (ACO) algorithm. The advantages of the proposed method are relatively high automation level, independency of image scene and type, post processing reduction for building edge reconstruction and accuracy improvement. The proposed method was evaluated by pixel-based SVM and Random Forest (RF) classification in terms of accuracy. In comparison with optimized pixel-based SVM classification, the results showed that the proposed method improved quality factor and overall accuracy by 17% and 10%, respectively. Also, in the proposed method, Kappa coefficient was improved by 6% rather than RF classification. Time processing of the proposed method was relatively low because of unit of image analysis (image object). These showed the superiority of the proposed method in terms of time and accuracy.
Kaufmann, Liane; Huber, Stefan; Mayer, Daniel; Moeller, Korbinian; Marksteiner, Josef
2018-04-01
Adverse effects of heavy drinking on cognition have frequently been reported. In the present study, we systematically examined for the first time whether clinical neuropsychological assessments may be sensitive to alcohol abuse in elderly patients with suspected minor neurocognitive disorder. A total of 144 elderly with and without alcohol abuse (each group n=72; mean age 66.7 years) were selected from a patient pool of n=738 by applying propensity score matching (a statistical method allowing to match participants in experimental and control group by balancing various covariates to reduce selection bias). Accordingly, study groups were almost perfectly matched regarding age, education, gender, and Mini Mental State Examination score. Neuropsychological performance was measured using the CERAD (Consortium to Establish a Registry for Alzheimer's Disease). Classification analyses (i.e., decision tree and boosted trees models) were conducted to examine whether CERAD variables or total score contributed to group classification. Decision tree models disclosed that groups could be reliably classified based on the CERAD variables "Word List Discriminability" (tapping verbal recognition memory, 64% classification accuracy) and "Trail Making Test A" (measuring visuo-motor speed, 59% classification accuracy). Boosted tree analyses further indicated the sensitivity of "Word List Recall" (measuring free verbal recall) for discriminating elderly with versus without a history of alcohol abuse. This indicates that specific CERAD variables seem to be sensitive to alcohol-related cognitive dysfunctions in elderly patients with suspected minor neurocognitive disorder. (JINS, 2018, 24, 360-371).
Friesz, Aaron M.; Wylie, Bruce K.; Howard, Daniel M.
2017-01-01
Crop cover maps have become widely used in a range of research applications. Multiple crop cover maps have been developed to suite particular research interests. The National Agricultural Statistics Service (NASS) Cropland Data Layers (CDL) are a series of commonly used crop cover maps for the conterminous United States (CONUS) that span from 2008 to 2013. In this investigation, we sought to contribute to the availability of consistent CONUS crop cover maps by extending temporal coverage of the NASS CDL archive back eight additional years to 2000 by creating annual NASS CDL-like crop cover maps derived from a classification tree model algorithm. We used over 11 million records to train a classification tree algorithm and develop a crop classification model (CCM). The model was used to create crop cover maps for the CONUS for years 2000–2013 at 250 m spatial resolution. The CCM and the maps for years 2008–2013 were assessed for accuracy relative to resampled NASS CDLs. The CCM performed well against a withheld test data set with a model prediction accuracy of over 90%. The assessment of the crop cover maps indicated that the model performed well spatially, placing crop cover pixels within their known domains; however, the model did show a bias towards the ‘Other’ crop cover class, which caused frequent misclassifications of pixels around the periphery of large crop cover patch clusters and of pixels that form small, sparsely dispersed crop cover patches.
Misra, A; Balaji, R
2015-07-01
The coastal zone along the districts of Surat, Navsari, and Valsad in southern Gujarat, India, is reported to be facing serious environmental challenges in the form of shoreline erosion, wetland loss, and man-made encroachments. This study assesses the decadal land use/ land cover (LULC) changes in these three districts for the years 1990, 2001, and 2014 using satellite datasets of Landsat TM, ETM, and OLI. The LULC changes are identified by using band ratios as a pre-classification step, followed by implementation of hybrid classification (a combination of supervised and unsupervised classification). An accuracy assessment is carried out for each dataset, and the overall accuracy ranges from 90 to 95%. It is observed that the spatial extents of aquaculture, urban built-up, and barren classes have appreciated over time, whereas the coverage of mudflats has depreciated due to rapid urbanization. The changes in the shoreline of these districts have also been analyzed for the same years, and significant changes are found in the form of shoreline erosion. The LULC maps prepared as well as the shoreline change analysis done for this study area will enable the local decision makers to adopt better land-use planning and shoreline protection measures, which will further aid in sustainable future developments in this region.
Saini, Harsh; Lal, Sunil Pranit; Naidu, Vimal Vikash; Pickering, Vincel Wince; Singh, Gurmeet; Tsunoda, Tatsuhiko; Sharma, Alok
2016-12-05
High dimensional feature space generally degrades classification in several applications. In this paper, we propose a strategy called gene masking, in which non-contributing dimensions are heuristically removed from the data to improve classification accuracy. Gene masking is implemented via a binary encoded genetic algorithm that can be integrated seamlessly with classifiers during the training phase of classification to perform feature selection. It can also be used to discriminate between features that contribute most to the classification, thereby, allowing researchers to isolate features that may have special significance. This technique was applied on publicly available datasets whereby it substantially reduced the number of features used for classification while maintaining high accuracies. The proposed technique can be extremely useful in feature selection as it heuristically removes non-contributing features to improve the performance of classifiers.
Zmiri, Dror; Shahar, Yuval; Taieb-Maimon, Meirav
2012-04-01
To test the feasibility of classifying emergency department patients into severity grades using data mining methods. Emergency department records of 402 patients were classified into five severity grades by two expert physicians. The Naïve Bayes and C4.5 algorithms were applied to produce classifiers from patient data into severity grades. The classifiers' results over several subsets of the data were compared with the physicians' assessments, with a random classifier, and with a classifier that selects the maximal-prevalence class. Positive predictive value, multiple-class extensions of sensitivity and specificity combinations, and entropy change. The mean accuracy of the data mining classifiers was 52.94 ± 5.89%, significantly better (P < 0.05) than the mean accuracy of a random classifier (34.60 ± 2.40%). The entropy of the input data sets was reduced through classification by a mean of 10.1%. Allowing for classification deviations of one severity grade led to mean accuracy of 85.42 ± 1.42%. The classifiers' accuracy in that case was similar to the physicians' consensus rate. Learning from consensus records led to better performance. Reducing the number of severity grades improved results in certain cases. The performance of the Naïve Bayes and C4.5 algorithms was similar; in unbalanced data sets, Naïve Bayes performed better. It is possible to produce a computerized classification model for the severity grade of triage patients, using data mining methods. Learning from patient records regarding which there is a consensus of several physicians is preferable to learning from each physician's patients. Either Naïve Bayes or C4.5 can be used; Naïve Bayes is preferable for unbalanced data sets. An ambiguity in the intermediate severity grades seems to hamper both the physicians' agreement and the classifiers' accuracy. © 2010 Blackwell Publishing Ltd.
Evaluation of an Algorithm to Predict Menstrual-Cycle Phase at the Time of Injury.
Tourville, Timothy W; Shultz, Sandra J; Vacek, Pamela M; Knudsen, Emily J; Bernstein, Ira M; Tourville, Kelly J; Hardy, Daniel M; Johnson, Robert J; Slauterbeck, James R; Beynnon, Bruce D
2016-01-01
Women are 2 to 8 times more likely to sustain an anterior cruciate ligament (ACL) injury than men, and previous studies indicated an increased risk for injury during the preovulatory phase of the menstrual cycle (MC). However, investigations of risk rely on retrospective classification of MC phase, and no tools for this have been validated. To evaluate the accuracy of an algorithm for retrospectively classifying MC phase at the time of a mock injury based on MC history and salivary progesterone (P4) concentration. Descriptive laboratory study. Research laboratory. Thirty-one healthy female collegiate athletes (age range, 18-24 years) provided serum or saliva (or both) samples at 8 visits over 1 complete MC. Self-reported MC information was obtained on a randomized date (1-45 days) after mock injury, which is the typical timeframe in which researchers have access to ACL-injured study participants. The MC phase was classified using the algorithm as applied in a stand-alone computational fashion and also by 4 clinical experts using the algorithm and additional subjective hormonal history information to help inform their decision. To assess algorithm accuracy, phase classifications were compared with the actual MC phase at the time of mock injury (ascertained using urinary luteinizing hormone tests and serial serum P4 samples). Clinical expert and computed classifications were compared using κ statistics. Fourteen participants (45%) experienced anovulatory cycles. The algorithm correctly classified MC phase for 23 participants (74%): 22 (76%) of 29 who were preovulatory/anovulatory and 1 (50%) of 2 who were postovulatory. Agreement between expert and algorithm classifications ranged from 80.6% (κ = 0.50) to 93% (κ = 0.83). Classifications based on same-day saliva sample and optimal P4 threshold were the same as those based on MC history alone (87.1% correct). Algorithm accuracy varied during the MC but at no time were both sensitivity and specificity levels acceptable. These findings raise concerns about the accuracy of previous retrospective MC-phase classification systems, particularly in a population with a high occurrence of anovulatory cycles.
NASA Astrophysics Data System (ADS)
Zhu, Zhe; Gallant, Alisa L.; Woodcock, Curtis E.; Pengra, Bruce; Olofsson, Pontus; Loveland, Thomas R.; Jin, Suming; Dahal, Devendra; Yang, Limin; Auch, Roger F.
2016-12-01
The U.S. Geological Survey's Land Change Monitoring, Assessment, and Projection (LCMAP) initiative is a new end-to-end capability to continuously track and characterize changes in land cover, use, and condition to better support research and applications relevant to resource management and environmental change. Among the LCMAP product suite are annual land cover maps that will be available to the public. This paper describes an approach to optimize the selection of training and auxiliary data for deriving the thematic land cover maps based on all available clear observations from Landsats 4-8. Training data were selected from map products of the U.S. Geological Survey's Land Cover Trends project. The Random Forest classifier was applied for different classification scenarios based on the Continuous Change Detection and Classification (CCDC) algorithm. We found that extracting training data proportionally to the occurrence of land cover classes was superior to an equal distribution of training data per class, and suggest using a total of 20,000 training pixels to classify an area about the size of a Landsat scene. The problem of unbalanced training data was alleviated by extracting a minimum of 600 training pixels and a maximum of 8000 training pixels per class. We additionally explored removing outliers contained within the training data based on their spectral and spatial criteria, but observed no significant improvement in classification results. We also tested the importance of different types of auxiliary data that were available for the conterminous United States, including: (a) five variables used by the National Land Cover Database, (b) three variables from the cloud screening "Function of mask" (Fmask) statistics, and (c) two variables from the change detection results of CCDC. We found that auxiliary variables such as a Digital Elevation Model and its derivatives (aspect, position index, and slope), potential wetland index, water probability, snow probability, and cloud probability improved the accuracy of land cover classification. Compared to the original strategy of the CCDC algorithm (500 pixels per class), the use of the optimal strategy improved the classification accuracies substantially (15-percentage point increase in overall accuracy and 4-percentage point increase in minimum accuracy).
Cardiac arrhythmia beat classification using DOST and PSO tuned SVM.
Raj, Sandeep; Ray, Kailash Chandra; Shankar, Om
2016-11-01
The increase in the number of deaths due to cardiovascular diseases (CVDs) has gained significant attention from the study of electrocardiogram (ECG) signals. These ECG signals are studied by the experienced cardiologist for accurate and proper diagnosis, but it becomes difficult and time-consuming for long-term recordings. Various signal processing techniques are studied to analyze the ECG signal, but they bear limitations due to the non-stationary behavior of ECG signals. Hence, this study aims to improve the classification accuracy rate and provide an automated diagnostic solution for the detection of cardiac arrhythmias. The proposed methodology consists of four stages, i.e. filtering, R-peak detection, feature extraction and classification stages. In this study, Wavelet based approach is used to filter the raw ECG signal, whereas Pan-Tompkins algorithm is used for detecting the R-peak inside the ECG signal. In the feature extraction stage, discrete orthogonal Stockwell transform (DOST) approach is presented for an efficient time-frequency representation (i.e. morphological descriptors) of a time domain signal and retains the absolute phase information to distinguish the various non-stationary behavior ECG signals. Moreover, these morphological descriptors are further reduced in lower dimensional space by using principal component analysis and combined with the dynamic features (i.e based on RR-interval of the ECG signals) of the input signal. This combination of two different kinds of descriptors represents each feature set of an input signal that is utilized for classification into subsequent categories by employing PSO tuned support vector machines (SVM). The proposed methodology is validated on the baseline MIT-BIH arrhythmia database and evaluated under two assessment schemes, yielding an improved overall accuracy of 99.18% for sixteen classes in the category-based and 89.10% for five classes (mapped according to AAMI standard) in the patient-based assessment scheme respectively to the state-of-art diagnosis. The results reported are further compared to the existing methodologies in literature. The proposed feature representation of cardiac signals based on symmetrical features along with PSO based optimization technique for the SVM classifier reported an improved classification accuracy in both the assessment schemes evaluated on the benchmark MIT-BIH arrhythmia database and hence can be utilized for automated computer-aided diagnosis of cardiac arrhythmia beats. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Analyzing thematic maps and mapping for accuracy
Rosenfield, G.H.
1982-01-01
Two problems which exist while attempting to test the accuracy of thematic maps and mapping are: (1) evaluating the accuracy of thematic content, and (2) evaluating the effects of the variables on thematic mapping. Statistical analysis techniques are applicable to both these problems and include techniques for sampling the data and determining their accuracy. In addition, techniques for hypothesis testing, or inferential statistics, are used when comparing the effects of variables. A comprehensive and valid accuracy test of a classification project, such as thematic mapping from remotely sensed data, includes the following components of statistical analysis: (1) sample design, including the sample distribution, sample size, size of the sample unit, and sampling procedure; and (2) accuracy estimation, including estimation of the variance and confidence limits. Careful consideration must be given to the minimum sample size necessary to validate the accuracy of a given. classification category. The results of an accuracy test are presented in a contingency table sometimes called a classification error matrix. Usually the rows represent the interpretation, and the columns represent the verification. The diagonal elements represent the correct classifications. The remaining elements of the rows represent errors by commission, and the remaining elements of the columns represent the errors of omission. For tests of hypothesis that compare variables, the general practice has been to use only the diagonal elements from several related classification error matrices. These data are arranged in the form of another contingency table. The columns of the table represent the different variables being compared, such as different scales of mapping. The rows represent the blocking characteristics, such as the various categories of classification. The values in the cells of the tables might be the counts of correct classification or the binomial proportions of these counts divided by either the row totals or the column totals from the original classification error matrices. In hypothesis testing, when the results of tests of multiple sample cases prove to be significant, some form of statistical test must be used to separate any results that differ significantly from the others. In the past, many analyses of the data in this error matrix were made by comparing the relative magnitudes of the percentage of correct classifications, for either individual categories, the entire map or both. More rigorous analyses have used data transformations and (or) two-way classification analysis of variance. A more sophisticated step of data analysis techniques would be to use the entire classification error matrices using the methods of discrete multivariate analysis or of multiviariate analysis of variance.
Tang, Yunwei; Jing, Linhai; Li, Hui; Liu, Qingjie; Yan, Qi; Li, Xiuxia
2016-11-22
This study explores the ability of WorldView-2 (WV-2) imagery for bamboo mapping in a mountainous region in Sichuan Province, China. A large area of this place is covered by shadows in the image, and only a few sampled points derived were useful. In order to identify bamboos based on sparse training data, the sample size was expanded according to the reflectance of multispectral bands selected using the principal component analysis (PCA). Then, class separability based on the training data was calculated using a feature space optimization method to select the features for classification. Four regular object-based classification methods were applied based on both sets of training data. The results show that the k -nearest neighbor ( k -NN) method produced the greatest accuracy. A geostatistically-weighted k -NN classifier, accounting for the spatial correlation between classes, was then applied to further increase the accuracy. It achieved 82.65% and 93.10% of the producer's and user's accuracies respectively for the bamboo class. The canopy densities were estimated to explain the result. This study demonstrates that the WV-2 image can be used to identify small patches of understory bamboos given limited known samples, and the resulting bamboo distribution facilitates the assessments of the habitats of giant pandas.
Identifying At-Risk Students for Early Reading Intervention: Challenges and Possible Solutions
ERIC Educational Resources Information Center
McAlenney, Athena Lentini; Coyne, Michael D.
2011-01-01
Accurate identification of at-risk kindergarten and 1st-grade students through early reading screening is an essential element of responsiveness to intervention models of reading instruction. The authors consider predictive validity and classification accuracy of early reading screening assessments with attention to sensitivity and specificity.…
Practical Issues in Estimating Classification Accuracy and Consistency with R Package cacIRT
ERIC Educational Resources Information Center
Lathrop, Quinn N.
2015-01-01
There are two main lines of research in estimating classification accuracy (CA) and classification consistency (CC) under Item Response Theory (IRT). The R package cacIRT provides computer implementations of both approaches in an accessible and unified framework. Even with available implementations, there remains decisions a researcher faces when…
HEp-2 cell image classification method based on very deep convolutional networks with small datasets
NASA Astrophysics Data System (ADS)
Lu, Mengchi; Gao, Long; Guo, Xifeng; Liu, Qiang; Yin, Jianping
2017-07-01
Human Epithelial-2 (HEp-2) cell images staining patterns classification have been widely used to identify autoimmune diseases by the anti-Nuclear antibodies (ANA) test in the Indirect Immunofluorescence (IIF) protocol. Because manual test is time consuming, subjective and labor intensive, image-based Computer Aided Diagnosis (CAD) systems for HEp-2 cell classification are developing. However, methods proposed recently are mostly manual features extraction with low accuracy. Besides, the scale of available benchmark datasets is small, which does not exactly suitable for using deep learning methods. This issue will influence the accuracy of cell classification directly even after data augmentation. To address these issues, this paper presents a high accuracy automatic HEp-2 cell classification method with small datasets, by utilizing very deep convolutional networks (VGGNet). Specifically, the proposed method consists of three main phases, namely image preprocessing, feature extraction and classification. Moreover, an improved VGGNet is presented to address the challenges of small-scale datasets. Experimental results over two benchmark datasets demonstrate that the proposed method achieves superior performance in terms of accuracy compared with existing methods.
Greene, Barry R; Redmond, Stephen J; Caulfield, Brian
2017-05-01
Falls are the leading global cause of accidental death and disability in older adults and are the most common cause of injury and hospitalization. Accurate, early identification of patients at risk of falling, could lead to timely intervention and a reduction in the incidence of fall-related injury and associated costs. We report a statistical method for fall risk assessment using standard clinical fall risk factors (N = 748). We also report a means of improving this method by automatically combining it, with a fall risk assessment algorithm based on inertial sensor data and the timed-up-and-go test. Furthermore, we provide validation data on the sensor-based fall risk assessment method using a statistically independent dataset. Results obtained using cross-validation on a sample of 292 community dwelling older adults suggest that a combined clinical and sensor-based approach yields a classification accuracy of 76.0%, compared to either 73.6% for sensor-based assessment alone, or 68.8% for clinical risk factors alone. Increasing the cohort size by adding an additional 130 subjects from a separate recruitment wave (N = 422), and applying the same model building and validation method, resulted in a decrease in classification performance (68.5% for combined classifier, 66.8% for sensor data alone, and 58.5% for clinical data alone). This suggests that heterogeneity between cohorts may be a major challenge when attempting to develop fall risk assessment algorithms which generalize well. Independent validation of the sensor-based fall risk assessment algorithm on an independent cohort of 22 community dwelling older adults yielded a classification accuracy of 72.7%. Results suggest that the present method compares well to previously reported sensor-based fall risk assessment methods in assessing falls risk. Implementation of objective fall risk assessment methods on a large scale has the potential to improve quality of care and lead to a reduction in associated hospital costs, due to fewer admissions and reduced injuries due to falling.
On-line analysis of algae in water by discrete three-dimensional fluorescence spectroscopy.
Zhao, Nanjing; Zhang, Xiaoling; Yin, Gaofang; Yang, Ruifang; Hu, Li; Chen, Shuang; Liu, Jianguo; Liu, Wenqing
2018-03-19
In view of the problem of the on-line measurement of algae classification, a method of algae classification and concentration determination based on the discrete three-dimensional fluorescence spectra was studied in this work. The discrete three-dimensional fluorescence spectra of twelve common species of algae belonging to five categories were analyzed, the discrete three-dimensional standard spectra of five categories were built, and the recognition, classification and concentration prediction of algae categories were realized by the discrete three-dimensional fluorescence spectra coupled with non-negative weighted least squares linear regression analysis. The results show that similarities between discrete three-dimensional standard spectra of different categories were reduced and the accuracies of recognition, classification and concentration prediction of the algae categories were significantly improved. By comparing with that of the chlorophyll a fluorescence excitation spectra method, the recognition accuracy rate in pure samples by discrete three-dimensional fluorescence spectra is improved 1.38%, and the recovery rate and classification accuracy in pure diatom samples 34.1% and 46.8%, respectively; the recognition accuracy rate of mixed samples by discrete-three dimensional fluorescence spectra is enhanced by 26.1%, the recovery rate of mixed samples with Chlorophyta 37.8%, and the classification accuracy of mixed samples with diatoms 54.6%.
Maize Cropping Systems Mapping Using RapidEye Observations in Agro-Ecological Landscapes in Kenya.
Richard, Kyalo; Abdel-Rahman, Elfatih M; Subramanian, Sevgan; Nyasani, Johnson O; Thiel, Michael; Jozani, Hosein; Borgemeister, Christian; Landmann, Tobias
2017-11-03
Cropping systems information on explicit scales is an important but rarely available variable in many crops modeling routines and of utmost importance for understanding pests and disease propagation mechanisms in agro-ecological landscapes. In this study, high spatial and temporal resolution RapidEye bio-temporal data were utilized within a novel 2-step hierarchical random forest (RF) classification approach to map areas of mono- and mixed maize cropping systems. A small-scale maize farming site in Machakos County, Kenya was used as a study site. Within the study site, field data was collected during the satellite acquisition period on general land use/land cover (LULC) and the two cropping systems. Firstly, non-cropland areas were masked out from other land use/land cover using the LULC mapping result. Subsequently an optimized RF model was applied to the cropland layer to map the two cropping systems (2nd classification step). An overall accuracy of 93% was attained for the LULC classification, while the class accuracies (PA: producer's accuracy and UA: user's accuracy) for the two cropping systems were consistently above 85%. We concluded that explicit mapping of different cropping systems is feasible in complex and highly fragmented agro-ecological landscapes if high resolution and multi-temporal satellite data such as 5 m RapidEye data is employed. Further research is needed on the feasibility of using freely available 10-20 m Sentinel-2 data for wide-area assessment of cropping systems as an important variable in numerous crop productivity models.
Validation of accelerometer cut points in toddlers with and without cerebral palsy.
Oftedal, Stina; Bell, Kristie L; Davies, Peter S W; Ware, Robert S; Boyd, Roslyn N
2014-09-01
The purpose of this study was to validate uni- and triaxial ActiGraph cut points for sedentary time in toddlers with cerebral palsy (CP) and typically developing children (TDC). Children (n = 103, 61 boys, mean age = 2 yr, SD = 6 months, range = 1 yr 6 months-3 yr) were divided into calibration (n = 65) and validation (n = 38) samples with separate analyses for TDC (n = 28) and ambulant (Gross Motor Function Classification System I-III, n = 51) and nonambulant (Gross Motor Function Classification System IV-V, n = 25) children with CP. An ActiGraph was worn during a videotaped assessment. Behavior was coded as sedentary or nonsedentary. Receiver operating characteristic-area under the curve analysis determined the classification accuracy of accelerometer data. Predictive validity was determined using the Bland-Altman analysis. Classification accuracy for uniaxial data was fair for the ambulatory CP and TDC group but poor for the nonambulatory CP group. Triaxial data showed good classification accuracy for all groups. The uniaxial ambulatory CP and TDC cut points significantly overestimated sedentary time (bias = -10.5%, 95% limits of agreement [LoA] = -30.2% to 9.1%; bias = -17.3%, 95% LoA = -44.3% to 8.3%). The triaxial ambulatory and nonambulatory CP and TDC cut points provided accurate group-level measures of sedentary time (bias = -1.5%, 95% LoA = -20% to 16.8%; bias = 2.1%, 95% LoA = -17.3% to 21.5%; bias = -5.1%, 95% LoA = -27.5% to 16.1%). Triaxial accelerometers provide useful group-level measures of sedentary time in children with CP across the spectrum of functional abilities and TDC. Uniaxial cut points are not recommended.
NASA Astrophysics Data System (ADS)
Quesada-Barriuso, Pablo; Heras, Dora B.; Argüello, Francisco
2016-10-01
The classification of remote sensing hyperspectral images for land cover applications is a very intensive topic. In the case of supervised classification, Support Vector Machines (SVMs) play a dominant role. Recently, the Extreme Learning Machine algorithm (ELM) has been extensively used. The classification scheme previously published by the authors, and called WT-EMP, introduces spatial information in the classification process by means of an Extended Morphological Profile (EMP) that is created from features extracted by wavelets. In addition, the hyperspectral image is denoised in the 2-D spatial domain, also using wavelets and it is joined to the EMP via a stacked vector. In this paper, the scheme is improved achieving two goals. The first one is to reduce the classification time while preserving the accuracy of the classification by using ELM instead of SVM. The second one is to improve the accuracy results by performing not only a 2-D denoising for every spectral band, but also a previous additional 1-D spectral signature denoising applied to each pixel vector of the image. For each denoising the image is transformed by applying a 1-D or 2-D wavelet transform, and then a NeighShrink thresholding is applied. Improvements in terms of classification accuracy are obtained, especially for images with close regions in the classification reference map, because in these cases the accuracy of the classification in the edges between classes is more relevant.
Computer-aided diagnosis of pulmonary diseases using x-ray darkfield radiography
NASA Astrophysics Data System (ADS)
Einarsdóttir, Hildur; Yaroshenko, Andre; Velroyen, Astrid; Bech, Martin; Hellbach, Katharina; Auweter, Sigrid; Yildirim, Önder; Meinel, Felix G.; Eickelberg, Oliver; Reiser, Maximilian; Larsen, Rasmus; Kjær Ersbøll, Bjarne; Pfeiffer, Franz
2015-12-01
In this work we develop a computer-aided diagnosis (CAD) scheme for classification of pulmonary disease for grating-based x-ray radiography. In addition to conventional transmission radiography, the grating-based technique provides a dark-field imaging modality, which utilizes the scattering properties of the x-rays. This modality has shown great potential for diagnosing early stage emphysema and fibrosis in mouse lungs in vivo. The CAD scheme is developed to assist radiologists and other medical experts to develop new diagnostic methods when evaluating grating-based images. The scheme consists of three stages: (i) automatic lung segmentation; (ii) feature extraction from lung shape and dark-field image intensities; (iii) classification between healthy, emphysema and fibrosis lungs. A study of 102 mice was conducted with 34 healthy, 52 emphysema and 16 fibrosis subjects. Each image was manually annotated to build an experimental dataset. System performance was assessed by: (i) determining the quality of the segmentations; (ii) validating emphysema and fibrosis recognition by a linear support vector machine using leave-one-out cross-validation. In terms of segmentation quality, we obtained an overlap percentage (Ω) 92.63 ± 3.65%, Dice Similarity Coefficient (DSC) 89.74 ± 8.84% and Jaccard Similarity Coefficient 82.39 ± 12.62%. For classification, the accuracy, sensitivity and specificity of diseased lung recognition was 100%. Classification between emphysema and fibrosis resulted in an accuracy of 93%, whilst the sensitivity was 94% and specificity 88%. In addition to the automatic classification of lungs, deviation maps created by the CAD scheme provide a visual aid for medical experts to further assess the severity of pulmonary disease in the lung, and highlights regions affected.
Comparing Features for Classification of MEG Responses to Motor Imagery
Halme, Hanna-Leena; Parkkonen, Lauri
2016-01-01
Background Motor imagery (MI) with real-time neurofeedback could be a viable approach, e.g., in rehabilitation of cerebral stroke. Magnetoencephalography (MEG) noninvasively measures electric brain activity at high temporal resolution and is well-suited for recording oscillatory brain signals. MI is known to modulate 10- and 20-Hz oscillations in the somatomotor system. In order to provide accurate feedback to the subject, the most relevant MI-related features should be extracted from MEG data. In this study, we evaluated several MEG signal features for discriminating between left- and right-hand MI and between MI and rest. Methods MEG was measured from nine healthy participants imagining either left- or right-hand finger tapping according to visual cues. Data preprocessing, feature extraction and classification were performed offline. The evaluated MI-related features were power spectral density (PSD), Morlet wavelets, short-time Fourier transform (STFT), common spatial patterns (CSP), filter-bank common spatial patterns (FBCSP), spatio—spectral decomposition (SSD), and combined SSD+CSP, CSP+PSD, CSP+Morlet, and CSP+STFT. We also compared four classifiers applied to single trials using 5-fold cross-validation for evaluating the classification accuracy and its possible dependence on the classification algorithm. In addition, we estimated the inter-session left-vs-right accuracy for each subject. Results The SSD+CSP combination yielded the best accuracy in both left-vs-right (mean 73.7%) and MI-vs-rest (mean 81.3%) classification. CSP+Morlet yielded the best mean accuracy in inter-session left-vs-right classification (mean 69.1%). There were large inter-subject differences in classification accuracy, and the level of the 20-Hz suppression correlated significantly with the subjective MI-vs-rest accuracy. Selection of the classification algorithm had only a minor effect on the results. Conclusions We obtained good accuracy in sensor-level decoding of MI from single-trial MEG data. Feature extraction methods utilizing both the spatial and spectral profile of MI-related signals provided the best classification results, suggesting good performance of these methods in an online MEG neurofeedback system. PMID:27992574
NASA Astrophysics Data System (ADS)
Piiroinen, Rami; Heiskanen, Janne; Mõttus, Matti; Pellikka, Petri
2015-07-01
Land use practices are changing at a fast pace in the tropics. In sub-Saharan Africa forests, woodlands and bushlands are being transformed for agricultural use to produce food for the rapidly growing population. The objective of this study was to assess the prospects of mapping the common agricultural crops in highly heterogeneous study area in south-eastern Kenya using high spatial and spectral resolution AisaEAGLE imaging spectroscopy data. Minimum noise fraction transformation was used to pack the coherent information in smaller set of bands and the data was classified with support vector machine (SVM) algorithm. A total of 35 plant species were mapped in the field and seven most dominant ones were used as classification targets. Five of the targets were agricultural crops. The overall accuracy (OA) for the classification was 90.8%. To assess the possibility of excluding the remaining 28 plant species from the classification results, 10 different probability thresholds (PT) were tried with SVM. The impact of PT was assessed with validation polygons of all 35 mapped plant species. The results showed that while PT was increased more pixels were excluded from non-target polygons than from the polygons of the seven classification targets. This increased the OA and reduced salt-and-pepper effects in the classification results. Very high spatial resolution imagery and pixel-based classification approach worked well with small targets such as maize while there was mixing of classes on the sides of the tree crowns.
Classification of leafy spurge with earth observing-1 advanced land imager
Stitt, S.; Root, R.; Brown, K.; Hager, S.; Mladinich, C.; Anderson, G.L.; Dudek, K.; Bustos, M.R.; Kokaly, R.
2006-01-01
Leafy spurge (Euphorbia esula L.) is an invasive exotic plant that can completely displace native plant communities. Automated techniques for monitoring the location and extent of leafy spurge, especially if available on a seasonal basis, could add greatly to the effectiveness of control measures. As part of a larger study including multiple sensors, this study examines the utility of mapping the location and extent of leafy spurge in Theodore Roosevelt National Park using Earth Observing-1 satellite Advanced Land Imager (ALI) scanner data. An unsupervised classification methodology was used producing accuracies in the range of 59% to 66%. Existing field studies, with their associated limitations, were used for identifying class membership and accuracy assessment. This sensor could be useful for broad landscape scale mapping of leafy spurge, from which control measures could be based.
Vegetation classification and distribution mapping report Mesa Verde National Park
Thomas, Kathryn A.; McTeague, Monica L.; Ogden, Lindsay; Floyd, M. Lisa; Schulz, Keith; Friesen, Beverly A.; Fancher, Tammy; Waltermire, Robert G.; Cully, Anne
2009-01-01
The classification and distribution mapping of the vegetation of Mesa Verde National Park (MEVE) and surrounding environment was achieved through a multi-agency effort between 2004 and 2007. The National Park Service’s Southern Colorado Plateau Network facilitated the team that conducted the work, which comprised the U.S. Geological Survey’s Southwest Biological Science Center, Fort Collins Research Center, and Rocky Mountain Geographic Science Center; Northern Arizona University; Prescott College; and NatureServe. The project team described 47 plant communities for MEVE, 34 of which were described from quantitative classification based on f eld-relevé data collected in 1993 and 2004. The team derived 13 additional plant communities from field observations during the photointerpretation phase of the project. The National Vegetation Classification Standard served as a framework for classifying these plant communities to the alliance and association level. Eleven of the 47 plant communities were classified as “park specials;” that is, plant communities with insufficient data to describe them as new alliances or associations. The project team also developed a spatial vegetation map database representing MEVE, with three different map-class schemas: base, group, and management map classes. The base map classes represent the fi nest level of spatial detail. Initial polygons were developed using Definiens Professional (at the time of our use, this software was called eCognition), assisted by interpretation of 1:12,000 true-color digital orthophoto quarter quadrangles (DOQQs). These polygons (base map classes) were labeled using manual photo interpretation of the DOQQs and 1:12,000 true-color aerial photography. Field visits verified interpretation concepts. The vegetation map database includes 46 base map classes, which consist of associations, alliances, and park specials classified with quantitative analysis, additional associations and park specials noted during photointerpretation, and non-vegetated land cover, such as infrastructure, land use, and geological land cover. The base map classes consist of 5,007 polygons in the project area. A field-based accuracy assessment of the base map classes showed overall accuracy to be 43.5%. Seven map classes comprise 89.1% of the park vegetated land cover. The group map classes represent aggregations of the base map classes, approximating the group level of the National Vegetation Classification Standard, version 2 (Federal Geographic Data Committee 2007), and reflecting physiognomy and floristics. Terrestrial ecological systems, as described by NatureServe (Comer et al. 2003), were used as the fi rst approximation of the group level. The project team identified 14 group map classes for this project. The overall accuracy of the group map classes was determined using the same accuracy assessment data as for the base map classes. The overall accuracy of the group representation of vegetation was 80.3%. In consultation with park staff , the team developed management map classes, consisting of park-defined groupings of base map classes intended to represent a balance between maintaining required accuracy and providing a focus on vegetation of particular interest or import to park managers. The 23 management map classes had an overall accuracy of 73.3%. While the main products of this project are the vegetation classification and the vegetation map database, a number of ancillary digital geographic information system and database products were also produced that can be used independently or to augment the main products. These products include shapefiles of the locations of field-collected data and relational databases of field-collected data.
Single-accelerometer-based daily physical activity classification.
Long, Xi; Yin, Bin; Aarts, Ronald M
2009-01-01
In this study, a single tri-axial accelerometer placed on the waist was used to record the acceleration data for human physical activity classification. The data collection involved 24 subjects performing daily real-life activities in a naturalistic environment without researchers' intervention. For the purpose of assessing customers' daily energy expenditure, walking, running, cycling, driving, and sports were chosen as target activities for classification. This study compared a Bayesian classification with that of a Decision Tree based approach. A Bayes classifier has the advantage to be more extensible, requiring little effort in classifier retraining and software update upon further expansion or modification of the target activities. Principal components analysis was applied to remove the correlation among features and to reduce the feature vector dimension. Experiments using leave-one-subject-out and 10-fold cross validation protocols revealed a classification accuracy of approximately 80%, which was comparable with that obtained by a Decision Tree classifier.
Assessments of SENTINEL-2 Vegetation Red-Edge Spectral Bands for Improving Land Cover Classification
NASA Astrophysics Data System (ADS)
Qiu, S.; He, B.; Yin, C.; Liao, Z.
2017-09-01
The Multi Spectral Instrument (MSI) onboard Sentinel-2 can record the information in Vegetation Red-Edge (VRE) spectral domains. In this study, the performance of the VRE bands on improving land cover classification was evaluated based on a Sentinel-2A MSI image in East Texas, USA. Two classification scenarios were designed by excluding and including the VRE bands. A Random Forest (RF) classifier was used to generate land cover maps and evaluate the contributions of different spectral bands. The combination of VRE bands increased the overall classification accuracy by 1.40 %, which was statistically significant. Both confusion matrices and land cover maps indicated that the most beneficial increase was from vegetation-related land cover types, especially agriculture. Comparison of the relative importance of each band showed that the most beneficial VRE bands were Band 5 and Band 6. These results demonstrated the value of VRE bands for land cover classification.
NASA Astrophysics Data System (ADS)
Kim, Namkug; Seo, Joon Beom; Park, Sang Ok; Lee, Youngjoo; Lee, Jeongjin
2009-02-01
To evaluate the accuracy of computer aided differential diagnosis (CADD) between usual interstitial pneumonia (UIP) and nonspecific interstitial pneumonia (NSIP) at HRCT in comparison with that of a radiologist's decision. A computerized classification for six local disease patterns (normal, NL; ground-glass opacity, GGO; reticular opacity, RO; honeycombing, HC; emphysema, EM; and consolidation, CON) using texture/shape analyses and a SVM classifier at HRCT was used for pixel-by-pixel labeling on the whole lung area. The mode filter was applied on the results to reduce noise. Area fraction (AF) of each pattern, directional probabilistic density function (pdf) (dPDF: mean, SD, skewness of pdf /3 directions: superior-inferior, anterior-posterior, central-peripheral), regional cluster distribution pattern (RCDP: number, mean, SD of clusters, mean, SD of centroid of clusters) were automatically evaluated. Spatially normalized left and right lungs were evaluated separately. Disease division index (DDI) on every combination of AFs and asymmetric index (AI) between left and right lung ((left-right)/left) were also evaluated. To assess the accuracy of the system, fifty-four HRCT data sets in patients with pathologically diagnosed UIP (n=26) and NSIP (n=28) were used. For a classification procedure, a CADD-SVM classifier with internal parameter optimization, and sequential forward floating feature selection (SFFS) were employed. The accuracy was assessed by a 5-folding cross validation with 20- times repetition. For comparison, two thoracic radiologists reviewed the whole HRCT images without clinical information and diagnose each case either as UIP or NSIP. The accuracies of radiologists' decision were 0.75 and 0.87, respectively. The accuracies of the CADD system using the features of AF, dPDF, AI of dPDF, RDP, AI of RDP, DDI were 0.70, 0.79, 0.77, 0.80, 0.78, 0.81, respectively. The accuracy of optimized CADD using all features after SFFS was 0.91. We developed the CADD system to differentiate between UIP and NSIP using automated assessment of the extent and distribution of regional disease patterns at HRCT.
Improving EEG-Based Driver Fatigue Classification Using Sparse-Deep Belief Networks.
Chai, Rifai; Ling, Sai Ho; San, Phyo Phyo; Naik, Ganesh R; Nguyen, Tuan N; Tran, Yvonne; Craig, Ashley; Nguyen, Hung T
2017-01-01
This paper presents an improvement of classification performance for electroencephalography (EEG)-based driver fatigue classification between fatigue and alert states with the data collected from 43 participants. The system employs autoregressive (AR) modeling as the features extraction algorithm, and sparse-deep belief networks (sparse-DBN) as the classification algorithm. Compared to other classifiers, sparse-DBN is a semi supervised learning method which combines unsupervised learning for modeling features in the pre-training layer and supervised learning for classification in the following layer. The sparsity in sparse-DBN is achieved with a regularization term that penalizes a deviation of the expected activation of hidden units from a fixed low-level prevents the network from overfitting and is able to learn low-level structures as well as high-level structures. For comparison, the artificial neural networks (ANN), Bayesian neural networks (BNN), and original deep belief networks (DBN) classifiers are used. The classification results show that using AR feature extractor and DBN classifiers, the classification performance achieves an improved classification performance with a of sensitivity of 90.8%, a specificity of 90.4%, an accuracy of 90.6%, and an area under the receiver operating curve (AUROC) of 0.94 compared to ANN (sensitivity at 80.8%, specificity at 77.8%, accuracy at 79.3% with AUC-ROC of 0.83) and BNN classifiers (sensitivity at 84.3%, specificity at 83%, accuracy at 83.6% with AUROC of 0.87). Using the sparse-DBN classifier, the classification performance improved further with sensitivity of 93.9%, a specificity of 92.3%, and an accuracy of 93.1% with AUROC of 0.96. Overall, the sparse-DBN classifier improved accuracy by 13.8, 9.5, and 2.5% over ANN, BNN, and DBN classifiers, respectively.
Improving EEG-Based Driver Fatigue Classification Using Sparse-Deep Belief Networks
Chai, Rifai; Ling, Sai Ho; San, Phyo Phyo; Naik, Ganesh R.; Nguyen, Tuan N.; Tran, Yvonne; Craig, Ashley; Nguyen, Hung T.
2017-01-01
This paper presents an improvement of classification performance for electroencephalography (EEG)-based driver fatigue classification between fatigue and alert states with the data collected from 43 participants. The system employs autoregressive (AR) modeling as the features extraction algorithm, and sparse-deep belief networks (sparse-DBN) as the classification algorithm. Compared to other classifiers, sparse-DBN is a semi supervised learning method which combines unsupervised learning for modeling features in the pre-training layer and supervised learning for classification in the following layer. The sparsity in sparse-DBN is achieved with a regularization term that penalizes a deviation of the expected activation of hidden units from a fixed low-level prevents the network from overfitting and is able to learn low-level structures as well as high-level structures. For comparison, the artificial neural networks (ANN), Bayesian neural networks (BNN), and original deep belief networks (DBN) classifiers are used. The classification results show that using AR feature extractor and DBN classifiers, the classification performance achieves an improved classification performance with a of sensitivity of 90.8%, a specificity of 90.4%, an accuracy of 90.6%, and an area under the receiver operating curve (AUROC) of 0.94 compared to ANN (sensitivity at 80.8%, specificity at 77.8%, accuracy at 79.3% with AUC-ROC of 0.83) and BNN classifiers (sensitivity at 84.3%, specificity at 83%, accuracy at 83.6% with AUROC of 0.87). Using the sparse-DBN classifier, the classification performance improved further with sensitivity of 93.9%, a specificity of 92.3%, and an accuracy of 93.1% with AUROC of 0.96. Overall, the sparse-DBN classifier improved accuracy by 13.8, 9.5, and 2.5% over ANN, BNN, and DBN classifiers, respectively. PMID:28326009
Zhang, Xiaoheng; Wang, Lirui; Cao, Yao; Wang, Pin; Zhang, Cheng; Yang, Liuyang; Li, Yongming; Zhang, Yanling; Cheng, Oumei
2018-02-01
Diagnosis of Parkinson's disease (PD) based on speech data has been proved to be an effective way in recent years. However, current researches just care about the feature extraction and classifier design, and do not consider the instance selection. Former research by authors showed that the instance selection can lead to improvement on classification accuracy. However, no attention is paid on the relationship between speech sample and feature until now. Therefore, a new diagnosis algorithm of PD is proposed in this paper by simultaneously selecting speech sample and feature based on relevant feature weighting algorithm and multiple kernel method, so as to find their synergy effects, thereby improving classification accuracy. Experimental results showed that this proposed algorithm obtained apparent improvement on classification accuracy. It can obtain mean classification accuracy of 82.5%, which was 30.5% higher than the relevant algorithm. Besides, the proposed algorithm detected the synergy effects of speech sample and feature, which is valuable for speech marker extraction.
NASA Astrophysics Data System (ADS)
Hao Chiang, Shou; Valdez, Miguel; Chen, Chi-Farn
2016-06-01
Forest is a very important ecosystem and natural resource for living things. Based on forest inventories, government is able to make decisions to converse, improve and manage forests in a sustainable way. Field work for forestry investigation is difficult and time consuming, because it needs intensive physical labor and the costs are high, especially surveying in remote mountainous regions. A reliable forest inventory can give us a more accurate and timely information to develop new and efficient approaches of forest management. The remote sensing technology has been recently used for forest investigation at a large scale. To produce an informative forest inventory, forest attributes, including tree species are unavoidably required to be considered. In this study the aim is to classify forest tree species in Erdenebulgan County, Huwsgul province in Mongolia, using Maximum Entropy method. The study area is covered by a dense forest which is almost 70% of total territorial extension of Erdenebulgan County and is located in a high mountain region in northern Mongolia. For this study, Landsat satellite imagery and a Digital Elevation Model (DEM) were acquired to perform tree species mapping. The forest tree species inventory map was collected from the Forest Division of the Mongolian Ministry of Nature and Environment as training data and also used as ground truth to perform the accuracy assessment of the tree species classification. Landsat images and DEM were processed for maximum entropy modeling, and this study applied the model with two experiments. The first one is to use Landsat surface reflectance for tree species classification; and the second experiment incorporates terrain variables in addition to the Landsat surface reflectance to perform the tree species classification. All experimental results were compared with the tree species inventory to assess the classification accuracy. Results show that the second one which uses Landsat surface reflectance coupled with terrain variables produced better result, with the higher overall accuracy and kappa coefficient than first experiment. The results indicate that the Maximum Entropy method is an applicable, and to classify tree species using satellite imagery data coupled with terrain information can improve the classification of tree species in the study area.
Ravindran, Sindhu; Jambek, Asral Bahari; Muthusamy, Hariharan; Neoh, Siew-Chin
2015-01-01
A novel clinical decision support system is proposed in this paper for evaluating the fetal well-being from the cardiotocogram (CTG) dataset through an Improved Adaptive Genetic Algorithm (IAGA) and Extreme Learning Machine (ELM). IAGA employs a new scaling technique (called sigma scaling) to avoid premature convergence and applies adaptive crossover and mutation techniques with masking concepts to enhance population diversity. Also, this search algorithm utilizes three different fitness functions (two single objective fitness functions and multi-objective fitness function) to assess its performance. The classification results unfold that promising classification accuracy of 94% is obtained with an optimal feature subset using IAGA. Also, the classification results are compared with those of other Feature Reduction techniques to substantiate its exhaustive search towards the global optimum. Besides, five other benchmark datasets are used to gauge the strength of the proposed IAGA algorithm.
NASA Astrophysics Data System (ADS)
Han, Xiaopeng; Huang, Xin; Li, Jiayi; Li, Yansheng; Yang, Michael Ying; Gong, Jianya
2018-04-01
In recent years, the availability of high-resolution imagery has enabled more detailed observation of the Earth. However, it is imperative to simultaneously achieve accurate interpretation and preserve the spatial details for the classification of such high-resolution data. To this aim, we propose the edge-preservation multi-classifier relearning framework (EMRF). This multi-classifier framework is made up of support vector machine (SVM), random forest (RF), and sparse multinomial logistic regression via variable splitting and augmented Lagrangian (LORSAL) classifiers, considering their complementary characteristics. To better characterize complex scenes of remote sensing images, relearning based on landscape metrics is proposed, which iteratively quantizes both the landscape composition and spatial configuration by the use of the initial classification results. In addition, a novel tri-training strategy is proposed to solve the over-smoothing effect of relearning by means of automatic selection of training samples with low classification certainties, which always distribute in or near the edge areas. Finally, EMRF flexibly combines the strengths of relearning and tri-training via the classification certainties calculated by the probabilistic output of the respective classifiers. It should be noted that, in order to achieve an unbiased evaluation, we assessed the classification accuracy of the proposed framework using both edge and non-edge test samples. The experimental results obtained with four multispectral high-resolution images confirm the efficacy of the proposed framework, in terms of both edge and non-edge accuracy.
NASA Astrophysics Data System (ADS)
Wu, Jie; Besnehard, Quentin; Marchessoux, Cédric
2011-03-01
Clinical studies for the validation of new medical imaging devices require hundreds of images. An important step in creating and tuning the study protocol is the classification of images into "difficult" and "easy" cases. This consists of classifying the image based on features like the complexity of the background, the visibility of the disease (lesions). Therefore, an automatic medical background classification tool for mammograms would help for such clinical studies. This classification tool is based on a multi-content analysis framework (MCA) which was firstly developed to recognize image content of computer screen shots. With the implementation of new texture features and a defined breast density scale, the MCA framework is able to automatically classify digital mammograms with a satisfying accuracy. BI-RADS (Breast Imaging Reporting Data System) density scale is used for grouping the mammograms, which standardizes the mammography reporting terminology and assessment and recommendation categories. Selected features are input into a decision tree classification scheme in MCA framework, which is the so called "weak classifier" (any classifier with a global error rate below 50%). With the AdaBoost iteration algorithm, these "weak classifiers" are combined into a "strong classifier" (a classifier with a low global error rate) for classifying one category. The results of classification for one "strong classifier" show the good accuracy with the high true positive rates. For the four categories the results are: TP=90.38%, TN=67.88%, FP=32.12% and FN =9.62%.
A neural network approach to cloud classification
NASA Technical Reports Server (NTRS)
Lee, Jonathan; Weger, Ronald C.; Sengupta, Sailes K.; Welch, Ronald M.
1990-01-01
It is shown that, using high-spatial-resolution data, very high cloud classification accuracies can be obtained with a neural network approach. A texture-based neural network classifier using only single-channel visible Landsat MSS imagery achieves an overall cloud identification accuracy of 93 percent. Cirrus can be distinguished from boundary layer cloudiness with an accuracy of 96 percent, without the use of an infrared channel. Stratocumulus is retrieved with an accuracy of 92 percent, cumulus at 90 percent. The use of the neural network does not improve cirrus classification accuracy. Rather, its main effect is in the improved separation between stratocumulus and cumulus cloudiness. While most cloud classification algorithms rely on linear parametric schemes, the present study is based on a nonlinear, nonparametric four-layer neural network approach. A three-layer neural network architecture, the nonparametric K-nearest neighbor approach, and the linear stepwise discriminant analysis procedure are compared. A significant finding is that significantly higher accuracies are attained with the nonparametric approaches using only 20 percent of the database as training data, compared to 67 percent of the database in the linear approach.
Foot-mounted inertial measurement unit for activity classification.
Ghobadi, Mostafa; Esfahani, Ehsan T
2014-01-01
This paper proposes a classification technique for daily base activity recognition for human monitoring during physical therapy in home. The proposed method estimates the foot motion using single inertial measurement unit, then segments the motion into steps classify them by template-matching as walking, stairs up or stairs down steps. The results show a high accuracy of activity recognition. Unlike previous works which are limited to activity recognition, the proposed approach is more qualitative by providing similarity index of any activity to its desired template which can be used to assess subjects improvement.
NASA Technical Reports Server (NTRS)
Spruce, Joseph P.; Ryan, Robert E.; Smoot, James; Kuper, Phillip; Prados, Donald; Russell, Jeffrey; Ross, Kenton; Gasser, Gerald; Sader, Steven; McKellip, Rodney
2007-01-01
This report details one of three experiments performed during FY 2007 for the NASA RPC (Rapid Prototyping Capability) at Stennis Space Center. This RPC experiment assesses the potential of VIIRS (Visible/Infrared Imager/Radiometer Suite) and MODIS (Moderate Resolution Imaging Spectroradiometer) data for detecting and monitoring forest defoliation from the non-native Eurasian gypsy moth (Lymantria dispar). The intent of the RPC experiment was to assess the degree to which VIIRS data can provide forest disturbance monitoring information as an input to a forest threat EWS (Early Warning System) as compared to the level of information that can be obtained from MODIS data. The USDA Forest Service (USFS) plans to use MODIS products for generating broad-scaled, regional monitoring products as input to an EWS for forest health threat assessment. NASA SSC is helping the USFS to evaluate and integrate currently available satellite remote sensing technologies and data products for the EWS, including the use of MODIS products for regional monitoring of forest disturbance. Gypsy moth defoliation of the mid-Appalachian highland region was selected as a case study. Gypsy moth is one of eight major forest insect threats listed in the Healthy Forest Restoration Act (HFRA) of 2003; the gypsy moth threatens eastern U.S. hardwood forests, which are also a concern highlighted in the HFRA of 2003. This region was selected for the project because extensive gypsy moth defoliation occurred there over multiple years during the MODIS operational period. This RPC experiment is relevant to several nationally important mapping applications, including agricultural efficiency, coastal management, ecological forecasting, disaster management, and carbon management. In this experiment, MODIS data and VIIRS data simulated from MODIS were assessed for their ability to contribute broad, regional geospatial information on gypsy moth defoliation. Landsat and ASTER (Advanced Spaceborne Thermal Emission and Reflection Radiometer) data were used to assess the quality of gypsy moth defoliation mapping products derived from MODIS data and from simulated VIIRS data. The project focused on use of data from MODIS Terra as opposed to MODIS Aqua mainly because only MODIS Terra data was collected during 2000 and 2001-years with comparatively high amounts of gypsy moth defoliation within the study area. The project assessed the quality of VIIRS data simulation products. Hyperion data was employed to assess the quality of MODIS-based VIIRS simulation datasets using image correlation analysis techniques. The ART (Application Research Toolbox) software was used for data simulation. Correlation analysis between MODIS-simulated VIIRS data and Hyperion-simulated VIIRS data for red, NIR (near-infrared), and NDVI (Normalized Difference Vegetation Index) image data products collectively indicate that useful, effective VIIRS simulations can be produced using Hyperion and MODIS data sources. The r(exp 2) for red, NIR, and NDVI products were 0.56, 0.63, and 0.62, respectively, indicating a moderately high correlation between the 2 data sources. Temporal decorrelation from different data acquisition times and image misregistration may have lowered correlation results. The RPC experiment also generated MODIS-based time series data products using the TSPT (Time Series Product Tool) software. Time series of simulated VIIRS NDVI products were produced at approximately 400-meter resolution GSD (Ground Sampling Distance) at nadir for comparison to MODIS NDVI products at either 250- or 500-meter GSD. The project also computed MODIS (MOD02) NDMI (Normalized Difference Moisture Index) products at 500-meter GSD for comparison to NDVI-based products. For each year during 2000-2006, MODIS and VIIRS (simulated from MOD02) time series were computed during the peak gypsy moth defoliation time frame in the study area (approximately June 10 through July 27). Gypsy moth defoliation mapping products from simated VIIRS and MOD02 time series were produced using multiple methods, including image classification and change detection via image differencing. The latter enabled an automated defoliation detection product computed using percent change in maximum NDVI for a peak defoliation period during 2001 compared to maximum NDVI across the entire 2000-2006 time frame. Final gypsy moth defoliation mapping products were assessed for accuracy using randomly sampled locations found on available geospatial reference data (Landsat and ASTER data in conjunction with defoliation map data from the USFS). Extensive gypsy moth defoliation patches were evident on screen displays of multitemporal color composites derived from MODIS data and from simulated VIIRS vegetation index data. Such defoliation was particularly evident for 2001, although widespread denuded forests were also seen for 2000 and 2003. These visualizations were validated using aforementioned reference data. Defoliation patches were visible on displays of MODIS-based NDVI and NDMI data. The viewing of apparent defoliation patches on all of these products necessitated adoption of a specialized temporal data processing method (e.g., maximum NDVI during the peak defoliation time frame). The frequency of cloud cover necessitated this approach. Multitemporal simulated VIIRS and MODIS Terra data both produced effective general classifications of defoliated forest versus other land cover. For 2001, the MOD02-simulated VIIRS 400-meter NDVI classification produced a similar yet slightly lower overall accuracy (87.28 percent with 0.72 Kappa) than the MOD02 250-meter NDVI classification (88.44 percent with 0.75 Kappa). The MOD13 250-meter NDVI classification had a lower overall accuracy (79.13 percent) and a much lower Kappa (0.46). The report discusses accuracy assessment results in much more detail, comparing overall classification and individual class accuracy statistics for simulated VIIRS 400-meter NDVI, MOD02 250-meter NDVI, MOD02-500 meter NDVI, MOD13 250-meter NDVI, and MOD02 500-meter NDMI classifications. Automated defoliation detection products from simulated VIIRS and MOD02 data for 2001 also yielded similar, relatively high overall classification accuracy (85.55 percent for the VIIRS 400-meter NDVI versus 87.28 percent for the MOD02 250-meter NDVI). In contrast, the USFS aerial sketch map of gypsy moth defoliation showed a lower overall classification accuracy at 73.64 percent. The overall classification Kappa values were also similar for the VIIRS (approximately 0.67 Kappa) versus the MOD02 (approximately 0.72 Kappa) automated defoliation detection product, which were much higher than the values exhibited by the USFS sketch map product (overall Kappa of approximately 0.47). The report provides additional details on the accuracy of automated gypsy moth defoliation detection products compared with USFS sketch maps. The results suggest that VIIRS data can be effectively simulated from MODIS data and that VIIRS data will produce gypsy moth defoliation mapping products that are similar to MODIS-based products. The results of the RPC experiment indicate that VIIRS and MODIS data products have good potential for integration into the forest threat EWS. The accuracy assessment was performed only for 2001 because of time constraints and a relative scarcity of cloud-free Landsat and ASTER data for the peak defoliation period of the other years in the 2000-2006 time series. Additional work should be performed to assess the accuracy of gypsy moth defoliation detection products for additional years.The study area (mid-Appalachian highlands) and application (gypsy moth forest defoliation) are not necessarily representative of all forested regions and of all forest threat disturbance agents. Additional work should be performed on other inland and coastal regions as well as for other major forest threats.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Getman, Daniel J
2008-01-01
Many attempts to observe changes in terrestrial systems over time would be significantly enhanced if it were possible to improve the accuracy of classifications of low-resolution historic satellite data. In an effort to examine improving the accuracy of historic satellite image classification by combining satellite and air photo data, two experiments were undertaken in which low-resolution multispectral data and high-resolution panchromatic data were combined and then classified using the ECHO spectral-spatial image classification algorithm and the Maximum Likelihood technique. The multispectral data consisted of 6 multispectral channels (30-meter pixel resolution) from Landsat 7. These data were augmented with panchromatic datamore » (15m pixel resolution) from Landsat 7 in the first experiment, and with a mosaic of digital aerial photography (1m pixel resolution) in the second. The addition of the Landsat 7 panchromatic data provided a significant improvement in the accuracy of classifications made using the ECHO algorithm. Although the inclusion of aerial photography provided an improvement in accuracy, this improvement was only statistically significant at a 40-60% level. These results suggest that once error levels associated with combining aerial photography and multispectral satellite data are reduced, this approach has the potential to significantly enhance the precision and accuracy of classifications made using historic remotely sensed data, as a way to extend the time range of efforts to track temporal changes in terrestrial systems.« less
Zbroch, Tomasz; Knapp, Paweł Grzegorz; Knapp, Piotr Andrzej
2007-09-01
Increasing knowledge concerning carcinogenesis within cervical epithelium has forced us to make continues modifications of cytology classification of the cervical smears. Eventually, new descriptions of the submicroscopic cytomorphological abnormalities have enabled the implementation of Bethesda System which was meant to take place of the former Papanicolaou classification although temporarily both are sometimes used simultaneously. The aim of this study was to compare results of these two classification systems in the aspect of diagnostic accuracy verified by further tests of the diagnostic algorithm for the cervical lesion evaluation. The study was conducted in the group of women selected from general population, the criteria being the place of living and cervical cancer age risk group, in the consecutive periods of mass screening in Podlaski region. The performed diagnostic tests have been based on the commonly used algorithm, as well as identical laboratory and methodological conditions. Performed assessment revealed comparable diagnostic accuracy of both analyzing classifications, verified by histological examination, although with marked higher specificity for dysplastic lesions with decreased number of HSIL results and increased diagnosis of LSILs. Higher number of performed colposcopies and biopsies were an additional consequence of TBS classification. Results based on Bethesda System made it possible to find the sources and reasons of abnormalities with much greater precision, which enabled causing agent treatment. Two evaluated cytology classification systems, although not much different, depicted higher potential of TBS and better, more effective communication between cytology laboratory and gynecologist, making reasonable implementation of The Bethesda System in the daily cytology screening work.
2011-01-01
Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed overall classification accuracy above a median value of 0.63, but for most sensitivity was around or even lower than a median value of 0.5. Conclusions When taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests. These methods may be used to improve accuracy, sensitivity and specificity of Dementia predictions from neuropsychological testing. PMID:21849043
Boskamp, Tobias; Lachmund, Delf; Oetjen, Janina; Cordero Hernandez, Yovany; Trede, Dennis; Maass, Peter; Casadonte, Rita; Kriegsmann, Jörg; Warth, Arne; Dienemann, Hendrik; Weichert, Wilko; Kriegsmann, Mark
2017-07-01
Matrix-assisted laser desorption/ionization imaging mass spectrometry (MALDI IMS) shows a high potential for applications in histopathological diagnosis, and in particular for supporting tumor typing and subtyping. The development of such applications requires the extraction of spectral fingerprints that are relevant for the given tissue and the identification of biomarkers associated with these spectral patterns. We propose a novel data analysis method based on the extraction of characteristic spectral patterns (CSPs) that allow automated generation of classification models for spectral data. Formalin-fixed paraffin embedded (FFPE) tissue samples from N=445 patients assembled on 12 tissue microarrays were analyzed. The method was applied to discriminate primary lung and pancreatic cancer, as well as adenocarcinoma and squamous cell carcinoma of the lung. A classification accuracy of 100% and 82.8%, resp., could be achieved on core level, assessed by cross-validation. The method outperformed the more conventional classification method based on the extraction of individual m/z values in the first application, while achieving a comparable accuracy in the second. LC-MS/MS peptide identification demonstrated that the spectral features present in selected CSPs correspond to peptides relevant for the respective classification. This article is part of a Special Issue entitled: MALDI Imaging, edited by Dr. Corinna Henkel and Prof. Peter Hoffmann. Copyright © 2016 Elsevier B.V. All rights reserved.
Structural Validation of Nursing Terminologies
Hardiker, Nicholas R.; Rector, Alan L.
2001-01-01
Objective: The purpose of the study is twofold: 1) to explore the applicability of combinatorial terminologies as the basis for building enumerated classifications, and 2) to investigate the usefulness of formal terminological systems for performing such classification and for assisting in the refinement of both combinatorial terminologies and enumerated classifications. Design: A formal model of the beta version of the International Classification for Nursing Practice (ICNP) was constructed in the compositional terminological language GRAIL (GALEN Representation and Integration Language). Terms drawn from the North American Nursing Diagnosis Association Taxonomy I (NANDA taxonomy) were mapped into the model and classified automatically using GALEN technology. Measurements: The resulting generated hierarchy was compared with the NANDA taxonomy to assess coverage and accuracy of classification. Results: In terms of coverage, in this study ICNP was able to capture 77 percent of NANDA terms using concepts drawn from five of its eight axes. Three axes—Body Site, Topology, and Frequency—were not needed. In terms of accuracy, where hierarchic relationships existed in the generated hierarchy or the NANDA taxonomy, or both, 6 were identical, 19 existed in the generated hierarchy alone (2 of these were considered suitable for incorporation into the NANDA taxonomy and 17 were considered inaccurate), and 23 appeared in the NANDA taxonomy alone (8 of these were considered suitable for incorporation into ICNP, 9 were considered inaccurate, and 6 reflected different, equally valid perspectives). Sixty terms appeared at the top level, with no indenting, in both the generated hierarchy and the NANDA taxonomy. Conclusions: With appropriate refinement, combinatorial terminologies such as ICNP have the potential to provide a useful foundation for representing enumerated classifications such as NANDA. Technologies such as GALEN make possible the process of building automatically enumerated classifications while providing a useful means of validating and refining both combinatorial terminologies and enumerated classifications. PMID:11320066
Na, X D; Zang, S Y; Wu, C S; Li, W L
2015-11-01
Knowledge of the spatial extent of forested wetlands is essential to many studies including wetland functioning assessment, greenhouse gas flux estimation, and wildlife suitable habitat identification. For discriminating forested wetlands from their adjacent land cover types, researchers have resorted to image analysis techniques applied to numerous remotely sensed data. While with some success, there is still no consensus on the optimal approaches for mapping forested wetlands. To address this problem, we examined two machine learning approaches, random forest (RF) and K-nearest neighbor (KNN) algorithms, and applied these two approaches to the framework of pixel-based and object-based classifications. The RF and KNN algorithms were constructed using predictors derived from Landsat 8 imagery, Radarsat-2 advanced synthetic aperture radar (SAR), and topographical indices. The results show that the objected-based classifications performed better than per-pixel classifications using the same algorithm (RF) in terms of overall accuracy and the difference of their kappa coefficients are statistically significant (p<0.01). There were noticeably omissions for forested and herbaceous wetlands based on the per-pixel classifications using the RF algorithm. As for the object-based image analysis, there were also statistically significant differences (p<0.01) of Kappa coefficient between results performed based on RF and KNN algorithms. The object-based classification using RF provided a more visually adequate distribution of interested land cover types, while the object classifications based on the KNN algorithm showed noticeably commissions for forested wetlands and omissions for agriculture land. This research proves that the object-based classification with RF using optical, radar, and topographical data improved the mapping accuracy of land covers and provided a feasible approach to discriminate the forested wetlands from the other land cover types in forestry area.
Thematic Accuracy Assessment of the 2011 National Land ...
Accuracy assessment is a standard protocol of National Land Cover Database (NLCD) mapping. Here we report agreement statistics between map and reference labels for NLCD 2011, which includes land cover for ca. 2001, ca. 2006, and ca. 2011. The two main objectives were assessment of agreement between map and reference labels for the three, single-date NLCD land cover products at Level II and Level I of the classification hierarchy, and agreement for 17 land cover change reporting themes based on Level I classes (e.g., forest loss; forest gain; forest, no change) for three change periods (2001–2006, 2006–2011, and 2001–2011). The single-date overall accuracies were 82%, 83%, and 83% at Level II and 88%, 89%, and 89% at Level I for 2011, 2006, and 2001, respectively. Many class-specific user's accuracies met or exceeded a previously established nominal accuracy benchmark of 85%. Overall accuracies for 2006 and 2001 land cover components of NLCD 2011 were approximately 4% higher (at Level II and Level I) than the overall accuracies for the same components of NLCD 2006. The high Level I overall, user's, and producer's accuracies for the single-date eras in NLCD 2011 did not translate into high class-specific user's and producer's accuracies for many of the 17 change reporting themes. User's accuracies were high for the no change reporting themes, commonly exceeding 85%, but were typically much lower for the reporting themes that represented change. Only forest l
NASA Technical Reports Server (NTRS)
1984-01-01
Rectifications of multispectral scanner and thematic mapper data sets for full and subscene areas, analyses of planimetric errors, assessments of the number and distribution of ground control points required to minimize errors, and factors contributing to error residual are examined. Other investigations include the generation of three dimensional terrain models and the effects of spatial resolution on digital classification accuracies.
Compensatory neurofuzzy model for discrete data classification in biomedical
NASA Astrophysics Data System (ADS)
Ceylan, Rahime
2015-03-01
Biomedical data is separated to two main sections: signals and discrete data. So, studies in this area are about biomedical signal classification or biomedical discrete data classification. There are artificial intelligence models which are relevant to classification of ECG, EMG or EEG signals. In same way, in literature, many models exist for classification of discrete data taken as value of samples which can be results of blood analysis or biopsy in medical process. Each algorithm could not achieve high accuracy rate on classification of signal and discrete data. In this study, compensatory neurofuzzy network model is presented for classification of discrete data in biomedical pattern recognition area. The compensatory neurofuzzy network has a hybrid and binary classifier. In this system, the parameters of fuzzy systems are updated by backpropagation algorithm. The realized classifier model is conducted to two benchmark datasets (Wisconsin Breast Cancer dataset and Pima Indian Diabetes dataset). Experimental studies show that compensatory neurofuzzy network model achieved 96.11% accuracy rate in classification of breast cancer dataset and 69.08% accuracy rate was obtained in experiments made on diabetes dataset with only 10 iterations.
Improved fibrosis staging by elastometry and blood test in chronic hepatitis C.
Calès, Paul; Boursier, Jérôme; Ducancelle, Alexandra; Oberti, Frédéric; Hubert, Isabelle; Hunault, Gilles; de Lédinghen, Victor; Zarski, Jean-Pierre; Salmon, Dominique; Lunel, Françoise
2014-07-01
Our main objective was to improve non-invasive fibrosis staging accuracy by resolving the limits of previous methods via new test combinations. Our secondary objectives were to improve staging precision, by developing a detailed fibrosis classification, and reliability (personalized accuracy) determination. All patients (729) included in the derivation population had chronic hepatitis C, liver biopsy, 6 blood tests and Fibroscan. Validation populations included 1584 patients. The most accurate combination was provided by using most markers of FibroMeter and Fibroscan results targeted for significant fibrosis, i.e. 'E-FibroMeter'. Its classification accuracy (91.7%) and precision (assessed by F difference with Metavir: 0.62 ± 0.57) were better than those of FibroMeter (84.1%, P < 0.001; 0.72 ± 0.57, P < 0.001), Fibroscan (88.2%, P = 0.011; 0.68 ± 0.57, P = 0.020), and a previous CSF-SF classification of FibroMeter + Fibroscan (86.7%, P < 0.001; 0.65 ± 0.57, P = 0.044). The accuracy for fibrosis absence (F0) was increased, e.g. from 16.0% with Fibroscan to 75.0% with E-FibroMeter (P < 0.001). Cirrhosis sensitivity was improved, e.g. E-FibroMeter: 92.7% vs. Fibroscan: 83.3%, P = 0.004. The combination improved reliability by deleting unreliable results (accuracy <50%) observed with a single test (1.2% of patients) and increasing optimal reliability (accuracy ≥85%) from 80.4% of patients with Fibroscan (accuracy: 90.9%) to 94.2% of patients with E-FibroMeter (accuracy: 92.9%), P < 0.001. The patient rate with 100% predictive values for cirrhosis by the best combination was twice (36.2%) that of the best single test (FibroMeter: 16.2%, P < 0.001). The new test combination increased: accuracy, globally and especially in patients without fibrosis, staging precision, cirrhosis prediction, and even reliability, thus offering improved fibrosis staging. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
NASA Technical Reports Server (NTRS)
Nalepka, R. F. (Principal Investigator); Sadowski, F. E.; Sarno, J. E.
1976-01-01
The author has identified the following significant results. A supervised classification within two separate ground areas of the Sam Houston National Forest was carried out for two sq meters spatial resolution MSS data. Data were progressively coarsened to simulate five additional cases of spatial resolution ranging up to 64 sq meters. Similar processing and analysis of all spatial resolutions enabled evaluations of the effect of spatial resolution on classification accuracy for various levels of detail and the effects on area proportion estimation for very general forest features. For very coarse resolutions, a subset of spectral channels which simulated the proposed thematic mapper channels was used to study classification accuracy.
The use of Landsat data to inventory cotton and soybean acreage in North Alabama
NASA Technical Reports Server (NTRS)
Downs, S. W., Jr.; Faust, N. L.
1980-01-01
This study was performed to determine if Landsat data could be used to improve the accuracy of the estimation of cotton acreage. A linear classification algorithm and a maximum likelihood algorithm were used for computer classification of the area, and the classification was compared with ground truth. The classification accuracy for some fields was greater than 90 percent; however, the overall accuracy was 71 percent for cotton and 56 percent for soybeans. The results of this research indicate that computer analysis of Landsat data has potential for improving upon the methods presently being used to determine cotton acreage; however, additional experiments and refinements are needed before the method can be used operationally.
Functional Assessment of Genetic Variants with Outcomes Adapted to Clinical Decision-Making
Thouvenot, Pierre; Ben Yamin, Barbara; Fourrière, Lou; Lescure, Aurianne; Boudier, Thomas; Del Nery, Elaine; Chauchereau, Anne; Goldgar, David E.; Stoppa-Lyonnet, Dominique; Nicolas, Alain; Millot, Gaël A.
2016-01-01
Understanding the medical effect of an ever-growing number of human variants detected is a long term challenge in genetic counseling. Functional assays, based on in vitro or in vivo evaluations of the variant effects, provide essential information, but they require robust statistical validation, as well as adapted outputs, to be implemented in the clinical decision-making process. Here, we assessed 25 pathogenic and 15 neutral missense variants of the BRCA1 breast/ovarian cancer susceptibility gene in four BRCA1 functional assays. Next, we developed a novel approach that refines the variant ranking in these functional assays. Lastly, we developed a computational system that provides a probabilistic classification of variants, adapted to clinical interpretation. Using this system, the best functional assay exhibits a variant classification accuracy estimated at 93%. Additional theoretical simulations highlight the benefit of this ready-to-use system in the classification of variants after functional assessment, which should facilitate the consideration of functional evidences in the decision-making process after genetic testing. Finally, we demonstrate the versatility of the system with the classification of siRNAs tested for human cell growth inhibition in high throughput screening. PMID:27272900
NASA Technical Reports Server (NTRS)
Heric, Matthew; Cox, William; Gordon, Daniel K.
1987-01-01
In an attempt to improve the land cover/use classification accuracy obtainable from remotely sensed multispectral imagery, Airborne Imaging Spectrometer-1 (AIS-1) images were analyzed in conjunction with Thematic Mapper Simulator (NS001) Large Format Camera color infrared photography and black and white aerial photography. Specific portions of the combined data set were registered and used for classification. Following this procedure, the resulting derived data was tested using an overall accuracy assessment method. Precise photogrammetric 2D-3D-2D geometric modeling techniques is not the basis for this study. Instead, the discussion exposes resultant spectral findings from the image-to-image registrations. Problems associated with the AIS-1 TMS integration are considered, and useful applications of the imagery combination are presented. More advanced methodologies for imagery integration are needed if multisystem data sets are to be utilized fully. Nevertheless, research, described herein, provides a formulation for future Earth Observation Station related multisensor studies.
Summer Crop Classification by Multi-Temporal COSMO-SkyMed® Data
NASA Astrophysics Data System (ADS)
Guarini, Rocchina; Bruzzone, Lorenzo; Santoni, Massimo; Vuolo, Francesco; Luigi, Dini
2016-08-01
In this study, we propose a multi-temporal and multi- polarization approach to discriminate different crop types in the Marchefel region, Austria. The sensitivity of X-band COSMO-SkyMed® (CSK®) data with respect to five crop classes, namely carrot, corn, potato, soybean and sugarbeet is investigated. In particular, the capabilities of dual-polarization (StripMap PingPong) HH/HV, and single-polarization (StripMap Himage), HH and VH, in distinguishing among the five crop types are evaluated. A total of twenty-one Himage and ten PingPong images were acquired in a seven-months period, from April to October 2014. Therefore, the backscattering coefficient was extracted for each dataset and the classification was performed using a pixel-based support vector machine (SVM) approach. The accuracy of the obtained crop classifications was assessed by comparing them with ground truth. The dual-polarization results are contrasted between the HH and HV polarization, and with single-polarization ones (HH and VH polarizations). The best accuracy is obtained by using time-series of StripMap Himage data, at VH polarization, covering the whole season period.
Singha, Mrinal; Wu, Bingfang; Zhang, Miao
2016-01-01
Accurate and timely mapping of paddy rice is vital for food security and environmental sustainability. This study evaluates the utility of temporal features extracted from coarse resolution data for object-based paddy rice classification of fine resolution data. The coarse resolution vegetation index data is first fused with the fine resolution data to generate the time series fine resolution data. Temporal features are extracted from the fused data and added with the multi-spectral data to improve the classification accuracy. Temporal features provided the crop growth information, while multi-spectral data provided the pattern variation of paddy rice. The achieved overall classification accuracy and kappa coefficient were 84.37% and 0.68, respectively. The results indicate that the use of temporal features improved the overall classification accuracy of a single-date multi-spectral image by 18.75% from 65.62% to 84.37%. The minimum sensitivity (MS) of the paddy rice classification has also been improved. The comparison showed that the mapped paddy area was analogous to the agricultural statistics at the district level. This work also highlighted the importance of feature selection to achieve higher classification accuracies. These results demonstrate the potential of the combined use of temporal and spectral features for accurate paddy rice classification. PMID:28025525
Janousova, Eva; Schwarz, Daniel; Kasparek, Tomas
2015-06-30
We investigated a combination of three classification algorithms, namely the modified maximum uncertainty linear discriminant analysis (mMLDA), the centroid method, and the average linkage, with three types of features extracted from three-dimensional T1-weighted magnetic resonance (MR) brain images, specifically MR intensities, grey matter densities, and local deformations for distinguishing 49 first episode schizophrenia male patients from 49 healthy male subjects. The feature sets were reduced using intersubject principal component analysis before classification. By combining the classifiers, we were able to obtain slightly improved results when compared with single classifiers. The best classification performance (81.6% accuracy, 75.5% sensitivity, and 87.8% specificity) was significantly better than classification by chance. We also showed that classifiers based on features calculated using more computation-intensive image preprocessing perform better; mMLDA with classification boundary calculated as weighted mean discriminative scores of the groups had improved sensitivity but similar accuracy compared to the original MLDA; reducing a number of eigenvectors during data reduction did not always lead to higher classification accuracy, since noise as well as the signal important for classification were removed. Our findings provide important information for schizophrenia research and may improve accuracy of computer-aided diagnostics of neuropsychiatric diseases. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Singha, Mrinal; Wu, Bingfang; Zhang, Miao
2016-12-22
Accurate and timely mapping of paddy rice is vital for food security and environmental sustainability. This study evaluates the utility of temporal features extracted from coarse resolution data for object-based paddy rice classification of fine resolution data. The coarse resolution vegetation index data is first fused with the fine resolution data to generate the time series fine resolution data. Temporal features are extracted from the fused data and added with the multi-spectral data to improve the classification accuracy. Temporal features provided the crop growth information, while multi-spectral data provided the pattern variation of paddy rice. The achieved overall classification accuracy and kappa coefficient were 84.37% and 0.68, respectively. The results indicate that the use of temporal features improved the overall classification accuracy of a single-date multi-spectral image by 18.75% from 65.62% to 84.37%. The minimum sensitivity (MS) of the paddy rice classification has also been improved. The comparison showed that the mapped paddy area was analogous to the agricultural statistics at the district level. This work also highlighted the importance of feature selection to achieve higher classification accuracies. These results demonstrate the potential of the combined use of temporal and spectral features for accurate paddy rice classification.
Automatic classification of protein structures using physicochemical parameters.
Mohan, Abhilash; Rao, M Divya; Sunderrajan, Shruthi; Pennathur, Gautam
2014-09-01
Protein classification is the first step to functional annotation; SCOP and Pfam databases are currently the most relevant protein classification schemes. However, the disproportion in the number of three dimensional (3D) protein structures generated versus their classification into relevant superfamilies/families emphasizes the need for automated classification schemes. Predicting function of novel proteins based on sequence information alone has proven to be a major challenge. The present study focuses on the use of physicochemical parameters in conjunction with machine learning algorithms (Naive Bayes, Decision Trees, Random Forest and Support Vector Machines) to classify proteins into their respective SCOP superfamily/Pfam family, using sequence derived information. Spectrophores™, a 1D descriptor of the 3D molecular field surrounding a structure was used as a benchmark to compare the performance of the physicochemical parameters. The machine learning algorithms were modified to select features based on information gain for each SCOP superfamily/Pfam family. The effect of combining physicochemical parameters and spectrophores on classification accuracy (CA) was studied. Machine learning algorithms trained with the physicochemical parameters consistently classified SCOP superfamilies and Pfam families with a classification accuracy above 90%, while spectrophores performed with a CA of around 85%. Feature selection improved classification accuracy for both physicochemical parameters and spectrophores based machine learning algorithms. Combining both attributes resulted in a marginal loss of performance. Physicochemical parameters were able to classify proteins from both schemes with classification accuracy ranging from 90-96%. These results suggest the usefulness of this method in classifying proteins from amino acid sequences.
The Role of Facial Attractiveness and Facial Masculinity/Femininity in Sex Classification of Faces
Hoss, Rebecca A.; Ramsey, Jennifer L.; Griffin, Angela M.; Langlois, Judith H.
2005-01-01
We tested whether adults (Experiment 1) and 4–5-year-old children (Experiment 2) identify the sex of high attractive faces faster and more accurately than low attractive faces in a reaction time task. We also assessed whether facial masculinity/femininity facilitated identification of sex. Results showed that attractiveness facilitated adults’ sex classification of both female and male faces and children’s sex classification of female, but not male, faces. Moreover, attractiveness affected the speed and accuracy of sex classification independent of masculinity/femininity. High masculinity in male faces, but not high femininity in female faces, also facilitated sex classification for both adults and children. These findings provide important new data on how the facial cues of attractiveness and masculinity/femininity contribute to the task of sex classification and provide evidence for developmental differences in how adults and children use these cues. Additionally, these findings provide support for Langlois and Roggman’s (1990) averageness theory of attractiveness. PMID:16457167
Classification accuracy for stratification with remotely sensed data
Raymond L. Czaplewski; Paul L. Patterson
2003-01-01
Tools are developed that help specify the classification accuracy required from remotely sensed data. These tools are applied during the planning stage of a sample survey that will use poststratification, prestratification with proportional allocation, or double sampling for stratification. Accuracy standards are developed in terms of an âerror matrix,â which is...
Metric learning for automatic sleep stage classification.
Phan, Huy; Do, Quan; Do, The-Luan; Vu, Duc-Lung
2013-01-01
We introduce in this paper a metric learning approach for automatic sleep stage classification based on single-channel EEG data. We show that learning a global metric from training data instead of using the default Euclidean metric, the k-nearest neighbor classification rule outperforms state-of-the-art methods on Sleep-EDF dataset with various classification settings. The overall accuracy for Awake/Sleep and 4-class classification setting are 98.32% and 94.49% respectively. Furthermore, the superior accuracy is achieved by performing classification on a low-dimensional feature space derived from time and frequency domains and without the need for artifact removal as a preprocessing step.
Forest/non-forest stratification in Georgia with Landsat Thematic Mapper data
William H. Cooke
2000-01-01
Geographically accurate Forest Inventory and Analysis (FIA) data may be useful for training, classification, and accuracy assessment of Landsat Thematic Mapper (TM) data. Minimum expectation for maps derived from Landsat data is accurate discrimination of several land cover classes. Landsat TM costs have decreased dramatically, but acquiring cloud-free scenes at...
ERIC Educational Resources Information Center
Kroopnick, Marc Howard
2010-01-01
When Item Response Theory (IRT) is operationally applied for large scale assessments, unidimensionality is typically assumed. This assumption requires that the test measures a single latent trait. Furthermore, when tests are vertically scaled using IRT, the assumption of unidimensionality would require that the battery of tests across grades…
Bahadure, Nilesh Bhaskarrao; Ray, Arun Kumar; Thethi, Har Pal
2018-01-17
The detection of a brain tumor and its classification from modern imaging modalities is a primary concern, but a time-consuming and tedious work was performed by radiologists or clinical supervisors. The accuracy of detection and classification of tumor stages performed by radiologists is depended on their experience only, so the computer-aided technology is very important to aid with the diagnosis accuracy. In this study, to improve the performance of tumor detection, we investigated comparative approach of different segmentation techniques and selected the best one by comparing their segmentation score. Further, to improve the classification accuracy, the genetic algorithm is employed for the automatic classification of tumor stage. The decision of classification stage is supported by extracting relevant features and area calculation. The experimental results of proposed technique are evaluated and validated for performance and quality analysis on magnetic resonance brain images, based on segmentation score, accuracy, sensitivity, specificity, and dice similarity index coefficient. The experimental results achieved 92.03% accuracy, 91.42% specificity, 92.36% sensitivity, and an average segmentation score between 0.82 and 0.93 demonstrating the effectiveness of the proposed technique for identifying normal and abnormal tissues from brain MR images. The experimental results also obtained an average of 93.79% dice similarity index coefficient, which indicates better overlap between the automated extracted tumor regions with manually extracted tumor region by radiologists.
NASA Astrophysics Data System (ADS)
Kurniawan, Dian; Suparti; Sugito
2018-05-01
Population growth in Indonesia has increased every year. According to the population census conducted by the Central Bureau of Statistics (BPS) in 2010, the population of Indonesia has reached 237.6 million people. Therefore, to control the population growth rate, the government hold Family Planning or Keluarga Berencana (KB) program for couples of childbearing age. The purpose of this program is to improve the health of mothers and children in order to manifest prosperous society by controlling births while ensuring control of population growth. The data used in this study is the updated family data of Semarang city in 2016 that conducted by National Family Planning Coordinating Board (BKKBN). From these data, classifiers with kernel discriminant analysis will be obtained, and also classification accuracy will be obtained from that method. The result of the analysis showed that normal kernel discriminant analysis gives 71.05 % classification accuracy with 28.95 % classification error. Whereas triweight kernel discriminant analysis gives 73.68 % classification accuracy with 26.32 % classification error. Using triweight kernel discriminant for data preprocessing of family planning participation of childbearing age couples in Semarang City of 2016 can be stated better than with normal kernel discriminant.
The study of vehicle classification equipment with solutions to improve accuracy in Oklahoma.
DOT National Transportation Integrated Search
2014-12-01
The accuracy of vehicle counting and classification data is vital for appropriate future highway and road : design, including determining pavement characteristics, eliminating traffic jams, and improving safety. : Organizations relying on vehicle cla...
Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds.
Sur, Maitreyi; Suffredini, Tony; Wessells, Stephen M; Bloom, Peter H; Lanzone, Michael; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd
2017-01-01
Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data.
Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds
Suffredini, Tony; Wessells, Stephen M.; Bloom, Peter H.; Lanzone, Michael; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd
2017-01-01
Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data. PMID:28403159
Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds
Sur, Maitreyi; Suffredini, Tony; Wessells, Stephen M.; Bloom, Peter H.; Lanzone, Michael J.; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd
2017-01-01
Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data.
Wang, Wei; Ackland, David C; McClelland, Jodie A; Webster, Kate E; Halgamuge, Saman
2018-01-01
Quantitative gait analysis is an important tool in objective assessment and management of total knee arthroplasty (TKA) patients. Studies evaluating gait patterns in TKA patients have tended to focus on discrete data such as spatiotemporal information, joint range of motion and peak values of kinematics and kinetics, or consider selected principal components of gait waveforms for analysis. These strategies may not have the capacity to capture small variations in gait patterns associated with each joint across an entire gait cycle, and may ultimately limit the accuracy of gait classification. The aim of this study was to develop an automatic feature extraction method to analyse patterns from high-dimensional autocorrelated gait waveforms. A general linear feature extraction framework was proposed and a hierarchical partial least squares method derived for discriminant analysis of multiple gait waveforms. The effectiveness of this strategy was verified using a dataset of joint angle and ground reaction force waveforms from 43 patients after TKA surgery and 31 healthy control subjects. Compared with principal component analysis and partial least squares methods, the hierarchical partial least squares method achieved generally better classification performance on all possible combinations of waveforms, with the highest classification accuracy . The novel hierarchical partial least squares method proposed is capable of capturing virtually all significant differences between TKA patients and the controls, and provides new insights into data visualization. The proposed framework presents a foundation for more rigorous classification of gait, and may ultimately be used to evaluate the effects of interventions such as surgery and rehabilitation.
Wu, S.-S.; Qiu, X.; Usery, E.L.; Wang, L.
2009-01-01
Detailed urban land use data are important to government officials, researchers, and businesspeople for a variety of purposes. This article presents an approach to classifying detailed urban land use based on geometrical, textural, and contextual information of land parcels. An area of 6 by 14 km in Austin, Texas, with land parcel boundaries delineated by the Travis Central Appraisal District of Travis County, Texas, is tested for the approach. We derive fifty parcel attributes from relevant geographic information system (GIS) and remote sensing data and use them to discriminate among nine urban land uses: single family, multifamily, commercial, office, industrial, civic, open space, transportation, and undeveloped. Half of the 33,025 parcels in the study area are used as training data for land use classification and the other half are used as testing data for accuracy assessment. The best result with a decision tree classification algorithm has an overall accuracy of 96 percent and a kappa coefficient of 0.78, and two naive, baseline models based on the majority rule and the spatial autocorrelation rule have overall accuracy of 89 percent and 79 percent, respectively. The algorithm is relatively good at classifying single-family, multifamily, commercial, open space, and undeveloped land uses and relatively poor at classifying office, industrial, civic, and transportation land uses. The most important attributes for land use classification are the geometrical attributes, particularly those related to building areas. Next are the contextual attributes, particularly those relevant to the spatial relationship between buildings, then the textural attributes, particularly the semivariance texture statistic from 0.61-m resolution images.
Douglas, P K; Harris, Sam; Yuille, Alan; Cohen, Mark S
2011-05-15
Machine learning (ML) has become a popular tool for mining functional neuroimaging data, and there are now hopes of performing such analyses efficiently in real-time. Towards this goal, we compared accuracy of six different ML algorithms applied to neuroimaging data of persons engaged in a bivariate task, asserting their belief or disbelief of a variety of propositional statements. We performed unsupervised dimension reduction and automated feature extraction using independent component (IC) analysis and extracted IC time courses. Optimization of classification hyperparameters across each classifier occurred prior to assessment. Maximum accuracy was achieved at 92% for Random Forest, followed by 91% for AdaBoost, 89% for Naïve Bayes, 87% for a J48 decision tree, 86% for K*, and 84% for support vector machine. For real-time decoding applications, finding a parsimonious subset of diagnostic ICs might be useful. We used a forward search technique to sequentially add ranked ICs to the feature subspace. For the current data set, we determined that approximately six ICs represented a meaningful basis set for classification. We then projected these six IC spatial maps forward onto a later scanning session within subject. We then applied the optimized ML algorithms to these new data instances, and found that classification accuracy results were reproducible. Additionally, we compared our classification method to our previously published general linear model results on this same data set. The highest ranked IC spatial maps show similarity to brain regions associated with contrasts for belief > disbelief, and disbelief < belief. Copyright © 2010 Elsevier Inc. All rights reserved.
Salvatore, C; Cerasa, A; Castiglioni, I; Gallivanone, F; Augimeri, A; Lopez, M; Arabia, G; Morelli, M; Gilardi, M C; Quattrone, A
2014-01-30
Supervised machine learning has been proposed as a revolutionary approach for identifying sensitive medical image biomarkers (or combination of them) allowing for automatic diagnosis of individual subjects. The aim of this work was to assess the feasibility of a supervised machine learning algorithm for the assisted diagnosis of patients with clinically diagnosed Parkinson's disease (PD) and Progressive Supranuclear Palsy (PSP). Morphological T1-weighted Magnetic Resonance Images (MRIs) of PD patients (28), PSP patients (28) and healthy control subjects (28) were used by a supervised machine learning algorithm based on the combination of Principal Components Analysis as feature extraction technique and on Support Vector Machines as classification algorithm. The algorithm was able to obtain voxel-based morphological biomarkers of PD and PSP. The algorithm allowed individual diagnosis of PD versus controls, PSP versus controls and PSP versus PD with an Accuracy, Specificity and Sensitivity>90%. Voxels influencing classification between PD and PSP patients involved midbrain, pons, corpus callosum and thalamus, four critical regions known to be strongly involved in the pathophysiological mechanisms of PSP. Classification accuracy of individual PSP patients was consistent with previous manual morphological metrics and with other supervised machine learning application to MRI data, whereas accuracy in the detection of individual PD patients was significantly higher with our classification method. The algorithm provides excellent discrimination of PD patients from PSP patients at an individual level, thus encouraging the application of computer-based diagnosis in clinical practice. Copyright © 2013 Elsevier B.V. All rights reserved.
Reduction of Topographic Effect for Curve Number Estimated from Remotely Sensed Imagery
NASA Astrophysics Data System (ADS)
Zhang, Wen-Yan; Lin, Chao-Yuan
2016-04-01
The Soil Conservation Service Curve Number (SCS-CN) method is commonly used in hydrology to estimate direct runoff volume. The CN is the empirical parameter which corresponding to land use/land cover, hydrologic soil group and antecedent soil moisture condition. In large watersheds with complex topography, satellite remote sensing is the appropriate approach to acquire the land use change information. However, the topographic effect have been usually found in the remotely sensed imageries and resulted in land use classification. This research selected summer and winter scenes of Landsat-5 TM during 2008 to classified land use in Chen-You-Lan Watershed, Taiwan. The b-correction, the empirical topographic correction method, was applied to Landsat-5 TM data. Land use were categorized using K-mean classification into 4 groups i.e. forest, grassland, agriculture and river. Accuracy assessment of image classification was performed with national land use map. The results showed that after topographic correction, the overall accuracy of classification was increased from 68.0% to 74.5%. The average CN estimated from remotely sensed imagery decreased from 48.69 to 45.35 where the average CN estimated from national LULC map was 44.11. Therefore, the topographic correction method was recommended to normalize the topographic effect from the satellite remote sensing data before estimating the CN.
Automated structural classification of lipids by machine learning.
Taylor, Ryan; Miller, Ryan H; Miller, Ryan D; Porter, Michael; Dalgleish, James; Prince, John T
2015-03-01
Modern lipidomics is largely dependent upon structural ontologies because of the great diversity exhibited in the lipidome, but no automated lipid classification exists to facilitate this partitioning. The size of the putative lipidome far exceeds the number currently classified, despite a decade of work. Automated classification would benefit ongoing classification efforts by decreasing the time needed and increasing the accuracy of classification while providing classifications for mass spectral identification algorithms. We introduce a tool that automates classification into the LIPID MAPS ontology of known lipids with >95% accuracy and novel lipids with 63% accuracy. The classification is based upon simple chemical characteristics and modern machine learning algorithms. The decision trees produced are intelligible and can be used to clarify implicit assumptions about the current LIPID MAPS classification scheme. These characteristics and decision trees are made available to facilitate alternative implementations. We also discovered many hundreds of lipids that are currently misclassified in the LIPID MAPS database, strongly underscoring the need for automated classification. Source code and chemical characteristic lists as SMARTS search strings are available under an open-source license at https://www.github.com/princelab/lipid_classifier. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Assessing herbivore foraging behavior with GPS collars in a semiarid grassland.
Augustine, David J; Derner, Justin D
2013-03-15
Advances in global positioning system (GPS) technology have dramatically enhanced the ability to track and study distributions of free-ranging livestock. Understanding factors controlling the distribution of free-ranging livestock requires the ability to assess when and where they are foraging. For four years (2008-2011), we periodically collected GPS and activity sensor data together with direct observations of collared cattle grazing semiarid rangeland in eastern Colorado. From these data, we developed classification tree models that allowed us to discriminate between grazing and non-grazing activities. We evaluated: (1) which activity sensor measurements from the GPS collars were most valuable in predicting cattle foraging behavior, (2) the accuracy of binary (grazing, non-grazing) activity models vs. models with multiple activity categories (grazing, resting, traveling, mixed), and (3) the accuracy of models that are robust across years vs. models specific to a given year. A binary classification tree correctly removed 86.5% of the non-grazing locations, while correctly retaining 87.8% of the locations where the animal was grazing, for an overall misclassification rate of 12.9%. A classification tree that separated activity into four different categories yielded a greater misclassification rate of 16.0%. Distance travelled in a 5 minute interval and the proportion of the interval with the sensor indicating a head down position were the two most important variables predicting grazing activity. Fitting annual models of cattle foraging activity did not improve model accuracy compared to a single model based on all four years combined. This suggests that increased sample size was more valuable than accounting for interannual variation in foraging behavior associated with variation in forage production. Our models differ from previous assessments in semiarid rangeland of Israel and mesic pastures in the United States in terms of the value of different activity sensor measurements for identifying grazing activity, suggesting that the use of GPS collars to classify cattle grazing behavior will require calibrations specific to the environment and vegetation being studied.
NASA Technical Reports Server (NTRS)
Quattrochi, D. A.
1984-01-01
An initial analysis of LANDSAT 4 Thematic Mapper (TM) data for the discrimination of agricultural, forested wetland, and urban land covers is conducted using a scene of data collected over Arkansas and Tennessee. A classification of agricultural lands derived from multitemporal LANDSAT Multispectral Scanner (MSS) data is compared with a classification of TM data for the same area. Results from this comparative analysis show that the multitemporal MSS classification produced an overall accuracy of 80.91% while the TM classification yields an overall classification accuracy of 97.06% correct.
Variations in the Intragene Methylation Profiles Hallmark Induced Pluripotency
Druzhkov, Pavel; Zolotykh, Nikolay; Meyerov, Iosif; Alsaedi, Ahmed; Shutova, Maria; Ivanchenko, Mikhail; Zaikin, Alexey
2015-01-01
We demonstrate the potential of differentiating embryonic and induced pluripotent stem cells by the regularized linear and decision tree machine learning classification algorithms, based on a number of intragene methylation measures. The resulting average accuracy of classification has been proven to be above 95%, which overcomes the earlier achievements. We propose a constructive and transparent method of feature selection based on classifier accuracy. Enrichment analysis reveals statistically meaningful presence of stemness group and cancer discriminating genes among the selected best classifying features. These findings stimulate the further research on the functional consequences of these differences in methylation patterns. The presented approach can be broadly used to discriminate the cells of different phenotype or in different state by their methylation profiles, identify groups of genes constituting multifeature classifiers, and assess enrichment of these groups by the sets of genes with a functionality of interest. PMID:26618180
seXY: a tool for sex inference from genotype arrays.
Qian, David C; Busam, Jonathan A; Xiao, Xiangjun; O'Mara, Tracy A; Eeles, Rosalind A; Schumacher, Frederick R; Phelan, Catherine M; Amos, Christopher I
2017-02-15
Checking concordance between reported sex and genotype-inferred sex is a crucial quality control measure in genome-wide association studies (GWAS). However, limited insights exist regarding the true accuracy of software that infer sex from genotype array data. We present seXY, a logistic regression model trained on both X chromosome heterozygosity and Y chromosome missingness, that consistently demonstrated >99.5% sex inference accuracy in cross-validation for 889 males and 5,361 females enrolled in prostate cancer and ovarian cancer GWAS. Compared to PLINK, one of the most popular tools for sex inference in GWAS that assesses only X chromosome heterozygosity, seXY achieved marginally better male classification and 3% more accurate female classification. https://github.com/Christopher-Amos-Lab/seXY. Christopher.I.Amos@dartmouth.edu. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
The limb movement analysis of rehabilitation exercises using wearable inertial sensors.
Bingquan Huang; Giggins, Oonagh; Kechadi, Tahar; Caulfield, Brian
2016-08-01
Due to no supervision of a therapist in home based exercise programs, inertial sensor based feedback systems which can accurately assess movement repetitions are urgently required. The synchronicity and the degrees of freedom both show that one movement might resemble another movement signal which is mixed in with another not precisely defined movement. Therefore, the data and feature selections are important for movement analysis. This paper explores the data and feature selection for the limb movement analysis of rehabilitation exercises. The results highlight that the classification accuracy is very sensitive to the mount location of the sensors. The results show that the use of 2 or 3 sensor units, the combination of acceleration and gyroscope data, and the feature sets combined by the statistical feature set with another type of feature, can significantly improve the classification accuracy rates. The results illustrate that acceleration data is more effective than gyroscope data for most of the movement analysis.
Texture classification of lung computed tomography images
NASA Astrophysics Data System (ADS)
Pheng, Hang See; Shamsuddin, Siti M.
2013-03-01
Current development of algorithms in computer-aided diagnosis (CAD) scheme is growing rapidly to assist the radiologist in medical image interpretation. Texture analysis of computed tomography (CT) scans is one of important preliminary stage in the computerized detection system and classification for lung cancer. Among different types of images features analysis, Haralick texture with variety of statistical measures has been used widely in image texture description. The extraction of texture feature values is essential to be used by a CAD especially in classification of the normal and abnormal tissue on the cross sectional CT images. This paper aims to compare experimental results using texture extraction and different machine leaning methods in the classification normal and abnormal tissues through lung CT images. The machine learning methods involve in this assessment are Artificial Immune Recognition System (AIRS), Naive Bayes, Decision Tree (J48) and Backpropagation Neural Network. AIRS is found to provide high accuracy (99.2%) and sensitivity (98.0%) in the assessment. For experiments and testing purpose, publicly available datasets in the Reference Image Database to Evaluate Therapy Response (RIDER) are used as study cases.
NASA Technical Reports Server (NTRS)
Rignot, Eric; Williams, Cynthia; Way, Jobea; Viereck, Leslie
1993-01-01
A maximum a posteriori Bayesian classifier for multifrequency polarimetric SAR data is used to perform a supervised classification of forest types in the floodplains of Alaska. The image classes include white spruce, balsam poplar, black spruce, alder, non-forests, and open water. The authors investigate the effect on classification accuracy of changing environmental conditions, and of frequency and polarization of the signal. The highest classification accuracy (86 percent correctly classified forest pixels, and 91 percent overall) is obtained combining L- and C-band frequencies fully polarimetric on a date where the forest is just recovering from flooding. The forest map compares favorably with a vegetation map assembled from digitized aerial photos which took five years for completion, and address the state of the forest in 1978, ignoring subsequent fires, changes in the course of the river, clear-cutting of trees, and tree growth. HV-polarization is the most useful polarization at L- and C-band for classification. C-band VV (ERS-1 mode) and L-band HH (J-ERS-1 mode) alone or combined yield unsatisfactory classification accuracies. Additional data acquired in the winter season during thawed and frozen days yield classification accuracies respectively 20 percent and 30 percent lower due to a greater confusion between conifers and deciduous trees. Data acquired at the peak of flooding in May 1991 also yield classification accuracies 10 percent lower because of dominant trunk-ground interactions which mask out finer differences in radar backscatter between tree species. Combination of several of these dates does not improve classification accuracy. For comparison, panchromatic optical data acquired by SPOT in the summer season of 1991 are used to classify the same area. The classification accuracy (78 percent for the forest types and 90 percent if open water is included) is lower than that obtained with AIRSAR although conifers and deciduous trees are better separated due to the presence of leaves on the deciduous trees. Optical data do not separate black spruce and white spruce as well as SAR data, cannot separate alder from balsam poplar, and are of course limited by the frequent cloud cover in the polar regions. Yet, combining SPOT and AIRSAR offers better chances to identify vegetation types independent of ground truth information using a combination of NDVI indexes from SPOT, biomass numbers from AIRSAR, and a segmentation map from either one.
NASA Technical Reports Server (NTRS)
Spruce, Joseph P.; Ross, Kenton W.; Graham, William D.
2006-01-01
Hurricane Katrina inflicted widespread damage to vegetation in southwestern coastal Mississippi upon landfall on August 29, 2005. Storm damage to surface vegetation types at the NASA John C. Stennis Space Center (SSC) was mapped and quantified using IKONOS data originally acquired on September 2, 2005, and later obtained via a Department of Defense ClearView contract. NASA SSC management required an assessment of the hurricane s impact to the 125,000-acre buffer zone used to mitigate rocket engine testing noise and vibration impacts and to manage forestry and fire risk. This study employed ERDAS IMAGINE software to apply traditional classification techniques to the IKONOS data. Spectral signatures were collected from multiple ISODATA classifications of subset areas across the entire region and then appended to a master file representative of major targeted cover type conditions. The master file was subsequently used with the IKONOS data and with a maximum likelihood algorithm to produce a supervised classification later refined using GIS-based editing. The final results enabled mapped, quantitative areal estimates of hurricane-induced damage according to general surface cover type. The IKONOS classification accuracy was assessed using higher resolution aerial imagery and field survey data. In-situ data and GIS analysis indicate that the results compare well to FEMA maps of flooding extent. The IKONOS classification also mapped open areas with woody storm debris. The detection of such storm damage categories is potentially useful for government officials responsible for hurricane disaster mitigation.
Efficient alignment-free DNA barcode analytics.
Kuksa, Pavel; Pavlovic, Vladimir
2009-11-10
In this work we consider barcode DNA analysis problems and address them using alternative, alignment-free methods and representations which model sequences as collections of short sequence fragments (features). The methods use fixed-length representations (spectrum) for barcode sequences to measure similarities or dissimilarities between sequences coming from the same or different species. The spectrum-based representation not only allows for accurate and computationally efficient species classification, but also opens possibility for accurate clustering analysis of putative species barcodes and identification of critical within-barcode loci distinguishing barcodes of different sample groups. New alignment-free methods provide highly accurate and fast DNA barcode-based identification and classification of species with substantial improvements in accuracy and speed over state-of-the-art barcode analysis methods. We evaluate our methods on problems of species classification and identification using barcodes, important and relevant analytical tasks in many practical applications (adverse species movement monitoring, sampling surveys for unknown or pathogenic species identification, biodiversity assessment, etc.) On several benchmark barcode datasets, including ACG, Astraptes, Hesperiidae, Fish larvae, and Birds of North America, proposed alignment-free methods considerably improve prediction accuracy compared to prior results. We also observe significant running time improvements over the state-of-the-art methods. Our results show that newly developed alignment-free methods for DNA barcoding can efficiently and with high accuracy identify specimens by examining only few barcode features, resulting in increased scalability and interpretability of current computational approaches to barcoding.
Tang, Yunwei; Jing, Linhai; Li, Hui; Liu, Qingjie; Yan, Qi; Li, Xiuxia
2016-01-01
This study explores the ability of WorldView-2 (WV-2) imagery for bamboo mapping in a mountainous region in Sichuan Province, China. A large area of this place is covered by shadows in the image, and only a few sampled points derived were useful. In order to identify bamboos based on sparse training data, the sample size was expanded according to the reflectance of multispectral bands selected using the principal component analysis (PCA). Then, class separability based on the training data was calculated using a feature space optimization method to select the features for classification. Four regular object-based classification methods were applied based on both sets of training data. The results show that the k-nearest neighbor (k-NN) method produced the greatest accuracy. A geostatistically-weighted k-NN classifier, accounting for the spatial correlation between classes, was then applied to further increase the accuracy. It achieved 82.65% and 93.10% of the producer’s and user’s accuracies respectively for the bamboo class. The canopy densities were estimated to explain the result. This study demonstrates that the WV-2 image can be used to identify small patches of understory bamboos given limited known samples, and the resulting bamboo distribution facilitates the assessments of the habitats of giant pandas. PMID:27879661
Vujaklija, Ivan; Roche, Aidan D; Hasenoehrl, Timothy; Sturma, Agnes; Amsuess, Sebastian; Farina, Dario; Aszmann, Oskar C
2017-01-01
Missing an upper limb dramatically impairs daily-life activities. Efforts in overcoming the issues arising from this disability have been made in both academia and industry, although their clinical outcome is still limited. Translation of prosthetic research into clinics has been challenging because of the difficulties in meeting the necessary requirements of the market. In this perspective article, we suggest that one relevant factor determining the relatively small clinical impact of myocontrol algorithms for upper limb prostheses is the limit of commonly used laboratory performance metrics. The laboratory conditions, in which the majority of the solutions are being evaluated, fail to sufficiently replicate real-life challenges. We qualitatively support this argument with representative data from seven transradial amputees. Their ability to control a myoelectric prosthesis was tested by measuring the accuracy of offline EMG signal classification, as a typical laboratory performance metrics, as well as by clinical scores when performing standard tests of daily living. Despite all subjects reaching relatively high classification accuracy offline, their clinical scores varied greatly and were not strongly predicted by classification accuracy. We therefore support the suggestion to test myocontrol systems using clinical tests on amputees, fully fitted with sockets and prostheses highly resembling the systems they would use in daily living, as evaluation benchmark. Agreement on this level of testing for systems developed in research laboratories would facilitate clinically relevant progresses in this field.
The Effect of Normalization in Violence Video Classification Performance
NASA Astrophysics Data System (ADS)
Ali, Ashikin; Senan, Norhalina
2017-08-01
Basically, data pre-processing is an important part of data mining. Normalization is a pre-processing stage for any type of problem statement, especially in video classification. Challenging problems that arises in video classification is because of the heterogeneous content, large variations in video quality and complex semantic meanings of the concepts involved. Therefore, to regularize this problem, it is thoughtful to ensure normalization or basically involvement of thorough pre-processing stage aids the robustness of classification performance. This process is to scale all the numeric variables into certain range to make it more meaningful for further phases in available data mining techniques. Thus, this paper attempts to examine the effect of 2 normalization techniques namely Min-max normalization and Z-score in violence video classifications towards the performance of classification rate using Multi-layer perceptron (MLP) classifier. Using Min-Max Normalization range of [0,1] the result shows almost 98% of accuracy, meanwhile Min-Max Normalization range of [-1,1] accuracy is 59% and for Z-score the accuracy is 50%.
NASA Astrophysics Data System (ADS)
Gajda, Agnieszka; Wójtowicz-Nowakowska, Anna
2013-04-01
A comparison of the accuracy of pixel based and object based classifications of integrated optical and LiDAR data Land cover maps are generally produced on the basis of high resolution imagery. Recently, LiDAR (Light Detection and Ranging) data have been brought into use in diverse applications including land cover mapping. In this study we attempted to assess the accuracy of land cover classification using both high resolution aerial imagery and LiDAR data (airborne laser scanning, ALS), testing two classification approaches: a pixel-based classification and object-oriented image analysis (OBIA). The study was conducted on three test areas (3 km2 each) in the administrative area of Kraków, Poland, along the course of the Vistula River. They represent three different dominating land cover types of the Vistula River valley. Test site 1 had a semi-natural vegetation, with riparian forests and shrubs, test site 2 represented a densely built-up area, and test site 3 was an industrial site. Point clouds from ALS and ortophotomaps were both captured in November 2007. Point cloud density was on average 16 pt/m2 and it contained additional information about intensity and encoded RGB values. Ortophotomaps had a spatial resolution of 10 cm. From point clouds two raster maps were generated: intensity (1) and (2) normalised Digital Surface Model (nDSM), both with the spatial resolution of 50 cm. To classify the aerial data, a supervised classification approach was selected. Pixel based classification was carried out in ERDAS Imagine software. Ortophotomaps and intensity and nDSM rasters were used in classification. 15 homogenous training areas representing each cover class were chosen. Classified pixels were clumped to avoid salt and pepper effect. Object oriented image object classification was carried out in eCognition software, which implements both the optical and ALS data. Elevation layers (intensity, firs/last reflection, etc.) were used at segmentation stage due to proper wages usage. Thus a more precise and unambiguous boundaries of segments (objects) were received. As a results of the classification 5 classes of land cover (buildings, water, high and low vegetation and others) were extracted. Both pixel-based image analysis and OBIA were conducted with a minimum mapping unit of 10m2. Results were validated on the basis on manual classification and random points (80 per test area), reference data set was manually interpreted using ortophotomaps and expert knowledge of the test site areas.
Shin, Younghak; Lee, Seungchan; Ahn, Minkyu; Cho, Hohyun; Jun, Sung Chan; Lee, Heung-No
2015-11-01
One of the main problems related to electroencephalogram (EEG) based brain-computer interface (BCI) systems is the non-stationarity of the underlying EEG signals. This results in the deterioration of the classification performance during experimental sessions. Therefore, adaptive classification techniques are required for EEG based BCI applications. In this paper, we propose simple adaptive sparse representation based classification (SRC) schemes. Supervised and unsupervised dictionary update techniques for new test data and a dictionary modification method by using the incoherence measure of the training data are investigated. The proposed methods are very simple and additional computation for the re-training of the classifier is not needed. The proposed adaptive SRC schemes are evaluated using two BCI experimental datasets. The proposed methods are assessed by comparing classification results with the conventional SRC and other adaptive classification methods. On the basis of the results, we find that the proposed adaptive schemes show relatively improved classification accuracy as compared to conventional methods without requiring additional computation. Copyright © 2015 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Anitha, J.; Vijila, C. Kezi Selva; Hemanth, D. Jude
2010-02-01
Diabetic retinopathy (DR) is a chronic eye disease for which early detection is highly essential to avoid any fatal results. Image processing of retinal images emerge as a feasible tool for this early diagnosis. Digital image processing techniques involve image classification which is a significant technique to detect the abnormality in the eye. Various automated classification systems have been developed in the recent years but most of them lack high classification accuracy. Artificial neural networks are the widely preferred artificial intelligence technique since it yields superior results in terms of classification accuracy. In this work, Radial Basis function (RBF) neural network based bi-level classification system is proposed to differentiate abnormal DR Images and normal retinal images. The results are analyzed in terms of classification accuracy, sensitivity and specificity. A comparative analysis is performed with the results of the probabilistic classifier namely Bayesian classifier to show the superior nature of neural classifier. Experimental results show promising results for the neural classifier in terms of the performance measures.
Fuzzy Classification of High Resolution Remote Sensing Scenes Using Visual Attention Features.
Li, Linyi; Xu, Tingbao; Chen, Yun
2017-01-01
In recent years the spatial resolutions of remote sensing images have been improved greatly. However, a higher spatial resolution image does not always lead to a better result of automatic scene classification. Visual attention is an important characteristic of the human visual system, which can effectively help to classify remote sensing scenes. In this study, a novel visual attention feature extraction algorithm was proposed, which extracted visual attention features through a multiscale process. And a fuzzy classification method using visual attention features (FC-VAF) was developed to perform high resolution remote sensing scene classification. FC-VAF was evaluated by using remote sensing scenes from widely used high resolution remote sensing images, including IKONOS, QuickBird, and ZY-3 images. FC-VAF achieved more accurate classification results than the others according to the quantitative accuracy evaluation indices. We also discussed the role and impacts of different decomposition levels and different wavelets on the classification accuracy. FC-VAF improves the accuracy of high resolution scene classification and therefore advances the research of digital image analysis and the applications of high resolution remote sensing images.
Fuzzy Classification of High Resolution Remote Sensing Scenes Using Visual Attention Features
Xu, Tingbao; Chen, Yun
2017-01-01
In recent years the spatial resolutions of remote sensing images have been improved greatly. However, a higher spatial resolution image does not always lead to a better result of automatic scene classification. Visual attention is an important characteristic of the human visual system, which can effectively help to classify remote sensing scenes. In this study, a novel visual attention feature extraction algorithm was proposed, which extracted visual attention features through a multiscale process. And a fuzzy classification method using visual attention features (FC-VAF) was developed to perform high resolution remote sensing scene classification. FC-VAF was evaluated by using remote sensing scenes from widely used high resolution remote sensing images, including IKONOS, QuickBird, and ZY-3 images. FC-VAF achieved more accurate classification results than the others according to the quantitative accuracy evaluation indices. We also discussed the role and impacts of different decomposition levels and different wavelets on the classification accuracy. FC-VAF improves the accuracy of high resolution scene classification and therefore advances the research of digital image analysis and the applications of high resolution remote sensing images. PMID:28761440
Protein classification based on text document classification techniques.
Cheng, Betty Yee Man; Carbonell, Jaime G; Klein-Seetharaman, Judith
2005-03-01
The need for accurate, automated protein classification methods continues to increase as advances in biotechnology uncover new proteins. G-protein coupled receptors (GPCRs) are a particularly difficult superfamily of proteins to classify due to extreme diversity among its members. Previous comparisons of BLAST, k-nearest neighbor (k-NN), hidden markov model (HMM) and support vector machine (SVM) using alignment-based features have suggested that classifiers at the complexity of SVM are needed to attain high accuracy. Here, analogous to document classification, we applied Decision Tree and Naive Bayes classifiers with chi-square feature selection on counts of n-grams (i.e. short peptide sequences of length n) to this classification task. Using the GPCR dataset and evaluation protocol from the previous study, the Naive Bayes classifier attained an accuracy of 93.0 and 92.4% in level I and level II subfamily classification respectively, while SVM has a reported accuracy of 88.4 and 86.3%. This is a 39.7 and 44.5% reduction in residual error for level I and level II subfamily classification, respectively. The Decision Tree, while inferior to SVM, outperforms HMM in both level I and level II subfamily classification. For those GPCR families whose profiles are stored in the Protein FAMilies database of alignments and HMMs (PFAM), our method performs comparably to a search against those profiles. Finally, our method can be generalized to other protein families by applying it to the superfamily of nuclear receptors with 94.5, 97.8 and 93.6% accuracy in family, level I and level II subfamily classification respectively. Copyright 2005 Wiley-Liss, Inc.
Zhe Fan; Zhong Wang; Guanglin Li; Ruomei Wang
2016-08-01
Motion classification system based on surface Electromyography (sEMG) pattern recognition has achieved good results in experimental condition. But it is still a challenge for clinical implement and practical application. Many factors contribute to the difficulty of clinical use of the EMG based dexterous control. The most obvious and important is the noise in the EMG signal caused by electrode shift, muscle fatigue, motion artifact, inherent instability of signal and biological signals such as Electrocardiogram. In this paper, a novel method based on Canonical Correlation Analysis (CCA) was developed to eliminate the reduction of classification accuracy caused by electrode shift. The average classification accuracy of our method were above 95% for the healthy subjects. In the process, we validated the influence of electrode shift on motion classification accuracy and discovered the strong correlation with correlation coefficient of >0.9 between shift position data and normal position data.
Rifai Chai; Naik, Ganesh R; Tran, Yvonne; Sai Ho Ling; Craig, Ashley; Nguyen, Hung T
2015-08-01
An electroencephalography (EEG)-based counter measure device could be used for fatigue detection during driving. This paper explores the classification of fatigue and alert states using power spectral density (PSD) as a feature extractor and fuzzy swarm based-artificial neural network (ANN) as a classifier. An independent component analysis of entropy rate bound minimization (ICA-ERBM) is investigated as a novel source separation technique for fatigue classification using EEG analysis. A comparison of the classification accuracy of source separator versus no source separator is presented. Classification performance based on 43 participants without the inclusion of the source separator resulted in an overall sensitivity of 71.67%, a specificity of 75.63% and an accuracy of 73.65%. However, these results were improved after the inclusion of a source separator module, resulting in an overall sensitivity of 78.16%, a specificity of 79.60% and an accuracy of 78.88% (p <; 0.05).
Boursier, Jérôme; Bertrais, Sandrine; Oberti, Frédéric; Gallois, Yves; Fouchard-Hubert, Isabelle; Rousselet, Marie-Christine; Zarski, Jean-Pierre; Calès, Paul
2011-11-30
Non-invasive tests have been constructed and evaluated mainly for binary diagnoses such as significant fibrosis. Recently, detailed fibrosis classifications for several non-invasive tests have been developed, but their accuracy has not been thoroughly evaluated in comparison to liver biopsy, especially in clinical practice and for Fibroscan. Therefore, the main aim of the present study was to evaluate the accuracy of detailed fibrosis classifications available for non-invasive tests and liver biopsy. The secondary aim was to validate these accuracies in independent populations. Four HCV populations provided 2,068 patients with liver biopsy, four different pathologist skill-levels and non-invasive tests. Results were expressed as percentages of correctly classified patients. In population #1 including 205 patients and comparing liver biopsy (reference: consensus reading by two experts) and blood tests, Metavir fibrosis (FM) stage accuracy was 64.4% in local pathologists vs. 82.2% (p < 10-3) in single expert pathologist. Significant discrepancy (≥ 2FM vs reference histological result) rates were: Fibrotest: 17.2%, FibroMeter2G: 5.6%, local pathologists: 4.9%, FibroMeter3G: 0.5%, expert pathologist: 0% (p < 10-3). In population #2 including 1,056 patients and comparing blood tests, the discrepancy scores, taking into account the error magnitude, of detailed fibrosis classification were significantly different between FibroMeter2G (0.30 ± 0.55) and FibroMeter3G (0.14 ± 0.37, p < 10-3) or Fibrotest (0.84 ± 0.80, p < 10-3). In population #3 (and #4) including 458 (359) patients and comparing blood tests and Fibroscan, accuracies of detailed fibrosis classification were, respectively: Fibrotest: 42.5% (33.5%), Fibroscan: 64.9% (50.7%), FibroMeter2G: 68.7% (68.2%), FibroMeter3G: 77.1% (83.4%), p < 10-3 (p < 10-3). Significant discrepancy (≥ 2 FM) rates were, respectively: Fibrotest: 21.3% (22.2%), Fibroscan: 12.9% (12.3%), FibroMeter2G: 5.7% (6.0%), FibroMeter3G: 0.9% (0.9%), p < 10-3 (p < 10-3). The accuracy in detailed fibrosis classification of the best-performing blood test outperforms liver biopsy read by a local pathologist, i.e., in clinical practice; however, the classification precision is apparently lesser. This detailed classification accuracy is much lower than that of significant fibrosis with Fibroscan and even Fibrotest but higher with FibroMeter3G. FibroMeter classification accuracy was significantly higher than those of other non-invasive tests. Finally, for hepatitis C evaluation in clinical practice, fibrosis degree can be evaluated using an accurate blood test.
2011-01-01
Background Non-invasive tests have been constructed and evaluated mainly for binary diagnoses such as significant fibrosis. Recently, detailed fibrosis classifications for several non-invasive tests have been developed, but their accuracy has not been thoroughly evaluated in comparison to liver biopsy, especially in clinical practice and for Fibroscan. Therefore, the main aim of the present study was to evaluate the accuracy of detailed fibrosis classifications available for non-invasive tests and liver biopsy. The secondary aim was to validate these accuracies in independent populations. Methods Four HCV populations provided 2,068 patients with liver biopsy, four different pathologist skill-levels and non-invasive tests. Results were expressed as percentages of correctly classified patients. Results In population #1 including 205 patients and comparing liver biopsy (reference: consensus reading by two experts) and blood tests, Metavir fibrosis (FM) stage accuracy was 64.4% in local pathologists vs. 82.2% (p < 10-3) in single expert pathologist. Significant discrepancy (≥ 2FM vs reference histological result) rates were: Fibrotest: 17.2%, FibroMeter2G: 5.6%, local pathologists: 4.9%, FibroMeter3G: 0.5%, expert pathologist: 0% (p < 10-3). In population #2 including 1,056 patients and comparing blood tests, the discrepancy scores, taking into account the error magnitude, of detailed fibrosis classification were significantly different between FibroMeter2G (0.30 ± 0.55) and FibroMeter3G (0.14 ± 0.37, p < 10-3) or Fibrotest (0.84 ± 0.80, p < 10-3). In population #3 (and #4) including 458 (359) patients and comparing blood tests and Fibroscan, accuracies of detailed fibrosis classification were, respectively: Fibrotest: 42.5% (33.5%), Fibroscan: 64.9% (50.7%), FibroMeter2G: 68.7% (68.2%), FibroMeter3G: 77.1% (83.4%), p < 10-3 (p < 10-3). Significant discrepancy (≥ 2 FM) rates were, respectively: Fibrotest: 21.3% (22.2%), Fibroscan: 12.9% (12.3%), FibroMeter2G: 5.7% (6.0%), FibroMeter3G: 0.9% (0.9%), p < 10-3 (p < 10-3). Conclusions The accuracy in detailed fibrosis classification of the best-performing blood test outperforms liver biopsy read by a local pathologist, i.e., in clinical practice; however, the classification precision is apparently lesser. This detailed classification accuracy is much lower than that of significant fibrosis with Fibroscan and even Fibrotest but higher with FibroMeter3G. FibroMeter classification accuracy was significantly higher than those of other non-invasive tests. Finally, for hepatitis C evaluation in clinical practice, fibrosis degree can be evaluated using an accurate blood test. PMID:22129438
Evaluation of space SAR as a land-cover classification
NASA Technical Reports Server (NTRS)
Brisco, B.; Ulaby, F. T.; Williams, T. H. L.
1985-01-01
The multidimensional approach to the mapping of land cover, crops, and forests is reported. Dimensionality is achieved by using data from sensors such as LANDSAT to augment Seasat and Shuttle Image Radar (SIR) data, using different image features such as tone and texture, and acquiring multidate data. Seasat, Shuttle Imaging Radar (SIR-A), and LANDSAT data are used both individually and in combination to map land cover in Oklahoma. The results indicates that radar is the best single sensor (72% accuracy) and produces the best sensor combination (97.5% accuracy) for discriminating among five land cover categories. Multidate Seasat data and a single data of LANDSAT coverage are then used in a crop classification study of western Kansas. The highest accuracy for a single channel is achieved using a Seasat scene, which produces a classification accuracy of 67%. Classification accuracy increases to approximately 75% when either a multidate Seasat combination or LANDSAT data in a multisensor combination is used. The tonal and textural elements of SIR-A data are then used both alone and in combination to classify forests into five categories.
Comparative Analysis of Haar and Daubechies Wavelet for Hyper Spectral Image Classification
NASA Astrophysics Data System (ADS)
Sharif, I.; Khare, S.
2014-11-01
With the number of channels in the hundreds instead of in the tens Hyper spectral imagery possesses much richer spectral information than multispectral imagery. The increased dimensionality of such Hyper spectral data provides a challenge to the current technique for analyzing data. Conventional classification methods may not be useful without dimension reduction pre-processing. So dimension reduction has become a significant part of Hyper spectral image processing. This paper presents a comparative analysis of the efficacy of Haar and Daubechies wavelets for dimensionality reduction in achieving image classification. Spectral data reduction using Wavelet Decomposition could be useful because it preserves the distinction among spectral signatures. Daubechies wavelets optimally capture the polynomial trends while Haar wavelet is discontinuous and resembles a step function. The performance of these wavelets are compared in terms of classification accuracy and time complexity. This paper shows that wavelet reduction has more separate classes and yields better or comparable classification accuracy. In the context of the dimensionality reduction algorithm, it is found that the performance of classification of Daubechies wavelets is better as compared to Haar wavelet while Daubechies takes more time compare to Haar wavelet. The experimental results demonstrate the classification system consistently provides over 84% classification accuracy.
ANALYSIS OF A CLASSIFICATION ERROR MATRIX USING CATEGORICAL DATA TECHNIQUES.
Rosenfield, George H.; Fitzpatrick-Lins, Katherine
1984-01-01
Summary form only given. A classification error matrix typically contains tabulation results of an accuracy evaluation of a thematic classification, such as that of a land use and land cover map. The diagonal elements of the matrix represent the counts corrected, and the usual designation of classification accuracy has been the total percent correct. The nondiagonal elements of the matrix have usually been neglected. The classification error matrix is known in statistical terms as a contingency table of categorical data. As an example, an application of these methodologies to a problem of remotely sensed data concerning two photointerpreters and four categories of classification indicated that there is no significant difference in the interpretation between the two photointerpreters, and that there are significant differences among the interpreted category classifications. However, two categories, oak and cottonwood, are not separable in classification in this experiment at the 0. 51 percent probability. A coefficient of agreement is determined for the interpreted map as a whole, and individually for each of the interpreted categories. A conditional coefficient of agreement for the individual categories is compared to other methods for expressing category accuracy which have already been presented in the remote sensing literature.
De Nunzio, Cosimo; Pastore, Antonio Luigi; Lombardo, Riccardo; Simone, Giuseppe; Leonardo, Costantino; Mastroianni, Riccardo; Collura, Devis; Muto, Giovanni; Gallucci, Michele; Carbone, Antonio; Fuschi, Andrea; Dutto, Lorenzo; Witt, Joern Heinrich; De Dominicis, Carlo; Tubaro, Andrea
2018-06-01
To evaluate the differences between the old and the new Gleason score classification systems in upgrading and downgrading rates. Between 2012 and 2015, we identified 9703 patients treated with retropubic radical prostatectomy (RP) in four tertiary centers. Biopsy specimens as well as radical prostatectomy specimens were graded according to both 2005 Gleason and 2014 ISUP five-tier Gleason grading system (five-tier GG system). Upgrading and downgrading rates on radical prostatectomy were first recorded for both classifications and then compared. The accuracy of the biopsy for each histological classification was determined by using the kappa coefficient of agreement and by assessing sensitivity, specificity, positive and negative predictive value. The five-tier GG system presented a lower clinically significant upgrading rate (1895/9703: 19,5% vs 2332/9703:24.0%; p = .001) and a similar clinically significant downgrading rate (756/9703: 7,7% vs 779/9703: 8%; p = .267) when compared to the 2005 ISUP classification. When evaluating their accuracy, the new five-tier GG system presented a better specificity (91% vs 83%) and a better negative predictive value (78% vs 60%). The kappa-statistics measures of agreement between needle biopsy and radical prostatectomy specimens were poor and good respectively for the five-tier GG system and for the 2005 Gleason score (k = 0.360 ± 0.007 vs k = 0.426 ± 0.007). The new Epstein classification significantly reduces upgrading events. The implementation of this new classification could better define prostate cancer aggressiveness with important clinical implications, particularly in prostate cancer management. Copyright © 2018 Elsevier Ltd, BASO ~ The Association for Cancer Surgery, and the European Society of Surgical Oncology. All rights reserved.
Research on Remote Sensing Image Classification Based on Feature Level Fusion
NASA Astrophysics Data System (ADS)
Yuan, L.; Zhu, G.
2018-04-01
Remote sensing image classification, as an important direction of remote sensing image processing and application, has been widely studied. However, in the process of existing classification algorithms, there still exists the phenomenon of misclassification and missing points, which leads to the final classification accuracy is not high. In this paper, we selected Sentinel-1A and Landsat8 OLI images as data sources, and propose a classification method based on feature level fusion. Compare three kind of feature level fusion algorithms (i.e., Gram-Schmidt spectral sharpening, Principal Component Analysis transform and Brovey transform), and then select the best fused image for the classification experimental. In the classification process, we choose four kinds of image classification algorithms (i.e. Minimum distance, Mahalanobis distance, Support Vector Machine and ISODATA) to do contrast experiment. We use overall classification precision and Kappa coefficient as the classification accuracy evaluation criteria, and the four classification results of fused image are analysed. The experimental results show that the fusion effect of Gram-Schmidt spectral sharpening is better than other methods. In four kinds of classification algorithms, the fused image has the best applicability to Support Vector Machine classification, the overall classification precision is 94.01 % and the Kappa coefficients is 0.91. The fused image with Sentinel-1A and Landsat8 OLI is not only have more spatial information and spectral texture characteristics, but also enhances the distinguishing features of the images. The proposed method is beneficial to improve the accuracy and stability of remote sensing image classification.
Thematic accuracy assessment of the 2011 National Land Cover Database (NLCD)
Wickham, James; Stehman, Stephen V.; Gass, Leila; Dewitz, Jon; Sorenson, Daniel G.; Granneman, Brian J.; Poss, Richard V.; Baer, Lori Anne
2017-01-01
Accuracy assessment is a standard protocol of National Land Cover Database (NLCD) mapping. Here we report agreement statistics between map and reference labels for NLCD 2011, which includes land cover for ca. 2001, ca. 2006, and ca. 2011. The two main objectives were assessment of agreement between map and reference labels for the three, single-date NLCD land cover products at Level II and Level I of the classification hierarchy, and agreement for 17 land cover change reporting themes based on Level I classes (e.g., forest loss; forest gain; forest, no change) for three change periods (2001–2006, 2006–2011, and 2001–2011). The single-date overall accuracies were 82%, 83%, and 83% at Level II and 88%, 89%, and 89% at Level I for 2011, 2006, and 2001, respectively. Many class-specific user's accuracies met or exceeded a previously established nominal accuracy benchmark of 85%. Overall accuracies for 2006 and 2001 land cover components of NLCD 2011 were approximately 4% higher (at Level II and Level I) than the overall accuracies for the same components of NLCD 2006. The high Level I overall, user's, and producer's accuracies for the single-date eras in NLCD 2011 did not translate into high class-specific user's and producer's accuracies for many of the 17 change reporting themes. User's accuracies were high for the no change reporting themes, commonly exceeding 85%, but were typically much lower for the reporting themes that represented change. Only forest loss, forest gain, and urban gain had user's accuracies that exceeded 70%. Lower user's accuracies for the other change reporting themes may be attributable to the difficulty in determining the context of grass (e.g., open urban, grassland, agriculture) and between the components of the forest-shrubland-grassland gradient at either the mapping phase, reference label assignment phase, or both. NLCD 2011 user's accuracies for forest loss, forest gain, and urban gain compare favorably with results from other land cover change accuracy assessments.
Shrivastava, Vimal K; Londhe, Narendra D; Sonawane, Rajendra S; Suri, Jasjit S
2015-10-01
A large percentage of dermatologist׳s decision in psoriasis disease assessment is based on color. The current computer-aided diagnosis systems for psoriasis risk stratification and classification lack the vigor of color paradigm. The paper presents an automated psoriasis computer-aided diagnosis (pCAD) system for classification of psoriasis skin images into psoriatic lesion and healthy skin, which solves the two major challenges: (i) fulfills the color feature requirements and (ii) selects the powerful dominant color features while retaining high classification accuracy. Fourteen color spaces are discovered for psoriasis disease analysis leading to 86 color features. The pCAD system is implemented in a support vector-based machine learning framework where the offline image data set is used for computing machine learning offline color machine learning parameters. These are then used for transformation of the online color features to predict the class labels for healthy vs. diseased cases. The above paradigm uses principal component analysis for color feature selection of dominant features, keeping the original color feature unaltered. Using the cross-validation protocol, the above machine learning protocol is compared against the standalone grayscale features with 60 features and against the combined grayscale and color feature set of 146. Using a fixed data size of 540 images with equal number of healthy and diseased, 10 fold cross-validation protocol, and SVM of polynomial kernel of type two, pCAD system shows an accuracy of 99.94% with sensitivity and specificity of 99.93% and 99.96%. Using a varying data size protocol, the mean classification accuracies for color, grayscale, and combined scenarios are: 92.85%, 93.83% and 93.99%, respectively. The reliability of the system in these three scenarios are: 94.42%, 97.39% and 96.00%, respectively. We conclude that pCAD system using color space alone is compatible to grayscale space or combined color and grayscale spaces. We validated our pCAD system against facial color databases and the results are consistent in accuracy and reliability. Copyright © 2015 Elsevier Ltd. All rights reserved.
High-Throughput Classification of Radiographs Using Deep Convolutional Neural Networks.
Rajkomar, Alvin; Lingam, Sneha; Taylor, Andrew G; Blum, Michael; Mongan, John
2017-02-01
The study aimed to determine if computer vision techniques rooted in deep learning can use a small set of radiographs to perform clinically relevant image classification with high fidelity. One thousand eight hundred eighty-five chest radiographs on 909 patients obtained between January 2013 and July 2015 at our institution were retrieved and anonymized. The source images were manually annotated as frontal or lateral and randomly divided into training, validation, and test sets. Training and validation sets were augmented to over 150,000 images using standard image manipulations. We then pre-trained a series of deep convolutional networks based on the open-source GoogLeNet with various transformations of the open-source ImageNet (non-radiology) images. These trained networks were then fine-tuned using the original and augmented radiology images. The model with highest validation accuracy was applied to our institutional test set and a publicly available set. Accuracy was assessed by using the Youden Index to set a binary cutoff for frontal or lateral classification. This retrospective study was IRB approved prior to initiation. A network pre-trained on 1.2 million greyscale ImageNet images and fine-tuned on augmented radiographs was chosen. The binary classification method correctly classified 100 % (95 % CI 99.73-100 %) of both our test set and the publicly available images. Classification was rapid, at 38 images per second. A deep convolutional neural network created using non-radiological images, and an augmented set of radiographs is effective in highly accurate classification of chest radiograph view type and is a feasible, rapid method for high-throughput annotation.
NASA Astrophysics Data System (ADS)
Mücher, C. A.; Roupioz, L.; Kramer, H.; Bogers, M. M. B.; Jongman, R. H. G.; Lucas, R. M.; Kosmidou, V. E.; Petrou, Z.; Manakos, I.; Padoa-Schioppa, E.; Adamo, M.; Blonda, P.
2015-05-01
A major challenge is to develop a biodiversity observation system that is cost effective and applicable in any geographic region. Measuring and reliable reporting of trends and changes in biodiversity requires amongst others detailed and accurate land cover and habitat maps in a standard and comparable way. The objective of this paper is to assess the EODHaM (EO Data for Habitat Mapping) classification results for a Dutch case study. The EODHaM system was developed within the BIO_SOS (The BIOdiversity multi-SOurce monitoring System: from Space TO Species) project and contains the decision rules for each land cover and habitat class based on spectral and height information. One of the main findings is that canopy height models, as derived from LiDAR, in combination with very high resolution satellite imagery provides a powerful input for the EODHaM system for the purpose of generic land cover and habitat mapping for any location across the globe. The assessment of the EODHaM classification results based on field data showed an overall accuracy of 74% for the land cover classes as described according to the Food and Agricultural Organization (FAO) Land Cover Classification System (LCCS) taxonomy at level 3, while the overall accuracy was lower (69.0%) for the habitat map based on the General Habitat Category (GHC) system for habitat surveillance and monitoring. A GHC habitat class is determined for each mapping unit on the basis of the composition of the individual life forms and height measurements. The classification showed very good results for forest phanerophytes (FPH) when individual life forms were analyzed in terms of their percentage coverage estimates per mapping unit from the LCCS classification and validated with field surveys. Analysis for shrubby chamaephytes (SCH) showed less accurate results, but might also be due to less accurate field estimates of percentage coverage. Overall, the EODHaM classification results encouraged us to derive the heights of all vegetated objects in the Netherlands from LiDAR data, in preparation for new habitat classifications.
NASA Astrophysics Data System (ADS)
Seo, Young Wook; Yoon, Seung Chul; Park, Bosoon; Hinton, Arthur; Windham, William R.; Lawrence, Kurt C.
2013-05-01
Salmonella is a major cause of foodborne disease outbreaks resulting from the consumption of contaminated food products in the United States. This paper reports the development of a hyperspectral imaging technique for detecting and differentiating two of the most common Salmonella serotypes, Salmonella Enteritidis (SE) and Salmonella Typhimurium (ST), from background microflora that are often found in poultry carcass rinse. Presumptive positive screening of colonies with a traditional direct plating method is a labor intensive and time consuming task. Thus, this paper is concerned with the detection of differences in spectral characteristics among the pure SE, ST, and background microflora grown on brilliant green sulfa (BGS) and xylose lysine tergitol 4 (XLT4) agar media with a spread plating technique. Visible near-infrared hyperspectral imaging, providing the spectral and spatial information unique to each microorganism, was utilized to differentiate SE and ST from the background microflora. A total of 10 classification models, including five machine learning algorithms, each without and with principal component analysis (PCA), were validated and compared to find the best model in classification accuracy. The five machine learning (classification) algorithms used in this study were Mahalanobis distance (MD), k-nearest neighbor (kNN), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and support vector machine (SVM). The average classification accuracy of all 10 models on a calibration (or training) set of the pure cultures on BGS agar plates was 98% (Kappa coefficient = 0.95) in determining the presence of SE and/or ST although it was difficult to differentiate between SE and ST. The average classification accuracy of all 10 models on a training set for ST detection on XLT4 agar was over 99% (Kappa coefficient = 0.99) although SE colonies on XLT4 agar were difficult to differentiate from background microflora. The average classification accuracy of all 10 models on a validation set of chicken carcass rinses spiked with SE or ST and incubated on BGS agar plates was 94.45% and 83.73%, without and with PCA for classification, respectively. The best performing classification model on the validation set was QDA without PCA by achieving the classification accuracy of 98.65% (Kappa coefficient=0.98). The overall best performing classification model regardless of using PCA was MD with the classification accuracy of 94.84% (Kappa coefficient=0.88) on the validation set.
AVHRR composite period selection for land cover classification
Maxwell, S.K.; Hoffer, R.M.; Chapman, P.L.
2002-01-01
Multitemporal satellite image datasets provide valuable information on the phenological characteristics of vegetation, thereby significantly increasing the accuracy of cover type classifications compared to single date classifications. However, the processing of these datasets can become very complex when dealing with multitemporal data combined with multispectral data. Advanced Very High Resolution Radiometer (AVHRR) biweekly composite data are commonly used to classify land cover over large regions. Selecting a subset of these biweekly composite periods may be required to reduce the complexity and cost of land cover mapping. The objective of our research was to evaluate the effect of reducing the number of composite periods and altering the spacing of those composite periods on classification accuracy. Because inter-annual variability can have a major impact on classification results, 5 years of AVHRR data were evaluated. AVHRR biweekly composite images for spectral channels 1-4 (visible, near-infrared and two thermal bands) covering the entire growing season were used to classify 14 cover types over the entire state of Colorado for each of five different years. A supervised classification method was applied to maintain consistent procedures for each case tested. Results indicate that the number of composite periods can be halved-reduced from 14 composite dates to seven composite dates-without significantly reducing overall classification accuracy (80.4% Kappa accuracy for the 14-composite data-set as compared to 80.0% for a seven-composite dataset). At least seven composite periods were required to ensure the classification accuracy was not affected by inter-annual variability due to climate fluctuations. Concentrating more composites near the beginning and end of the growing season, as compared to using evenly spaced time periods, consistently produced slightly higher classification values over the 5 years tested (average Kappa) of 80.3% for the heavy early/late case as compared to 79.0% for the alternate dataset case).
Sleep state classification using pressure sensor mats.
Baran Pouyan, M; Nourani, M; Pompeo, M
2015-08-01
Sleep state detection is valuable in assessing patient's sleep quality and in-bed general behavior. In this paper, a novel classification approach of sleep states (sleep, pre-wake, wake) is proposed that uses only surface pressure sensors. In our method, a mobility metric is defined based on successive pressure body maps. Then, suitable statistical features are computed based on the mobility metric. Finally, a customized random forest classifier is employed to identify various classes including a new class for pre-wake state. Our algorithm achieves 96.1% and 88% accuracies for two (sleep, wake) and three (sleep, pre-wake, wake) class identification, respectively.
Assessment of various supervised learning algorithms using different performance metrics
NASA Astrophysics Data System (ADS)
Susheel Kumar, S. M.; Laxkar, Deepak; Adhikari, Sourav; Vijayarajan, V.
2017-11-01
Our work brings out comparison based on the performance of supervised machine learning algorithms on a binary classification task. The supervised machine learning algorithms which are taken into consideration in the following work are namely Support Vector Machine(SVM), Decision Tree(DT), K Nearest Neighbour (KNN), Naïve Bayes(NB) and Random Forest(RF). This paper mostly focuses on comparing the performance of above mentioned algorithms on one binary classification task by analysing the Metrics such as Accuracy, F-Measure, G-Measure, Precision, Misclassification Rate, False Positive Rate, True Positive Rate, Specificity, Prevalence.
Hao, Pengyu; Wang, Li; Niu, Zheng
2015-01-01
A range of single classifiers have been proposed to classify crop types using time series vegetation indices, and hybrid classifiers are used to improve discriminatory power. Traditional fusion rules use the product of multi-single classifiers, but that strategy cannot integrate the classification output of machine learning classifiers. In this research, the performance of two hybrid strategies, multiple voting (M-voting) and probabilistic fusion (P-fusion), for crop classification using NDVI time series were tested with different training sample sizes at both pixel and object levels, and two representative counties in north Xinjiang were selected as study area. The single classifiers employed in this research included Random Forest (RF), Support Vector Machine (SVM), and See 5 (C 5.0). The results indicated that classification performance improved (increased the mean overall accuracy by 5%~10%, and reduced standard deviation of overall accuracy by around 1%) substantially with the training sample number, and when the training sample size was small (50 or 100 training samples), hybrid classifiers substantially outperformed single classifiers with higher mean overall accuracy (1%~2%). However, when abundant training samples (4,000) were employed, single classifiers could achieve good classification accuracy, and all classifiers obtained similar performances. Additionally, although object-based classification did not improve accuracy, it resulted in greater visual appeal, especially in study areas with a heterogeneous cropping pattern. PMID:26360597
Koch, Stefan P.; Hägele, Claudia; Haynes, John-Dylan; Heinz, Andreas; Schlagenhauf, Florian; Sterzer, Philipp
2015-01-01
Functional neuroimaging has provided evidence for altered function of mesolimbic circuits implicated in reward processing, first and foremost the ventral striatum, in patients with schizophrenia. While such findings based on significant group differences in brain activations can provide important insights into the pathomechanisms of mental disorders, the use of neuroimaging results from standard univariate statistical analysis for individual diagnosis has proven difficult. In this proof of concept study, we tested whether the predictive accuracy for the diagnostic classification of schizophrenia patients vs. healthy controls could be improved using multivariate pattern analysis (MVPA) of regional functional magnetic resonance imaging (fMRI) activation patterns for the anticipation of monetary reward. With a searchlight MVPA approach using support vector machine classification, we found that the diagnostic category could be predicted from local activation patterns in frontal, temporal, occipital and midbrain regions, with a maximal cluster peak classification accuracy of 93% for the right pallidum. Region-of-interest based MVPA for the ventral striatum achieved a maximal cluster peak accuracy of 88%, whereas the classification accuracy on the basis of standard univariate analysis reached only 75%. Moreover, using support vector regression we could additionally predict the severity of negative symptoms from ventral striatal activation patterns. These results show that MVPA can be used to substantially increase the accuracy of diagnostic classification on the basis of task-related fMRI signal patterns in a regionally specific way. PMID:25799236
As the rapidly growing archives of satellite remote sensing imagery now span decades'worth of data, there is increasing interest in the study of long-term regional land cover change across multiple image dates. In most cases, however, temporally coincident ground sampled data are...
As the rapidly growing archives of satellite remote sensing imagery now span decades'worth of data, there is increasing interest in the study of long-term regional land cover change across multiple image dates. In most cases, however, temporally coincident ground sampled data are...
As the rapidly growing archives of satellite remote sensing imagery now span decades' worth of data, there is increasing interest in the study of long-term regional land cover change across multiple image dates. In most cases, however, temporally coincident ground sampled data ar...
Quantifying the abundance of co-occurring conifers along Inland Northwest (USA) climate gradients
Gerald E. Rehfeldt; Dennis E. Ferguson; Nicholas L. Crookston
2008-01-01
The occurrence and abundance of conifers along climate gradients in the Inland Northwest (USA) was assessed using data from 5082 field plots, 81% of which were forested. Analyses using the Random Forests classification tree revealed that the sequential distribution of species along an altitudinal gradient could be predicted with reasonable accuracy from a single...
ERIC Educational Resources Information Center
Liu, Ren; Huggins-Manley, Anne Corinne; Bradshaw, Laine
2017-01-01
There is an increasing demand for assessments that can provide more fine-grained information about examinees. In response to the demand, diagnostic measurement provides students with feedback on their strengths and weaknesses on specific skills by classifying them into mastery or nonmastery attribute categories. These attributes often form a…
A semi-automated method for bone age assessment using cervical vertebral maturation.
Baptista, Roberto S; Quaglio, Camila L; Mourad, Laila M E H; Hummel, Anderson D; Caetano, Cesar Augusto C; Ortolani, Cristina Lúcia F; Pisa, Ivan T
2012-07-01
To propose a semi-automated method for pattern classification to predict individuals' stage of growth based on morphologic characteristics that are described in the modified cervical vertebral maturation (CVM) method of Baccetti et al. A total of 188 lateral cephalograms were collected, digitized, evaluated manually, and grouped into cervical stages by two expert examiners. Landmarks were located on each image and measured. Three pattern classifiers based on the Naïve Bayes algorithm were built and assessed using a software program. The classifier with the greatest accuracy according to the weighted kappa test was considered best. The classifier showed a weighted kappa coefficient of 0.861 ± 0.020. If an adjacent estimated pre-stage or poststage value was taken to be acceptable, the classifier would show a weighted kappa coefficient of 0.992 ± 0.019. Results from this study show that the proposed semi-automated pattern classification method can help orthodontists identify the stage of CVM. However, additional studies are needed before this semi-automated classification method for CVM assessment can be implemented in clinical practice.
Feature Selection Has a Large Impact on One-Class Classification Accuracy for MicroRNAs in Plants.
Yousef, Malik; Saçar Demirci, Müşerref Duygu; Khalifa, Waleed; Allmer, Jens
2016-01-01
MicroRNAs (miRNAs) are short RNA sequences involved in posttranscriptional gene regulation. Their experimental analysis is complicated and, therefore, needs to be supplemented with computational miRNA detection. Currently computational miRNA detection is mainly performed using machine learning and in particular two-class classification. For machine learning, the miRNAs need to be parametrized and more than 700 features have been described. Positive training examples for machine learning are readily available, but negative data is hard to come by. Therefore, it seems prerogative to use one-class classification instead of two-class classification. Previously, we were able to almost reach two-class classification accuracy using one-class classifiers. In this work, we employ feature selection procedures in conjunction with one-class classification and show that there is up to 36% difference in accuracy among these feature selection methods. The best feature set allowed the training of a one-class classifier which achieved an average accuracy of ~95.6% thereby outperforming previous two-class-based plant miRNA detection approaches by about 0.5%. We believe that this can be improved upon in the future by rigorous filtering of the positive training examples and by improving current feature clustering algorithms to better target pre-miRNA feature selection.
Word pair classification during imagined speech using direct brain recordings
NASA Astrophysics Data System (ADS)
Martin, Stephanie; Brunner, Peter; Iturrate, Iñaki; Millán, José Del R.; Schalk, Gerwin; Knight, Robert T.; Pasley, Brian N.
2016-05-01
People that cannot communicate due to neurological disorders would benefit from an internal speech decoder. Here, we showed the ability to classify individual words during imagined speech from electrocorticographic signals. In a word imagery task, we used high gamma (70-150 Hz) time features with a support vector machine model to classify individual words from a pair of words. To account for temporal irregularities during speech production, we introduced a non-linear time alignment into the SVM kernel. Classification accuracy reached 88% in a two-class classification framework (50% chance level), and average classification accuracy across fifteen word-pairs was significant across five subjects (mean = 58% p < 0.05). We also compared classification accuracy between imagined speech, overt speech and listening. As predicted, higher classification accuracy was obtained in the listening and overt speech conditions (mean = 89% and 86%, respectively; p < 0.0001), where speech stimuli were directly presented. The results provide evidence for a neural representation for imagined words in the temporal lobe, frontal lobe and sensorimotor cortex, consistent with previous findings in speech perception and production. These data represent a proof of concept study for basic decoding of speech imagery, and delineate a number of key challenges to usage of speech imagery neural representations for clinical applications.
Word pair classification during imagined speech using direct brain recordings
Martin, Stephanie; Brunner, Peter; Iturrate, Iñaki; Millán, José del R.; Schalk, Gerwin; Knight, Robert T.; Pasley, Brian N.
2016-01-01
People that cannot communicate due to neurological disorders would benefit from an internal speech decoder. Here, we showed the ability to classify individual words during imagined speech from electrocorticographic signals. In a word imagery task, we used high gamma (70–150 Hz) time features with a support vector machine model to classify individual words from a pair of words. To account for temporal irregularities during speech production, we introduced a non-linear time alignment into the SVM kernel. Classification accuracy reached 88% in a two-class classification framework (50% chance level), and average classification accuracy across fifteen word-pairs was significant across five subjects (mean = 58%; p < 0.05). We also compared classification accuracy between imagined speech, overt speech and listening. As predicted, higher classification accuracy was obtained in the listening and overt speech conditions (mean = 89% and 86%, respectively; p < 0.0001), where speech stimuli were directly presented. The results provide evidence for a neural representation for imagined words in the temporal lobe, frontal lobe and sensorimotor cortex, consistent with previous findings in speech perception and production. These data represent a proof of concept study for basic decoding of speech imagery, and delineate a number of key challenges to usage of speech imagery neural representations for clinical applications. PMID:27165452
Comparing ecoregional classifications for natural areas management in the Klamath Region, USA
Sarr, Daniel A.; Duff, Andrew; Dinger, Eric C.; Shafer, Sarah L.; Wing, Michael; Seavy, Nathaniel E.; Alexander, John D.
2015-01-01
We compared three existing ecoregional classification schemes (Bailey, Omernik, and World Wildlife Fund) with two derived schemes (Omernik Revised and Climate Zones) to explore their effectiveness in explaining species distributions and to better understand natural resource geography in the Klamath Region, USA. We analyzed presence/absence data derived from digital distribution maps for trees, amphibians, large mammals, small mammals, migrant birds, and resident birds using three statistical analyses of classification accuracy (Analysis of Similarity, Canonical Analysis of Principal Coordinates, and Classification Strength). The classifications were roughly comparable in classification accuracy, with Omernik Revised showing the best overall performance. Trees showed the strongest fidelity to the classifications, and large mammals showed the weakest fidelity. We discuss the implications for regional biogeography and describe how intermediate resolution ecoregional classifications may be appropriate for use as natural areas management domains.
Vegetation inventory, mapping, and classification report, Fort Bowie National Historic Site
Studd, Sarah; Fallon, Elizabeth; Crumbacher, Laura; Drake, Sam; Villarreal, Miguel
2013-01-01
A vegetation mapping and characterization effort was conducted at Fort Bowie National Historic Site in 2008-10 by the Sonoran Desert Network office in collaboration with researchers from the Office of Arid lands studies, Remote Sensing Center at the University of Arizona. This vegetation mapping effort was completed under the National Park Service Vegetation Inventory program which aims to complete baseline mapping inventories at over 270 national park units. The vegetation map data was collected to provide park managers with a digital map product that met national standards of spatial and thematic accuracy, while also placing the vegetation into a regional and even national context. Work comprised of three major field phases 1) concurrent field-based classification data collection and mapping (map unit delineation), 2) development of vegetation community types at the National Vegetation Classification alliance or association level and 3) map accuracy assessment. Phase 1 was completed in late 2008 and early 2009. Community type descriptions were drafted to meet the then-current hierarchy (version 1) of the National Vegetation Classification System (NVCS) and these were applied to each of the mapped areas. This classification was developed from both plot level data and censused polygon data (map units) as this project was conducted as a concurrent mapping and classification effort. The third stage of accuracy assessment completed in the fall of 2010 consisted of a complete census of each map unit and was conducted almost entirely by park staff. Following accuracy assessment the map was amended where needed and final products were developed including this report, a digital map and full vegetation descriptions. Fort Bowie National Historic Site covers only 1000 acres yet has a relatively complex landscape, topography and geology. A total of 16 distinct communities were described and mapped at Fort Bowie NHS. These ranged from lush riparian woodlands lining the ephemeral washes dominated by Ash (Fraxinus), Walnut (Juglans) and Hackberry (Celtis) to drier upland sites typical of desert scrub and semi-desert grassland communities. These shrublands boast a diverse mixture of shrubs, succulents and perennial grasses. In many places the vegetation could be seen to echo the history of the fort site, with management of shrub encroachment apparent in the grasslands and the paucity of trees evidence of historic cutting for timber and fire wood. Seven of the 16 vegetation types were ‘accepted’ types within the NVC while the others have been described here as specific to FOBO and have proposed status within the NVC. The map was designed to facilitate ecologically-based natural resources management and research. The map is in digital format within a geodatabase structure that allows for complex relationships to be established between spatial and tabular data, and makes accessing the product easy and seamless. The GIS format allows user flexibility and will also enable updates to be made as new information becomes available (such as revised NVC codes or vegetation type names) or in the event of major disturbance events that could impact the vegetation.
Shayan, Zahra; Mohammad Gholi Mezerji, Naser; Shayan, Leila; Naseri, Parisa
2015-11-03
Logistic regression (LR) and linear discriminant analysis (LDA) are two popular statistical models for prediction of group membership. Although they are very similar, the LDA makes more assumptions about the data. When categorical and continuous variables used simultaneously, the optimal choice between the two models is questionable. In most studies, classification error (CE) is used to discriminate between subjects in several groups, but this index is not suitable to predict the accuracy of the outcome. The present study compared LR and LDA models using classification indices. This cross-sectional study selected 243 cancer patients. Sample sets of different sizes (n = 50, 100, 150, 200, 220) were randomly selected and the CE, B, and Q classification indices were calculated by the LR and LDA models. CE revealed the a lack of superiority for one model over the other, but the results showed that LR performed better than LDA for the B and Q indices in all situations. No significant effect for sample size on CE was noted for selection of an optimal model. Assessment of the accuracy of prediction of real data indicated that the B and Q indices are appropriate for selection of an optimal model. The results of this study showed that LR performs better in some cases and LDA in others when based on CE. The CE index is not appropriate for classification, although the B and Q indices performed better and offered more efficient criteria for comparison and discrimination between groups.
a Gsa-Svm Hybrid System for Classification of Binary Problems
NASA Astrophysics Data System (ADS)
Sarafrazi, Soroor; Nezamabadi-pour, Hossein; Barahman, Mojgan
2011-06-01
This paperhybridizesgravitational search algorithm (GSA) with support vector machine (SVM) and made a novel GSA-SVM hybrid system to improve the classification accuracy in binary problems. GSA is an optimization heuristic toolused to optimize the value of SVM kernel parameter (in this paper, radial basis function (RBF) is chosen as the kernel function). The experimental results show that this newapproach can achieve high classification accuracy and is comparable to or better than the particle swarm optimization (PSO)-SVM and genetic algorithm (GA)-SVM, which are two hybrid systems for classification.
Typicality effects in artificial categories: is there a hemisphere difference?
Richards, L G; Chiarello, C
1990-07-01
In category classification tasks, typicality effects are usually found: accuracy and reaction time depend upon distance from a prototype. In this study, subjects learned either verbal or nonverbal dot pattern categories, followed by a lateralized classification task. Comparable typicality effects were found in both reaction time and accuracy across visual fields for both verbal and nonverbal categories. Both hemispheres appeared to use a similarity-to-prototype matching strategy in classification. This indicates that merely having a verbal label does not differentiate classification in the two hemispheres.
Wearable-Sensor-Based Classification Models of Faller Status in Older Adults.
Howcroft, Jennifer; Lemaire, Edward D; Kofman, Jonathan
2016-01-01
Wearable sensors have potential for quantitative, gait-based, point-of-care fall risk assessment that can be easily and quickly implemented in clinical-care and older-adult living environments. This investigation generated models for wearable-sensor based fall-risk classification in older adults and identified the optimal sensor type, location, combination, and modelling method; for walking with and without a cognitive load task. A convenience sample of 100 older individuals (75.5 ± 6.7 years; 76 non-fallers, 24 fallers based on 6 month retrospective fall occurrence) walked 7.62 m under single-task and dual-task conditions while wearing pressure-sensing insoles and tri-axial accelerometers at the head, pelvis, and left and right shanks. Participants also completed the Activities-specific Balance Confidence scale, Community Health Activities Model Program for Seniors questionnaire, six minute walk test, and ranked their fear of falling. Fall risk classification models were assessed for all sensor combinations and three model types: multi-layer perceptron neural network, naïve Bayesian, and support vector machine. The best performing model was a multi-layer perceptron neural network with input parameters from pressure-sensing insoles and head, pelvis, and left shank accelerometers (accuracy = 84%, F1 score = 0.600, MCC score = 0.521). Head sensor-based models had the best performance of the single-sensor models for single-task gait assessment. Single-task gait assessment models outperformed models based on dual-task walking or clinical assessment data. Support vector machines and neural networks were the best modelling technique for fall risk classification. Fall risk classification models developed for point-of-care environments should be developed using support vector machines and neural networks, with a multi-sensor single-task gait assessment.
Transportation Modes Classification Using Sensors on Smartphones.
Fang, Shih-Hau; Liao, Hao-Hsiang; Fei, Yu-Xiang; Chen, Kai-Hsiang; Huang, Jen-Wei; Lu, Yu-Ding; Tsao, Yu
2016-08-19
This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user's transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes.
Transportation Modes Classification Using Sensors on Smartphones
Fang, Shih-Hau; Liao, Hao-Hsiang; Fei, Yu-Xiang; Chen, Kai-Hsiang; Huang, Jen-Wei; Lu, Yu-Ding; Tsao, Yu
2016-01-01
This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user’s transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes. PMID:27548182
Wu, Zhuoting; Thenkabail, Prasad S.; Mueller, Rick; Zakzeski, Audra; Melton, Forrest; Johnson, Lee; Rosevelt, Carolyn; Dwyer, John; Jones, Jeanine; Verdin, James P.
2014-01-01
Increasing drought occurrences and growing populations demand accurate, routine, and consistent cultivated and fallow cropland products to enable water and food security analysis. The overarching goal of this research was to develop and test automated cropland classification algorithm (ACCA) that provide accurate, consistent, and repeatable information on seasonal cultivated as well as seasonal fallow cropland extents and areas based on the Moderate Resolution Imaging Spectroradiometer remote sensing data. Seasonal ACCA development process involves writing series of iterative decision tree codes to separate cultivated and fallow croplands from noncroplands, aiming to accurately mirror reliable reference data sources. A pixel-by-pixel accuracy assessment when compared with the U.S. Department of Agriculture (USDA) cropland data showed, on average, a producer’s accuracy of 93% and a user’s accuracy of 85% across all months. Further, ACCA-derived cropland maps agreed well with the USDA Farm Service Agency crop acreage-reported data for both cultivated and fallow croplands with R-square values over 0.7 and field surveys with an accuracy of ≥95% for cultivated croplands and ≥76% for fallow croplands. Our results demonstrated the ability of ACCA to generate cropland products, such as cultivated and fallow cropland extents and areas, accurately, automatically, and repeatedly throughout the growing season.
NASA Astrophysics Data System (ADS)
Wu, Zhuoting; Thenkabail, Prasad S.; Mueller, Rick; Zakzeski, Audra; Melton, Forrest; Johnson, Lee; Rosevelt, Carolyn; Dwyer, John; Jones, Jeanine; Verdin, James P.
2014-01-01
Increasing drought occurrences and growing populations demand accurate, routine, and consistent cultivated and fallow cropland products to enable water and food security analysis. The overarching goal of this research was to develop and test automated cropland classification algorithm (ACCA) that provide accurate, consistent, and repeatable information on seasonal cultivated as well as seasonal fallow cropland extents and areas based on the Moderate Resolution Imaging Spectroradiometer remote sensing data. Seasonal ACCA development process involves writing series of iterative decision tree codes to separate cultivated and fallow croplands from noncroplands, aiming to accurately mirror reliable reference data sources. A pixel-by-pixel accuracy assessment when compared with the U.S. Department of Agriculture (USDA) cropland data showed, on average, a producer's accuracy of 93% and a user's accuracy of 85% across all months. Further, ACCA-derived cropland maps agreed well with the USDA Farm Service Agency crop acreage-reported data for both cultivated and fallow croplands with R-square values over 0.7 and field surveys with an accuracy of ≥95% for cultivated croplands and ≥76% for fallow croplands. Our results demonstrated the ability of ACCA to generate cropland products, such as cultivated and fallow cropland extents and areas, accurately, automatically, and repeatedly throughout the growing season.
NASA Astrophysics Data System (ADS)
Dondurur, Mehmet
The primary objective of this study was to determine the degree to which modern SAR systems can be used to obtain information about the Earth's vegetative resources. Information obtainable from microwave synthetic aperture radar (SAR) data was compared with that obtainable from LANDSAT-TM and SPOT data. Three hypotheses were tested: (a) Classification of land cover/use from SAR data can be accomplished on a pixel-by-pixel basis with the same overall accuracy as from LANDSAT-TM and SPOT data. (b) Classification accuracy for individual land cover/use classes will differ between sensors. (c) Combining information derived from optical and SAR data into an integrated monitoring system will improve overall and individual land cover/use class accuracies. The study was conducted with three data sets for the Sleeping Bear Dunes test site in the northwestern part of Michigan's lower peninsula, including an October 1982 LANDSAT-TM scene, a June 1989 SPOT scene and C-, L- and P-Band radar data from the Jet Propulsion Laboratory AIRSAR. Reference data were derived from the Michigan Resource Information System (MIRIS) and available color infrared aerial photos. Classification and rectification of data sets were done using ERDAS Image Processing Programs. Classification algorithms included Maximum Likelihood, Mahalanobis Distance, Minimum Spectral Distance, ISODATA, Parallelepiped, and Sequential Cluster Analysis. Classified images were rectified as necessary so that all were at the same scale and oriented north-up. Results were analyzed with contingency tables and percent correctly classified (PCC) and Cohen's Kappa (CK) as accuracy indices using CSLANT and ImagePro programs developed for this study. Accuracy analyses were based upon a 1.4 by 6.5 km area with its long axis east-west. Reference data for this subscene total 55,770 15 by 15 m pixels with sixteen cover types, including seven level III forest classes, three level III urban classes, two level II range classes, two water classes, one wetland class and one agriculture class. An initial analysis was made without correcting the 1978 MIRIS reference data to the different dates of the TM, SPOT and SAR data sets. In this analysis, highest overall classification accuracy (PCC) was 87% with the TM data set, with both SPOT and C-Band SAR at 85%, a difference statistically significant at the 0.05 level. When the reference data were corrected for land cover change between 1978 and 1991, classification accuracy with the C-Band SAR data increased to 87%. Classification accuracy differed from sensor to sensor for individual land cover classes, Combining sensors into hypothetical multi-sensor systems resulted in higher accuracies than for any single sensor. Combining LANDSAT -TM and C-Band SAR yielded an overall classification accuracy (PCC) of 92%. The results of this study indicate that C-Band SAR data provide an acceptable substitute for LANDSAT-TM or SPOT data when land cover information is desired of areas where cloud cover obscures the terrain. Even better results can be obtained by integrating TM and C-Band SAR data into a multi-sensor system.
A contour-based shape descriptor for biomedical image classification and retrieval
NASA Astrophysics Data System (ADS)
You, Daekeun; Antani, Sameer; Demner-Fushman, Dina; Thoma, George R.
2013-12-01
Contours, object blobs, and specific feature points are utilized to represent object shapes and extract shape descriptors that can then be used for object detection or image classification. In this research we develop a shape descriptor for biomedical image type (or, modality) classification. We adapt a feature extraction method used in optical character recognition (OCR) for character shape representation, and apply various image preprocessing methods to successfully adapt the method to our application. The proposed shape descriptor is applied to radiology images (e.g., MRI, CT, ultrasound, X-ray, etc.) to assess its usefulness for modality classification. In our experiment we compare our method with other visual descriptors such as CEDD, CLD, Tamura, and PHOG that extract color, texture, or shape information from images. The proposed method achieved the highest classification accuracy of 74.1% among all other individual descriptors in the test, and when combined with CSD (color structure descriptor) showed better performance (78.9%) than using the shape descriptor alone.
Histogram Curve Matching Approaches for Object-based Image Classification of Land Cover and Land Use
Toure, Sory I.; Stow, Douglas A.; Weeks, John R.; Kumar, Sunil
2013-01-01
The classification of image-objects is usually done using parametric statistical measures of central tendency and/or dispersion (e.g., mean or standard deviation). The objectives of this study were to analyze digital number histograms of image objects and evaluate classifications measures exploiting characteristic signatures of such histograms. Two histograms matching classifiers were evaluated and compared to the standard nearest neighbor to mean classifier. An ADS40 airborne multispectral image of San Diego, California was used for assessing the utility of curve matching classifiers in a geographic object-based image analysis (GEOBIA) approach. The classifications were performed with data sets having 0.5 m, 2.5 m, and 5 m spatial resolutions. Results show that histograms are reliable features for characterizing classes. Also, both histogram matching classifiers consistently performed better than the one based on the standard nearest neighbor to mean rule. The highest classification accuracies were produced with images having 2.5 m spatial resolution. PMID:24403648
NASA Astrophysics Data System (ADS)
Lin, Yi; Jiang, Miao
2017-01-01
Tree species information is essential for forest research and management purposes, which in turn require approaches for accurate and precise classification of tree species. One such remote sensing technology, terrestrial laser scanning (TLS), has proved to be capable of characterizing detailed tree structures, such as tree stem geometry. Can TLS further differentiate between broad- and needle-leaves? If the answer is positive, TLS data can be used for classification of taxonomic tree groups by directly examining their differences in leaf morphology. An analysis was proposed to assess TLS-represented broad- and needle-leaf structures, followed by a Bayes classifier to perform the classification. Tests indicated that the proposed method can basically implement the task, with an overall accuracy of 77.78%. This study indicates a way of implementing the classification of the two major broad- and needle-leaf taxonomies measured by TLS in accordance to their literal definitions, and manifests the potential of extending TLS applications in forestry.
Kilavuz, Ahmet Erdem; Songu, Murat; İmre, Abdulkadir; Arslanoğlu, Secil; Özkul, Yilmaz; Pinar, Ercan; Ateş, Düzgün
2018-05-01
The accuracy of fine-needle aspiration biopsy (FNAB) is controversial in parotid tumors. We aimed to compare FNAB results with the final histopathological diagnosis and to apply the "Sal classification" to our data and discuss its results and its place in parotid gland cytology. The FNAB cytological findings and final histological diagnosis were assessed retrospectively in 2 different scenarios based on the distribution of nondefinitive cytology, and we applied the Sal classification and determined malignancy rate, sensitivity, and specificity for each category. In 2 different scenarios FNAB sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were found to be 81%, 87%, 54.7%, and 96.1%; and 65.3%, 100%, 100%, and 96.1%, respectively. The malignancy rates and sensitivity and specificity were also calculated and discussed for each Sal category. We believe that the Sal classification has a great potential to be a useful tool in classification of parotid gland cytology. © 2018 Wiley Periodicals, Inc.
Brain-Computer Interface Based on Generation of Visual Images
Bobrov, Pavel; Frolov, Alexander; Cantor, Charles; Fedulova, Irina; Bakhnyan, Mikhail; Zhavoronkov, Alexander
2011-01-01
This paper examines the task of recognizing EEG patterns that correspond to performing three mental tasks: relaxation and imagining of two types of pictures: faces and houses. The experiments were performed using two EEG headsets: BrainProducts ActiCap and Emotiv EPOC. The Emotiv headset becomes widely used in consumer BCI application allowing for conducting large-scale EEG experiments in the future. Since classification accuracy significantly exceeded the level of random classification during the first three days of the experiment with EPOC headset, a control experiment was performed on the fourth day using ActiCap. The control experiment has shown that utilization of high-quality research equipment can enhance classification accuracy (up to 68% in some subjects) and that the accuracy is independent of the presence of EEG artifacts related to blinking and eye movement. This study also shows that computationally-inexpensive Bayesian classifier based on covariance matrix analysis yields similar classification accuracy in this problem as a more sophisticated Multi-class Common Spatial Patterns (MCSP) classifier. PMID:21695206
NASA Technical Reports Server (NTRS)
Mulligan, P. J.; Gervin, J. C.; Lu, Y. C.
1985-01-01
An area bordering the Eastern Shore of the Chesapeake Bay was selected for study and classified using unsupervised techniques applied to LANDSAT-2 MSS data and several band combinations of LANDSAT-4 TM data. The accuracies of these Level I land cover classifications were verified using the Taylor's Island USGS 7.5 minute topographic map which was photointerpreted, digitized and rasterized. The the Taylor's Island map, comparing the MSS and TM three band (2 3 4) classifications, the increased resolution of TM produced a small improvement in overall accuracy of 1% correct due primarily to a small improvement, and 1% and 3%, in areas such as water and woodland. This was expected as the MSS data typically produce high accuracies for categories which cover large contiguous areas. However, in the categories covering smaller areas within the map there was generally an improvement of at least 10%. Classification of the important residential category improved 12%, and wetlands were mapped with 11% greater accuracy.
NASA Astrophysics Data System (ADS)
Al-Doasari, Ahmad E.
The 1991 Gulf War caused massive environmental damage in Kuwait. Deposition of oil and soot droplets from hundreds of burning oil-wells created a layer of tarcrete on the desert surface covering over 900 km2. This research investigates the spatial change in the tarcrete extent from 1991 to 1998 using Landsat Thematic Mapper (TM) imagery and statistical modeling techniques. The pixel structure of TM data allows the spatial analysis of the change in tarcrete extent to be conducted at the pixel (cell) level within a geographical information system (GIS). There are two components to this research. The first is a comparison of three remote sensing classification techniques used to map the tarcrete layer. The second is a spatial-temporal analysis and simulation of tarcrete changes through time. The analysis focuses on an area of 389 km2 located south of the Al-Burgan oil field. Five TM images acquired in 1991, 1993, 1994, 1995, and 1998 were geometrically and atmospherically corrected. These images were classified into six classes: oil lakes; heavy, intermediate, light, and traces of tarcrete; and sand. The classification methods tested were unsupervised, supervised, and neural network supervised (fuzzy ARTMAP). Field data of tarcrete characteristics were collected to support the classification process and to evaluate the classification accuracies. Overall, the neural network method is more accurate (60 percent) than the other two methods; both the unsupervised and the supervised classification accuracy assessments resulted in 46 percent accuracy. The five classifications were used in a lagged autologistic model to analyze the spatial changes of the tarcrete through time. The autologistic model correctly identified overall tarcrete contraction between 1991--1993 and 1995--1998. However, tarcrete contraction between 1993--1994 and 1994--1995 was less well marked, in part because of classification errors in the maps from these time periods. Initial simulations of tarcrete contraction with a cellular automaton model were not very successful. However, more accurate classifications could improve the simulations. This study illustrates how an empirical investigation using satellite images, field data, GIS, and spatial statistics can simulate dynamic land-cover change through the use of a discrete statistical and cellular automaton model.
Liljeqvist, Henning T G; Muscatello, David; Sara, Grant; Dinh, Michael; Lawrence, Glenda L
2014-09-23
Syndromic surveillance in emergency departments (EDs) may be used to deliver early warnings of increases in disease activity, to provide situational awareness during events of public health significance, to supplement other information on trends in acute disease and injury, and to support the development and monitoring of prevention or response strategies. Changes in mental health related ED presentations may be relevant to these goals, provided they can be identified accurately and efficiently. This study aimed to measure the accuracy of using diagnostic codes in electronic ED presentation records to identify mental health-related visits. We selected a random sample of 500 records from a total of 1,815,588 ED electronic presentation records from 59 NSW public hospitals during 2010. ED diagnoses were recorded using any of ICD-9, ICD-10 or SNOMED CT classifications. Three clinicians, blinded to the automatically generated syndromic grouping and each other's classification, reviewed the triage notes and classified each of the 500 visits as mental health-related or not. A "mental health problem presentation" for the purposes of this study was defined as any ED presentation where either a mental disorder or a mental health problem was the reason for the ED visit. The combined clinicians' assessment of the records was used as reference standard to measure the sensitivity, specificity, and positive and negative predictive values of the automatic classification of coded emergency department diagnoses. Agreement between the reference standard and the automated coded classification was estimated using the Kappa statistic. Agreement between clinician's classification and automated coded classification was substantial (Kappa = 0.73. 95% CI: 0.58 - 0.87). The automatic syndromic grouping of coded ED diagnoses for mental health-related visits was found to be moderately sensitive (68% 95% CI: 46%-84%) and highly specific at 99% (95% CI: 98%-99.7%) when compared with the reference standard in identifying mental health related ED visits. Positive predictive value was 81% (95% CI: 0.57 - 0.94) and negative predictive value was 98% (95% CI: 0.97-0.99). Mental health presentations identified using diagnoses coded with various classifications in electronic ED presentation records offers sufficient accuracy for application in near real-time syndromic surveillance.
Investigating the sex-related geometric variation of the human cranium.
Bertsatos, Andreas; Papageorgopoulou, Christina; Valakos, Efstratios; Chovalopoulou, Maria-Eleni
2018-01-29
Accurate sexing methods are of great importance in forensic anthropology since sex assessment is among the principal tasks when examining human skeletal remains. The present study explores a novel approach in assessing the most accurate metric traits of the human cranium for sex estimation based on 80 ectocranial landmarks from 176 modern individuals of known age and sex from the Athens Collection. The purpose of the study is to identify those distance and angle measurements that can be most effectively used in sex assessment. Three-dimensional landmark coordinates were digitized with a Microscribe 3DX and analyzed in GNU Octave. An iterative linear discriminant analysis of all possible combinations of landmarks was performed for each unique set of the 3160 distances and 246,480 angles. Cross-validated correct classification as well as multivariate DFA on top performing variables reported 13 craniometric distances with over 85% classification accuracy, 7 angles over 78%, as well as certain multivariate combinations yielding over 95%. Linear regression of these variables with the centroid size was used to assess their relation to the size of the cranium. In contrast to the use of generalized procrustes analysis (GPA) and principal component analysis (PCA), which constitute the common analytical work flow for such data, our method, although computational intensive, produced easily applicable discriminant functions of high accuracy, while at the same time explored the maximum of cranial variability.
Pettersson-Yeo, William; Benetti, Stefania; Marquand, Andre F.; Joules, Richard; Catani, Marco; Williams, Steve C. R.; Allen, Paul; McGuire, Philip; Mechelli, Andrea
2014-01-01
In the pursuit of clinical utility, neuroimaging researchers of psychiatric and neurological illness are increasingly using analyses, such as support vector machine, that allow inference at the single-subject level. Recent studies employing single-modality data, however, suggest that classification accuracies must be improved for such utility to be realized. One possible solution is to integrate different data types to provide a single combined output classification; either by generating a single decision function based on an integrated kernel matrix, or, by creating an ensemble of multiple single modality classifiers and integrating their predictions. Here, we describe four integrative approaches: (1) an un-weighted sum of kernels, (2) multi-kernel learning, (3) prediction averaging, and (4) majority voting, and compare their ability to enhance classification accuracy relative to the best single-modality classification accuracy. We achieve this by integrating structural, functional, and diffusion tensor magnetic resonance imaging data, in order to compare ultra-high risk (n = 19), first episode psychosis (n = 19) and healthy control subjects (n = 23). Our results show that (i) whilst integration can enhance classification accuracy by up to 13%, the frequency of such instances may be limited, (ii) where classification can be enhanced, simple methods may yield greater increases relative to more computationally complex alternatives, and, (iii) the potential for classification enhancement is highly influenced by the specific diagnostic comparison under consideration. In conclusion, our findings suggest that for moderately sized clinical neuroimaging datasets, combining different imaging modalities in a data-driven manner is no “magic bullet” for increasing classification accuracy. However, it remains possible that this conclusion is dependent on the use of neuroimaging modalities that had little, or no, complementary information to offer one another, and that the integration of more diverse types of data would have produced greater classification enhancement. We suggest that future studies ideally examine a greater variety of data types (e.g., genetic, cognitive, and neuroimaging) in order to identify the data types and combinations optimally suited to the classification of early stage psychosis. PMID:25076868
Pettersson-Yeo, William; Benetti, Stefania; Marquand, Andre F; Joules, Richard; Catani, Marco; Williams, Steve C R; Allen, Paul; McGuire, Philip; Mechelli, Andrea
2014-01-01
In the pursuit of clinical utility, neuroimaging researchers of psychiatric and neurological illness are increasingly using analyses, such as support vector machine, that allow inference at the single-subject level. Recent studies employing single-modality data, however, suggest that classification accuracies must be improved for such utility to be realized. One possible solution is to integrate different data types to provide a single combined output classification; either by generating a single decision function based on an integrated kernel matrix, or, by creating an ensemble of multiple single modality classifiers and integrating their predictions. Here, we describe four integrative approaches: (1) an un-weighted sum of kernels, (2) multi-kernel learning, (3) prediction averaging, and (4) majority voting, and compare their ability to enhance classification accuracy relative to the best single-modality classification accuracy. We achieve this by integrating structural, functional, and diffusion tensor magnetic resonance imaging data, in order to compare ultra-high risk (n = 19), first episode psychosis (n = 19) and healthy control subjects (n = 23). Our results show that (i) whilst integration can enhance classification accuracy by up to 13%, the frequency of such instances may be limited, (ii) where classification can be enhanced, simple methods may yield greater increases relative to more computationally complex alternatives, and, (iii) the potential for classification enhancement is highly influenced by the specific diagnostic comparison under consideration. In conclusion, our findings suggest that for moderately sized clinical neuroimaging datasets, combining different imaging modalities in a data-driven manner is no "magic bullet" for increasing classification accuracy. However, it remains possible that this conclusion is dependent on the use of neuroimaging modalities that had little, or no, complementary information to offer one another, and that the integration of more diverse types of data would have produced greater classification enhancement. We suggest that future studies ideally examine a greater variety of data types (e.g., genetic, cognitive, and neuroimaging) in order to identify the data types and combinations optimally suited to the classification of early stage psychosis.
NASA Technical Reports Server (NTRS)
Fagan, Matthew E.; Defries, Ruth S.; Sesnie, Steven E.; Arroyo-Mora, J. Pablo; Soto, Carlomagno; Singh, Aditya; Townsend, Philip A.; Chazdon, Robin L.
2015-01-01
An efficient means to map tree plantations is needed to detect tropical land use change and evaluate reforestation projects. To analyze recent tree plantation expansion in northeastern Costa Rica, we examined the potential of combining moderate-resolution hyperspectral imagery (2005 HyMap mosaic) with multitemporal, multispectral data (Landsat) to accurately classify (1) general forest types and (2) tree plantations by species composition. Following a linear discriminant analysis to reduce data dimensionality, we compared four Random Forest classification models: hyperspectral data (HD) alone; HD plus interannual spectral metrics; HD plus a multitemporal forest regrowth classification; and all three models combined. The fourth, combined model achieved overall accuracy of 88.5%. Adding multitemporal data significantly improved classification accuracy (p less than 0.0001) of all forest types, although the effect on tree plantation accuracy was modest. The hyperspectral data alone classified six species of tree plantations with 75% to 93% producer's accuracy; adding multitemporal spectral data increased accuracy only for two species with dense canopies. Non-native tree species had higher classification accuracy overall and made up the majority of tree plantations in this landscape. Our results indicate that combining occasionally acquired hyperspectral data with widely available multitemporal satellite imagery enhances mapping and monitoring of reforestation in tropical landscapes.
NASA Technical Reports Server (NTRS)
Spann, G. W.; Faust, N. L.
1974-01-01
It is known from several previous investigations that many categories of land-use can be mapped via computer processing of Earth Resources Technology Satellite data. The results are presented of one such experiment using the USGS/NASA land-use classification system. Douglas County, Georgia, was chosen as the test site for this project. It was chosen primarily because of its recent rapid growth and future growth potential. Results of the investigation indicate an overall land-use mapping accuracy of 67% with higher accuracies in rural areas and lower accuracies in urban areas. It is estimated, however, that 95% of the State of Georgia could be mapped by these techniques with an accuracy of 80% to 90%.
Machine learning of neural representations of suicide and emotion concepts identifies suicidal youth
Just, Marcel Adam; Pan, Lisa; Cherkassky, Vladimir L.; McMakin, Dana; Cha, Christine; Nock, Matthew K.; Brent, David
2017-01-01
The clinical assessment of suicidal risk would be significantly complemented by a biologically-based measure that assesses alterations in the neural representations of concepts related to death and life in people who engage in suicidal ideation. This study used machine-learning algorithms (Gaussian Naïve Bayes) to identify such individuals (17 suicidal ideators vs 17 controls) with high (91%) accuracy, based on their altered fMRI neural signatures of death and life-related concepts. The most discriminating concepts were death, cruelty, trouble, carefree, good, and praise. A similar classification accurately (94%) discriminated 9 suicidal ideators who had made a suicide attempt from 8 who had not. Moreover, a major facet of the concept alterations was the evoked emotion, whose neural signature served as an alternative basis for accurate (85%) group classification. The study establishes a biological, neurocognitive basis for altered concept representations in participants with suicidal ideation, which enables highly accurate group membership classification. PMID:29367952
A vectorial semantics approach to personality assessment.
Neuman, Yair; Cohen, Yochai
2014-04-23
Personality assessment and, specifically, the assessment of personality disorders have traditionally been indifferent to computational models. Computational personality is a new field that involves the automatic classification of individuals' personality traits that can be compared against gold-standard labels. In this context, we introduce a new vectorial semantics approach to personality assessment, which involves the construction of vectors representing personality dimensions and disorders, and the automatic measurements of the similarity between these vectors and texts written by human subjects. We evaluated our approach by using a corpus of 2468 essays written by students who were also assessed through the five-factor personality model. To validate our approach, we measured the similarity between the essays and the personality vectors to produce personality disorder scores. These scores and their correspondence with the subjects' classification of the five personality factors reproduce patterns well-documented in the psychological literature. In addition, we show that, based on the personality vectors, we can predict each of the five personality factors with high accuracy.
A Vectorial Semantics Approach to Personality Assessment
NASA Astrophysics Data System (ADS)
Neuman, Yair; Cohen, Yochai
2014-04-01
Personality assessment and, specifically, the assessment of personality disorders have traditionally been indifferent to computational models. Computational personality is a new field that involves the automatic classification of individuals' personality traits that can be compared against gold-standard labels. In this context, we introduce a new vectorial semantics approach to personality assessment, which involves the construction of vectors representing personality dimensions and disorders, and the automatic measurements of the similarity between these vectors and texts written by human subjects. We evaluated our approach by using a corpus of 2468 essays written by students who were also assessed through the five-factor personality model. To validate our approach, we measured the similarity between the essays and the personality vectors to produce personality disorder scores. These scores and their correspondence with the subjects' classification of the five personality factors reproduce patterns well-documented in the psychological literature. In addition, we show that, based on the personality vectors, we can predict each of the five personality factors with high accuracy.
A Vectorial Semantics Approach to Personality Assessment
Neuman, Yair; Cohen, Yochai
2014-01-01
Personality assessment and, specifically, the assessment of personality disorders have traditionally been indifferent to computational models. Computational personality is a new field that involves the automatic classification of individuals' personality traits that can be compared against gold-standard labels. In this context, we introduce a new vectorial semantics approach to personality assessment, which involves the construction of vectors representing personality dimensions and disorders, and the automatic measurements of the similarity between these vectors and texts written by human subjects. We evaluated our approach by using a corpus of 2468 essays written by students who were also assessed through the five-factor personality model. To validate our approach, we measured the similarity between the essays and the personality vectors to produce personality disorder scores. These scores and their correspondence with the subjects' classification of the five personality factors reproduce patterns well-documented in the psychological literature. In addition, we show that, based on the personality vectors, we can predict each of the five personality factors with high accuracy. PMID:24755833
NASA Astrophysics Data System (ADS)
Park, M.; Stenstrom, M. K.
2004-12-01
Recognizing urban information from the satellite imagery is problematic due to the diverse features and dynamic changes of urban landuse. The use of Landsat imagery for urban land use classification involves inherent uncertainty due to its spatial resolution and the low separability among land uses. To resolve the uncertainty problem, we investigated the performance of Bayesian networks to classify urban land use since Bayesian networks provide a quantitative way of handling uncertainty and have been successfully used in many areas. In this study, we developed the optimized networks for urban land use classification from Landsat ETM+ images of Marina del Rey area based on USGS land cover/use classification level III. The networks started from a tree structure based on mutual information between variables and added the links to improve accuracy. This methodology offers several advantages: (1) The network structure shows the dependency relationships between variables. The class node value can be predicted even with particular band information missing due to sensor system error. The missing information can be inferred from other dependent bands. (2) The network structure provides information of variables that are important for the classification, which is not available from conventional classification methods such as neural networks and maximum likelihood classification. In our case, for example, bands 1, 5 and 6 are the most important inputs in determining the land use of each pixel. (3) The networks can be reduced with those input variables important for classification. This minimizes the problem without considering all possible variables. We also examined the effect of incorporating ancillary data: geospatial information such as X and Y coordinate values of each pixel and DEM data, and vegetation indices such as NDVI and Tasseled Cap transformation. The results showed that the locational information improved overall accuracy (81%) and kappa coefficient (76%), and lowered the omission and commission errors compared with using only spectral data (accuracy 71%, kappa coefficient 62%). Incorporating DEM data did not significantly improve overall accuracy (74%) and kappa coefficient (66%) but lowered the omission and commission errors. Incorporating NDVI did not much improve the overall accuracy (72%) and k coefficient (65%). Including Tasseled Cap transformation reduced the accuracy (accuracy 70%, kappa 61%). Therefore, additional information from the DEM and vegetation indices was not useful as locational ancillary data.
NASA Astrophysics Data System (ADS)
Müller-Putz, Gernot R.; Scherer, Reinhold; Brauneis, Christian; Pfurtscheller, Gert
2005-12-01
Brain-computer interfaces (BCIs) can be realized on the basis of steady-state evoked potentials (SSEPs). These types of brain signals resulting from repetitive stimulation have the same fundamental frequency as the stimulation but also include higher harmonics. This study investigated how the classification accuracy of a 4-class BCI system can be improved by incorporating visually evoked harmonic oscillations. The current study revealed that the use of three SSVEP harmonics yielded a significantly higher classification accuracy than was the case for one or two harmonics. During feedback experiments, the five subjects investigated reached a classification accuracy between 42.5% and 94.4%.
Müller-Putz, Gernot R; Scherer, Reinhold; Brauneis, Christian; Pfurtscheller, Gert
2005-12-01
Brain-computer interfaces (BCIs) can be realized on the basis of steady-state evoked potentials (SSEPs). These types of brain signals resulting from repetitive stimulation have the same fundamental frequency as the stimulation but also include higher harmonics. This study investigated how the classification accuracy of a 4-class BCI system can be improved by incorporating visually evoked harmonic oscillations. The current study revealed that the use of three SSVEP harmonics yielded a significantly higher classification accuracy than was the case for one or two harmonics. During feedback experiments, the five subjects investigated reached a classification accuracy between 42.5% and 94.4%.
Lambron, Julien; Rakotonjanahary, Josué; Loisel, Didier; Frampas, Eric; De Carli, Emilie; Delion, Matthieu; Rialland, Xavier; Toulgoat, Frédérique
2016-02-01
Magnetic resonance (MR) images from children with optic pathway glioma (OPG) are complex. We initiated this study to evaluate the accuracy of MR imaging (MRI) interpretation and to propose a simple and reproducible imaging classification for MRI. We randomly selected 140 MRIs from among 510 MRIs performed on 104 children diagnosed with OPG in France from 1990 to 2004. These images were reviewed independently by three radiologists (F.T., 15 years of experience in neuroradiology; D.L., 25 years of experience in pediatric radiology; and J.L., 3 years of experience in radiology) using a classification derived from the Dodge and modified Dodge classifications. Intra- and interobserver reliabilities were assessed using the Bland-Altman method and the kappa coefficient. These reviews allowed the definition of reliable criteria for MRI interpretation. The reviews showed intraobserver variability and large discrepancies among the three radiologists (kappa coefficient varying from 0.11 to 1). These variabilities were too large for the interpretation to be considered reproducible over time or among observers. A consensual analysis, taking into account all observed variabilities, allowed the development of a definitive interpretation protocol. Using this revised protocol, we observed consistent intra- and interobserver results (kappa coefficient varying from 0.56 to 1). The mean interobserver difference for the solid portion of the tumor with contrast enhancement was 0.8 cm(3) (limits of agreement = -16 to 17). We propose simple and precise rules for improving the accuracy and reliability of MRI interpretation for children with OPG. Further studies will be necessary to investigate the possible prognostic value of this approach.
Using spectrotemporal indices to improve the fruit-tree crop classification accuracy
NASA Astrophysics Data System (ADS)
Peña, M. A.; Liao, R.; Brenning, A.
2017-06-01
This study assesses the potential of spectrotemporal indices derived from satellite image time series (SITS) to improve the classification accuracy of fruit-tree crops. Six major fruit-tree crop types in the Aconcagua Valley, Chile, were classified by applying various linear discriminant analysis (LDA) techniques on a Landsat-8 time series of nine images corresponding to the 2014-15 growing season. As features we not only used the complete spectral resolution of the SITS, but also all possible normalized difference indices (NDIs) that can be constructed from any two bands of the time series, a novel approach to derive features from SITS. Due to the high dimensionality of this "enhanced" feature set we used the lasso and ridge penalized variants of LDA (PLDA). Although classification accuracies yielded by the standard LDA applied on the full-band SITS were good (misclassification error rate, MER = 0.13), they were further improved by 23% (MER = 0.10) with ridge PLDA using the enhanced feature set. The most important bands to discriminate the crops of interest were mainly concentrated on the first two image dates of the time series, corresponding to the crops' greenup stage. Despite the high predictor weights provided by the red and near infrared bands, typically used to construct greenness spectral indices, other spectral regions were also found important for the discrimination, such as the shortwave infrared band at 2.11-2.19 μm, sensitive to foliar water changes. These findings support the usefulness of spectrotemporal indices in the context of SITS-based crop type classifications, which until now have been mainly constructed by the arithmetic combination of two bands of the same image date in order to derive greenness temporal profiles like those from the normalized difference vegetation index.
High Accuracy Human Activity Recognition Based on Sparse Locality Preserving Projections.
Zhu, Xiangbin; Qiu, Huiling
2016-01-01
Human activity recognition(HAR) from the temporal streams of sensory data has been applied to many fields, such as healthcare services, intelligent environments and cyber security. However, the classification accuracy of most existed methods is not enough in some applications, especially for healthcare services. In order to improving accuracy, it is necessary to develop a novel method which will take full account of the intrinsic sequential characteristics for time-series sensory data. Moreover, each human activity may has correlated feature relationship at different levels. Therefore, in this paper, we propose a three-stage continuous hidden Markov model (TSCHMM) approach to recognize human activities. The proposed method contains coarse, fine and accurate classification. The feature reduction is an important step in classification processing. In this paper, sparse locality preserving projections (SpLPP) is exploited to determine the optimal feature subsets for accurate classification of the stationary-activity data. It can extract more discriminative activities features from the sensor data compared with locality preserving projections. Furthermore, all of the gyro-based features are used for accurate classification of the moving data. Compared with other methods, our method uses significantly less number of features, and the over-all accuracy has been obviously improved.
High Accuracy Human Activity Recognition Based on Sparse Locality Preserving Projections
2016-01-01
Human activity recognition(HAR) from the temporal streams of sensory data has been applied to many fields, such as healthcare services, intelligent environments and cyber security. However, the classification accuracy of most existed methods is not enough in some applications, especially for healthcare services. In order to improving accuracy, it is necessary to develop a novel method which will take full account of the intrinsic sequential characteristics for time-series sensory data. Moreover, each human activity may has correlated feature relationship at different levels. Therefore, in this paper, we propose a three-stage continuous hidden Markov model (TSCHMM) approach to recognize human activities. The proposed method contains coarse, fine and accurate classification. The feature reduction is an important step in classification processing. In this paper, sparse locality preserving projections (SpLPP) is exploited to determine the optimal feature subsets for accurate classification of the stationary-activity data. It can extract more discriminative activities features from the sensor data compared with locality preserving projections. Furthermore, all of the gyro-based features are used for accurate classification of the moving data. Compared with other methods, our method uses significantly less number of features, and the over-all accuracy has been obviously improved. PMID:27893761
NASA Technical Reports Server (NTRS)
Sadowski, F. E.; Sarno, J. E.
1976-01-01
First, an analysis of forest feature signatures was used to help explain the large variation in classification accuracy that can occur among individual forest features for any one case of spatial resolution and the inconsistent changes in classification accuracy that were demonstrated among features as spatial resolution was degraded. Second, the classification rejection threshold was varied in an effort to reduce the large proportion of unclassified resolution elements that previously appeared in the processing of coarse resolution data when a constant rejection threshold was used for all cases of spatial resolution. For the signature analysis, two-channel ellipse plots showing the feature signature distributions for several cases of spatial resolution indicated that the capability of signatures to correctly identify their respective features is dependent on the amount of statistical overlap among signatures. Reductions in signature variance that occur in data of degraded spatial resolution may not necessarily decrease the amount of statistical overlap among signatures having large variance and small mean separations. Features classified by such signatures may thus continue to have similar amounts of misclassified elements in coarser resolution data, and thus, not necessarily improve in classification accuracy.
Maizlin, Ilan I; Redden, David T; Beierle, Elizabeth A; Chen, Mike K; Russell, Robert T
2017-04-01
Surgical wound classification, introduced in 1964, stratifies the risk of surgical site infection (SSI) based on a clinical estimate of the inoculum of bacteria encountered during the procedure. Recent literature has questioned the accuracy of predicting SSI risk based on wound classification. We hypothesized that a more specific model founded on specific patient and perioperative factors would more accurately predict the risk of SSI. Using all observations from the 2012 to 2014 pediatric National Surgical Quality Improvement Program-Pediatric (NSQIP-P) Participant Use File, patients were randomized into model creation and model validation datasets. Potential perioperative predictive factors were assessed with univariate analysis for each of 4 outcomes: wound dehiscence, superficial wound infection, deep wound infection, and organ space infection. A multiple logistic regression model with a step-wise backwards elimination was performed. A receiver operating characteristic curve with c-statistic was generated to assess the model discrimination for each outcome. A total of 183,233 patients were included. All perioperative NSQIP factors were evaluated for clinical pertinence. Of the original 43 perioperative predictive factors selected, 6 to 9 predictors for each outcome were significantly associated with postoperative SSI. The predictive accuracy level of our model compared favorably with the traditional wound classification in each outcome of interest. The proposed model from NSQIP-P demonstrated a significantly improved predictive ability for postoperative SSIs than the current wound classification system. This model will allow providers to more effectively counsel families and patients of these risks, and more accurately reflect true risks for individual surgical patients to hospitals and payers. Copyright © 2017 American College of Surgeons. Published by Elsevier Inc. All rights reserved.
Tomizawa, Yutaka; Iyer, Prasad G; Wongkeesong, Louis M; Buttar, Navtej S; Lutzke, Lori S; Wu, Tsung-Teh; Wang, Kenneth K
2013-01-01
AIM: To investigate a classification of endocytoscopy (ECS) images in Barrett’s esophagus (BE) and evaluate its diagnostic performance and interobserver variability. METHODS: ECS was applied to surveillance endoscopic mucosal resection (EMR) specimens of BE ex-vivo. The mucosal surface of specimen was stained with 1% methylene blue and surveyed with a catheter-type endocytoscope. We selected still images that were most representative of the endoscopically suspect lesion and matched with the final histopathological diagnosis to accomplish accurate correlation. The diagnostic performance and inter-observer variability of the new classification scheme were assessed in a blinded fashion by physicians with expertise in both BE and ECS and inexperienced physicians with no prior exposure to ECS. RESULTS: Three staff physicians and 22 gastroenterology fellows classified eight randomly assigned unknown still ECS pictures (two images per each classification) into one of four histopathologic categories as follows: (1) BEC1-squamous epithelium; (2) BEC2-BE without dysplasia; (3) BEC3-BE with dysplasia; and (4) BEC4-esophageal adenocarcinoma (EAC) in BE. Accuracy of diagnosis in staff physicians and clinical fellows were, respectively, 100% and 99.4% for BEC1, 95.8% and 83.0% for BEC2, 91.7% and 83.0% for BEC3, and 95.8% and 98.3% for BEC4. Interobserver agreement of the faculty physicians and fellows in classifying each category were 0.932 and 0.897, respectively. CONCLUSION: This is the first study to investigate classification system of ECS in BE. This ex-vivo pilot study demonstrated acceptable diagnostic accuracy and excellent interobserver agreement. PMID:24379583
NASA Astrophysics Data System (ADS)
Jokar Arsanjani, Jamal; Vaz, Eric
2015-03-01
Until recently, land surveys and digital interpretation of remotely sensed imagery have been used to generate land use inventories. These techniques however, are often cumbersome and costly, allocating large amounts of technical and temporal costs. The technological advances of web 2.0 have brought a wide array of technological achievements, stimulating the participatory role in collaborative and crowd sourced mapping products. This has been fostered by GPS-enabled devices, and accessible tools that enable visual interpretation of high resolution satellite images/air photos provided in collaborative mapping projects. Such technologies offer an integrative approach to geography by means of promoting public participation and allowing accurate assessment and classification of land use as well as geographical features. OpenStreetMap (OSM) has supported the evolution of such techniques, contributing to the existence of a large inventory of spatial land use information. This paper explores the introduction of this novel participatory phenomenon for land use classification in Europe's metropolitan regions. We adopt a positivistic approach to assess comparatively the accuracy of these contributions of OSM for land use classifications in seven large European metropolitan regions. Thematic accuracy and degree of completeness of OSM data was compared to available Global Monitoring for Environment and Security Urban Atlas (GMESUA) datasets for the chosen metropolises. We further extend our findings of land use within a novel framework for geography, justifying that volunteered geographic information (VGI) sources are of great benefit for land use mapping depending on location and degree of VGI dynamism and offer a great alternative to traditional mapping techniques for metropolitan regions throughout Europe. Evaluation of several land use types at the local level suggests that a number of OSM classes (such as anthropogenic land use, agricultural and some natural environment classes) are viable alternatives for land use classification. These classes are highly accurate and can be integrated into planning decisions for stakeholders and policymakers.
Bullich, Santiago; Seibyl, John; Catafau, Ana M; Jovalekic, Aleksandar; Koglin, Norman; Barthel, Henryk; Sabri, Osama; De Santi, Susan
2017-01-01
Standardized uptake value ratios (SUVRs) calculated from cerebral cortical areas can be used to categorize 18 F-Florbetaben (FBB) PET scans by applying appropriate cutoffs. The objective of this work was first to generate FBB SUVR cutoffs using visual assessment (VA) as standard of truth (SoT) for a number of reference regions (RR) (cerebellar gray matter (GCER), whole cerebellum (WCER), pons (PONS), and subcortical white matter (SWM)). Secondly, to validate the FBB PET scan categorization performed by SUVR cutoffs against the categorization made by post-mortem histopathological confirmation of the Aβ presence. Finally, to evaluate the added value of SUVR cutoff categorization to VA. SUVR cutoffs were generated for each RR using FBB scans from 143 subjects who were visually assessed by 3 readers. SUVR cutoffs were validated in 78 end-of life subjects using VA from 8 independent blinded readers (3 expert readers and 5 non-expert readers) and histopathological confirmation of the presence of neuritic beta-amyloid plaques as SoT. Finally, the number of correctly or incorrectly classified scans according to pathology results using VA and SUVR cutoffs was compared. Composite SUVR cutoffs generated were 1.43 (GCER), 0.96 (WCER), 0.78 (PONS) and 0.71 (SWM). Accuracy values were high and consistent across RR (range 83-94% for histopathology, and 85-94% for VA). SUVR cutoff performed similarly as VA but did not improve VA classification of FBB scans read either by expert readers or the majority read but provided higher accuracy than some non-expert readers. The accurate scan classification obtained in this study supports the use of VA as SoT to generate site-specific SUVR cutoffs. For an elderly end of life population, VA and SUVR cutoff categorization perform similarly in classifying FBB scans as Aβ-positive or Aβ-negative. These results emphasize the additional contribution that SUVR cutoff classification may have compared with VA performed by non-expert readers.
NASA Astrophysics Data System (ADS)
Fujita, Yusuke; Mitani, Yoshihiro; Hamamoto, Yoshihiko; Segawa, Makoto; Terai, Shuji; Sakaida, Isao
2017-03-01
Ultrasound imaging is a popular and non-invasive tool used in the diagnoses of liver disease. Cirrhosis is a chronic liver disease and it can advance to liver cancer. Early detection and appropriate treatment are crucial to prevent liver cancer. However, ultrasound image analysis is very challenging, because of the low signal-to-noise ratio of ultrasound images. To achieve the higher classification performance, selection of training regions of interest (ROIs) is very important that effect to classification accuracy. The purpose of our study is cirrhosis detection with high accuracy using liver ultrasound images. In our previous works, training ROI selection by MILBoost and multiple-ROI classification based on the product rule had been proposed, to achieve high classification performance. In this article, we propose self-training method to select training ROIs effectively. Evaluation experiments were performed to evaluate effect of self-training, using manually selected ROIs and also automatically selected ROIs. Experimental results show that self-training for manually selected ROIs achieved higher classification performance than other approaches, including our conventional methods. The manually ROI definition and sample selection are important to improve classification accuracy in cirrhosis detection using ultrasound images.
The impact of OCR accuracy on automated cancer classification of pathology reports.
Zuccon, Guido; Nguyen, Anthony N; Bergheim, Anton; Wickman, Sandra; Grayson, Narelle
2012-01-01
To evaluate the effects of Optical Character Recognition (OCR) on the automatic cancer classification of pathology reports. Scanned images of pathology reports were converted to electronic free-text using a commercial OCR system. A state-of-the-art cancer classification system, the Medical Text Extraction (MEDTEX) system, was used to automatically classify the OCR reports. Classifications produced by MEDTEX on the OCR versions of the reports were compared with the classification from a human amended version of the OCR reports. The employed OCR system was found to recognise scanned pathology reports with up to 99.12% character accuracy and up to 98.95% word accuracy. Errors in the OCR processing were found to minimally impact on the automatic classification of scanned pathology reports into notifiable groups. However, the impact of OCR errors is not negligible when considering the extraction of cancer notification items, such as primary site, histological type, etc. The automatic cancer classification system used in this work, MEDTEX, has proven to be robust to errors produced by the acquisition of freetext pathology reports from scanned images through OCR software. However, issues emerge when considering the extraction of cancer notification items.
Multi-phenology WorldView-2 imagery improves remote sensing of savannah tree species
NASA Astrophysics Data System (ADS)
Madonsela, Sabelo; Cho, Moses Azong; Mathieu, Renaud; Mutanga, Onisimo; Ramoelo, Abel; Kaszta, Żaneta; Kerchove, Ruben Van De; Wolff, Eléonore
2017-06-01
Biodiversity mapping in African savannah is important for monitoring changes and ensuring sustainable use of ecosystem resources. Biodiversity mapping can benefit from multi-spectral instruments such as WorldView-2 with very high spatial resolution and a spectral configuration encompassing important spectral regions not previously available for vegetation mapping. This study investigated i) the benefits of the eight-band WorldView-2 (WV-2) spectral configuration for discriminating tree species in Southern African savannah and ii) if multiple-images acquired at key points of the typical phenological development of savannahs (peak productivity, transition to senescence) improve on tree species classifications. We first assessed the discriminatory power of WV-2 bands using interspecies-Spectral Angle Mapper (SAM) via Band Add-On procedure and tested the spectral capability of WorldView-2 against simulated IKONOS for tree species classification. The results from interspecies-SAM procedure identified the yellow and red bands as the most statistically significant bands (p = 0.000251 and p = 0.000039 respectively) in the discriminatory power of WV-2 during the transition from wet to dry season (April). Using Random Forest classifier, the classification scenarios investigated showed that i) the 8-bands of the WV-2 sensor achieved higher classification accuracy for the April date (transition from wet to dry season, senescence) compared to the March date (peak productivity season) ii) the WV-2 spectral configuration systematically outperformed the IKONOS sensor spectral configuration and iii) the multi-temporal approach (March and April combined) improved the discrimination of tress species and produced the highest overall accuracy results at 80.4%. Consistent with the interspecies-SAM procedure, the yellow (605 nm) band also showed a statistically significant contribution in the improved classification accuracy from WV-2. These results highlight the mapping opportunities presented by WV-2 data for monitoring the distribution status of e.g. species often harvested by local communities (e.g. Sclerocharya birrea), encroaching species, or species-specific tree losses induced by elephants.
NASA Astrophysics Data System (ADS)
de Oliveira Silveira, Eduarda Martiniano; de Menezes, Michele Duarte; Acerbi Júnior, Fausto Weimar; Castro Nunes Santos Terra, Marcela; de Mello, José Márcio
2017-07-01
Accurate mapping and monitoring of savanna and semiarid woodland biomes are needed to support the selection of areas of conservation, to provide sustainable land use, and to improve the understanding of vegetation. The potential of geostatistical features, derived from medium spatial resolution satellite imagery, to characterize contrasted landscape vegetation cover and improve object-based image classification is studied. The study site in Brazil includes cerrado sensu stricto, deciduous forest, and palm swamp vegetation cover. Sentinel 2 and Landsat 8 images were acquired and divided into objects, for each of which a semivariogram was calculated using near-infrared (NIR) and normalized difference vegetation index (NDVI) to extract the set of geostatistical features. The features selected by principal component analysis were used as input data to train a random forest algorithm. Tests were conducted, combining spectral and geostatistical features. Change detection evaluation was performed using a confusion matrix and its accuracies. The semivariogram curves were efficient to characterize spatial heterogeneity, with similar results using NIR and NDVI from Sentinel 2 and Landsat 8. Accuracy was significantly greater when combining geostatistical features with spectral data, suggesting that this method can improve image classification results.
Significance of perceptually relevant image decolorization for scene classification
NASA Astrophysics Data System (ADS)
Viswanathan, Sowmya; Divakaran, Govind; Soman, Kutti Padanyl
2017-11-01
Color images contain luminance and chrominance components representing the intensity and color information, respectively. The objective of this paper is to show the significance of incorporating chrominance information to the task of scene classification. An improved color-to-grayscale image conversion algorithm that effectively incorporates chrominance information is proposed using the color-to-gray structure similarity index and singular value decomposition to improve the perceptual quality of the converted grayscale images. The experimental results based on an image quality assessment for image decolorization and its success rate (using the Cadik and COLOR250 datasets) show that the proposed image decolorization technique performs better than eight existing benchmark algorithms for image decolorization. In the second part of the paper, the effectiveness of incorporating the chrominance component for scene classification tasks is demonstrated using a deep belief network-based image classification system developed using dense scale-invariant feature transforms. The amount of chrominance information incorporated into the proposed image decolorization technique is confirmed with the improvement to the overall scene classification accuracy. Moreover, the overall scene classification performance improved by combining the models obtained using the proposed method and conventional decolorization methods.
Voice based gender classification using machine learning
NASA Astrophysics Data System (ADS)
Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.
2017-11-01
Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.
NASA Astrophysics Data System (ADS)
Dash, Jatindra K.; Kale, Mandar; Mukhopadhyay, Sudipta; Khandelwal, Niranjan; Prabhakar, Nidhi; Garg, Mandeep; Kalra, Naveen
2017-03-01
In this paper, we investigate the effect of the error criteria used during a training phase of the artificial neural network (ANN) on the accuracy of the classifier for classification of lung tissues affected with Interstitial Lung Diseases (ILD). Mean square error (MSE) and the cross-entropy (CE) criteria are chosen being most popular choice in state-of-the-art implementations. The classification experiment performed on the six interstitial lung disease (ILD) patterns viz. Consolidation, Emphysema, Ground Glass Opacity, Micronodules, Fibrosis and Healthy from MedGIFT database. The texture features from an arbitrary region of interest (AROI) are extracted using Gabor filter. Two different neural networks are trained with the scaled conjugate gradient back propagation algorithm with MSE and CE error criteria function respectively for weight updation. Performance is evaluated in terms of average accuracy of these classifiers using 4 fold cross-validation. Each network is trained for five times for each fold with randomly initialized weight vectors and accuracies are computed. Significant improvement in classification accuracy is observed when ANN is trained by using CE (67.27%) as error function compared to MSE (63.60%). Moreover, standard deviation of the classification accuracy for the network trained with CE (6.69) error criteria is found less as compared to network trained with MSE (10.32) criteria.
Cao, Jianfang; Cui, Hongyan; Shi, Hao; Jiao, Lijuan
2016-01-01
A back-propagation (BP) neural network can solve complicated random nonlinear mapping problems; therefore, it can be applied to a wide range of problems. However, as the sample size increases, the time required to train BP neural networks becomes lengthy. Moreover, the classification accuracy decreases as well. To improve the classification accuracy and runtime efficiency of the BP neural network algorithm, we proposed a parallel design and realization method for a particle swarm optimization (PSO)-optimized BP neural network based on MapReduce on the Hadoop platform using both the PSO algorithm and a parallel design. The PSO algorithm was used to optimize the BP neural network's initial weights and thresholds and improve the accuracy of the classification algorithm. The MapReduce parallel programming model was utilized to achieve parallel processing of the BP algorithm, thereby solving the problems of hardware and communication overhead when the BP neural network addresses big data. Datasets on 5 different scales were constructed using the scene image library from the SUN Database. The classification accuracy of the parallel PSO-BP neural network algorithm is approximately 92%, and the system efficiency is approximately 0.85, which presents obvious advantages when processing big data. The algorithm proposed in this study demonstrated both higher classification accuracy and improved time efficiency, which represents a significant improvement obtained from applying parallel processing to an intelligent algorithm on big data.
Characterization and classification of South American land cover types using satellite data
NASA Technical Reports Server (NTRS)
Townshend, J. R. G.; Justice, C. O.; Kalb, V.
1987-01-01
Various methods are compared for carrying out land cover classifications of South America using multitemporal Advanced Very High Resolution Radiometer data. Fifty-two images of the normalized difference vegetation index (NDVI) from a 1-year period are used to generate multitemporal data sets. Three main approaches to land cover classification are considered, namely the use of the principal components transformed images, the use of a characteristic curves procedure based on NDVI values plotted against time, and finally application of the maximum likelihood rule to multitemporal data sets. Comparison of results from training sites indicates that the last approach yields the most accurate results. Despite the reliance on training site figures for performance assessment, the results are nevertheless extremely encouraging, with accuracies for several cover types exceeding 90 per cent.
Ahmad, Iftikhar; Ahmad, Manzoor; Khan, Karim; Ikram, Masroor
2016-06-01
Optical polarimetry was employed for assessment of ex vivo healthy and basal cell carcinoma (BCC) tissue samples from human skin. Polarimetric analyses revealed that depolarization and retardance for healthy tissue group were significantly higher (p<0.001) compared to BCC tissue group. Histopathology indicated that these differences partially arise from BCC-related characteristic changes in tissue morphology. Wilks lambda statistics demonstrated the potential of all investigated polarimetric properties for computer assisted classification of the two tissue groups. Based on differences in polarimetric properties, partial least square (PLS) regression classified the samples with 100% accuracy, sensitivity and specificity. These findings indicate that optical polarimetry together with PLS statistics hold promise for automated pathology classification. Copyright © 2016 Elsevier B.V. All rights reserved.
Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li
2011-01-01
Background Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Methodology/Principal Findings Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. Conclusions/Significance The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice. PMID:21359184
Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li
2011-02-16
Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice.
NASA Astrophysics Data System (ADS)
Melville, Bethany; Lucieer, Arko; Aryal, Jagannath
2018-04-01
This paper presents a random forest classification approach for identifying and mapping three types of lowland native grassland communities found in the Tasmanian Midlands region. Due to the high conservation priority assigned to these communities, there has been an increasing need to identify appropriate datasets that can be used to derive accurate and frequently updateable maps of community extent. Therefore, this paper proposes a method employing repeat classification and statistical significance testing as a means of identifying the most appropriate dataset for mapping these communities. Two datasets were acquired and analysed; a Landsat ETM+ scene, and a WorldView-2 scene, both from 2010. Training and validation data were randomly subset using a k-fold (k = 50) approach from a pre-existing field dataset. Poa labillardierei, Themeda triandra and lowland native grassland complex communities were identified in addition to dry woodland and agriculture. For each subset of randomly allocated points, a random forest model was trained based on each dataset, and then used to classify the corresponding imagery. Validation was performed using the reciprocal points from the independent subset that had not been used to train the model. Final training and classification accuracies were reported as per class means for each satellite dataset. Analysis of Variance (ANOVA) was undertaken to determine whether classification accuracy differed between the two datasets, as well as between classifications. Results showed mean class accuracies between 54% and 87%. Class accuracy only differed significantly between datasets for the dry woodland and Themeda grassland classes, with the WorldView-2 dataset showing higher mean classification accuracies. The results of this study indicate that remote sensing is a viable method for the identification of lowland native grassland communities in the Tasmanian Midlands, and that repeat classification and statistical significant testing can be used to identify optimal datasets for vegetation community mapping.
The effect of call libraries and acoustic filters on the identification of bat echolocation.
Clement, Matthew J; Murray, Kevin L; Solick, Donald I; Gruver, Jeffrey C
2014-09-01
Quantitative methods for species identification are commonly used in acoustic surveys for animals. While various identification models have been studied extensively, there has been little study of methods for selecting calls prior to modeling or methods for validating results after modeling. We obtained two call libraries with a combined 1556 pulse sequences from 11 North American bat species. We used four acoustic filters to automatically select and quantify bat calls from the combined library. For each filter, we trained a species identification model (a quadratic discriminant function analysis) and compared the classification ability of the models. In a separate analysis, we trained a classification model using just one call library. We then compared a conventional model assessment that used the training library against an alternative approach that used the second library. We found that filters differed in the share of known pulse sequences that were selected (68 to 96%), the share of non-bat noises that were excluded (37 to 100%), their measurement of various pulse parameters, and their overall correct classification rate (41% to 85%). Although the top two filters did not differ significantly in overall correct classification rate (85% and 83%), rates differed significantly for some bat species. In our assessment of call libraries, overall correct classification rates were significantly lower (15% to 23% lower) when tested on the second call library instead of the training library. Well-designed filters obviated the need for subjective and time-consuming manual selection of pulses. Accordingly, researchers should carefully design and test filters and include adequate descriptions in publications. Our results also indicate that it may not be possible to extend inferences about model accuracy beyond the training library. If so, the accuracy of acoustic-only surveys may be lower than commonly reported, which could affect ecological understanding or management decisions based on acoustic surveys.
The effect of call libraries and acoustic filters on the identification of bat echolocation
Clement, Matthew J; Murray, Kevin L; Solick, Donald I; Gruver, Jeffrey C
2014-01-01
Quantitative methods for species identification are commonly used in acoustic surveys for animals. While various identification models have been studied extensively, there has been little study of methods for selecting calls prior to modeling or methods for validating results after modeling. We obtained two call libraries with a combined 1556 pulse sequences from 11 North American bat species. We used four acoustic filters to automatically select and quantify bat calls from the combined library. For each filter, we trained a species identification model (a quadratic discriminant function analysis) and compared the classification ability of the models. In a separate analysis, we trained a classification model using just one call library. We then compared a conventional model assessment that used the training library against an alternative approach that used the second library. We found that filters differed in the share of known pulse sequences that were selected (68 to 96%), the share of non-bat noises that were excluded (37 to 100%), their measurement of various pulse parameters, and their overall correct classification rate (41% to 85%). Although the top two filters did not differ significantly in overall correct classification rate (85% and 83%), rates differed significantly for some bat species. In our assessment of call libraries, overall correct classification rates were significantly lower (15% to 23% lower) when tested on the second call library instead of the training library. Well-designed filters obviated the need for subjective and time-consuming manual selection of pulses. Accordingly, researchers should carefully design and test filters and include adequate descriptions in publications. Our results also indicate that it may not be possible to extend inferences about model accuracy beyond the training library. If so, the accuracy of acoustic-only surveys may be lower than commonly reported, which could affect ecological understanding or management decisions based on acoustic surveys. PMID:25535563
Nutritional status in sick children and adolescents is not accurately reflected by BMI-SDS.
Fusch, Gerhard; Raja, Preeya; Dung, Nguyen Quang; Karaolis-Danckert, Nadina; Barr, Ronald; Fusch, Christoph
2013-01-01
Nutritional status provides helpful information of disease severity and treatment effectiveness. Body mass index standard deviation scores (BMI-SDS) provide an approximation of body composition and thus are frequently used to classify nutritional status of sick children and adolescents. However, the accuracy of estimating body composition in this population using BMI-SDS has not been assessed. Thus, this study aims to evaluate the accuracy of nutritional status classification in sick infants and adolescents using BMI-SDS, upon comparison to classification using percentage body fat (%BF) reference charts. BMI-SDS was calculated from anthropometric measurements and %BF was measured using dual-energy x-ray absorptiometry (DXA) for 393 sick children and adolescents (5 months-18 years). Subjects were classified by nutritional status (underweight, normal weight, overweight, and obese), using 2 methods: (1) BMI-SDS, based on age- and gender-specific percentiles, and (2) %BF reference charts (standard). Linear regression and a correlation analysis were conducted to compare agreement between both methods of nutritional status classification. %BF reference value comparisons were also made between 3 independent sources based on German, Canadian, and American study populations. Correlation between nutritional status classification by BMI-SDS and %BF agreed moderately (r (2) = 0.75, 0.76 in boys and girls, respectively). The misclassification of nutritional status in sick children and adolescents using BMI-SDS was 27% when using German %BF references. Similar rates observed when using Canadian and American %BF references (24% and 23%, respectively). Using BMI-SDS to determine nutritional status in a sick population is not considered an appropriate clinical tool for identifying individual underweight or overweight children or adolescents. However, BMI-SDS may be appropriate for longitudinal measurements or for screening purposes in large field studies. When accurate nutritional status classification of a sick patient is needed for clinical purposes, nutritional status will be assessed more accurately using methods that accurately measure %BF, such as DXA.
The effect of call libraries and acoustic filters on the identification of bat echolocation
Clement, Matthew; Murray, Kevin L; Solick, Donald I; Gruver, Jeffrey C
2014-01-01
Quantitative methods for species identification are commonly used in acoustic surveys for animals. While various identification models have been studied extensively, there has been little study of methods for selecting calls prior to modeling or methods for validating results after modeling. We obtained two call libraries with a combined 1556 pulse sequences from 11 North American bat species. We used four acoustic filters to automatically select and quantify bat calls from the combined library. For each filter, we trained a species identification model (a quadratic discriminant function analysis) and compared the classification ability of the models. In a separate analysis, we trained a classification model using just one call library. We then compared a conventional model assessment that used the training library against an alternative approach that used the second library. We found that filters differed in the share of known pulse sequences that were selected (68 to 96%), the share of non-bat noises that were excluded (37 to 100%), their measurement of various pulse parameters, and their overall correct classification rate (41% to 85%). Although the top two filters did not differ significantly in overall correct classification rate (85% and 83%), rates differed significantly for some bat species. In our assessment of call libraries, overall correct classification rates were significantly lower (15% to 23% lower) when tested on the second call library instead of the training library. Well-designed filters obviated the need for subjective and time-consuming manual selection of pulses. Accordingly, researchers should carefully design and test filters and include adequate descriptions in publications. Our results also indicate that it may not be possible to extend inferences about model accuracy beyond the training library. If so, the accuracy of acoustic-only surveys may be lower than commonly reported, which could affect ecological understanding or management decisions based on acoustic surveys.
Estimation of different data compositions for early-season crop type classification.
Hao, Pengyu; Wu, Mingquan; Niu, Zheng; Wang, Li; Zhan, Yulin
2018-01-01
Timely and accurate crop type distribution maps are an important inputs for crop yield estimation and production forecasting as multi-temporal images can observe phenological differences among crops. Therefore, time series remote sensing data are essential for crop type mapping, and image composition has commonly been used to improve the quality of the image time series. However, the optimal composition period is unclear as long composition periods (such as compositions lasting half a year) are less informative and short composition periods lead to information redundancy and missing pixels. In this study, we initially acquired daily 30 m Normalized Difference Vegetation Index (NDVI) time series by fusing MODIS, Landsat, Gaofen and Huanjing (HJ) NDVI, and then composited the NDVI time series using four strategies (daily, 8-day, 16-day, and 32-day). We used Random Forest to identify crop types and evaluated the classification performances of the NDVI time series generated from four composition strategies in two studies regions from Xinjiang, China. Results indicated that crop classification performance improved as crop separabilities and classification accuracies increased, and classification uncertainties dropped in the green-up stage of the crops. When using daily NDVI time series, overall accuracies saturated at 113-day and 116-day in Bole and Luntai, and the saturated overall accuracies (OAs) were 86.13% and 91.89%, respectively. Cotton could be identified 40∼60 days and 35∼45 days earlier than the harvest in Bole and Luntai when using daily, 8-day and 16-day composition NDVI time series since both producer's accuracies (PAs) and user's accuracies (UAs) were higher than 85%. Among the four compositions, the daily NDVI time series generated the highest classification accuracies. Although the 8-day, 16-day and 32-day compositions had similar saturated overall accuracies (around 85% in Bole and 83% in Luntai), the 8-day and 16-day compositions achieved these accuracies around 155-day in Bole and 133-day in Luntai, which were earlier than the 32-day composition (170-day in both Bole and Luntai). Therefore, when the daily NDVI time series cannot be acquired, the 16-day composition is recommended in this study.
Estimation of different data compositions for early-season crop type classification
Wu, Mingquan; Wang, Li; Zhan, Yulin
2018-01-01
Timely and accurate crop type distribution maps are an important inputs for crop yield estimation and production forecasting as multi-temporal images can observe phenological differences among crops. Therefore, time series remote sensing data are essential for crop type mapping, and image composition has commonly been used to improve the quality of the image time series. However, the optimal composition period is unclear as long composition periods (such as compositions lasting half a year) are less informative and short composition periods lead to information redundancy and missing pixels. In this study, we initially acquired daily 30 m Normalized Difference Vegetation Index (NDVI) time series by fusing MODIS, Landsat, Gaofen and Huanjing (HJ) NDVI, and then composited the NDVI time series using four strategies (daily, 8-day, 16-day, and 32-day). We used Random Forest to identify crop types and evaluated the classification performances of the NDVI time series generated from four composition strategies in two studies regions from Xinjiang, China. Results indicated that crop classification performance improved as crop separabilities and classification accuracies increased, and classification uncertainties dropped in the green-up stage of the crops. When using daily NDVI time series, overall accuracies saturated at 113-day and 116-day in Bole and Luntai, and the saturated overall accuracies (OAs) were 86.13% and 91.89%, respectively. Cotton could be identified 40∼60 days and 35∼45 days earlier than the harvest in Bole and Luntai when using daily, 8-day and 16-day composition NDVI time series since both producer’s accuracies (PAs) and user’s accuracies (UAs) were higher than 85%. Among the four compositions, the daily NDVI time series generated the highest classification accuracies. Although the 8-day, 16-day and 32-day compositions had similar saturated overall accuracies (around 85% in Bole and 83% in Luntai), the 8-day and 16-day compositions achieved these accuracies around 155-day in Bole and 133-day in Luntai, which were earlier than the 32-day composition (170-day in both Bole and Luntai). Therefore, when the daily NDVI time series cannot be acquired, the 16-day composition is recommended in this study. PMID:29868265
Three-Class Mammogram Classification Based on Descriptive CNN Features
Zhang, Qianni; Jadoon, Adeel
2017-01-01
In this paper, a novel classification technique for large data set of mammograms using a deep learning method is proposed. The proposed model targets a three-class classification study (normal, malignant, and benign cases). In our model we have presented two methods, namely, convolutional neural network-discrete wavelet (CNN-DW) and convolutional neural network-curvelet transform (CNN-CT). An augmented data set is generated by using mammogram patches. To enhance the contrast of mammogram images, the data set is filtered by contrast limited adaptive histogram equalization (CLAHE). In the CNN-DW method, enhanced mammogram images are decomposed as its four subbands by means of two-dimensional discrete wavelet transform (2D-DWT), while in the second method discrete curvelet transform (DCT) is used. In both methods, dense scale invariant feature (DSIFT) for all subbands is extracted. Input data matrix containing these subband features of all the mammogram patches is created that is processed as input to convolutional neural network (CNN). Softmax layer and support vector machine (SVM) layer are used to train CNN for classification. Proposed methods have been compared with existing methods in terms of accuracy rate, error rate, and various validation assessment measures. CNN-DW and CNN-CT have achieved accuracy rate of 81.83% and 83.74%, respectively. Simulation results clearly validate the significance and impact of our proposed model as compared to other well-known existing techniques. PMID:28191461
Three-Class Mammogram Classification Based on Descriptive CNN Features.
Jadoon, M Mohsin; Zhang, Qianni; Haq, Ihsan Ul; Butt, Sharjeel; Jadoon, Adeel
2017-01-01
In this paper, a novel classification technique for large data set of mammograms using a deep learning method is proposed. The proposed model targets a three-class classification study (normal, malignant, and benign cases). In our model we have presented two methods, namely, convolutional neural network-discrete wavelet (CNN-DW) and convolutional neural network-curvelet transform (CNN-CT). An augmented data set is generated by using mammogram patches. To enhance the contrast of mammogram images, the data set is filtered by contrast limited adaptive histogram equalization (CLAHE). In the CNN-DW method, enhanced mammogram images are decomposed as its four subbands by means of two-dimensional discrete wavelet transform (2D-DWT), while in the second method discrete curvelet transform (DCT) is used. In both methods, dense scale invariant feature (DSIFT) for all subbands is extracted. Input data matrix containing these subband features of all the mammogram patches is created that is processed as input to convolutional neural network (CNN). Softmax layer and support vector machine (SVM) layer are used to train CNN for classification. Proposed methods have been compared with existing methods in terms of accuracy rate, error rate, and various validation assessment measures. CNN-DW and CNN-CT have achieved accuracy rate of 81.83% and 83.74%, respectively. Simulation results clearly validate the significance and impact of our proposed model as compared to other well-known existing techniques.
Automated classification and quantitative analysis of arterial and venous vessels in fundus images
NASA Astrophysics Data System (ADS)
Alam, Minhaj; Son, Taeyoon; Toslak, Devrim; Lim, Jennifer I.; Yao, Xincheng
2018-02-01
It is known that retinopathies may affect arteries and veins differently. Therefore, reliable differentiation of arteries and veins is essential for computer-aided analysis of fundus images. The purpose of this study is to validate one automated method for robust classification of arteries and veins (A-V) in digital fundus images. We combine optical density ratio (ODR) analysis and blood vessel tracking algorithm to classify arteries and veins. A matched filtering method is used to enhance retinal blood vessels. Bottom hat filtering and global thresholding are used to segment the vessel and skeleton individual blood vessels. The vessel tracking algorithm is used to locate the optic disk and to identify source nodes of blood vessels in optic disk area. Each node can be identified as vein or artery using ODR information. Using the source nodes as starting point, the whole vessel trace is then tracked and classified as vein or artery using vessel curvature and angle information. 50 color fundus images from diabetic retinopathy patients were used to test the algorithm. Sensitivity, specificity, and accuracy metrics were measured to assess the validity of the proposed classification method compared to ground truths created by two independent observers. The algorithm demonstrated 97.52% accuracy in identifying blood vessels as vein or artery. A quantitative analysis upon A-V classification showed that average A-V ratio of width for NPDR subjects with hypertension decreased significantly (43.13%).
A deep learning method for classifying mammographic breast density categories.
Mohamed, Aly A; Berg, Wendie A; Peng, Hong; Luo, Yahong; Jankowitz, Rachel C; Wu, Shandong
2018-01-01
Mammographic breast density is an established risk marker for breast cancer and is visually assessed by radiologists in routine mammogram image reading, using four qualitative Breast Imaging and Reporting Data System (BI-RADS) breast density categories. It is particularly difficult for radiologists to consistently distinguish the two most common and most variably assigned BI-RADS categories, i.e., "scattered density" and "heterogeneously dense". The aim of this work was to investigate a deep learning-based breast density classifier to consistently distinguish these two categories, aiming at providing a potential computerized tool to assist radiologists in assigning a BI-RADS category in current clinical workflow. In this study, we constructed a convolutional neural network (CNN)-based model coupled with a large (i.e., 22,000 images) digital mammogram imaging dataset to evaluate the classification performance between the two aforementioned breast density categories. All images were collected from a cohort of 1,427 women who underwent standard digital mammography screening from 2005 to 2016 at our institution. The truths of the density categories were based on standard clinical assessment made by board-certified breast imaging radiologists. Effects of direct training from scratch solely using digital mammogram images and transfer learning of a pretrained model on a large nonmedical imaging dataset were evaluated for the specific task of breast density classification. In order to measure the classification performance, the CNN classifier was also tested on a refined version of the mammogram image dataset by removing some potentially inaccurately labeled images. Receiver operating characteristic (ROC) curves and the area under the curve (AUC) were used to measure the accuracy of the classifier. The AUC was 0.9421 when the CNN-model was trained from scratch on our own mammogram images, and the accuracy increased gradually along with an increased size of training samples. Using the pretrained model followed by a fine-tuning process with as few as 500 mammogram images led to an AUC of 0.9265. After removing the potentially inaccurately labeled images, AUC was increased to 0.9882 and 0.9857 for without and with the pretrained model, respectively, both significantly higher (P < 0.001) than when using the full imaging dataset. Our study demonstrated high classification accuracies between two difficult to distinguish breast density categories that are routinely assessed by radiologists. We anticipate that our approach will help enhance current clinical assessment of breast density and better support consistent density notification to patients in breast cancer screening. © 2017 American Association of Physicists in Medicine.
Semi-supervised classification tool for DubaiSat-2 multispectral imagery
NASA Astrophysics Data System (ADS)
Al-Mansoori, Saeed
2015-10-01
This paper addresses a semi-supervised classification tool based on a pixel-based approach of the multi-spectral satellite imagery. There are not many studies demonstrating such algorithm for the multispectral images, especially when the image consists of 4 bands (Red, Green, Blue and Near Infrared) as in DubaiSat-2 satellite images. The proposed approach utilizes both unsupervised and supervised classification schemes sequentially to identify four classes in the image, namely, water bodies, vegetation, land (developed and undeveloped areas) and paved areas (i.e. roads). The unsupervised classification concept is applied to identify two classes; water bodies and vegetation, based on a well-known index that uses the distinct wavelengths of visible and near-infrared sunlight that is absorbed and reflected by the plants to identify the classes; this index parameter is called "Normalized Difference Vegetation Index (NDVI)". Afterward, the supervised classification is performed by selecting training homogenous samples for roads and land areas. Here, a precise selection of training samples plays a vital role in the classification accuracy. Post classification is finally performed to enhance the classification accuracy, where the classified image is sieved, clumped and filtered before producing final output. Overall, the supervised classification approach produced higher accuracy than the unsupervised method. This paper shows some current preliminary research results which point out the effectiveness of the proposed technique in a virtual perspective.
Classification with spatio-temporal interpixel class dependency contexts
NASA Technical Reports Server (NTRS)
Jeon, Byeungwoo; Landgrebe, David A.
1992-01-01
A contextual classifier which can utilize both spatial and temporal interpixel dependency contexts is investigated. After spatial and temporal neighbors are defined, a general form of maximum a posterior spatiotemporal contextual classifier is derived. This contextual classifier is simplified under several assumptions. Joint prior probabilities of the classes of each pixel and its spatial neighbors are modeled by the Gibbs random field. The classification is performed in a recursive manner to allow a computationally efficient contextual classification. Experimental results with bitemporal TM data show significant improvement of classification accuracy over noncontextual pixelwise classifiers. This spatiotemporal contextual classifier should find use in many applications of remote sensing, especially when the classification accuracy is important.
Gastric precancerous diseases classification using CNN with a concise model.
Zhang, Xu; Hu, Weiling; Chen, Fei; Liu, Jiquan; Yang, Yuanhang; Wang, Liangjing; Duan, Huilong; Si, Jianmin
2017-01-01
Gastric precancerous diseases (GPD) may deteriorate into early gastric cancer if misdiagnosed, so it is important to help doctors recognize GPD accurately and quickly. In this paper, we realize the classification of 3-class GPD, namely, polyp, erosion, and ulcer using convolutional neural networks (CNN) with a concise model called the Gastric Precancerous Disease Network (GPDNet). GPDNet introduces fire modules from SqueezeNet to reduce the model size and parameters about 10 times while improving speed for quick classification. To maintain classification accuracy with fewer parameters, we propose an innovative method called iterative reinforced learning (IRL). After training GPDNet from scratch, we apply IRL to fine-tune the parameters whose values are close to 0, and then we take the modified model as a pretrained model for the next training. The result shows that IRL can improve the accuracy about 9% after 6 iterations. The final classification accuracy of our GPDNet was 88.90%, which is promising for clinical GPD recognition.
Convolutional neural network with transfer learning for rice type classification
NASA Astrophysics Data System (ADS)
Patel, Vaibhav Amit; Joshi, Manjunath V.
2018-04-01
Presently, rice type is identified manually by humans, which is time consuming and error prone. Therefore, there is a need to do this by machine which makes it faster with greater accuracy. This paper proposes a deep learning based method for classification of rice types. We propose two methods to classify the rice types. In the first method, we train a deep convolutional neural network (CNN) using the given segmented rice images. In the second method, we train a combination of a pretrained VGG16 network and the proposed method, while using transfer learning in which the weights of a pretrained network are used to achieve better accuracy. Our approach can also be used for classification of rice grain as broken or fine. We train a 5-class model for classifying rice types using 4000 training images and another 2- class model for the classification of broken and normal rice using 1600 training images. We observe that despite having distinct rice images, our architecture, pretrained on ImageNet data boosts classification accuracy significantly.
ERIC Educational Resources Information Center
Laracy, Seth D.; Hojnoski, Robin L.; Dever, Bridget V.
2016-01-01
Receiver operating characteristic curve (ROC) analysis was used to investigate the ability of early numeracy curriculum-based measures (EN-CBM) administered in preschool to predict performance below the 25th and 40th percentiles on a quantity discrimination measure in kindergarten. Areas under the curve derived from a sample of 279 students ranged…
Ronald E. McRoberts
2014-01-01
Multiple remote sensing-based approaches to estimating gross afforestation, gross deforestation, and net deforestation are possible. However, many of these approaches have severe data requirements in the form of long time series of remotely sensed data and/or large numbers of observations of land cover change to train classifiers and assess the accuracy of...
ERIC Educational Resources Information Center
Koziol, Natalie A.
2016-01-01
Testlets, or groups of related items, are commonly included in educational assessments due to their many logistical and conceptual advantages. Despite their advantages, testlets introduce complications into the theory and practice of educational measurement. Responses to items within a testlet tend to be correlated even after controlling for…
Participatory Classification in a System for Assessing Multimodal Transportation Patterns
2015-02-17
Culler Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2015-8 http...California at Berkeley,Electrical Engineering and Computer Sciences,Berkeley,CA,94720 8. PERFORMING ORGANIZATION REPORT NUMBER 9. SPONSORING/MONITORING...confirmation screen This section sketches the characteristics of the data that was collected, computes the accuracy of the auto- mated inference algorithm
Hettige, Nuwan C; Nguyen, Thai Binh; Yuan, Chen; Rajakulendran, Thanara; Baddour, Jermeen; Bhagwat, Nikhil; Bani-Fatemi, Ali; Voineskos, Aristotle N; Mallar Chakravarty, M; De Luca, Vincenzo
2017-07-01
Suicide is a major concern for those afflicted by schizophrenia. Identifying patients at the highest risk for future suicide attempts remains a complex problem for psychiatric interventions. Machine learning models allow for the integration of many risk factors in order to build an algorithm that predicts which patients are likely to attempt suicide. Currently it is unclear how to integrate previously identified risk factors into a clinically relevant predictive tool to estimate the probability of a patient with schizophrenia for attempting suicide. We conducted a cross-sectional assessment on a sample of 345 participants diagnosed with schizophrenia spectrum disorders. Suicide attempters and non-attempters were clearly identified using the Columbia Suicide Severity Rating Scale (C-SSRS) and the Beck Suicide Ideation Scale (BSS). We developed four classification algorithms using a regularized regression, random forest, elastic net and support vector machine models with sociocultural and clinical variables as features to train the models. All classification models performed similarly in identifying suicide attempters and non-attempters. Our regularized logistic regression model demonstrated an accuracy of 67% and an area under the curve (AUC) of 0.71, while the random forest model demonstrated 66% accuracy and an AUC of 0.67. Support vector classifier (SVC) model demonstrated an accuracy of 67% and an AUC of 0.70, and the elastic net model demonstrated and accuracy of 65% and an AUC of 0.71. Machine learning algorithms offer a relatively successful method for incorporating many clinical features to predict individuals at risk for future suicide attempts. Increased performance of these models using clinically relevant variables offers the potential to facilitate early treatment and intervention to prevent future suicide attempts. Copyright © 2017 Elsevier Inc. All rights reserved.
Dutch population specific sex estimation formulae using the proximal femur.
Colman, K L; Janssen, M C L; Stull, K E; van Rijn, R R; Oostra, R J; de Boer, H H; van der Merwe, A E
2018-05-01
Sex estimation techniques are frequently applied in forensic anthropological analyses of unidentified human skeletal remains. While morphological sex estimation methods are able to endure population differences, the classification accuracy of metric sex estimation methods are population-specific. No metric sex estimation method currently exists for the Dutch population. The purpose of this study is to create Dutch population specific sex estimation formulae by means of osteometric analyses of the proximal femur. Since the Netherlands lacks a representative contemporary skeletal reference population, 2D plane reconstructions, derived from clinical computed tomography (CT) data, were used as an alternative source for a representative reference sample. The first part of this study assesses the intra- and inter-observer error, or reliability, of twelve measurements of the proximal femur. The technical error of measurement (TEM) and relative TEM (%TEM) were calculated using 26 dry adult femora. In addition, the agreement, or accuracy, between the dry bone and CT-based measurements was determined by percent agreement. Only reliable and accurate measurements were retained for the logistic regression sex estimation formulae; a training set (n=86) was used to create the models while an independent testing set (n=28) was used to validate the models. Due to high levels of multicollinearity, only single variable models were created. Cross-validated classification accuracies ranged from 86% to 92%. The high cross-validated classification accuracies indicate that the developed formulae can contribute to the biological profile and specifically in sex estimation of unidentified human skeletal remains in the Netherlands. Furthermore, the results indicate that clinical CT data can be a valuable alternative source of data when representative skeletal collections are unavailable. Copyright © 2017 Elsevier B.V. All rights reserved.
Time-dependent classification accuracy curve under marker-dependent sampling.
Zhu, Zhaoyin; Wang, Xiaofei; Saha-Chaudhuri, Paramita; Kosinski, Andrzej S; George, Stephen L
2016-07-01
Evaluating the classification accuracy of a candidate biomarker signaling the onset of disease or disease status is essential for medical decision making. A good biomarker would accurately identify the patients who are likely to progress or die at a particular time in the future or who are in urgent need for active treatments. To assess the performance of a candidate biomarker, the receiver operating characteristic (ROC) curve and the area under the ROC curve (AUC) are commonly used. In many cases, the standard simple random sampling (SRS) design used for biomarker validation studies is costly and inefficient. In order to improve the efficiency and reduce the cost of biomarker validation, marker-dependent sampling (MDS) may be used. In a MDS design, the selection of patients to assess true survival time is dependent on the result of a biomarker assay. In this article, we introduce a nonparametric estimator for time-dependent AUC under a MDS design. The consistency and the asymptotic normality of the proposed estimator is established. Simulation shows the unbiasedness of the proposed estimator and a significant efficiency gain of the MDS design over the SRS design. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Macaluso, P J
2011-02-01
Digital photogrammetric methods were used to collect diameter, area, and perimeter data of the acetabulum for a twentieth-century skeletal sample from France (Georges Olivier Collection, Musée de l'Homme, Paris) consisting of 46 males and 36 females. The measurements were then subjected to both discriminant function and logistic regression analyses in order to develop osteometric standards for sex assessment. Univariate discriminant functions and logistic regression equations yielded overall correct classification accuracy rates for both the left and the right acetabula ranging from 84.1% to 89.6%. The multivariate models developed in this study did not provide increased accuracy over those using only a single variable. Classification sex bias ratios ranged between 1.1% and 7.3% for the majority of models. The results of this study, therefore, demonstrate that metric analysis of acetabular size provides a highly accurate, and easily replicable, method of discriminating sex in this documented skeletal collection. The results further suggest that the addition of area and perimeter data derived from digital images may provide a more effective method of sex assessment than that offered by traditional linear measurements alone. Copyright © 2010 Elsevier GmbH. All rights reserved.
Preoperative assessment of microvascular invasion in hepatocellular carcinoma
NASA Astrophysics Data System (ADS)
Chakraborty, Jayasree; Zheng, Jian; Gönen, Mithat; Jarnagin, William R.; DeMatteo, Ronald P.; Do, Richard K. G.; Simpson, Amber L.
2017-03-01
Hepatocellular carcinoma (HCC) is the most common liver cancer and the third leading cause of cancer-related death worldwide.1 Resection or liver transplantation may be curative in patients with early-stage HCC but early recurrence is common.2, 3 Microvascular invasion (MVI) is one of the most important predictors of early recurrence.3 The identification of MVI prior to surgery would optimally select patients for potentially curative resection or liver transplant. However, MVI can only be diagnosed by microscopic assessment of the resected tumor. The aim of the present study is to apply CT-based texture analysis to identify pre-operative imaging predictors of MVI in patients with HCC. Texture features are derived from CT and analyzed individually as well as in combination, to evaluate their ability to predict MVI. A two-stage classification is employed: HCC tumors are automatically categorized into uniform or heterogenous groups followed by classification into the presence or absence of MVI. We achieve an area under the receiver operating characteristic curve (AUC) of 0.76 and accuracy of 76.7% for uniform lesions and AUC of 0.79 and accuracy of 74.06% for heterogeneous tumors. These results suggest that MVI can be accurately and objectively predicted from preoperative CT scans.
NASA Astrophysics Data System (ADS)
Gupta, Rajendra Kumar
The increase in lion and leopard population in the GIR wild life sanctuary and National Park (Gir Protected Area) demands periodic and precision monitoring of habitat at close intervals using space based remote sensing data. Besides characterizing the different forest classes, remote sensing needs to support for the assessment of thermal stress zones and identification of possible corridors for lion dispersion to new home ranges. The study focuses on assessing the thematic forest classification accuracies in percentage terms(CA) attainable using single date post-monsoon (CA=60, kappa = 0.514) as well as leaf shedding (CA=48.4, kappa = 0.372) season data in visible and Near-IR spectral bands of IRS/LISS-III at 23.5 m spatial resolution; and improvement of CA by using joint two date (multi-temporal) data sets (CA=87.2, kappa = 0.843) in the classification. The 188 m spatial resolution IRS/WiFS and 23.5 m spatial resolution LISS-III data were used to study the possible corridors for dispersion of Lions from GIR protected areas (PA). A relative thermal stress index (RTSI) for Gir PA has been developed using NOAA/ AVHRR data sets of post-monsoon, leaf shedded and summer seasons. The paper discusses the role of RTSI as a tool to work out forest management plans using leaf shedded season data to combat the thermal stress in the habitat, by identifying locations for artificial water holes during the ensuing summer season.
Borges, Díbio L; Vidal, Flávio B; Flores, Marta R P; Melani, Rodolfo F H; Guimarães, Marco A; Machado, Carlos E P
2018-03-01
Age assessment from images is of high interest in the forensic community because of the necessity to establish formal protocols to identify child pornography, child missing and abuses where visual evidences are the mostly admissible. Recently, photoanthropometric methods have been found useful for age estimation correlating facial proportions in image databases with samples of some age groups. Notwithstanding the advances, newer facial features and further analysis are needed to improve accuracy and establish larger applicability. In this investigation, frontal images of 1000 individuals (500 females, 500 males), equally distributed in five age groups (6, 10, 14, 18, 22 years old) were used in a 10 fold cross-validated experiment for three age thresholds classifications (<10, <14, <18 years old). A set of novel 40 features, based on a relation between landmark distances and the iris diameter, is proposed and joint mutual information is used to select the most relevant and complementary features for the classification task. In a civil image identification database with diverse ancestry, receiver operating characteristic (ROC) curves were plotted to verify accuracy, and the resultant AUCs achieved 0.971, 0.969, and 0.903 for the age classifications (<10, <14, <18 years old), respectively. These results add support to continuing research in age assessment from images using the metric approach. Still, larger samples are necessary to evaluate reliability in extensive conditions. Copyright © 2017 Elsevier B.V. All rights reserved.
Physical activity classification using the GENEA wrist-worn accelerometer.
Zhang, Shaoyan; Rowlands, Alex V; Murray, Peter; Hurst, Tina L
2012-04-01
Most accelerometer-based activity monitors are worn on the waist or lower back for assessment of habitual physical activity. Output is in arbitrary counts that can be classified by activity intensity according to published thresholds. The purpose of this study was to develop methods to classify physical activities into walking, running, household, or sedentary activities based on raw acceleration data from the GENEA (Gravity Estimator of Normal Everyday Activity) and compare classification accuracy from a wrist-worn GENEA with a waist-worn GENEA. Sixty participants (age = 49.4 ± 6.5 yr, body mass index = 24.6 ± 3.4 kg·m⁻²) completed an ordered series of 10-12 semistructured activities in the laboratory and outdoor environment. Throughout, three GENEA accelerometers were worn: one at the waist, one on the left wrist, and one on the right wrist. Acceleration data were collected at 80 Hz. Features obtained from both fast Fourier transform and wavelet decomposition were extracted, and machine learning algorithms were used to classify four types of daily activities including sedentary, household, walking, and running activities. The computational results demonstrated that the algorithm we developed can accurately classify certain types of daily activities, with high overall classification accuracy for both waist-worn GENEA (0.99) and wrist-worn GENEA (right wrist = 0.97, left wrist = 0.96). We have successfully developed algorithms suitable for use with wrist-worn accelerometers for detecting certain types of physical activities; the performance is comparable to waist-worn accelerometers for assessment of physical activity.
Stefano, A; Gallivanone, F; Messa, C; Gilardi, M C; Gastiglioni, I
2014-12-01
The aim of this work is to evaluate the metabolic impact of Partial Volume Correction (PVC) on the measurement of the Standard Uptake Value (SUV) from [18F]FDG PET-CT oncological studies for treatment monitoring purpose. Twenty-nine breast cancer patients with bone lesions (42 lesions in total) underwent [18F]FDG PET-CT studies after surgical resection of breast cancer primitives, and before (PET-II) chemotherapy and hormone treatment. PVC of bone lesion uptake was performed on the two [18F]FDG PET-CT studies, using a method based on Recovery Coefficients (RC) and on an automatic measurement of lesion metabolic volume. Body-weight average SUV was calculated for each lesion, with and without PVC. The accuracy, reproducibility, clinical feasibility and the metabolic impact on treatment response of the considered PVC method was evaluated. The PVC method was found clinically feasible in bone lesions, with an accuracy of 93% for lesion sphere-equivalent diameter >1 cm. Applying PVC, average SUV values increased, from 7% up to 154% considering both PET-I and PET-II studies, proving the need of the correction. As main finding, PVC modified the therapy response classification in 6 cases according to EORTC 1999 classification and in 5 cases according to PERCIST 1.0 classification. PVC has an important metabolic impact on the assessment of tumor response to treatment by [18F]FDG PET-CT oncological studies.
Classification of EMG signals using PSO optimized SVM for diagnosis of neuromuscular disorders.
Subasi, Abdulhamit
2013-06-01
Support vector machine (SVM) is an extensively used machine learning method with many biomedical signal classification applications. In this study, a novel PSO-SVM model has been proposed that hybridized the particle swarm optimization (PSO) and SVM to improve the EMG signal classification accuracy. This optimization mechanism involves kernel parameter setting in the SVM training procedure, which significantly influences the classification accuracy. The experiments were conducted on the basis of EMG signal to classify into normal, neurogenic or myopathic. In the proposed method the EMG signals were decomposed into the frequency sub-bands using discrete wavelet transform (DWT) and a set of statistical features were extracted from these sub-bands to represent the distribution of wavelet coefficients. The obtained results obviously validate the superiority of the SVM method compared to conventional machine learning methods, and suggest that further significant enhancements in terms of classification accuracy can be achieved by the proposed PSO-SVM classification system. The PSO-SVM yielded an overall accuracy of 97.41% on 1200 EMG signals selected from 27 subject records against 96.75%, 95.17% and 94.08% for the SVM, the k-NN and the RBF classifiers, respectively. PSO-SVM is developed as an efficient tool so that various SVMs can be used conveniently as the core of PSO-SVM for diagnosis of neuromuscular disorders. Copyright © 2013 Elsevier Ltd. All rights reserved.
Mandelkow, Hendrik; de Zwart, Jacco A.; Duyn, Jeff H.
2016-01-01
Naturalistic stimuli like movies evoke complex perceptual processes, which are of great interest in the study of human cognition by functional MRI (fMRI). However, conventional fMRI analysis based on statistical parametric mapping (SPM) and the general linear model (GLM) is hampered by a lack of accurate parametric models of the BOLD response to complex stimuli. In this situation, statistical machine-learning methods, a.k.a. multivariate pattern analysis (MVPA), have received growing attention for their ability to generate stimulus response models in a data-driven fashion. However, machine-learning methods typically require large amounts of training data as well as computational resources. In the past, this has largely limited their application to fMRI experiments involving small sets of stimulus categories and small regions of interest in the brain. By contrast, the present study compares several classification algorithms known as Nearest Neighbor (NN), Gaussian Naïve Bayes (GNB), and (regularized) Linear Discriminant Analysis (LDA) in terms of their classification accuracy in discriminating the global fMRI response patterns evoked by a large number of naturalistic visual stimuli presented as a movie. Results show that LDA regularized by principal component analysis (PCA) achieved high classification accuracies, above 90% on average for single fMRI volumes acquired 2 s apart during a 300 s movie (chance level 0.7% = 2 s/300 s). The largest source of classification errors were autocorrelations in the BOLD signal compounded by the similarity of consecutive stimuli. All classifiers performed best when given input features from a large region of interest comprising around 25% of the voxels that responded significantly to the visual stimulus. Consistent with this, the most informative principal components represented widespread distributions of co-activated brain regions that were similar between subjects and may represent functional networks. In light of these results, the combination of naturalistic movie stimuli and classification analysis in fMRI experiments may prove to be a sensitive tool for the assessment of changes in natural cognitive processes under experimental manipulation. PMID:27065832
NASA Astrophysics Data System (ADS)
Hafizt, M.; Manessa, M. D. M.; Adi, N. S.; Prayudha, B.
2017-12-01
Benthic habitat mapping using satellite data is one challenging task for practitioners and academician as benthic objects are covered by light-attenuating water column obscuring object discrimination. One common method to reduce this water-column effect is by using depth-invariant index (DII) image. However, the application of the correction in shallow coastal areas is challenging as a dark object such as seagrass could have a very low pixel value, preventing its reliable identification and classification. This limitation can be solved by specifically applying a classification process to areas with different water depth levels. The water depth level can be extracted from satellite imagery using Relative Water Depth Index (RWDI). This study proposed a new approach to improve the mapping accuracy, particularly for benthic dark objects by combining the DII of Lyzenga’s water column correction method and the RWDI of Stumpt’s method. This research was conducted in Lintea Island which has a high variation of benthic cover using Sentinel-2A imagery. To assess the effectiveness of the proposed new approach for benthic habitat mapping two different classification procedures are implemented. The first procedure is the commonly applied method in benthic habitat mapping where DII image is used as input data to all coastal area for image classification process regardless of depth variation. The second procedure is the proposed new approach where its initial step begins with the separation of the study area into shallow and deep waters using the RWDI image. Shallow area was then classified using the sunglint-corrected image as input data and the deep area was classified using DII image as input data. The final classification maps of those two areas were merged as a single benthic habitat map. A confusion matrix was then applied to evaluate the mapping accuracy of the final map. The result shows that the new proposed mapping approach can be used to map all benthic objects in all depth ranges and shows a better accuracy compared to that of classification map produced using only with DII.
Efficient alignment-free DNA barcode analytics
Kuksa, Pavel; Pavlovic, Vladimir
2009-01-01
Background In this work we consider barcode DNA analysis problems and address them using alternative, alignment-free methods and representations which model sequences as collections of short sequence fragments (features). The methods use fixed-length representations (spectrum) for barcode sequences to measure similarities or dissimilarities between sequences coming from the same or different species. The spectrum-based representation not only allows for accurate and computationally efficient species classification, but also opens possibility for accurate clustering analysis of putative species barcodes and identification of critical within-barcode loci distinguishing barcodes of different sample groups. Results New alignment-free methods provide highly accurate and fast DNA barcode-based identification and classification of species with substantial improvements in accuracy and speed over state-of-the-art barcode analysis methods. We evaluate our methods on problems of species classification and identification using barcodes, important and relevant analytical tasks in many practical applications (adverse species movement monitoring, sampling surveys for unknown or pathogenic species identification, biodiversity assessment, etc.) On several benchmark barcode datasets, including ACG, Astraptes, Hesperiidae, Fish larvae, and Birds of North America, proposed alignment-free methods considerably improve prediction accuracy compared to prior results. We also observe significant running time improvements over the state-of-the-art methods. Conclusion Our results show that newly developed alignment-free methods for DNA barcoding can efficiently and with high accuracy identify specimens by examining only few barcode features, resulting in increased scalability and interpretability of current computational approaches to barcoding. PMID:19900305
Gradus, Jaimie L; King, Matthew W; Galatzer-Levy, Isaac; Street, Amy E
2017-08-01
Suicide rates among recent veterans have led to interest in risk identification. Evidence of gender-and trauma-specific predictors of suicidal ideation necessitates the use of advanced computational methods capable of elucidating these important and complex associations. In this study, we used machine learning to examine gender-specific associations between predeployment and military factors, traumatic deployment experiences, and psychopathology and suicidal ideation (SI) in a national sample of veterans deployed during the Iraq and Afghanistan conflicts (n = 2,244). Classification, regression tree analyses, and random forests were used to identify associations with SI and determine their classification accuracy. Findings converged on several associations for men that included depression, posttraumatic stress disorder (PTSD), and somatic complaints. Sexual harassment during deployment emerged as a key factor that interacted with PTSD and depression and demonstrated a stronger association with SI among women. Classification accuracy for SI presence or absence was good based on the receiver operating characteristic area under the curve, men = .91, women = .92. The risk for SI was classifiable with good accuracy, with associations that varied by gender. The use of machine learning analyses allowed for the discovery of rich, nuanced results that should be replicated in other samples and may eventually be a basis for the development of gender-specific actuarial tools to assess SI risk among veterans. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.
Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio
NASA Astrophysics Data System (ADS)
Nababan, A. A.; Sitompul, O. S.; Tulus
2018-04-01
K- Nearest Neighbor (KNN) is a good classifier, but from several studies, the result performance accuracy of KNN still lower than other methods. One of the causes of the low accuracy produced, because each attribute has the same effect on the classification process, while some less relevant characteristics lead to miss-classification of the class assignment for new data. In this research, we proposed Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio as a parameter to see the correlation between each attribute in the data and the Gain Ratio also will be used as the basis for weighting each attribute of the dataset. The accuracy of results is compared to the accuracy acquired from the original KNN method using 10-fold Cross-Validation with several datasets from the UCI Machine Learning repository and KEEL-Dataset Repository, such as abalone, glass identification, haberman, hayes-roth and water quality status. Based on the result of the test, the proposed method was able to increase the classification accuracy of KNN, where the highest difference of accuracy obtained hayes-roth dataset is worth 12.73%, and the lowest difference of accuracy obtained in the abalone dataset of 0.07%. The average result of the accuracy of all dataset increases the accuracy by 5.33%.
Wang, Xueyi; Davidson, Nicholas J.
2011-01-01
Ensemble methods have been widely used to improve prediction accuracy over individual classifiers. In this paper, we achieve a few results about the prediction accuracies of ensemble methods for binary classification that are missed or misinterpreted in previous literature. First we show the upper and lower bounds of the prediction accuracies (i.e. the best and worst possible prediction accuracies) of ensemble methods. Next we show that an ensemble method can achieve > 0.5 prediction accuracy, while individual classifiers have < 0.5 prediction accuracies. Furthermore, for individual classifiers with different prediction accuracies, the average of the individual accuracies determines the upper and lower bounds. We perform two experiments to verify the results and show that it is hard to achieve the upper and lower bounds accuracies by random individual classifiers and better algorithms need to be developed. PMID:21853162
Articular cartilage degeneration classification by means of high-frequency ultrasound.
Männicke, N; Schöne, M; Oelze, M; Raum, K
2014-10-01
To date only single ultrasound parameters were regarded in statistical analyses to characterize osteoarthritic changes in articular cartilage and the potential benefit of using parameter combinations for characterization remains unclear. Therefore, the aim of this work was to utilize feature selection and classification of a Mankin subset score (i.e., cartilage surface and cell sub-scores) using ultrasound-based parameter pairs and investigate both classification accuracy and the sensitivity towards different degeneration stages. 40 punch biopsies of human cartilage were previously scanned ex vivo with a 40-MHz transducer. Ultrasound-based surface parameters, as well as backscatter and envelope statistics parameters were available. Logistic regression was performed with each unique US parameter pair as predictor and different degeneration stages as response variables. The best ultrasound-based parameter pair for each Mankin subset score value was assessed by highest classification accuracy and utilized in receiver operating characteristics (ROC) analysis. The classifications discriminating between early degenerations yielded area under the ROC curve (AUC) values of 0.94-0.99 (mean ± SD: 0.97 ± 0.03). In contrast, classifications among higher Mankin subset scores resulted in lower AUC values: 0.75-0.91 (mean ± SD: 0.84 ± 0.08). Variable sensitivities of the different ultrasound features were observed with respect to different degeneration stages. Our results strongly suggest that combinations of high-frequency ultrasound-based parameters exhibit potential to characterize different, particularly very early, degeneration stages of hyaline cartilage. Variable sensitivities towards different degeneration stages suggest that a concurrent estimation of multiple ultrasound-based parameters is diagnostically valuable. In-vivo application of the present findings is conceivable in both minimally invasive arthroscopic ultrasound and high-frequency transcutaneous ultrasound. Copyright © 2014 Osteoarthritis Research Society International. Published by Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Selim, Serdar; Sonmez, Namik Kemal; Onur, Isin; Coslu, Mesut
2017-10-01
Connection of similar landscape patches with ecological corridors supports habitat quality of these patches, increases urban ecological quality, and constitutes an important living and expansion area for wild life. Furthermore, habitat connectivity provided by urban green areas is supporting biodiversity in urban areas. In this study, possible ecological connections between landscape patches, which were achieved by using Expert classification technique and modeled with probabilistic connection index. Firstly, the reflection responses of plants to various bands are used as data in hypotheses. One of the important features of this method is being able to use more than one image at the same time in the formation of the hypothesis. For this reason, before starting the application of the Expert classification, the base images are prepared. In addition to the main image, the hypothesis conditions were also created for each class with the NDVI image which is commonly used in the vegetation researches. Besides, the results of the previously conducted supervised classification were taken into account. We applied this classification method by using the raster imagery with user-defined variables. Hereupon, to provide ecological connections of the tree cover which was achieved from the classification, we used Probabilistic Connection (PC) index. The probabilistic connection model which is used for landscape planning and conservation studies via detecting and prioritization critical areas for ecological connection characterizes the possibility of direct connection between habitats. As a result we obtained over % 90 total accuracy in accuracy assessment analysis. We provided ecological connections with PC index and we created inter-connected green spaces system. Thus, we offered and implicated green infrastructure system model takes place in the agenda of recent years.
Common component classification: what can we learn from machine learning?
Anderson, Ariana; Labus, Jennifer S; Vianna, Eduardo P; Mayer, Emeran A; Cohen, Mark S
2011-05-15
Machine learning methods have been applied to classifying fMRI scans by studying locations in the brain that exhibit temporal intensity variation between groups, frequently reporting classification accuracy of 90% or better. Although empirical results are quite favorable, one might doubt the ability of classification methods to withstand changes in task ordering and the reproducibility of activation patterns over runs, and question how much of the classification machines' power is due to artifactual noise versus genuine neurological signal. To examine the true strength and power of machine learning classifiers we create and then deconstruct a classifier to examine its sensitivity to physiological noise, task reordering, and across-scan classification ability. The models are trained and tested both within and across runs to assess stability and reproducibility across conditions. We demonstrate the use of independent components analysis for both feature extraction and artifact removal and show that removal of such artifacts can reduce predictive accuracy even when data has been cleaned in the preprocessing stages. We demonstrate how mistakes in the feature selection process can cause the cross-validation error seen in publication to be a biased estimate of the testing error seen in practice and measure this bias by purposefully making flawed models. We discuss other ways to introduce bias and the statistical assumptions lying behind the data and model themselves. Finally we discuss the complications in drawing inference from the smaller sample sizes typically seen in fMRI studies, the effects of small or unbalanced samples on the Type 1 and Type 2 error rates, and how publication bias can give a false confidence of the power of such methods. Collectively this work identifies challenges specific to fMRI classification and methods affecting the stability of models. Copyright © 2010 Elsevier Inc. All rights reserved.
Can the Ni classification of vessels predict neoplasia? A systematic review and meta-analysis.
Mehlum, Camilla S; Rosenberg, Tine; Dyrvig, Anne-Kirstine; Groentved, Aagot Moeller; Kjaergaard, Thomas; Godballe, Christian
2018-01-01
The Ni classification of vascular change from 2011 is well documented for evaluating pharyngeal and laryngeal lesions, primarily focusing on cancer. In the planning of surgery it may be more relevant to differentiate neoplasia from non-neoplasia. We aimed to evaluate the ability of the Ni classification to predict laryngeal or hypopharyngeal neoplasia and to investigate if a changed cutoff value would support the recent European Laryngological Society (ELS) proposal of perpendicular vascular changes as indicative of neoplasia. PubMed, Embase, Cochrane, and Scopus databases. A systematic review and meta-analysis was performed according to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis statement. We systematically searched for publications from 2011 until 2016. All retrieved studies were reviewed and qualitatively assessed. The pooled sensitivity and specificity of the Ni classification with two different cutoffs were calculated, and bubble and summary receiver operating characteristics plots were created. The combined sensitivity of five studies (n = 687) with Ni type IV-V defined as test-positive was 0.89 (95% confidence interval [CI]: 0.76-0.95), and specificity was 0.82 (95% CI: 0.72-0.89). The equivalent combined sensitivity of four studies (n = 624) with Ni type V defined as test-positive was 0.82 (95% CI: 0.75-0.87), and specificity was 0.93 (95% CI: 0.82-0.97). The diagnostic accuracy of the Ni classification in predicting neoplasia was high, without significant difference between the two analyzed cutoff values. Implementation of the proposed ELS classification of vascular changes seems reasonable from a clinical perspective, with comparable accuracy. Attention must be drawn to the accompanying risk of exposing patients to unnecessary surgery. Laryngoscope, 128:168-176, 2018. © 2017 The American Laryngological, Rhinological and Otological Society, Inc.
Landenburger, L.; Lawrence, R.L.; Podruzny, S.; Schwartz, C.C.
2008-01-01
Moderate resolution satellite imagery traditionally has been thought to be inadequate for mapping vegetation at the species level. This has made comprehensive mapping of regional distributions of sensitive species, such as whitebark pine, either impractical or extremely time consuming. We sought to determine whether using a combination of moderate resolution satellite imagery (Landsat Enhanced Thematic Mapper Plus), extensive stand data collected by land management agencies for other purposes, and modern statistical classification techniques (boosted classification trees) could result in successful mapping of whitebark pine. Overall classification accuracies exceeded 90%, with similar individual class accuracies. Accuracies on a localized basis varied based on elevation. Accuracies also varied among administrative units, although we were not able to determine whether these differences related to inherent spatial variations or differences in the quality of available reference data.
Zare, Marzieh; Rezvani, Zahra; Benasich, April A
2016-07-01
This study assesses the ability of a novel, "automatic classification" approach to facilitate identification of infants at highest familial risk for language-learning disorders (LLD) and to provide converging assessments to enable earlier detection of developmental disorders that disrupt language acquisition. Network connectivity measures derived from 62-channel electroencephalogram (EEG) recording were used to identify selected features within two infant groups who differed on LLD risk: infants with a family history of LLD (FH+) and typically-developing infants without such a history (FH-). A support vector machine was deployed; global efficiency and global and local clustering coefficients were computed. A novel minimum spanning tree (MST) approach was also applied. Cross-validation was employed to assess the resultant classification. Infants were classified with about 80% accuracy into FH+ and FH- groups with 89% specificity and precision of 92%. Clustering patterns differed by risk group and MST network analysis suggests that FH+ infants' EEG complexity patterns were significantly different from FH- infants. The automatic classification techniques used here were shown to be both robust and reliable and should provide valuable information when applied to early identification of risk or clinical groups. The ability to identify infants at highest risk for LLD using "automatic classification" strategies is a novel convergent approach that may facilitate earlier diagnosis and remediation. Copyright © 2016 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Data Processing and Quality Evaluation of a Boat-Based Mobile Laser Scanning System
Vaaja, Matti; Kukko, Antero; Kaartinen, Harri; Kurkela, Matti; Kasvi, Elina; Flener, Claude; Hyyppä, Hannu; Hyyppä, Juha; Järvelä, Juha; Alho, Petteri
2013-01-01
Mobile mapping systems (MMSs) are used for mapping topographic and urban features which are difficult and time consuming to measure with other instruments. The benefits of MMSs include efficient data collection and versatile usability. This paper investigates the data processing steps and quality of a boat-based mobile mapping system (BoMMS) data for generating terrain and vegetation points in a river environment. Our aim in data processing was to filter noise points, detect shorelines as well as points below water surface and conduct ground point classification. Previous studies of BoMMS have investigated elevation accuracies and usability in detection of fluvial erosion and deposition areas. The new findings concerning BoMMS data are that the improved data processing approach allows for identification of multipath reflections and shoreline delineation. We demonstrate the possibility to measure bathymetry data in shallow (0–1 m) and clear water. Furthermore, we evaluate for the first time the accuracy of the BoMMS ground points classification compared to manually classified data. We also demonstrate the spatial variations of the ground point density and assess elevation and vertical accuracies of the BoMMS data. PMID:24048340
Data processing and quality evaluation of a boat-based mobile laser scanning system.
Vaaja, Matti; Kukko, Antero; Kaartinen, Harri; Kurkela, Matti; Kasvi, Elina; Flener, Claude; Hyyppä, Hannu; Hyyppä, Juha; Järvelä, Juha; Alho, Petteri
2013-09-17
Mobile mapping systems (MMSs) are used for mapping topographic and urban features which are difficult and time consuming to measure with other instruments. The benefits of MMSs include efficient data collection and versatile usability. This paper investigates the data processing steps and quality of a boat-based mobile mapping system (BoMMS) data for generating terrain and vegetation points in a river environment. Our aim in data processing was to filter noise points, detect shorelines as well as points below water surface and conduct ground point classification. Previous studies of BoMMS have investigated elevation accuracies and usability in detection of fluvial erosion and deposition areas. The new findings concerning BoMMS data are that the improved data processing approach allows for identification of multipath reflections and shoreline delineation. We demonstrate the possibility to measure bathymetry data in shallow (0-1 m) and clear water. Furthermore, we evaluate for the first time the accuracy of the BoMMS ground points classification compared to manually classified data. We also demonstrate the spatial variations of the ground point density and assess elevation and vertical accuracies of the BoMMS data.
Heredia-Juesas, Juan; Thatcher, Jeffrey E; Lu, Yang; Squiers, John J; King, Darlene; Fan, Wensheng; DiMaio, J Michael; Martinez-Lorenzo, Jose A
2018-04-01
The process of burn debridement is a challenging technique requiring significant skills to identify the regions that need excision and their appropriate excision depths. In order to assist surgeons, a machine learning tool is being developed to provide a quantitative assessment of burn-injured tissue. This paper presents three non-invasive optical imaging techniques capable of distinguishing four kinds of tissue-healthy skin, viable wound bed, shallow burn, and deep burn-during serial burn debridement in a porcine model. All combinations of these three techniques have been studied through a k-fold cross-validation method. In terms of global performance, the combination of all three techniques significantly improves the classification accuracy with respect to just one technique, from 0.42 up to more than 0.76. Furthermore, a non-linear spatial filtering based on the mode of a small neighborhood has been applied as a post-processing technique, in order to improve the performance of the classification. Using this technique, the global accuracy reaches a value close to 0.78 and, for some particular tissues and combination of techniques, the accuracy improves by 13%.
NASA Technical Reports Server (NTRS)
Wrigley, R. C.; Acevedo, W.; Alexander, D.; Buis, J.; Card, D.
1984-01-01
An experiment of a factorial design was conducted to test the effects on classification accuracy of land cover types due to the improved spatial, spectral and radiometric characteristics of the Thematic Mapper (TM) in comparison to the Multispectral Scanner (MSS). High altitude aircraft scanner data from the Airborne Thematic Mapper instrument was acquired over central California in August, 1983 and used to simulate Thematic Mapper data as well as all combinations of the three characteristics for eight data sets in all. Results for the training sites (field center pixels) showed better classification accuracies for MSS spatial resolution, TM spectral bands and TM radiometry in order of importance.
NASA Astrophysics Data System (ADS)
Cavigelli, Lukas; Bernath, Dominic; Magno, Michele; Benini, Luca
2016-10-01
Detecting and classifying targets in video streams from surveillance cameras is a cumbersome, error-prone and expensive task. Often, the incurred costs are prohibitive for real-time monitoring. This leads to data being stored locally or transmitted to a central storage site for post-incident examination. The required communication links and archiving of the video data are still expensive and this setup excludes preemptive actions to respond to imminent threats. An effective way to overcome these limitations is to build a smart camera that analyzes the data on-site, close to the sensor, and transmits alerts when relevant video sequences are detected. Deep neural networks (DNNs) have come to outperform humans in visual classifications tasks and are also performing exceptionally well on other computer vision tasks. The concept of DNNs and Convolutional Networks (ConvNets) can easily be extended to make use of higher-dimensional input data such as multispectral data. We explore this opportunity in terms of achievable accuracy and required computational effort. To analyze the precision of DNNs for scene labeling in an urban surveillance scenario we have created a dataset with 8 classes obtained in a field experiment. We combine an RGB camera with a 25-channel VIS-NIR snapshot sensor to assess the potential of multispectral image data for target classification. We evaluate several new DNNs, showing that the spectral information fused together with the RGB frames can be used to improve the accuracy of the system or to achieve similar accuracy with a 3x smaller computation effort. We achieve a very high per-pixel accuracy of 99.1%. Even for scarcely occurring, but particularly interesting classes, such as cars, 75% of the pixels are labeled correctly with errors occurring only around the border of the objects. This high accuracy was obtained with a training set of only 30 labeled images, paving the way for fast adaptation to various application scenarios.
NASA Astrophysics Data System (ADS)
Bratic, G.; Brovelli, M. A.; Molinari, M. E.
2018-04-01
The availability of thematic maps has significantly increased over the last few years. Validation of these maps is a key factor in assessing their suitability for different applications. The evaluation of the accuracy of classified data is carried out through a comparison with a reference dataset and the generation of a confusion matrix from which many quality indexes can be derived. In this work, an ad hoc free and open source Python tool was implemented to automatically compute all the matrix confusion-derived accuracy indexes proposed by literature. The tool was integrated into GRASS GIS environment and successfully applied to evaluate the quality of three high-resolution global datasets (GlobeLand30, Global Urban Footprint, Global Human Settlement Layer Built-Up Grid) in the Lombardy Region area (Italy). In addition to the most commonly used accuracy measures, e.g. overall accuracy and Kappa, the tool allowed to compute and investigate less known indexes such as the Ground Truth and the Classification Success Index. The promising tool will be further extended with spatial autocorrelation analysis functions and made available to researcher and user community.
NASA Astrophysics Data System (ADS)
Kong, Xianyu; Che, Xiaowei; Su, Rongguo; Zhang, Chuansong; Yao, Qingzhen; Shi, Xiaoyong
2017-05-01
There is an urgent need to develop efficient evaluation tools that use easily measured variables to make rapid and timely eutrophication assessments, which are important for marine health management, and to implement eutrophication monitoring programs. In this study, an approach for rapidly assessing the eutrophication status of coastal waters with three easily measured parameters (turbidity, chlorophyll a and dissolved oxygen) was developed by the grid search (GS) optimized support vector machine (SVM), with trophic index TRIX classification results as the reference. With the optimized penalty parameter C =64 and the kernel parameter γ =1, the classification accuracy rates reached 89.3% for the training data, 88.3% for the cross-validation, and 88.5% for the validation dataset. Because the developed approach only used three easy-to-measure variables, its application could facilitate the rapid assessment of the eutrophication status of coastal waters, resulting in potential cost savings in marine monitoring programs and assisting in the provision of timely advice for marine management.
Vehicle Classification Using an Imbalanced Dataset Based on a Single Magnetic Sensor.
Xu, Chang; Wang, Yingguan; Bao, Xinghe; Li, Fengrong
2018-05-24
This paper aims to improve the accuracy of automatic vehicle classifiers for imbalanced datasets. Classification is made through utilizing a single anisotropic magnetoresistive sensor, with the models of vehicles involved being classified into hatchbacks, sedans, buses, and multi-purpose vehicles (MPVs). Using time domain and frequency domain features in combination with three common classification algorithms in pattern recognition, we develop a novel feature extraction method for vehicle classification. These three common classification algorithms are the k-nearest neighbor, the support vector machine, and the back-propagation neural network. Nevertheless, a problem remains with the original vehicle magnetic dataset collected being imbalanced, and may lead to inaccurate classification results. With this in mind, we propose an approach called SMOTE, which can further boost the performance of classifiers. Experimental results show that the k-nearest neighbor (KNN) classifier with the SMOTE algorithm can reach a classification accuracy of 95.46%, thus minimizing the effect of the imbalance.
Examining the Classification Accuracy of a Vocabulary Screening Measure with Preschool Children
ERIC Educational Resources Information Center
Marcotte, Amanda M.; Clemens, Nathan H.; Parker, Christopher; Whitcomb, Sara A.
2016-01-01
This study investigated the classification accuracy of the "Dynamic Indicators of Vocabulary Skills" (DIVS) as a preschool vocabulary screening measure. With a sample of 240 preschoolers, fall and winter DIVS scores were used to predict year-end vocabulary risk using the 25th percentile on the "Peabody Picture Vocabulary Test--Third…
ERIC Educational Resources Information Center
Daniels, Brian; Volpe, Robert J.; Fabiano, Gregory A.; Briesch, Amy M.
2017-01-01
This study examines the classification accuracy and teacher acceptability of a problem-focused screener for academic and disruptive behavior problems, which is directly linked to evidence-based intervention. Participants included 39 classroom teachers from 2 public school districts in the Northeastern United States. Teacher ratings were obtained…
ERIC Educational Resources Information Center
Furey, William M.; Marcotte, Amanda M.; Hintze, John M.; Shackett, Caroline M.
2016-01-01
The study presents a critical analysis of written expression curriculum-based measurement (WE-CBM) metrics derived from 3- and 10-min test lengths. Criterion validity and classification accuracy were examined for Total Words Written (TWW), Correct Writing Sequences (CWS), Percent Correct Writing Sequences (%CWS), and Correct Minus Incorrect…