Sample records for classification error sources

  1. Sources of error in estimating truck traffic from automatic vehicle classification data

    DOT National Transportation Integrated Search

    1998-10-01

    Truck annual average daily traffic estimation errors resulting from sample classification counts are computed in this paper under two scenarios. One scenario investigates an improper factoring procedure that may be used by highway agencies. The study...

  2. J-Plus: Morphological Classification Of Compact And Extended Sources By Pdf Analysis

    NASA Astrophysics Data System (ADS)

    López-Sanjuan, C.; Vázquez-Ramió, H.; Varela, J.; Spinoso, D.; Cristóbal-Hornillos, D.; Viironen, K.; Muniesa, D.; J-PLUS Collaboration

    2017-10-01

    We present a morphological classification of J-PLUS EDR sources into compact (i.e. stars) and extended (i.e. galaxies). Such classification is based on the Bayesian modelling of the concentration distribution, including observational errors and magnitude + sky position priors. We provide the star / galaxy probability of each source computed from the gri images. The comparison with the SDSS number counts support our classification up to r 21. The 31.7 deg² analised comprises 150k stars and 101k galaxies.

  3. Assessing the statistical significance of the achieved classification error of classifiers constructed using serum peptide profiles, and a prescription for random sampling repeated studies for massive high-throughput genomic and proteomic studies.

    PubMed

    Lyons-Weiler, James; Pelikan, Richard; Zeh, Herbert J; Whitcomb, David C; Malehorn, David E; Bigbee, William L; Hauskrecht, Milos

    2005-01-01

    Peptide profiles generated using SELDI/MALDI time of flight mass spectrometry provide a promising source of patient-specific information with high potential impact on the early detection and classification of cancer and other diseases. The new profiling technology comes, however, with numerous challenges and concerns. Particularly important are concerns of reproducibility of classification results and their significance. In this work we describe a computational validation framework, called PACE (Permutation-Achieved Classification Error), that lets us assess, for a given classification model, the significance of the Achieved Classification Error (ACE) on the profile data. The framework compares the performance statistic of the classifier on true data samples and checks if these are consistent with the behavior of the classifier on the same data with randomly reassigned class labels. A statistically significant ACE increases our belief that a discriminative signal was found in the data. The advantage of PACE analysis is that it can be easily combined with any classification model and is relatively easy to interpret. PACE analysis does not protect researchers against confounding in the experimental design, or other sources of systematic or random error. We use PACE analysis to assess significance of classification results we have achieved on a number of published data sets. The results show that many of these datasets indeed possess a signal that leads to a statistically significant ACE.

  4. Exception handling for sensor fusion

    NASA Astrophysics Data System (ADS)

    Chavez, G. T.; Murphy, Robin R.

    1993-08-01

    This paper presents a control scheme for handling sensing failures (sensor malfunctions, significant degradations in performance due to changes in the environment, and errant expectations) in sensor fusion for autonomous mobile robots. The advantages of the exception handling mechanism are that it emphasizes a fast response to sensing failures, is able to use only a partial causal model of sensing failure, and leads to a graceful degradation of sensing if the sensing failure cannot be compensated for. The exception handling mechanism consists of two modules: error classification and error recovery. The error classification module in the exception handler attempts to classify the type and source(s) of the error using a modified generate-and-test procedure. If the source of the error is isolated, the error recovery module examines its cache of recovery schemes, which either repair or replace the current sensing configuration. If the failure is due to an error in expectation or cannot be identified, the planner is alerted. Experiments using actual sensor data collected by the CSM Mobile Robotics/Machine Perception Laboratory's Denning mobile robot demonstrate the operation of the exception handling mechanism.

  5. Feasibility of Equivalent Dipole Models for Electroencephalogram-Based Brain Computer Interfaces.

    PubMed

    Schimpf, Paul H

    2017-09-15

    This article examines the localization errors of equivalent dipolar sources inverted from the surface electroencephalogram in order to determine the feasibility of using their location as classification parameters for non-invasive brain computer interfaces. Inverse localization errors are examined for two head models: a model represented by four concentric spheres and a realistic model based on medical imagery. It is shown that the spherical model results in localization ambiguity such that a number of dipolar sources, with different azimuths and varying orientations, provide a near match to the electroencephalogram of the best equivalent source. No such ambiguity exists for the elevation of inverted sources, indicating that for spherical head models, only the elevation of inverted sources (and not the azimuth) can be expected to provide meaningful classification parameters for brain-computer interfaces. In a realistic head model, all three parameters of the inverted source location are found to be reliable, providing a more robust set of parameters. In both cases, the residual error hypersurfaces demonstrate local minima, indicating that a search for the best-matching sources should be global. Source localization error vs. signal-to-noise ratio is also demonstrated for both head models.

  6. Multiple-rule bias in the comparison of classification rules

    PubMed Central

    Yousefi, Mohammadmahdi R.; Hua, Jianping; Dougherty, Edward R.

    2011-01-01

    Motivation: There is growing discussion in the bioinformatics community concerning overoptimism of reported results. Two approaches contributing to overoptimism in classification are (i) the reporting of results on datasets for which a proposed classification rule performs well and (ii) the comparison of multiple classification rules on a single dataset that purports to show the advantage of a certain rule. Results: This article provides a careful probabilistic analysis of the second issue and the ‘multiple-rule bias’, resulting from choosing a classification rule having minimum estimated error on the dataset. It quantifies this bias corresponding to estimating the expected true error of the classification rule possessing minimum estimated error and it characterizes the bias from estimating the true comparative advantage of the chosen classification rule relative to the others by the estimated comparative advantage on the dataset. The analysis is applied to both synthetic and real data using a number of classification rules and error estimators. Availability: We have implemented in C code the synthetic data distribution model, classification rules, feature selection routines and error estimation methods. The code for multiple-rule analysis is implemented in MATLAB. The source code is available at http://gsp.tamu.edu/Publications/supplementary/yousefi11a/. Supplementary simulation results are also included. Contact: edward@ece.tamu.edu Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:21546390

  7. The Sources of Error in Spanish Writing.

    ERIC Educational Resources Information Center

    Justicia, Fernando; Defior, Sylvia; Pelegrina, Santiago; Martos, Francisco J.

    1999-01-01

    Determines the pattern of errors in Spanish spelling. Analyzes and proposes a classification system for the errors made by children in the initial stages of the acquisition of spelling skills. Finds the diverse forms of only 20 Spanish words produces 36% of the spelling errors in Spanish; and substitution is the most frequent type of error. (RS)

  8. Wildlife management by habitat units: A preliminary plan of action

    NASA Technical Reports Server (NTRS)

    Frentress, C. D.; Frye, R. G.

    1975-01-01

    Procedures for yielding vegetation type maps were developed using LANDSAT data and a computer assisted classification analysis (LARSYS) to assist in managing populations of wildlife species by defined area units. Ground cover in Travis County, Texas was classified on two occasions using a modified version of the unsupervised approach to classification. The first classification produced a total of 17 classes. Examination revealed that further grouping was justified. A second analysis produced 10 classes which were displayed on printouts which were later color-coded. The final classification was 82 percent accurate. While the classification map appeared to satisfactorily depict the existing vegetation, two classes were determined to contain significant error. The major sources of error could have been eliminated by stratifying cluster sites more closely among previously mapped soil associations that are identified with particular plant associations and by precisely defining class nomenclature using established criteria early in the analysis.

  9. LACIE performance predictor FOC users manual

    NASA Technical Reports Server (NTRS)

    1976-01-01

    The LACIE Performance Predictor (LPP) is a computer simulation of the LACIE process for predicting worldwide wheat production. The simulation provides for the introduction of various errors into the system and provides estimates based on these errors, thus allowing the user to determine the impact of selected error sources. The FOC LPP simulates the acquisition of the sample segment data by the LANDSAT Satellite (DAPTS), the classification of the agricultural area within the sample segment (CAMS), the estimation of the wheat yield (YES), and the production estimation and aggregation (CAS). These elements include data acquisition characteristics, environmental conditions, classification algorithms, the LACIE aggregation and data adjustment procedures. The operational structure for simulating these elements consists of the following key programs: (1) LACIE Utility Maintenance Process, (2) System Error Executive, (3) Ephemeris Generator, (4) Access Generator, (5) Acquisition Selector, (6) LACIE Error Model (LEM), and (7) Post Processor.

  10. A data-driven modeling approach to stochastic computation for low-energy biomedical devices.

    PubMed

    Lee, Kyong Ho; Jang, Kuk Jin; Shoeb, Ali; Verma, Naveen

    2011-01-01

    Low-power devices that can detect clinically relevant correlations in physiologically-complex patient signals can enable systems capable of closed-loop response (e.g., controlled actuation of therapeutic stimulators, continuous recording of disease states, etc.). In ultra-low-power platforms, however, hardware error sources are becoming increasingly limiting. In this paper, we present how data-driven methods, which allow us to accurately model physiological signals, also allow us to effectively model and overcome prominent hardware error sources with nearly no additional overhead. Two applications, EEG-based seizure detection and ECG-based arrhythmia-beat classification, are synthesized to a logic-gate implementation, and two prominent error sources are introduced: (1) SRAM bit-cell errors and (2) logic-gate switching errors ('stuck-at' faults). Using patient data from the CHB-MIT and MIT-BIH databases, performance similar to error-free hardware is achieved even for very high fault rates (up to 0.5 for SRAMs and 7 × 10(-2) for logic) that cause computational bit error rates as high as 50%.

  11. Real-time Neuroimaging and Cognitive Monitoring Using Wearable Dry EEG

    PubMed Central

    Mullen, Tim R.; Kothe, Christian A.E.; Chi, Mike; Ojeda, Alejandro; Kerth, Trevor; Makeig, Scott; Jung, Tzyy-Ping; Cauwenberghs, Gert

    2015-01-01

    Goal We present and evaluate a wearable high-density dry electrode EEG system and an open-source software framework for online neuroimaging and state classification. Methods The system integrates a 64-channel dry EEG form-factor with wireless data streaming for online analysis. A real-time software framework is applied, including adaptive artifact rejection, cortical source localization, multivariate effective connectivity inference, data visualization, and cognitive state classification from connectivity features using a constrained logistic regression approach (ProxConn). We evaluate the system identification methods on simulated 64-channel EEG data. Then we evaluate system performance, using ProxConn and a benchmark ERP method, in classifying response errors in 9 subjects using the dry EEG system. Results Simulations yielded high accuracy (AUC=0.97±0.021) for real-time cortical connectivity estimation. Response error classification using cortical effective connectivity (sdDTF) was significantly above chance with similar performance (AUC) for cLORETA (0.74±0.09) and LCMV (0.72±0.08) source localization. Cortical ERP-based classification was equivalent to ProxConn for cLORETA (0.74±0.16) but significantly better for LCMV (0.82±0.12). Conclusion We demonstrated the feasibility for real-time cortical connectivity analysis and cognitive state classification from high-density wearable dry EEG. Significance This paper is the first validated application of these methods to 64-channel dry EEG. The work addresses a need for robust real-time measurement and interpretation of complex brain activity in the dynamic environment of the wearable setting. Such advances can have broad impact in research, medicine, and brain-computer interfaces. The pipelines are made freely available in the open-source SIFT and BCILAB toolboxes. PMID:26415149

  12. Construction of a Calibrated Probabilistic Classification Catalog: Application to 50k Variable Sources in the All-Sky Automated Survey

    NASA Astrophysics Data System (ADS)

    Richards, Joseph W.; Starr, Dan L.; Miller, Adam A.; Bloom, Joshua S.; Butler, Nathaniel R.; Brink, Henrik; Crellin-Quick, Arien

    2012-12-01

    With growing data volumes from synoptic surveys, astronomers necessarily must become more abstracted from the discovery and introspection processes. Given the scarcity of follow-up resources, there is a particularly sharp onus on the frameworks that replace these human roles to provide accurate and well-calibrated probabilistic classification catalogs. Such catalogs inform the subsequent follow-up, allowing consumers to optimize the selection of specific sources for further study and permitting rigorous treatment of classification purities and efficiencies for population studies. Here, we describe a process to produce a probabilistic classification catalog of variability with machine learning from a multi-epoch photometric survey. In addition to producing accurate classifications, we show how to estimate calibrated class probabilities and motivate the importance of probability calibration. We also introduce a methodology for feature-based anomaly detection, which allows discovery of objects in the survey that do not fit within the predefined class taxonomy. Finally, we apply these methods to sources observed by the All-Sky Automated Survey (ASAS), and release the Machine-learned ASAS Classification Catalog (MACC), a 28 class probabilistic classification catalog of 50,124 ASAS sources in the ASAS Catalog of Variable Stars. We estimate that MACC achieves a sub-20% classification error rate and demonstrate that the class posterior probabilities are reasonably calibrated. MACC classifications compare favorably to the classifications of several previous domain-specific ASAS papers and to the ASAS Catalog of Variable Stars, which had classified only 24% of those sources into one of 12 science classes.

  13. CONSTRUCTION OF A CALIBRATED PROBABILISTIC CLASSIFICATION CATALOG: APPLICATION TO 50k VARIABLE SOURCES IN THE ALL-SKY AUTOMATED SURVEY

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Richards, Joseph W.; Starr, Dan L.; Miller, Adam A.

    2012-12-15

    With growing data volumes from synoptic surveys, astronomers necessarily must become more abstracted from the discovery and introspection processes. Given the scarcity of follow-up resources, there is a particularly sharp onus on the frameworks that replace these human roles to provide accurate and well-calibrated probabilistic classification catalogs. Such catalogs inform the subsequent follow-up, allowing consumers to optimize the selection of specific sources for further study and permitting rigorous treatment of classification purities and efficiencies for population studies. Here, we describe a process to produce a probabilistic classification catalog of variability with machine learning from a multi-epoch photometric survey. In additionmore » to producing accurate classifications, we show how to estimate calibrated class probabilities and motivate the importance of probability calibration. We also introduce a methodology for feature-based anomaly detection, which allows discovery of objects in the survey that do not fit within the predefined class taxonomy. Finally, we apply these methods to sources observed by the All-Sky Automated Survey (ASAS), and release the Machine-learned ASAS Classification Catalog (MACC), a 28 class probabilistic classification catalog of 50,124 ASAS sources in the ASAS Catalog of Variable Stars. We estimate that MACC achieves a sub-20% classification error rate and demonstrate that the class posterior probabilities are reasonably calibrated. MACC classifications compare favorably to the classifications of several previous domain-specific ASAS papers and to the ASAS Catalog of Variable Stars, which had classified only 24% of those sources into one of 12 science classes.« less

  14. Content-based multiple bitstream image transmission over noisy channels.

    PubMed

    Cao, Lei; Chen, Chang Wen

    2002-01-01

    In this paper, we propose a novel combined source and channel coding scheme for image transmission over noisy channels. The main feature of the proposed scheme is a systematic decomposition of image sources so that unequal error protection can be applied according to not only bit error sensitivity but also visual content importance. The wavelet transform is adopted to hierarchically decompose the image. The association between the wavelet coefficients and what they represent spatially in the original image is fully exploited so that wavelet blocks are classified based on their corresponding image content. The classification produces wavelet blocks in each class with similar content and statistics, therefore enables high performance source compression using the set partitioning in hierarchical trees (SPIHT) algorithm. To combat the channel noise, an unequal error protection strategy with rate-compatible punctured convolutional/cyclic redundancy check (RCPC/CRC) codes is implemented based on the bit contribution to both peak signal-to-noise ratio (PSNR) and visual quality. At the receiving end, a postprocessing method making use of the SPIHT decoding structure and the classification map is developed to restore the degradation due to the residual error after channel decoding. Experimental results show that the proposed scheme is indeed able to provide protection both for the bits that are more sensitive to errors and for the more important visual content under a noisy transmission environment. In particular, the reconstructed images illustrate consistently better visual quality than using the single-bitstream-based schemes.

  15. Measuring diagnoses: ICD code accuracy.

    PubMed

    O'Malley, Kimberly J; Cook, Karon F; Price, Matt D; Wildes, Kimberly Raiford; Hurdle, John F; Ashton, Carol M

    2005-10-01

    To examine potential sources of errors at each step of the described inpatient International Classification of Diseases (ICD) coding process. The use of disease codes from the ICD has expanded from classifying morbidity and mortality information for statistical purposes to diverse sets of applications in research, health care policy, and health care finance. By describing a brief history of ICD coding, detailing the process for assigning codes, identifying where errors can be introduced into the process, and reviewing methods for examining code accuracy, we help code users more systematically evaluate code accuracy for their particular applications. We summarize the inpatient ICD diagnostic coding process from patient admission to diagnostic code assignment. We examine potential sources of errors at each step and offer code users a tool for systematically evaluating code accuracy. Main error sources along the "patient trajectory" include amount and quality of information at admission, communication among patients and providers, the clinician's knowledge and experience with the illness, and the clinician's attention to detail. Main error sources along the "paper trail" include variance in the electronic and written records, coder training and experience, facility quality-control efforts, and unintentional and intentional coder errors, such as misspecification, unbundling, and upcoding. By clearly specifying the code assignment process and heightening their awareness of potential error sources, code users can better evaluate the applicability and limitations of codes for their particular situations. ICD codes can then be used in the most appropriate ways.

  16. Measuring Diagnoses: ICD Code Accuracy

    PubMed Central

    O'Malley, Kimberly J; Cook, Karon F; Price, Matt D; Wildes, Kimberly Raiford; Hurdle, John F; Ashton, Carol M

    2005-01-01

    Objective To examine potential sources of errors at each step of the described inpatient International Classification of Diseases (ICD) coding process. Data Sources/Study Setting The use of disease codes from the ICD has expanded from classifying morbidity and mortality information for statistical purposes to diverse sets of applications in research, health care policy, and health care finance. By describing a brief history of ICD coding, detailing the process for assigning codes, identifying where errors can be introduced into the process, and reviewing methods for examining code accuracy, we help code users more systematically evaluate code accuracy for their particular applications. Study Design/Methods We summarize the inpatient ICD diagnostic coding process from patient admission to diagnostic code assignment. We examine potential sources of errors at each step and offer code users a tool for systematically evaluating code accuracy. Principle Findings Main error sources along the “patient trajectory” include amount and quality of information at admission, communication among patients and providers, the clinician's knowledge and experience with the illness, and the clinician's attention to detail. Main error sources along the “paper trail” include variance in the electronic and written records, coder training and experience, facility quality-control efforts, and unintentional and intentional coder errors, such as misspecification, unbundling, and upcoding. Conclusions By clearly specifying the code assignment process and heightening their awareness of potential error sources, code users can better evaluate the applicability and limitations of codes for their particular situations. ICD codes can then be used in the most appropriate ways. PMID:16178999

  17. The pot calling the kettle black: the extent and type of errors in a computerized immunization registry and by parent report.

    PubMed

    MacDonald, Shannon E; Schopflocher, Donald P; Golonka, Richard P

    2014-01-04

    Accurate classification of children's immunization status is essential for clinical care, administration and evaluation of immunization programs, and vaccine program research. Computerized immunization registries have been proposed as a valuable alternative to provider paper records or parent report, but there is a need to better understand the challenges associated with their use. This study assessed the accuracy of immunization status classification in an immunization registry as compared to parent report and determined the number and type of errors occurring in both sources. This study was a sub-analysis of a larger study which compared the characteristics of children whose immunizations were up to date (UTD) at two years as compared to those not UTD. Children's immunization status was initially determined from a population-based immunization registry, and then compared to parent report of immunization status, as reported in a postal survey. Discrepancies between the two sources were adjudicated by review of immunization providers' hard-copy clinic records. Descriptive analyses included calculating proportions and confidence intervals for errors in classification and reporting of the type and frequency of errors. Among the 461 survey respondents, there were 60 discrepancies in immunization status. The majority of errors were due to parent report (n = 44), but the registry was not without fault (n = 16). Parents tended to erroneously report their child as UTD, whereas the registry was more likely to wrongly classify children as not UTD. Reasons for registry errors included failure to account for varicella disease history, variable number of doses required due to age at series initiation, and doses administered out of the region. These results confirm that parent report is often flawed, but also identify that registries are prone to misclassification of immunization status. Immunization program administrators and researchers need to institute measures to identify and reduce misclassification, in order for registries to play an effective role in the control of vaccine-preventable disease.

  18. The pot calling the kettle black: the extent and type of errors in a computerized immunization registry and by parent report

    PubMed Central

    2014-01-01

    Background Accurate classification of children’s immunization status is essential for clinical care, administration and evaluation of immunization programs, and vaccine program research. Computerized immunization registries have been proposed as a valuable alternative to provider paper records or parent report, but there is a need to better understand the challenges associated with their use. This study assessed the accuracy of immunization status classification in an immunization registry as compared to parent report and determined the number and type of errors occurring in both sources. Methods This study was a sub-analysis of a larger study which compared the characteristics of children whose immunizations were up to date (UTD) at two years as compared to those not UTD. Children’s immunization status was initially determined from a population-based immunization registry, and then compared to parent report of immunization status, as reported in a postal survey. Discrepancies between the two sources were adjudicated by review of immunization providers’ hard-copy clinic records. Descriptive analyses included calculating proportions and confidence intervals for errors in classification and reporting of the type and frequency of errors. Results Among the 461 survey respondents, there were 60 discrepancies in immunization status. The majority of errors were due to parent report (n = 44), but the registry was not without fault (n = 16). Parents tended to erroneously report their child as UTD, whereas the registry was more likely to wrongly classify children as not UTD. Reasons for registry errors included failure to account for varicella disease history, variable number of doses required due to age at series initiation, and doses administered out of the region. Conclusions These results confirm that parent report is often flawed, but also identify that registries are prone to misclassification of immunization status. Immunization program administrators and researchers need to institute measures to identify and reduce misclassification, in order for registries to play an effective role in the control of vaccine-preventable disease. PMID:24387002

  19. Development of a methodology for classifying software errors

    NASA Technical Reports Server (NTRS)

    Gerhart, S. L.

    1976-01-01

    A mathematical formalization of the intuition behind classification of software errors is devised and then extended to a classification discipline: Every classification scheme should have an easily discernible mathematical structure and certain properties of the scheme should be decidable (although whether or not these properties hold is relative to the intended use of the scheme). Classification of errors then becomes an iterative process of generalization from actual errors to terms defining the errors together with adjustment of definitions according to the classification discipline. Alternatively, whenever possible, small scale models may be built to give more substance to the definitions. The classification discipline and the difficulties of definition are illustrated by examples of classification schemes from the literature and a new study of observed errors in published papers of programming methodologies.

  20. Classification and reduction of pilot error

    NASA Technical Reports Server (NTRS)

    Rogers, W. H.; Logan, A. L.; Boley, G. D.

    1989-01-01

    Human error is a primary or contributing factor in about two-thirds of commercial aviation accidents worldwide. With the ultimate goal of reducing pilot error accidents, this contract effort is aimed at understanding the factors underlying error events and reducing the probability of certain types of errors by modifying underlying factors such as flight deck design and procedures. A review of the literature relevant to error classification was conducted. Classification includes categorizing types of errors, the information processing mechanisms and factors underlying them, and identifying factor-mechanism-error relationships. The classification scheme developed by Jens Rasmussen was adopted because it provided a comprehensive yet basic error classification shell or structure that could easily accommodate addition of details on domain-specific factors. For these purposes, factors specific to the aviation environment were incorporated. Hypotheses concerning the relationship of a small number of underlying factors, information processing mechanisms, and error types types identified in the classification scheme were formulated. ASRS data were reviewed and a simulation experiment was performed to evaluate and quantify the hypotheses.

  1. Automatic classification of diseases from free-text death certificates for real-time surveillance.

    PubMed

    Koopman, Bevan; Karimi, Sarvnaz; Nguyen, Anthony; McGuire, Rhydwyn; Muscatello, David; Kemp, Madonna; Truran, Donna; Zhang, Ming; Thackway, Sarah

    2015-07-15

    Death certificates provide an invaluable source for mortality statistics which can be used for surveillance and early warnings of increases in disease activity and to support the development and monitoring of prevention or response strategies. However, their value can be realised only if accurate, quantitative data can be extracted from death certificates, an aim hampered by both the volume and variable nature of certificates written in natural language. This study aims to develop a set of machine learning and rule-based methods to automatically classify death certificates according to four high impact diseases of interest: diabetes, influenza, pneumonia and HIV. Two classification methods are presented: i) a machine learning approach, where detailed features (terms, term n-grams and SNOMED CT concepts) are extracted from death certificates and used to train a set of supervised machine learning models (Support Vector Machines); and ii) a set of keyword-matching rules. These methods were used to identify the presence of diabetes, influenza, pneumonia and HIV in a death certificate. An empirical evaluation was conducted using 340,142 death certificates, divided between training and test sets, covering deaths from 2000-2007 in New South Wales, Australia. Precision and recall (positive predictive value and sensitivity) were used as evaluation measures, with F-measure providing a single, overall measure of effectiveness. A detailed error analysis was performed on classification errors. Classification of diabetes, influenza, pneumonia and HIV was highly accurate (F-measure 0.96). More fine-grained ICD-10 classification effectiveness was more variable but still high (F-measure 0.80). The error analysis revealed that word variations as well as certain word combinations adversely affected classification. In addition, anomalies in the ground truth likely led to an underestimation of the effectiveness. The high accuracy and low cost of the classification methods allow for an effective means for automatic and real-time surveillance of diabetes, influenza, pneumonia and HIV deaths. In addition, the methods are generally applicable to other diseases of interest and to other sources of medical free-text besides death certificates.

  2. Investigating the error sources of the online state of charge estimation methods for lithium-ion batteries in electric vehicles

    NASA Astrophysics Data System (ADS)

    Zheng, Yuejiu; Ouyang, Minggao; Han, Xuebing; Lu, Languang; Li, Jianqiu

    2018-02-01

    Sate of charge (SOC) estimation is generally acknowledged as one of the most important functions in battery management system for lithium-ion batteries in new energy vehicles. Though every effort is made for various online SOC estimation methods to reliably increase the estimation accuracy as much as possible within the limited on-chip resources, little literature discusses the error sources for those SOC estimation methods. This paper firstly reviews the commonly studied SOC estimation methods from a conventional classification. A novel perspective focusing on the error analysis of the SOC estimation methods is proposed. SOC estimation methods are analyzed from the views of the measured values, models, algorithms and state parameters. Subsequently, the error flow charts are proposed to analyze the error sources from the signal measurement to the models and algorithms for the widely used online SOC estimation methods in new energy vehicles. Finally, with the consideration of the working conditions, choosing more reliable and applicable SOC estimation methods is discussed, and the future development of the promising online SOC estimation methods is suggested.

  3. Unbiased Taxonomic Annotation of Metagenomic Samples

    PubMed Central

    Fosso, Bruno; Pesole, Graziano; Rosselló, Francesc

    2018-01-01

    Abstract The classification of reads from a metagenomic sample using a reference taxonomy is usually based on first mapping the reads to the reference sequences and then classifying each read at a node under the lowest common ancestor of the candidate sequences in the reference taxonomy with the least classification error. However, this taxonomic annotation can be biased by an imbalanced taxonomy and also by the presence of multiple nodes in the taxonomy with the least classification error for a given read. In this article, we show that the Rand index is a better indicator of classification error than the often used area under the receiver operating characteristic (ROC) curve and F-measure for both balanced and imbalanced reference taxonomies, and we also address the second source of bias by reducing the taxonomic annotation problem for a whole metagenomic sample to a set cover problem, for which a logarithmic approximation can be obtained in linear time and an exact solution can be obtained by integer linear programming. Experimental results with a proof-of-concept implementation of the set cover approach to taxonomic annotation in a next release of the TANGO software show that the set cover approach further reduces ambiguity in the taxonomic annotation obtained with TANGO without distorting the relative abundance profile of the metagenomic sample. PMID:29028181

  4. Global land cover mapping: a review and uncertainty analysis

    USGS Publications Warehouse

    Congalton, Russell G.; Gu, Jianyu; Yadav, Kamini; Thenkabail, Prasad S.; Ozdogan, Mutlu

    2014-01-01

    Given the advances in remotely sensed imagery and associated technologies, several global land cover maps have been produced in recent times including IGBP DISCover, UMD Land Cover, Global Land Cover 2000 and GlobCover 2009. However, the utility of these maps for specific applications has often been hampered due to considerable amounts of uncertainties and inconsistencies. A thorough review of these global land cover projects including evaluating the sources of error and uncertainty is prudent and enlightening. Therefore, this paper describes our work in which we compared, summarized and conducted an uncertainty analysis of the four global land cover mapping projects using an error budget approach. The results showed that the classification scheme and the validation methodology had the highest error contribution and implementation priority. A comparison of the classification schemes showed that there are many inconsistencies between the definitions of the map classes. This is especially true for the mixed type classes for which thresholds vary for the attributes/discriminators used in the classification process. Examination of these four global mapping projects provided quite a few important lessons for the future global mapping projects including the need for clear and uniform definitions of the classification scheme and an efficient, practical, and valid design of the accuracy assessment.

  5. Accuracy assessment, using stratified plurality sampling, of portions of a LANDSAT classification of the Arctic National Wildlife Refuge Coastal Plain

    NASA Technical Reports Server (NTRS)

    Card, Don H.; Strong, Laurence L.

    1989-01-01

    An application of a classification accuracy assessment procedure is described for a vegetation and land cover map prepared by digital image processing of LANDSAT multispectral scanner data. A statistical sampling procedure called Stratified Plurality Sampling was used to assess the accuracy of portions of a map of the Arctic National Wildlife Refuge coastal plain. Results are tabulated as percent correct classification overall as well as per category with associated confidence intervals. Although values of percent correct were disappointingly low for most categories, the study was useful in highlighting sources of classification error and demonstrating shortcomings of the plurality sampling method.

  6. [Study of inversion and classification of particle size distribution under dependent model algorithm].

    PubMed

    Sun, Xiao-Gang; Tang, Hong; Yuan, Gui-Bin

    2008-05-01

    For the total light scattering particle sizing technique, an inversion and classification method was proposed with the dependent model algorithm. The measured particle system was inversed simultaneously by different particle distribution functions whose mathematic model was known in advance, and then classified according to the inversion errors. The simulation experiments illustrated that it is feasible to use the inversion errors to determine the particle size distribution. The particle size distribution function was obtained accurately at only three wavelengths in the visible light range with the genetic algorithm, and the inversion results were steady and reliable, which decreased the number of multi wavelengths to the greatest extent and increased the selectivity of light source. The single peak distribution inversion error was less than 5% and the bimodal distribution inversion error was less than 10% when 5% stochastic noise was put in the transmission extinction measurement values at two wavelengths. The running time of this method was less than 2 s. The method has advantages of simplicity, rapidity, and suitability for on-line particle size measurement.

  7. A Robust Sound Source Localization Approach for Microphone Array with Model Errors

    NASA Astrophysics Data System (ADS)

    Xiao, Hua; Shao, Huai-Zong; Peng, Qi-Cong

    In this paper, a robust sound source localization approach is proposed. The approach retains good performance even when model errors exist. Compared with previous work in this field, the contributions of this paper are as follows. First, an improved broad-band and near-field array model is proposed. It takes array gain, phase perturbations into account and is based on the actual positions of the elements. It can be used in arbitrary planar geometry arrays. Second, a subspace model errors estimation algorithm and a Weighted 2-Dimension Multiple Signal Classification (W2D-MUSIC) algorithm are proposed. The subspace model errors estimation algorithm estimates unknown parameters of the array model, i. e., gain, phase perturbations, and positions of the elements, with high accuracy. The performance of this algorithm is improved with the increasing of SNR or number of snapshots. The W2D-MUSIC algorithm based on the improved array model is implemented to locate sound sources. These two algorithms compose the robust sound source approach. The more accurate steering vectors can be provided for further processing such as adaptive beamforming algorithm. Numerical examples confirm effectiveness of this proposed approach.

  8. An investigation of the usability of sound recognition for source separation of packaging wastes in reverse vending machines.

    PubMed

    Korucu, M Kemal; Kaplan, Özgür; Büyük, Osman; Güllü, M Kemal

    2016-10-01

    In this study, we investigate the usability of sound recognition for source separation of packaging wastes in reverse vending machines (RVMs). For this purpose, an experimental setup equipped with a sound recording mechanism was prepared. Packaging waste sounds generated by three physical impacts such as free falling, pneumatic hitting and hydraulic crushing were separately recorded using two different microphones. To classify the waste types and sizes based on sound features of the wastes, a support vector machine (SVM) and a hidden Markov model (HMM) based sound classification systems were developed. In the basic experimental setup in which only free falling impact type was considered, SVM and HMM systems provided 100% classification accuracy for both microphones. In the expanded experimental setup which includes all three impact types, material type classification accuracies were 96.5% for dynamic microphone and 97.7% for condenser microphone. When both the material type and the size of the wastes were classified, the accuracy was 88.6% for the microphones. The modeling studies indicated that hydraulic crushing impact type recordings were very noisy for an effective sound recognition application. In the detailed analysis of the recognition errors, it was observed that most of the errors occurred in the hitting impact type. According to the experimental results, it can be said that the proposed novel approach for the separation of packaging wastes could provide a high classification performance for RVMs. Copyright © 2016 Elsevier Ltd. All rights reserved.

  9. Fully Convolutional Networks for Ground Classification from LIDAR Point Clouds

    NASA Astrophysics Data System (ADS)

    Rizaldy, A.; Persello, C.; Gevaert, C. M.; Oude Elberink, S. J.

    2018-05-01

    Deep Learning has been massively used for image classification in recent years. The use of deep learning for ground classification from LIDAR point clouds has also been recently studied. However, point clouds need to be converted into an image in order to use Convolutional Neural Networks (CNNs). In state-of-the-art techniques, this conversion is slow because each point is converted into a separate image. This approach leads to highly redundant computation during conversion and classification. The goal of this study is to design a more efficient data conversion and ground classification. This goal is achieved by first converting the whole point cloud into a single image. The classification is then performed by a Fully Convolutional Network (FCN), a modified version of CNN designed for pixel-wise image classification. The proposed method is significantly faster than state-of-the-art techniques. On the ISPRS Filter Test dataset, it is 78 times faster for conversion and 16 times faster for classification. Our experimental analysis on the same dataset shows that the proposed method results in 5.22 % of total error, 4.10 % of type I error, and 15.07 % of type II error. Compared to the previous CNN-based technique and LAStools software, the proposed method reduces the total error and type I error (while type II error is slightly higher). The method was also tested on a very high point density LIDAR point clouds resulting in 4.02 % of total error, 2.15 % of type I error and 6.14 % of type II error.

  10. Problems with the Use of Terminology in Genetics Education: 1, A Literature Review and Classification Scheme.

    ERIC Educational Resources Information Center

    Pearson, J. T.; Hughes, W. J.

    1988-01-01

    Examines the technical vocabulary of genetics as a source of error and confusion. Suggests that it is necessary to identify different types of problems associated with terminology and to organize them into logical classes to deal effectively with the difficulties. Highlights terms misused in textbooks. (RT)

  11. Bayes-LQAS: classifying the prevalence of global acute malnutrition

    PubMed Central

    2010-01-01

    Lot Quality Assurance Sampling (LQAS) applications in health have generally relied on frequentist interpretations for statistical validity. Yet health professionals often seek statements about the probability distribution of unknown parameters to answer questions of interest. The frequentist paradigm does not pretend to yield such information, although a Bayesian formulation might. This is the source of an error made in a recent paper published in this journal. Many applications lend themselves to a Bayesian treatment, and would benefit from such considerations in their design. We discuss Bayes-LQAS (B-LQAS), which allows for incorporation of prior information into the LQAS classification procedure, and thus shows how to correct the aforementioned error. Further, we pay special attention to the formulation of Bayes Operating Characteristic Curves and the use of prior information to improve survey designs. As a motivating example, we discuss the classification of Global Acute Malnutrition prevalence and draw parallels between the Bayes and classical classifications schemes. We also illustrate the impact of informative and non-informative priors on the survey design. Results indicate that using a Bayesian approach allows the incorporation of expert information and/or historical data and is thus potentially a valuable tool for making accurate and precise classifications. PMID:20534159

  12. Bayes-LQAS: classifying the prevalence of global acute malnutrition.

    PubMed

    Olives, Casey; Pagano, Marcello

    2010-06-09

    Lot Quality Assurance Sampling (LQAS) applications in health have generally relied on frequentist interpretations for statistical validity. Yet health professionals often seek statements about the probability distribution of unknown parameters to answer questions of interest. The frequentist paradigm does not pretend to yield such information, although a Bayesian formulation might. This is the source of an error made in a recent paper published in this journal. Many applications lend themselves to a Bayesian treatment, and would benefit from such considerations in their design. We discuss Bayes-LQAS (B-LQAS), which allows for incorporation of prior information into the LQAS classification procedure, and thus shows how to correct the aforementioned error. Further, we pay special attention to the formulation of Bayes Operating Characteristic Curves and the use of prior information to improve survey designs. As a motivating example, we discuss the classification of Global Acute Malnutrition prevalence and draw parallels between the Bayes and classical classifications schemes. We also illustrate the impact of informative and non-informative priors on the survey design. Results indicate that using a Bayesian approach allows the incorporation of expert information and/or historical data and is thus potentially a valuable tool for making accurate and precise classifications.

  13. Water displacement leg volumetry in clinical studies - A discussion of error sources

    PubMed Central

    2010-01-01

    Background Water displacement leg volumetry is a highly reproducible method, allowing the confirmation of efficacy of vasoactive substances. Nevertheless errors of its execution and the selection of unsuitable patients are likely to negatively affect the outcome of clinical studies in chronic venous insufficiency (CVI). Discussion Placebo controlled double-blind drug studies in CVI were searched (Cochrane Review 2005, MedLine Search until December 2007) and assessed with regard to efficacy (volume reduction of the leg), patient characteristics, and potential methodological error sources. Almost every second study reported only small drug effects (≤ 30 mL volume reduction). As the most relevant error source the conduct of volumetry was identified. Because the practical use of available equipment varies, volume differences of more than 300 mL - which is a multifold of a potential treatment effect - have been reported between consecutive measurements. Other potential error sources were insufficient patient guidance or difficulties with the transition from the Widmer CVI classification to the CEAP (Clinical Etiological Anatomical Pathophysiological) grading. Summary Patients should be properly diagnosed with CVI and selected for stable oedema and further clinical symptoms relevant for the specific study. Centres require a thorough training on the use of the volumeter and on patient guidance. Volumetry should be performed under constant conditions. The reproducibility of short term repeat measurements has to be ensured. PMID:20070899

  14. Evaluation of the confusion matrix method in the validation of an automated system for measuring feeding behaviour of cattle.

    PubMed

    Ruuska, Salla; Hämäläinen, Wilhelmiina; Kajava, Sari; Mughal, Mikaela; Matilainen, Pekka; Mononen, Jaakko

    2018-03-01

    The aim of the present study was to evaluate empirically confusion matrices in device validation. We compared the confusion matrix method to linear regression and error indices in the validation of a device measuring feeding behaviour of dairy cattle. In addition, we studied how to extract additional information on classification errors with confusion probabilities. The data consisted of 12 h behaviour measurements from five dairy cows; feeding and other behaviour were detected simultaneously with a device and from video recordings. The resulting 216 000 pairs of classifications were used to construct confusion matrices and calculate performance measures. In addition, hourly durations of each behaviour were calculated and the accuracy of measurements was evaluated with linear regression and error indices. All three validation methods agreed when the behaviour was detected very accurately or inaccurately. Otherwise, in the intermediate cases, the confusion matrix method and error indices produced relatively concordant results, but the linear regression method often disagreed with them. Our study supports the use of confusion matrix analysis in validation since it is robust to any data distribution and type of relationship, it makes a stringent evaluation of validity, and it offers extra information on the type and sources of errors. Copyright © 2018 Elsevier B.V. All rights reserved.

  15. Organic Chemical Attribution Signatures for the Sourcing of a Mustard Agent and Its Starting Materials

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fraga, Carlos G.; Bronk, Krys; Dockendorff, Brian P.

    Chemical attribution signatures (CAS) are being investigated for the sourcing of chemical warfare (CW) agents and their starting materials that may be implicated in chemical attacks or CW proliferation. The work reported here demonstrates for the first time trace impurities produced during the synthesis of tris(2-chloroethyl)amine (HN3) that point to specific reagent stocks used in the synthesis of this CW agent. Thirty batches of HN3 were synthesized using different combinations of commercial stocks of triethanolamine (TEA), thionyl chloride, chloroform, and acetone. The HN3 batches and reagent stocks were then analyzed for impurities by gas chromatography/mass spectrometry. Reaction-produced impurities indicative ofmore » specific TEA and chloroform stocks were exclusively discovered in HN3 batches made with those reagent stocks. In addition, some reagent impurities were found in the HN3 batches that were presumably not altered during synthesis and believed to be indicative of reagent type regardless of stock. Supervised classification using partial least squares discriminant analysis (PLSDA) on the impurity profiles of chloroform samples from seven stocks resulted in an average classification error by cross-validation of 2.4%. A classification error of zero was obtained using the seven-stock PLSDA model on a validation set of samples from an arbitrarily selected chloroform stock. In a separate analysis, all samples from two of seven chloroform stocks that were purposely not modeled had their samples matched to a chloroform stock rather than assigned a “no class” classification.« less

  16. Systematic bias in genomic classification due to contaminating non-neoplastic tissue in breast tumor samples.

    PubMed

    Elloumi, Fathi; Hu, Zhiyuan; Li, Yan; Parker, Joel S; Gulley, Margaret L; Amos, Keith D; Troester, Melissa A

    2011-06-30

    Genomic tests are available to predict breast cancer recurrence and to guide clinical decision making. These predictors provide recurrence risk scores along with a measure of uncertainty, usually a confidence interval. The confidence interval conveys random error and not systematic bias. Standard tumor sampling methods make this problematic, as it is common to have a substantial proportion (typically 30-50%) of a tumor sample comprised of histologically benign tissue. This "normal" tissue could represent a source of non-random error or systematic bias in genomic classification. To assess the performance characteristics of genomic classification to systematic error from normal contamination, we collected 55 tumor samples and paired tumor-adjacent normal tissue. Using genomic signatures from the tumor and paired normal, we evaluated how increasing normal contamination altered recurrence risk scores for various genomic predictors. Simulations of normal tissue contamination caused misclassification of tumors in all predictors evaluated, but different breast cancer predictors showed different types of vulnerability to normal tissue bias. While two predictors had unpredictable direction of bias (either higher or lower risk of relapse resulted from normal contamination), one signature showed predictable direction of normal tissue effects. Due to this predictable direction of effect, this signature (the PAM50) was adjusted for normal tissue contamination and these corrections improved sensitivity and negative predictive value. For all three assays quality control standards and/or appropriate bias adjustment strategies can be used to improve assay reliability. Normal tissue sampled concurrently with tumor is an important source of bias in breast genomic predictors. All genomic predictors show some sensitivity to normal tissue contamination and ideal strategies for mitigating this bias vary depending upon the particular genes and computational methods used in the predictor.

  17. Satellite inventory of Minnesota forest resources

    NASA Technical Reports Server (NTRS)

    Bauer, Marvin E.; Burk, Thomas E.; Ek, Alan R.; Coppin, Pol R.; Lime, Stephen D.; Walsh, Terese A.; Walters, David K.; Befort, William; Heinzen, David F.

    1993-01-01

    The methods and results of using Landsat Thematic Mapper (TM) data to classify and estimate the acreage of forest covertypes in northeastern Minnesota are described. Portions of six TM scenes covering five counties with a total area of 14,679 square miles were classified into six forest and five nonforest classes. The approach involved the integration of cluster sampling, image processing, and estimation. Using cluster sampling, 343 plots, each 88 acres in size, were photo interpreted and field mapped as a source of reference data for classifier training and calibration of the TM data classifications. Classification accuracies of up to 75 percent were achieved; most misclassification was between similar or related classes. An inverse method of calibration, based on the error rates obtained from the classifications of the cluster plots, was used to adjust the classification class proportions for classification errors. The resulting area estimates for total forest land in the five-county area were within 3 percent of the estimate made independently by the USDA Forest Service. Area estimates for conifer and hardwood forest types were within 0.8 and 6.0 percent respectively, of the Forest Service estimates. A trial of a second method of estimating the same classes as the Forest Service resulted in standard errors of 0.002 to 0.015. A study of the use of multidate TM data for change detection showed that forest canopy depletion, canopy increment, and no change could be identified with greater than 90 percent accuracy. The project results have been the basis for the Minnesota Department of Natural Resources and the Forest Service to define and begin to implement an annual system of forest inventory which utilizes Landsat TM data to detect changes in forest cover.

  18. Classification-Based Spatial Error Concealment for Visual Communications

    NASA Astrophysics Data System (ADS)

    Chen, Meng; Zheng, Yefeng; Wu, Min

    2006-12-01

    In an error-prone transmission environment, error concealment is an effective technique to reconstruct the damaged visual content. Due to large variations of image characteristics, different concealment approaches are necessary to accommodate the different nature of the lost image content. In this paper, we address this issue and propose using classification to integrate the state-of-the-art error concealment techniques. The proposed approach takes advantage of multiple concealment algorithms and adaptively selects the suitable algorithm for each damaged image area. With growing awareness that the design of sender and receiver systems should be jointly considered for efficient and reliable multimedia communications, we proposed a set of classification-based block concealment schemes, including receiver-side classification, sender-side attachment, and sender-side embedding. Our experimental results provide extensive performance comparisons and demonstrate that the proposed classification-based error concealment approaches outperform the conventional approaches.

  19. PSD Source Classification for Safety Kleen's Lubricating Oil Recovery Facility

    EPA Pesticide Factsheets

    This document may be of assistance in applying the New Source Review (NSR) air permitting regulations including the Prevention of Significant Deterioration (PSD) requirements. This document is part of the NSR Policy and Guidance Database. Some documents in the database are a scanned or retyped version of a paper photocopy of the original. Although we have taken considerable effort to quality assure the documents, some may contain typographical errors. Contact the office that issued the document if you need a copy of the original.

  20. Microscopic saw mark analysis: an empirical approach.

    PubMed

    Love, Jennifer C; Derrick, Sharon M; Wiersema, Jason M; Peters, Charles

    2015-01-01

    Microscopic saw mark analysis is a well published and generally accepted qualitative analytical method. However, little research has focused on identifying and mitigating potential sources of error associated with the method. The presented study proposes the use of classification trees and random forest classifiers as an optimal, statistically sound approach to mitigate the potential for error of variability and outcome error in microscopic saw mark analysis. The statistical model was applied to 58 experimental saw marks created with four types of saws. The saw marks were made in fresh human femurs obtained through anatomical gift and were analyzed using a Keyence digital microscope. The statistical approach weighed the variables based on discriminatory value and produced decision trees with an associated outcome error rate of 8.62-17.82%. © 2014 American Academy of Forensic Sciences.

  1. Exposure assessment in investigations of waterborne illness: a quantitative estimate of measurement error

    PubMed Central

    Jones, Andria Q; Dewey, Catherine E; Doré, Kathryn; Majowicz, Shannon E; McEwen, Scott A; Waltner-Toews, David

    2006-01-01

    Background Exposure assessment is typically the greatest weakness of epidemiologic studies of disinfection by-products (DBPs) in drinking water, which largely stems from the difficulty in obtaining accurate data on individual-level water consumption patterns and activity. Thus, surrogate measures for such waterborne exposures are commonly used. Little attention however, has been directed towards formal validation of these measures. Methods We conducted a study in the City of Hamilton, Ontario (Canada) in 2001–2002, to assess the accuracy of two surrogate measures of home water source: (a) urban/rural status as assigned using residential postal codes, and (b) mapping of residential postal codes to municipal water systems within a Geographic Information System (GIS). We then assessed the accuracy of a commonly-used surrogate measure of an individual's actual drinking water source, namely, their home water source. Results The surrogates for home water source provided good classification of residents served by municipal water systems (approximately 98% predictive value), but did not perform well in classifying those served by private water systems (average: 63.5% predictive value). More importantly, we found that home water source was a poor surrogate measure of the individuals' actual drinking water source(s), being associated with high misclassification errors. Conclusion This study demonstrated substantial misclassification errors associated with a surrogate measure commonly used in studies of drinking water disinfection byproducts. Further, the limited accuracy of two surrogate measures of an individual's home water source heeds caution in their use in exposure classification methodology. While these surrogates are inexpensive and convenient, they should not be substituted for direct collection of accurate data pertaining to the subjects' waterborne disease exposure. In instances where such surrogates must be used, estimation of the misclassification and its subsequent effects are recommended for the interpretation and communication of results. Our results also lend support for further investigation into the quantification of the exposure misclassification associated with these surrogate measures, which would provide useful estimates for consideration in interpretation of waterborne disease studies. PMID:16729887

  2. C-fuzzy variable-branch decision tree with storage and classification error rate constraints

    NASA Astrophysics Data System (ADS)

    Yang, Shiueng-Bien

    2009-10-01

    The C-fuzzy decision tree (CFDT), which is based on the fuzzy C-means algorithm, has recently been proposed. The CFDT is grown by selecting the nodes to be split according to its classification error rate. However, the CFDT design does not consider the classification time taken to classify the input vector. Thus, the CFDT can be improved. We propose a new C-fuzzy variable-branch decision tree (CFVBDT) with storage and classification error rate constraints. The design of the CFVBDT consists of two phases-growing and pruning. The CFVBDT is grown by selecting the nodes to be split according to the classification error rate and the classification time in the decision tree. Additionally, the pruning method selects the nodes to prune based on the storage requirement and the classification time of the CFVBDT. Furthermore, the number of branches of each internal node is variable in the CFVBDT. Experimental results indicate that the proposed CFVBDT outperforms the CFDT and other methods.

  3. A catalog of selected compact radio sources for the construction of an extragalactic radio/optical reference frame (Argue et al. 1984): Documentation for the machine-readable version

    NASA Technical Reports Server (NTRS)

    1990-01-01

    This document describes the machine readable version of the Selected Compact Radio Source Catalog as it is currently being distributed from the international network of astronomical data centers. It is intended to enable users to read and process the computerized catalog. The catalog contains 233 strong, compact extragalactic radio sources having identified optical counterparts. The machine version contains the same data as the published catalog and includes source identifications, equatorial positions at J2000.0 and their mean errors, object classifications, visual magnitudes, redshift, 5-GHz flux densities, and comments.

  4. Sources of variation in Landsat autocorrelation

    NASA Technical Reports Server (NTRS)

    Craig, R. G.; Labovitz, M. L.

    1980-01-01

    Analysis of sixty-four scan lines representing diverse conditions across satellites, channels, scanners, locations and cloud cover confirms that Landsat data are autocorrelated and consistently follow an Arima (1,0,1) pattern. The AR parameter varies significantly with location and the MA coefficient with cloud cover. Maximum likelihood classification functions are considerably in error unless this autocorrelation is compensated for in sampling.

  5. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mirjankar, Nikhil S.; Fraga, Carlos G.; Carman, April J.

    Chemical attribution signatures (CAS) for chemical threat agents (CTAs) are being investigated to provide an evidentiary link between CTAs and specific sources to support criminal investigations and prosecutions. In a previous study, anionic impurity profiles developed using high performance ion chromatography (HPIC) were demonstrated as CAS for matching samples from eight potassium cyanide (KCN) stocks to their reported countries of origin. Herein, a larger number of solid KCN stocks (n = 13) and, for the first time, solid sodium cyanide (NaCN) stocks (n = 15) were examined to determine what additional sourcing information can be obtained through anion, carbon stablemore » isotope, and elemental analyses of cyanide stocks by HPIC, isotope ratio mass spectrometry (IRMS), and inductively coupled plasma optical emission spectroscopy (ICP-OES), respectively. The HPIC anion data was evaluated using the variable selection methods of Fisher-ratio (F-ratio), interval partial least squares (iPLS), and genetic algorithm-based partial least squares (GAPLS) and the classification methods of partial least squares discriminate analysis (PLSDA), K nearest neighbors (KNN), and support vector machines discriminate analysis (SVMDA). In summary, hierarchical cluster analysis (HCA) of anion impurity profiles from multiple cyanide stocks from six reported country of origins resulted in cyanide samples clustering into three groups: Czech Republic, Germany, and United States, independent of the associated alkali metal (K or Na). The three country groups were independently corroborated by HCA of cyanide elemental profiles and corresponded to countries with known solid cyanide factories. Both the anion and elemental CAS are believed to originate from the aqueous alkali hydroxides used in cyanide manufacture. Carbon stable isotope measurements resulted in two clusters: Germany and United States (the single Czech stock grouped with United States stocks). The carbon isotope CAS is believed to originate from the carbon source and process used to make the HCN utilized in cyanide synthesis. Classification errors for two validation studies using anion impurity profiles collected over five years on different instruments were as low as zero for KNN and SVMDA, demonstrating the excellent reliability (so far) of using anion impurities for matching a cyanide sample to its country of manufacture (i.e., factory). Variable selection reduced errors for those classification methods having errors greater than zero with iPLS-forward selection, and F-ratio typically providing the lowest errors. Finally, using anion profiles to match cyanides to a specific stock or stock group resulted in cross-validation errors ranging from zero to 5.3%.« less

  6. Clarification of terminology in medication errors: definitions and classification.

    PubMed

    Ferner, Robin E; Aronson, Jeffrey K

    2006-01-01

    We have previously described and analysed some terms that are used in drug safety and have proposed definitions. Here we discuss and define terms that are used in the field of medication errors, particularly terms that are sometimes misunderstood or misused. We also discuss the classification of medication errors. A medication error is a failure in the treatment process that leads to, or has the potential to lead to, harm to the patient. Errors can be classified according to whether they are mistakes, slips, or lapses. Mistakes are errors in the planning of an action. They can be knowledge based or rule based. Slips and lapses are errors in carrying out an action - a slip through an erroneous performance and a lapse through an erroneous memory. Classification of medication errors is important because the probabilities of errors of different classes are different, as are the potential remedies.

  7. Hand posture classification using electrocorticography signals in the gamma band over human sensorimotor brain areas

    NASA Astrophysics Data System (ADS)

    Chestek, Cynthia A.; Gilja, Vikash; Blabe, Christine H.; Foster, Brett L.; Shenoy, Krishna V.; Parvizi, Josef; Henderson, Jaimie M.

    2013-04-01

    Objective. Brain-machine interface systems translate recorded neural signals into command signals for assistive technology. In individuals with upper limb amputation or cervical spinal cord injury, the restoration of a useful hand grasp could significantly improve daily function. We sought to determine if electrocorticographic (ECoG) signals contain sufficient information to select among multiple hand postures for a prosthetic hand, orthotic, or functional electrical stimulation system.Approach. We recorded ECoG signals from subdural macro- and microelectrodes implanted in motor areas of three participants who were undergoing inpatient monitoring for diagnosis and treatment of intractable epilepsy. Participants performed five distinct isometric hand postures, as well as four distinct finger movements. Several control experiments were attempted in order to remove sensory information from the classification results. Online experiments were performed with two participants. Main results. Classification rates were 68%, 84% and 81% for correct identification of 5 isometric hand postures offline. Using 3 potential controls for removing sensory signals, error rates were approximately doubled on average (2.1×). A similar increase in errors (2.6×) was noted when the participant was asked to make simultaneous wrist movements along with the hand postures. In online experiments, fist versus rest was successfully classified on 97% of trials; the classification output drove a prosthetic hand. Online classification performance for a larger number of hand postures remained above chance, but substantially below offline performance. In addition, the long integration windows used would preclude the use of decoded signals for control of a BCI system. Significance. These results suggest that ECoG is a plausible source of command signals for prosthetic grasp selection. Overall, avenues remain for improvement through better electrode designs and placement, better participant training, and characterization of non-stationarities such that ECoG could be a viable signal source for grasp control for amputees or individuals with paralysis.

  8. Quantifying uncertainty in carbon and nutrient pools of coarse woody debris

    NASA Astrophysics Data System (ADS)

    See, C. R.; Campbell, J. L.; Fraver, S.; Domke, G. M.; Harmon, M. E.; Knoepp, J. D.; Woodall, C. W.

    2016-12-01

    Woody detritus constitutes a major pool of both carbon and nutrients in forested ecosystems. Estimating coarse wood stocks relies on many assumptions, even when full surveys are conducted. Researchers rarely report error in coarse wood pool estimates, despite the importance to ecosystem budgets and modelling efforts. To date, no study has attempted a comprehensive assessment of error rates and uncertainty inherent in the estimation of this pool. Here, we use Monte Carlo analysis to propagate the error associated with the major sources of uncertainty present in the calculation of coarse wood carbon and nutrient (i.e., N, P, K, Ca, Mg, Na) pools. We also evaluate individual sources of error to identify the importance of each source of uncertainty in our estimates. We quantify sampling error by comparing the three most common field methods used to survey coarse wood (two transect methods and a whole-plot survey). We quantify the measurement error associated with length and diameter measurement, and technician error in species identification and decay class using plots surveyed by multiple technicians. We use previously published values of model error for the four most common methods of volume estimation: Smalian's, conical frustum, conic paraboloid, and average-of-ends. We also use previously published values for error in the collapse ratio (cross-sectional height/width) of decayed logs that serves as a surrogate for the volume remaining. We consider sampling error in chemical concentration and density for all decay classes, using distributions from both published and unpublished studies. Analytical uncertainty is calculated using standard reference plant material from the National Institute of Standards. Our results suggest that technician error in decay classification can have a large effect on uncertainty, since many of the error distributions included in the calculation (e.g. density, chemical concentration, volume-model selection, collapse ratio) are decay-class specific.

  9. Determining the optimal window length for pattern recognition-based myoelectric control: balancing the competing effects of classification error and controller delay.

    PubMed

    Smith, Lauren H; Hargrove, Levi J; Lock, Blair A; Kuiken, Todd A

    2011-04-01

    Pattern recognition-based control of myoelectric prostheses has shown great promise in research environments, but has not been optimized for use in a clinical setting. To explore the relationship between classification error, controller delay, and real-time controllability, 13 able-bodied subjects were trained to operate a virtual upper-limb prosthesis using pattern recognition of electromyogram (EMG) signals. Classification error and controller delay were varied by training different classifiers with a variety of analysis window lengths ranging from 50 to 550 ms and either two or four EMG input channels. Offline analysis showed that classification error decreased with longer window lengths (p < 0.01 ). Real-time controllability was evaluated with the target achievement control (TAC) test, which prompted users to maneuver the virtual prosthesis into various target postures. The results indicated that user performance improved with lower classification error (p < 0.01 ) and was reduced with longer controller delay (p < 0.01 ), as determined by the window length. Therefore, both of these effects should be considered when choosing a window length; it may be beneficial to increase the window length if this results in a reduced classification error, despite the corresponding increase in controller delay. For the system employed in this study, the optimal window length was found to be between 150 and 250 ms, which is within acceptable controller delays for conventional multistate amplitude controllers.

  10. The use of a contextual, modal and psychological classification of medication errors in the emergency department: a retrospective descriptive study.

    PubMed

    Cabilan, C J; Hughes, James A; Shannon, Carl

    2017-12-01

    To describe the contextual, modal and psychological classification of medication errors in the emergency department to know the factors associated with the reported medication errors. The causes of medication errors are unique in every clinical setting; hence, error minimisation strategies are not always effective. For this reason, it is fundamental to understand the causes specific to the emergency department so that targeted strategies can be implemented. Retrospective analysis of reported medication errors in the emergency department. All voluntarily staff-reported medication-related incidents from 2010-2015 from the hospital's electronic incident management system were retrieved for analysis. Contextual classification involved the time, place and the type of medications involved. Modal classification pertained to the stage and issue (e.g. wrong medication, wrong patient). Psychological classification categorised the errors in planning (knowledge-based and rule-based errors) and skill (slips and lapses). There were 405 errors reported. Most errors occurred in the acute care area, short-stay unit and resuscitation area, during the busiest shifts (0800-1559, 1600-2259). Half of the errors involved high-alert medications. Many of the errors occurred during administration (62·7%), prescribing (28·6%) and commonly during both stages (18·5%). Wrong dose, wrong medication and omission were the issues that dominated. Knowledge-based errors characterised the errors that occurred in prescribing and administration. The highest proportion of slips (79·5%) and lapses (76·1%) occurred during medication administration. It is likely that some of the errors occurred due to the lack of adherence to safety protocols. Technology such as computerised prescribing, barcode medication administration and reminder systems could potentially decrease the medication errors in the emergency department. There was a possibility that some of the errors could be prevented if safety protocols were adhered to, which highlights the need to also address clinicians' attitudes towards safety. Technology can be implemented to help minimise errors in the ED, but this must be coupled with efforts to enhance the culture of safety. © 2017 John Wiley & Sons Ltd.

  11. Source Attribution of Cyanides Using Anionic Impurity Profiling, Stable Isotope Ratios, Trace Elemental Analysis and Chemometrics.

    PubMed

    Mirjankar, Nikhil S; Fraga, Carlos G; Carman, April J; Moran, James J

    2016-02-02

    Chemical attribution signatures (CAS) for chemical threat agents (CTAs), such as cyanides, are being investigated to provide an evidentiary link between CTAs and specific sources to support criminal investigations and prosecutions. Herein, stocks of KCN and NaCN were analyzed for trace anions by high performance ion chromatography (HPIC), carbon stable isotope ratio (δ(13)C) by isotope ratio mass spectrometry (IRMS), and trace elements by inductively coupled plasma optical emission spectroscopy (ICP-OES). The collected analytical data were evaluated using hierarchical cluster analysis (HCA), Fisher-ratio (F-ratio), interval partial least-squares (iPLS), genetic algorithm-based partial least-squares (GAPLS), partial least-squares discriminant analysis (PLSDA), K nearest neighbors (KNN), and support vector machines discriminant analysis (SVMDA). HCA of anion impurity profiles from multiple cyanide stocks from six reported countries of origin resulted in cyanide samples clustering into three groups, independent of the associated alkali metal (K or Na). The three groups were independently corroborated by HCA of cyanide elemental profiles and corresponded to countries each having one known solid cyanide factory: Czech Republic, Germany, and United States. Carbon stable isotope measurements resulted in two clusters: Germany and United States (the single Czech stock grouped with United States stocks). Classification errors for two validation studies using anion impurity profiles collected over five years on different instruments were as low as zero for KNN and SVMDA, demonstrating the excellent reliability associated with using anion impurities for matching a cyanide sample to its factory using our current cyanide stocks. Variable selection methods reduced errors for those classification methods having errors greater than zero; iPLS-forward selection and F-ratio typically provided the lowest errors. Finally, using anion profiles to classify cyanides to a specific stock or stock group for a subset of United States stocks resulted in cross-validation errors ranging from 0 to 5.3%.

  12. Effects of uncertainty and variability on population declines and IUCN Red List classifications.

    PubMed

    Rueda-Cediel, Pamela; Anderson, Kurt E; Regan, Tracey J; Regan, Helen M

    2018-01-22

    The International Union for Conservation of Nature (IUCN) Red List Categories and Criteria is a quantitative framework for classifying species according to extinction risk. Population models may be used to estimate extinction risk or population declines. Uncertainty and variability arise in threat classifications through measurement and process error in empirical data and uncertainty in the models used to estimate extinction risk and population declines. Furthermore, species traits are known to affect extinction risk. We investigated the effects of measurement and process error, model type, population growth rate, and age at first reproduction on the reliability of risk classifications based on projected population declines on IUCN Red List classifications. We used an age-structured population model to simulate true population trajectories with different growth rates, reproductive ages and levels of variation, and subjected them to measurement error. We evaluated the ability of scalar and matrix models parameterized with these simulated time series to accurately capture the IUCN Red List classification generated with true population declines. Under all levels of measurement error tested and low process error, classifications were reasonably accurate; scalar and matrix models yielded roughly the same rate of misclassifications, but the distribution of errors differed; matrix models led to greater overestimation of extinction risk than underestimations; process error tended to contribute to misclassifications to a greater extent than measurement error; and more misclassifications occurred for fast, rather than slow, life histories. These results indicate that classifications of highly threatened taxa (i.e., taxa with low growth rates) under criterion A are more likely to be reliable than for less threatened taxa when assessed with population models. Greater scrutiny needs to be placed on data used to parameterize population models for species with high growth rates, particularly when available evidence indicates a potential transition to higher risk categories. © 2018 Society for Conservation Biology.

  13. Human Error Assessment and Reduction Technique (HEART) and Human Factor Analysis and Classification System (HFACS)

    NASA Technical Reports Server (NTRS)

    Alexander, Tiffaney Miller

    2017-01-01

    Research results have shown that more than half of aviation, aerospace and aeronautics mishaps incidents are attributed to human error. As a part of Safety within space exploration ground processing operations, the identification and/or classification of underlying contributors and causes of human error must be identified, in order to manage human error. This research provides a framework and methodology using the Human Error Assessment and Reduction Technique (HEART) and Human Factor Analysis and Classification System (HFACS), as an analysis tool to identify contributing factors, their impact on human error events, and predict the Human Error probabilities (HEPs) of future occurrences. This research methodology was applied (retrospectively) to six (6) NASA ground processing operations scenarios and thirty (30) years of Launch Vehicle related mishap data. This modifiable framework can be used and followed by other space and similar complex operations.

  14. Human Error Assessment and Reduction Technique (HEART) and Human Factor Analysis and Classification System (HFACS)

    NASA Technical Reports Server (NTRS)

    Alexander, Tiffaney Miller

    2017-01-01

    Research results have shown that more than half of aviation, aerospace and aeronautics mishaps/incidents are attributed to human error. As a part of Safety within space exploration ground processing operations, the identification and/or classification of underlying contributors and causes of human error must be identified, in order to manage human error. This research provides a framework and methodology using the Human Error Assessment and Reduction Technique (HEART) and Human Factor Analysis and Classification System (HFACS), as an analysis tool to identify contributing factors, their impact on human error events, and predict the Human Error probabilities (HEPs) of future occurrences. This research methodology was applied (retrospectively) to six (6) NASA ground processing operations scenarios and thirty (30) years of Launch Vehicle related mishap data. This modifiable framework can be used and followed by other space and similar complex operations.

  15. Human Error Assessment and Reduction Technique (HEART) and Human Factor Analysis and Classification System (HFACS)

    NASA Technical Reports Server (NTRS)

    Alexander, Tiffaney Miller

    2017-01-01

    Research results have shown that more than half of aviation, aerospace and aeronautics mishaps incidents are attributed to human error. As a part of Quality within space exploration ground processing operations, the identification and or classification of underlying contributors and causes of human error must be identified, in order to manage human error.This presentation will provide a framework and methodology using the Human Error Assessment and Reduction Technique (HEART) and Human Factor Analysis and Classification System (HFACS), as an analysis tool to identify contributing factors, their impact on human error events, and predict the Human Error probabilities (HEPs) of future occurrences. This research methodology was applied (retrospectively) to six (6) NASA ground processing operations scenarios and thirty (30) years of Launch Vehicle related mishap data. This modifiable framework can be used and followed by other space and similar complex operations.

  16. Factors That Affect Large Subunit Ribosomal DNA Amplicon Sequencing Studies of Fungal Communities: Classification Method, Primer Choice, and Error

    PubMed Central

    Porter, Teresita M.; Golding, G. Brian

    2012-01-01

    Nuclear large subunit ribosomal DNA is widely used in fungal phylogenetics and to an increasing extent also amplicon-based environmental sequencing. The relatively short reads produced by next-generation sequencing, however, makes primer choice and sequence error important variables for obtaining accurate taxonomic classifications. In this simulation study we tested the performance of three classification methods: 1) a similarity-based method (BLAST + Metagenomic Analyzer, MEGAN); 2) a composition-based method (Ribosomal Database Project naïve Bayesian classifier, NBC); and, 3) a phylogeny-based method (Statistical Assignment Package, SAP). We also tested the effects of sequence length, primer choice, and sequence error on classification accuracy and perceived community composition. Using a leave-one-out cross validation approach, results for classifications to the genus rank were as follows: BLAST + MEGAN had the lowest error rate and was particularly robust to sequence error; SAP accuracy was highest when long LSU query sequences were classified; and, NBC runs significantly faster than the other tested methods. All methods performed poorly with the shortest 50–100 bp sequences. Increasing simulated sequence error reduced classification accuracy. Community shifts were detected due to sequence error and primer selection even though there was no change in the underlying community composition. Short read datasets from individual primers, as well as pooled datasets, appear to only approximate the true community composition. We hope this work informs investigators of some of the factors that affect the quality and interpretation of their environmental gene surveys. PMID:22558215

  17. Masked and unmasked error-related potentials during continuous control and feedback

    NASA Astrophysics Data System (ADS)

    Lopes Dias, Catarina; Sburlea, Andreea I.; Müller-Putz, Gernot R.

    2018-06-01

    The detection of error-related potentials (ErrPs) in tasks with discrete feedback is well established in the brain–computer interface (BCI) field. However, the decoding of ErrPs in tasks with continuous feedback is still in its early stages. Objective. We developed a task in which subjects have continuous control of a cursor’s position by means of a joystick. The cursor’s position was shown to the participants in two different modalities of continuous feedback: normal and jittered. The jittered feedback was created to mimic the instability that could exist if participants controlled the trajectory directly with brain signals. Approach. This paper studies the electroencephalographic (EEG)—measurable signatures caused by a loss of control over the cursor’s trajectory, causing a target miss. Main results. In both feedback modalities, time-locked potentials revealed the typical frontal-central components of error-related potentials. Errors occurring during the jittered feedback (masked errors) were delayed in comparison to errors occurring during normal feedback (unmasked errors). Masked errors displayed lower peak amplitudes than unmasked errors. Time-locked classification analysis allowed a good distinction between correct and error classes (average Cohen-, average TPR  =  81.8% and average TNR  =  96.4%). Time-locked classification analysis between masked error and unmasked error classes revealed results at chance level (average Cohen-, average TPR  =  60.9% and average TNR  =  58.3%). Afterwards, we performed asynchronous detection of ErrPs, combining both masked and unmasked trials. The asynchronous detection of ErrPs in a simulated online scenario resulted in an average TNR of 84.0% and in an average TPR of 64.9%. Significance. The time-locked classification results suggest that the masked and unmasked errors were indistinguishable in terms of classification. The asynchronous classification results suggest that the feedback modality did not hinder the asynchronous detection of ErrPs.

  18. Accuracy assessment in the Large Area Crop Inventory Experiment

    NASA Technical Reports Server (NTRS)

    Houston, A. G.; Pitts, D. E.; Feiveson, A. H.; Badhwar, G.; Ferguson, M.; Hsu, E.; Potter, J.; Chhikara, R.; Rader, M.; Ahlers, C.

    1979-01-01

    The Accuracy Assessment System (AAS) of the Large Area Crop Inventory Experiment (LACIE) was responsible for determining the accuracy and reliability of LACIE estimates of wheat production, area, and yield, made at regular intervals throughout the crop season, and for investigating the various LACIE error sources, quantifying these errors, and relating them to their causes. Some results of using the AAS during the three years of LACIE are reviewed. As the program culminated, AAS was able not only to meet the goal of obtaining accurate statistical estimates of sampling and classification accuracy, but also the goal of evaluating component labeling errors. Furthermore, the ground-truth data processing matured from collecting data for one crop (small grains) to collecting, quality-checking, and archiving data for all crops in a LACIE small segment.

  19. A compendium of fossil marine animal families, 2nd edition

    NASA Technical Reports Server (NTRS)

    Sepkoski, J. J. Jr; Sepkoski JJ, J. r. (Principal Investigator)

    1992-01-01

    A comprehensive listing of 4075 taxonomic families of marine animals known from the fossil record is presented. This listing covers invertebrates, vertebrates, and animal-like protists, gives time intervals of apparent origination and extinction, and provides literature sources for these data. The time intervals are mostly 81 internationally recognized stratigraphic stages; more than half of the data are resolved to one of 145 substage divisions, providing more highly resolved data for studies of taxic macroevolution. Families are classified by order, class, and phylum, reflecting current classifications in the published literature. This compendium is a new edition of the 1982 publication, correcting errors and presenting greater stratigraphic resolution and more current ideas about acceptable families and their classification.

  20. On the use of interaction error potentials for adaptive brain computer interfaces.

    PubMed

    Llera, A; van Gerven, M A J; Gómez, V; Jensen, O; Kappen, H J

    2011-12-01

    We propose an adaptive classification method for the Brain Computer Interfaces (BCI) which uses Interaction Error Potentials (IErrPs) as a reinforcement signal and adapts the classifier parameters when an error is detected. We analyze the quality of the proposed approach in relation to the misclassification of the IErrPs. In addition we compare static versus adaptive classification performance using artificial and MEG data. We show that the proposed adaptive framework significantly improves the static classification methods. Copyright © 2011 Elsevier Ltd. All rights reserved.

  1. A neural network for noise correlation classification

    NASA Astrophysics Data System (ADS)

    Paitz, Patrick; Gokhberg, Alexey; Fichtner, Andreas

    2018-02-01

    We present an artificial neural network (ANN) for the classification of ambient seismic noise correlations into two categories, suitable and unsuitable for noise tomography. By using only a small manually classified data subset for network training, the ANN allows us to classify large data volumes with low human effort and to encode the valuable subjective experience of data analysts that cannot be captured by a deterministic algorithm. Based on a new feature extraction procedure that exploits the wavelet-like nature of seismic time-series, we efficiently reduce the dimensionality of noise correlation data, still keeping relevant features needed for automated classification. Using global- and regional-scale data sets, we show that classification errors of 20 per cent or less can be achieved when the network training is performed with as little as 3.5 per cent and 16 per cent of the data sets, respectively. Furthermore, the ANN trained on the regional data can be applied to the global data, and vice versa, without a significant increase of the classification error. An experiment where four students manually classified the data, revealed that the classification error they would assign to each other is substantially larger than the classification error of the ANN (>35 per cent). This indicates that reproducibility would be hampered more by human subjectivity than by imperfections of the ANN.

  2. Comparison of artificial intelligence classifiers for SIP attack data

    NASA Astrophysics Data System (ADS)

    Safarik, Jakub; Slachta, Jiri

    2016-05-01

    Honeypot application is a source of valuable data about attacks on the network. We run several SIP honeypots in various computer networks, which are separated geographically and logically. Each honeypot runs on public IP address and uses standard SIP PBX ports. All information gathered via honeypot is periodically sent to the centralized server. This server classifies all attack data by neural network algorithm. The paper describes optimizations of a neural network classifier, which lower the classification error. The article contains the comparison of two neural network algorithm used for the classification of validation data. The first is the original implementation of the neural network described in recent work; the second neural network uses further optimizations like input normalization or cross-entropy cost function. We also use other implementations of neural networks and machine learning classification algorithms. The comparison test their capabilities on validation data to find the optimal classifier. The article result shows promise for further development of an accurate SIP attack classification engine.

  3. Analyzing thematic maps and mapping for accuracy

    USGS Publications Warehouse

    Rosenfield, G.H.

    1982-01-01

    Two problems which exist while attempting to test the accuracy of thematic maps and mapping are: (1) evaluating the accuracy of thematic content, and (2) evaluating the effects of the variables on thematic mapping. Statistical analysis techniques are applicable to both these problems and include techniques for sampling the data and determining their accuracy. In addition, techniques for hypothesis testing, or inferential statistics, are used when comparing the effects of variables. A comprehensive and valid accuracy test of a classification project, such as thematic mapping from remotely sensed data, includes the following components of statistical analysis: (1) sample design, including the sample distribution, sample size, size of the sample unit, and sampling procedure; and (2) accuracy estimation, including estimation of the variance and confidence limits. Careful consideration must be given to the minimum sample size necessary to validate the accuracy of a given. classification category. The results of an accuracy test are presented in a contingency table sometimes called a classification error matrix. Usually the rows represent the interpretation, and the columns represent the verification. The diagonal elements represent the correct classifications. The remaining elements of the rows represent errors by commission, and the remaining elements of the columns represent the errors of omission. For tests of hypothesis that compare variables, the general practice has been to use only the diagonal elements from several related classification error matrices. These data are arranged in the form of another contingency table. The columns of the table represent the different variables being compared, such as different scales of mapping. The rows represent the blocking characteristics, such as the various categories of classification. The values in the cells of the tables might be the counts of correct classification or the binomial proportions of these counts divided by either the row totals or the column totals from the original classification error matrices. In hypothesis testing, when the results of tests of multiple sample cases prove to be significant, some form of statistical test must be used to separate any results that differ significantly from the others. In the past, many analyses of the data in this error matrix were made by comparing the relative magnitudes of the percentage of correct classifications, for either individual categories, the entire map or both. More rigorous analyses have used data transformations and (or) two-way classification analysis of variance. A more sophisticated step of data analysis techniques would be to use the entire classification error matrices using the methods of discrete multivariate analysis or of multiviariate analysis of variance.

  4. Analysis of DSN software anomalies

    NASA Technical Reports Server (NTRS)

    Galorath, D. D.; Hecht, H.; Hecht, M.; Reifer, D. J.

    1981-01-01

    A categorized data base of software errors which were discovered during the various stages of development and operational use of the Deep Space Network DSN/Mark 3 System was developed. A study team identified several existing error classification schemes (taxonomies), prepared a detailed annotated bibliography of the error taxonomy literature, and produced a new classification scheme which was tuned to the DSN anomaly reporting system and encapsulated the work of others. Based upon the DSN/RCI error taxonomy, error data on approximately 1000 reported DSN/Mark 3 anomalies were analyzed, interpreted and classified. Next, error data are summarized and histograms were produced highlighting key tendencies.

  5. Neyman-Pearson classification algorithms and NP receiver operating characteristics

    PubMed Central

    Tong, Xin; Feng, Yang; Li, Jingyi Jessica

    2018-01-01

    In many binary classification applications, such as disease diagnosis and spam detection, practitioners commonly face the need to limit type I error (that is, the conditional probability of misclassifying a class 0 observation as class 1) so that it remains below a desired threshold. To address this need, the Neyman-Pearson (NP) classification paradigm is a natural choice; it minimizes type II error (that is, the conditional probability of misclassifying a class 1 observation as class 0) while enforcing an upper bound, α, on the type I error. Despite its century-long history in hypothesis testing, the NP paradigm has not been well recognized and implemented in classification schemes. Common practices that directly limit the empirical type I error to no more than α do not satisfy the type I error control objective because the resulting classifiers are likely to have type I errors much larger than α, and the NP paradigm has not been properly implemented in practice. We develop the first umbrella algorithm that implements the NP paradigm for all scoring-type classification methods, such as logistic regression, support vector machines, and random forests. Powered by this algorithm, we propose a novel graphical tool for NP classification methods: NP receiver operating characteristic (NP-ROC) bands motivated by the popular ROC curves. NP-ROC bands will help choose α in a data-adaptive way and compare different NP classifiers. We demonstrate the use and properties of the NP umbrella algorithm and NP-ROC bands, available in the R package nproc, through simulation and real data studies. PMID:29423442

  6. Neyman-Pearson classification algorithms and NP receiver operating characteristics.

    PubMed

    Tong, Xin; Feng, Yang; Li, Jingyi Jessica

    2018-02-01

    In many binary classification applications, such as disease diagnosis and spam detection, practitioners commonly face the need to limit type I error (that is, the conditional probability of misclassifying a class 0 observation as class 1) so that it remains below a desired threshold. To address this need, the Neyman-Pearson (NP) classification paradigm is a natural choice; it minimizes type II error (that is, the conditional probability of misclassifying a class 1 observation as class 0) while enforcing an upper bound, α, on the type I error. Despite its century-long history in hypothesis testing, the NP paradigm has not been well recognized and implemented in classification schemes. Common practices that directly limit the empirical type I error to no more than α do not satisfy the type I error control objective because the resulting classifiers are likely to have type I errors much larger than α, and the NP paradigm has not been properly implemented in practice. We develop the first umbrella algorithm that implements the NP paradigm for all scoring-type classification methods, such as logistic regression, support vector machines, and random forests. Powered by this algorithm, we propose a novel graphical tool for NP classification methods: NP receiver operating characteristic (NP-ROC) bands motivated by the popular ROC curves. NP-ROC bands will help choose α in a data-adaptive way and compare different NP classifiers. We demonstrate the use and properties of the NP umbrella algorithm and NP-ROC bands, available in the R package nproc, through simulation and real data studies.

  7. Evaluation of normalization methods for cDNA microarray data by k-NN classification

    PubMed Central

    Wu, Wei; Xing, Eric P; Myers, Connie; Mian, I Saira; Bissell, Mina J

    2005-01-01

    Background Non-biological factors give rise to unwanted variations in cDNA microarray data. There are many normalization methods designed to remove such variations. However, to date there have been few published systematic evaluations of these techniques for removing variations arising from dye biases in the context of downstream, higher-order analytical tasks such as classification. Results Ten location normalization methods that adjust spatial- and/or intensity-dependent dye biases, and three scale methods that adjust scale differences were applied, individually and in combination, to five distinct, published, cancer biology-related cDNA microarray data sets. Leave-one-out cross-validation (LOOCV) classification error was employed as the quantitative end-point for assessing the effectiveness of a normalization method. In particular, a known classifier, k-nearest neighbor (k-NN), was estimated from data normalized using a given technique, and the LOOCV error rate of the ensuing model was computed. We found that k-NN classifiers are sensitive to dye biases in the data. Using NONRM and GMEDIAN as baseline methods, our results show that single-bias-removal techniques which remove either spatial-dependent dye bias (referred later as spatial effect) or intensity-dependent dye bias (referred later as intensity effect) moderately reduce LOOCV classification errors; whereas double-bias-removal techniques which remove both spatial- and intensity effect reduce LOOCV classification errors even further. Of the 41 different strategies examined, three two-step processes, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, all of which removed intensity effect globally and spatial effect locally, appear to reduce LOOCV classification errors most consistently and effectively across all data sets. We also found that the investigated scale normalization methods do not reduce LOOCV classification error. Conclusion Using LOOCV error of k-NNs as the evaluation criterion, three double-bias-removal normalization strategies, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, outperform other strategies for removing spatial effect, intensity effect and scale differences from cDNA microarray data. The apparent sensitivity of k-NN LOOCV classification error to dye biases suggests that this criterion provides an informative measure for evaluating normalization methods. All the computational tools used in this study were implemented using the R language for statistical computing and graphics. PMID:16045803

  8. Evaluation of normalization methods for cDNA microarray data by k-NN classification.

    PubMed

    Wu, Wei; Xing, Eric P; Myers, Connie; Mian, I Saira; Bissell, Mina J

    2005-07-26

    Non-biological factors give rise to unwanted variations in cDNA microarray data. There are many normalization methods designed to remove such variations. However, to date there have been few published systematic evaluations of these techniques for removing variations arising from dye biases in the context of downstream, higher-order analytical tasks such as classification. Ten location normalization methods that adjust spatial- and/or intensity-dependent dye biases, and three scale methods that adjust scale differences were applied, individually and in combination, to five distinct, published, cancer biology-related cDNA microarray data sets. Leave-one-out cross-validation (LOOCV) classification error was employed as the quantitative end-point for assessing the effectiveness of a normalization method. In particular, a known classifier, k-nearest neighbor (k-NN), was estimated from data normalized using a given technique, and the LOOCV error rate of the ensuing model was computed. We found that k-NN classifiers are sensitive to dye biases in the data. Using NONRM and GMEDIAN as baseline methods, our results show that single-bias-removal techniques which remove either spatial-dependent dye bias (referred later as spatial effect) or intensity-dependent dye bias (referred later as intensity effect) moderately reduce LOOCV classification errors; whereas double-bias-removal techniques which remove both spatial- and intensity effect reduce LOOCV classification errors even further. Of the 41 different strategies examined, three two-step processes, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, all of which removed intensity effect globally and spatial effect locally, appear to reduce LOOCV classification errors most consistently and effectively across all data sets. We also found that the investigated scale normalization methods do not reduce LOOCV classification error. Using LOOCV error of k-NNs as the evaluation criterion, three double-bias-removal normalization strategies, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, outperform other strategies for removing spatial effect, intensity effect and scale differences from cDNA microarray data. The apparent sensitivity of k-NN LOOCV classification error to dye biases suggests that this criterion provides an informative measure for evaluating normalization methods. All the computational tools used in this study were implemented using the R language for statistical computing and graphics.

  9. Bayesian learning for spatial filtering in an EEG-based brain-computer interface.

    PubMed

    Zhang, Haihong; Yang, Huijuan; Guan, Cuntai

    2013-07-01

    Spatial filtering for EEG feature extraction and classification is an important tool in brain-computer interface. However, there is generally no established theory that links spatial filtering directly to Bayes classification error. To address this issue, this paper proposes and studies a Bayesian analysis theory for spatial filtering in relation to Bayes error. Following the maximum entropy principle, we introduce a gamma probability model for describing single-trial EEG power features. We then formulate and analyze the theoretical relationship between Bayes classification error and the so-called Rayleigh quotient, which is a function of spatial filters and basically measures the ratio in power features between two classes. This paper also reports our extensive study that examines the theory and its use in classification, using three publicly available EEG data sets and state-of-the-art spatial filtering techniques and various classifiers. Specifically, we validate the positive relationship between Bayes error and Rayleigh quotient in real EEG power features. Finally, we demonstrate that the Bayes error can be practically reduced by applying a new spatial filter with lower Rayleigh quotient.

  10. Automated Classification of Phonological Errors in Aphasic Language

    PubMed Central

    Ahuja, Sanjeev B.; Reggia, James A.; Berndt, Rita S.

    1984-01-01

    Using heuristically-guided state space search, a prototype program has been developed to simulate and classify phonemic errors occurring in the speech of neurologically-impaired patients. Simulations are based on an interchangeable rule/operator set of elementary errors which represent a theory of phonemic processing faults. This work introduces and evaluates a novel approach to error simulation and classification, it provides a prototype simulation tool for neurolinguistic research, and it forms the initial phase of a larger research effort involving computer modelling of neurolinguistic processes.

  11. HIV classification using the coalescent theory

    PubMed Central

    Bulla, Ingo; Schultz, Anne-Kathrin; Schreiber, Fabian; Zhang, Ming; Leitner, Thomas; Korber, Bette; Morgenstern, Burkhard; Stanke, Mario

    2010-01-01

    Motivation: Existing coalescent models and phylogenetic tools based on them are not designed for studying the genealogy of sequences like those of HIV, since in HIV recombinants with multiple cross-over points between the parental strains frequently arise. Hence, ambiguous cases in the classification of HIV sequences into subtypes and circulating recombinant forms (CRFs) have been treated with ad hoc methods in lack of tools based on a comprehensive coalescent model accounting for complex recombination patterns. Results: We developed the program ARGUS that scores classifications of sequences into subtypes and recombinant forms. It reconstructs ancestral recombination graphs (ARGs) that reflect the genealogy of the input sequences given a classification hypothesis. An ARG with maximal probability is approximated using a Markov chain Monte Carlo approach. ARGUS was able to distinguish the correct classification with a low error rate from plausible alternative classifications in simulation studies with realistic parameters. We applied our algorithm to decide between two recently debated alternatives in the classification of CRF02 of HIV-1 and find that CRF02 is indeed a recombinant of Subtypes A and G. Availability: ARGUS is implemented in C++ and the source code is available at http://gobics.de/software Contact: ibulla@uni-goettingen.de Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:20400454

  12. ANALYSIS OF A CLASSIFICATION ERROR MATRIX USING CATEGORICAL DATA TECHNIQUES.

    USGS Publications Warehouse

    Rosenfield, George H.; Fitzpatrick-Lins, Katherine

    1984-01-01

    Summary form only given. A classification error matrix typically contains tabulation results of an accuracy evaluation of a thematic classification, such as that of a land use and land cover map. The diagonal elements of the matrix represent the counts corrected, and the usual designation of classification accuracy has been the total percent correct. The nondiagonal elements of the matrix have usually been neglected. The classification error matrix is known in statistical terms as a contingency table of categorical data. As an example, an application of these methodologies to a problem of remotely sensed data concerning two photointerpreters and four categories of classification indicated that there is no significant difference in the interpretation between the two photointerpreters, and that there are significant differences among the interpreted category classifications. However, two categories, oak and cottonwood, are not separable in classification in this experiment at the 0. 51 percent probability. A coefficient of agreement is determined for the interpreted map as a whole, and individually for each of the interpreted categories. A conditional coefficient of agreement for the individual categories is compared to other methods for expressing category accuracy which have already been presented in the remote sensing literature.

  13. New wideband radar target classification method based on neural learning and modified Euclidean metric

    NASA Astrophysics Data System (ADS)

    Jiang, Yicheng; Cheng, Ping; Ou, Yangkui

    2001-09-01

    A new method for target classification of high-range resolution radar is proposed. It tries to use neural learning to obtain invariant subclass features of training range profiles. A modified Euclidean metric based on the Box-Cox transformation technique is investigated for Nearest Neighbor target classification improvement. The classification experiments using real radar data of three different aircraft have demonstrated that classification error can reduce 8% if this method proposed in this paper is chosen instead of the conventional method. The results of this paper have shown that by choosing an optimized metric, it is indeed possible to reduce the classification error without increasing the number of samples.

  14. What Do Spelling Errors Tell Us? Classification and Analysis of Errors Made by Greek Schoolchildren with and without Dyslexia

    ERIC Educational Resources Information Center

    Protopapas, Athanassios; Fakou, Aikaterini; Drakopoulou, Styliani; Skaloumbakas, Christos; Mouzaki, Angeliki

    2013-01-01

    In this study we propose a classification system for spelling errors and determine the most common spelling difficulties of Greek children with and without dyslexia. Spelling skills of 542 children from the general population and 44 children with dyslexia, Grades 3-4 and 7, were assessed with a dictated common word list and age-appropriate…

  15. Dutch population specific sex estimation formulae using the proximal femur.

    PubMed

    Colman, K L; Janssen, M C L; Stull, K E; van Rijn, R R; Oostra, R J; de Boer, H H; van der Merwe, A E

    2018-05-01

    Sex estimation techniques are frequently applied in forensic anthropological analyses of unidentified human skeletal remains. While morphological sex estimation methods are able to endure population differences, the classification accuracy of metric sex estimation methods are population-specific. No metric sex estimation method currently exists for the Dutch population. The purpose of this study is to create Dutch population specific sex estimation formulae by means of osteometric analyses of the proximal femur. Since the Netherlands lacks a representative contemporary skeletal reference population, 2D plane reconstructions, derived from clinical computed tomography (CT) data, were used as an alternative source for a representative reference sample. The first part of this study assesses the intra- and inter-observer error, or reliability, of twelve measurements of the proximal femur. The technical error of measurement (TEM) and relative TEM (%TEM) were calculated using 26 dry adult femora. In addition, the agreement, or accuracy, between the dry bone and CT-based measurements was determined by percent agreement. Only reliable and accurate measurements were retained for the logistic regression sex estimation formulae; a training set (n=86) was used to create the models while an independent testing set (n=28) was used to validate the models. Due to high levels of multicollinearity, only single variable models were created. Cross-validated classification accuracies ranged from 86% to 92%. The high cross-validated classification accuracies indicate that the developed formulae can contribute to the biological profile and specifically in sex estimation of unidentified human skeletal remains in the Netherlands. Furthermore, the results indicate that clinical CT data can be a valuable alternative source of data when representative skeletal collections are unavailable. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. Effectiveness of Global Features for Automatic Medical Image Classification and Retrieval – the experiences of OHSU at ImageCLEFmed

    PubMed Central

    Kalpathy-Cramer, Jayashree; Hersh, William

    2008-01-01

    In 2006 and 2007, Oregon Health & Science University (OHSU) participated in the automatic image annotation task for medical images at ImageCLEF, an annual international benchmarking event that is part of the Cross Language Evaluation Forum (CLEF). The goal of the automatic annotation task was to classify 1000 test images based on the Image Retrieval in Medical Applications (IRMA) code, given a set of 10,000 training images. There were 116 distinct classes in 2006 and 2007. We evaluated the efficacy of a variety of primarily global features for this classification task. These included features based on histograms, gray level correlation matrices and the gist technique. A multitude of classifiers including k-nearest neighbors, two-level neural networks, support vector machines, and maximum likelihood classifiers were evaluated. Our official error rates for the 1000 test images were 26% in 2006 using the flat classification structure. The error count in 2007 was 67.8 using the hierarchical classification error computation based on the IRMA code in 2007. Confusion matrices as well as clustering experiments were used to identify visually similar classes. The use of the IRMA code did not help us in the classification task as the semantic hierarchy of the IRMA classes did not correspond well with the hierarchy based on clustering of image features that we used. Our most frequent misclassification errors were along the view axis. Subsequent experiments based on a two-stage classification system decreased our error rate to 19.8% for the 2006 dataset and our error count to 55.4 for the 2007 data. PMID:19884953

  17. Investigations in adaptive processing of multispectral data

    NASA Technical Reports Server (NTRS)

    Kriegler, F. J.; Horwitz, H. M.

    1973-01-01

    Adaptive data processing procedures are applied to the problem of classifying objects in a scene scanned by multispectral sensor. These procedures show a performance improvement over standard nonadaptive techniques. Some sources of error in classification are identified and those correctable by adaptive processing are discussed. Experiments in adaptation of signature means by decision-directed methods are described. Some of these methods assume correlation between the trajectories of different signature means; for others this assumption is not made.

  18. An assessment of the cultivated cropland class of NLCD 2006 using a multi-source and multi-criteria approach

    USGS Publications Warehouse

    Danielson, Patrick; Yang, Limin; Jin, Suming; Homer, Collin G.; Napton, Darrell

    2016-01-01

    We developed a method that analyzes the quality of the cultivated cropland class mapped in the USA National Land Cover Database (NLCD) 2006. The method integrates multiple geospatial datasets and a Multi Index Integrated Change Analysis (MIICA) change detection method that captures spectral changes to identify the spatial distribution and magnitude of potential commission and omission errors for the cultivated cropland class in NLCD 2006. The majority of the commission and omission errors in NLCD 2006 are in areas where cultivated cropland is not the most dominant land cover type. The errors are primarily attributed to the less accurate training dataset derived from the National Agricultural Statistics Service Cropland Data Layer dataset. In contrast, error rates are low in areas where cultivated cropland is the dominant land cover. Agreement between model-identified commission errors and independently interpreted reference data was high (79%). Agreement was low (40%) for omission error comparison. The majority of the commission errors in the NLCD 2006 cultivated crops were confused with low-intensity developed classes, while the majority of omission errors were from herbaceous and shrub classes. Some errors were caused by inaccurate land cover change from misclassification in NLCD 2001 and the subsequent land cover post-classification process.

  19. Error-related brain activity and error awareness in an error classification paradigm.

    PubMed

    Di Gregorio, Francesco; Steinhauser, Marco; Maier, Martin E

    2016-10-01

    Error-related brain activity has been linked to error detection enabling adaptive behavioral adjustments. However, it is still unclear which role error awareness plays in this process. Here, we show that the error-related negativity (Ne/ERN), an event-related potential reflecting early error monitoring, is dissociable from the degree of error awareness. Participants responded to a target while ignoring two different incongruent distractors. After responding, they indicated whether they had committed an error, and if so, whether they had responded to one or to the other distractor. This error classification paradigm allowed distinguishing partially aware errors, (i.e., errors that were noticed but misclassified) and fully aware errors (i.e., errors that were correctly classified). The Ne/ERN was larger for partially aware errors than for fully aware errors. Whereas this speaks against the idea that the Ne/ERN foreshadows the degree of error awareness, it confirms the prediction of a computational model, which relates the Ne/ERN to post-response conflict. This model predicts that stronger distractor processing - a prerequisite of error classification in our paradigm - leads to lower post-response conflict and thus a smaller Ne/ERN. This implies that the relationship between Ne/ERN and error awareness depends on how error awareness is related to response conflict in a specific task. Our results further indicate that the Ne/ERN but not the degree of error awareness determines adaptive performance adjustments. Taken together, we conclude that the Ne/ERN is dissociable from error awareness and foreshadows adaptive performance adjustments. Our results suggest that the relationship between the Ne/ERN and error awareness is correlative and mediated by response conflict. Copyright © 2016 Elsevier Inc. All rights reserved.

  20. Toward attenuating the impact of arm positions on electromyography pattern-recognition based motion classification in transradial amputees

    PubMed Central

    2012-01-01

    Background Electromyography (EMG) pattern-recognition based control strategies for multifunctional myoelectric prosthesis systems have been studied commonly in a controlled laboratory setting. Before these myoelectric prosthesis systems are clinically viable, it will be necessary to assess the effect of some disparities between the ideal laboratory setting and practical use on the control performance. One important obstacle is the impact of arm position variation that causes the changes of EMG pattern when performing identical motions in different arm positions. This study aimed to investigate the impacts of arm position variation on EMG pattern-recognition based motion classification in upper-limb amputees and the solutions for reducing these impacts. Methods With five unilateral transradial (TR) amputees, the EMG signals and tri-axial accelerometer mechanomyography (ACC-MMG) signals were simultaneously collected from both amputated and intact arms when performing six classes of arm and hand movements in each of five arm positions that were considered in the study. The effect of the arm position changes was estimated in terms of motion classification error and compared between amputated and intact arms. Then the performance of three proposed methods in attenuating the impact of arm positions was evaluated. Results With EMG signals, the average intra-position and inter-position classification errors across all five arm positions and five subjects were around 7.3% and 29.9% from amputated arms, respectively, about 1.0% and 10% low in comparison with those from intact arms. While ACC-MMG signals could yield a similar intra-position classification error (9.9%) as EMG, they had much higher inter-position classification error with an average value of 81.1% over the arm positions and the subjects. When the EMG data from all five arm positions were involved in the training set, the average classification error reached a value of around 10.8% for amputated arms. Using a two-stage cascade classifier, the average classification error was around 9.0% over all five arm positions. Reducing ACC-MMG channels from 8 to 2 only increased the average position classification error across all five arm positions from 0.7% to 1.0% in amputated arms. Conclusions The performance of EMG pattern-recognition based method in classifying movements strongly depends on arm positions. This dependency is a little stronger in intact arm than in amputated arm, which suggests that the investigations associated with practical use of a myoelectric prosthesis should use the limb amputees as subjects instead of using able-body subjects. The two-stage cascade classifier mode with ACC-MMG for limb position identification and EMG for limb motion classification may be a promising way to reduce the effect of limb position variation on classification performance. PMID:23036049

  1. Algorithmic Classification of Five Characteristic Types of Paraphasias.

    PubMed

    Fergadiotis, Gerasimos; Gorman, Kyle; Bedrick, Steven

    2016-12-01

    This study was intended to evaluate a series of algorithms developed to perform automatic classification of paraphasic errors (formal, semantic, mixed, neologistic, and unrelated errors). We analyzed 7,111 paraphasias from the Moss Aphasia Psycholinguistics Project Database (Mirman et al., 2010) and evaluated the classification accuracy of 3 automated tools. First, we used frequency norms from the SUBTLEXus database (Brysbaert & New, 2009) to differentiate nonword errors and real-word productions. Then we implemented a phonological-similarity algorithm to identify phonologically related real-word errors. Last, we assessed the performance of a semantic-similarity criterion that was based on word2vec (Mikolov, Yih, & Zweig, 2013). Overall, the algorithmic classification replicated human scoring for the major categories of paraphasias studied with high accuracy. The tool that was based on the SUBTLEXus frequency norms was more than 97% accurate in making lexicality judgments. The phonological-similarity criterion was approximately 91% accurate, and the overall classification accuracy of the semantic classifier ranged from 86% to 90%. Overall, the results highlight the potential of tools from the field of natural language processing for the development of highly reliable, cost-effective diagnostic tools suitable for collecting high-quality measurement data for research and clinical purposes.

  2. Human error analysis of commercial aviation accidents using the human factors analysis and classification system (HFACS)

    DOT National Transportation Integrated Search

    2001-02-01

    The Human Factors Analysis and Classification System (HFACS) is a general human error framework : originally developed and tested within the U.S. military as a tool for investigating and analyzing the human : causes of aviation accidents. Based upon ...

  3. Current Assessment and Classification of Suicidal Phenomena using the FDA 2012 Draft Guidance Document on Suicide Assessment: A Critical Review.

    PubMed

    Sheehan, David V; Giddens, Jennifer M; Sheehan, Kathy Harnett

    2014-09-01

    Standard international classification criteria require that classification categories be comprehensive to avoid type II error. Categories should be mutually exclusive and definitions should be clear and unambiguous (to avoid type I and type II errors). In addition, the classification system should be robust enough to last over time and provide comparability between data collections. This article was designed to evaluate the extent to which the classification system contained in the United States Food and Drug Administration 2012 Draft Guidance for the prospective assessment and classification of suicidal ideation and behavior in clinical trials meets these criteria. A critical review is used to assess the extent to which the proposed categories contained in the Food and Drug Administration 2012 Draft Guidance are comprehensive, unambiguous, and robust. Assumptions that underlie the classification system are also explored. The Food and Drug Administration classification system contained in the 2012 Draft Guidance does not capture the full range of suicidal ideation and behavior (type II error). Definitions, moreover, are frequently ambiguous (susceptible to multiple interpretations), and the potential for misclassification (type I and type II errors) is compounded by frequent mismatches in category titles and definitions. These issues have the potential to compromise data comparability within clinical trial sites, across sites, and over time. These problems need to be remedied because of the potential for flawed data output and consequent threats to public health, to research on the safety of medications, and to the search for effective medication treatments for suicidality.

  4. An experimental study of interstitial lung tissue classification in HRCT images using ANN and role of cost functions

    NASA Astrophysics Data System (ADS)

    Dash, Jatindra K.; Kale, Mandar; Mukhopadhyay, Sudipta; Khandelwal, Niranjan; Prabhakar, Nidhi; Garg, Mandeep; Kalra, Naveen

    2017-03-01

    In this paper, we investigate the effect of the error criteria used during a training phase of the artificial neural network (ANN) on the accuracy of the classifier for classification of lung tissues affected with Interstitial Lung Diseases (ILD). Mean square error (MSE) and the cross-entropy (CE) criteria are chosen being most popular choice in state-of-the-art implementations. The classification experiment performed on the six interstitial lung disease (ILD) patterns viz. Consolidation, Emphysema, Ground Glass Opacity, Micronodules, Fibrosis and Healthy from MedGIFT database. The texture features from an arbitrary region of interest (AROI) are extracted using Gabor filter. Two different neural networks are trained with the scaled conjugate gradient back propagation algorithm with MSE and CE error criteria function respectively for weight updation. Performance is evaluated in terms of average accuracy of these classifiers using 4 fold cross-validation. Each network is trained for five times for each fold with randomly initialized weight vectors and accuracies are computed. Significant improvement in classification accuracy is observed when ANN is trained by using CE (67.27%) as error function compared to MSE (63.60%). Moreover, standard deviation of the classification accuracy for the network trained with CE (6.69) error criteria is found less as compared to network trained with MSE (10.32) criteria.

  5. Particle Swarm Optimization approach to defect detection in armour ceramics.

    PubMed

    Kesharaju, Manasa; Nagarajah, Romesh

    2017-03-01

    In this research, various extracted features were used in the development of an automated ultrasonic sensor based inspection system that enables defect classification in each ceramic component prior to despatch to the field. Classification is an important task and large number of irrelevant, redundant features commonly introduced to a dataset reduces the classifiers performance. Feature selection aims to reduce the dimensionality of the dataset while improving the performance of a classification system. In the context of a multi-criteria optimization problem (i.e. to minimize classification error rate and reduce number of features) such as one discussed in this research, the literature suggests that evolutionary algorithms offer good results. Besides, it is noted that Particle Swarm Optimization (PSO) has not been explored especially in the field of classification of high frequency ultrasonic signals. Hence, a binary coded Particle Swarm Optimization (BPSO) technique is investigated in the implementation of feature subset selection and to optimize the classification error rate. In the proposed method, the population data is used as input to an Artificial Neural Network (ANN) based classification system to obtain the error rate, as ANN serves as an evaluator of PSO fitness function. Copyright © 2016. Published by Elsevier B.V.

  6. The impact of OCR accuracy on automated cancer classification of pathology reports.

    PubMed

    Zuccon, Guido; Nguyen, Anthony N; Bergheim, Anton; Wickman, Sandra; Grayson, Narelle

    2012-01-01

    To evaluate the effects of Optical Character Recognition (OCR) on the automatic cancer classification of pathology reports. Scanned images of pathology reports were converted to electronic free-text using a commercial OCR system. A state-of-the-art cancer classification system, the Medical Text Extraction (MEDTEX) system, was used to automatically classify the OCR reports. Classifications produced by MEDTEX on the OCR versions of the reports were compared with the classification from a human amended version of the OCR reports. The employed OCR system was found to recognise scanned pathology reports with up to 99.12% character accuracy and up to 98.95% word accuracy. Errors in the OCR processing were found to minimally impact on the automatic classification of scanned pathology reports into notifiable groups. However, the impact of OCR errors is not negligible when considering the extraction of cancer notification items, such as primary site, histological type, etc. The automatic cancer classification system used in this work, MEDTEX, has proven to be robust to errors produced by the acquisition of freetext pathology reports from scanned images through OCR software. However, issues emerge when considering the extraction of cancer notification items.

  7. Error Detection in Mechanized Classification Systems

    ERIC Educational Resources Information Center

    Hoyle, W. G.

    1976-01-01

    When documentary material is indexed by a mechanized classification system, and the results judged by trained professionals, the number of documents in disagreement, after suitable adjustment, defines the error rate of the system. In a test case disagreement was 22 percent and, of this 22 percent, the computer correctly identified two-thirds of…

  8. Privacy-Preserving Evaluation of Generalization Error and Its Application to Model and Attribute Selection

    NASA Astrophysics Data System (ADS)

    Sakuma, Jun; Wright, Rebecca N.

    Privacy-preserving classification is the task of learning or training a classifier on the union of privately distributed datasets without sharing the datasets. The emphasis of existing studies in privacy-preserving classification has primarily been put on the design of privacy-preserving versions of particular data mining algorithms, However, in classification problems, preprocessing and postprocessing— such as model selection or attribute selection—play a prominent role in achieving higher classification accuracy. In this paper, we show generalization error of classifiers in privacy-preserving classification can be securely evaluated without sharing prediction results. Our main technical contribution is a new generalized Hamming distance protocol that is universally applicable to preprocessing and postprocessing of various privacy-preserving classification problems, such as model selection in support vector machine and attribute selection in naive Bayes classification.

  9. Putzmeister Inc. and Implication of Changes in Nonattainment Classification

    EPA Pesticide Factsheets

    This document may be of assistance in applying the New Source Review (NSR) air permitting regulations including the Prevention of Significant Deterioration (PSD) requirements. This document is part of the NSR Policy and Guidance Database. Some documents in the database are a scanned or retyped version of a paper photocopy of the original. Although we have taken considerable effort to quality assure the documents, some may contain typographical errors. Contact the office that issued the document if you need a copy of the original.

  10. Classification of the Bardstown Fuel Alcohol Company under PSD

    EPA Pesticide Factsheets

    This document may be of assistance in applying the New Source Review (NSR) air permitting regulations including the Prevention of Significant Deterioration (PSD) requirements. This document is part of the NSR Policy and Guidance Database. Some documents in the database are a scanned or retyped version of a paper photocopy of the original. Although we have taken considerable effort to quality assure the documents, some may contain typographical errors. Contact the office that issued the document if you need a copy of the original.

  11. Classification of Ethanol Fuel Plants under PSD

    EPA Pesticide Factsheets

    This document may be of assistance in applying the New Source Review (NSR) air permitting regulations including the Prevention of Significant Deterioration (PSD) requirements. This document is part of the NSR Policy and Guidance Database. Some documents in the database are a scanned or retyped version of a paper photocopy of the original. Although we have taken considerable effort to quality assure the documents, some may contain typographical errors. Contact the office that issued the document if you need a copy of the original.

  12. Classification of Emissions from Landfills for NSR Applicability Purposes

    EPA Pesticide Factsheets

    This document may be of assistance in applying the New Source Review (NSR) air permitting regulations including the Prevention of Significant Deterioration (PSD) requirements. This document is part of the NSR Policy and Guidance Database. Some documents in the database are a scanned or retyped version of a paper photocopy of the original. Although we have taken considerable effort to quality assure the documents, some may contain typographical errors. Contact the office that issued the document if you need a copy of the original.

  13. Unsupervised classification of operator workload from brain signals.

    PubMed

    Schultze-Kraft, Matthias; Dähne, Sven; Gugler, Manfred; Curio, Gabriel; Blankertz, Benjamin

    2016-06-01

    In this study we aimed for the classification of operator workload as it is expected in many real-life workplace environments. We explored brain-signal based workload predictors that differ with respect to the level of label information required for training, including entirely unsupervised approaches. Subjects executed a task on a touch screen that required continuous effort of visual and motor processing with alternating difficulty. We first employed classical approaches for workload state classification that operate on the sensor space of EEG and compared those to the performance of three state-of-the-art spatial filtering methods: common spatial patterns (CSPs) analysis, which requires binary label information; source power co-modulation (SPoC) analysis, which uses the subjects' error rate as a target function; and canonical SPoC (cSPoC) analysis, which solely makes use of cross-frequency power correlations induced by different states of workload and thus represents an unsupervised approach. Finally, we investigated the effects of fusing brain signals and peripheral physiological measures (PPMs) and examined the added value for improving classification performance. Mean classification accuracies of 94%, 92% and 82% were achieved with CSP, SPoC, cSPoC, respectively. These methods outperformed the approaches that did not use spatial filtering and they extracted physiologically plausible components. The performance of the unsupervised cSPoC is significantly increased by augmenting it with PPM features. Our analyses ensured that the signal sources used for classification were of cortical origin and not contaminated with artifacts. Our findings show that workload states can be successfully differentiated from brain signals, even when less and less information from the experimental paradigm is used, thus paving the way for real-world applications in which label information may be noisy or entirely unavailable.

  14. Unsupervised classification of operator workload from brain signals

    NASA Astrophysics Data System (ADS)

    Schultze-Kraft, Matthias; Dähne, Sven; Gugler, Manfred; Curio, Gabriel; Blankertz, Benjamin

    2016-06-01

    Objective. In this study we aimed for the classification of operator workload as it is expected in many real-life workplace environments. We explored brain-signal based workload predictors that differ with respect to the level of label information required for training, including entirely unsupervised approaches. Approach. Subjects executed a task on a touch screen that required continuous effort of visual and motor processing with alternating difficulty. We first employed classical approaches for workload state classification that operate on the sensor space of EEG and compared those to the performance of three state-of-the-art spatial filtering methods: common spatial patterns (CSPs) analysis, which requires binary label information; source power co-modulation (SPoC) analysis, which uses the subjects’ error rate as a target function; and canonical SPoC (cSPoC) analysis, which solely makes use of cross-frequency power correlations induced by different states of workload and thus represents an unsupervised approach. Finally, we investigated the effects of fusing brain signals and peripheral physiological measures (PPMs) and examined the added value for improving classification performance. Main results. Mean classification accuracies of 94%, 92% and 82% were achieved with CSP, SPoC, cSPoC, respectively. These methods outperformed the approaches that did not use spatial filtering and they extracted physiologically plausible components. The performance of the unsupervised cSPoC is significantly increased by augmenting it with PPM features. Significance. Our analyses ensured that the signal sources used for classification were of cortical origin and not contaminated with artifacts. Our findings show that workload states can be successfully differentiated from brain signals, even when less and less information from the experimental paradigm is used, thus paving the way for real-world applications in which label information may be noisy or entirely unavailable.

  15. Low-Power Analog Processing for Sensing Applications: Low-Frequency Harmonic Signal Classification

    PubMed Central

    White, Daniel J.; William, Peter E.; Hoffman, Michael W.; Balkir, Sina

    2013-01-01

    A low-power analog sensor front-end is described that reduces the energy required to extract environmental sensing spectral features without using Fast Fouriér Transform (FFT) or wavelet transforms. An Analog Harmonic Transform (AHT) allows selection of only the features needed by the back-end, in contrast to the FFT, where all coefficients must be calculated simultaneously. We also show that the FFT coefficients can be easily calculated from the AHT results by a simple back-substitution. The scheme is tailored for low-power, parallel analog implementation in an integrated circuit (IC). Two different applications are tested with an ideal front-end model and compared to existing studies with the same data sets. Results from the military vehicle classification and identification of machine-bearing fault applications shows that the front-end suits a wide range of harmonic signal sources. Analog-related errors are modeled to evaluate the feasibility of and to set design parameters for an IC implementation to maintain good system-level performance. Design of a preliminary transistor-level integrator circuit in a 0.13 μm complementary metal-oxide-silicon (CMOS) integrated circuit process showed the ability to use online self-calibration to reduce fabrication errors to a sufficiently low level. Estimated power dissipation is about three orders of magnitude less than similar vehicle classification systems that use commercially available FFT spectral extraction. PMID:23892765

  16. Information analysis of a spatial database for ecological land classification

    NASA Technical Reports Server (NTRS)

    Davis, Frank W.; Dozier, Jeff

    1990-01-01

    An ecological land classification was developed for a complex region in southern California using geographic information system techniques of map overlay and contingency table analysis. Land classes were identified by mutual information analysis of vegetation pattern in relation to other mapped environmental variables. The analysis was weakened by map errors, especially errors in the digital elevation data. Nevertheless, the resulting land classification was ecologically reasonable and performed well when tested with higher quality data from the region.

  17. Using the EC decision on case definitions for communicable diseases as a terminology source--lessons learned.

    PubMed

    Balkanyi, Laszlo; Heja, Gergely; Nagy, Attlia

    2014-01-01

    Extracting scientifically accurate terminology from an EU public health regulation is part of the knowledge engineering work at the European Centre for Disease Prevention and Control (ECDC). ECDC operates information systems at the crossroads of many areas - posing a challenge for transparency and consistency. Semantic interoperability is based on the Terminology Server (TS). TS value sets (structured vocabularies) describe shared domains as "diseases", "organisms", "public health terms", "geo-entities" "organizations" and "administrative terms" and others. We extracted information from the relevant EC Implementing Decision on case definitions for reporting communicable diseases, listing 53 notifiable infectious diseases, containing clinical, diagnostic, laboratory and epidemiological criteria. We performed a consistency check; a simplification - abstraction; we represented lab criteria in triplets: as 'y' procedural result /of 'x' organism-substance/on 'z' specimen and identified negations. The resulting new case definition value set represents the various formalized criteria, meanwhile the existing disease value set has been extended, new signs and symptoms were added. New organisms enriched the organism value set. Other new categories have been added to the public health value set, as transmission modes; substances; specimens and procedures. We identified problem areas, as (a) some classification error(s); (b) inconsistent granularity of conditions; (c) seemingly nonsense criteria, medical trivialities; (d) possible logical errors, (e) seemingly factual errors that might be phrasing errors. We think our hypothesis regarding room for possible improvements is valid: there are some open issues and a further improved legal text might lead to more precise epidemiologic data collection. It has to be noted that formal representation for automatic classification of cases was out of scope, such a task would require other formalism, as e.g. those used by rule-based decision support systems.

  18. Semiautomated object-based classification of rain-induced landslides with VHR multispectral images on Madeira Island

    NASA Astrophysics Data System (ADS)

    Heleno, Sandra; Matias, Magda; Pina, Pedro; Sousa, António Jorge

    2016-04-01

    A method for semiautomated landslide detection and mapping, with the ability to separate source and run-out areas, is presented in this paper. It combines object-based image analysis and a support vector machine classifier and is tested using a GeoEye-1 multispectral image, sensed 3 days after a major damaging landslide event that occurred on Madeira Island (20 February 2010), and a pre-event lidar digital terrain model. The testing is developed in a 15 km2 wide study area, where 95 % of the number of landslides scars are detected by this supervised approach. The classifier presents a good performance in the delineation of the overall landslide area, with commission errors below 26 % and omission errors below 24 %. In addition, fair results are achieved in the separation of the source from the run-out landslide areas, although in less illuminated slopes this discrimination is less effective than in sunnier, east-facing slopes.

  19. Structural MRI-based detection of Alzheimer's disease using feature ranking and classification error.

    PubMed

    Beheshti, Iman; Demirel, Hasan; Farokhian, Farnaz; Yang, Chunlan; Matsuda, Hiroshi

    2016-12-01

    This paper presents an automatic computer-aided diagnosis (CAD) system based on feature ranking for detection of Alzheimer's disease (AD) using structural magnetic resonance imaging (sMRI) data. The proposed CAD system is composed of four systematic stages. First, global and local differences in the gray matter (GM) of AD patients compared to the GM of healthy controls (HCs) are analyzed using a voxel-based morphometry technique. The aim is to identify significant local differences in the volume of GM as volumes of interests (VOIs). Second, the voxel intensity values of the VOIs are extracted as raw features. Third, the raw features are ranked using a seven-feature ranking method, namely, statistical dependency (SD), mutual information (MI), information gain (IG), Pearson's correlation coefficient (PCC), t-test score (TS), Fisher's criterion (FC), and the Gini index (GI). The features with higher scores are more discriminative. To determine the number of top features, the estimated classification error based on training set made up of the AD and HC groups is calculated, with the vector size that minimized this error selected as the top discriminative feature. Fourth, the classification is performed using a support vector machine (SVM). In addition, a data fusion approach among feature ranking methods is introduced to improve the classification performance. The proposed method is evaluated using a data-set from ADNI (130 AD and 130 HC) with 10-fold cross-validation. The classification accuracy of the proposed automatic system for the diagnosis of AD is up to 92.48% using the sMRI data. An automatic CAD system for the classification of AD based on feature-ranking method and classification errors is proposed. In this regard, seven-feature ranking methods (i.e., SD, MI, IG, PCC, TS, FC, and GI) are evaluated. The optimal size of top discriminative features is determined by the classification error estimation in the training phase. The experimental results indicate that the performance of the proposed system is comparative to that of state-of-the-art classification models. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  20. Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2's q2-feature-classifier plugin.

    PubMed

    Bokulich, Nicholas A; Kaehler, Benjamin D; Rideout, Jai Ram; Dillon, Matthew; Bolyen, Evan; Knight, Rob; Huttley, Gavin A; Gregory Caporaso, J

    2018-05-17

    Taxonomic classification of marker-gene sequences is an important step in microbiome analysis. We present q2-feature-classifier ( https://github.com/qiime2/q2-feature-classifier ), a QIIME 2 plugin containing several novel machine-learning and alignment-based methods for taxonomy classification. We evaluated and optimized several commonly used classification methods implemented in QIIME 1 (RDP, BLAST, UCLUST, and SortMeRNA) and several new methods implemented in QIIME 2 (a scikit-learn naive Bayes machine-learning classifier, and alignment-based taxonomy consensus methods based on VSEARCH, and BLAST+) for classification of bacterial 16S rRNA and fungal ITS marker-gene amplicon sequence data. The naive-Bayes, BLAST+-based, and VSEARCH-based classifiers implemented in QIIME 2 meet or exceed the species-level accuracy of other commonly used methods designed for classification of marker gene sequences that were evaluated in this work. These evaluations, based on 19 mock communities and error-free sequence simulations, including classification of simulated "novel" marker-gene sequences, are available in our extensible benchmarking framework, tax-credit ( https://github.com/caporaso-lab/tax-credit-data ). Our results illustrate the importance of parameter tuning for optimizing classifier performance, and we make recommendations regarding parameter choices for these classifiers under a range of standard operating conditions. q2-feature-classifier and tax-credit are both free, open-source, BSD-licensed packages available on GitHub.

  1. Documentation of procedures for textural/spatial pattern recognition techniques

    NASA Technical Reports Server (NTRS)

    Haralick, R. M.; Bryant, W. F.

    1976-01-01

    A C-130 aircraft was flown over the Sam Houston National Forest on March 21, 1973 at 10,000 feet altitude to collect multispectral scanner (MSS) data. Existing textural and spatial automatic processing techniques were used to classify the MSS imagery into specified timber categories. Several classification experiments were performed on this data using features selected from the spectral bands and a textural transform band. The results indicate that (1) spatial post-processing a classified image can cut the classification error to 1/2 or 1/3 of its initial value, (2) spatial post-processing the classified image using combined spectral and textural features produces a resulting image with less error than post-processing a classified image using only spectral features and (3) classification without spatial post processing using the combined spectral textural features tends to produce about the same error rate as a classification without spatial post processing using only spectral features.

  2. Acquiring Research-grade ALSM Data in the Commercial Marketplace

    NASA Astrophysics Data System (ADS)

    Haugerud, R. A.; Harding, D. J.; Latypov, D.; Martinez, D.; Routh, S.; Ziegler, J.

    2003-12-01

    The Puget Sound Lidar Consortium, working with TerraPoint, LLC, has procured a large volume of ALSM (topographic lidar) data for scientific research. Research-grade ALSM data can be characterized by their completeness, density, and accuracy. Complete data include-at a minimum-X, Y, Z, time, and classification (ground, vegetation, structure, blunder) for each laser reflection. Off-nadir angle and return number for multiple returns are also useful. We began with a pulse density of 1/sq m, and after limited experiments still find this density satisfactory in the dense second-growth forests of western Washington. Lower pulse densities would have produced unacceptably limited sampling in forested areas and aliased some topographic features. Higher pulse densities do not produce markedly better topographic models, in part because of limitations of reproducibility between the overlapping survey swaths used to achieve higher density. Our experience in a variety of forest types demonstrates that the fraction of pulses that produce ground returns varies with vegetation cover, laser beam divergence, laser power, and detector sensitivity, but have not quantified this relationship. The most significant operational limits on vertical accuracy of ALSM appear to be instrument calibration and the accuracy with which returns are classified as ground or vegetation. TerraPoint has recently implemented in-situ calibration using overlapping swaths (Latypov and Zosse, 2002, see http://www.terrapoint.com/News_damirACSM_ASPRS2002.html). On the consumer side, we routinely perform a similar overlap analysis to produce maps of relative Z error between swaths; we find that in bare, low-slope regions the in-situ calibration has reduced this internal Z error to 6-10 cm RMSE. Comparison with independent ground control points commonly illuminates inconsistencies in how GPS heights have been reduced to orthometric heights. Once these inconsistencies are resolved, it appears that the internal errors are the bulk of the error of the survey. The error maps suggest that with in-situ calibration, minor time-varying errors with a period of circa 1 sec are the largest remaining source of survey error. For forested terrain, limited ground penetration and errors in return classification can severely limit the accuracy of resulting topographic models. Initial work by Haugerud and Harding demonstrated the feasibility of fully-automatic return classification; however, TerraPoint has found that better results can be obtained more effectively with 3rd-party classification software that allows a mix of automated routines and human intervention. Our relationship has been evolving since early 2000. Important aspects of this relationship include close communication between data producer and consumer, a willingness to learn from each other, significant technical expertise and resources on the consumer side, and continued refinement of achievable, quantitative performance and accuracy specifications. Most recently we have instituted a slope-dependent Z accuracy specification that TerraPoint first developed as a heuristic for surveying mountainous terrain in Switzerland. We are now working on quantifying the internal consistency of topographic models in forested areas, using a variant of overlap analysis, and standards for the spatial distribution of internal errors.

  3. Medical error and systems of signaling: conceptual and linguistic definition.

    PubMed

    Smorti, Andrea; Cappelli, Francesco; Zarantonello, Roberta; Tani, Franca; Gensini, Gian Franco

    2014-09-01

    In recent years the issue of patient safety has been the subject of detailed investigations, particularly as a result of the increasing attention from the patients and the public on the problem of medical error. The purpose of this work is firstly to define the classification of medical errors, which are distinguished between two perspectives: those that are personal, and those that are caused by the system. Furthermore we will briefly review some of the main methods used by healthcare organizations to identify and analyze errors. During this discussion it has been determined that, in order to constitute a practical, coordinated and shared action to counteract the error, it is necessary to promote an analysis that considers all elements (human, technological and organizational) that contribute to the occurrence of a critical event. Therefore, it is essential to create a culture of constructive confrontation that encourages an open and non-punitive debate about the causes that led to error. In conclusion we have thus underlined that in health it is essential to affirm a system discussion that considers the error as a learning source, and as a result of the interaction between the individual and the organization. In this way, one should encourage a non-guilt bearing discussion on evident errors and on those which are not immediately identifiable, in order to create the conditions that recognize and corrects the error even before it produces negative consequences.

  4. Segmentation, classification, and pose estimation of military vehicles in low resolution laser radar images

    NASA Astrophysics Data System (ADS)

    Neulist, Joerg; Armbruster, Walter

    2005-05-01

    Model-based object recognition in range imagery typically involves matching the image data to the expected model data for each feasible model and pose hypothesis. Since the matching procedure is computationally expensive, the key to efficient object recognition is the reduction of the set of feasible hypotheses. This is particularly important for military vehicles, which may consist of several large moving parts such as the hull, turret, and gun of a tank, and hence require an eight or higher dimensional pose space to be searched. The presented paper outlines techniques for reducing the set of feasible hypotheses based on an estimation of target dimensions and orientation. Furthermore, the presence of a turret and a main gun and their orientations are determined. The vehicle parts dimensions as well as their error estimates restrict the number of model hypotheses whereas the position and orientation estimates and their error bounds reduce the number of pose hypotheses needing to be verified. The techniques are applied to several hundred laser radar images of eight different military vehicles with various part classifications and orientations. On-target resolution in azimuth, elevation and range is about 30 cm. The range images contain up to 20% dropouts due to atmospheric absorption. Additionally some target retro-reflectors produce outliers due to signal crosstalk. The presented algorithms are extremely robust with respect to these and other error sources. The hypothesis space for hull orientation is reduced to about 5 degrees as is the error for turret rotation and gun elevation, provided the main gun is visible.

  5. VizieR Online Data Catalog: SDSS-DR8 galaxies classified by WND-CHARM (Kuminski+, 2016)

    NASA Astrophysics Data System (ADS)

    Kuminski, E.; Shamir, L.

    2016-06-01

    The image analysis method used to classify the images is WND-CHARM (wndchrm; Shamir et al. 2008, BMC Source Code for Biology and Medicine, 3: 13; 2010PLSCB...6E0974S; 2013ascl.soft12002S), which first computes 2885 numerical descriptors from each SDSS image such as textures, edges, shapes), the statistical distribution of the pixel intensities, the polynomial decomposition of the image, and fractal features. These features are extracted from the raw pixels, as well as the image transforms and multi-order image transforms. See section 2 for further explanations. In a similar way than the catalog we also compiled a catalog of all objects with spectra in DR8. For each object, that catalog contains the spec ObjID, the R.A., the decl., the z, z error, the certainty of classification as elliptical, the certainty of classification as spiral, and the certainty of classification as a star. See section 3.1 for further explanations. (2 data files).

  6. Human error analysis of commercial aviation accidents: application of the Human Factors Analysis and Classification system (HFACS).

    PubMed

    Wiegmann, D A; Shappell, S A

    2001-11-01

    The Human Factors Analysis and Classification System (HFACS) is a general human error framework originally developed and tested within the U.S. military as a tool for investigating and analyzing the human causes of aviation accidents. Based on Reason's (1990) model of latent and active failures, HFACS addresses human error at all levels of the system, including the condition of aircrew and organizational factors. The purpose of the present study was to assess the utility of the HFACS framework as an error analysis and classification tool outside the military. The HFACS framework was used to analyze human error data associated with aircrew-related commercial aviation accidents that occurred between January 1990 and December 1996 using database records maintained by the NTSB and the FAA. Investigators were able to reliably accommodate all the human causal factors associated with the commercial aviation accidents examined in this study using the HFACS system. In addition, the classification of data using HFACS highlighted several critical safety issues in need of intervention research. These results demonstrate that the HFACS framework can be a viable tool for use within the civil aviation arena. However, additional research is needed to examine its applicability to areas outside the flight deck, such as aircraft maintenance and air traffic control domains.

  7. Human Vision-Motivated Algorithm Allows Consistent Retinal Vessel Classification Based on Local Color Contrast for Advancing General Diagnostic Exams.

    PubMed

    Ivanov, Iliya V; Leitritz, Martin A; Norrenberg, Lars A; Völker, Michael; Dynowski, Marek; Ueffing, Marius; Dietter, Johannes

    2016-02-01

    Abnormalities of blood vessel anatomy, morphology, and ratio can serve as important diagnostic markers for retinal diseases such as AMD or diabetic retinopathy. Large cohort studies demand automated and quantitative image analysis of vascular abnormalities. Therefore, we developed an analytical software tool to enable automated standardized classification of blood vessels supporting clinical reading. A dataset of 61 images was collected from a total of 33 women and 8 men with a median age of 38 years. The pupils were not dilated, and images were taken after dark adaption. In contrast to current methods in which classification is based on vessel profile intensity averages, and similar to human vision, local color contrast was chosen as a discriminator to allow artery vein discrimination and arterial-venous ratio (AVR) calculation without vessel tracking. With 83% ± 1 standard error of the mean for our dataset, we achieved best classification for weighted lightness information from a combination of the red, green, and blue channels. Tested on an independent dataset, our method reached 89% correct classification, which, when benchmarked against conventional ophthalmologic classification, shows significantly improved classification scores. Our study demonstrates that vessel classification based on local color contrast can cope with inter- or intraimage lightness variability and allows consistent AVR calculation. We offer an open-source implementation of this method upon request, which can be integrated into existing tool sets and applied to general diagnostic exams.

  8. Practical Procedures for Constructing Mastery Tests to Minimize Errors of Classification and to Maximize or Optimize Decision Reliability.

    ERIC Educational Resources Information Center

    Byars, Alvin Gregg

    The objectives of this investigation are to develop, describe, assess, and demonstrate procedures for constructing mastery tests to minimize errors of classification and to maximize decision reliability. The guidelines are based on conditions where item exchangeability is a reasonable assumption and the test constructor can control the number of…

  9. Comparing K-mer based methods for improved classification of 16S sequences.

    PubMed

    Vinje, Hilde; Liland, Kristian Hovde; Almøy, Trygve; Snipen, Lars

    2015-07-01

    The need for precise and stable taxonomic classification is highly relevant in modern microbiology. Parallel to the explosion in the amount of sequence data accessible, there has also been a shift in focus for classification methods. Previously, alignment-based methods were the most applicable tools. Now, methods based on counting K-mers by sliding windows are the most interesting classification approach with respect to both speed and accuracy. Here, we present a systematic comparison on five different K-mer based classification methods for the 16S rRNA gene. The methods differ from each other both in data usage and modelling strategies. We have based our study on the commonly known and well-used naïve Bayes classifier from the RDP project, and four other methods were implemented and tested on two different data sets, on full-length sequences as well as fragments of typical read-length. The difference in classification error obtained by the methods seemed to be small, but they were stable and for both data sets tested. The Preprocessed nearest-neighbour (PLSNN) method performed best for full-length 16S rRNA sequences, significantly better than the naïve Bayes RDP method. On fragmented sequences the naïve Bayes Multinomial method performed best, significantly better than all other methods. For both data sets explored, and on both full-length and fragmented sequences, all the five methods reached an error-plateau. We conclude that no K-mer based method is universally best for classifying both full-length sequences and fragments (reads). All methods approach an error plateau indicating improved training data is needed to improve classification from here. Classification errors occur most frequent for genera with few sequences present. For improving the taxonomy and testing new classification methods, the need for a better and more universal and robust training data set is crucial.

  10. An experimental evaluation of the incidence of fitness-function/search-algorithm combinations on the classification performance of myoelectric control systems with iPCA tuning

    PubMed Central

    2013-01-01

    Background The information of electromyographic signals can be used by Myoelectric Control Systems (MCSs) to actuate prostheses. These devices allow the performing of movements that cannot be carried out by persons with amputated limbs. The state of the art in the development of MCSs is based on the use of individual principal component analysis (iPCA) as a stage of pre-processing of the classifiers. The iPCA pre-processing implies an optimization stage which has not yet been deeply explored. Methods The present study considers two factors in the iPCA stage: namely A (the fitness function), and B (the search algorithm). The A factor comprises two levels, namely A1 (the classification error) and A2 (the correlation factor). Otherwise, the B factor has four levels, specifically B1 (the Sequential Forward Selection, SFS), B2 (the Sequential Floating Forward Selection, SFFS), B3 (Artificial Bee Colony, ABC), and B4 (Particle Swarm Optimization, PSO). This work evaluates the incidence of each one of the eight possible combinations between A and B factors over the classification error of the MCS. Results A two factor ANOVA was performed on the computed classification errors and determined that: (1) the interactive effects over the classification error are not significative (F0.01,3,72 = 4.0659 > f AB  = 0.09), (2) the levels of factor A have significative effects on the classification error (F0.02,1,72 = 5.0162 < f A  = 6.56), and (3) the levels of factor B over the classification error are not significative (F0.01,3,72 = 4.0659 > f B  = 0.08). Conclusions Considering the classification performance we found a superiority of using the factor A2 in combination with any of the levels of factor B. With respect to the time performance the analysis suggests that the PSO algorithm is at least 14 percent better than its best competitor. The latter behavior has been observed for a particular configuration set of parameters in the search algorithms. Future works will investigate the effect of these parameters in the classification performance, such as length of the reduced size vector, number of particles and bees used during optimal search, the cognitive parameters in the PSO algorithm as well as the limit of cycles to improve a solution in the ABC algorithm. PMID:24369728

  11. Exploring human error in military aviation flight safety events using post-incident classification systems.

    PubMed

    Hooper, Brionny J; O'Hare, David P A

    2013-08-01

    Human error classification systems theoretically allow researchers to analyze postaccident data in an objective and consistent manner. The Human Factors Analysis and Classification System (HFACS) framework is one such practical analysis tool that has been widely used to classify human error in aviation. The Cognitive Error Taxonomy (CET) is another. It has been postulated that the focus on interrelationships within HFACS can facilitate the identification of the underlying causes of pilot error. The CET provides increased granularity at the level of unsafe acts. The aim was to analyze the influence of factors at higher organizational levels on the unsafe acts of front-line operators and to compare the errors of fixed-wing and rotary-wing operations. This study analyzed 288 aircraft incidents involving human error from an Australasian military organization occurring between 2001 and 2008. Action errors accounted for almost twice (44%) the proportion of rotary wing compared to fixed wing (23%) incidents. Both classificatory systems showed significant relationships between precursor factors such as the physical environment, mental and physiological states, crew resource management, training and personal readiness, and skill-based, but not decision-based, acts. The CET analysis showed different predisposing factors for different aspects of skill-based behaviors. Skill-based errors in military operations are more prevalent in rotary wing incidents and are related to higher level supervisory processes in the organization. The Cognitive Error Taxonomy provides increased granularity to HFACS analyses of unsafe acts.

  12. Machine Learning for Discriminating Quantum Measurement Trajectories and Improving Readout.

    PubMed

    Magesan, Easwar; Gambetta, Jay M; Córcoles, A D; Chow, Jerry M

    2015-05-22

    Current methods for classifying measurement trajectories in superconducting qubit systems produce fidelities systematically lower than those predicted by experimental parameters. Here, we place current classification methods within the framework of machine learning (ML) algorithms and improve on them by investigating more sophisticated ML approaches. We find that nonlinear algorithms and clustering methods produce significantly higher assignment fidelities that help close the gap to the fidelity possible under ideal noise conditions. Clustering methods group trajectories into natural subsets within the data, which allows for the diagnosis of systematic errors. We find large clusters in the data associated with T1 processes and show these are the main source of discrepancy between our experimental and ideal fidelities. These error diagnosis techniques help provide a path forward to improve qubit measurements.

  13. VizieR Online Data Catalog: ROSAT detected quasars. I. (Brinkmann+ 1997)

    NASA Astrophysics Data System (ADS)

    Brinkmann, W.; Yuan, W.

    1996-09-01

    We have compiled a sample of all quasars with measured radio emission from the Veron-Cetty - Veron catalogue (1993, VV93 ) detected by ROSAT in the ALL-SKY SURVEY (RASS, Voges 1992), as targets of pointed observations, or as serendipitous sources from pointed observations as publicly available from the ROSAT point source catalogue (ROSAT-SRC, Voges et al. 1995). The total number of ROSAT detected radio quasars from the above three sources is 654 objects. 69 of the objects are classified as radio-quiet using the defining line at a radio-loudness of 1.0, and 10 objects have no classification. The 5GHz data are from the 87GB radio survey, the NED database, or from the Veron-Cetty - Veron catalogue. The power law indices and their errors are estimated from the two hardness ratios given by the SASS assuming Galactic absorption. The X-ray flux densities in the ROSAT band (0.1-2.4keV) are calculated from the count rates using the energy to counts conversion factor for power law spectra and Galactic absorption. For the photon index we use the value obtained for a individual source if the estimated 1 sigma error is smaller than 0.5, otherwise we use the mean value 2.14. (1 data file).

  14. How should children with speech sound disorders be classified? A review and critical evaluation of current classification systems.

    PubMed

    Waring, R; Knight, R

    2013-01-01

    Children with speech sound disorders (SSD) form a heterogeneous group who differ in terms of the severity of their condition, underlying cause, speech errors, involvement of other aspects of the linguistic system and treatment response. To date there is no universal and agreed-upon classification system. Instead, a number of theoretically differing classification systems have been proposed based on either an aetiological (medical) approach, a descriptive-linguistic approach or a processing approach. To describe and review the supporting evidence, and to provide a critical evaluation of the current childhood SSD classification systems. Descriptions of the major specific approaches to classification are reviewed and research papers supporting the reliability and validity of the systems are evaluated. Three specific paediatric SSD classification systems; the aetiologic-based Speech Disorders Classification System, the descriptive-linguistic Differential Diagnosis system, and the processing-based Psycholinguistic Framework are identified as potentially useful in classifying children with SSD into homogeneous subgroups. The Differential Diagnosis system has a growing body of empirical support from clinical population studies, across language error pattern studies and treatment efficacy studies. The Speech Disorders Classification System is currently a research tool with eight proposed subgroups. The Psycholinguistic Framework is a potential bridge to linking cause and surface level speech errors. There is a need for a universally agreed-upon classification system that is useful to clinicians and researchers. The resulting classification system needs to be robust, reliable and valid. A universal classification system would allow for improved tailoring of treatments to subgroups of SSD which may, in turn, lead to improved treatment efficacy. © 2012 Royal College of Speech and Language Therapists.

  15. The application of Aronson's taxonomy to medication errors in nursing.

    PubMed

    Johnson, Maree; Young, Helen

    2011-01-01

    Medication administration is a frequent nursing activity that is prone to error. In this study of 318 self-reported medication incidents (including near misses), very few resulted in patient harm-7% required intervention or prolonged hospitalization or caused temporary harm. Aronson's classification system provided an excellent framework for analysis of the incidents with a close connection between the type of error and the change strategy to minimize medication incidents. Taking a behavioral approach to medication error classification has provided helpful strategies for nurses such as nurse-call cards on patient lockers when patients are absent and checking of medication sign-off by outgoing and incoming staff at handover.

  16. Center for Seismic Studies Final Technical Report, October 1992 through October 1993

    DTIC Science & Technology

    1994-02-07

    SECURITY CLASSIFICATION 18. SECURITY CLASSIFICATION 19. SECURITY CLASSIFICATION 20. LIMITATION OF ABSTRACT OF REPORT OF THIS PAGE OF ABSTRACT...Upper limit of depth error as a function of mb for estimates based on P and S waves for three netowrks : GSETr-2, ALPHA, and ALPHA + a 50 station...U 4A 4 U 4S as 1 I I I Figure 42: Upper limit of depth error as a function of mb for estimatesbased on P and S waves for three netowrk : GSETT-2o ALPHA

  17. Assimilation of a knowledge base and physical models to reduce errors in passive-microwave classifications of sea ice

    NASA Technical Reports Server (NTRS)

    Maslanik, J. A.; Key, J.

    1992-01-01

    An expert system framework has been developed to classify sea ice types using satellite passive microwave data, an operational classification algorithm, spatial and temporal information, ice types estimated from a dynamic-thermodynamic model, output from a neural network that detects the onset of melt, and knowledge about season and region. The rule base imposes boundary conditions upon the ice classification, modifies parameters in the ice algorithm, determines a `confidence' measure for the classified data, and under certain conditions, replaces the algorithm output with model output. Results demonstrate the potential power of such a system for minimizing overall error in the classification and for providing non-expert data users with a means of assessing the usefulness of the classification results for their applications.

  18. Spelling in adolescents with dyslexia: errors and modes of assessment.

    PubMed

    Tops, Wim; Callens, Maaike; Bijn, Evi; Brysbaert, Marc

    2014-01-01

    In this study we focused on the spelling of high-functioning students with dyslexia. We made a detailed classification of the errors in a word and sentence dictation task made by 100 students with dyslexia and 100 matched control students. All participants were in the first year of their bachelor's studies and had Dutch as mother tongue. Three main error categories were distinguished: phonological, orthographic, and grammatical errors (on the basis of morphology and language-specific spelling rules). The results indicated that higher-education students with dyslexia made on average twice as many spelling errors as the controls, with effect sizes of d ≥ 2. When the errors were classified as phonological, orthographic, or grammatical, we found a slight dominance of phonological errors in students with dyslexia. Sentence dictation did not provide more information than word dictation in the correct classification of students with and without dyslexia. © Hammill Institute on Disabilities 2012.

  19. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mosher, J.C.; Leahy, R.M.

    A new method for source localization is described that is based on a modification of the well known multiple signal classification (MUSIC) algorithm. In classical MUSIC, the array manifold vector is projected onto an estimate of the signal subspace, but errors in the estimate can make location of multiple sources difficult. Recursively applied and projected (RAP) MUSIC uses each successively located source to form an intermediate array gain matrix, and projects both the array manifold and the signal subspace estimate into its orthogonal complement. The MUSIC projection is then performed in this reduced subspace. Using the metric of principal angles,more » the authors describe a general form of the RAP-MUSIC algorithm for the case of diversely polarized sources. Through a uniform linear array simulation, the authors demonstrate the improved Monte Carlo performance of RAP-MUSIC relative to MUSIC and two other sequential subspace methods, S and IES-MUSIC.« less

  20. A Guide for Setting the Cut-Scores to Minimize Weighted Classification Errors in Test Batteries

    ERIC Educational Resources Information Center

    Grabovsky, Irina; Wainer, Howard

    2017-01-01

    In this article, we extend the methodology of the Cut-Score Operating Function that we introduced previously and apply it to a testing scenario with multiple independent components and different testing policies. We derive analytically the overall classification error rate for a test battery under the policy when several retakes are allowed for…

  1. [Classifications in forensic medicine and their logical basis].

    PubMed

    Kovalev, A V; Shmarov, L A; Ten'kov, A A

    2014-01-01

    The objective of the present study was to characterize the main requirements for the correct construction of classifications used in forensic medicine, with special reference to the errors that occur in the relevant text-books, guidelines, and manuals and the ways to avoid them. This publication continues the series of thematic articles of the authors devoted to the logical errors in the expert conclusions. The preparation of further publications is underway to report the results of the in-depth analysis of the logical errors encountered in expert conclusions, text-books, guidelines, and manuals.

  2. Sensitivity analysis of the GEMS soil organic carbon model to land cover land use classification uncertainties under different climate scenarios in Senegal

    USGS Publications Warehouse

    Dieye, A.M.; Roy, David P.; Hanan, N.P.; Liu, S.; Hansen, M.; Toure, A.

    2012-01-01

    Spatially explicit land cover land use (LCLU) change information is needed to drive biogeochemical models that simulate soil organic carbon (SOC) dynamics. Such information is increasingly being mapped using remotely sensed satellite data with classification schemes and uncertainties constrained by the sensing system, classification algorithms and land cover schemes. In this study, automated LCLU classification of multi-temporal Landsat satellite data were used to assess the sensitivity of SOC modeled by the Global Ensemble Biogeochemical Modeling System (GEMS). The GEMS was run for an area of 1560 km2 in Senegal under three climate change scenarios with LCLU maps generated using different Landsat classification approaches. This research provides a method to estimate the variability of SOC, specifically the SOC uncertainty due to satellite classification errors, which we show is dependent not only on the LCLU classification errors but also on where the LCLU classes occur relative to the other GEMS model inputs.

  3. Evaluation of automated global mapping of Reference Soil Groups of WRB2015

    NASA Astrophysics Data System (ADS)

    Mantel, Stephan; Caspari, Thomas; Kempen, Bas; Schad, Peter; Eberhardt, Einar; Ruiperez Gonzalez, Maria

    2017-04-01

    SoilGrids is an automated system that provides global predictions for standard numeric soil properties at seven standard depths down to 200 cm, currently at spatial resolutions of 1km and 250m. In addition, the system provides predictions of depth to bedrock and distribution of soil classes based on WRB and USDA Soil Taxonomy (ST). In SoilGrids250m(1), soil classes (WRB, version 2006) consist of the RSG and the first prefix qualifier, whereas in SoilGrids1km(2), the soil class was assessed at RSG level. Automated mapping of World Reference Base (WRB) Reference Soil Groups (RSGs) at a global level has great advantages. Maps can be updated in a short time span with relatively little effort when new data become available. To translate soil names of older versions of FAO/WRB and national classification systems of the source data into names according to WRB 2006, correlation tables are used in SoilGrids. Soil properties and classes are predicted independently from each other. This means that the combinations of soil properties for the same cells or soil property-soil class combinations do not necessarily yield logical combinations when the map layers are studied jointly. The model prediction procedure is robust and probably has a low source of error in the prediction of RSGs. It seems that the quality of the original soil classification in the data and the use of correlation tables are the largest sources of error in mapping the RSG distribution patterns. Predicted patterns of dominant RSGs were evaluated in selected areas and sources of error were identified. Suggestions are made for improvement of WRB2015 RSG distribution predictions in SoilGrids. Keywords: Automated global mapping; World Reference Base for Soil Resources; Data evaluation; Data quality assurance References 1 Hengl T, de Jesus JM, Heuvelink GBM, Ruiperez Gonzalez M, Kilibarda M, et al. (2016) SoilGrids250m: global gridded soil information based on Machine Learning. Earth System Science Data (ESSD), in review. 2 Hengl T, de Jesus JM, MacMillan RA, Batjes NH, Heuvelink GBM, et al. (2014) SoilGrids1km — Global Soil Information Based on Automated Mapping. PLoS ONE 9(8): e105992. doi:10.1371/journal.pone.0105992

  4. Survey methods for assessing land cover map accuracy

    USGS Publications Warehouse

    Nusser, S.M.; Klaas, E.E.

    2003-01-01

    The increasing availability of digital photographic materials has fueled efforts by agencies and organizations to generate land cover maps for states, regions, and the United States as a whole. Regardless of the information sources and classification methods used, land cover maps are subject to numerous sources of error. In order to understand the quality of the information contained in these maps, it is desirable to generate statistically valid estimates of accuracy rates describing misclassification errors. We explored a full sample survey framework for creating accuracy assessment study designs that balance statistical and operational considerations in relation to study objectives for a regional assessment of GAP land cover maps. We focused not only on appropriate sample designs and estimation approaches, but on aspects of the data collection process, such as gaining cooperation of land owners and using pixel clusters as an observation unit. The approach was tested in a pilot study to assess the accuracy of Iowa GAP land cover maps. A stratified two-stage cluster sampling design addressed sample size requirements for land covers and the need for geographic spread while minimizing operational effort. Recruitment methods used for private land owners yielded high response rates, minimizing a source of nonresponse error. Collecting data for a 9-pixel cluster centered on the sampled pixel was simple to implement, and provided better information on rarer vegetation classes as well as substantial gains in precision relative to observing data at a single-pixel.

  5. GDF v2.0, an enhanced version of GDF

    NASA Astrophysics Data System (ADS)

    Tsoulos, Ioannis G.; Gavrilis, Dimitris; Dermatas, Evangelos

    2007-12-01

    An improved version of the function estimation program GDF is presented. The main enhancements of the new version include: multi-output function estimation, capability of defining custom functions in the grammar and selection of the error function. The new version has been evaluated on a series of classification and regression datasets, that are widely used for the evaluation of such methods. It is compared to two known neural networks and outperforms them in 5 (out of 10) datasets. Program summaryTitle of program: GDF v2.0 Catalogue identifier: ADXC_v2_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADXC_v2_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 98 147 No. of bytes in distributed program, including test data, etc.: 2 040 684 Distribution format: tar.gz Programming language: GNU C++ Computer: The program is designed to be portable in all systems running the GNU C++ compiler Operating system: Linux, Solaris, FreeBSD RAM: 200000 bytes Classification: 4.9 Does the new version supersede the previous version?: Yes Nature of problem: The technique of function estimation tries to discover from a series of input data a functional form that best describes them. This can be performed with the use of parametric models, whose parameters can adapt according to the input data. Solution method: Functional forms are being created by genetic programming which are approximations for the symbolic regression problem. Reasons for new version: The GDF package was extended in order to be more flexible and user customizable than the old package. The user can extend the package by defining his own error functions and he can extend the grammar of the package by adding new functions to the function repertoire. Also, the new version can perform function estimation of multi-output functions and it can be used for classification problems. Summary of revisions: The following features have been added to the package GDF: Multi-output function approximation. The package can now approximate any function f:R→R. This feature gives also to the package the capability of performing classification and not only regression. User defined function can be added to the repertoire of the grammar, extending the regression capabilities of the package. This feature is limited to 3 functions, but easily this number can be increased. Capability of selecting the error function. The package offers now to the user apart from the mean square error other error functions such as: mean absolute square error, maximum square error. Also, user defined error functions can be added to the set of error functions. More verbose output. The main program displays more information to the user as well as the default values for the parameters. Also, the package gives to the user the capability to define an output file, where the output of the gdf program for the testing set will be stored after the termination of the process. Additional comments: A technical report describing the revisions, experiments and test runs is packaged with the source code. Running time: Depending on the train data.

  6. Evaluating data mining algorithms using molecular dynamics trajectories.

    PubMed

    Tatsis, Vasileios A; Tjortjis, Christos; Tzirakis, Panagiotis

    2013-01-01

    Molecular dynamics simulations provide a sample of a molecule's conformational space. Experiments on the mus time scale, resulting in large amounts of data, are nowadays routine. Data mining techniques such as classification provide a way to analyse such data. In this work, we evaluate and compare several classification algorithms using three data sets which resulted from computer simulations, of a potential enzyme mimetic biomolecule. We evaluated 65 classifiers available in the well-known data mining toolkit Weka, using 'classification' errors to assess algorithmic performance. Results suggest that: (i) 'meta' classifiers perform better than the other groups, when applied to molecular dynamics data sets; (ii) Random Forest and Rotation Forest are the best classifiers for all three data sets; and (iii) classification via clustering yields the highest classification error. Our findings are consistent with bibliographic evidence, suggesting a 'roadmap' for dealing with such data.

  7. A Categorization of Dynamic Analyzers

    NASA Technical Reports Server (NTRS)

    Lujan, Michelle R.

    1997-01-01

    Program analysis techniques and tools are essential to the development process because of the support they provide in detecting errors and deficiencies at different phases of development. The types of information rendered through analysis includes the following: statistical measurements of code, type checks, dataflow analysis, consistency checks, test data,verification of code, and debugging information. Analyzers can be broken into two major categories: dynamic and static. Static analyzers examine programs with respect to syntax errors and structural properties., This includes gathering statistical information on program content, such as the number of lines of executable code, source lines. and cyclomatic complexity. In addition, static analyzers provide the ability to check for the consistency of programs with respect to variables. Dynamic analyzers in contrast are dependent on input and the execution of a program providing the ability to find errors that cannot be detected through the use of static analysis alone. Dynamic analysis provides information on the behavior of a program rather than on the syntax. Both types of analysis detect errors in a program, but dynamic analyzers accomplish this through run-time behavior. This paper focuses on the following broad classification of dynamic analyzers: 1) Metrics; 2) Models; and 3) Monitors. Metrics are those analyzers that provide measurement. The next category, models, captures those analyzers that present the state of the program to the user at specified points in time. The last category, monitors, checks specified code based on some criteria. The paper discusses each classification and the techniques that are included under them. In addition, the role of each technique in the software life cycle is discussed. Familiarization with the tools that measure, model and monitor programs provides a framework for understanding the program's dynamic behavior from different, perspectives through analysis of the input/output data.

  8. The localization of focal heart activity via body surface potential measurements: tests in a heterogeneous torso phantom

    NASA Astrophysics Data System (ADS)

    Wetterling, F.; Liehr, M.; Schimpf, P.; Liu, H.; Haueisen, J.

    2009-09-01

    The non-invasive localization of focal heart activity via body surface potential measurements (BSPM) could greatly benefit the understanding and treatment of arrhythmic heart diseases. However, the in vivo validation of source localization algorithms is rather difficult with currently available measurement techniques. In this study, we used a physical torso phantom composed of different conductive compartments and seven dipoles, which were placed in the anatomical position of the human heart in order to assess the performance of the Recursively Applied and Projected Multiple Signal Classification (RAP-MUSIC) algorithm. Electric potentials were measured on the torso surface for single dipoles with and without further uncorrelated or correlated dipole activity. The localization error averaged 11 ± 5 mm over 22 dipoles, which shows the ability of RAP-MUSIC to distinguish an uncorrelated dipole from surrounding sources activity. For the first time, real computational modelling errors could be included within the validation procedure due to the physically modelled heterogeneities. In conclusion, the introduced heterogeneous torso phantom can be used to validate state-of-the-art algorithms under nearly realistic measurement conditions.

  9. Evaluation criteria for software classification inventories, accuracies, and maps

    NASA Technical Reports Server (NTRS)

    Jayroe, R. R., Jr.

    1976-01-01

    Statistical criteria are presented for modifying the contingency table used to evaluate tabular classification results obtained from remote sensing and ground truth maps. This classification technique contains information on the spatial complexity of the test site, on the relative location of classification errors, on agreement of the classification maps with ground truth maps, and reduces back to the original information normally found in a contingency table.

  10. The search for structure - Object classification in large data sets. [for astronomers

    NASA Technical Reports Server (NTRS)

    Kurtz, Michael J.

    1988-01-01

    Research concerning object classifications schemes are reviewed, focusing on large data sets. Classification techniques are discussed, including syntactic, decision theoretic methods, fuzzy techniques, and stochastic and fuzzy grammars. Consideration is given to the automation of MK classification (Morgan and Keenan, 1973) and other problems associated with the classification of spectra. In addition, the classification of galaxies is examined, including the problems of systematic errors, blended objects, galaxy types, and galaxy clusters.

  11. Evaluation of forest cover estimates for Haiti using supervised classification of Landsat data

    NASA Astrophysics Data System (ADS)

    Churches, Christopher E.; Wampler, Peter J.; Sun, Wanxiao; Smith, Andrew J.

    2014-08-01

    This study uses 2010-2011 Landsat Thematic Mapper (TM) imagery to estimate total forested area in Haiti. The thematic map was generated using radiometric normalization of digital numbers by a modified normalization method utilizing pseudo-invariant polygons (PIPs), followed by supervised classification of the mosaicked image using the Food and Agriculture Organization (FAO) of the United Nations Land Cover Classification System. Classification results were compared to other sources of land-cover data produced for similar years, with an emphasis on the statistics presented by the FAO. Three global land cover datasets (GLC2000, Globcover, 2009, and MODIS MCD12Q1), and a national-scale dataset (a land cover analysis by Haitian National Centre for Geospatial Information (CNIGS)) were reclassified and compared. According to our classification, approximately 32.3% of Haiti's total land area was tree covered in 2010-2011. This result was confirmed using an error-adjusted area estimator, which predicted a tree covered area of 32.4%. Standardization to the FAO's forest cover class definition reduces the amount of tree cover of our supervised classification to 29.4%. This result was greater than the reported FAO value of 4% and the value for the recoded GLC2000 dataset of 7.0%, but is comparable to values for three other recoded datasets: MCD12Q1 (21.1%), Globcover (2009) (26.9%), and CNIGS (19.5%). We propose that at coarse resolutions, the segmented and patchy nature of Haiti's forests resulted in a systematic underestimation of the extent of forest cover. It appears the best explanation for the significant difference between our results, FAO statistics, and compared datasets is the accuracy of the data sources and the resolution of the imagery used for land cover analyses. Analysis of recoded global datasets and results from this study suggest a strong linear relationship (R2 = 0.996 for tree cover) between spatial resolution and land cover estimates.

  12. Procedural Error and Task Interruption

    DTIC Science & Technology

    2016-09-30

    red for research on errors and individual differences . Results indicate predictive validity for fluid intelligence and specifi c forms of work...TERMS procedural error, task interruption, individual differences , fluid intelligence, sleep deprivation 16. SECURITY CLASSIFICATION OF: 17...and individual differences . It generates rich data on several kinds of errors, including procedural errors in which steps are skipped or repeated

  13. A research of selected textural features for detection of asbestos-cement roofing sheets using orthoimages

    NASA Astrophysics Data System (ADS)

    Książek, Judyta

    2015-10-01

    At present, there has been a great interest in the development of texture based image classification methods in many different areas. This study presents the results of research carried out to assess the usefulness of selected textural features for detection of asbestos-cement roofs in orthophotomap classification. Two different orthophotomaps of southern Poland (with ground resolution: 5 cm and 25 cm) were used. On both orthoimages representative samples for two classes: asbestos-cement roofing sheets and other roofing materials were selected. Estimation of texture analysis usefulness was conducted using machine learning methods based on decision trees (C5.0 algorithm). For this purpose, various sets of texture parameters were calculated in MaZda software. During the calculation of decision trees different numbers of texture parameters groups were considered. In order to obtain the best settings for decision trees models cross-validation was performed. Decision trees models with the lowest mean classification error were selected. The accuracy of the classification was held based on validation data sets, which were not used for the classification learning. For 5 cm ground resolution samples, the lowest mean classification error was 15.6%. The lowest mean classification error in the case of 25 cm ground resolution was 20.0%. The obtained results confirm potential usefulness of the texture parameter image processing for detection of asbestos-cement roofing sheets. In order to improve the accuracy another extended study should be considered in which additional textural features as well as spectral characteristics should be analyzed.

  14. Application of principal component analysis to distinguish patients with schizophrenia from healthy controls based on fractional anisotropy measurements.

    PubMed

    Caprihan, A; Pearlson, G D; Calhoun, V D

    2008-08-15

    Principal component analysis (PCA) is often used to reduce the dimension of data before applying more sophisticated data analysis methods such as non-linear classification algorithms or independent component analysis. This practice is based on selecting components corresponding to the largest eigenvalues. If the ultimate goal is separation of data in two groups, then these set of components need not have the most discriminatory power. We measured the distance between two such populations using Mahalanobis distance and chose the eigenvectors to maximize it, a modified PCA method, which we call the discriminant PCA (DPCA). DPCA was applied to diffusion tensor-based fractional anisotropy images to distinguish age-matched schizophrenia subjects from healthy controls. The performance of the proposed method was evaluated by the one-leave-out method. We show that for this fractional anisotropy data set, the classification error with 60 components was close to the minimum error and that the Mahalanobis distance was twice as large with DPCA, than with PCA. Finally, by masking the discriminant function with the white matter tracts of the Johns Hopkins University atlas, we identified left superior longitudinal fasciculus as the tract which gave the least classification error. In addition, with six optimally chosen tracts the classification error was zero.

  15. Pareto-optimal multi-objective dimensionality reduction deep auto-encoder for mammography classification.

    PubMed

    Taghanaki, Saeid Asgari; Kawahara, Jeremy; Miles, Brandon; Hamarneh, Ghassan

    2017-07-01

    Feature reduction is an essential stage in computer aided breast cancer diagnosis systems. Multilayer neural networks can be trained to extract relevant features by encoding high-dimensional data into low-dimensional codes. Optimizing traditional auto-encoders works well only if the initial weights are close to a proper solution. They are also trained to only reduce the mean squared reconstruction error (MRE) between the encoder inputs and the decoder outputs, but do not address the classification error. The goal of the current work is to test the hypothesis that extending traditional auto-encoders (which only minimize reconstruction error) to multi-objective optimization for finding Pareto-optimal solutions provides more discriminative features that will improve classification performance when compared to single-objective and other multi-objective approaches (i.e. scalarized and sequential). In this paper, we introduce a novel multi-objective optimization of deep auto-encoder networks, in which the auto-encoder optimizes two objectives: MRE and mean classification error (MCE) for Pareto-optimal solutions, rather than just MRE. These two objectives are optimized simultaneously by a non-dominated sorting genetic algorithm. We tested our method on 949 X-ray mammograms categorized into 12 classes. The results show that the features identified by the proposed algorithm allow a classification accuracy of up to 98.45%, demonstrating favourable accuracy over the results of state-of-the-art methods reported in the literature. We conclude that adding the classification objective to the traditional auto-encoder objective and optimizing for finding Pareto-optimal solutions, using evolutionary multi-objective optimization, results in producing more discriminative features. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. Automatic intelligibility classification of sentence-level pathological speech

    PubMed Central

    Kim, Jangwon; Kumar, Naveen; Tsiartas, Andreas; Li, Ming; Narayanan, Shrikanth S.

    2014-01-01

    Pathological speech usually refers to the condition of speech distortion resulting from atypicalities in voice and/or in the articulatory mechanisms owing to disease, illness or other physical or biological insult to the production system. Although automatic evaluation of speech intelligibility and quality could come in handy in these scenarios to assist experts in diagnosis and treatment design, the many sources and types of variability often make it a very challenging computational processing problem. In this work we propose novel sentence-level features to capture abnormal variation in the prosodic, voice quality and pronunciation aspects in pathological speech. In addition, we propose a post-classification posterior smoothing scheme which refines the posterior of a test sample based on the posteriors of other test samples. Finally, we perform feature-level fusions and subsystem decision fusion for arriving at a final intelligibility decision. The performances are tested on two pathological speech datasets, the NKI CCRT Speech Corpus (advanced head and neck cancer) and the TORGO database (cerebral palsy or amyotrophic lateral sclerosis), by evaluating classification accuracy without overlapping subjects’ data among training and test partitions. Results show that the feature sets of each of the voice quality subsystem, prosodic subsystem, and pronunciation subsystem, offer significant discriminating power for binary intelligibility classification. We observe that the proposed posterior smoothing in the acoustic space can further reduce classification errors. The smoothed posterior score fusion of subsystems shows the best classification performance (73.5% for unweighted, and 72.8% for weighted, average recalls of the binary classes). PMID:25414544

  17. The Relationship between Occurrence Timing of Dispensing Errors and Subsequent Danger to Patients under the Situation According to the Classification of Drugs by Efficacy.

    PubMed

    Tsuji, Toshikazu; Nagata, Kenichiro; Kawashiri, Takehiro; Yamada, Takaaki; Irisa, Toshihiro; Murakami, Yuko; Kanaya, Akiko; Egashira, Nobuaki; Masuda, Satohiro

    2016-01-01

    There are many reports regarding various medical institutions' attempts at the prevention of dispensing errors. However, the relationship between occurrence timing of dispensing errors and subsequent danger to patients has not been studied under the situation according to the classification of drugs by efficacy. Therefore, we analyzed the relationship between position and time regarding the occurrence of dispensing errors. Furthermore, we investigated the relationship between occurrence timing of them and danger to patients. In this study, dispensing errors and incidents in three categories (drug name errors, drug strength errors, drug count errors) were classified into two groups in terms of its drug efficacy (efficacy similarity (-) group, efficacy similarity (+) group), into three classes in terms of the occurrence timing of dispensing errors (initial phase errors, middle phase errors, final phase errors). Then, the rates of damage shifting from "dispensing errors" to "damage to patients" were compared as an index of danger between two groups and among three classes. Consequently, the rate of damage in "efficacy similarity (-) group" was significantly higher than that in "efficacy similarity (+) group". Furthermore, the rate of damage is the highest in "initial phase errors", the lowest in "final phase errors" among three classes. From the results of this study, it became clear that the earlier the timing of dispensing errors occurs, the more severe the damage to patients becomes.

  18. Classification Model for Forest Fire Hotspot Occurrences Prediction Using ANFIS Algorithm

    NASA Astrophysics Data System (ADS)

    Wijayanto, A. K.; Sani, O.; Kartika, N. D.; Herdiyeni, Y.

    2017-01-01

    This study proposed the application of data mining technique namely Adaptive Neuro-Fuzzy inference system (ANFIS) on forest fires hotspot data to develop classification models for hotspots occurrence in Central Kalimantan. Hotspot is a point that is indicated as the location of fires. In this study, hotspot distribution is categorized as true alarm and false alarm. ANFIS is a soft computing method in which a given inputoutput data set is expressed in a fuzzy inference system (FIS). The FIS implements a nonlinear mapping from its input space to the output space. The method of this study classified hotspots as target objects by correlating spatial attributes data using three folds in ANFIS algorithm to obtain the best model. The best result obtained from the 3rd fold provided low error for training (error = 0.0093676) and also low error testing result (error = 0.0093676). Attribute of distance to road is the most determining factor that influences the probability of true and false alarm where the level of human activities in this attribute is higher. This classification model can be used to develop early warning system of forest fire.

  19. The GALEX Time Domain Survey. I. Selection and Classification of Over a Thousand Ultraviolet Variable Sources

    NASA Astrophysics Data System (ADS)

    Gezari, S.; Martin, D. C.; Forster, K.; Neill, J. D.; Huber, M.; Heckman, T.; Bianchi, L.; Morrissey, P.; Neff, S. G.; Seibert, M.; Schiminovich, D.; Wyder, T. K.; Burgett, W. S.; Chambers, K. C.; Kaiser, N.; Magnier, E. A.; Price, P. A.; Tonry, J. L.

    2013-03-01

    We present the selection and classification of over a thousand ultraviolet (UV) variable sources discovered in ~40 deg2 of GALEX Time Domain Survey (TDS) NUV images observed with a cadence of 2 days and a baseline of observations of ~3 years. The GALEX TDS fields were designed to be in spatial and temporal coordination with the Pan-STARRS1 Medium Deep Survey, which provides deep optical imaging and simultaneous optical transient detections via image differencing. We characterize the GALEX photometric errors empirically as a function of mean magnitude, and select sources that vary at the 5σ level in at least one epoch. We measure the statistical properties of the UV variability, including the structure function on timescales of days and years. We report classifications for the GALEX TDS sample using a combination of optical host colors and morphology, UV light curve characteristics, and matches to archival X-ray, and spectroscopy catalogs. We classify 62% of the sources as active galaxies (358 quasars and 305 active galactic nuclei), and 10% as variable stars (including 37 RR Lyrae, 53 M dwarf flare stars, and 2 cataclysmic variables). We detect a large-amplitude tail in the UV variability distribution for M-dwarf flare stars and RR Lyrae, reaching up to |Δm| = 4.6 mag and 2.9 mag, respectively. The mean amplitude of the structure function for quasars on year timescales is five times larger than observed at optical wavelengths. The remaining unclassified sources include UV-bright extragalactic transients, two of which have been spectroscopically confirmed to be a young core-collapse supernova and a flare from the tidal disruption of a star by dormant supermassive black hole. We calculate a surface density for variable sources in the UV with NUV < 23 mag and |Δm| > 0.2 mag of ~8.0, 7.7, and 1.8 deg-2 for quasars, active galactic nuclei, and RR Lyrae stars, respectively. We also calculate a surface density rate in the UV for transient sources, using the effective survey time at the cadence appropriate to each class, of ~15 and 52 deg-2 yr-1 for M dwarfs and extragalactic transients, respectively.

  20. Classification based upon gene expression data: bias and precision of error rates.

    PubMed

    Wood, Ian A; Visscher, Peter M; Mengersen, Kerrie L

    2007-06-01

    Gene expression data offer a large number of potentially useful predictors for the classification of tissue samples into classes, such as diseased and non-diseased. The predictive error rate of classifiers can be estimated using methods such as cross-validation. We have investigated issues of interpretation and potential bias in the reporting of error rate estimates. The issues considered here are optimization and selection biases, sampling effects, measures of misclassification rate, baseline error rates, two-level external cross-validation and a novel proposal for detection of bias using the permutation mean. Reporting an optimal estimated error rate incurs an optimization bias. Downward bias of 3-5% was found in an existing study of classification based on gene expression data and may be endemic in similar studies. Using a simulated non-informative dataset and two example datasets from existing studies, we show how bias can be detected through the use of label permutations and avoided using two-level external cross-validation. Some studies avoid optimization bias by using single-level cross-validation and a test set, but error rates can be more accurately estimated via two-level cross-validation. In addition to estimating the simple overall error rate, we recommend reporting class error rates plus where possible the conditional risk incorporating prior class probabilities and a misclassification cost matrix. We also describe baseline error rates derived from three trivial classifiers which ignore the predictors. R code which implements two-level external cross-validation with the PAMR package, experiment code, dataset details and additional figures are freely available for non-commercial use from http://www.maths.qut.edu.au/profiles/wood/permr.jsp

  1. Sensitivity of geographic information system outputs to errors in remotely sensed data

    NASA Technical Reports Server (NTRS)

    Ramapriyan, H. K.; Boyd, R. K.; Gunther, F. J.; Lu, Y. C.

    1981-01-01

    The sensitivity of the outputs of a geographic information system (GIS) to errors in inputs derived from remotely sensed data (RSD) is investigated using a suitability model with per-cell decisions and a gridded geographic data base whose cells are larger than the RSD pixels. The process of preparing RSD as input to a GIS is analyzed, and the errors associated with classification and registration are examined. In the case of the model considered, it is found that the errors caused during classification and registration are partially compensated by the aggregation of pixels. The compensation is quantified by means of an analytical model, a Monte Carlo simulation, and experiments with Landsat data. The results show that error reductions of the order of 50% occur because of aggregation when 25 pixels of RSD are used per cell in the geographic data base.

  2. Development of an FBG Sensor Array for Multi-Impact Source Localization on CFRP Structures.

    PubMed

    Jiang, Mingshun; Sai, Yaozhang; Geng, Xiangyi; Sui, Qingmei; Liu, Xiaohui; Jia, Lei

    2016-10-24

    We proposed and studied an impact detection system based on a fiber Bragg grating (FBG) sensor array and multiple signal classification (MUSIC) algorithm to determine the location and the number of low velocity impacts on a carbon fiber-reinforced polymer (CFRP) plate. A FBG linear array, consisting of seven FBG sensors, was used for detecting the ultrasonic signals from impacts. The edge-filter method was employed for signal demodulation. Shannon wavelet transform was used to extract narrow band signals from the impacts. The Gerschgorin disc theorem was used for estimating the number of impacts. We used the MUSIC algorithm to obtain the coordinates of multi-impacts. The impact detection system was tested on a 500 mm × 500 mm × 1.5 mm CFRP plate. The results show that the maximum error and average error of the multi-impacts' localization are 9.2 mm and 7.4 mm, respectively.

  3. Automated Error Detection in Physiotherapy Training.

    PubMed

    Jovanović, Marko; Seiffarth, Johannes; Kutafina, Ekaterina; Jonas, Stephan M

    2018-01-01

    Manual skills teaching, such as physiotherapy education, requires immediate teacher feedback for the students during the learning process, which to date can only be performed by expert trainers. A machine-learning system trained only on correct performances to classify and score performed movements, to identify sources of errors in the movement and give feedback to the learner. We acquire IMU and sEMG sensor data from a commercial-grade wearable device and construct an HMM-based model for gesture classification, scoring and feedback giving. We evaluate the model on publicly available and self-generated data of an exemplary movement pattern executions. The model achieves an overall accuracy of 90.71% on the public dataset and 98.9% on our dataset. An AUC of 0.99 for the ROC of the scoring method could be achieved to discriminate between correct and untrained incorrect executions. The proposed system demonstrated its suitability for scoring and feedback in manual skills training.

  4. Audio visual speech source separation via improved context dependent association model

    NASA Astrophysics Data System (ADS)

    Kazemi, Alireza; Boostani, Reza; Sobhanmanesh, Fariborz

    2014-12-01

    In this paper, we exploit the non-linear relation between a speech source and its associated lip video as a source of extra information to propose an improved audio-visual speech source separation (AVSS) algorithm. The audio-visual association is modeled using a neural associator which estimates the visual lip parameters from a temporal context of acoustic observation frames. We define an objective function based on mean square error (MSE) measure between estimated and target visual parameters. This function is minimized for estimation of the de-mixing vector/filters to separate the relevant source from linear instantaneous or time-domain convolutive mixtures. We have also proposed a hybrid criterion which uses AV coherency together with kurtosis as a non-Gaussianity measure. Experimental results are presented and compared in terms of visually relevant speech detection accuracy and output signal-to-interference ratio (SIR) of source separation. The suggested audio-visual model significantly improves relevant speech classification accuracy compared to existing GMM-based model and the proposed AVSS algorithm improves the speech separation quality compared to reference ICA- and AVSS-based methods.

  5. Error Analysis in Composition of Iranian Lower Intermediate Students

    ERIC Educational Resources Information Center

    Taghavi, Mehdi

    2012-01-01

    Learners make errors during the process of learning languages. This study examines errors in writing task of twenty Iranian lower intermediate male students aged between 13 and 15. A subject was given to the participants was a composition about the seasons of a year. All of the errors were identified and classified. Corder's classification (1967)…

  6. Combining multiple decisions: applications to bioinformatics

    NASA Astrophysics Data System (ADS)

    Yukinawa, N.; Takenouchi, T.; Oba, S.; Ishii, S.

    2008-01-01

    Multi-class classification is one of the fundamental tasks in bioinformatics and typically arises in cancer diagnosis studies by gene expression profiling. This article reviews two recent approaches to multi-class classification by combining multiple binary classifiers, which are formulated based on a unified framework of error-correcting output coding (ECOC). The first approach is to construct a multi-class classifier in which each binary classifier to be aggregated has a weight value to be optimally tuned based on the observed data. In the second approach, misclassification of each binary classifier is formulated as a bit inversion error with a probabilistic model by making an analogy to the context of information transmission theory. Experimental studies using various real-world datasets including cancer classification problems reveal that both of the new methods are superior or comparable to other multi-class classification methods.

  7. Credit Risk Evaluation Using a C-Variable Least Squares Support Vector Classification Model

    NASA Astrophysics Data System (ADS)

    Yu, Lean; Wang, Shouyang; Lai, K. K.

    Credit risk evaluation is one of the most important issues in financial risk management. In this paper, a C-variable least squares support vector classification (C-VLSSVC) model is proposed for credit risk analysis. The main idea of this model is based on the prior knowledge that different classes may have different importance for modeling and more weights should be given to those classes with more importance. The C-VLSSVC model can be constructed by a simple modification of the regularization parameter in LSSVC, whereby more weights are given to the lease squares classification errors with important classes than the lease squares classification errors with unimportant classes while keeping the regularized terms in its original form. For illustration purpose, a real-world credit dataset is used to test the effectiveness of the C-VLSSVC model.

  8. Using beta binomials to estimate classification uncertainty for ensemble models.

    PubMed

    Clark, Robert D; Liang, Wenkel; Lee, Adam C; Lawless, Michael S; Fraczkiewicz, Robert; Waldman, Marvin

    2014-01-01

    Quantitative structure-activity (QSAR) models have enormous potential for reducing drug discovery and development costs as well as the need for animal testing. Great strides have been made in estimating their overall reliability, but to fully realize that potential, researchers and regulators need to know how confident they can be in individual predictions. Submodels in an ensemble model which have been trained on different subsets of a shared training pool represent multiple samples of the model space, and the degree of agreement among them contains information on the reliability of ensemble predictions. For artificial neural network ensembles (ANNEs) using two different methods for determining ensemble classification - one using vote tallies and the other averaging individual network outputs - we have found that the distribution of predictions across positive vote tallies can be reasonably well-modeled as a beta binomial distribution, as can the distribution of errors. Together, these two distributions can be used to estimate the probability that a given predictive classification will be in error. Large data sets comprised of logP, Ames mutagenicity, and CYP2D6 inhibition data are used to illustrate and validate the method. The distributions of predictions and errors for the training pool accurately predicted the distribution of predictions and errors for large external validation sets, even when the number of positive and negative examples in the training pool were not balanced. Moreover, the likelihood of a given compound being prospectively misclassified as a function of the degree of consensus between networks in the ensemble could in most cases be estimated accurately from the fitted beta binomial distributions for the training pool. Confidence in an individual predictive classification by an ensemble model can be accurately assessed by examining the distributions of predictions and errors as a function of the degree of agreement among the constituent submodels. Further, ensemble uncertainty estimation can often be improved by adjusting the voting or classification threshold based on the parameters of the error distribution. Finally, the profiles for models whose predictive uncertainty estimates are not reliable provide clues to that effect without the need for comparison to an external test set.

  9. Textural Analysis and Substrate Classification in the Nearshore Region of Lake Superior Using High-Resolution Multibeam Bathymetry

    NASA Astrophysics Data System (ADS)

    Dennison, Andrew G.

    Classification of the seafloor substrate can be done with a variety of methods. These methods include Visual (dives, drop cameras); mechanical (cores, grab samples); acoustic (statistical analysis of echosounder returns). Acoustic methods offer a more powerful and efficient means of collecting useful information about the bottom type. Due to the nature of an acoustic survey, larger areas can be sampled, and by combining the collected data with visual and mechanical survey methods provide greater confidence in the classification of a mapped region. During a multibeam sonar survey, both bathymetric and backscatter data is collected. It is well documented that the statistical characteristic of a sonar backscatter mosaic is dependent on bottom type. While classifying the bottom-type on the basis on backscatter alone can accurately predict and map bottom-type, i.e a muddy area from a rocky area, it lacks the ability to resolve and capture fine textural details, an important factor in many habitat mapping studies. Statistical processing of high-resolution multibeam data can capture the pertinent details about the bottom-type that are rich in textural information. Further multivariate statistical processing can then isolate characteristic features, and provide the basis for an accurate classification scheme. The development of a new classification method is described here. It is based upon the analysis of textural features in conjunction with ground truth sampling. The processing and classification result of two geologically distinct areas in nearshore regions of Lake Superior; off the Lester River,MN and Amnicon River, WI are presented here, using the Minnesota Supercomputer Institute's Mesabi computing cluster for initial processing. Processed data is then calibrated using ground truth samples to conduct an accuracy assessment of the surveyed areas. From analysis of high-resolution bathymetry data collected at both survey sites is was possible to successfully calculate a series of measures that describe textural information about the lake floor. Further processing suggests that the features calculated capture a significant amount of statistical information about the lake floor terrain as well. Two sources of error, an anomalous heave and refraction error significantly deteriorated the quality of the processed data and resulting validate results. Ground truth samples used to validate the classification methods utilized for both survey sites, however, resulted in accuracy values ranging from 5 -30 percent at the Amnicon River, and between 60-70 percent for the Lester River. The final results suggest that this new processing methodology does adequately capture textural information about the lake floor and does provide an acceptable classification in the absence of significant data quality issues.

  10. Original and Mirror Face Images and Minimum Squared Error Classification for Visible Light Face Recognition.

    PubMed

    Wang, Rong

    2015-01-01

    In real-world applications, the image of faces varies with illumination, facial expression, and poses. It seems that more training samples are able to reveal possible images of the faces. Though minimum squared error classification (MSEC) is a widely used method, its applications on face recognition usually suffer from the problem of a limited number of training samples. In this paper, we improve MSEC by using the mirror faces as virtual training samples. We obtained the mirror faces generated from original training samples and put these two kinds of samples into a new set. The face recognition experiments show that our method does obtain high accuracy performance in classification.

  11. Development of an Independent Global Land Cover Validation Dataset

    NASA Astrophysics Data System (ADS)

    Sulla-Menashe, D. J.; Olofsson, P.; Woodcock, C. E.; Holden, C.; Metcalfe, M.; Friedl, M. A.; Stehman, S. V.; Herold, M.; Giri, C.

    2012-12-01

    Accurate information related to the global distribution and dynamics in global land cover is critical for a large number of global change science questions. A growing number of land cover products have been produced at regional to global scales, but the uncertainty in these products and the relative strengths and weaknesses among available products are poorly characterized. To address this limitation we are compiling a database of high spatial resolution imagery to support international land cover validation studies. Validation sites were selected based on a probability sample, and may therefore be used to estimate statistically defensible accuracy statistics and associated standard errors. Validation site locations were identified using a stratified random design based on 21 strata derived from an intersection of Koppen climate classes and a population density layer. In this way, the two major sources of global variation in land cover (climate and human activity) are explicitly included in the stratification scheme. At each site we are acquiring high spatial resolution (< 1-m) satellite imagery for 5-km x 5-km blocks. The response design uses an object-oriented hierarchical legend that is compatible with the UN FAO Land Cover Classification System. Using this response design, we are classifying each site using a semi-automated algorithm that blends image segmentation with a supervised RandomForest classification algorithm. In the long run, the validation site database is designed to support international efforts to validate land cover products. To illustrate, we use the site database to validate the MODIS Collection 4 Land Cover product, providing a prototype for validating the VIIRS Surface Type Intermediate Product scheduled to start operational production early in 2013. As part of our analysis we evaluate sources of error in coarse resolution products including semantic issues related to the class definitions, mixed pixels, and poor spectral separation between classes.

  12. Rank preserving sparse learning for Kinect based scene classification.

    PubMed

    Tao, Dapeng; Jin, Lianwen; Yang, Zhao; Li, Xuelong

    2013-10-01

    With the rapid development of the RGB-D sensors and the promptly growing population of the low-cost Microsoft Kinect sensor, scene classification, which is a hard, yet important, problem in computer vision, has gained a resurgence of interest recently. That is because the depth of information provided by the Kinect sensor opens an effective and innovative way for scene classification. In this paper, we propose a new scheme for scene classification, which applies locality-constrained linear coding (LLC) to local SIFT features for representing the RGB-D samples and classifies scenes through the cooperation between a new rank preserving sparse learning (RPSL) based dimension reduction and a simple classification method. RPSL considers four aspects: 1) it preserves the rank order information of the within-class samples in a local patch; 2) it maximizes the margin between the between-class samples on the local patch; 3) the L1-norm penalty is introduced to obtain the parsimony property; and 4) it models the classification error minimization by utilizing the least-squares error minimization. Experiments are conducted on the NYU Depth V1 dataset and demonstrate the robustness and effectiveness of RPSL for scene classification.

  13. Comparison of two Classification methods (MLC and SVM) to extract land use and land cover in Johor Malaysia

    NASA Astrophysics Data System (ADS)

    Rokni Deilmai, B.; Ahmad, B. Bin; Zabihi, H.

    2014-06-01

    Mapping is essential for the analysis of the land use and land cover, which influence many environmental processes and properties. For the purpose of the creation of land cover maps, it is important to minimize error. These errors will propagate into later analyses based on these land cover maps. The reliability of land cover maps derived from remotely sensed data depends on an accurate classification. In this study, we have analyzed multispectral data using two different classifiers including Maximum Likelihood Classifier (MLC) and Support Vector Machine (SVM). To pursue this aim, Landsat Thematic Mapper data and identical field-based training sample datasets in Johor Malaysia used for each classification method, which results indicate in five land cover classes forest, oil palm, urban area, water, rubber. Classification results indicate that SVM was more accurate than MLC. With demonstrated capability to produce reliable cover results, the SVM methods should be especially useful for land cover classification.

  14. On the Discriminant Analysis in the 2-Populations Case

    NASA Astrophysics Data System (ADS)

    Rublík, František

    2008-01-01

    The empirical Bayes Gaussian rule, which in the normal case yields good values of the probability of total error, may yield high values of the maximum probability error. From this point of view the presented modified version of the classification rule of Broffitt, Randles and Hogg appears to be superior. The modification included in this paper is termed as a WR method, and the choice of its weights is discussed. The mentioned methods are also compared with the K nearest neighbours classification rule.

  15. A Practical and Automated Approach to Large Area Forest Disturbance Mapping with Remote Sensing

    PubMed Central

    Ozdogan, Mutlu

    2014-01-01

    In this paper, I describe a set of procedures that automate forest disturbance mapping using a pair of Landsat images. The approach is built on the traditional pair-wise change detection method, but is designed to extract training data without user interaction and uses a robust classification algorithm capable of handling incorrectly labeled training data. The steps in this procedure include: i) creating masks for water, non-forested areas, clouds, and cloud shadows; ii) identifying training pixels whose value is above or below a threshold defined by the number of standard deviations from the mean value of the histograms generated from local windows in the short-wave infrared (SWIR) difference image; iii) filtering the original training data through a number of classification algorithms using an n-fold cross validation to eliminate mislabeled training samples; and finally, iv) mapping forest disturbance using a supervised classification algorithm. When applied to 17 Landsat footprints across the U.S. at five-year intervals between 1985 and 2010, the proposed approach produced forest disturbance maps with 80 to 95% overall accuracy, comparable to those obtained from traditional approaches to forest change detection. The primary sources of mis-classification errors included inaccurate identification of forests (errors of commission), issues related to the land/water mask, and clouds and cloud shadows missed during image screening. The approach requires images from the peak growing season, at least for the deciduous forest sites, and cannot readily distinguish forest harvest from natural disturbances or other types of land cover change. The accuracy of detecting forest disturbance diminishes with the number of years between the images that make up the image pair. Nevertheless, the relatively high accuracies, little or no user input needed for processing, speed of map production, and simplicity of the approach make the new method especially practical for forest cover change analysis over very large regions. PMID:24717283

  16. A practical and automated approach to large area forest disturbance mapping with remote sensing.

    PubMed

    Ozdogan, Mutlu

    2014-01-01

    In this paper, I describe a set of procedures that automate forest disturbance mapping using a pair of Landsat images. The approach is built on the traditional pair-wise change detection method, but is designed to extract training data without user interaction and uses a robust classification algorithm capable of handling incorrectly labeled training data. The steps in this procedure include: i) creating masks for water, non-forested areas, clouds, and cloud shadows; ii) identifying training pixels whose value is above or below a threshold defined by the number of standard deviations from the mean value of the histograms generated from local windows in the short-wave infrared (SWIR) difference image; iii) filtering the original training data through a number of classification algorithms using an n-fold cross validation to eliminate mislabeled training samples; and finally, iv) mapping forest disturbance using a supervised classification algorithm. When applied to 17 Landsat footprints across the U.S. at five-year intervals between 1985 and 2010, the proposed approach produced forest disturbance maps with 80 to 95% overall accuracy, comparable to those obtained from traditional approaches to forest change detection. The primary sources of mis-classification errors included inaccurate identification of forests (errors of commission), issues related to the land/water mask, and clouds and cloud shadows missed during image screening. The approach requires images from the peak growing season, at least for the deciduous forest sites, and cannot readily distinguish forest harvest from natural disturbances or other types of land cover change. The accuracy of detecting forest disturbance diminishes with the number of years between the images that make up the image pair. Nevertheless, the relatively high accuracies, little or no user input needed for processing, speed of map production, and simplicity of the approach make the new method especially practical for forest cover change analysis over very large regions.

  17. A classification of errors in lay comprehension of medical documents.

    PubMed

    Keselman, Alla; Smith, Catherine Arnott

    2012-12-01

    Emphasis on participatory medicine requires that patients and consumers participate in tasks traditionally reserved for healthcare providers. This includes reading and comprehending medical documents, often but not necessarily in the context of interacting with Personal Health Records (PHRs). Research suggests that while giving patients access to medical documents has many benefits (e.g., improved patient-provider communication), lay people often have difficulty understanding medical information. Informatics can address the problem by developing tools that support comprehension; this requires in-depth understanding of the nature and causes of errors that lay people make when comprehending clinical documents. The objective of this study was to develop a classification scheme of comprehension errors, based on lay individuals' retellings of two documents containing clinical text: a description of a clinical trial and a typical office visit note. While not comprehensive, the scheme can serve as a foundation of further development of a taxonomy of patients' comprehension errors. Eighty participants, all healthy volunteers, read and retold two medical documents. A data-driven content analysis procedure was used to extract and classify retelling errors. The resulting hierarchical classification scheme contains nine categories and 23 subcategories. The most common error made by the participants involved incorrectly recalling brand names of medications. Other common errors included misunderstanding clinical concepts, misreporting the objective of a clinical research study and physician's findings during a patient's visit, and confusing and misspelling clinical terms. A combination of informatics support and health education is likely to improve the accuracy of lay comprehension of medical documents. Published by Elsevier Inc.

  18. An extension of the receiver operating characteristic curve and AUC-optimal classification.

    PubMed

    Takenouchi, Takashi; Komori, Osamu; Eguchi, Shinto

    2012-10-01

    While most proposed methods for solving classification problems focus on minimization of the classification error rate, we are interested in the receiver operating characteristic (ROC) curve, which provides more information about classification performance than the error rate does. The area under the ROC curve (AUC) is a natural measure for overall assessment of a classifier based on the ROC curve. We discuss a class of concave functions for AUC maximization in which a boosting-type algorithm including RankBoost is considered, and the Bayesian risk consistency and the lower bound of the optimum function are discussed. A procedure derived by maximizing a specific optimum function has high robustness, based on gross error sensitivity. Additionally, we focus on the partial AUC, which is the partial area under the ROC curve. For example, in medical screening, a high true-positive rate to the fixed lower false-positive rate is preferable and thus the partial AUC corresponding to lower false-positive rates is much more important than the remaining AUC. We extend the class of concave optimum functions for partial AUC optimality with the boosting algorithm. We investigated the validity of the proposed method through several experiments with data sets in the UCI repository.

  19. Review of medication errors that are new or likely to occur more frequently with electronic medication management systems.

    PubMed

    Van de Vreede, Melita; McGrath, Anne; de Clifford, Jan

    2018-05-14

    Objective. The aim of the present study was to identify and quantify medication errors reportedly related to electronic medication management systems (eMMS) and those considered likely to occur more frequently with eMMS. This included developing a new classification system relevant to eMMS errors. Methods. Eight Victorian hospitals with eMMS participated in a retrospective audit of reported medication incidents from their incident reporting databases between May and July 2014. Site-appointed project officers submitted deidentified incidents they deemed new or likely to occur more frequently due to eMMS, together with the Incident Severity Rating (ISR). The authors reviewed and classified incidents. Results. There were 5826 medication-related incidents reported. In total, 93 (47 prescribing errors, 46 administration errors) were identified as new or potentially related to eMMS. Only one ISR2 (moderate) and no ISR1 (severe or death) errors were reported, so harm to patients in this 3-month period was minimal. The most commonly reported error types were 'human factors' and 'unfamiliarity or training' (70%) and 'cross-encounter or hybrid system errors' (22%). Conclusions. Although the results suggest that the errors reported were of low severity, organisations must remain vigilant to the risk of new errors and avoid the assumption that eMMS is the panacea to all medication error issues. What is known about the topic? eMMS have been shown to reduce some types of medication errors, but it has been reported that some new medication errors have been identified and some are likely to occur more frequently with eMMS. There are few published Australian studies that have reported on medication error types that are likely to occur more frequently with eMMS in more than one organisation and that include administration and prescribing errors. What does this paper add? This paper includes a new simple classification system for eMMS that is useful and outlines the most commonly reported incident types and can inform organisations and vendors on possible eMMS improvements. The paper suggests a new classification system for eMMS medication errors. What are the implications for practitioners? The results of the present study will highlight to organisations the need for ongoing review of system design, refinement of workflow issues, staff education and training and reporting and monitoring of errors.

  20. EEG error potentials detection and classification using time-frequency features for robot reinforcement learning.

    PubMed

    Boubchir, Larbi; Touati, Youcef; Daachi, Boubaker; Chérif, Arab Ali

    2015-08-01

    In thought-based steering of robots, error potentials (ErrP) can appear when the action resulting from the brain-machine interface (BMI) classifier/controller does not correspond to the user's thought. Using the Steady State Visual Evoked Potentials (SSVEP) techniques, ErrP, which appear when a classification error occurs, are not easily recognizable by only examining the temporal or frequency characteristics of EEG signals. A supplementary classification process is therefore needed to identify them in order to stop the course of the action and back up to a recovery state. This paper presents a set of time-frequency (t-f) features for the detection and classification of EEG ErrP in extra-brain activities due to misclassification observed by a user exploiting non-invasive BMI and robot control in the task space. The proposed features are able to characterize and detect ErrP activities in the t-f domain. These features are derived from the information embedded in the t-f representation of EEG signals, and include the Instantaneous Frequency (IF), t-f information complexity, SVD information, energy concentration and sub-bands' energies. The experiment results on real EEG data show that the use of the proposed t-f features for detecting and classifying EEG ErrP achieved an overall classification accuracy up to 97% for 50 EEG segments using 2-class SVM classifier.

  1. Neural network approaches versus statistical methods in classification of multisource remote sensing data

    NASA Technical Reports Server (NTRS)

    Benediktsson, Jon A.; Swain, Philip H.; Ersoy, Okan K.

    1990-01-01

    Neural network learning procedures and statistical classificaiton methods are applied and compared empirically in classification of multisource remote sensing and geographic data. Statistical multisource classification by means of a method based on Bayesian classification theory is also investigated and modified. The modifications permit control of the influence of the data sources involved in the classification process. Reliability measures are introduced to rank the quality of the data sources. The data sources are then weighted according to these rankings in the statistical multisource classification. Four data sources are used in experiments: Landsat MSS data and three forms of topographic data (elevation, slope, and aspect). Experimental results show that two different approaches have unique advantages and disadvantages in this classification application.

  2. Evaluation of spatial filtering on the accuracy of wheat area estimate

    NASA Technical Reports Server (NTRS)

    Dejesusparada, N. (Principal Investigator); Moreira, M. A.; Chen, S. C.; Delima, A. M.

    1982-01-01

    A 3 x 3 pixel spatial filter for postclassification was used for wheat classification to evaluate the effects of this procedure on the accuracy of area estimation using LANDSAT digital data obtained from a single pass. Quantitative analyses were carried out in five test sites (approx 40 sq km each) and t tests showed that filtering with threshold values significantly decreased errors of commission and omission. In area estimation filtering improved the overestimate of 4.5% to 2.7% and the root-mean-square error decreased from 126.18 ha to 107.02 ha. Extrapolating the same procedure of automatic classification using spatial filtering for postclassification to the whole study area, the accuracy in area estimate was improved from the overestimate of 10.9% to 9.7%. It is concluded that when single pass LANDSAT data is used for crop identification and area estimation the postclassification procedure using a spatial filter provides a more accurate area estimate by reducing classification errors.

  3. Maximum likelihood estimation of label imperfections and its use in the identification of mislabeled patterns

    NASA Technical Reports Server (NTRS)

    Chittineni, C. B.

    1979-01-01

    The problem of estimating label imperfections and the use of the estimation in identifying mislabeled patterns is presented. Expressions for the maximum likelihood estimates of classification errors and a priori probabilities are derived from the classification of a set of labeled patterns. Expressions also are given for the asymptotic variances of probability of correct classification and proportions. Simple models are developed for imperfections in the labels and for classification errors and are used in the formulation of a maximum likelihood estimation scheme. Schemes are presented for the identification of mislabeled patterns in terms of threshold on the discriminant functions for both two-class and multiclass cases. Expressions are derived for the probability that the imperfect label identification scheme will result in a wrong decision and are used in computing thresholds. The results of practical applications of these techniques in the processing of remotely sensed multispectral data are presented.

  4. Data fusion and classification using a hybrid intrinsic cellular inference network

    NASA Astrophysics Data System (ADS)

    Woodley, Robert; Walenz, Brett; Seiffertt, John; Robinette, Paul; Wunsch, Donald

    2010-04-01

    Hybrid Intrinsic Cellular Inference Network (HICIN) is designed for battlespace decision support applications. We developed an automatic method of generating hypotheses for an entity-attribute classifier. The capability and effectiveness of a domain specific ontology was used to generate automatic categories for data classification. Heterogeneous data is clustered using an Adaptive Resonance Theory (ART) inference engine on a sample (unclassified) data set. The data set is the Lahman baseball database. The actual data is immaterial to the architecture, however, parallels in the data can be easily drawn (i.e., "Team" maps to organization, "Runs scored/allowed" to Measure of organization performance (positive/negative), "Payroll" to organization resources, etc.). Results show that HICIN classifiers create known inferences from the heterogonous data. These inferences are not explicitly stated in the ontological description of the domain and are strictly data driven. HICIN uses data uncertainty handling to reduce errors in the classification. The uncertainty handling is based on subjective logic. The belief mass allows evidence from multiple sources to be mathematically combined to increase or discount an assertion. In military operations the ability to reduce uncertainty will be vital in the data fusion operation.

  5. Application of FTIR-ATR spectroscopy coupled with multivariate analysis for rapid estimation of butter adulteration.

    PubMed

    Fadzlillah, Nurrulhidayah Ahmad; Rohman, Abdul; Ismail, Amin; Mustafa, Shuhaimi; Khatib, Alfi

    2013-01-01

    In dairy product sector, butter is one of the potential sources of fat soluble vitamins, namely vitamin A, D, E, K; consequently, butter is taken into account as high valuable price from other dairy products. This fact has attracted unscrupulous market players to blind butter with other animal fats to gain economic profit. Animal fats like mutton fat (MF) are potential to be mixed with butter due to the similarity in terms of fatty acid composition. This study focused on the application of FTIR-ATR spectroscopy in conjunction with chemometrics for classification and quantification of MF as adulterant in butter. The FTIR spectral region of 3910-710 cm⁻¹ was used for classification between butter and butter blended with MF at various concentrations with the aid of discriminant analysis (DA). DA is able to classify butter and adulterated butter without any mistakenly grouped. For quantitative analysis, partial least square (PLS) regression was used to develop a calibration model at the frequency regions of 3910-710 cm⁻¹. The equation obtained for the relationship between actual value of MF and FTIR predicted values of MF in PLS calibration model was y = 0.998x + 1.033, with the values of coefficient of determination (R²) and root mean square error of calibration are 0.998 and 0.046% (v/v), respectively. The PLS calibration model was subsequently used for the prediction of independent samples containing butter in the binary mixtures with MF. Using 9 principal components, root mean square error of prediction (RMSEP) is 1.68% (v/v). The results showed that FTIR spectroscopy can be used for the classification and quantification of MF in butter formulation for verification purposes.

  6. Misclassification Errors in Unsupervised Classification Methods. Comparison Based on the Simulation of Targeted Proteomics Data

    PubMed Central

    Andreev, Victor P; Gillespie, Brenda W; Helfand, Brian T; Merion, Robert M

    2016-01-01

    Unsupervised classification methods are gaining acceptance in omics studies of complex common diseases, which are often vaguely defined and are likely the collections of disease subtypes. Unsupervised classification based on the molecular signatures identified in omics studies have the potential to reflect molecular mechanisms of the subtypes of the disease and to lead to more targeted and successful interventions for the identified subtypes. Multiple classification algorithms exist but none is ideal for all types of data. Importantly, there are no established methods to estimate sample size in unsupervised classification (unlike power analysis in hypothesis testing). Therefore, we developed a simulation approach allowing comparison of misclassification errors and estimating the required sample size for a given effect size, number, and correlation matrix of the differentially abundant proteins in targeted proteomics studies. All the experiments were performed in silico. The simulated data imitated the expected one from the study of the plasma of patients with lower urinary tract dysfunction with the aptamer proteomics assay Somascan (SomaLogic Inc, Boulder, CO), which targeted 1129 proteins, including 330 involved in inflammation, 180 in stress response, 80 in aging, etc. Three popular clustering methods (hierarchical, k-means, and k-medoids) were compared. K-means clustering performed much better for the simulated data than the other two methods and enabled classification with misclassification error below 5% in the simulated cohort of 100 patients based on the molecular signatures of 40 differentially abundant proteins (effect size 1.5) from among the 1129-protein panel. PMID:27524871

  7. Constraints as a destriping tool for Hires images

    NASA Technical Reports Server (NTRS)

    Cao, YU; Prince, Thomas A.

    1994-01-01

    Images produced from the Maximum Correlation Method sometimes suffer from visible striping artifacts, especially for areas of extended sources. Possible causes are different baseline levels and calibration errors in the detectors. We incorporated these factors into the MCM algorithm, and tested the effects of different constraints on the output image. The result shows significant visual improvement over the standard MCM Method. In some areas the new images show intelligible structures that are otherwise corrupted by striping artifacts, and the removal of these artifacts could enhance performance of object classification algorithms. The constraints were also tested on low surface brightness areas, and were found to be effective in reducing the noise level.

  8. Development and Validation of Various Phenotyping Algorithms for Diabetes Mellitus Using Data from Electronic Health Records.

    PubMed

    Esteban, Santiago; Rodríguez Tablado, Manuel; Peper, Francisco; Mahumud, Yamila S; Ricci, Ricardo I; Kopitowski, Karin; Terrasa, Sergio

    2017-01-01

    Precision medicine requires extremely large samples. Electronic health records (EHR) are thought to be a cost-effective source of data for that purpose. Phenotyping algorithms help reduce classification errors, making EHR a more reliable source of information for research. Four algorithm development strategies for classifying patients according to their diabetes status (diabetics; non-diabetics; inconclusive) were tested (one codes-only algorithm; one boolean algorithm, four statistical learning algorithms and six stacked generalization meta-learners). The best performing algorithms within each strategy were tested on the validation set. The stacked generalization algorithm yielded the highest Kappa coefficient value in the validation set (0.95 95% CI 0.91, 0.98). The implementation of these algorithms allows for the exploitation of data from thousands of patients accurately, greatly reducing the costs of constructing retrospective cohorts for research.

  9. Computer discrimination procedures applicable to aerial and ERTS multispectral data

    NASA Technical Reports Server (NTRS)

    Richardson, A. J.; Torline, R. J.; Allen, W. A.

    1970-01-01

    Two statistical models are compared in the classification of crops recorded on color aerial photographs. A theory of error ellipses is applied to the pattern recognition problem. An elliptical boundary condition classification model (EBC), useful for recognition of candidate patterns, evolves out of error ellipse theory. The EBC model is compared with the minimum distance to the mean (MDM) classification model in terms of pattern recognition ability. The pattern recognition results of both models are interpreted graphically using scatter diagrams to represent measurement space. Measurement space, for this report, is determined by optical density measurements collected from Kodak Ektachrome Infrared Aero Film 8443 (EIR). The EBC model is shown to be a significant improvement over the MDM model.

  10. Statistical learning from nonrecurrent experience with discrete input variables and recursive-error-minimization equations

    NASA Astrophysics Data System (ADS)

    Carter, Jeffrey R.; Simon, Wayne E.

    1990-08-01

    Neural networks are trained using Recursive Error Minimization (REM) equations to perform statistical classification. Using REM equations with continuous input variables reduces the required number of training experiences by factors of one to two orders of magnitude over standard back propagation. Replacing the continuous input variables with discrete binary representations reduces the number of connections by a factor proportional to the number of variables reducing the required number of experiences by another order of magnitude. Undesirable effects of using recurrent experience to train neural networks for statistical classification problems are demonstrated and nonrecurrent experience used to avoid these undesirable effects. 1. THE 1-41 PROBLEM The statistical classification problem which we address is is that of assigning points in ddimensional space to one of two classes. The first class has a covariance matrix of I (the identity matrix) the covariance matrix of the second class is 41. For this reason the problem is known as the 1-41 problem. Both classes have equal probability of occurrence and samples from both classes may appear anywhere throughout the ddimensional space. Most samples near the origin of the coordinate system will be from the first class while most samples away from the origin will be from the second class. Since the two classes completely overlap it is impossible to have a classifier with zero error. The minimum possible error is known as the Bayes error and

  11. Calibration of remotely sensed proportion or area estimates for misclassification error

    Treesearch

    Raymond L. Czaplewski; Glenn P. Catts

    1992-01-01

    Classifications of remotely sensed data contain misclassification errors that bias areal estimates. Monte Carlo techniques were used to compare two statistical methods that correct or calibrate remotely sensed areal estimates for misclassification bias using reference data from an error matrix. The inverse calibration estimator was consistently superior to the...

  12. Mapping gully-affected areas in the region of Taroudannt, Morocco based on Object-Based Image Analysis (OBIA)

    NASA Astrophysics Data System (ADS)

    d'Oleire-Oltmanns, Sebastian; Marzolff, Irene; Tiede, Dirk; Blaschke, Thomas

    2015-04-01

    The need for area-wide landform mapping approaches, especially in terms of land degradation, can be ascribed to the fact that within area-wide landform mapping approaches, the (spatial) context of erosional landforms is considered by providing additional information on the physiography neighboring the distinct landform. This study presents an approach for the detection of gully-affected areas by applying object-based image analysis in the region of Taroudannt, Morocco, which is highly affected by gully erosion while simultaneously representing a major region of agro-industry with a high demand of arable land. Various sensors provide readily available high-resolution optical satellite data with a much better temporal resolution than 3D terrain data which lead to the development of an area-wide mapping approach to extract gully-affected areas using only optical satellite imagery. The classification rule-set was developed with a clear focus on virtual spatial independence within the software environment of eCognition Developer. This allows the incorporation of knowledge about the target objects under investigation. Only optical QuickBird-2 satellite data and freely-available OpenStreetMap (OSM) vector data were used as input data. The OSM vector data were incorporated in order to mask out plantations and residential areas. Optical input data are more readily available for a broad range of users compared to terrain data, which is considered to be a major advantage. The methodology additionally incorporates expert knowledge and freely-available vector data in a cyclic object-based image analysis approach. This connects the two fields of geomorphology and remote sensing. The classification results allow conclusions on the current distribution of gullies. The results of the classification were checked against manually delineated reference data incorporating expert knowledge based on several field campaigns in the area, resulting in an overall classification accuracy of 62%. The error of omission accounts for 38% and the error of commission for 16%, respectively. Additionally, a manual assessment was carried out to assess the quality of the applied classification algorithm. The limited error of omission contributes with 23% to the overall error of omission and the limited error of commission contributes with 98% to the overall error of commission. This assessment improves the results and confirms the high quality of the developed approach for area-wide mapping of gully-affected areas in larger regions. In the field of landform mapping, the overall quality of the classification results is often assessed with more than one method to incorporate all aspects adequately.

  13. The Effectiveness of Using Limited Gauge Measurements for Bias Adjustment of Satellite-Based Precipitation Estimation over Saudi Arabia

    NASA Astrophysics Data System (ADS)

    Alharbi, Raied; Hsu, Kuolin; Sorooshian, Soroosh; Braithwaite, Dan

    2018-01-01

    Precipitation is a key input variable for hydrological and climate studies. Rain gauges are capable of providing reliable precipitation measurements at point scale. However, the uncertainty of rain measurements increases when the rain gauge network is sparse. Satellite -based precipitation estimations appear to be an alternative source of precipitation measurements, but they are influenced by systematic bias. In this study, a method for removing the bias from the Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks-Cloud Classification System (PERSIANN-CCS) over a region where the rain gauge is sparse is investigated. The method consists of monthly empirical quantile mapping, climate classification, and inverse-weighted distance method. Daily PERSIANN-CCS is selected to test the capability of the method for removing the bias over Saudi Arabia during the period of 2010 to 2016. The first six years (2010 - 2015) are calibrated years and 2016 is used for validation. The results show that the yearly correlation coefficient was enhanced by 12%, the yearly mean bias was reduced by 93% during validated year. Root mean square error was reduced by 73% during validated year. The correlation coefficient, the mean bias, and the root mean square error show that the proposed method removes the bias on PERSIANN-CCS effectively that the method can be applied to other regions where the rain gauge network is sparse.

  14. An evaluation of the quality of obstetric morbidity coding using an objective assessment tool, the Performance Indicators For Coding Quality (PICQ).

    PubMed

    Lamb, Mary K; Innes, Kerry; Saad, Patricia; Rust, Julie; Dimitropoulos, Vera; Cumerlato, Megan

    The Performance Indicators for Coding Quality (PICQ) is a data quality assessment tool developed by Australia's National Centre for Classification in Health (NCCH). PICQ consists of a number of indicators covering all ICD-10-AM disease chapters, some procedure chapters from the Australian Classification of Health Intervention (ACHI) and some Australian Coding Standards (ACS). The indicators can be used to assess the coding quality of hospital morbidity data by monitoring compliance of coding conventions and ACS; this enables the identification of particular records that may be incorrectly coded, thus providing a measure of data quality. There are 31 obstetric indicators available for the ICD-10-AM Fourth Edition. Twenty of these 31 indicators were classified as Fatal, nine as Warning and two Relative. These indicators were used to examine coding quality of obstetric records in the 2004-2005 financial year Australian national hospital morbidity dataset. Records with obstetric disease or procedure codes listed anywhere in the code string were extracted and exported from the SPSS source file. Data were then imported into a Microsoft Access database table as per PICQ instructions, and run against all Fatal and Warning and Relative (N=31) obstetric PICQ 2006 Fourth Edition Indicators v.5 for the ICD-10- AM Fourth Edition. There were 689,905 gynaecological and obstetric records in the 2004-2005 financial year, of which 1.14% were found to have triggered Fatal degree errors, 3.78% Warning degree errors and 8.35% Relative degree errors. The types of errors include completeness, redundancy, specificity and sequencing problems. It was found that PICQ is a useful initial screening tool for the assessment of ICD-10-AM/ACHI coding quality. The overall quality of codes assigned to obstetric records in the 2004- 2005 Australian national morbidity dataset is of fair quality.

  15. Classification of light sources and their interaction with active and passive environments

    NASA Astrophysics Data System (ADS)

    El-Dardiry, Ramy G. S.; Faez, Sanli; Lagendijk, Ad

    2011-03-01

    Emission from a molecular light source depends on its optical and chemical environment. This dependence is different for various sources. We present a general classification in terms of constant-amplitude and constant-power sources. Using this classification, we have described the response to both changes in the local density of states and stimulated emission. The unforeseen consequences of this classification are illustrated for photonic studies by random laser experiments and are in good agreement with our correspondingly developed theory. Our results require a revision of studies on sources in complex media.

  16. Improved discrimination between monocotyledonous and dicotyledonous plants for weed control based on the blue-green region of ultraviolet-induced fluorescence spectra.

    PubMed

    Panneton, Bernard; Guillaume, Serge; Roger, Jean-Michel; Samson, Guy

    2010-01-01

    Precision weeding by spot spraying in real time requires sensors to discriminate between weeds and crop without contact. Among the optical based solutions, the ultraviolet (UV) induced fluorescence of the plants appears as a promising alternative. In a first paper, the feasibility of discriminating between corn hybrids, monocotyledonous, and dicotyledonous weeds was demonstrated on the basis of the complete spectra. Some considerations about the different sources of fluorescence oriented the focus to the blue-green fluorescence (BGF) part, ignoring the chlorophyll fluorescence that is inherently more variable in time. This paper investigates the potential of performing weed/crop discrimination on the basis of several large spectral bands in the BGF area. A partial least squares discriminant analysis (PLS-DA) was performed on a set of 1908 spectra of corn and weed plants over 3 years and various growing conditions. The discrimination between monocotyledonous and dicotyledonous plants based on the blue-green fluorescence yielded robust models (classification error between 1.3 and 4.6% for between-year validation). On the basis of the analysis of the PLS-DA model, two large bands were chosen in the blue-green fluorescence zone (400-425 nm and 425-490 nm). A linear discriminant analysis based on the signal from these two bands also provided very robust inter-year results (classification error from 1.5% to 5.2%). The same selection process was applied to discriminate between monocotyledonous weeds and maize but yielded no robust models (up to 50% inter-year error). Further work will be required to solve this problem and provide a complete UV fluorescence based sensor for weed-maize discrimination.

  17. Maximum-likelihood techniques for joint segmentation-classification of multispectral chromosome images.

    PubMed

    Schwartzkopf, Wade C; Bovik, Alan C; Evans, Brian L

    2005-12-01

    Traditional chromosome imaging has been limited to grayscale images, but recently a 5-fluorophore combinatorial labeling technique (M-FISH) was developed wherein each class of chromosomes binds with a different combination of fluorophores. This results in a multispectral image, where each class of chromosomes has distinct spectral components. In this paper, we develop new methods for automatic chromosome identification by exploiting the multispectral information in M-FISH chromosome images and by jointly performing chromosome segmentation and classification. We (1) develop a maximum-likelihood hypothesis test that uses multispectral information, together with conventional criteria, to select the best segmentation possibility; (2) use this likelihood function to combine chromosome segmentation and classification into a robust chromosome identification system; and (3) show that the proposed likelihood function can also be used as a reliable indicator of errors in segmentation, errors in classification, and chromosome anomalies, which can be indicators of radiation damage, cancer, and a wide variety of inherited diseases. We show that the proposed multispectral joint segmentation-classification method outperforms past grayscale segmentation methods when decomposing touching chromosomes. We also show that it outperforms past M-FISH classification techniques that do not use segmentation information.

  18. Classification accuracy on the family planning participation status using kernel discriminant analysis

    NASA Astrophysics Data System (ADS)

    Kurniawan, Dian; Suparti; Sugito

    2018-05-01

    Population growth in Indonesia has increased every year. According to the population census conducted by the Central Bureau of Statistics (BPS) in 2010, the population of Indonesia has reached 237.6 million people. Therefore, to control the population growth rate, the government hold Family Planning or Keluarga Berencana (KB) program for couples of childbearing age. The purpose of this program is to improve the health of mothers and children in order to manifest prosperous society by controlling births while ensuring control of population growth. The data used in this study is the updated family data of Semarang city in 2016 that conducted by National Family Planning Coordinating Board (BKKBN). From these data, classifiers with kernel discriminant analysis will be obtained, and also classification accuracy will be obtained from that method. The result of the analysis showed that normal kernel discriminant analysis gives 71.05 % classification accuracy with 28.95 % classification error. Whereas triweight kernel discriminant analysis gives 73.68 % classification accuracy with 26.32 % classification error. Using triweight kernel discriminant for data preprocessing of family planning participation of childbearing age couples in Semarang City of 2016 can be stated better than with normal kernel discriminant.

  19. A parametric multiclass Bayes error estimator for the multispectral scanner spatial model performance evaluation

    NASA Technical Reports Server (NTRS)

    Mobasseri, B. G.; Mcgillem, C. D.; Anuta, P. E. (Principal Investigator)

    1978-01-01

    The author has identified the following significant results. The probability of correct classification of various populations in data was defined as the primary performance index. The multispectral data being of multiclass nature as well, required a Bayes error estimation procedure that was dependent on a set of class statistics alone. The classification error was expressed in terms of an N dimensional integral, where N was the dimensionality of the feature space. The multispectral scanner spatial model was represented by a linear shift, invariant multiple, port system where the N spectral bands comprised the input processes. The scanner characteristic function, the relationship governing the transformation of the input spatial, and hence, spectral correlation matrices through the systems, was developed.

  20. A comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies.

    PubMed

    Khondoker, Mizanur; Dobson, Richard; Skirrow, Caroline; Simmons, Andrew; Stahl, Daniel

    2016-10-01

    Recent literature on the comparison of machine learning methods has raised questions about the neutrality, unbiasedness and utility of many comparative studies. Reporting of results on favourable datasets and sampling error in the estimated performance measures based on single samples are thought to be the major sources of bias in such comparisons. Better performance in one or a few instances does not necessarily imply so on an average or on a population level and simulation studies may be a better alternative for objectively comparing the performances of machine learning algorithms. We compare the classification performance of a number of important and widely used machine learning algorithms, namely the Random Forests (RF), Support Vector Machines (SVM), Linear Discriminant Analysis (LDA) and k-Nearest Neighbour (kNN). Using massively parallel processing on high-performance supercomputers, we compare the generalisation errors at various combinations of levels of several factors: number of features, training sample size, biological variation, experimental variation, effect size, replication and correlation between features. For smaller number of correlated features, number of features not exceeding approximately half the sample size, LDA was found to be the method of choice in terms of average generalisation errors as well as stability (precision) of error estimates. SVM (with RBF kernel) outperforms LDA as well as RF and kNN by a clear margin as the feature set gets larger provided the sample size is not too small (at least 20). The performance of kNN also improves as the number of features grows and outplays that of LDA and RF unless the data variability is too high and/or effect sizes are too small. RF was found to outperform only kNN in some instances where the data are more variable and have smaller effect sizes, in which cases it also provide more stable error estimates than kNN and LDA. Applications to a number of real datasets supported the findings from the simulation study. © The Author(s) 2013.

  1. Using Gaussian mixture models to detect and classify dolphin whistles and pulses.

    PubMed

    Peso Parada, Pablo; Cardenal-López, Antonio

    2014-06-01

    In recent years, a number of automatic detection systems for free-ranging cetaceans have been proposed that aim to detect not just surfaced, but also submerged, individuals. These systems are typically based on pattern-recognition techniques applied to underwater acoustic recordings. Using a Gaussian mixture model, a classification system was developed that detects sounds in recordings and classifies them as one of four types: background noise, whistles, pulses, and combined whistles and pulses. The classifier was tested using a database of underwater recordings made off the Spanish coast during 2011. Using cepstral-coefficient-based parameterization, a sound detection rate of 87.5% was achieved for a 23.6% classification error rate. To improve these results, two parameters computed using the multiple signal classification algorithm and an unpredictability measure were included in the classifier. These parameters, which helped to classify the segments containing whistles, increased the detection rate to 90.3% and reduced the classification error rate to 18.1%. Finally, the potential of the multiple signal classification algorithm and unpredictability measure for estimating whistle contours and classifying cetacean species was also explored, with promising results.

  2. Defining and classifying medical error: lessons for patient safety reporting systems.

    PubMed

    Tamuz, M; Thomas, E J; Franchois, K E

    2004-02-01

    It is important for healthcare providers to report safety related events, but little attention has been paid to how the definition and classification of events affects a hospital's ability to learn from its experience. To examine how the definition and classification of safety related events influences key organizational routines for gathering information, allocating incentives, and analyzing event reporting data. In semi-structured interviews, professional staff and administrators in a tertiary care teaching hospital and its pharmacy were asked to describe the existing programs designed to monitor medication safety, including the reporting systems. With a focus primarily on the pharmacy staff, interviews were audio recorded, transcribed, and analyzed using qualitative research methods. Eighty six interviews were conducted, including 36 in the hospital pharmacy. Examples are presented which show that: (1) the definition of an event could lead to under-reporting; (2) the classification of a medication error into alternative categories can influence the perceived incentives and disincentives for incident reporting; (3) event classification can enhance or impede organizational routines for data analysis and learning; and (4) routines that promote organizational learning within the pharmacy can reduce the flow of medication error data to the hospital. These findings from one hospital raise important practical and research questions about how reporting systems are influenced by the definition and classification of safety related events. By understanding more clearly how hospitals define and classify their experience, we may improve our capacity to learn and ultimately improve patient safety.

  3. Fisher classifier and its probability of error estimation

    NASA Technical Reports Server (NTRS)

    Chittineni, C. B.

    1979-01-01

    Computationally efficient expressions are derived for estimating the probability of error using the leave-one-out method. The optimal threshold for the classification of patterns projected onto Fisher's direction is derived. A simple generalization of the Fisher classifier to multiple classes is presented. Computational expressions are developed for estimating the probability of error of the multiclass Fisher classifier.

  4. Human factors analysis and classification system-HFACS.

    DOT National Transportation Integrated Search

    2000-02-01

    Human error has been implicated in 70 to 80% of all civil and military aviation accidents. Yet, most accident : reporting systems are not designed around any theoretical framework of human error. As a result, most : accident databases are not conduci...

  5. Automatic Generation of Data Types for Classification of Deep Web Sources

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ngu, A H; Buttler, D J; Critchlow, T J

    2005-02-14

    A Service Class Description (SCD) is an effective meta-data based approach for discovering Deep Web sources whose data exhibit some regular patterns. However, it is tedious and error prone to create an SCD description manually. Moreover, a manually created SCD is not adaptive to the frequent changes of Web sources. It requires its creator to identify all the possible input and output types of a service a priori. In many domains, it is impossible to exhaustively list all the possible input and output data types of a source in advance. In this paper, we describe machine learning approaches for automaticmore » generation of the data types of an SCD. We propose two different approaches for learning data types of a class of Web sources. The Brute-Force Learner is able to generate data types that can achieve high recall, but with low precision. The Clustering-based Learner generates data types that have a high precision rate, but with a lower recall rate. We demonstrate the feasibility of these two learning-based solutions for automatic generation of data types for citation Web sources and presented a quantitative evaluation of these two solutions.« less

  6. The application of wavelet denoising in material discrimination system

    NASA Astrophysics Data System (ADS)

    Fu, Kenneth; Ranta, Dale; Guest, Clark; Das, Pankaj

    2010-01-01

    Recently, the need for cargo inspection imaging systems to provide a material discrimination function has become desirable. This is done by scanning the cargo container with x-rays at two different energy levels. The ratio of attenuations of the two energy scans can provide information on the composition of the material. However, with the statistical error from noise, the accuracy of such systems can be low. Because the moving source emits two energies of x-rays alternately, images from the two scans will not be identical. That means edges of objects in the two images are not perfectly aligned. Moreover, digitization creates blurry-edge artifacts. Different energy x-rays produce different edge spread functions. Those combined effects contribute to a source of false classification namely, the "edge effect." Other types of false classification are caused by noise, mainly Poisson noise associated with photons. The Poisson noise in xray images can be dealt with using either a Wiener filter or a wavelet shrinkage denoising approach. In this paper, we propose a method that uses the wavelet shrinkage denoising approach to enhance the performance of the material identification system. Test results show that this wavelet-based approach has improved performance in object detection and eliminating false positives due to the edge effects.

  7. The Galex Time Domain Survey. I. Selection And Classification of Over a Thousand Ultraviolet Variable Sources

    NASA Technical Reports Server (NTRS)

    Gezari, S.; Martin, D. C.; Forster, K.; Neill, J. D.; Huber, M.; Heckman, T.; Bianchi, L.; Morrissey, P.; Neff, S. G.; Seibert, M.; hide

    2013-01-01

    We present the selection and classification of over a thousand ultraviolet (UV) variable sources discovered in approximately 40 deg(exp 2) of GALEX Time Domain Survey (TDS) NUV images observed with a cadence of 2 days and a baseline of observations of approximately 3 years. The GALEX TDS fields were designed to be in spatial and temporal coordination with the Pan-STARRS1 Medium Deep Survey, which provides deep optical imaging and simultaneous optical transient detections via image differencing. We characterize the GALEX photometric errors empirically as a function of mean magnitude, and select sources that vary at the 5 sigma level in at least one epoch. We measure the statistical properties of the UV variability, including the structure function on timescales of days and years. We report classifications for the GALEX TDS sample using a combination of optical host colors and morphology, UV light curve characteristics, and matches to archival X-ray, and spectroscopy catalogs. We classify 62% of the sources as active galaxies (358 quasars and 305 active galactic nuclei), and 10% as variable stars (including 37 RR Lyrae, 53 M dwarf flare stars, and 2 cataclysmic variables). We detect a large-amplitude tail in the UV variability distribution for M-dwarf flare stars and RR Lyrae, reaching up to absolute value(?m) = 4.6 mag and 2.9 mag, respectively. The mean amplitude of the structure function for quasars on year timescales is five times larger than observed at optical wavelengths. The remaining unclassified sources include UV-bright extragalactic transients, two of which have been spectroscopically confirmed to be a young core-collapse supernova and a flare from the tidal disruption of a star by dormant supermassive black hole. We calculate a surface density for variable sources in the UV with NUV less than 23 mag and absolute value(?m) greater than 0.2 mag of approximately 8.0, 7.7, and 1.8 deg(exp -2) for quasars, active galactic nuclei, and RR Lyrae stars, respectively. We also calculate a surface density rate in the UV for transient sources, using the effective survey time at the cadence appropriate to each class, of approximately 15 and 52 deg(exp -2 yr-1 for M dwarfs and extragalactic transients, respectively.

  8. IMPACTS OF PATCH SIZE AND LANDSCAPE HETEROGENEITY ON THEMATIC IMAGE CLASSIFICATION ACCURACY

    EPA Science Inventory

    Impacts of Patch Size and Landscape Heterogeneity on Thematic Image Classification Accuracy.
    Currently, most thematic accuracy assessments of classified remotely sensed images oily account for errors between the various classes employed, at particular pixels of interest, thu...

  9. Sea ice classification using fast learning neural networks

    NASA Technical Reports Server (NTRS)

    Dawson, M. S.; Fung, A. K.; Manry, M. T.

    1992-01-01

    A first learning neural network approach to the classification of sea ice is presented. The fast learning (FL) neural network and a multilayer perceptron (MLP) trained with backpropagation learning (BP network) were tested on simulated data sets based on the known dominant scattering characteristics of the target class. Four classes were used in the data simulation: open water, thick lossy saline ice, thin saline ice, and multiyear ice. The BP network was unable to consistently converge to less than 25 percent error while the FL method yielded an average error of approximately 1 percent on the first iteration of training. The fast learning method presented can significantly reduce the CPU time necessary to train a neural network as well as consistently yield higher classification accuracy than BP networks.

  10. High-density force myography: A possible alternative for upper-limb prosthetic control.

    PubMed

    Radmand, Ashkan; Scheme, Erik; Englehart, Kevin

    2016-01-01

    Several multiple degree-of-freedom upper-limb prostheses that have the promise of highly dexterous control have recently been developed. Inadequate controllability, however, has limited adoption of these devices. Introducing more robust control methods will likely result in higher acceptance rates. This work investigates the suitability of using high-density force myography (HD-FMG) for prosthetic control. HD-FMG uses a high-density array of pressure sensors to detect changes in the pressure patterns between the residual limb and socket caused by the contraction of the forearm muscles. In this work, HD-FMG outperforms the standard electromyography (EMG)-based system in detecting different wrist and hand gestures. With the arm in a fixed, static position, eight hand and wrist motions were classified with 0.33% error using the HD-FMG technique. Comparatively, classification errors in the range of 2.2%-11.3% have been reported in the literature for multichannel EMG-based approaches. As with EMG, position variation in HD-FMG can introduce classification error, but incorporating position variation into the training protocol reduces this effect. Channel reduction was also applied to the HD-FMG technique to decrease the dimensionality of the problem as well as the size of the sensorized area. We found that with informed, symmetric channel reduction, classification error could be decreased to 0.02%.

  11. Improved wetland remote sensing in Yellowstone National Park using classification trees to combine TM imagery and ancillary environmental data

    USGS Publications Warehouse

    Wright, C.; Gallant, Alisa L.

    2007-01-01

    The U.S. Fish and Wildlife Service uses the term palustrine wetland to describe vegetated wetlands traditionally identified as marsh, bog, fen, swamp, or wet meadow. Landsat TM imagery was combined with image texture and ancillary environmental data to model probabilities of palustrine wetland occurrence in Yellowstone National Park using classification trees. Model training and test locations were identified from National Wetlands Inventory maps, and classification trees were built for seven years spanning a range of annual precipitation. At a coarse level, palustrine wetland was separated from upland. At a finer level, five palustrine wetland types were discriminated: aquatic bed (PAB), emergent (PEM), forested (PFO), scrub–shrub (PSS), and unconsolidated shore (PUS). TM-derived variables alone were relatively accurate at separating wetland from upland, but model error rates dropped incrementally as image texture, DEM-derived terrain variables, and other ancillary GIS layers were added. For classification trees making use of all available predictors, average overall test error rates were 7.8% for palustrine wetland/upland models and 17.0% for palustrine wetland type models, with consistent accuracies across years. However, models were prone to wetland over-prediction. While the predominant PEM class was classified with omission and commission error rates less than 14%, we had difficulty identifying the PAB and PSS classes. Ancillary vegetation information greatly improved PSS classification and moderately improved PFO discrimination. Association with geothermal areas distinguished PUS wetlands. Wetland over-prediction was exacerbated by class imbalance in likely combination with spatial and spectral limitations of the TM sensor. Wetland probability surfaces may be more informative than hard classification, and appear to respond to climate-driven wetland variability. The developed method is portable, relatively easy to implement, and should be applicable in other settings and over larger extents.

  12. Common component classification: what can we learn from machine learning?

    PubMed

    Anderson, Ariana; Labus, Jennifer S; Vianna, Eduardo P; Mayer, Emeran A; Cohen, Mark S

    2011-05-15

    Machine learning methods have been applied to classifying fMRI scans by studying locations in the brain that exhibit temporal intensity variation between groups, frequently reporting classification accuracy of 90% or better. Although empirical results are quite favorable, one might doubt the ability of classification methods to withstand changes in task ordering and the reproducibility of activation patterns over runs, and question how much of the classification machines' power is due to artifactual noise versus genuine neurological signal. To examine the true strength and power of machine learning classifiers we create and then deconstruct a classifier to examine its sensitivity to physiological noise, task reordering, and across-scan classification ability. The models are trained and tested both within and across runs to assess stability and reproducibility across conditions. We demonstrate the use of independent components analysis for both feature extraction and artifact removal and show that removal of such artifacts can reduce predictive accuracy even when data has been cleaned in the preprocessing stages. We demonstrate how mistakes in the feature selection process can cause the cross-validation error seen in publication to be a biased estimate of the testing error seen in practice and measure this bias by purposefully making flawed models. We discuss other ways to introduce bias and the statistical assumptions lying behind the data and model themselves. Finally we discuss the complications in drawing inference from the smaller sample sizes typically seen in fMRI studies, the effects of small or unbalanced samples on the Type 1 and Type 2 error rates, and how publication bias can give a false confidence of the power of such methods. Collectively this work identifies challenges specific to fMRI classification and methods affecting the stability of models. Copyright © 2010 Elsevier Inc. All rights reserved.

  13. Automated source classification of new transient sources

    NASA Astrophysics Data System (ADS)

    Oertel, M.; Kreikenbohm, A.; Wilms, J.; DeLuca, A.

    2017-10-01

    The EXTraS project harvests the hitherto unexplored temporal domain information buried in the serendipitous data collected by the European Photon Imaging Camera (EPIC) onboard the ESA XMM-Newton mission since its launch. This includes a search for fast transients, missed by standard image analysis, and a search and characterization of variability in hundreds of thousands of sources. We present an automated classification scheme for new transient sources in the EXTraS project. The method is as follows: source classification features of a training sample are used to train machine learning algorithms (performed in R; randomForest (Breiman, 2001) in supervised mode) which are then tested on a sample of known source classes and used for classification.

  14. Enhancing navigation in biomedical databases by community voting and database-driven text classification

    PubMed Central

    Duchrow, Timo; Shtatland, Timur; Guettler, Daniel; Pivovarov, Misha; Kramer, Stefan; Weissleder, Ralph

    2009-01-01

    Background The breadth of biological databases and their information content continues to increase exponentially. Unfortunately, our ability to query such sources is still often suboptimal. Here, we introduce and apply community voting, database-driven text classification, and visual aids as a means to incorporate distributed expert knowledge, to automatically classify database entries and to efficiently retrieve them. Results Using a previously developed peptide database as an example, we compared several machine learning algorithms in their ability to classify abstracts of published literature results into categories relevant to peptide research, such as related or not related to cancer, angiogenesis, molecular imaging, etc. Ensembles of bagged decision trees met the requirements of our application best. No other algorithm consistently performed better in comparative testing. Moreover, we show that the algorithm produces meaningful class probability estimates, which can be used to visualize the confidence of automatic classification during the retrieval process. To allow viewing long lists of search results enriched by automatic classifications, we added a dynamic heat map to the web interface. We take advantage of community knowledge by enabling users to cast votes in Web 2.0 style in order to correct automated classification errors, which triggers reclassification of all entries. We used a novel framework in which the database "drives" the entire vote aggregation and reclassification process to increase speed while conserving computational resources and keeping the method scalable. In our experiments, we simulate community voting by adding various levels of noise to nearly perfectly labelled instances, and show that, under such conditions, classification can be improved significantly. Conclusion Using PepBank as a model database, we show how to build a classification-aided retrieval system that gathers training data from the community, is completely controlled by the database, scales well with concurrent change events, and can be adapted to add text classification capability to other biomedical databases. The system can be accessed at . PMID:19799796

  15. An evaluation of satellite data for estimating the area of small forestland in the southern lower peninsula of Michigan. Ph.D. Thesis

    NASA Technical Reports Server (NTRS)

    Karteris, M. A. (Principal Investigator)

    1980-01-01

    A winter black and white band 5, a winter color, a fall color, and a diazo color composite of the fall scene were used to assess the use and potential of LANDSAT images for mapping and estimating acreage of small scattered forest tracts in Barry County, Michigan. Forests as small as 2.5 acres were mapped from each LANDSAT data source. The maps for each image were compared with an available forest-type map. Mapping errors detected were categorized as boundary and identification errors. The most frequently misclassified areas were agriculture lands, treed-bogs, brushlands and lowland and mixed hardwood stands. Stocking level affected interpretation more than stand size. The overall level of the interpretation performance was expressed through the estimation of classification, interpretation, and mapping accuracies. These accuracies ranged from 74 between 74% and 98%. Considering errors, accuracy, and cost, winter color imagery is the best LANDSAT alternative for mapping small forest tracts. However, since the availability of cloud-free winter images of the study area is significantly lower than images for other seasons, a diazo enhanced image of a fall scene is recommended as the best next best alternative.

  16. Classification of driver fatigue in an electroencephalography-based countermeasure system with source separation module.

    PubMed

    Rifai Chai; Naik, Ganesh R; Tran, Yvonne; Sai Ho Ling; Craig, Ashley; Nguyen, Hung T

    2015-08-01

    An electroencephalography (EEG)-based counter measure device could be used for fatigue detection during driving. This paper explores the classification of fatigue and alert states using power spectral density (PSD) as a feature extractor and fuzzy swarm based-artificial neural network (ANN) as a classifier. An independent component analysis of entropy rate bound minimization (ICA-ERBM) is investigated as a novel source separation technique for fatigue classification using EEG analysis. A comparison of the classification accuracy of source separator versus no source separator is presented. Classification performance based on 43 participants without the inclusion of the source separator resulted in an overall sensitivity of 71.67%, a specificity of 75.63% and an accuracy of 73.65%. However, these results were improved after the inclusion of a source separator module, resulting in an overall sensitivity of 78.16%, a specificity of 79.60% and an accuracy of 78.88% (p <; 0.05).

  17. On the impact of approximate computation in an analog DeSTIN architecture.

    PubMed

    Young, Steven; Lu, Junjie; Holleman, Jeremy; Arel, Itamar

    2014-05-01

    Deep machine learning (DML) holds the potential to revolutionize machine learning by automating rich feature extraction, which has become the primary bottleneck of human engineering in pattern recognition systems. However, the heavy computational burden renders DML systems implemented on conventional digital processors impractical for large-scale problems. The highly parallel computations required to implement large-scale deep learning systems are well suited to custom hardware. Analog computation has demonstrated power efficiency advantages of multiple orders of magnitude relative to digital systems while performing nonideal computations. In this paper, we investigate typical error sources introduced by analog computational elements and their impact on system-level performance in DeSTIN--a compositional deep learning architecture. These inaccuracies are evaluated on a pattern classification benchmark, clearly demonstrating the robustness of the underlying algorithm to the errors introduced by analog computational elements. A clear understanding of the impacts of nonideal computations is necessary to fully exploit the efficiency of analog circuits.

  18. Automatic Classification of Time-variable X-Ray Sources

    NASA Astrophysics Data System (ADS)

    Lo, Kitty K.; Farrell, Sean; Murphy, Tara; Gaensler, B. M.

    2014-05-01

    To maximize the discovery potential of future synoptic surveys, especially in the field of transient science, it will be necessary to use automatic classification to identify some of the astronomical sources. The data mining technique of supervised classification is suitable for this problem. Here, we present a supervised learning method to automatically classify variable X-ray sources in the Second XMM-Newton Serendipitous Source Catalog (2XMMi-DR2). Random Forest is our classifier of choice since it is one of the most accurate learning algorithms available. Our training set consists of 873 variable sources and their features are derived from time series, spectra, and other multi-wavelength contextual information. The 10 fold cross validation accuracy of the training data is ~97% on a 7 class data set. We applied the trained classification model to 411 unknown variable 2XMM sources to produce a probabilistically classified catalog. Using the classification margin and the Random Forest derived outlier measure, we identified 12 anomalous sources, of which 2XMM J180658.7-500250 appears to be the most unusual source in the sample. Its X-ray spectra is suggestive of a ultraluminous X-ray source but its variability makes it highly unusual. Machine-learned classification and anomaly detection will facilitate scientific discoveries in the era of all-sky surveys.

  19. An analysis of computer-related patient safety incidents to inform the development of a classification.

    PubMed

    Magrabi, Farah; Ong, Mei-Sing; Runciman, William; Coiera, Enrico

    2010-01-01

    To analyze patient safety incidents associated with computer use to develop the basis for a classification of problems reported by health professionals. Incidents submitted to a voluntary incident reporting database across one Australian state were retrieved and a subset (25%) was analyzed to identify 'natural categories' for classification. Two coders independently classified the remaining incidents into one or more categories. Free text descriptions were analyzed to identify contributing factors. Where available medical specialty, time of day and consequences were examined. Descriptive statistics; inter-rater reliability. A search of 42,616 incidents from 2003 to 2005 yielded 123 computer related incidents. After removing duplicate and unrelated incidents, 99 incidents describing 117 problems remained. A classification with 32 types of computer use problems was developed. Problems were grouped into information input (31%), transfer (20%), output (20%) and general technical (24%). Overall, 55% of problems were machine related and 45% were attributed to human-computer interaction. Delays in initiating and completing clinical tasks were a major consequence of machine related problems (70%) whereas rework was a major consequence of human-computer interaction problems (78%). While 38% (n=26) of the incidents were reported to have a noticeable consequence but no harm, 34% (n=23) had no noticeable consequence. Only 0.2% of all incidents reported were computer related. Further work is required to expand our classification using incident reports and other sources of information about healthcare IT problems. Evidence based user interface design must focus on the safe entry and retrieval of clinical information and support users in detecting and correcting errors and malfunctions.

  20. Statistical sensor fusion of ECG data using automotive-grade sensors

    NASA Astrophysics Data System (ADS)

    Koenig, A.; Rehg, T.; Rasshofer, R.

    2015-11-01

    Driver states such as fatigue, stress, aggression, distraction or even medical emergencies continue to be yield to severe mistakes in driving and promote accidents. A pathway towards improving driver state assessment can be found in psycho-physiological measures to directly quantify the driver's state from physiological recordings. Although heart rate is a well-established physiological variable that reflects cognitive stress, obtaining heart rate contactless and reliably is a challenging task in an automotive environment. Our aim was to investigate, how sensory fusion of two automotive grade sensors would influence the accuracy of automatic classification of cognitive stress levels. We induced cognitive stress in subjects and estimated levels from their heart rate signals, acquired from automotive ready ECG sensors. Using signal quality indices and Kalman filters, we were able to decrease Root Mean Squared Error (RMSE) of heart rate recordings by 10 beats per minute. We then trained a neural network to classify the cognitive workload state of subjects from heart rate and compared classification performance for ground truth, the individual sensors and the fused heart rate signal. We obtained an increase of 5 % higher correct classification by fusing signals as compared to individual sensors, staying only 4 % below the maximally possible classification accuracy from ground truth. These results are a first step towards real world applications of psycho-physiological measurements in vehicle settings. Future implementations of driver state modeling will be able to draw from a larger pool of data sources, such as additional physiological values or vehicle related data, which can be expected to drive classification to significantly higher values.

  1. Halftoning Algorithms and Systems.

    DTIC Science & Technology

    1996-08-01

    TERMS 15. NUMBER IF PAGESi. Halftoning algorithms; error diffusions ; color printing; topographic maps 16. PRICE CODE 17. SECURITY CLASSIFICATION 18...graylevels for each screen level. In the case of error diffusion algorithms, the calibration procedure using the new centering concept manifests itself as a...Novel Centering Concept for Overlapping Correction Paper / Transparency (Patent Applied 5/94)I * Applications To Error Diffusion * To Dithering (IS&T

  2. Simulation techniques for estimating error in the classification of normal patterns

    NASA Technical Reports Server (NTRS)

    Whitsitt, S. J.; Landgrebe, D. A.

    1974-01-01

    Methods of efficiently generating and classifying samples with specified multivariate normal distributions were discussed. Conservative confidence tables for sample sizes are given for selective sampling. Simulation results are compared with classified training data. Techniques for comparing error and separability measure for two normal patterns are investigated and used to display the relationship between the error and the Chernoff bound.

  3. Investigation of Error Patterns in Geographical Databases

    NASA Technical Reports Server (NTRS)

    Dryer, David; Jacobs, Derya A.; Karayaz, Gamze; Gronbech, Chris; Jones, Denise R. (Technical Monitor)

    2002-01-01

    The objective of the research conducted in this project is to develop a methodology to investigate the accuracy of Airport Safety Modeling Data (ASMD) using statistical, visualization, and Artificial Neural Network (ANN) techniques. Such a methodology can contribute to answering the following research questions: Over a representative sampling of ASMD databases, can statistical error analysis techniques be accurately learned and replicated by ANN modeling techniques? This representative ASMD sample should include numerous airports and a variety of terrain characterizations. Is it possible to identify and automate the recognition of patterns of error related to geographical features? Do such patterns of error relate to specific geographical features, such as elevation or terrain slope? Is it possible to combine the errors in small regions into an error prediction for a larger region? What are the data density reduction implications of this work? ASMD may be used as the source of terrain data for a synthetic visual system to be used in the cockpit of aircraft when visual reference to ground features is not possible during conditions of marginal weather or reduced visibility. In this research, United States Geologic Survey (USGS) digital elevation model (DEM) data has been selected as the benchmark. Artificial Neural Networks (ANNS) have been used and tested as alternate methods in place of the statistical methods in similar problems. They often perform better in pattern recognition, prediction and classification and categorization problems. Many studies show that when the data is complex and noisy, the accuracy of ANN models is generally higher than those of comparable traditional methods.

  4. Optimal number of features as a function of sample size for various classification rules.

    PubMed

    Hua, Jianping; Xiong, Zixiang; Lowey, James; Suh, Edward; Dougherty, Edward R

    2005-04-15

    Given the joint feature-label distribution, increasing the number of features always results in decreased classification error; however, this is not the case when a classifier is designed via a classification rule from sample data. Typically (but not always), for fixed sample size, the error of a designed classifier decreases and then increases as the number of features grows. The potential downside of using too many features is most critical for small samples, which are commonplace for gene-expression-based classifiers for phenotype discrimination. For fixed sample size and feature-label distribution, the issue is to find an optimal number of features. Since only in rare cases is there a known distribution of the error as a function of the number of features and sample size, this study employs simulation for various feature-label distributions and classification rules, and across a wide range of sample and feature-set sizes. To achieve the desired end, finding the optimal number of features as a function of sample size, it employs massively parallel computation. Seven classifiers are treated: 3-nearest-neighbor, Gaussian kernel, linear support vector machine, polynomial support vector machine, perceptron, regular histogram and linear discriminant analysis. Three Gaussian-based models are considered: linear, nonlinear and bimodal. In addition, real patient data from a large breast-cancer study is considered. To mitigate the combinatorial search for finding optimal feature sets, and to model the situation in which subsets of genes are co-regulated and correlation is internal to these subsets, we assume that the covariance matrix of the features is blocked, with each block corresponding to a group of correlated features. Altogether there are a large number of error surfaces for the many cases. These are provided in full on a companion website, which is meant to serve as resource for those working with small-sample classification. For the companion website, please visit http://public.tgen.org/tamu/ofs/ e-dougherty@ee.tamu.edu.

  5. The Human Factors Analysis and Classification System : HFACS : final report.

    DOT National Transportation Integrated Search

    2000-02-01

    Human error has been implicated in 70 to 80% of all civil and military aviation accidents. Yet, most accident reporting systems are not designed around any theoretical framework of human error. As a result, most accident databases are not conducive t...

  6. A Confidence Paradigm for Classification Systems

    DTIC Science & Technology

    2008-09-01

    methodology to determine how much confi- dence one should have in a classifier output. This research proposes a framework to determine the level of...theoretical framework that attempts to unite the viewpoints of the classification system developer (or engineer) and the classification system user (or...operating point. An algorithm is developed that minimizes a “confidence” measure called Binned Error in the Posterior ( BEP ). Then, we prove that training a

  7. A False Alarm Reduction Method for a Gas Sensor Based Electronic Nose

    PubMed Central

    Rahman, Mohammad Mizanur; Suksompong, Prapun; Toochinda, Pisanu; Taparugssanagorn, Attaphongse

    2017-01-01

    Electronic noses (E-Noses) are becoming popular for food and fruit quality assessment due to their robustness and repeated usability without fatigue, unlike human experts. An E-Nose equipped with classification algorithms and having open ended classification boundaries such as the k-nearest neighbor (k-NN), support vector machine (SVM), and multilayer perceptron neural network (MLPNN), are found to suffer from false classification errors of irrelevant odor data. To reduce false classification and misclassification errors, and to improve correct rejection performance; algorithms with a hyperspheric boundary, such as a radial basis function neural network (RBFNN) and generalized regression neural network (GRNN) with a Gaussian activation function in the hidden layer should be used. The simulation results presented in this paper show that GRNN has more correct classification efficiency and false alarm reduction capability compared to RBFNN. As the design of a GRNN and RBFNN is complex and expensive due to large numbers of neuron requirements, a simple hyperspheric classification method based on minimum, maximum, and mean (MMM) values of each class of the training dataset was presented. The MMM algorithm was simple and found to be fast and efficient in correctly classifying data of training classes, and correctly rejecting data of extraneous odors, and thereby reduced false alarms. PMID:28895910

  8. A False Alarm Reduction Method for a Gas Sensor Based Electronic Nose.

    PubMed

    Rahman, Mohammad Mizanur; Charoenlarpnopparut, Chalie; Suksompong, Prapun; Toochinda, Pisanu; Taparugssanagorn, Attaphongse

    2017-09-12

    Electronic noses (E-Noses) are becoming popular for food and fruit quality assessment due to their robustness and repeated usability without fatigue, unlike human experts. An E-Nose equipped with classification algorithms and having open ended classification boundaries such as the k -nearest neighbor ( k -NN), support vector machine (SVM), and multilayer perceptron neural network (MLPNN), are found to suffer from false classification errors of irrelevant odor data. To reduce false classification and misclassification errors, and to improve correct rejection performance; algorithms with a hyperspheric boundary, such as a radial basis function neural network (RBFNN) and generalized regression neural network (GRNN) with a Gaussian activation function in the hidden layer should be used. The simulation results presented in this paper show that GRNN has more correct classification efficiency and false alarm reduction capability compared to RBFNN. As the design of a GRNN and RBFNN is complex and expensive due to large numbers of neuron requirements, a simple hyperspheric classification method based on minimum, maximum, and mean (MMM) values of each class of the training dataset was presented. The MMM algorithm was simple and found to be fast and efficient in correctly classifying data of training classes, and correctly rejecting data of extraneous odors, and thereby reduced false alarms.

  9. Comparison of the resulting error in data fusion techniques when used with remote sensing, earth observation, and in-situ data sets for water quality applications

    NASA Astrophysics Data System (ADS)

    Ziemba, Alexander; El Serafy, Ghada

    2016-04-01

    Ecological modeling and water quality investigations are complex processes which can require a high level of parameterization and a multitude of varying data sets in order to properly execute the model in question. Since models are generally complex, their calibration and validation can benefit from the application of data and information fusion techniques. The data applied to ecological models comes from a wide range of sources such as remote sensing, earth observation, and in-situ measurements, resulting in a high variability in the temporal and spatial resolution of the various data sets available to water quality investigators. It is proposed that effective fusion into a comprehensive singular set will provide a more complete and robust data resource with which models can be calibrated, validated, and driven by. Each individual product contains a unique valuation of error resulting from the method of measurement and application of pre-processing techniques. The uncertainty and error is further compounded when the data being fused is of varying temporal and spatial resolution. In order to have a reliable fusion based model and data set, the uncertainty of the results and confidence interval of the data being reported must be effectively communicated to those who would utilize the data product or model outputs in a decision making process[2]. Here we review an array of data fusion techniques applied to various remote sensing, earth observation, and in-situ data sets whose domains' are varied in spatial and temporal resolution. The data sets examined are combined in a manner so that the various classifications, complementary, redundant, and cooperative, of data are all assessed to determine classification's impact on the propagation and compounding of error. In order to assess the error of the fused data products, a comparison is conducted with data sets containing a known confidence interval and quality rating. We conclude with a quantification of the performance of the data fusion techniques and a recommendation on the feasibility of applying of the fused products in operating forecast systems and modeling scenarios. The error bands and confidence intervals derived can be used in order to clarify the error and confidence of water quality variables produced by prediction and forecasting models. References [1] F. Castanedo, "A Review of Data Fusion Techniques", The Scientific World Journal, vol. 2013, pp. 1-19, 2013. [2] T. Keenan, M. Carbone, M. Reichstein and A. Richardson, "The model-data fusion pitfall: assuming certainty in an uncertain world", Oecologia, vol. 167, no. 3, pp. 587-597, 2011.

  10. Are Nomothetic or Ideographic Approaches Superior in Predicting Daily Exercise Behaviors?

    PubMed

    Cheung, Ying Kuen; Hsueh, Pei-Yun Sabrina; Qian, Min; Yoon, Sunmoo; Meli, Laura; Diaz, Keith M; Schwartz, Joseph E; Kronish, Ian M; Davidson, Karina W

    2017-01-01

    The understanding of how stress influences health behavior can provide insights into developing healthy lifestyle interventions. This understanding is traditionally attained through observational studies that examine associations at a population level. This nomothetic approach, however, is fundamentally limited by the fact that the environment- person milieu that constitutes stress exposure and experience can vary substantially between individuals, and the modifiable elements of these exposures and experiences are individual-specific. With recent advances in smartphone and sensing technologies, it is now possible to conduct idiographic assessment in users' own environment, leveraging the full-range observations of actions and experiences that result in differential response to naturally occurring events. The aim of this paper is to explore the hypothesis that an ideographic N-of-1 model can better capture an individual's stress- behavior pathway (or the lack thereof) and provide useful person-specific predictors of exercise behavior. This paper used the data collected in an observational study in 79 participants who were followed for up to a 1-year period, wherein their physical activity was continuously and objectively monitored by actigraphy and their stress experience was recorded via ecological momentary assessment on a mobile app. In addition, our analyses considered exogenous and environmental variables retrieved from public archive such as day in a week, daylight time, temperature and precipitation. Leveraging the multiple data sources, we developed prediction algorithms for exercise behavior using random forest and classification tree techniques using a nomothetic approach and an N-of-1 approach. The two approaches were compared based on classification errors in predicting personalized exercise behavior. Eight factors were selected by random forest for the nomothetic decision model, which was used to predict whether a participant would exercise on a particular day. The predictors included previous exercise behavior, emotional factors (e.g., midday stress), external factors such as weather (e.g., temperature), and self-determination factors (e.g., expectation of exercise). The nomothetic model yielded an average classification error of 36%. The ideographic N-of-1 models used on average about two predictors for each individual, and had an average classification error of 25%, which represented an improvement of 11 percentage points. Compared to the traditional one-size-fits-all, nomothetic model that generalizes population-evidence for individuals, the proposed N-of-1 model can better capture the individual difference in their stressbehavior pathways. In this paper, we demonstrate it is feasible to perform personalized exercise behavior prediction, mainly made possible by mobile health technology and machine learning analytics. Schattauer GmbH.

  11. Differences in chewing sounds of dry-crisp snacks by multivariate data analysis

    NASA Astrophysics Data System (ADS)

    De Belie, N.; Sivertsvik, M.; De Baerdemaeker, J.

    2003-09-01

    Chewing sounds of different types of dry-crisp snacks (two types of potato chips, prawn crackers, cornflakes and low calorie snacks from extruded starch) were analysed to assess differences in sound emission patterns. The emitted sounds were recorded by a microphone placed over the ear canal. The first bite and the first subsequent chew were selected from the time signal and a fast Fourier transformation provided the power spectra. Different multivariate analysis techniques were used for classification of the snack groups. This included principal component analysis (PCA) and unfold partial least-squares (PLS) algorithms, as well as multi-way techniques such as three-way PLS, three-way PCA (Tucker3), and parallel factor analysis (PARAFAC) on the first bite and subsequent chew. The models were evaluated by calculating the classification errors and the root mean square error of prediction (RMSEP) for independent validation sets. It appeared that the logarithm of the power spectra obtained from the chewing sounds could be used successfully to distinguish the different snack groups. When different chewers were used, recalibration of the models was necessary. Multi-way models distinguished better between chewing sounds of different snack groups than PCA on bite or chew separately and than unfold PLS. From all three-way models applied, N-PLS with three components showed the best classification capabilities, resulting in classification errors of 14-18%. The major amount of incorrect classifications was due to one type of potato chips that had a very irregular shape, resulting in a wide variation of the emitted sounds.

  12. Statistical methods and neural network approaches for classification of data from multiple sources

    NASA Technical Reports Server (NTRS)

    Benediktsson, Jon Atli; Swain, Philip H.

    1990-01-01

    Statistical methods for classification of data from multiple data sources are investigated and compared to neural network models. A problem with using conventional multivariate statistical approaches for classification of data of multiple types is in general that a multivariate distribution cannot be assumed for the classes in the data sources. Another common problem with statistical classification methods is that the data sources are not equally reliable. This means that the data sources need to be weighted according to their reliability but most statistical classification methods do not have a mechanism for this. This research focuses on statistical methods which can overcome these problems: a method of statistical multisource analysis and consensus theory. Reliability measures for weighting the data sources in these methods are suggested and investigated. Secondly, this research focuses on neural network models. The neural networks are distribution free since no prior knowledge of the statistical distribution of the data is needed. This is an obvious advantage over most statistical classification methods. The neural networks also automatically take care of the problem involving how much weight each data source should have. On the other hand, their training process is iterative and can take a very long time. Methods to speed up the training procedure are introduced and investigated. Experimental results of classification using both neural network models and statistical methods are given, and the approaches are compared based on these results.

  13. Kernel Wiener filter and its application to pattern recognition.

    PubMed

    Yoshino, Hirokazu; Dong, Chen; Washizawa, Yoshikazu; Yamashita, Yukihiko

    2010-11-01

    The Wiener filter (WF) is widely used for inverse problems. From an observed signal, it provides the best estimated signal with respect to the squared error averaged over the original and the observed signals among linear operators. The kernel WF (KWF), extended directly from WF, has a problem that an additive noise has to be handled by samples. Since the computational complexity of kernel methods depends on the number of samples, a huge computational cost is necessary for the case. By using the first-order approximation of kernel functions, we realize KWF that can handle such a noise not by samples but as a random variable. We also propose the error estimation method for kernel filters by using the approximations. In order to show the advantages of the proposed methods, we conducted the experiments to denoise images and estimate errors. We also apply KWF to classification since KWF can provide an approximated result of the maximum a posteriori classifier that provides the best recognition accuracy. The noise term in the criterion can be used for the classification in the presence of noise or a new regularization to suppress changes in the input space, whereas the ordinary regularization for the kernel method suppresses changes in the feature space. In order to show the advantages of the proposed methods, we conducted experiments of binary and multiclass classifications and classification in the presence of noise.

  14. Robust automated classification of first-motion polarities for focal mechanism determination with machine learning

    NASA Astrophysics Data System (ADS)

    Ross, Z. E.; Meier, M. A.; Hauksson, E.

    2017-12-01

    Accurate first-motion polarities are essential for determining earthquake focal mechanisms, but are difficult to measure automatically because of picking errors and signal to noise issues. Here we develop an algorithm for reliable automated classification of first-motion polarities using machine learning algorithms. A classifier is designed to identify whether the first-motion polarity is up, down, or undefined by examining the waveform data directly. We first improve the accuracy of automatic P-wave onset picks by maximizing a weighted signal/noise ratio for a suite of candidate picks around the automatic pick. We then use the waveform amplitudes before and after the optimized pick as features for the classification. We demonstrate the method's potential by training and testing the classifier on tens of thousands of hand-made first-motion picks by the Southern California Seismic Network. The classifier assigned the same polarity as chosen by an analyst in more than 94% of the records. We show that the method is generalizable to a variety of learning algorithms, including neural networks and random forest classifiers. The method is suitable for automated processing of large seismic waveform datasets, and can potentially be used in real-time applications, e.g. for improving the source characterizations of earthquake early warning algorithms.

  15. Consequences of land-cover misclassification in models of impervious surface

    USGS Publications Warehouse

    McMahon, G.

    2007-01-01

    Model estimates of impervious area as a function of landcover area may be biased and imprecise because of errors in the land-cover classification. This investigation of the effects of land-cover misclassification on impervious surface models that use National Land Cover Data (NLCD) evaluates the consequences of adjusting land-cover within a watershed to reflect uncertainty assessment information. Model validation results indicate that using error-matrix information to adjust land-cover values used in impervious surface models does not substantially improve impervious surface predictions. Validation results indicate that the resolution of the landcover data (Level I and Level II) is more important in predicting impervious surface accurately than whether the land-cover data have been adjusted using information in the error matrix. Level I NLCD, adjusted for land-cover misclassification, is preferable to the other land-cover options for use in models of impervious surface. This result is tied to the lower classification error rates for the Level I NLCD. ?? 2007 American Society for Photogrammetry and Remote Sensing.

  16. Improved classification accuracy by feature extraction using genetic algorithms

    NASA Astrophysics Data System (ADS)

    Patriarche, Julia; Manduca, Armando; Erickson, Bradley J.

    2003-05-01

    A feature extraction algorithm has been developed for the purposes of improving classification accuracy. The algorithm uses a genetic algorithm / hill-climber hybrid to generate a set of linearly recombined features, which may be of reduced dimensionality compared with the original set. The genetic algorithm performs the global exploration, and a hill climber explores local neighborhoods. Hybridizing the genetic algorithm with a hill climber improves both the rate of convergence, and the final overall cost function value; it also reduces the sensitivity of the genetic algorithm to parameter selection. The genetic algorithm includes the operators: crossover, mutation, and deletion / reactivation - the last of these effects dimensionality reduction. The feature extractor is supervised, and is capable of deriving a separate feature space for each tissue (which are reintegrated during classification). A non-anatomical digital phantom was developed as a gold standard for testing purposes. In tests with the phantom, and with images of multiple sclerosis patients, classification with feature extractor derived features yielded lower error rates than using standard pulse sequences, and with features derived using principal components analysis. Using the multiple sclerosis patient data, the algorithm resulted in a mean 31% reduction in classification error of pure tissues.

  17. Locally Weighted Score Estimation for Quantile Classification in Binary Regression Models

    PubMed Central

    Rice, John D.; Taylor, Jeremy M. G.

    2016-01-01

    One common use of binary response regression methods is classification based on an arbitrary probability threshold dictated by the particular application. Since this is given to us a priori, it is sensible to incorporate the threshold into our estimation procedure. Specifically, for the linear logistic model, we solve a set of locally weighted score equations, using a kernel-like weight function centered at the threshold. The bandwidth for the weight function is selected by cross validation of a novel hybrid loss function that combines classification error and a continuous measure of divergence between observed and fitted values; other possible cross-validation functions based on more common binary classification metrics are also examined. This work has much in common with robust estimation, but diers from previous approaches in this area in its focus on prediction, specifically classification into high- and low-risk groups. Simulation results are given showing the reduction in error rates that can be obtained with this method when compared with maximum likelihood estimation, especially under certain forms of model misspecification. Analysis of a melanoma data set is presented to illustrate the use of the method in practice. PMID:28018492

  18. Automatic classification of time-variable X-ray sources

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lo, Kitty K.; Farrell, Sean; Murphy, Tara

    2014-05-01

    To maximize the discovery potential of future synoptic surveys, especially in the field of transient science, it will be necessary to use automatic classification to identify some of the astronomical sources. The data mining technique of supervised classification is suitable for this problem. Here, we present a supervised learning method to automatically classify variable X-ray sources in the Second XMM-Newton Serendipitous Source Catalog (2XMMi-DR2). Random Forest is our classifier of choice since it is one of the most accurate learning algorithms available. Our training set consists of 873 variable sources and their features are derived from time series, spectra, andmore » other multi-wavelength contextual information. The 10 fold cross validation accuracy of the training data is ∼97% on a 7 class data set. We applied the trained classification model to 411 unknown variable 2XMM sources to produce a probabilistically classified catalog. Using the classification margin and the Random Forest derived outlier measure, we identified 12 anomalous sources, of which 2XMM J180658.7–500250 appears to be the most unusual source in the sample. Its X-ray spectra is suggestive of a ultraluminous X-ray source but its variability makes it highly unusual. Machine-learned classification and anomaly detection will facilitate scientific discoveries in the era of all-sky surveys.« less

  19. Error detection and reduction in blood banking.

    PubMed

    Motschman, T L; Moore, S B

    1996-12-01

    Error management plays a major role in facility process improvement efforts. By detecting and reducing errors, quality and, therefore, patient care improve. It begins with a strong organizational foundation of management attitude with clear, consistent employee direction and appropriate physical facilities. Clearly defined critical processes, critical activities, and SOPs act as the framework for operations as well as active quality monitoring. To assure that personnel can detect an report errors they must be trained in both operational duties and error management practices. Use of simulated/intentional errors and incorporation of error detection into competency assessment keeps employees practiced, confident, and diminishes fear of the unknown. Personnel can clearly see that errors are indeed used as opportunities for process improvement and not for punishment. The facility must have a clearly defined and consistently used definition for reportable errors. Reportable errors should include those errors with potentially harmful outcomes as well as those errors that are "upstream," and thus further away from the outcome. A well-written error report consists of who, what, when, where, why/how, and follow-up to the error. Before correction can occur, an investigation to determine the underlying cause of the error should be undertaken. Obviously, the best corrective action is prevention. Correction can occur at five different levels; however, only three of these levels are directed at prevention. Prevention requires a method to collect and analyze data concerning errors. In the authors' facility a functional error classification method and a quality system-based classification have been useful. An active method to search for problems uncovers them further upstream, before they can have disastrous outcomes. In the continual quest for improving processes, an error management program is itself a process that needs improvement, and we must strive to always close the circle of quality assurance. Ultimately, the goal of better patient care will be the reward.

  20. PDF text classification to leverage information extraction from publication reports.

    PubMed

    Bui, Duy Duc An; Del Fiol, Guilherme; Jonnalagadda, Siddhartha

    2016-06-01

    Data extraction from original study reports is a time-consuming, error-prone process in systematic review development. Information extraction (IE) systems have the potential to assist humans in the extraction task, however majority of IE systems were not designed to work on Portable Document Format (PDF) document, an important and common extraction source for systematic review. In a PDF document, narrative content is often mixed with publication metadata or semi-structured text, which add challenges to the underlining natural language processing algorithm. Our goal is to categorize PDF texts for strategic use by IE systems. We used an open-source tool to extract raw texts from a PDF document and developed a text classification algorithm that follows a multi-pass sieve framework to automatically classify PDF text snippets (for brevity, texts) into TITLE, ABSTRACT, BODYTEXT, SEMISTRUCTURE, and METADATA categories. To validate the algorithm, we developed a gold standard of PDF reports that were included in the development of previous systematic reviews by the Cochrane Collaboration. In a two-step procedure, we evaluated (1) classification performance, and compared it with machine learning classifier, and (2) the effects of the algorithm on an IE system that extracts clinical outcome mentions. The multi-pass sieve algorithm achieved an accuracy of 92.6%, which was 9.7% (p<0.001) higher than the best performing machine learning classifier that used a logistic regression algorithm. F-measure improvements were observed in the classification of TITLE (+15.6%), ABSTRACT (+54.2%), BODYTEXT (+3.7%), SEMISTRUCTURE (+34%), and MEDADATA (+14.2%). In addition, use of the algorithm to filter semi-structured texts and publication metadata improved performance of the outcome extraction system (F-measure +4.1%, p=0.002). It also reduced of number of sentences to be processed by 44.9% (p<0.001), which corresponds to a processing time reduction of 50% (p=0.005). The rule-based multi-pass sieve framework can be used effectively in categorizing texts extracted from PDF documents. Text classification is an important prerequisite step to leverage information extraction from PDF documents. Copyright © 2016 Elsevier Inc. All rights reserved.

  1. Simulation of seagrass bed mapping by satellite images based on the radiative transfer model

    NASA Astrophysics Data System (ADS)

    Sagawa, Tatsuyuki; Komatsu, Teruhisa

    2015-06-01

    Seagrass and seaweed beds play important roles in coastal marine ecosystems. They are food sources and habitats for many marine organisms, and influence the physical, chemical, and biological environment. They are sensitive to human impacts such as reclamation and pollution. Therefore, their management and preservation are necessary for a healthy coastal environment. Satellite remote sensing is a useful tool for mapping and monitoring seagrass beds. The efficiency of seagrass mapping, seagrass bed classification in particular, has been evaluated by mapping accuracy using an error matrix. However, mapping accuracies are influenced by coastal environments such as seawater transparency, bathymetry, and substrate type. Coastal management requires sufficient accuracy and an understanding of mapping limitations for monitoring coastal habitats including seagrass beds. Previous studies are mainly based on case studies in specific regions and seasons. Extensive data are required to generalise assessments of classification accuracy from case studies, which has proven difficult. This study aims to build a simulator based on a radiative transfer model to produce modelled satellite images and assess the visual detectability of seagrass beds under different transparencies and seagrass coverages, as well as to examine mapping limitations and classification accuracy. Our simulations led to the development of a model of water transparency and the mapping of depth limits and indicated the possibility for seagrass density mapping under certain ideal conditions. The results show that modelling satellite images is useful in evaluating the accuracy of classification and that establishing seagrass bed monitoring by remote sensing is a reliable tool.

  2. Modified Bat Algorithm for Feature Selection with the Wisconsin Diagnosis Breast Cancer (WDBC) Dataset

    PubMed

    Jeyasingh, Suganthi; Veluchamy, Malathi

    2017-05-01

    Early diagnosis of breast cancer is essential to save lives of patients. Usually, medical datasets include a large variety of data that can lead to confusion during diagnosis. The Knowledge Discovery on Database (KDD) process helps to improve efficiency. It requires elimination of inappropriate and repeated data from the dataset before final diagnosis. This can be done using any of the feature selection algorithms available in data mining. Feature selection is considered as a vital step to increase the classification accuracy. This paper proposes a Modified Bat Algorithm (MBA) for feature selection to eliminate irrelevant features from an original dataset. The Bat algorithm was modified using simple random sampling to select the random instances from the dataset. Ranking was with the global best features to recognize the predominant features available in the dataset. The selected features are used to train a Random Forest (RF) classification algorithm. The MBA feature selection algorithm enhanced the classification accuracy of RF in identifying the occurrence of breast cancer. The Wisconsin Diagnosis Breast Cancer Dataset (WDBC) was used for estimating the performance analysis of the proposed MBA feature selection algorithm. The proposed algorithm achieved better performance in terms of Kappa statistic, Mathew’s Correlation Coefficient, Precision, F-measure, Recall, Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Relative Absolute Error (RAE) and Root Relative Squared Error (RRSE). Creative Commons Attribution License

  3. Simulated rRNA/DNA Ratios Show Potential To Misclassify Active Populations as Dormant

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Steven, Blaire; Hesse, Cedar; Soghigian, John

    The use of rRNA/DNA ratios derived from surveys of rRNA sequences in RNA and DNA extracts is an appealing but poorly validated approach to infer the activity status of environmental microbes. To improve the interpretation of rRNA/DNA ratios, we performed simulations to investigate the effects of community structure, rRNA amplification, and sampling depth on the accuracy of rRNA/DNA ratios in classifying bacterial populations as “active” or “dormant.” Community structure was an insignificant factor. In contrast, the extent of rRNA amplification that occurs as cells transition from dormant to growing had a significant effect (P < 0.0001) on classification accuracy, withmore » misclassification errors ranging from 16 to 28%, depending on the rRNA amplification model. The error rate increased to 47% when communities included a mixture of rRNA amplification models, but most of the inflated error was false negatives (i.e., active populations misclassified as dormant). Sampling depth also affected error rates (P < 0.001). Inadequate sampling depth produced various artifacts that are characteristic of rRNA/DNA ratios generated from real communities. These data show important constraints on the use of rRNA/DNA ratios to infer activity status. Whereas classification of populations as active based on rRNA/DNA ratios appears generally valid, classification of populations as dormant is potentially far less accurate.« less

  4. Simulated rRNA/DNA Ratios Show Potential To Misclassify Active Populations as Dormant

    DOE PAGES

    Steven, Blaire; Hesse, Cedar; Soghigian, John; ...

    2017-03-31

    The use of rRNA/DNA ratios derived from surveys of rRNA sequences in RNA and DNA extracts is an appealing but poorly validated approach to infer the activity status of environmental microbes. To improve the interpretation of rRNA/DNA ratios, we performed simulations to investigate the effects of community structure, rRNA amplification, and sampling depth on the accuracy of rRNA/DNA ratios in classifying bacterial populations as “active” or “dormant.” Community structure was an insignificant factor. In contrast, the extent of rRNA amplification that occurs as cells transition from dormant to growing had a significant effect (P < 0.0001) on classification accuracy, withmore » misclassification errors ranging from 16 to 28%, depending on the rRNA amplification model. The error rate increased to 47% when communities included a mixture of rRNA amplification models, but most of the inflated error was false negatives (i.e., active populations misclassified as dormant). Sampling depth also affected error rates (P < 0.001). Inadequate sampling depth produced various artifacts that are characteristic of rRNA/DNA ratios generated from real communities. These data show important constraints on the use of rRNA/DNA ratios to infer activity status. Whereas classification of populations as active based on rRNA/DNA ratios appears generally valid, classification of populations as dormant is potentially far less accurate.« less

  5. Regulation of IAP (Inhibitor of Apoptosis) Gene Expression by the p53 Tumor Suppressor Protein

    DTIC Science & Technology

    2005-05-01

    adenovirus, gene therapy, polymorphism, 31 16. PRICE CODE 17. SECURITY CLASSIFICATION 18. SECURITY CLASSIFICATION 19. SECURITY CLASSIFICATION 20...averaged results of three inde- pendent experiments, with standard error. Right panel: Level of p53 in infected cells using the antibody Ab-6 (Calbiochem...with highly purified mitochondria as described in (2). The arrow marks oligomerized BAK. The right _ -. panel depicts the purity of BMH CrosIinked Mito

  6. An Analysis of U.S. Army Fratricide Incidents during the Global War on Terror (11 September 2001 to 31 March 2008)

    DTIC Science & Technology

    2010-03-15

    Swiss cheese model of human error causation. ................................................................... 3  2. Results for the classification of...based on Reason’s “ Swiss cheese ” model of human error (1990). Figure 1 describes how an accident is likely to occur when all of the errors, or “holes...align. A detailed description of HFACS can be found in Wiegmann and Shappell (2003). Figure 1. The Swiss cheese model of human error

  7. Limited-information goodness-of-fit testing of diagnostic classification item response models.

    PubMed

    Hansen, Mark; Cai, Li; Monroe, Scott; Li, Zhen

    2016-11-01

    Despite the growing popularity of diagnostic classification models (e.g., Rupp et al., 2010, Diagnostic measurement: theory, methods, and applications, Guilford Press, New York, NY) in educational and psychological measurement, methods for testing their absolute goodness of fit to real data remain relatively underdeveloped. For tests of reasonable length and for realistic sample size, full-information test statistics such as Pearson's X 2 and the likelihood ratio statistic G 2 suffer from sparseness in the underlying contingency table from which they are computed. Recently, limited-information fit statistics such as Maydeu-Olivares and Joe's (2006, Psychometrika, 71, 713) M 2 have been found to be quite useful in testing the overall goodness of fit of item response theory models. In this study, we applied Maydeu-Olivares and Joe's (2006, Psychometrika, 71, 713) M 2 statistic to diagnostic classification models. Through a series of simulation studies, we found that M 2 is well calibrated across a wide range of diagnostic model structures and was sensitive to certain misspecifications of the item model (e.g., fitting disjunctive models to data generated according to a conjunctive model), errors in the Q-matrix (adding or omitting paths, omitting a latent variable), and violations of local item independence due to unmodelled testlet effects. On the other hand, M 2 was largely insensitive to misspecifications in the distribution of higher-order latent dimensions and to the specification of an extraneous attribute. To complement the analyses of the overall model goodness of fit using M 2 , we investigated the utility of the Chen and Thissen (1997, J. Educ. Behav. Stat., 22, 265) local dependence statistic XLD2 for characterizing sources of misfit, an important aspect of model appraisal often overlooked in favour of overall statements. The XLD2 statistic was found to be slightly conservative (with Type I error rates consistently below the nominal level) but still useful in pinpointing the sources of misfit. Patterns of local dependence arising due to specific model misspecifications are illustrated. Finally, we used the M 2 and XLD2 statistics to evaluate a diagnostic model fit to data from the Trends in Mathematics and Science Study, drawing upon analyses previously conducted by Lee et al., (2011, IJT, 11, 144). © 2016 The British Psychological Society.

  8. Object-Based Land Use Classification of Agricultural Land by Coupling Multi-Temporal Spectral Characteristics and Phenological Events in Germany

    NASA Astrophysics Data System (ADS)

    Knoefel, Patrick; Loew, Fabian; Conrad, Christopher

    2015-04-01

    Crop maps based on classification of remotely sensed data are of increased attendance in agricultural management. This induces a more detailed knowledge about the reliability of such spatial information. However, classification of agricultural land use is often limited by high spectral similarities of the studied crop types. More, spatially and temporally varying agro-ecological conditions can introduce confusion in crop mapping. Classification errors in crop maps in turn may have influence on model outputs, like agricultural production monitoring. One major goal of the PhenoS project ("Phenological structuring to determine optimal acquisition dates for Sentinel-2 data for field crop classification"), is the detection of optimal phenological time windows for land cover classification purposes. Since many crop species are spectrally highly similar, accurate classification requires the right selection of satellite images for a certain classification task. In the course of one growing season, phenological phases exist where crops are separable with higher accuracies. For this purpose, coupling of multi-temporal spectral characteristics and phenological events is promising. The focus of this study is set on the separation of spectrally similar cereal crops like winter wheat, barley, and rye of two test sites in Germany called "Harz/Central German Lowland" and "Demmin". However, this study uses object based random forest (RF) classification to investigate the impact of image acquisition frequency and timing on crop classification uncertainty by permuting all possible combinations of available RapidEye time series recorded on the test sites between 2010 and 2014. The permutations were applied to different segmentation parameters. Then, classification uncertainty was assessed and analysed, based on the probabilistic soft-output from the RF algorithm at the per-field basis. From this soft output, entropy was calculated as a spatial measure of classification uncertainty. The results indicate that uncertainty estimates provide a valuable addition to traditional accuracy assessments and helps the user to allocate error in crop maps.

  9. A new classification of glaucomas

    PubMed Central

    Bordeianu, Constantin-Dan

    2014-01-01

    Purpose To suggest a new glaucoma classification that is pathogenic, etiologic, and clinical. Methods After discussing the logical pathway used in criteria selection, the paper presents the new classification and compares it with the classification currently in use, that is, the one issued by the European Glaucoma Society in 2008. Results The paper proves that the new classification is clear (being based on a coherent and consistently followed set of criteria), is comprehensive (framing all forms of glaucoma), and helps in understanding the sickness understanding (in that it uses a logical framing system). The great advantage is that it facilitates therapeutic decision making in that it offers direct therapeutic suggestions and avoids errors leading to disasters. Moreover, the scheme remains open to any new development. Conclusion The suggested classification is a pathogenic, etiologic, and clinical classification that fulfills the conditions of an ideal classification. The suggested classification is the first classification in which the main criterion is consistently used for the first 5 to 7 crossings until its differentiation capabilities are exhausted. Then, secondary criteria (etiologic and clinical) pick up the relay until each form finds its logical place in the scheme. In order to avoid unclear aspects, the genetic criterion is no longer used, being replaced by age, one of the clinical criteria. The suggested classification brings only benefits to all categories of ophthalmologists: the beginners will have a tool to better understand the sickness and to ease their decision making, whereas the experienced doctors will have their practice simplified. For all doctors, errors leading to therapeutic disasters will be less likely to happen. Finally, researchers will have the object of their work gathered in the group of glaucoma with unknown or uncertain pathogenesis, whereas the results of their work will easily find a logical place in the scheme, as the suggested classification remains open to any new development. PMID:25246759

  10. Classification of burn wounds using support vector machines

    NASA Astrophysics Data System (ADS)

    Acha, Begona; Serrano, Carmen; Palencia, Sergio; Murillo, Juan Jose

    2004-05-01

    The purpose of this work is to improve a previous method developed by the authors for the classification of burn wounds into their depths. The inputs of the system are color and texture information, as these are the characteristics observed by physicians in order to give a diagnosis. Our previous work consisted in segmenting the burn wound from the rest of the image and classifying the burn into its depth. In this paper we focus on the classification problem only. We already proposed to use a Fuzzy-ARTMAP neural network (NN). However, we may take advantage of new powerful classification tools such as Support Vector Machines (SVM). We apply the five-folded cross validation scheme to divide the database into training and validating sets. Then, we apply a feature selection method for each classifier, which will give us the set of features that yields the smallest classification error for each classifier. Features used to classify are first-order statistical parameters extracted from the L*, u* and v* color components of the image. The feature selection algorithms used are the Sequential Forward Selection (SFS) and the Sequential Backward Selection (SBS) methods. As data of the problem faced here are not linearly separable, the SVM was trained using some different kernels. The validating process shows that the SVM method, when using a Gaussian kernel of variance 1, outperforms classification results obtained with the rest of the classifiers, yielding an error classification rate of 0.7% whereas the Fuzzy-ARTMAP NN attained 1.6 %.

  11. Comparative assessment of LANDSAT-D MSS and TM data quality for mapping applications in the Southeast

    NASA Technical Reports Server (NTRS)

    1984-01-01

    Rectifications of multispectral scanner and thematic mapper data sets for full and subscene areas, analyses of planimetric errors, assessments of the number and distribution of ground control points required to minimize errors, and factors contributing to error residual are examined. Other investigations include the generation of three dimensional terrain models and the effects of spatial resolution on digital classification accuracies.

  12. Cluster designs to assess the prevalence of acute malnutrition by lot quality assurance sampling: a validation study by computer simulation.

    PubMed

    Olives, Casey; Pagano, Marcello; Deitchler, Megan; Hedt, Bethany L; Egge, Kari; Valadez, Joseph J

    2009-04-01

    Traditional lot quality assurance sampling (LQAS) methods require simple random sampling to guarantee valid results. However, cluster sampling has been proposed to reduce the number of random starting points. This study uses simulations to examine the classification error of two such designs, a 67x3 (67 clusters of three observations) and a 33x6 (33 clusters of six observations) sampling scheme to assess the prevalence of global acute malnutrition (GAM). Further, we explore the use of a 67x3 sequential sampling scheme for LQAS classification of GAM prevalence. Results indicate that, for independent clusters with moderate intracluster correlation for the GAM outcome, the three sampling designs maintain approximate validity for LQAS analysis. Sequential sampling can substantially reduce the average sample size that is required for data collection. The presence of intercluster correlation can impact dramatically the classification error that is associated with LQAS analysis.

  13. Influence of ECG measurement accuracy on ECG diagnostic statements.

    PubMed

    Zywietz, C; Celikag, D; Joseph, G

    1996-01-01

    Computer analysis of electrocardiograms (ECGs) provides a large amount of ECG measurement data, which may be used for diagnostic classification and storage in ECG databases. Until now, neither error limits for ECG measurements have been specified nor has their influence on diagnostic statements been systematically investigated. An analytical method is presented to estimate the influence of measurement errors on the accuracy of diagnostic ECG statements. Systematic (offset) errors will usually result in an increase of false positive or false negative statements since they cause a shift of the working point on the receiver operating characteristics curve. Measurement error dispersion broadens the distribution function of discriminative measurement parameters and, therefore, usually increases the overlap between discriminative parameters. This results in a flattening of the receiver operating characteristics curve and an increase of false positive and false negative classifications. The method developed has been applied to ECG conduction defect diagnoses by using the proposed International Electrotechnical Commission's interval measurement tolerance limits. These limits appear too large because more than 30% of false positive atrial conduction defect statements and 10-18% of false intraventricular conduction defect statements could be expected due to tolerated measurement errors. To assure long-term usability of ECG measurement databases, it is recommended that systems provide its error tolerance limits obtained on a defined test set.

  14. Decision support system for determining the contact lens for refractive errors patients with classification ID3

    NASA Astrophysics Data System (ADS)

    Situmorang, B. H.; Setiawan, M. P.; Tosida, E. T.

    2017-01-01

    Refractive errors are abnormalities of the refraction of light so that the shadows do not focus precisely on the retina resulting in blurred vision [1]. Refractive errors causing the patient should wear glasses or contact lenses in order eyesight returned to normal. The use of glasses or contact lenses in a person will be different from others, it is influenced by patient age, the amount of tear production, vision prescription, and astigmatic. Because the eye is one organ of the human body is very important to see, then the accuracy in determining glasses or contact lenses which will be used is required. This research aims to develop a decision support system that can produce output on the right contact lenses for refractive errors patients with a value of 100% accuracy. Iterative Dichotomize Three (ID3) classification methods will generate gain and entropy values of attributes that include code sample data, age of the patient, astigmatic, the ratio of tear production, vision prescription, and classes that will affect the outcome of the decision tree. The eye specialist test result for the training data obtained the accuracy rate of 96.7% and an error rate of 3.3%, the result test using confusion matrix obtained the accuracy rate of 96.1% and an error rate of 3.1%; for the data testing obtained accuracy rate of 100% and an error rate of 0.

  15. A posteriori error estimates in voice source recovery

    NASA Astrophysics Data System (ADS)

    Leonov, A. S.; Sorokin, V. N.

    2017-12-01

    The inverse problem of voice source pulse recovery from a segment of a speech signal is under consideration. A special mathematical model is used for the solution that relates these quantities. A variational method of solving inverse problem of voice source recovery for a new parametric class of sources, that is for piecewise-linear sources (PWL-sources), is proposed. Also, a technique for a posteriori numerical error estimation for obtained solutions is presented. A computer study of the adequacy of adopted speech production model with PWL-sources is performed in solving the inverse problems for various types of voice signals, as well as corresponding study of a posteriori error estimates. Numerical experiments for speech signals show satisfactory properties of proposed a posteriori error estimates, which represent the upper bounds of possible errors in solving the inverse problem. The estimate of the most probable error in determining the source-pulse shapes is about 7-8% for the investigated speech material. It is noted that a posteriori error estimates can be used as a criterion of the quality for obtained voice source pulses in application to speaker recognition.

  16. SU-E-J-125: Classification of CBCT Noises in Terms of Their Contribution to Proton Range Uncertainty

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brousmiche, S; Orban de Xivry, J; Macq, B

    2014-06-01

    Purpose: This study assesses the potential use of CBCT images in adaptive protontherapy by estimating the contribution of the main sources of noise and calibration errors to the proton range uncertainty. Methods: Measurements intended to highlight each particular source have been achieved by adapting either the testbench configuration, e.g. use of filtration, fan-beam collimation, beam stop arrays, phantoms and detector reset light, or the sequence of correction algorithms including water precorrection. Additional Monte-Carlo simulations have been performed to complement these measurements, especially for the beam hardening and the scatter cases. Simulations of proton beams penetration through the resulting images havemore » then been carried out to quantify the range change due to these effects. The particular case of a brain irradiation is considered mainly because of the multiple effects that the skull bones have on the internal soft tissues. Results: On top of the range error sources is the undercorrection of scatter. Its influence has been analyzed from a comparison of fan-beam and full axial FOV acquisitions. In this case, large range errors of about 12 mm can be reached if the assumption is made that the scatter has only a constant contribution over the projection images. Even the detector lag, which a priori induces a much smaller effect, has been shown to contribute for up to 2 mm to the overall error if its correction only aims at reducing the skin artefact. This last result can partially be understood by the larger interface between tissues and bones inside the skull. Conclusion: This study has set the basis of a more systematical analysis of the effect CBCT noise on range uncertainties based on a combination of measurements, simulations and theoretical results. With our method, even more subtle effects such as the cone-beam artifact or the detector lag can be assessed. SBR and JOR are financed by iMagX, a public-private partnership between the region Wallone of Belgium and IBA under convention #1217662.« less

  17. The generalization ability of online SVM classification based on Markov sampling.

    PubMed

    Xu, Jie; Yan Tang, Yuan; Zou, Bin; Xu, Zongben; Li, Luoqing; Lu, Yang

    2015-03-01

    In this paper, we consider online support vector machine (SVM) classification learning algorithms with uniformly ergodic Markov chain (u.e.M.c.) samples. We establish the bound on the misclassification error of an online SVM classification algorithm with u.e.M.c. samples based on reproducing kernel Hilbert spaces and obtain a satisfactory convergence rate. We also introduce a novel online SVM classification algorithm based on Markov sampling, and present the numerical studies on the learning ability of online SVM classification based on Markov sampling for benchmark repository. The numerical studies show that the learning performance of the online SVM classification algorithm based on Markov sampling is better than that of classical online SVM classification based on random sampling as the size of training samples is larger.

  18. Aspen, climate, and sudden decline in western USA

    Treesearch

    Gerald E. Rehfeldt; Dennis E. Ferguson; Nicholas L. Crookston

    2009-01-01

    A bioclimate model predicting the presence or absence of aspen, Populus tremuloides, in western USA from climate variables was developed by using the Random Forests classification tree on Forest Inventory data from about 118,000 permanent sample plots. A reasonably parsimonious model used eight predictors to describe aspen's climate profile. Classification errors...

  19. Objected-oriented remote sensing image classification method based on geographic ontology model

    NASA Astrophysics Data System (ADS)

    Chu, Z.; Liu, Z. J.; Gu, H. Y.

    2016-11-01

    Nowadays, with the development of high resolution remote sensing image and the wide application of laser point cloud data, proceeding objected-oriented remote sensing classification based on the characteristic knowledge of multi-source spatial data has been an important trend on the field of remote sensing image classification, which gradually replaced the traditional method through improving algorithm to optimize image classification results. For this purpose, the paper puts forward a remote sensing image classification method that uses the he characteristic knowledge of multi-source spatial data to build the geographic ontology semantic network model, and carries out the objected-oriented classification experiment to implement urban features classification, the experiment uses protégé software which is developed by Stanford University in the United States, and intelligent image analysis software—eCognition software as the experiment platform, uses hyperspectral image and Lidar data that is obtained through flight in DaFeng City of JiangSu as the main data source, first of all, the experiment uses hyperspectral image to obtain feature knowledge of remote sensing image and related special index, the second, the experiment uses Lidar data to generate nDSM(Normalized DSM, Normalized Digital Surface Model),obtaining elevation information, the last, the experiment bases image feature knowledge, special index and elevation information to build the geographic ontology semantic network model that implement urban features classification, the experiment results show that, this method is significantly higher than the traditional classification algorithm on classification accuracy, especially it performs more evidently on the respect of building classification. The method not only considers the advantage of multi-source spatial data, for example, remote sensing image, Lidar data and so on, but also realizes multi-source spatial data knowledge integration and application of the knowledge to the field of remote sensing image classification, which provides an effective way for objected-oriented remote sensing image classification in the future.

  20. Multi-template tensor-based morphometry: Application to analysis of Alzheimer's disease

    PubMed Central

    Koikkalainen, Juha; Lötjönen, Jyrki; Thurfjell, Lennart; Rueckert, Daniel; Waldemar, Gunhild; Soininen, Hilkka

    2012-01-01

    In this paper methods for using multiple templates in tensor-based morphometry (TBM) are presented and comparedtothe conventional single-template approach. TBM analysis requires non-rigid registrations which are often subject to registration errors. When using multiple templates and, therefore, multiple registrations, it can be assumed that the registration errors are averaged and eventually compensated. Four different methods are proposed for multi-template TBM. The methods were evaluated using magnetic resonance (MR) images of healthy controls, patients with stable or progressive mild cognitive impairment (MCI), and patients with Alzheimer's disease (AD) from the ADNI database (N=772). The performance of TBM features in classifying images was evaluated both quantitatively and qualitatively. Classification results show that the multi-template methods are statistically significantly better than the single-template method. The overall classification accuracy was 86.0% for the classification of control and AD subjects, and 72.1%for the classification of stable and progressive MCI subjects. The statistical group-level difference maps produced using multi-template TBM were smoother, formed larger continuous regions, and had larger t-values than the maps obtained with single-template TBM. PMID:21419228

  1. Predicting alpine headwater stream intermittency: a case study in the northern Rocky Mountains

    USGS Publications Warehouse

    Sando, Thomas R.; Blasch, Kyle W.

    2015-01-01

    This investigation used climatic, geological, and environmental data coupled with observational stream intermittency data to predict alpine headwater stream intermittency. Prediction was made using a random forest classification model. Results showed that the most important variables in the prediction model were snowpack persistence, represented by average snow extent from March through July, mean annual mean monthly minimum temperature, and surface geology types. For stream catchments with intermittent headwater streams, snowpack, on average, persisted until early June, whereas for stream catchments with perennial headwater streams, snowpack, on average, persisted until early July. Additionally, on average, stream catchments with intermittent headwater streams were about 0.7 °C warmer than stream catchments with perennial headwater streams. Finally, headwater stream catchments primarily underlain by coarse, permeable sediment are significantly more likely to have intermittent headwater streams than those primarily underlain by impermeable bedrock. Comparison of the predicted streamflow classification with observed stream status indicated a four percent classification error for first-order streams and a 21 percent classification error for all stream orders in the study area.

  2. A framework for software fault tolerance in real-time systems

    NASA Technical Reports Server (NTRS)

    Anderson, T.; Knight, J. C.

    1983-01-01

    A classification scheme for errors and a technique for the provision of software fault tolerance in cyclic real-time systems is presented. The technique requires that the process structure of a system be represented by a synchronization graph which is used by an executive as a specification of the relative times at which they will communicate during execution. Communication between concurrent processes is severely limited and may only take place between processes engaged in an exchange. A history of error occurrences is maintained by an error handler. When an error is detected, the error handler classifies it using the error history information and then initiates appropriate recovery action.

  3. Software platform for managing the classification of error- related potentials of observers

    NASA Astrophysics Data System (ADS)

    Asvestas, P.; Ventouras, E.-C.; Kostopoulos, S.; Sidiropoulos, K.; Korfiatis, V.; Korda, A.; Uzunolglu, A.; Karanasiou, I.; Kalatzis, I.; Matsopoulos, G.

    2015-09-01

    Human learning is partly based on observation. Electroencephalographic recordings of subjects who perform acts (actors) or observe actors (observers), contain a negative waveform in the Evoked Potentials (EPs) of the actors that commit errors and of observers who observe the error-committing actors. This waveform is called the Error-Related Negativity (ERN). Its detection has applications in the context of Brain-Computer Interfaces. The present work describes a software system developed for managing EPs of observers, with the aim of classifying them into observations of either correct or incorrect actions. It consists of an integrated platform for the storage, management, processing and classification of EPs recorded during error-observation experiments. The system was developed using C# and the following development tools and frameworks: MySQL, .NET Framework, Entity Framework and Emgu CV, for interfacing with the machine learning library of OpenCV. Up to six features can be computed per EP recording per electrode. The user can select among various feature selection algorithms and then proceed to train one of three types of classifiers: Artificial Neural Networks, Support Vector Machines, k-nearest neighbour. Next the classifier can be used for classifying any EP curve that has been inputted to the database.

  4. R package PRIMsrc: Bump Hunting by Patient Rule Induction Method for Survival, Regression and Classification

    PubMed Central

    Dazard, Jean-Eudes; Choe, Michael; LeBlanc, Michael; Rao, J. Sunil

    2015-01-01

    PRIMsrc is a novel implementation of a non-parametric bump hunting procedure, based on the Patient Rule Induction Method (PRIM), offering a unified treatment of outcome variables, including censored time-to-event (Survival), continuous (Regression) and discrete (Classification) responses. To fit the model, it uses a recursive peeling procedure with specific peeling criteria and stopping rules depending on the response. To validate the model, it provides an objective function based on prediction-error or other specific statistic, as well as two alternative cross-validation techniques, adapted to the task of decision-rule making and estimation in the three types of settings. PRIMsrc comes as an open source R package, including at this point: (i) a main function for fitting a Survival Bump Hunting model with various options allowing cross-validated model selection to control model size (#covariates) and model complexity (#peeling steps) and generation of cross-validated end-point estimates; (ii) parallel computing; (iii) various S3-generic and specific plotting functions for data visualization, diagnostic, prediction, summary and display of results. It is available on CRAN and GitHub. PMID:26798326

  5. Assessing and minimizing contamination in time of flight based validation data

    NASA Astrophysics Data System (ADS)

    Lennox, Kristin P.; Rosenfield, Paul; Blair, Brenton; Kaplan, Alan; Ruz, Jaime; Glenn, Andrew; Wurtz, Ronald

    2017-10-01

    Time of flight experiments are the gold standard method for generating labeled training and testing data for the neutron/gamma pulse shape discrimination problem. As the popularity of supervised classification methods increases in this field, there will also be increasing reliance on time of flight data for algorithm development and evaluation. However, time of flight experiments are subject to various sources of contamination that lead to neutron and gamma pulses being mislabeled. Such labeling errors have a detrimental effect on classification algorithm training and testing, and should therefore be minimized. This paper presents a method for identifying minimally contaminated data sets from time of flight experiments and estimating the residual contamination rate. This method leverages statistical models describing neutron and gamma travel time distributions and is easily implemented using existing statistical software. The method produces a set of optimal intervals that balance the trade-off between interval size and nuisance particle contamination, and its use is demonstrated on a time of flight data set for Cf-252. The particular properties of the optimal intervals for the demonstration data are explored in detail.

  6. An analysis of computer-related patient safety incidents to inform the development of a classification

    PubMed Central

    Ong, Mei-Sing; Runciman, William; Coiera, Enrico

    2010-01-01

    Objective To analyze patient safety incidents associated with computer use to develop the basis for a classification of problems reported by health professionals. Design Incidents submitted to a voluntary incident reporting database across one Australian state were retrieved and a subset (25%) was analyzed to identify ‘natural categories’ for classification. Two coders independently classified the remaining incidents into one or more categories. Free text descriptions were analyzed to identify contributing factors. Where available medical specialty, time of day and consequences were examined. Measurements Descriptive statistics; inter-rater reliability. Results A search of 42 616 incidents from 2003 to 2005 yielded 123 computer related incidents. After removing duplicate and unrelated incidents, 99 incidents describing 117 problems remained. A classification with 32 types of computer use problems was developed. Problems were grouped into information input (31%), transfer (20%), output (20%) and general technical (24%). Overall, 55% of problems were machine related and 45% were attributed to human–computer interaction. Delays in initiating and completing clinical tasks were a major consequence of machine related problems (70%) whereas rework was a major consequence of human–computer interaction problems (78%). While 38% (n=26) of the incidents were reported to have a noticeable consequence but no harm, 34% (n=23) had no noticeable consequence. Conclusion Only 0.2% of all incidents reported were computer related. Further work is required to expand our classification using incident reports and other sources of information about healthcare IT problems. Evidence based user interface design must focus on the safe entry and retrieval of clinical information and support users in detecting and correcting errors and malfunctions. PMID:20962128

  7. Analysis of the load selection on the error of source characteristics identification for an engine exhaust system

    NASA Astrophysics Data System (ADS)

    Zheng, Sifa; Liu, Haitao; Dan, Jiabi; Lian, Xiaomin

    2015-05-01

    Linear time-invariant assumption for the determination of acoustic source characteristics, the source strength and the source impedance in the frequency domain has been proved reasonable in the design of an exhaust system. Different methods have been proposed to its identification and the multi-load method is widely used for its convenience by varying the load number and impedance. Theoretical error analysis has rarely been referred to and previous results have shown an overdetermined set of open pipes can reduce the identification error. This paper contributes a theoretical error analysis for the load selection. The relationships between the error in the identification of source characteristics and the load selection were analysed. A general linear time-invariant model was built based on the four-load method. To analyse the error of the source impedance, an error estimation function was proposed. The dispersion of the source pressure was obtained by an inverse calculation as an indicator to detect the accuracy of the results. It was found that for a certain load length, the load resistance at the frequency points of one-quarter wavelength of odd multiples results in peaks and in the maximum error for source impedance identification. Therefore, the load impedance of frequency range within the one-quarter wavelength of odd multiples should not be used for source impedance identification. If the selected loads have more similar resistance values (i.e., the same order of magnitude), the identification error of the source impedance could be effectively reduced.

  8. Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines

    PubMed Central

    del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J.; Raboso, Mariano

    2015-01-01

    Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation—based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking—to reduce the dimensions of images—and binarization—to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements. PMID:26091392

  9. Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines.

    PubMed

    del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J; Raboso, Mariano

    2015-06-17

    Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation-based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking-to reduce the dimensions of images-and binarization-to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements.

  10. Multi-factorial analysis of class prediction error: estimating optimal number of biomarkers for various classification rules.

    PubMed

    Khondoker, Mizanur R; Bachmann, Till T; Mewissen, Muriel; Dickinson, Paul; Dobrzelecki, Bartosz; Campbell, Colin J; Mount, Andrew R; Walton, Anthony J; Crain, Jason; Schulze, Holger; Giraud, Gerard; Ross, Alan J; Ciani, Ilenia; Ember, Stuart W J; Tlili, Chaker; Terry, Jonathan G; Grant, Eilidh; McDonnell, Nicola; Ghazal, Peter

    2010-12-01

    Machine learning and statistical model based classifiers have increasingly been used with more complex and high dimensional biological data obtained from high-throughput technologies. Understanding the impact of various factors associated with large and complex microarray datasets on the predictive performance of classifiers is computationally intensive, under investigated, yet vital in determining the optimal number of biomarkers for various classification purposes aimed towards improved detection, diagnosis, and therapeutic monitoring of diseases. We investigate the impact of microarray based data characteristics on the predictive performance for various classification rules using simulation studies. Our investigation using Random Forest, Support Vector Machines, Linear Discriminant Analysis and k-Nearest Neighbour shows that the predictive performance of classifiers is strongly influenced by training set size, biological and technical variability, replication, fold change and correlation between biomarkers. Optimal number of biomarkers for a classification problem should therefore be estimated taking account of the impact of all these factors. A database of average generalization errors is built for various combinations of these factors. The database of generalization errors can be used for estimating the optimal number of biomarkers for given levels of predictive accuracy as a function of these factors. Examples show that curves from actual biological data resemble that of simulated data with corresponding levels of data characteristics. An R package optBiomarker implementing the method is freely available for academic use from the Comprehensive R Archive Network (http://www.cran.r-project.org/web/packages/optBiomarker/).

  11. New low-resolution spectrometer spectra for IRAS sources

    NASA Astrophysics Data System (ADS)

    Volk, Kevin; Kwok, Sun; Stencel, R. E.; Brugel, E.

    1991-12-01

    Low-resolution spectra of 486 IRAS point sources with Fnu(12 microns) in the range 20-40 Jy are presented. This is part of an effort to extract and classify spectra that were not included in the Atlas of Low-Resolution Spectra and represents an extension of the earlier work by Volk and Cohen which covers sources with Fnu(12 microns) greater than 40 Jy. The spectra have been examined by eye and classified into nine groups based on the spectral morphology. This new classification scheme is compared with the mechanical classification of the Atlas, and the differences are noted. Oxygen-rich stars of the asymptotic giant branch make up 33 percent of the sample. Solid state features dominate the spectra of most sources. It is found that the nature of the sources as implied by the present spectral classification is consistent with the classifications based on broad-band colors of the sources.

  12. Detection of Periodic Leg Movements by Machine Learning Methods Using Polysomnographic Parameters Other Than Leg Electromyography

    PubMed Central

    Umut, İlhan; Çentik, Güven

    2016-01-01

    The number of channels used for polysomnographic recording frequently causes difficulties for patients because of the many cables connected. Also, it increases the risk of having troubles during recording process and increases the storage volume. In this study, it is intended to detect periodic leg movement (PLM) in sleep with the use of the channels except leg electromyography (EMG) by analysing polysomnography (PSG) data with digital signal processing (DSP) and machine learning methods. PSG records of 153 patients of different ages and genders with PLM disorder diagnosis were examined retrospectively. A novel software was developed for the analysis of PSG records. The software utilizes the machine learning algorithms, statistical methods, and DSP methods. In order to classify PLM, popular machine learning methods (multilayer perceptron, K-nearest neighbour, and random forests) and logistic regression were used. Comparison of classified results showed that while K-nearest neighbour classification algorithm had higher average classification rate (91.87%) and lower average classification error value (RMSE = 0.2850), multilayer perceptron algorithm had the lowest average classification rate (83.29%) and the highest average classification error value (RMSE = 0.3705). Results showed that PLM can be classified with high accuracy (91.87%) without leg EMG record being present. PMID:27213008

  13. Detection of Periodic Leg Movements by Machine Learning Methods Using Polysomnographic Parameters Other Than Leg Electromyography.

    PubMed

    Umut, İlhan; Çentik, Güven

    2016-01-01

    The number of channels used for polysomnographic recording frequently causes difficulties for patients because of the many cables connected. Also, it increases the risk of having troubles during recording process and increases the storage volume. In this study, it is intended to detect periodic leg movement (PLM) in sleep with the use of the channels except leg electromyography (EMG) by analysing polysomnography (PSG) data with digital signal processing (DSP) and machine learning methods. PSG records of 153 patients of different ages and genders with PLM disorder diagnosis were examined retrospectively. A novel software was developed for the analysis of PSG records. The software utilizes the machine learning algorithms, statistical methods, and DSP methods. In order to classify PLM, popular machine learning methods (multilayer perceptron, K-nearest neighbour, and random forests) and logistic regression were used. Comparison of classified results showed that while K-nearest neighbour classification algorithm had higher average classification rate (91.87%) and lower average classification error value (RMSE = 0.2850), multilayer perceptron algorithm had the lowest average classification rate (83.29%) and the highest average classification error value (RMSE = 0.3705). Results showed that PLM can be classified with high accuracy (91.87%) without leg EMG record being present.

  14. Effects of stress typicality during speeded grammatical classification.

    PubMed

    Arciuli, Joanne; Cupples, Linda

    2003-01-01

    The experiments reported here were designed to investigate the influence of stress typicality during speeded grammatical classification of disyllabic English words by native and non-native speakers. Trochaic nouns and iambic gram verbs were considered to be typically stressed, whereas iambic nouns and trochaic verbs were considered to be atypically stressed. Experiments 1a and 2a showed that while native speakers classified typically stressed words individual more quickly and more accurately than atypically stressed words during differences reading, there were no overall effects during classification of spoken stimuli. However, a subgroup of native speakers with high error rates did show a significant effect during classification of spoken stimuli. Experiments 1b and 2b showed that non-native speakers classified typically stressed words more quickly and more accurately than atypically stressed words during reading. Typically stressed words were classified more accurately than atypically stressed words when the stimuli were spoken. Importantly, there was a significant relationship between error rates, vocabulary size and the size of the stress typicality effect in each experiment. We conclude that participants use information about lexical stress to help them distinguish between disyllabic nouns and verbs during speeded grammatical classification. This is especially so for individuals with a limited vocabulary who lack other knowledge (e.g., semantic knowledge) about the differences between these grammatical categories.

  15. Uncertainty Analyses for Back Projection Methods

    NASA Astrophysics Data System (ADS)

    Zeng, H.; Wei, S.; Wu, W.

    2017-12-01

    So far few comprehensive error analyses for back projection methods have been conducted, although it is evident that high frequency seismic waves can be easily affected by earthquake depth, focal mechanisms and the Earth's 3D structures. Here we perform 1D and 3D synthetic tests for two back projection methods, MUltiple SIgnal Classification (MUSIC) (Meng et al., 2011) and Compressive Sensing (CS) (Yao et al., 2011). We generate synthetics for both point sources and finite rupture sources with different depths, focal mechanisms, as well as 1D and 3D structures in the source region. The 3D synthetics are generated through a hybrid scheme of Direct Solution Method and Spectral Element Method. Then we back project the synthetic data using MUSIC and CS. The synthetic tests show that the depth phases can be back projected as artificial sources both in space and time. For instance, for a source depth of 10km, back projection gives a strong signal 8km away from the true source. Such bias increases with depth, e.g., the error of horizontal location could be larger than 20km for a depth of 40km. If the array is located around the nodal direction of direct P-waves the teleseismic P-waves are dominated by the depth phases. Therefore, back projections are actually imaging the reflection points of depth phases more than the rupture front. Besides depth phases, the strong and long lasted coda waves due to 3D effects near trench can lead to additional complexities tested here. The strength contrast of different frequency contents in the rupture models also produces some variations to the back projection results. In the synthetic tests, MUSIC and CS derive consistent results. While MUSIC is more computationally efficient, CS works better for sparse arrays. In summary, our analyses indicate that the impact of various factors mentioned above should be taken into consideration when interpreting back projection images, before we can use them to infer the earthquake rupture physics.

  16. Measures of Linguistic Accuracy in Second Language Writing Research.

    ERIC Educational Resources Information Center

    Polio, Charlene G.

    1997-01-01

    Investigates the reliability of measures of linguistic accuracy in second language writing. The study uses a holistic scale, error-free T-units, and an error classification system on the essays of English-as-a-Second-Language students and discusses why disagreements arise within a rater and between raters. (24 references) (Author/CK)

  17. Classification accuracy for stratification with remotely sensed data

    Treesearch

    Raymond L. Czaplewski; Paul L. Patterson

    2003-01-01

    Tools are developed that help specify the classification accuracy required from remotely sensed data. These tools are applied during the planning stage of a sample survey that will use poststratification, prestratification with proportional allocation, or double sampling for stratification. Accuracy standards are developed in terms of an “error matrix,” which is...

  18. Comparison of wheat classification accuracy using different classifiers of the image-100 system

    NASA Technical Reports Server (NTRS)

    Dejesusparada, N. (Principal Investigator); Chen, S. C.; Moreira, M. A.; Delima, A. M.

    1981-01-01

    Classification results using single-cell and multi-cell signature acquisition options, a point-by-point Gaussian maximum-likelihood classifier, and K-means clustering of the Image-100 system are presented. Conclusions reached are that: a better indication of correct classification can be provided by using a test area which contains various cover types of the study area; classification accuracy should be evaluated considering both the percentages of correct classification and error of commission; supervised classification approaches are better than K-means clustering; Gaussian distribution maximum likelihood classifier is better than Single-cell and Multi-cell Signature Acquisition Options of the Image-100 system; and in order to obtain a high classification accuracy in a large and heterogeneous crop area, using Gaussian maximum-likelihood classifier, homogeneous spectral subclasses of the study crop should be created to derive training statistics.

  19. Identification of spilled oils by NIR spectroscopy technology based on KPCA and LSSVM

    NASA Astrophysics Data System (ADS)

    Tan, Ailing; Bi, Weihong

    2011-08-01

    Oil spills on the sea surface are seen relatively often with the development of the petroleum exploitation and transportation of the sea. Oil spills are great threat to the marine environment and the ecosystem, thus the oil pollution in the ocean becomes an urgent topic in the environmental protection. To develop the oil spill accident treatment program and track the source of the spilled oils, a novel qualitative identification method combined Kernel Principal Component Analysis (KPCA) and Least Square Support Vector Machine (LSSVM) was proposed. The proposed method adapt Fourier transform NIR spectrophotometer to collect the NIR spectral data of simulated gasoline, diesel fuel and kerosene oil spills samples and do some pretreatments to the original spectrum. We use the KPCA algorithm which is an extension of Principal Component Analysis (PCA) using techniques of kernel methods to extract nonlinear features of the preprocessed spectrum. Support Vector Machines (SVM) is a powerful methodology for solving spectral classification tasks in chemometrics. LSSVM are reformulations to the standard SVMs which lead to solving a system of linear equations. So a LSSVM multiclass classification model was designed which using Error Correcting Output Code (ECOC) method borrowing the idea of error correcting codes used for correcting bit errors in transmission channels. The most common and reliable approach to parameter selection is to decide on parameter ranges, and to then do a grid search over the parameter space to find the optimal model parameters. To test the proposed method, 375 spilled oil samples of unknown type were selected to study. The optimal model has the best identification capabilities with the accuracy of 97.8%. Experimental results show that the proposed KPCA plus LSSVM qualitative analysis method of near infrared spectroscopy has good recognition result, which could work as a new method for rapid identification of spilled oils.

  20. Ensemble of classifiers for confidence-rated classification of NDE signal

    NASA Astrophysics Data System (ADS)

    Banerjee, Portia; Safdarnejad, Seyed; Udpa, Lalita; Udpa, Satish

    2016-02-01

    Ensemble of classifiers in general, aims to improve classification accuracy by combining results from multiple weak hypotheses into a single strong classifier through weighted majority voting. Improved versions of ensemble of classifiers generate self-rated confidence scores which estimate the reliability of each of its prediction and boost the classifier using these confidence-rated predictions. However, such a confidence metric is based only on the rate of correct classification. In existing works, although ensemble of classifiers has been widely used in computational intelligence, the effect of all factors of unreliability on the confidence of classification is highly overlooked. With relevance to NDE, classification results are affected by inherent ambiguity of classifica-tion, non-discriminative features, inadequate training samples and noise due to measurement. In this paper, we extend the existing ensemble classification by maximizing confidence of every classification decision in addition to minimizing the classification error. Initial results of the approach on data from eddy current inspection show improvement in classification performance of defect and non-defect indications.

  1. Predictors of Errors of Novice Java Programmers

    ERIC Educational Resources Information Center

    Bringula, Rex P.; Manabat, Geecee Maybelline A.; Tolentino, Miguel Angelo A.; Torres, Edmon L.

    2012-01-01

    This descriptive study determined which of the sources of errors would predict the errors committed by novice Java programmers. Descriptive statistics revealed that the respondents perceived that they committed the identified eighteen errors infrequently. Thought error was perceived to be the main source of error during the laboratory programming…

  2. On the statistical assessment of classifiers using DNA microarray data

    PubMed Central

    Ancona, N; Maglietta, R; Piepoli, A; D'Addabbo, A; Cotugno, R; Savino, M; Liuni, S; Carella, M; Pesole, G; Perri, F

    2006-01-01

    Background In this paper we present a method for the statistical assessment of cancer predictors which make use of gene expression profiles. The methodology is applied to a new data set of microarray gene expression data collected in Casa Sollievo della Sofferenza Hospital, Foggia – Italy. The data set is made up of normal (22) and tumor (25) specimens extracted from 25 patients affected by colon cancer. We propose to give answers to some questions which are relevant for the automatic diagnosis of cancer such as: Is the size of the available data set sufficient to build accurate classifiers? What is the statistical significance of the associated error rates? In what ways can accuracy be considered dependant on the adopted classification scheme? How many genes are correlated with the pathology and how many are sufficient for an accurate colon cancer classification? The method we propose answers these questions whilst avoiding the potential pitfalls hidden in the analysis and interpretation of microarray data. Results We estimate the generalization error, evaluated through the Leave-K-Out Cross Validation error, for three different classification schemes by varying the number of training examples and the number of the genes used. The statistical significance of the error rate is measured by using a permutation test. We provide a statistical analysis in terms of the frequencies of the genes involved in the classification. Using the whole set of genes, we found that the Weighted Voting Algorithm (WVA) classifier learns the distinction between normal and tumor specimens with 25 training examples, providing e = 21% (p = 0.045) as an error rate. This remains constant even when the number of examples increases. Moreover, Regularized Least Squares (RLS) and Support Vector Machines (SVM) classifiers can learn with only 15 training examples, with an error rate of e = 19% (p = 0.035) and e = 18% (p = 0.037) respectively. Moreover, the error rate decreases as the training set size increases, reaching its best performances with 35 training examples. In this case, RLS and SVM have error rates of e = 14% (p = 0.027) and e = 11% (p = 0.019). Concerning the number of genes, we found about 6000 genes (p < 0.05) correlated with the pathology, resulting from the signal-to-noise statistic. Moreover the performances of RLS and SVM classifiers do not change when 74% of genes is used. They progressively reduce up to e = 16% (p < 0.05) when only 2 genes are employed. The biological relevance of a set of genes determined by our statistical analysis and the major roles they play in colorectal tumorigenesis is discussed. Conclusions The method proposed provides statistically significant answers to precise questions relevant for the diagnosis and prognosis of cancer. We found that, with as few as 15 examples, it is possible to train statistically significant classifiers for colon cancer diagnosis. As for the definition of the number of genes sufficient for a reliable classification of colon cancer, our results suggest that it depends on the accuracy required. PMID:16919171

  3. Speech variability effects on recognition accuracy associated with concurrent task performance by pilots

    NASA Technical Reports Server (NTRS)

    Simpson, C. A.

    1985-01-01

    In the present study of the responses of pairs of pilots to aircraft warning classification tasks using an isolated word, speaker-dependent speech recognition system, the induced stress was manipulated by means of different scoring procedures for the classification task and by the inclusion of a competitive manual control task. Both speech patterns and recognition accuracy were analyzed, and recognition errors were recorded by type for an isolated word speaker-dependent system and by an offline technique for a connected word speaker-dependent system. While errors increased with task loading for the isolated word system, there was no such effect for task loading in the case of the connected word system.

  4. Robust Transmission of H.264/AVC Streams Using Adaptive Group Slicing and Unequal Error Protection

    NASA Astrophysics Data System (ADS)

    Thomos, Nikolaos; Argyropoulos, Savvas; Boulgouris, Nikolaos V.; Strintzis, Michael G.

    2006-12-01

    We present a novel scheme for the transmission of H.264/AVC video streams over lossy packet networks. The proposed scheme exploits the error-resilient features of H.264/AVC codec and employs Reed-Solomon codes to protect effectively the streams. A novel technique for adaptive classification of macroblocks into three slice groups is also proposed. The optimal classification of macroblocks and the optimal channel rate allocation are achieved by iterating two interdependent steps. Dynamic programming techniques are used for the channel rate allocation process in order to reduce complexity. Simulations clearly demonstrate the superiority of the proposed method over other recent algorithms for transmission of H.264/AVC streams.

  5. Harmonization of forest disturbance datasets of the conterminous USA from 1986 to 2011

    USGS Publications Warehouse

    Soulard, Christopher E.; Acevedo, William; Cohen, Warren B.; Yang, Zhiqiang; Stehman, Stephen V.; Taylor, Janis L.

    2017-01-01

    Several spatial forest disturbance datasets exist for the conterminous USA. The major problem with forest disturbance mapping is that variability between map products leads to uncertainty regarding the actual rate of disturbance. In this article, harmonized maps were produced from multiple data sources (i.e., Global Forest Change, LANDFIRE Vegetation Disturbance, National Land Cover Database, Vegetation Change Tracker, and Web-Enabled Landsat Data). The harmonization process involved fitting common class ontologies and determining spatial congruency to produce forest disturbance maps for four time intervals (1986–1992, 1992–2001, 2001–2006, and 2006–2011). Pixels mapped as disturbed for two or more datasets were labeled as disturbed in the harmonized maps. The primary advantage gained by harmonization was improvement in commission error rates relative to the individual disturbance products. Disturbance omission errors were high for both harmonized and individual forest disturbance maps due to underlying limitations in mapping subtle disturbances with Landsat classification algorithms. To enhance the value of the harmonized disturbance products, we used fire perimeter maps to add information on the cause of disturbance.

  6. Harmonization of forest disturbance datasets of the conterminous USA from 1986 to 2011.

    PubMed

    Soulard, Christopher E; Acevedo, William; Cohen, Warren B; Yang, Zhiqiang; Stehman, Stephen V; Taylor, Janis L

    2017-04-01

    Several spatial forest disturbance datasets exist for the conterminous USA. The major problem with forest disturbance mapping is that variability between map products leads to uncertainty regarding the actual rate of disturbance. In this article, harmonized maps were produced from multiple data sources (i.e., Global Forest Change, LANDFIRE Vegetation Disturbance, National Land Cover Database, Vegetation Change Tracker, and Web-Enabled Landsat Data). The harmonization process involved fitting common class ontologies and determining spatial congruency to produce forest disturbance maps for four time intervals (1986-1992, 1992-2001, 2001-2006, and 2006-2011). Pixels mapped as disturbed for two or more datasets were labeled as disturbed in the harmonized maps. The primary advantage gained by harmonization was improvement in commission error rates relative to the individual disturbance products. Disturbance omission errors were high for both harmonized and individual forest disturbance maps due to underlying limitations in mapping subtle disturbances with Landsat classification algorithms. To enhance the value of the harmonized disturbance products, we used fire perimeter maps to add information on the cause of disturbance.

  7. Fact or fallacy? Immunisation arguments in the New Zealand print media.

    PubMed

    Petousis-Harris, Helen A; Goodyear-Smith, Felicity A; Kameshwar, Kamya; Turner, Nikki

    2010-10-01

    To explore New Zealand's four major daily newspapers' coverage of immunisation with regards to errors of fact and fallacy in construction of immunisation-related arguments. All articles from 2002 to 2007 were assessed for errors of fact and logic. Fact was defined as that which was supported by the most current evidence-based medical literature. Errors of logic were assessed using a classical taxonomy broadly based in Aristotle's classifications. Numerous errors of both fact and logic were identified, predominantly used by anti-immunisation proponents, but occasionally by health authorities. The proportion of media articles reporting exclusively fact changes over time during the life of a vaccine where new vaccines incur little fallacious reporting and established vaccines generate inaccurate claims. Fallacious arguments can be deconstructed and classified into a classical taxonomy including non sequitur and argumentum ad Hominem. Most media 'balance' given to immunisation relies on 'he said, she said' arguments using quotes from opposing spokespersons with a failure to verify the scientific validity of both the material and the source. Health professionals and media need training so that recognising and critiquing public health arguments becomes accepted practice: stronger public relations strategies should challenge poor quality articles to journalists' code of ethics and the health sector needs to be proactive in predicting and pre-empting the expected responses to introduction of new public health initiatives such as a new vaccine. © 2010 The Authors. Journal Compilation © 2010 Public Health Association of Australia.

  8. Bayesian Network Structure Learning for Urban Land Use Classification from Landsat ETM+ and Ancillary Data

    NASA Astrophysics Data System (ADS)

    Park, M.; Stenstrom, M. K.

    2004-12-01

    Recognizing urban information from the satellite imagery is problematic due to the diverse features and dynamic changes of urban landuse. The use of Landsat imagery for urban land use classification involves inherent uncertainty due to its spatial resolution and the low separability among land uses. To resolve the uncertainty problem, we investigated the performance of Bayesian networks to classify urban land use since Bayesian networks provide a quantitative way of handling uncertainty and have been successfully used in many areas. In this study, we developed the optimized networks for urban land use classification from Landsat ETM+ images of Marina del Rey area based on USGS land cover/use classification level III. The networks started from a tree structure based on mutual information between variables and added the links to improve accuracy. This methodology offers several advantages: (1) The network structure shows the dependency relationships between variables. The class node value can be predicted even with particular band information missing due to sensor system error. The missing information can be inferred from other dependent bands. (2) The network structure provides information of variables that are important for the classification, which is not available from conventional classification methods such as neural networks and maximum likelihood classification. In our case, for example, bands 1, 5 and 6 are the most important inputs in determining the land use of each pixel. (3) The networks can be reduced with those input variables important for classification. This minimizes the problem without considering all possible variables. We also examined the effect of incorporating ancillary data: geospatial information such as X and Y coordinate values of each pixel and DEM data, and vegetation indices such as NDVI and Tasseled Cap transformation. The results showed that the locational information improved overall accuracy (81%) and kappa coefficient (76%), and lowered the omission and commission errors compared with using only spectral data (accuracy 71%, kappa coefficient 62%). Incorporating DEM data did not significantly improve overall accuracy (74%) and kappa coefficient (66%) but lowered the omission and commission errors. Incorporating NDVI did not much improve the overall accuracy (72%) and k coefficient (65%). Including Tasseled Cap transformation reduced the accuracy (accuracy 70%, kappa 61%). Therefore, additional information from the DEM and vegetation indices was not useful as locational ancillary data.

  9. Vector quantizer designs for joint compression and terrain categorization of multispectral imagery

    NASA Technical Reports Server (NTRS)

    Gorman, John D.; Lyons, Daniel F.

    1994-01-01

    Two vector quantizer designs for compression of multispectral imagery and their impact on terrain categorization performance are evaluated. The mean-squared error (MSE) and classification performance of the two quantizers are compared, and it is shown that a simple two-stage design minimizing MSE subject to a constraint on classification performance has a significantly better classification performance than a standard MSE-based tree-structured vector quantizer followed by maximum likelihood classification. This improvement in classification performance is obtained with minimal loss in MSE performance. The results show that it is advantageous to tailor compression algorithm designs to the required data exploitation tasks. Applications of joint compression/classification include compression for the archival or transmission of Landsat imagery that is later used for land utility surveys and/or radiometric analysis.

  10. The impact of catchment source group classification on the accuracy of sediment fingerprinting outputs.

    PubMed

    Pulley, Simon; Foster, Ian; Collins, Adrian L

    2017-06-01

    The objective classification of sediment source groups is at present an under-investigated aspect of source tracing studies, which has the potential to statistically improve discrimination between sediment sources and reduce uncertainty. This paper investigates this potential using three different source group classification schemes. The first classification scheme was simple surface and subsurface groupings (Scheme 1). The tracer signatures were then used in a two-step cluster analysis to identify the sediment source groupings naturally defined by the tracer signatures (Scheme 2). The cluster source groups were then modified by splitting each one into a surface and subsurface component to suit catchment management goals (Scheme 3). The schemes were tested using artificial mixtures of sediment source samples. Controlled corruptions were made to some of the mixtures to mimic the potential causes of tracer non-conservatism present when using tracers in natural fluvial environments. It was determined how accurately the known proportions of sediment sources in the mixtures were identified after unmixing modelling using the three classification schemes. The cluster analysis derived source groups (2) significantly increased tracer variability ratios (inter-/intra-source group variability) (up to 2122%, median 194%) compared to the surface and subsurface groupings (1). As a result, the composition of the artificial mixtures was identified an average of 9.8% more accurately on the 0-100% contribution scale. It was found that the cluster groups could be reclassified into a surface and subsurface component (3) with no significant increase in composite uncertainty (a 0.1% increase over Scheme 2). The far smaller effects of simulated tracer non-conservatism for the cluster analysis based schemes (2 and 3) was primarily attributed to the increased inter-group variability producing a far larger sediment source signal that the non-conservatism noise (1). Modified cluster analysis based classification methods have the potential to reduce composite uncertainty significantly in future source tracing studies. Copyright © 2016 Elsevier Ltd. All rights reserved.

  11. Medication errors: definitions and classification

    PubMed Central

    Aronson, Jeffrey K

    2009-01-01

    To understand medication errors and to identify preventive strategies, we need to classify them and define the terms that describe them. The four main approaches to defining technical terms consider etymology, usage, previous definitions, and the Ramsey–Lewis method (based on an understanding of theory and practice). A medication error is ‘a failure in the treatment process that leads to, or has the potential to lead to, harm to the patient’. Prescribing faults, a subset of medication errors, should be distinguished from prescription errors. A prescribing fault is ‘a failure in the prescribing [decision-making] process that leads to, or has the potential to lead to, harm to the patient’. The converse of this, ‘balanced prescribing’ is ‘the use of a medicine that is appropriate to the patient's condition and, within the limits created by the uncertainty that attends therapeutic decisions, in a dosage regimen that optimizes the balance of benefit to harm’. This excludes all forms of prescribing faults, such as irrational, inappropriate, and ineffective prescribing, underprescribing and overprescribing. A prescription error is ‘a failure in the prescription writing process that results in a wrong instruction about one or more of the normal features of a prescription’. The ‘normal features’ include the identity of the recipient, the identity of the drug, the formulation, dose, route, timing, frequency, and duration of administration. Medication errors can be classified, invoking psychological theory, as knowledge-based mistakes, rule-based mistakes, action-based slips, and memory-based lapses. This classification informs preventive strategies. PMID:19594526

  12. 3D Complex: A Structural Classification of Protein Complexes

    PubMed Central

    Levy, Emmanuel D; Pereira-Leal, Jose B; Chothia, Cyrus; Teichmann, Sarah A

    2006-01-01

    Most of the proteins in a cell assemble into complexes to carry out their function. It is therefore crucial to understand the physicochemical properties as well as the evolution of interactions between proteins. The Protein Data Bank represents an important source of information for such studies, because more than half of the structures are homo- or heteromeric protein complexes. Here we propose the first hierarchical classification of whole protein complexes of known 3-D structure, based on representing their fundamental structural features as a graph. This classification provides the first overview of all the complexes in the Protein Data Bank and allows nonredundant sets to be derived at different levels of detail. This reveals that between one-half and two-thirds of known structures are multimeric, depending on the level of redundancy accepted. We also analyse the structures in terms of the topological arrangement of their subunits and find that they form a small number of arrangements compared with all theoretically possible ones. This is because most complexes contain four subunits or less, and the large majority are homomeric. In addition, there is a strong tendency for symmetry in complexes, even for heteromeric complexes. Finally, through comparison of Biological Units in the Protein Data Bank with the Protein Quaternary Structure database, we identified many possible errors in quaternary structure assignments. Our classification, available as a database and Web server at http://www.3Dcomplex.org, will be a starting point for future work aimed at understanding the structure and evolution of protein complexes. PMID:17112313

  13. Linear and Order Statistics Combiners for Pattern Classification

    NASA Technical Reports Server (NTRS)

    Tumer, Kagan; Ghosh, Joydeep; Lau, Sonie (Technical Monitor)

    2001-01-01

    Several researchers have experimentally shown that substantial improvements can be obtained in difficult pattern recognition problems by combining or integrating the outputs of multiple classifiers. This chapter provides an analytical framework to quantify the improvements in classification results due to combining. The results apply to both linear combiners and order statistics combiners. We first show that to a first order approximation, the error rate obtained over and above the Bayes error rate, is directly proportional to the variance of the actual decision boundaries around the Bayes optimum boundary. Combining classifiers in output space reduces this variance, and hence reduces the 'added' error. If N unbiased classifiers are combined by simple averaging. the added error rate can be reduced by a factor of N if the individual errors in approximating the decision boundaries are uncorrelated. Expressions are then derived for linear combiners which are biased or correlated, and the effect of output correlations on ensemble performance is quantified. For order statistics based non-linear combiners, we derive expressions that indicate how much the median, the maximum and in general the i-th order statistic can improve classifier performance. The analysis presented here facilitates the understanding of the relationships among error rates, classifier boundary distributions, and combining in output space. Experimental results on several public domain data sets are provided to illustrate the benefits of combining and to support the analytical results.

  14. Searching for Unresolved Binary Brown Dwarfs

    NASA Astrophysics Data System (ADS)

    Albretsen, Jacob; Stephens, Denise

    2007-10-01

    There are currently L and T brown dwarfs (BDs) with errors in their classification of +/- 1 to 2 spectra types. Metallicity and gravitational differences have accounted for some of these discrepancies, and recent studies have shown unresolved binary BDs may offer some explanation as well. However limitations in technology and resources often make it difficult to clearly resolve an object that may be binary in nature. Stephens and Noll (2006) identified statistically strong binary source candidates from Hubble Space Telescope (HST) images of Trans-Neptunian Objects (TNOs) that were apparently unresolved using model point-spread functions for single and binary sources. The HST archive contains numerous observations of BDs using the Near Infrared Camera and Multi-Object Spectrometer (NICMOS) that have never been rigorously analyzed for binary properties. Using methods developed by Stephens and Noll (2006), BD observations from the HST data archive are being analyzed for possible unresolved binaries. Preliminary results will be presented. This technique will identify potential candidates for future observations to determine orbital information.

  15. Quantifying the Contributions of Environmental Parameters to Ceres Surface Net Radiation Error in China

    NASA Astrophysics Data System (ADS)

    Pan, X.; Yang, Y.; Liu, Y.; Fan, X.; Shan, L.; Zhang, X.

    2018-04-01

    Error source analyses are critical for the satellite-retrieved surface net radiation (Rn) products. In this study, we evaluate the Rn error sources in the Clouds and the Earth's Radiant Energy System (CERES) project at 43 sites from July in 2007 to December in 2007 in China. The results show that cloud fraction (CF), land surface temperature (LST), atmospheric temperature (AT) and algorithm error dominate the Rn error, with error contributions of -20, 15, 10 and 10 W/m2 (net shortwave (NSW)/longwave (NLW) radiation), respectively. For NSW, the dominant error source is algorithm error (more than 10 W/m2), particularly in spring and summer with abundant cloud. For NLW, due to the high sensitivity of algorithm and large LST/CF error, LST and CF are the largest error sources, especially in northern China. The AT influences the NLW error large in southern China because of the large AT error in there. The total precipitable water has weak influence on Rn error even with the high sensitivity of algorithm. In order to improve Rn quality, CF and LST (AT) error in northern (southern) China should be decreased.

  16. Conversion of Continuous-Valued Deep Networks to Efficient Event-Driven Networks for Image Classification.

    PubMed

    Rueckauer, Bodo; Lungu, Iulia-Alexandra; Hu, Yuhuang; Pfeiffer, Michael; Liu, Shih-Chii

    2017-01-01

    Spiking neural networks (SNNs) can potentially offer an efficient way of doing inference because the neurons in the networks are sparsely activated and computations are event-driven. Previous work showed that simple continuous-valued deep Convolutional Neural Networks (CNNs) can be converted into accurate spiking equivalents. These networks did not include certain common operations such as max-pooling, softmax, batch-normalization and Inception-modules. This paper presents spiking equivalents of these operations therefore allowing conversion of nearly arbitrary CNN architectures. We show conversion of popular CNN architectures, including VGG-16 and Inception-v3, into SNNs that produce the best results reported to date on MNIST, CIFAR-10 and the challenging ImageNet dataset. SNNs can trade off classification error rate against the number of available operations whereas deep continuous-valued neural networks require a fixed number of operations to achieve their classification error rate. From the examples of LeNet for MNIST and BinaryNet for CIFAR-10, we show that with an increase in error rate of a few percentage points, the SNNs can achieve more than 2x reductions in operations compared to the original CNNs. This highlights the potential of SNNs in particular when deployed on power-efficient neuromorphic spiking neuron chips, for use in embedded applications.

  17. Conversion of Continuous-Valued Deep Networks to Efficient Event-Driven Networks for Image Classification

    PubMed Central

    Rueckauer, Bodo; Lungu, Iulia-Alexandra; Hu, Yuhuang; Pfeiffer, Michael; Liu, Shih-Chii

    2017-01-01

    Spiking neural networks (SNNs) can potentially offer an efficient way of doing inference because the neurons in the networks are sparsely activated and computations are event-driven. Previous work showed that simple continuous-valued deep Convolutional Neural Networks (CNNs) can be converted into accurate spiking equivalents. These networks did not include certain common operations such as max-pooling, softmax, batch-normalization and Inception-modules. This paper presents spiking equivalents of these operations therefore allowing conversion of nearly arbitrary CNN architectures. We show conversion of popular CNN architectures, including VGG-16 and Inception-v3, into SNNs that produce the best results reported to date on MNIST, CIFAR-10 and the challenging ImageNet dataset. SNNs can trade off classification error rate against the number of available operations whereas deep continuous-valued neural networks require a fixed number of operations to achieve their classification error rate. From the examples of LeNet for MNIST and BinaryNet for CIFAR-10, we show that with an increase in error rate of a few percentage points, the SNNs can achieve more than 2x reductions in operations compared to the original CNNs. This highlights the potential of SNNs in particular when deployed on power-efficient neuromorphic spiking neuron chips, for use in embedded applications. PMID:29375284

  18. 3D multi-view convolutional neural networks for lung nodule classification

    PubMed Central

    Kang, Guixia; Hou, Beibei; Zhang, Ningbo

    2017-01-01

    The 3D convolutional neural network (CNN) is able to make full use of the spatial 3D context information of lung nodules, and the multi-view strategy has been shown to be useful for improving the performance of 2D CNN in classifying lung nodules. In this paper, we explore the classification of lung nodules using the 3D multi-view convolutional neural networks (MV-CNN) with both chain architecture and directed acyclic graph architecture, including 3D Inception and 3D Inception-ResNet. All networks employ the multi-view-one-network strategy. We conduct a binary classification (benign and malignant) and a ternary classification (benign, primary malignant and metastatic malignant) on Computed Tomography (CT) images from Lung Image Database Consortium and Image Database Resource Initiative database (LIDC-IDRI). All results are obtained via 10-fold cross validation. As regards the MV-CNN with chain architecture, results show that the performance of 3D MV-CNN surpasses that of 2D MV-CNN by a significant margin. Finally, a 3D Inception network achieved an error rate of 4.59% for the binary classification and 7.70% for the ternary classification, both of which represent superior results for the corresponding task. We compare the multi-view-one-network strategy with the one-view-one-network strategy. The results reveal that the multi-view-one-network strategy can achieve a lower error rate than the one-view-one-network strategy. PMID:29145492

  19. Automatic classification for mammogram backgrounds based on bi-rads complexity definition and on a multi content analysis framework

    NASA Astrophysics Data System (ADS)

    Wu, Jie; Besnehard, Quentin; Marchessoux, Cédric

    2011-03-01

    Clinical studies for the validation of new medical imaging devices require hundreds of images. An important step in creating and tuning the study protocol is the classification of images into "difficult" and "easy" cases. This consists of classifying the image based on features like the complexity of the background, the visibility of the disease (lesions). Therefore, an automatic medical background classification tool for mammograms would help for such clinical studies. This classification tool is based on a multi-content analysis framework (MCA) which was firstly developed to recognize image content of computer screen shots. With the implementation of new texture features and a defined breast density scale, the MCA framework is able to automatically classify digital mammograms with a satisfying accuracy. BI-RADS (Breast Imaging Reporting Data System) density scale is used for grouping the mammograms, which standardizes the mammography reporting terminology and assessment and recommendation categories. Selected features are input into a decision tree classification scheme in MCA framework, which is the so called "weak classifier" (any classifier with a global error rate below 50%). With the AdaBoost iteration algorithm, these "weak classifiers" are combined into a "strong classifier" (a classifier with a low global error rate) for classifying one category. The results of classification for one "strong classifier" show the good accuracy with the high true positive rates. For the four categories the results are: TP=90.38%, TN=67.88%, FP=32.12% and FN =9.62%.

  20. Classification of electroencephalograph signals using time-frequency decomposition and linear discriminant analysis

    NASA Astrophysics Data System (ADS)

    Szuflitowska, B.; Orlowski, P.

    2017-08-01

    Automated detection system consists of two key steps: extraction of features from EEG signals and classification for detection of pathology activity. The EEG sequences were analyzed using Short-Time Fourier Transform and the classification was performed using Linear Discriminant Analysis. The accuracy of the technique was tested on three sets of EEG signals: epilepsy, healthy and Alzheimer's Disease. The classification error below 10% has been considered a success. The higher accuracy are obtained for new data of unknown classes than testing data. The methodology can be helpful in differentiation epilepsy seizure and disturbances in the EEG signal in Alzheimer's Disease.

  1. The Influence of Item Calibration Error on Variable-Length Computerized Adaptive Testing

    ERIC Educational Resources Information Center

    Patton, Jeffrey M.; Cheng, Ying; Yuan, Ke-Hai; Diao, Qi

    2013-01-01

    Variable-length computerized adaptive testing (VL-CAT) allows both items and test length to be "tailored" to examinees, thereby achieving the measurement goal (e.g., scoring precision or classification) with as few items as possible. Several popular test termination rules depend on the standard error of the ability estimate, which in turn depends…

  2. Lexical Errors in Second Language Scientific Writing: Some Conceptual Implications

    ERIC Educational Resources Information Center

    Carrió Pastor, María Luisa; Mestre-Mestre, Eva María

    2014-01-01

    Nowadays, scientific writers are required not only a thorough knowledge of their subject field, but also a sound command of English as a lingua franca. In this paper, the lexical errors produced in scientific texts written in English by non-native researchers are identified to propose a classification of the categories they contain. This study…

  3. Land use surveys by means of automatic interpretation of LANDSAT system data

    NASA Technical Reports Server (NTRS)

    Dejesusparada, N. (Principal Investigator); Lombardo, M. A.; Novo, E. M. L. D.; Niero, M.; Foresti, C.

    1981-01-01

    Analyses for seven land-use classes are presented. The classes are: urban area, industrial area, bare soil, cultivated area, pastureland, reforestation, and natural vegetation. The automatic classification of LANDSAT MSS data using a maximum likelihood algorithm shows a 39% average error of emission and a 3.45 error of commission for the seven classes.

  4. Anatomical and/or pathological predictors for the “incorrect” classification of red dot markers on wrist radiographs taken following trauma

    PubMed Central

    Kranz, R

    2015-01-01

    Objective: To establish the prevalence of red dot markers in a sample of wrist radiographs and to identify any anatomical and/or pathological characteristics that predict “incorrect” red dot classification. Methods: Accident and emergency (A&E) wrist cases from a digital imaging and communications in medicine/digital teaching library were examined for red dot prevalence and for the presence of several anatomical and pathological features. Binary logistic regression analyses were run to establish if any of these features were predictors of incorrect red dot classification. Results: 398 cases were analysed. Red dot was “incorrectly” classified in 8.5% of cases; 6.3% were “false negatives” (“FNs”)and 2.3% false positives (FPs) (one decimal place). Old fractures [odds ratio (OR), 5.070 (1.256–20.471)] and reported degenerative change [OR, 9.870 (2.300–42.359)] were found to predict FPs. Frykman V [OR, 9.500 (1.954–46.179)], Frykman VI [OR, 6.333 (1.205–33.283)] and non-Frykman positive abnormalities [OR, 4.597 (1.264–16.711)] predict “FNs”. Old fractures and Frykman VI were predictive of error at 90% confidence interval (CI); the rest at 95% CI. Conclusion: The five predictors of incorrect red dot classification may inform the image interpretation training of radiographers and other professionals to reduce diagnostic error. Verification with larger samples would reinforce these findings. Advances in knowledge: All healthcare providers strive to eradicate diagnostic error. By examining specific anatomical and pathological predictors on radiographs for such error, as well as extrinsic factors that may affect reporting accuracy, image interpretation training can focus on these “problem” areas and influence which radiographic abnormality detection schemes are appropriate to implement in A&E departments. PMID:25496373

  5. How Should Children with Speech Sound Disorders be Classified? A Review and Critical Evaluation of Current Classification Systems

    ERIC Educational Resources Information Center

    Waring, R.; Knight, R.

    2013-01-01

    Background: Children with speech sound disorders (SSD) form a heterogeneous group who differ in terms of the severity of their condition, underlying cause, speech errors, involvement of other aspects of the linguistic system and treatment response. To date there is no universal and agreed-upon classification system. Instead, a number of…

  6. Optimization of the ANFIS using a genetic algorithm for physical work rate classification.

    PubMed

    Habibi, Ehsanollah; Salehi, Mina; Yadegarfar, Ghasem; Taheri, Ali

    2018-03-13

    Recently, a new method was proposed for physical work rate classification based on an adaptive neuro-fuzzy inference system (ANFIS). This study aims to present a genetic algorithm (GA)-optimized ANFIS model for a highly accurate classification of physical work rate. Thirty healthy men participated in this study. Directly measured heart rate and oxygen consumption of the participants in the laboratory were used for training the ANFIS classifier model in MATLAB version 8.0.0 using a hybrid algorithm. A similar process was done using the GA as an optimization technique. The accuracy, sensitivity and specificity of the ANFIS classifier model were increased successfully. The mean accuracy of the model was increased from 92.95 to 97.92%. Also, the calculated root mean square error of the model was reduced from 5.4186 to 3.1882. The maximum estimation error of the optimized ANFIS during the network testing process was ± 5%. The GA can be effectively used for ANFIS optimization and leads to an accurate classification of physical work rate. In addition to high accuracy, simple implementation and inter-individual variability consideration are two other advantages of the presented model.

  7. Galaxy Zoo 1: data release of morphological classifications for nearly 900 000 galaxies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Linott, C.; Slosar, A.; Lintott, C.

    Morphology is a powerful indicator of a galaxy's dynamical and merger history. It is strongly correlated with many physical parameters, including mass, star formation history and the distribution of mass. The Galaxy Zoo project collected simple morphological classifications of nearly 900,000 galaxies drawn from the Sloan Digital Sky Survey, contributed by hundreds of thousands of volunteers. This large number of classifications allows us to exclude classifier error, and measure the influence of subtle biases inherent in morphological classification. This paper presents the data collected by the project, alongside measures of classification accuracy and bias. The data are now publicly availablemore » and full catalogues can be downloaded in electronic format from http://data.galaxyzoo.org.« less

  8. A method for classification of multisource data using interval-valued probabilities and its application to HIRIS data

    NASA Technical Reports Server (NTRS)

    Kim, H.; Swain, P. H.

    1991-01-01

    A method of classifying multisource data in remote sensing is presented. The proposed method considers each data source as an information source providing a body of evidence, represents statistical evidence by interval-valued probabilities, and uses Dempster's rule to integrate information based on multiple data source. The method is applied to the problems of ground-cover classification of multispectral data combined with digital terrain data such as elevation, slope, and aspect. Then this method is applied to simulated 201-band High Resolution Imaging Spectrometer (HIRIS) data by dividing the dimensionally huge data source into smaller and more manageable pieces based on the global statistical correlation information. It produces higher classification accuracy than the Maximum Likelihood (ML) classification method when the Hughes phenomenon is apparent.

  9. Spotting East African mammals in open savannah from space.

    PubMed

    Yang, Zheng; Wang, Tiejun; Skidmore, Andrew K; de Leeuw, Jan; Said, Mohammed Y; Freer, Jim

    2014-01-01

    Knowledge of population dynamics is essential for managing and conserving wildlife. Traditional methods of counting wild animals such as aerial survey or ground counts not only disturb animals, but also can be labour intensive and costly. New, commercially available very high-resolution satellite images offer great potential for accurate estimates of animal abundance over large open areas. However, little research has been conducted in the area of satellite-aided wildlife census, although computer processing speeds and image analysis algorithms have vastly improved. This paper explores the possibility of detecting large animals in the open savannah of Maasai Mara National Reserve, Kenya from very high-resolution GeoEye-1 satellite images. A hybrid image classification method was employed for this specific purpose by incorporating the advantages of both pixel-based and object-based image classification approaches. This was performed in two steps: firstly, a pixel-based image classification method, i.e., artificial neural network was applied to classify potential targets with similar spectral reflectance at pixel level; and then an object-based image classification method was used to further differentiate animal targets from the surrounding landscapes through the applications of expert knowledge. As a result, the large animals in two pilot study areas were successfully detected with an average count error of 8.2%, omission error of 6.6% and commission error of 13.7%. The results of the study show for the first time that it is feasible to perform automated detection and counting of large wild animals in open savannahs from space, and therefore provide a complementary and alternative approach to the conventional wildlife survey techniques.

  10. Comparison of three methods for long-term monitoring of boreal lake area using Landsat TM and ETM+ imagery

    USGS Publications Warehouse

    Roach, Jennifer K.; Griffith, Brad; Verbyla, David

    2012-01-01

    Programs to monitor lake area change are becoming increasingly important in high latitude regions, and their development often requires evaluating tradeoffs among different approaches in terms of accuracy of measurement, consistency across multiple users over long time periods, and efficiency. We compared three supervised methods for lake classification from Landsat imagery (density slicing, classification trees, and feature extraction). The accuracy of lake area and number estimates was evaluated relative to high-resolution aerial photography acquired within two days of satellite overpasses. The shortwave infrared band 5 was better at separating surface water from nonwater when used alone than when combined with other spectral bands. The simplest of the three methods, density slicing, performed best overall. The classification tree method resulted in the most omission errors (approx. 2x), feature extraction resulted in the most commission errors (approx. 4x), and density slicing had the least directional bias (approx. half of the lakes with overestimated area and half of the lakes with underestimated area). Feature extraction was the least consistent across training sets (i.e., large standard error among different training sets). Density slicing was the best of the three at classifying small lakes as evidenced by its lower optimal minimum lake size criterion of 5850 m2 compared with the other methods (8550 m2). Contrary to conventional wisdom, the use of additional spectral bands and a more sophisticated method not only required additional processing effort but also had a cost in terms of the accuracy and consistency of lake classifications.

  11. Cluster designs to assess the prevalence of acute malnutrition by lot quality assurance sampling: a validation study by computer simulation

    PubMed Central

    Olives, Casey; Pagano, Marcello; Deitchler, Megan; Hedt, Bethany L; Egge, Kari; Valadez, Joseph J

    2009-01-01

    Traditional lot quality assurance sampling (LQAS) methods require simple random sampling to guarantee valid results. However, cluster sampling has been proposed to reduce the number of random starting points. This study uses simulations to examine the classification error of two such designs, a 67×3 (67 clusters of three observations) and a 33×6 (33 clusters of six observations) sampling scheme to assess the prevalence of global acute malnutrition (GAM). Further, we explore the use of a 67×3 sequential sampling scheme for LQAS classification of GAM prevalence. Results indicate that, for independent clusters with moderate intracluster correlation for the GAM outcome, the three sampling designs maintain approximate validity for LQAS analysis. Sequential sampling can substantially reduce the average sample size that is required for data collection. The presence of intercluster correlation can impact dramatically the classification error that is associated with LQAS analysis. PMID:20011037

  12. Electroencephalography epilepsy classifications using hybrid cuckoo search and neural network

    NASA Astrophysics Data System (ADS)

    Pratiwi, A. B.; Damayanti, A.; Miswanto

    2017-07-01

    Epilepsy is a condition that affects the brain and causes repeated seizures. This seizure is episodes that can vary and nearly undetectable to long periods of vigorous shaking or brain contractions. Epilepsy often can be confirmed with an electrocephalography (EEG). Neural Networks has been used in biomedic signal analysis, it has successfully classified the biomedic signal, such as EEG signal. In this paper, a hybrid cuckoo search and neural network are used to recognize EEG signal for epilepsy classifications. The weight of the multilayer perceptron is optimized by the cuckoo search algorithm based on its error. The aim of this methods is making the network faster to obtained the local or global optimal then the process of classification become more accurate. Based on the comparison results with the traditional multilayer perceptron, the hybrid cuckoo search and multilayer perceptron provides better performance in term of error convergence and accuracy. The purpose methods give MSE 0.001 and accuracy 90.0 %.

  13. A statistical study of radio-source structure effects on astrometric very long baseline interferometry observations

    NASA Technical Reports Server (NTRS)

    Ulvestad, J. S.

    1989-01-01

    Errors from a number of sources in astrometric very long baseline interferometry (VLBI) have been reduced in recent years through a variety of methods of calibration and modeling. Such reductions have led to a situation in which the extended structure of the natural radio sources used in VLBI is a significant error source in the effort to improve the accuracy of the radio reference frame. In the past, work has been done on individual radio sources to establish the magnitude of the errors caused by their particular structures. The results of calculations on 26 radio sources are reported in which an effort is made to determine the typical delay and delay-rate errors for a number of sources having different types of structure. It is found that for single observations of the types of radio sources present in astrometric catalogs, group-delay and phase-delay scatter in the 50 to 100 psec range due to source structure can be expected at 8.4 GHz on the intercontinental baselines available in the Deep Space Network (DSN). Delay-rate scatter of approx. 5 x 10(exp -15) sec sec(exp -1) (or approx. 0.002 mm sec (exp -1) is also expected. If such errors mapped directly into source position errors, they would correspond to position uncertainties of approx. 2 to 5 nrad, similar to the best position determinations in the current JPL VLBI catalog. With the advent of wider bandwidth VLBI systems on the large DSN antennas, the system noise will be low enough so that the structure-induced errors will be a significant part of the error budget. Several possibilities for reducing the structure errors are discussed briefly, although it is likely that considerable effort will have to be devoted to the structure problem in order to reduce the typical error by a factor of two or more.

  14. 22 CFR 9.6 - Derivative classification.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 22 Foreign Relations 1 2012-04-01 2012-04-01 false Derivative classification. 9.6 Section 9.6... classification. (a) Definition. Derivative classification is the incorporating, paraphrasing, restating or... with the classification of the source material. Duplication or reproduction of existing classified...

  15. 22 CFR 9.6 - Derivative classification.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 22 Foreign Relations 1 2013-04-01 2013-04-01 false Derivative classification. 9.6 Section 9.6... classification. (a) Definition. Derivative classification is the incorporating, paraphrasing, restating or... with the classification of the source material. Duplication or reproduction of existing classified...

  16. 22 CFR 9.6 - Derivative classification.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 22 Foreign Relations 1 2014-04-01 2014-04-01 false Derivative classification. 9.6 Section 9.6... classification. (a) Definition. Derivative classification is the incorporating, paraphrasing, restating or... with the classification of the source material. Duplication or reproduction of existing classified...

  17. 22 CFR 9.6 - Derivative classification.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 22 Foreign Relations 1 2011-04-01 2011-04-01 false Derivative classification. 9.6 Section 9.6... classification. (a) Definition. Derivative classification is the incorporating, paraphrasing, restating or... with the classification of the source material. Duplication or reproduction of existing classified...

  18. North Alabama Lightning Mapping Array (LMA): VHF Source Retrieval Algorithm and Error Analyses

    NASA Technical Reports Server (NTRS)

    Koshak, W. J.; Solakiewicz, R. J.; Blakeslee, R. J.; Goodman, S. J.; Christian, H. J.; Hall, J.; Bailey, J.; Krider, E. P.; Bateman, M. G.; Boccippio, D.

    2003-01-01

    Two approaches are used to characterize how accurately the North Alabama Lightning Mapping Array (LMA) is able to locate lightning VHF sources in space and in time. The first method uses a Monte Carlo computer simulation to estimate source retrieval errors. The simulation applies a VHF source retrieval algorithm that was recently developed at the NASA Marshall Space Flight Center (MSFC) and that is similar, but not identical to, the standard New Mexico Tech retrieval algorithm. The second method uses a purely theoretical technique (i.e., chi-squared Curvature Matrix Theory) to estimate retrieval errors. Both methods assume that the LMA system has an overall rms timing error of 50 ns, but all other possible errors (e.g., multiple sources per retrieval attempt) are neglected. The detailed spatial distributions of retrieval errors are provided. Given that the two methods are completely independent of one another, it is shown that they provide remarkably similar results. However, for many source locations, the Curvature Matrix Theory produces larger altitude error estimates than the (more realistic) Monte Carlo simulation.

  19. Peculiarities of use of ECOC and AdaBoost based classifiers for thematic processing of hyperspectral data

    NASA Astrophysics Data System (ADS)

    Dementev, A. O.; Dmitriev, E. V.; Kozoderov, V. V.; Egorov, V. D.

    2017-10-01

    Hyperspectral imaging is up-to-date promising technology widely applied for the accurate thematic mapping. The presence of a large number of narrow survey channels allows us to use subtle differences in spectral characteristics of objects and to make a more detailed classification than in the case of using standard multispectral data. The difficulties encountered in the processing of hyperspectral images are usually associated with the redundancy of spectral information which leads to the problem of the curse of dimensionality. Methods currently used for recognizing objects on multispectral and hyperspectral images are usually based on standard base supervised classification algorithms of various complexity. Accuracy of these algorithms can be significantly different depending on considered classification tasks. In this paper we study the performance of ensemble classification methods for the problem of classification of the forest vegetation. Error correcting output codes and boosting are tested on artificial data and real hyperspectral images. It is demonstrates, that boosting gives more significant improvement when used with simple base classifiers. The accuracy in this case in comparable the error correcting output code (ECOC) classifier with Gaussian kernel SVM base algorithm. However the necessity of boosting ECOC with Gaussian kernel SVM is questionable. It is demonstrated, that selected ensemble classifiers allow us to recognize forest species with high enough accuracy which can be compared with ground-based forest inventory data.

  20. Error, Power, and Blind Sentinels: The Statistics of Seagrass Monitoring

    PubMed Central

    Schultz, Stewart T.; Kruschel, Claudia; Bakran-Petricioli, Tatjana; Petricioli, Donat

    2015-01-01

    We derive statistical properties of standard methods for monitoring of habitat cover worldwide, and criticize them in the context of mandated seagrass monitoring programs, as exemplified by Posidonia oceanica in the Mediterranean Sea. We report the novel result that cartographic methods with non-trivial classification errors are generally incapable of reliably detecting habitat cover losses less than about 30 to 50%, and the field labor required to increase their precision can be orders of magnitude higher than that required to estimate habitat loss directly in a field campaign. We derive a universal utility threshold of classification error in habitat maps that represents the minimum habitat map accuracy above which direct methods are superior. Widespread government reliance on blind-sentinel methods for monitoring seafloor can obscure the gradual and currently ongoing losses of benthic resources until the time has long passed for meaningful management intervention. We find two classes of methods with very high statistical power for detecting small habitat cover losses: 1) fixed-plot direct methods, which are over 100 times as efficient as direct random-plot methods in a variable habitat mosaic; and 2) remote methods with very low classification error such as geospatial underwater videography, which is an emerging, low-cost, non-destructive method for documenting small changes at millimeter visual resolution. General adoption of these methods and their further development will require a fundamental cultural change in conservation and management bodies towards the recognition and promotion of requirements of minimal statistical power and precision in the development of international goals for monitoring these valuable resources and the ecological services they provide. PMID:26367863

  1. Land use in the Paraiba Valley through remotely sensed data. [Brazil

    NASA Technical Reports Server (NTRS)

    Dejesusparada, N. (Principal Investigator); Lombardo, M. A.; Novo, E. M. L. D.; Niero, M.; Foresti, C.

    1980-01-01

    A methodology for land use survey was developed and land use modification rates were determined using LANDSAT imagery of the Paraiba Valley (state of Sao Paulo). Both visual and automatic interpretation methods were employed to analyze seven land use classes: urban area, industrial area, bare soil, cultivated area, pastureland, reforestation and natural vegetation. By means of visual interpretation, little spectral differences are observed among those classes. The automatic classification of LANDSAT MSS data using maximum likelihood algorithm shows a 39% average error of omission and a 3.4% error of inclusion for the seven classes. The complexity of land uses in the study area, the large spectral variations of analyzed classes, and the low resolution of LANDSAT data influenced the classification results.

  2. Multilayer perceptron, fuzzy sets, and classification

    NASA Technical Reports Server (NTRS)

    Pal, Sankar K.; Mitra, Sushmita

    1992-01-01

    A fuzzy neural network model based on the multilayer perceptron, using the back-propagation algorithm, and capable of fuzzy classification of patterns is described. The input vector consists of membership values to linguistic properties while the output vector is defined in terms of fuzzy class membership values. This allows efficient modeling of fuzzy or uncertain patterns with appropriate weights being assigned to the backpropagated errors depending upon the membership values at the corresponding outputs. During training, the learning rate is gradually decreased in discrete steps until the network converges to a minimum error solution. The effectiveness of the algorithm is demonstrated on a speech recognition problem. The results are compared with those of the conventional MLP, the Bayes classifier, and the other related models.

  3. Measurement uncertainty and feasibility study of a flush airdata system for a hypersonic flight experiment

    NASA Technical Reports Server (NTRS)

    Whitmore, Stephen A.; Moes, Timothy R.

    1994-01-01

    Presented is a feasibility and error analysis for a hypersonic flush airdata system on a hypersonic flight experiment (HYFLITE). HYFLITE heating loads make intrusive airdata measurement impractical. Although this analysis is specifically for the HYFLITE vehicle and trajectory, the problems analyzed are generally applicable to hypersonic vehicles. A layout of the flush-port matrix is shown. Surface pressures are related airdata parameters using a simple aerodynamic model. The model is linearized using small perturbations and inverted using nonlinear least-squares. Effects of various error sources on the overall uncertainty are evaluated using an error simulation. Error sources modeled include boundarylayer/viscous interactions, pneumatic lag, thermal transpiration in the sensor pressure tubing, misalignment in the matrix layout, thermal warping of the vehicle nose, sampling resolution, and transducer error. Using simulated pressure data for input to the estimation algorithm, effects caused by various error sources are analyzed by comparing estimator outputs with the original trajectory. To obtain ensemble averages the simulation is run repeatedly and output statistics are compiled. Output errors resulting from the various error sources are presented as a function of Mach number. Final uncertainties with all modeled error sources included are presented as a function of Mach number.

  4. One-Class Classification-Based Real-Time Activity Error Detection in Smart Homes.

    PubMed

    Das, Barnan; Cook, Diane J; Krishnan, Narayanan C; Schmitter-Edgecombe, Maureen

    2016-08-01

    Caring for individuals with dementia is frequently associated with extreme physical and emotional stress, which often leads to depression. Smart home technology and advances in machine learning techniques can provide innovative solutions to reduce caregiver burden. One key service that caregivers provide is prompting individuals with memory limitations to initiate and complete daily activities. We hypothesize that sensor technologies combined with machine learning techniques can automate the process of providing reminder-based interventions. The first step towards automated interventions is to detect when an individual faces difficulty with activities. We propose machine learning approaches based on one-class classification that learn normal activity patterns. When we apply these classifiers to activity patterns that were not seen before, the classifiers are able to detect activity errors, which represent potential prompt situations. We validate our approaches on smart home sensor data obtained from older adult participants, some of whom faced difficulties performing routine activities and thus committed errors.

  5. Improving the mapping of crop types in the Midwestern U.S. by fusing Landsat and MODIS satellite data

    NASA Astrophysics Data System (ADS)

    Zhu, Likai; Radeloff, Volker C.; Ives, Anthony R.

    2017-06-01

    Mapping crop types is of great importance for assessing agricultural production, land-use patterns, and the environmental effects of agriculture. Indeed, both radiometric and spatial resolution of Landsat's sensors images are optimized for cropland monitoring. However, accurate mapping of crop types requires frequent cloud-free images during the growing season, which are often not available, and this raises the question of whether Landsat data can be combined with data from other satellites. Here, our goal is to evaluate to what degree fusing Landsat with MODIS Nadir Bidirectional Reflectance Distribution Function (BRDF)-Adjusted Reflectance (NBAR) data can improve crop-type classification. Choosing either one or two images from all cloud-free Landsat observations available for the Arlington Agricultural Research Station area in Wisconsin from 2010 to 2014, we generated 87 combinations of images, and used each combination as input into the Spatial and Temporal Adaptive Reflectance Fusion Model (STARFM) algorithm to predict Landsat-like images at the nominal dates of each 8-day MODIS NBAR product. Both the original Landsat and STARFM-predicted images were then classified with a support vector machine (SVM), and we compared the classification errors of three scenarios: 1) classifying the one or two original Landsat images of each combination only, 2) classifying the one or two original Landsat images plus all STARFM-predicted images, and 3) classifying the one or two original Landsat images together with STARFM-predicted images for key dates. Our results indicated that using two Landsat images as the input of STARFM did not significantly improve the STARFM predictions compared to using only one, and predictions using Landsat images between July and August as input were most accurate. Including all STARFM-predicted images together with the Landsat images significantly increased average classification error by 4% points (from 21% to 25%) compared to using only Landsat images. However, incorporating only STARFM-predicted images for key dates decreased average classification error by 2% points (from 21% to 19%) compared to using only Landsat images. In particular, if only a single Landsat image was available, adding STARFM predictions for key dates significantly decreased the average classification error by 4 percentage points from 30% to 26% (p < 0.05). We conclude that adding STARFM-predicted images can be effective for improving crop-type classification when only limited Landsat observations are available, but carefully selecting images from a full set of STARFM predictions is crucial. We developed an approach to identify the optimal subsets of all STARFM predictions, which gives an alternative method of feature selection for future research.

  6. The 20th-Century Topographic Survey as Source Data for Long-Term Landscape Studies at Local and Regional Scales

    USGS Publications Warehouse

    Varanka, Dalia

    2006-01-01

    Historical topographic maps are the only systematically collected data resource covering the entire nation for long-term landscape change studies over the 20th century for geographical and environmental research. The paper discusses aspects of the historical U.S. Geological Survey topographic maps that present constraints on the design of a database for such studies. Problems involved in this approach include locating the required maps, understanding land feature classification differences between topographic vs. land use/land cover maps, the approximation of error between different map editions of the same area, and the identification of true changes on the landscape between time periods. Suggested approaches to these issues are illustrated using an example of such a study by the author.

  7. Exploring diversity in ensemble classification: Applications in large area land cover mapping

    NASA Astrophysics Data System (ADS)

    Mellor, Andrew; Boukir, Samia

    2017-07-01

    Ensemble classifiers, such as random forests, are now commonly applied in the field of remote sensing, and have been shown to perform better than single classifier systems, resulting in reduced generalisation error. Diversity across the members of ensemble classifiers is known to have a strong influence on classification performance - whereby classifier errors are uncorrelated and more uniformly distributed across ensemble members. The relationship between ensemble diversity and classification performance has not yet been fully explored in the fields of information science and machine learning and has never been examined in the field of remote sensing. This study is a novel exploration of ensemble diversity and its link to classification performance, applied to a multi-class canopy cover classification problem using random forests and multisource remote sensing and ancillary GIS data, across seven million hectares of diverse dry-sclerophyll dominated public forests in Victoria Australia. A particular emphasis is placed on analysing the relationship between ensemble diversity and ensemble margin - two key concepts in ensemble learning. The main novelty of our work is on boosting diversity by emphasizing the contribution of lower margin instances used in the learning process. Exploring the influence of tree pruning on diversity is also a new empirical analysis that contributes to a better understanding of ensemble performance. Results reveal insights into the trade-off between ensemble classification accuracy and diversity, and through the ensemble margin, demonstrate how inducing diversity by targeting lower margin training samples is a means of achieving better classifier performance for more difficult or rarer classes and reducing information redundancy in classification problems. Our findings inform strategies for collecting training data and designing and parameterising ensemble classifiers, such as random forests. This is particularly important in large area remote sensing applications, for which training data is costly and resource intensive to collect.

  8. Analysis of swallowing sounds using hidden Markov models.

    PubMed

    Aboofazeli, Mohammad; Moussavi, Zahra

    2008-04-01

    In recent years, acoustical analysis of the swallowing mechanism has received considerable attention due to its diagnostic potentials. This paper presents a hidden Markov model (HMM) based method for the swallowing sound segmentation and classification. Swallowing sound signals of 15 healthy and 11 dysphagic subjects were studied. The signals were divided into sequences of 25 ms segments each of which were represented by seven features. The sequences of features were modeled by HMMs. Trained HMMs were used for segmentation of the swallowing sounds into three distinct phases, i.e., initial quiet period, initial discrete sounds (IDS) and bolus transit sounds (BTS). Among the seven features, accuracy of segmentation by the HMM based on multi-scale product of wavelet coefficients was higher than that of the other HMMs and the linear prediction coefficient (LPC)-based HMM showed the weakest performance. In addition, HMMs were used for classification of the swallowing sounds of healthy subjects and dysphagic patients. Classification accuracy of different HMM configurations was investigated. When we increased the number of states of the HMMs from 4 to 8, the classification error gradually decreased. In most cases, classification error for N=9 was higher than that of N=8. Among the seven features used, root mean square (RMS) and waveform fractal dimension (WFD) showed the best performance in the HMM-based classification of swallowing sounds. When the sequences of the features of IDS segment were modeled separately, the accuracy reached up to 85.5%. As a second stage classification, a screening algorithm was used which correctly classified all the subjects but one healthy subject when RMS was used as characteristic feature of the swallowing sounds and the number of states was set to N=8.

  9. Determination of Barometric Altimeter Errors for the Orion Exploration Flight Test-1 Entry

    NASA Technical Reports Server (NTRS)

    Brown, Denise L.; Munoz, Jean-Philippe; Gay, Robert

    2011-01-01

    The EFT-1 mission is the unmanned flight test for the upcoming Multi-Purpose Crew Vehicle (MPCV). During entry, the EFT-1 vehicle will trigger several Landing and Recovery System (LRS) events, such as parachute deployment, based on onboard altitude information. The primary altitude source is the filtered navigation solution updated with GPS measurement data. The vehicle also has three barometric altimeters that will be used to measure atmospheric pressure during entry. In the event that GPS data is not available during entry, the altitude derived from the barometric altimeter pressure will be used to trigger chute deployment for the drogues and main parachutes. Therefore it is important to understand the impact of error sources on the pressure measured by the barometric altimeters and on the altitude derived from that pressure. There are four primary error sources impacting the sensed pressure: sensor errors, Analog to Digital conversion errors, aerodynamic errors, and atmosphere modeling errors. This last error source is induced by the conversion from pressure to altitude in the vehicle flight software, which requires an atmosphere model such as the US Standard 1976 Atmosphere model. There are several secondary error sources as well, such as waves, tides, and latencies in data transmission. Typically, for error budget calculations it is assumed that all error sources are independent, normally distributed variables. Thus, the initial approach to developing the EFT-1 barometric altimeter altitude error budget was to create an itemized error budget under these assumptions. This budget was to be verified by simulation using high fidelity models of the vehicle hardware and software. The simulation barometric altimeter model includes hardware error sources and a data-driven model of the aerodynamic errors expected to impact the pressure in the midbay compartment in which the sensors are located. The aerodynamic model includes the pressure difference between the midbay compartment and the free stream pressure as a function of altitude, oscillations in sensed pressure due to wake effects, and an acoustics model capturing fluctuations in pressure due to motion of the passive vents separating the barometric altimeters from the outside of the vehicle.

  10. Spelling in Adolescents with Dyslexia: Errors and Modes of Assessment

    ERIC Educational Resources Information Center

    Tops, Wim; Callens, Maaike; Bijn, Evi; Brysbaert, Marc

    2014-01-01

    In this study we focused on the spelling of high-functioning students with dyslexia. We made a detailed classification of the errors in a word and sentence dictation task made by 100 students with dyslexia and 100 matched control students. All participants were in the first year of their bachelor's studies and had Dutch as mother tongue. Three…

  11. Estimation of a cover-type change matrix from error-prone data

    Treesearch

    Steen Magnussen

    2009-01-01

    Coregistration and classification errors seriously compromise per-pixel estimates of land cover change. A more robust estimation of change is proposed in which adjacent pixels are grouped into 3x3 clusters and treated as a unit of observation. A complete change matrix is recovered in a two-step process. The diagonal elements of a change matrix are recovered from...

  12. Curriculum Assessment Using Artificial Neural Network and Support Vector Machine Modeling Approaches: A Case Study. IR Applications. Volume 29

    ERIC Educational Resources Information Center

    Chen, Chau-Kuang

    2010-01-01

    Artificial Neural Network (ANN) and Support Vector Machine (SVM) approaches have been on the cutting edge of science and technology for pattern recognition and data classification. In the ANN model, classification accuracy can be achieved by using the feed-forward of inputs, back-propagation of errors, and the adjustment of connection weights. In…

  13. North American vegetation model for land-use planning in a changing climate: A solution to large classification problems

    Treesearch

    Gerald E. Rehfeldt; Nicholas L. Crookston; Cuauhtemoc Saenz-Romero; Elizabeth M. Campbell

    2012-01-01

    Data points intensively sampling 46 North American biomes were used to predict the geographic distribution of biomes from climate variables using the Random Forests classification tree. Techniques were incorporated to accommodate a large number of classes and to predict the future occurrence of climates beyond the contemporary climatic range of the biomes. Errors of...

  14. Bayes Error Rate Estimation Using Classifier Ensembles

    NASA Technical Reports Server (NTRS)

    Tumer, Kagan; Ghosh, Joydeep

    2003-01-01

    The Bayes error rate gives a statistical lower bound on the error achievable for a given classification problem and the associated choice of features. By reliably estimating th is rate, one can assess the usefulness of the feature set that is being used for classification. Moreover, by comparing the accuracy achieved by a given classifier with the Bayes rate, one can quantify how effective that classifier is. Classical approaches for estimating or finding bounds for the Bayes error, in general, yield rather weak results for small sample sizes; unless the problem has some simple characteristics, such as Gaussian class-conditional likelihoods. This article shows how the outputs of a classifier ensemble can be used to provide reliable and easily obtainable estimates of the Bayes error with negligible extra computation. Three methods of varying sophistication are described. First, we present a framework that estimates the Bayes error when multiple classifiers, each providing an estimate of the a posteriori class probabilities, a recombined through averaging. Second, we bolster this approach by adding an information theoretic measure of output correlation to the estimate. Finally, we discuss a more general method that just looks at the class labels indicated by ensem ble members and provides error estimates based on the disagreements among classifiers. The methods are illustrated for artificial data, a difficult four-class problem involving underwater acoustic data, and two problems from the Problem benchmarks. For data sets with known Bayes error, the combiner-based methods introduced in this article outperform existing methods. The estimates obtained by the proposed methods also seem quite reliable for the real-life data sets for which the true Bayes rates are unknown.

  15. Identification of terrain cover using the optimum polarimetric classifier

    NASA Technical Reports Server (NTRS)

    Kong, J. A.; Swartz, A. A.; Yueh, H. A.; Novak, L. M.; Shin, R. T.

    1988-01-01

    A systematic approach for the identification of terrain media such as vegetation canopy, forest, and snow-covered fields is developed using the optimum polarimetric classifier. The covariance matrices for various terrain cover are computed from theoretical models of random medium by evaluating the scattering matrix elements. The optimal classification scheme makes use of a quadratic distance measure and is applied to classify a vegetation canopy consisting of both trees and grass. Experimentally measured data are used to validate the classification scheme. Analytical and Monte Carlo simulated classification errors using the fully polarimetric feature vector are compared with classification based on single features which include the phase difference between the VV and HH polarization returns. It is shown that the full polarimetric results are optimal and provide better classification performance than single feature measurements.

  16. Classifying nursing errors in clinical management within an Australian hospital.

    PubMed

    Tran, D T; Johnson, M

    2010-12-01

    Although many classification systems relating to patient safety exist, no taxonomy was identified that classified nursing errors in clinical management. To develop a classification system for nursing errors relating to clinical management (NECM taxonomy) and to describe contributing factors and patient consequences. We analysed 241 (11%) self-reported incidents relating to clinical management in nursing in a metropolitan hospital. Descriptive analysis of numeric data and content analysis of text data were undertaken to derive the NECM taxonomy, contributing factors and consequences for patients. Clinical management incidents represented 1.63 incidents per 1000 occupied bed days. The four themes of the NECM taxonomy were nursing care process (67%), communication (22%), administrative process (5%), and knowledge and skill (6%). Half of the incidents did not cause any patient harm. Contributing factors (n=111) included the following: patient clinical, social conditions and behaviours (27%); resources (22%); environment and workload (18%); other health professionals (15%); communication (13%); and nurse's knowledge and experience (5%). The NECM taxonomy provides direction to clinicians and managers on areas in clinical management that are most vulnerable to error, and therefore, priorities for system change management. Any nurses who wish to classify nursing errors relating to clinical management could use these types of errors. This study informs further research into risk management behaviour, and self-assessment tools for clinicians. Globally, nurses need to continue to monitor and act upon patient safety issues. © 2010 The Authors. International Nursing Review © 2010 International Council of Nurses.

  17. In-vivo determination of chewing patterns using FBG and artificial neural networks

    NASA Astrophysics Data System (ADS)

    Pegorini, Vinicius; Zen Karam, Leandro; Rocha Pitta, Christiano S.; Ribeiro, Richardson; Simioni Assmann, Tangriani; Cardozo da Silva, Jean Carlos; Bertotti, Fábio L.; Kalinowski, Hypolito J.; Cardoso, Rafael

    2015-09-01

    This paper reports the process of pattern classification of the chewing process of ruminants. We propose a simplified signal processing scheme for optical fiber Bragg grating (FBG) sensors based on machine learning techniques. The FBG sensors measure the biomechanical forces during jaw movements and an artificial neural network is responsible for the classification of the associated chewing pattern. In this study, three patterns associated to dietary supplement, hay and ryegrass were considered. Additionally, two other important events for ingestive behavior studies were monitored, rumination and idle period. Experimental results show that the proposed approach for pattern classification has been capable of differentiating the materials involved in the chewing process with a small classification error.

  18. Hyperspectral image classification based on local binary patterns and PCANet

    NASA Astrophysics Data System (ADS)

    Yang, Huizhen; Gao, Feng; Dong, Junyu; Yang, Yang

    2018-04-01

    Hyperspectral image classification has been well acknowledged as one of the challenging tasks of hyperspectral data processing. In this paper, we propose a novel hyperspectral image classification framework based on local binary pattern (LBP) features and PCANet. In the proposed method, linear prediction error (LPE) is first employed to select a subset of informative bands, and LBP is utilized to extract texture features. Then, spectral and texture features are stacked into a high dimensional vectors. Next, the extracted features of a specified position are transformed to a 2-D image. The obtained images of all pixels are fed into PCANet for classification. Experimental results on real hyperspectral dataset demonstrate the effectiveness of the proposed method.

  19. Optimizing pattern recognition-based control for partial-hand prosthesis application.

    PubMed

    Earley, Eric J; Adewuyi, Adenike A; Hargrove, Levi J

    2014-01-01

    Partial-hand amputees often retain good residual wrist motion, which is essential for functional activities involving use of the hand. Thus, a crucial design criterion for a myoelectric, partial-hand prosthesis control scheme is that it allows the user to retain residual wrist motion. Pattern recognition (PR) of electromyographic (EMG) signals is a well-studied method of controlling myoelectric prostheses. However, wrist motion degrades a PR system's ability to correctly predict hand-grasp patterns. We studied the effects of (1) window length and number of hand-grasps, (2) static and dynamic wrist motion, and (3) EMG muscle source on the ability of a PR-based control scheme to classify functional hand-grasp patterns. Our results show that training PR classifiers with both extrinsic and intrinsic muscle EMG yields a lower error rate than training with either group by itself (p<0.001); and that training in only variable wrist positions, with only dynamic wrist movements, or with both variable wrist positions and movements results in lower error rates than training in only the neutral wrist position (p<0.001). Finally, our results show that both an increase in window length and a decrease in the number of grasps available to the classifier significantly decrease classification error (p<0.001). These results remained consistent whether the classifier selected or maintained a hand-grasp.

  20. Multi-source feature extraction and target recognition in wireless sensor networks based on adaptive distributed wavelet compression algorithms

    NASA Astrophysics Data System (ADS)

    Hortos, William S.

    2008-04-01

    Proposed distributed wavelet-based algorithms are a means to compress sensor data received at the nodes forming a wireless sensor network (WSN) by exchanging information between neighboring sensor nodes. Local collaboration among nodes compacts the measurements, yielding a reduced fused set with equivalent information at far fewer nodes. Nodes may be equipped with multiple sensor types, each capable of sensing distinct phenomena: thermal, humidity, chemical, voltage, or image signals with low or no frequency content as well as audio, seismic or video signals within defined frequency ranges. Compression of the multi-source data through wavelet-based methods, distributed at active nodes, reduces downstream processing and storage requirements along the paths to sink nodes; it also enables noise suppression and more energy-efficient query routing within the WSN. Targets are first detected by the multiple sensors; then wavelet compression and data fusion are applied to the target returns, followed by feature extraction from the reduced data; feature data are input to target recognition/classification routines; targets are tracked during their sojourns through the area monitored by the WSN. Algorithms to perform these tasks are implemented in a distributed manner, based on a partition of the WSN into clusters of nodes. In this work, a scheme of collaborative processing is applied for hierarchical data aggregation and decorrelation, based on the sensor data itself and any redundant information, enabled by a distributed, in-cluster wavelet transform with lifting that allows multiple levels of resolution. The wavelet-based compression algorithm significantly decreases RF bandwidth and other resource use in target processing tasks. Following wavelet compression, features are extracted. The objective of feature extraction is to maximize the probabilities of correct target classification based on multi-source sensor measurements, while minimizing the resource expenditures at participating nodes. Therefore, the feature-extraction method based on the Haar DWT is presented that employs a maximum-entropy measure to determine significant wavelet coefficients. Features are formed by calculating the energy of coefficients grouped around the competing clusters. A DWT-based feature extraction algorithm used for vehicle classification in WSNs can be enhanced by an added rule for selecting the optimal number of resolution levels to improve the correct classification rate and reduce energy consumption expended in local algorithm computations. Published field trial data for vehicular ground targets, measured with multiple sensor types, are used to evaluate the wavelet-assisted algorithms. Extracted features are used in established target recognition routines, e.g., the Bayesian minimum-error-rate classifier, to compare the effects on the classification performance of the wavelet compression. Simulations of feature sets and recognition routines at different resolution levels in target scenarios indicate the impact on classification rates, while formulas are provided to estimate reduction in resource use due to distributed compression.

  1. Influence of nuclei segmentation on breast cancer malignancy classification

    NASA Astrophysics Data System (ADS)

    Jelen, Lukasz; Fevens, Thomas; Krzyzak, Adam

    2009-02-01

    Breast Cancer is one of the most deadly cancers affecting middle-aged women. Accurate diagnosis and prognosis are crucial to reduce the high death rate. Nowadays there are numerous diagnostic tools for breast cancer diagnosis. In this paper we discuss a role of nuclear segmentation from fine needle aspiration biopsy (FNA) slides and its influence on malignancy classification. Classification of malignancy plays a very important role during the diagnosis process of breast cancer. Out of all cancer diagnostic tools, FNA slides provide the most valuable information about the cancer malignancy grade which helps to choose an appropriate treatment. This process involves assessing numerous nuclear features and therefore precise segmentation of nuclei is very important. In this work we compare three powerful segmentation approaches and test their impact on the classification of breast cancer malignancy. The studied approaches involve level set segmentation, fuzzy c-means segmentation and textural segmentation based on co-occurrence matrix. Segmented nuclei were used to extract nuclear features for malignancy classification. For classification purposes four different classifiers were trained and tested with previously extracted features. The compared classifiers are Multilayer Perceptron (MLP), Self-Organizing Maps (SOM), Principal Component-based Neural Network (PCA) and Support Vector Machines (SVM). The presented results show that level set segmentation yields the best results over the three compared approaches and leads to a good feature extraction with a lowest average error rate of 6.51% over four different classifiers. The best performance was recorded for multilayer perceptron with an error rate of 3.07% using fuzzy c-means segmentation.

  2. Differing Procedures for Recording Mortality Statistics in Scandinavia.

    PubMed

    Tøllefsen, Ingvild M; Hem, Erlend; Ekeberg, Øivind; Zahl, Per-Henrik; Helweg-Larsen, Karin

    2017-03-01

    There may be various reasons for differences in suicide rates between countries and over time within a country. One reason can be different registration practices. The purpose of this study was to describe and compare the present procedures for mortality and suicide registration in the three Scandinavian countries and to illustrate potential sources of error in the registration of suicide. Information about registration practices and classification procedures was obtained from the cause of death registers in Norway, Sweden, and Denmark. In addition, we received information from experts in the field in each country. Sweden uses event of undetermined intent more frequently than Denmark does, and Denmark more frequently than Norway. There seems to be somewhat more uncertainty among deaths classified as ill-defined and unknown cause of mortality in Norway, compared with the other two countries. Sweden performs more forensic autopsies than Norway, and Norway more than Denmark. In Denmark, in cases of a suspected unnatural manner of death, a thorough external examination of the deceased is performed. Differences in the classification of causes of death and in postmortem examinations exist in Scandinavian countries. These differences might influence the suicide statistics in Scandinavia.

  3. Sub-pixel image classification for forest types in East Texas

    NASA Astrophysics Data System (ADS)

    Westbrook, Joey

    Sub-pixel classification is the extraction of information about the proportion of individual materials of interest within a pixel. Landcover classification at the sub-pixel scale provides more discrimination than traditional per-pixel multispectral classifiers for pixels where the material of interest is mixed with other materials. It allows for the un-mixing of pixels to show the proportion of each material of interest. The materials of interest for this study are pine, hardwood, mixed forest and non-forest. The goal of this project was to perform a sub-pixel classification, which allows a pixel to have multiple labels, and compare the result to a traditional supervised classification, which allows a pixel to have only one label. The satellite image used was a Landsat 5 Thematic Mapper (TM) scene of the Stephen F. Austin Experimental Forest in Nacogdoches County, Texas and the four cover type classes are pine, hardwood, mixed forest and non-forest. Once classified, a multi-layer raster datasets was created that comprised four raster layers where each layer showed the percentage of that cover type within the pixel area. Percentage cover type maps were then produced and the accuracy of each was assessed using a fuzzy error matrix for the sub-pixel classifications, and the results were compared to the supervised classification in which a traditional error matrix was used. The overall accuracy of the sub-pixel classification using the aerial photo for both training and reference data had the highest (65% overall) out of the three sub-pixel classifications. This was understandable because the analyst can visually observe the cover types actually on the ground for training data and reference data, whereas using the FIA (Forest Inventory and Analysis) plot data, the analyst must assume that an entire pixel contains the exact percentage of a cover type found in a plot. An increase in accuracy was found after reclassifying each sub-pixel classification from nine classes with 10 percent interval each to five classes with 20 percent interval each. When compared to the supervised classification which has a satisfactory overall accuracy of 90%, none of the sub-pixel classification achieved the same level. However, since traditional per-pixel classifiers assign only one label to pixels throughout the landscape while sub-pixel classifications assign multiple labels to each pixel, the traditional 85% accuracy of acceptance for pixel-based classifications should not apply to sub-pixel classifications. More research is needed in order to define the level of accuracy that is deemed acceptable for sub-pixel classifications.

  4. Automated Classification of ROSAT Sources Using Heterogeneous Multiwavelength Source Catalogs

    NASA Technical Reports Server (NTRS)

    McGlynn, Thomas; Suchkov, A. A.; Winter, E. L.; Hanisch, R. J.; White, R. L.; Ochsenbein, F.; Derriere, S.; Voges, W.; Corcoran, M. F.

    2004-01-01

    We describe an on-line system for automated classification of X-ray sources, ClassX, and present preliminary results of classification of the three major catalogs of ROSAT sources, RASS BSC, RASS FSC, and WGACAT, into six class categories: stars, white dwarfs, X-ray binaries, galaxies, AGNs, and clusters of galaxies. ClassX is based on a machine learning technology. It represents a system of classifiers, each classifier consisting of a considerable number of oblique decision trees. These trees are built as the classifier is 'trained' to recognize various classes of objects using a training sample of sources of known object types. Each source is characterized by a preselected set of parameters, or attributes; the same set is then used as the classifier conducts classification of sources of unknown identity. The ClassX pipeline features an automatic search for X-ray source counterparts among heterogeneous data sets in on-line data archives using Virtual Observatory protocols; it retrieves from those archives all the attributes required by the selected classifier and inputs them to the classifier. The user input to ClassX is typically a file with target coordinates, optionally complemented with target IDs. The output contains the class name, attributes, and class probabilities for all classified targets. We discuss ways to characterize and assess the classifier quality and performance and present the respective validation procedures. Based on both internal and external validation, we conclude that the ClassX classifiers yield reasonable and reliable classifications for ROSAT sources and have the potential to broaden class representation significantly for rare object types.

  5. A real-time heat strain risk classifier using heart rate and skin temperature.

    PubMed

    Buller, Mark J; Latzka, William A; Yokota, Miyo; Tharion, William J; Moran, Daniel S

    2008-12-01

    Heat injury is a real concern to workers engaged in physically demanding tasks in high heat strain environments. Several real-time physiological monitoring systems exist that can provide indices of heat strain, e.g. physiological strain index (PSI), and provide alerts to medical personnel. However, these systems depend on core temperature measurement using expensive, ingestible thermometer pills. Seeking a better solution, we suggest the use of a model which can identify the probability that individuals are 'at risk' from heat injury using non-invasive measures. The intent is for the system to identify individuals who need monitoring more closely or who should apply heat strain mitigation strategies. We generated a model that can identify 'at risk' (PSI 7.5) workers from measures of heart rate and chest skin temperature. The model was built using data from six previously published exercise studies in which some subjects wore chemical protective equipment. The model has an overall classification error rate of 10% with one false negative error (2.7%), and outperforms an earlier model and a least squares regression model with classification errors of 21% and 14%, respectively. Additionally, the model allows the classification criteria to be adjusted based on the task and acceptable level of risk. We conclude that the model could be a valuable part of a multi-faceted heat strain management system.

  6. Deep learning for digital pathology image analysis: A comprehensive tutorial with selected use cases

    PubMed Central

    Janowczyk, Andrew; Madabhushi, Anant

    2016-01-01

    Background: Deep learning (DL) is a representation learning approach ideally suited for image analysis challenges in digital pathology (DP). The variety of image analysis tasks in the context of DP includes detection and counting (e.g., mitotic events), segmentation (e.g., nuclei), and tissue classification (e.g., cancerous vs. non-cancerous). Unfortunately, issues with slide preparation, variations in staining and scanning across sites, and vendor platforms, as well as biological variance, such as the presentation of different grades of disease, make these image analysis tasks particularly challenging. Traditional approaches, wherein domain-specific cues are manually identified and developed into task-specific “handcrafted” features, can require extensive tuning to accommodate these variances. However, DL takes a more domain agnostic approach combining both feature discovery and implementation to maximally discriminate between the classes of interest. While DL approaches have performed well in a few DP related image analysis tasks, such as detection and tissue classification, the currently available open source tools and tutorials do not provide guidance on challenges such as (a) selecting appropriate magnification, (b) managing errors in annotations in the training (or learning) dataset, and (c) identifying a suitable training set containing information rich exemplars. These foundational concepts, which are needed to successfully translate the DL paradigm to DP tasks, are non-trivial for (i) DL experts with minimal digital histology experience, and (ii) DP and image processing experts with minimal DL experience, to derive on their own, thus meriting a dedicated tutorial. Aims: This paper investigates these concepts through seven unique DP tasks as use cases to elucidate techniques needed to produce comparable, and in many cases, superior to results from the state-of-the-art hand-crafted feature-based classification approaches. Results: Specifically, in this tutorial on DL for DP image analysis, we show how an open source framework (Caffe), with a singular network architecture, can be used to address: (a) nuclei segmentation (F-score of 0.83 across 12,000 nuclei), (b) epithelium segmentation (F-score of 0.84 across 1735 regions), (c) tubule segmentation (F-score of 0.83 from 795 tubules), (d) lymphocyte detection (F-score of 0.90 across 3064 lymphocytes), (e) mitosis detection (F-score of 0.53 across 550 mitotic events), (f) invasive ductal carcinoma detection (F-score of 0.7648 on 50 k testing patches), and (g) lymphoma classification (classification accuracy of 0.97 across 374 images). Conclusion: This paper represents the largest comprehensive study of DL approaches in DP to date, with over 1200 DP images used during evaluation. The supplemental online material that accompanies this paper consists of step-by-step instructions for the usage of the supplied source code, trained models, and input data. PMID:27563488

  7. Deep learning for digital pathology image analysis: A comprehensive tutorial with selected use cases.

    PubMed

    Janowczyk, Andrew; Madabhushi, Anant

    2016-01-01

    Deep learning (DL) is a representation learning approach ideally suited for image analysis challenges in digital pathology (DP). The variety of image analysis tasks in the context of DP includes detection and counting (e.g., mitotic events), segmentation (e.g., nuclei), and tissue classification (e.g., cancerous vs. non-cancerous). Unfortunately, issues with slide preparation, variations in staining and scanning across sites, and vendor platforms, as well as biological variance, such as the presentation of different grades of disease, make these image analysis tasks particularly challenging. Traditional approaches, wherein domain-specific cues are manually identified and developed into task-specific "handcrafted" features, can require extensive tuning to accommodate these variances. However, DL takes a more domain agnostic approach combining both feature discovery and implementation to maximally discriminate between the classes of interest. While DL approaches have performed well in a few DP related image analysis tasks, such as detection and tissue classification, the currently available open source tools and tutorials do not provide guidance on challenges such as (a) selecting appropriate magnification, (b) managing errors in annotations in the training (or learning) dataset, and (c) identifying a suitable training set containing information rich exemplars. These foundational concepts, which are needed to successfully translate the DL paradigm to DP tasks, are non-trivial for (i) DL experts with minimal digital histology experience, and (ii) DP and image processing experts with minimal DL experience, to derive on their own, thus meriting a dedicated tutorial. This paper investigates these concepts through seven unique DP tasks as use cases to elucidate techniques needed to produce comparable, and in many cases, superior to results from the state-of-the-art hand-crafted feature-based classification approaches. Specifically, in this tutorial on DL for DP image analysis, we show how an open source framework (Caffe), with a singular network architecture, can be used to address: (a) nuclei segmentation (F-score of 0.83 across 12,000 nuclei), (b) epithelium segmentation (F-score of 0.84 across 1735 regions), (c) tubule segmentation (F-score of 0.83 from 795 tubules), (d) lymphocyte detection (F-score of 0.90 across 3064 lymphocytes), (e) mitosis detection (F-score of 0.53 across 550 mitotic events), (f) invasive ductal carcinoma detection (F-score of 0.7648 on 50 k testing patches), and (g) lymphoma classification (classification accuracy of 0.97 across 374 images). This paper represents the largest comprehensive study of DL approaches in DP to date, with over 1200 DP images used during evaluation. The supplemental online material that accompanies this paper consists of step-by-step instructions for the usage of the supplied source code, trained models, and input data.

  8. Field errors in hybrid insertion devices

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schlueter, R.D.

    1995-02-01

    Hybrid magnet theory as applied to the error analyses used in the design of Advanced Light Source (ALS) insertion devices is reviewed. Sources of field errors in hybrid insertion devices are discussed.

  9. Effects of Optical Combiner and IPD Change for Convergence on Near-Field Depth Perception in an Optical See-Through HMD.

    PubMed

    Lee, Sangyoon; Hu, Xinda; Hua, Hong

    2016-05-01

    Many error sources have been explored in regards to the depth perception problem in augmented reality environments using optical see-through head-mounted displays (OST-HMDs). Nonetheless, two error sources are commonly neglected: the ray-shift phenomenon and the change in interpupillary distance (IPD). The first source of error arises from the difference in refraction for virtual and see-through optical paths caused by an optical combiner, which is required of OST-HMDs. The second occurs from the change in the viewer's IPD due to eye convergence. In this paper, we analyze the effects of these two error sources on near-field depth perception and propose methods to compensate for these two types of errors. Furthermore, we investigate their effectiveness through an experiment comparing the conditions with and without our error compensation methods applied. In our experiment, participants estimated the egocentric depth of a virtual and a physical object located at seven different near-field distances (40∼200 cm) using a perceptual matching task. Although the experimental results showed different patterns depending on the target distance, the results demonstrated that the near-field depth perception error can be effectively reduced to a very small level (at most 1 percent error) by compensating for the two mentioned error sources.

  10. Meteorological Error Budget Using Open Source Data

    DTIC Science & Technology

    2016-09-01

    ARL-TR-7831 ● SEP 2016 US Army Research Laboratory Meteorological Error Budget Using Open- Source Data by J Cogan, J Smith, P...needed. Do not return it to the originator. ARL-TR-7831 ● SEP 2016 US Army Research Laboratory Meteorological Error Budget Using...Error Budget Using Open-Source Data 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) J Cogan, J Smith, P Haines

  11. Automated Segmentation Errors When Using Optical Coherence Tomography to Measure Retinal Nerve Fiber Layer Thickness in Glaucoma.

    PubMed

    Mansberger, Steven L; Menda, Shivali A; Fortune, Brad A; Gardiner, Stuart K; Demirel, Shaban

    2017-02-01

    To characterize the error of optical coherence tomography (OCT) measurements of retinal nerve fiber layer (RNFL) thickness when using automated retinal layer segmentation algorithms without manual refinement. Cross-sectional study. This study was set in a glaucoma clinical practice, and the dataset included 3490 scans from 412 eyes of 213 individuals with a diagnosis of glaucoma or glaucoma suspect. We used spectral domain OCT (Spectralis) to measure RNFL thickness in a 6-degree peripapillary circle, and exported the native "automated segmentation only" results. In addition, we exported the results after "manual refinement" to correct errors in the automated segmentation of the anterior (internal limiting membrane) and the posterior boundary of the RNFL. Our outcome measures included differences in RNFL thickness and glaucoma classification (i.e., normal, borderline, or outside normal limits) between scans with automated segmentation only and scans using manual refinement. Automated segmentation only resulted in a thinner global RNFL thickness (1.6 μm thinner, P < .001) when compared to manual refinement. When adjusted by operator, a multivariate model showed increased differences with decreasing RNFL thickness (P < .001), decreasing scan quality (P < .001), and increasing age (P < .03). Manual refinement changed 298 of 3486 (8.5%) of scans to a different global glaucoma classification, wherein 146 of 617 (23.7%) of borderline classifications became normal. Superior and inferior temporal clock hours had the largest differences. Automated segmentation without manual refinement resulted in reduced global RNFL thickness and overestimated the classification of glaucoma. Differences increased in eyes with a thinner RNFL thickness, older age, and decreased scan quality. Operators should inspect and manually refine OCT retinal layer segmentation when assessing RNFL thickness in the management of patients with glaucoma. Copyright © 2016 Elsevier Inc. All rights reserved.

  12. Automated feature extraction and classification from image sources

    USGS Publications Warehouse

    ,

    1995-01-01

    The U.S. Department of the Interior, U.S. Geological Survey (USGS), and Unisys Corporation have completed a cooperative research and development agreement (CRADA) to explore automated feature extraction and classification from image sources. The CRADA helped the USGS define the spectral and spatial resolution characteristics of airborne and satellite imaging sensors necessary to meet base cartographic and land use and land cover feature classification requirements and help develop future automated geographic and cartographic data production capabilities. The USGS is seeking a new commercial partner to continue automated feature extraction and classification research and development.

  13. Brain fingerprinting classification concealed information test detects US Navy military medical information with P300

    PubMed Central

    Farwell, Lawrence A.; Richardson, Drew C.; Richardson, Graham M.; Furedy, John J.

    2014-01-01

    A classification concealed information test (CIT) used the “brain fingerprinting” method of applying P300 event-related potential (ERP) in detecting information that is (1) acquired in real life and (2) unique to US Navy experts in military medicine. Military medicine experts and non-experts were asked to push buttons in response to three types of text stimuli. Targets contain known information relevant to military medicine, are identified to subjects as relevant, and require pushing one button. Subjects are told to push another button to all other stimuli. Probes contain concealed information relevant to military medicine, and are not identified to subjects. Irrelevants contain equally plausible, but incorrect/irrelevant information. Error rate was 0%. Median and mean statistical confidences for individual determinations were 99.9% with no indeterminates (results lacking sufficiently high statistical confidence to be classified). We compared error rate and statistical confidence for determinations of both information present and information absent produced by classification CIT (Is a probe ERP more similar to a target or to an irrelevant ERP?) vs. comparison CIT (Does a probe produce a larger ERP than an irrelevant?) using P300 plus the late negative component (LNP; together, P300-MERMER). Comparison CIT produced a significantly higher error rate (20%) and lower statistical confidences: mean 67%; information-absent mean was 28.9%, less than chance (50%). We compared analysis using P300 alone with the P300 + LNP. P300 alone produced the same 0% error rate but significantly lower statistical confidences. These findings add to the evidence that the brain fingerprinting methods as described here provide sufficient conditions to produce less than 1% error rate and greater than 95% median statistical confidence in a CIT on information obtained in the course of real life that is characteristic of individuals with specific training, expertise, or organizational affiliation. PMID:25565941

  14. Error Sources in Asteroid Astrometry

    NASA Technical Reports Server (NTRS)

    Owen, William M., Jr.

    2000-01-01

    Asteroid astrometry, like any other scientific measurement process, is subject to both random and systematic errors, not all of which are under the observer's control. To design an astrometric observing program or to improve an existing one requires knowledge of the various sources of error, how different errors affect one's results, and how various errors may be minimized by careful observation or data reduction techniques.

  15. Morbidity Assessment in Surgery: Refinement Proposal Based on a Concept of Perioperative Adverse Events

    PubMed Central

    Kazaryan, Airazat M.; Røsok, Bård I.; Edwin, Bjørn

    2013-01-01

    Background. Morbidity is a cornerstone assessing surgical treatment; nevertheless surgeons have not reached extensive consensus on this problem. Methods and Findings. Clavien, Dindo, and Strasberg with coauthors (1992, 2004, 2009, and 2010) made significant efforts to the standardization of surgical morbidity (Clavien-Dindo-Strasberg classification, last revision, the Accordion classification). However, this classification includes only postoperative complications and has two principal shortcomings: disregard of intraoperative events and confusing terminology. Postoperative events have a major impact on patient well-being. However, intraoperative events should also be recorded and reported even if they do not evidently affect the patient's postoperative well-being. The term surgical complication applied in the Clavien-Dindo-Strasberg classification may be regarded as an incident resulting in a complication caused by technical failure of surgery, in contrast to the so-called medical complications. Therefore, the term surgical complication contributes to misinterpretation of perioperative morbidity. The term perioperative adverse events comprising both intraoperative unfavourable incidents and postoperative complications could be regarded as better alternative. In 2005, Satava suggested a simple grading to evaluate intraoperative surgical errors. Based on that approach, we have elaborated a 3-grade classification of intraoperative incidents so that it can be used to grade intraoperative events of any type of surgery. Refinements have been made to the Accordion classification of postoperative complications. Interpretation. The proposed systematization of perioperative adverse events utilizing the combined application of two appraisal tools, that is, the elaborated classification of intraoperative incidents on the basis of the Satava approach to surgical error evaluation together with the modified Accordion classification of postoperative complication, appears to be an effective tool for comprehensive assessment of surgical outcomes. This concept was validated in regard to various surgical procedures. Broad implementation of this approach will promote the development of surgical science and practice. PMID:23762627

  16. Determining decision thresholds and evaluating indicators when conservation status is measured as a continuum.

    PubMed

    Connors, B M; Cooper, A B

    2014-12-01

    Categorization of the status of populations, species, and ecosystems underpins most conservation activities. Status is often based on how a system's current indicator value (e.g., change in abundance) relates to some threshold of conservation concern. Receiver operating characteristic (ROC) curves can be used to quantify the statistical reliability of indicators of conservation status and evaluate trade-offs between correct (true positive) and incorrect (false positive) classifications across a range of decision thresholds. However, ROC curves assume a discrete, binary relationship between an indicator and the conservation status it is meant to track, which is a simplification of the more realistic continuum of conservation status, and may limit the applicability of ROC curves in conservation science. We describe a modified ROC curve that treats conservation status as a continuum rather than a discrete state. We explored the influence of this continuum and typical sources of variation in abundance that can lead to classification errors (i.e., random variation and measurement error) on the true and false positive rates corresponding to varying decision thresholds and the reliability of change in abundance as an indicator of conservation status, respectively. We applied our modified ROC approach to an indicator of endangerment in Pacific salmon (Oncorhynchus nerka) (i.e., percent decline in geometric mean abundance) and an indicator of marine ecosystem structure and function (i.e., detritivore biomass). Failure to treat conservation status as a continuum when choosing thresholds for indicators resulted in the misidentification of trade-offs between true and false positive rates and the overestimation of an indicator's reliability. We argue for treating conservation status as a continuum when ROC curves are used to evaluate decision thresholds in indicators for the assessment of conservation status. © 2014 Society for Conservation Biology.

  17. Optimal mapping of terrestrial gamma dose rates using geological parent material and aerogeophysical survey data.

    PubMed

    Rawlins, B G; Scheib, C; Tyler, A N; Beamish, D

    2012-12-01

    Regulatory authorities need ways to estimate natural terrestrial gamma radiation dose rates (nGy h⁻¹) across the landscape accurately, to assess its potential deleterious health effects. The primary method for estimating outdoor dose rate is to use an in situ detector supported 1 m above the ground, but such measurements are costly and cannot capture the landscape-scale variation in dose rates which are associated with changes in soil and parent material mineralogy. We investigate the potential for improving estimates of terrestrial gamma dose rates across Northern Ireland (13,542 km²) using measurements from 168 sites and two sources of ancillary data: (i) a map based on a simplified classification of soil parent material, and (ii) dose estimates from a national-scale, airborne radiometric survey. We used the linear mixed modelling framework in which the two ancillary variables were included in separate models as fixed effects, plus a correlation structure which captures the spatially correlated variance component. We used a cross-validation procedure to determine the magnitude of the prediction errors for the different models. We removed a random subset of 10 terrestrial measurements and formed the model from the remainder (n = 158), and then used the model to predict values at the other 10 sites. We repeated this procedure 50 times. The measurements of terrestrial dose vary between 1 and 103 (nGy h⁻¹). The median absolute model prediction errors (nGy h⁻¹) for the three models declined in the following order: no ancillary data (10.8) > simple geological classification (8.3) > airborne radiometric dose (5.4) as a single fixed effect. Estimates of airborne radiometric gamma dose rate can significantly improve the spatial prediction of terrestrial dose rate.

  18. Active Learning to Overcome Sample Selection Bias: Application to Photometric Variable Star Classification

    NASA Astrophysics Data System (ADS)

    Richards, Joseph W.; Starr, Dan L.; Brink, Henrik; Miller, Adam A.; Bloom, Joshua S.; Butler, Nathaniel R.; James, J. Berian; Long, James P.; Rice, John

    2012-01-01

    Despite the great promise of machine-learning algorithms to classify and predict astrophysical parameters for the vast numbers of astrophysical sources and transients observed in large-scale surveys, the peculiarities of the training data often manifest as strongly biased predictions on the data of interest. Typically, training sets are derived from historical surveys of brighter, more nearby objects than those from more extensive, deeper surveys (testing data). This sample selection bias can cause catastrophic errors in predictions on the testing data because (1) standard assumptions for machine-learned model selection procedures break down and (2) dense regions of testing space might be completely devoid of training data. We explore possible remedies to sample selection bias, including importance weighting, co-training, and active learning (AL). We argue that AL—where the data whose inclusion in the training set would most improve predictions on the testing set are queried for manual follow-up—is an effective approach and is appropriate for many astronomical applications. For a variable star classification problem on a well-studied set of stars from Hipparcos and Optical Gravitational Lensing Experiment, AL is the optimal method in terms of error rate on the testing data, beating the off-the-shelf classifier by 3.4% and the other proposed methods by at least 3.0%. To aid with manual labeling of variable stars, we developed a Web interface which allows for easy light curve visualization and querying of external databases. Finally, we apply AL to classify variable stars in the All Sky Automated Survey, finding dramatic improvement in our agreement with the ASAS Catalog of Variable Stars, from 65.5% to 79.5%, and a significant increase in the classifier's average confidence for the testing set, from 14.6% to 42.9%, after a few AL iterations.

  19. Entropy-based gene ranking without selection bias for the predictive classification of microarray data.

    PubMed

    Furlanello, Cesare; Serafini, Maria; Merler, Stefano; Jurman, Giuseppe

    2003-11-06

    We describe the E-RFE method for gene ranking, which is useful for the identification of markers in the predictive classification of array data. The method supports a practical modeling scheme designed to avoid the construction of classification rules based on the selection of too small gene subsets (an effect known as the selection bias, in which the estimated predictive errors are too optimistic due to testing on samples already considered in the feature selection process). With E-RFE, we speed up the recursive feature elimination (RFE) with SVM classifiers by eliminating chunks of uninteresting genes using an entropy measure of the SVM weights distribution. An optimal subset of genes is selected according to a two-strata model evaluation procedure: modeling is replicated by an external stratified-partition resampling scheme, and, within each run, an internal K-fold cross-validation is used for E-RFE ranking. Also, the optimal number of genes can be estimated according to the saturation of Zipf's law profiles. Without a decrease of classification accuracy, E-RFE allows a speed-up factor of 100 with respect to standard RFE, while improving on alternative parametric RFE reduction strategies. Thus, a process for gene selection and error estimation is made practical, ensuring control of the selection bias, and providing additional diagnostic indicators of gene importance.

  20. DMSP SSJ4 Data Restoration, Classification, and On-Line Data Access

    NASA Technical Reports Server (NTRS)

    Wing, Simon; Bredekamp, Joseph H. (Technical Monitor)

    2000-01-01

    Compress and clean raw data file for permanent storage We have identified various error conditions/types and developed algorithms to get rid of these errors/noises, including the more complicated noise in the newer data sets. (status = 100% complete). Internet access of compacted raw data. It is now possible to access the raw data via our web site, http://www.jhuapl.edu/Aurora/index.html. The software to read and plot the compacted raw data is also available from the same web site. The users can now download the raw data, read, plot, or manipulate the data as they wish on their own computer. The users are able to access the cleaned data sets. Internet access of the color spectrograms. This task has also been completed. It is now possible to access the spectrograms from the web site mentioned above. Improve the particle precipitation region classification. The algorithm for doing this task has been developed and implemented. As a result, the accuracies improved. Now the web site routinely distributes the results of applying the new algorithm to the cleaned data set. Mark the classification region on the spectrograms. The software to mark the classification region in the spectrograms has been completed. This is also available from our web site.

  1. Malingering in Toxic Exposure. Classification Accuracy of Reliable Digit Span and WAIS-III Digit Span Scaled Scores

    ERIC Educational Resources Information Center

    Greve, Kevin W.; Springer, Steven; Bianchini, Kevin J.; Black, F. William; Heinly, Matthew T.; Love, Jeffrey M.; Swift, Douglas A.; Ciota, Megan A.

    2007-01-01

    This study examined the sensitivity and false-positive error rate of reliable digit span (RDS) and the WAIS-III Digit Span (DS) scaled score in persons alleging toxic exposure and determined whether error rates differed from published rates in traumatic brain injury (TBI) and chronic pain (CP). Data were obtained from the files of 123 persons…

  2. The use of source memory to identify one's own episodic confusion errors.

    PubMed

    Smith, S M; Tindell, D R; Pierce, B H; Gilliland, T R; Gerkens, D R

    2001-03-01

    In 4 category cued recall experiments, participants falsely recalled nonlist common members, a semantic confusion error. Errors were more likely if critical nonlist words were presented on an incidental task, causing source memory failures called episodic confusion errors. Participants could better identify the source of falsely recalled words if they had deeply processed the words on the incidental task. For deep but not shallow processing, participants could reliably include or exclude incidentally shown category members in recall. The illusion that critical items actually appeared on categorized lists was diminished but not eradicated when participants identified episodic confusion errors post hoc among their own recalled responses; participants often believed that critical items had been on both the incidental task and the study list. Improved source monitoring can potentially mitigate episodic (but not semantic) confusion errors.

  3. Software errors and complexity: An empirical investigation

    NASA Technical Reports Server (NTRS)

    Basili, Victor R.; Perricone, Berry T.

    1983-01-01

    The distributions and relationships derived from the change data collected during the development of a medium scale satellite software project show that meaningful results can be obtained which allow an insight into software traits and the environment in which it is developed. Modified and new modules were shown to behave similarly. An abstract classification scheme for errors which allows a better understanding of the overall traits of a software project is also shown. Finally, various size and complexity metrics are examined with respect to errors detected within the software yielding some interesting results.

  4. Software errors and complexity: An empirical investigation

    NASA Technical Reports Server (NTRS)

    Basili, V. R.; Perricone, B. T.

    1982-01-01

    The distributions and relationships derived from the change data collected during the development of a medium scale satellite software project show that meaningful results can be obtained which allow an insight into software traits and the environment in which it is developed. Modified and new modules were shown to behave similarly. An abstract classification scheme for errors which allows a better understanding of the overall traits of a software project is also shown. Finally, various size and complexity metrics are examined with respect to errors detected within the software yielding some interesting results.

  5. Video compression of coronary angiograms based on discrete wavelet transform with block classification.

    PubMed

    Ho, B T; Tsai, M J; Wei, J; Ma, M; Saipetch, P

    1996-01-01

    A new method of video compression for angiographic images has been developed to achieve high compression ratio (~20:1) while eliminating block artifacts which leads to loss of diagnostic accuracy. This method adopts motion picture experts group's (MPEGs) motion compensated prediction to takes advantage of frame to frame correlation. However, in contrast to MPEG, the error images arising from mismatches in the motion estimation are encoded by discrete wavelet transform (DWT) rather than block discrete cosine transform (DCT). Furthermore, the authors developed a classification scheme which label each block in an image as intra, error, or background type and encode it accordingly. This hybrid coding can significantly improve the compression efficiency in certain eases. This method can be generalized for any dynamic image sequences applications sensitive to block artifacts.

  6. Empirically Estimable Classification Bounds Based on a Nonparametric Divergence Measure

    PubMed Central

    Berisha, Visar; Wisler, Alan; Hero, Alfred O.; Spanias, Andreas

    2015-01-01

    Information divergence functions play a critical role in statistics and information theory. In this paper we show that a non-parametric f-divergence measure can be used to provide improved bounds on the minimum binary classification probability of error for the case when the training and test data are drawn from the same distribution and for the case where there exists some mismatch between training and test distributions. We confirm the theoretical results by designing feature selection algorithms using the criteria from these bounds and by evaluating the algorithms on a series of pathological speech classification tasks. PMID:26807014

  7. Comparisons of neural networks to standard techniques for image classification and correlation

    NASA Technical Reports Server (NTRS)

    Paola, Justin D.; Schowengerdt, Robert A.

    1994-01-01

    Neural network techniques for multispectral image classification and spatial pattern detection are compared to the standard techniques of maximum-likelihood classification and spatial correlation. The neural network produced a more accurate classification than maximum-likelihood of a Landsat scene of Tucson, Arizona. Some of the errors in the maximum-likelihood classification are illustrated using decision region and class probability density plots. As expected, the main drawback to the neural network method is the long time required for the training stage. The network was trained using several different hidden layer sizes to optimize both the classification accuracy and training speed, and it was found that one node per class was optimal. The performance improved when 3x3 local windows of image data were entered into the net. This modification introduces texture into the classification without explicit calculation of a texture measure. Larger windows were successfully used for the detection of spatial features in Landsat and Magellan synthetic aperture radar imagery.

  8. 28 CFR 17.26 - Derivative classification.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 28 Judicial Administration 1 2014-07-01 2014-07-01 false Derivative classification. 17.26 Section... ACCESS TO CLASSIFIED INFORMATION Classified Information § 17.26 Derivative classification. (a) Persons need not possess original classification authority to derivatively classify information based on source...

  9. 28 CFR 17.26 - Derivative classification.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 28 Judicial Administration 1 2013-07-01 2013-07-01 false Derivative classification. 17.26 Section... ACCESS TO CLASSIFIED INFORMATION Classified Information § 17.26 Derivative classification. (a) Persons need not possess original classification authority to derivatively classify information based on source...

  10. 28 CFR 17.26 - Derivative classification.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 28 Judicial Administration 1 2012-07-01 2012-07-01 false Derivative classification. 17.26 Section... ACCESS TO CLASSIFIED INFORMATION Classified Information § 17.26 Derivative classification. (a) Persons need not possess original classification authority to derivatively classify information based on source...

  11. 28 CFR 17.26 - Derivative classification.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 28 Judicial Administration 1 2011-07-01 2011-07-01 false Derivative classification. 17.26 Section... ACCESS TO CLASSIFIED INFORMATION Classified Information § 17.26 Derivative classification. (a) Persons need not possess original classification authority to derivatively classify information based on source...

  12. Analysis and application of classification methods of complex carbonate reservoirs

    NASA Astrophysics Data System (ADS)

    Li, Xiongyan; Qin, Ruibao; Ping, Haitao; Wei, Dan; Liu, Xiaomei

    2018-06-01

    There are abundant carbonate reservoirs from the Cenozoic to Mesozoic era in the Middle East. Due to variation in sedimentary environment and diagenetic process of carbonate reservoirs, several porosity types coexist in carbonate reservoirs. As a result, because of the complex lithologies and pore types as well as the impact of microfractures, the pore structure is very complicated. Therefore, it is difficult to accurately calculate the reservoir parameters. In order to accurately evaluate carbonate reservoirs, based on the pore structure evaluation of carbonate reservoirs, the classification methods of carbonate reservoirs are analyzed based on capillary pressure curves and flow units. Based on the capillary pressure curves, although the carbonate reservoirs can be classified, the relationship between porosity and permeability after classification is not ideal. On the basis of the flow units, the high-precision functional relationship between porosity and permeability after classification can be established. Therefore, the carbonate reservoirs can be quantitatively evaluated based on the classification of flow units. In the dolomite reservoirs, the average absolute error of calculated permeability decreases from 15.13 to 7.44 mD. Similarly, the average absolute error of calculated permeability of limestone reservoirs is reduced from 20.33 to 7.37 mD. Only by accurately characterizing pore structures and classifying reservoir types, reservoir parameters could be calculated accurately. Therefore, characterizing pore structures and classifying reservoir types are very important to accurate evaluation of complex carbonate reservoirs in the Middle East.

  13. Study on Classification Accuracy Inspection of Land Cover Data Aided by Automatic Image Change Detection Technology

    NASA Astrophysics Data System (ADS)

    Xie, W.-J.; Zhang, L.; Chen, H.-P.; Zhou, J.; Mao, W.-J.

    2018-04-01

    The purpose of carrying out national geographic conditions monitoring is to obtain information of surface changes caused by human social and economic activities, so that the geographic information can be used to offer better services for the government, enterprise and public. Land cover data contains detailed geographic conditions information, thus has been listed as one of the important achievements in the national geographic conditions monitoring project. At present, the main issue of the production of the land cover data is about how to improve the classification accuracy. For the land cover data quality inspection and acceptance, classification accuracy is also an important check point. So far, the classification accuracy inspection is mainly based on human-computer interaction or manual inspection in the project, which are time consuming and laborious. By harnessing the automatic high-resolution remote sensing image change detection technology based on the ERDAS IMAGINE platform, this paper carried out the classification accuracy inspection test of land cover data in the project, and presented a corresponding technical route, which includes data pre-processing, change detection, result output and information extraction. The result of the quality inspection test shows the effectiveness of the technical route, which can meet the inspection needs for the two typical errors, that is, missing and incorrect update error, and effectively reduces the work intensity of human-computer interaction inspection for quality inspectors, and also provides a technical reference for the data production and quality control of the land cover data.

  14. Operational monitoring of land-cover change using multitemporal remote sensing data

    NASA Astrophysics Data System (ADS)

    Rogan, John

    2005-11-01

    Land-cover change, manifested as either land-cover modification and/or conversion, can occur at all spatial scales, and changes at local scales can have profound, cumulative impacts at broader scales. The implication of operational land-cover monitoring is that researchers have access to a continuous stream of remote sensing data, with the long term goal of providing for consistent and repetitive mapping. Effective large area monitoring of land-cover (i.e., >1000 km2) can only be accomplished by using remotely sensed images as an indirect data source in land-cover change mapping and as a source for land-cover change model projections. Large area monitoring programs face several challenges: (1) choice of appropriate classification scheme/map legend over large, topographically and phenologically diverse areas; (2) issues concerning data consistency and map accuracy (i.e., calibration and validation); (3) very large data volumes; (4) time consuming data processing and interpretation. Therefore, this dissertation research broadly addresses these challenges in the context of examining state-of-the-art image pre-processing, spectral enhancement, classification, and accuracy assessment techniques to assist the California Land-cover Mapping and Monitoring Program (LCMMP). The results of this dissertation revealed that spatially varying haze can be effectively corrected from Landsat data for the purposes of change detection. The Multitemporal Spectral Mixture Analysis (MSMA) spectral enhancement technique produced more accurate land-cover maps than those derived from the Multitemporal Kauth Thomas (MKT) transformation in northern and southern California study areas. A comparison of machine learning classifiers showed that Fuzzy ARTMAP outperformed two classification tree algorithms, based on map accuracy and algorithm robustness. Variation in spatial data error (positional and thematic) was explored in relation to environmental variables using geostatistical interpolation techniques. Finally, the land-cover modification maps generated for three time intervals (1985--1990--1996--2000), with nine change-classes revealed important variations in land-cover gain and loss between northern and southern California study areas.

  15. Improved EEG Event Classification Using Differential Energy.

    PubMed

    Harati, A; Golmohammadi, M; Lopez, S; Obeid, I; Picone, J

    2015-12-01

    Feature extraction for automatic classification of EEG signals typically relies on time frequency representations of the signal. Techniques such as cepstral-based filter banks or wavelets are popular analysis techniques in many signal processing applications including EEG classification. In this paper, we present a comparison of a variety of approaches to estimating and postprocessing features. To further aid in discrimination of periodic signals from aperiodic signals, we add a differential energy term. We evaluate our approaches on the TUH EEG Corpus, which is the largest publicly available EEG corpus and an exceedingly challenging task due to the clinical nature of the data. We demonstrate that a variant of a standard filter bank-based approach, coupled with first and second derivatives, provides a substantial reduction in the overall error rate. The combination of differential energy and derivatives produces a 24 % absolute reduction in the error rate and improves our ability to discriminate between signal events and background noise. This relatively simple approach proves to be comparable to other popular feature extraction approaches such as wavelets, but is much more computationally efficient.

  16. Comparison of remote sensing image processing techniques to identify tornado damage areas from Landsat TM data

    USGS Publications Warehouse

    Myint, S.W.; Yuan, M.; Cerveny, R.S.; Giri, C.P.

    2008-01-01

    Remote sensing techniques have been shown effective for large-scale damage surveys after a hazardous event in both near real-time or post-event analyses. The paper aims to compare accuracy of common imaging processing techniques to detect tornado damage tracks from Landsat TM data. We employed the direct change detection approach using two sets of images acquired before and after the tornado event to produce a principal component composite images and a set of image difference bands. Techniques in the comparison include supervised classification, unsupervised classification, and objectoriented classification approach with a nearest neighbor classifier. Accuracy assessment is based on Kappa coefficient calculated from error matrices which cross tabulate correctly identified cells on the TM image and commission and omission errors in the result. Overall, the Object-oriented Approach exhibits the highest degree of accuracy in tornado damage detection. PCA and Image Differencing methods show comparable outcomes. While selected PCs can improve detection accuracy 5 to 10%, the Object-oriented Approach performs significantly better with 15-20% higher accuracy than the other two techniques. ?? 2008 by MDPI.

  17. Benchmark data on the separability among crops in the southern San Joaquin Valley of California

    NASA Technical Reports Server (NTRS)

    Morse, A.; Card, D. H.

    1984-01-01

    Landsat MSS data were input to a discriminant analysis of 21 crops on each of eight dates in 1979 using a total of 4,142 fields in southern Fresno County, California. The 21 crops, which together account for over 70 percent of the agricultural acreage in the southern San Joaquin Valley, were analyzed to quantify the spectral separability, defined as omission error, between all pairs of crops. On each date the fields were segregated into six groups based on the mean value of the MSS7/MSS5 ratio, which is correlated with green biomass. Discriminant analysis was run on each group on each date. The resulting contingency tables offer information that can be profitably used in conjunction with crop calendars to pick the best dates for a classification. The tables show expected percent correct classification and error rates for all the crops. The patterns in the contingency tables show that the percent correct classification for crops generally increases with the amount of greenness in the fields being classified. However, there are exceptions to this general rule, notably grain.

  18. Comparison of Remote Sensing Image Processing Techniques to Identify Tornado Damage Areas from Landsat TM Data

    PubMed Central

    Myint, Soe W.; Yuan, May; Cerveny, Randall S.; Giri, Chandra P.

    2008-01-01

    Remote sensing techniques have been shown effective for large-scale damage surveys after a hazardous event in both near real-time or post-event analyses. The paper aims to compare accuracy of common imaging processing techniques to detect tornado damage tracks from Landsat TM data. We employed the direct change detection approach using two sets of images acquired before and after the tornado event to produce a principal component composite images and a set of image difference bands. Techniques in the comparison include supervised classification, unsupervised classification, and object-oriented classification approach with a nearest neighbor classifier. Accuracy assessment is based on Kappa coefficient calculated from error matrices which cross tabulate correctly identified cells on the TM image and commission and omission errors in the result. Overall, the Object-oriented Approach exhibits the highest degree of accuracy in tornado damage detection. PCA and Image Differencing methods show comparable outcomes. While selected PCs can improve detection accuracy 5 to 10%, the Object-oriented Approach performs significantly better with 15-20% higher accuracy than the other two techniques. PMID:27879757

  19. Multicategory nets of single-layer perceptrons: complexity and sample-size issues.

    PubMed

    Raudys, Sarunas; Kybartas, Rimantas; Zavadskas, Edmundas Kazimieras

    2010-05-01

    The standard cost function of multicategory single-layer perceptrons (SLPs) does not minimize the classification error rate. In order to reduce classification error, it is necessary to: 1) refuse the traditional cost function, 2) obtain near to optimal pairwise linear classifiers by specially organized SLP training and optimal stopping, and 3) fuse their decisions properly. To obtain better classification in unbalanced training set situations, we introduce the unbalance correcting term. It was found that fusion based on the Kulback-Leibler (K-L) distance and the Wu-Lin-Weng (WLW) method result in approximately the same performance in situations where sample sizes are relatively small. The explanation for this observation is by theoretically known verity that an excessive minimization of inexact criteria becomes harmful at times. Comprehensive comparative investigations of six real-world pattern recognition (PR) problems demonstrated that employment of SLP-based pairwise classifiers is comparable and as often as not outperforming the linear support vector (SV) classifiers in moderate dimensional situations. The colored noise injection used to design pseudovalidation sets proves to be a powerful tool for facilitating finite sample problems in moderate-dimensional PR tasks.

  20. Remote sensing of submerged aquatic vegetation in lower Chesapeake Bay - A comparison of Landsat MSS to TM imagery

    NASA Technical Reports Server (NTRS)

    Ackleson, S. G.; Klemas, V.

    1987-01-01

    Landsat MSS and TM imagery, obtained simultaneously over Guinea Marsh, VA, as analyzed and compares for its ability to detect submerged aquatic vegetation (SAV). An unsupervised clustering algorithm was applied to each image, where the input classification parameters are defined as functions of apparent sensor noise. Class confidence and accuracy were computed for all water areas by comparing the classified images, pixel-by-pixel, to rasterized SAV distributions derived from color aerial photography. To illustrate the effect of water depth on classification error, areas of depth greater than 1.9 m were masked, and class confidence and accuracy recalculated. A single-scattering radiative-transfer model is used to illustrate how percent canopy cover and water depth affect the volume reflectance from a water column containing SAV. For a submerged canopy that is morphologically and optically similar to Zostera marina inhabiting Lower Chesapeake Bay, dense canopies may be isolated by masking optically deep water. For less dense canopies, the effect of increasing water depth is to increase the apparent percent crown cover, which may result in classification error.

  1. 32 CFR 2001.22 - Derivative classification.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 32 National Defense 6 2012-07-01 2012-07-01 false Derivative classification. 2001.22 Section 2001... Identification and Markings § 2001.22 Derivative classification. (a) General. Information classified derivatively on the basis of source documents or classification guides shall bear all markings prescribed in...

  2. 32 CFR 2001.22 - Derivative classification.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 32 National Defense 6 2013-07-01 2013-07-01 false Derivative classification. 2001.22 Section 2001... Identification and Markings § 2001.22 Derivative classification. (a) General. Information classified derivatively on the basis of source documents or classification guides shall bear all markings prescribed in...

  3. 32 CFR 2001.22 - Derivative classification.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 32 National Defense 6 2014-07-01 2014-07-01 false Derivative classification. 2001.22 Section 2001... Identification and Markings § 2001.22 Derivative classification. (a) General. Information classified derivatively on the basis of source documents or classification guides shall bear all markings prescribed in...

  4. 32 CFR 2001.22 - Derivative classification.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 32 National Defense 6 2011-07-01 2011-07-01 false Derivative classification. 2001.22 Section 2001... Identification and Markings § 2001.22 Derivative classification. (a) General. Information classified derivatively on the basis of source documents or classification guides shall bear all markings prescribed in...

  5. Taxonaut: an application software for comparative display of multiple taxonomies with a use case of GBIF Species API.

    PubMed

    Ytow, Nozomi

    2016-01-01

    The Species API of the Global Biodiversity Information Facility (GBIF) provides public access to taxonomic data aggregated from multiple data sources. Each data source follows its own classification which can be inconsistent with classifications from other sources. Even with a reference classification e.g. the GBIF Backbone taxonomy, a comprehensive method to compare classifications in the data aggregation is essential, especially for non-expert users. A Java application was developed to compare multiple taxonomies graphically using classification data acquired from GBIF's ChecklistBank via the GBIF Species API. It uses a table to display taxonomies where each column represents a taxonomy under comparison, with an aligner column to organise taxa by name. Each cell contains the name of a taxon if the classification in that column contains the name. Each column also has a cell showing the hierarchy of the taxonomy by a folder metaphor where taxa are aligned and synchronised in the aligner column. A set of those comparative tables shows taxa categorised by relationship between taxonomies. The result set is also available as tables in an Excel format file.

  6. Error Analyses of the North Alabama Lightning Mapping Array (LMA)

    NASA Technical Reports Server (NTRS)

    Koshak, W. J.; Solokiewicz, R. J.; Blakeslee, R. J.; Goodman, S. J.; Christian, H. J.; Hall, J. M.; Bailey, J. C.; Krider, E. P.; Bateman, M. G.; Boccippio, D. J.

    2003-01-01

    Two approaches are used to characterize how accurately the North Alabama Lightning Mapping Array (LMA) is able to locate lightning VHF sources in space and in time. The first method uses a Monte Carlo computer simulation to estimate source retrieval errors. The simulation applies a VHF source retrieval algorithm that was recently developed at the NASA-MSFC and that is similar, but not identical to, the standard New Mexico Tech retrieval algorithm. The second method uses a purely theoretical technique (i.e., chi-squared Curvature Matrix theory) to estimate retrieval errors. Both methods assume that the LMA system has an overall rms timing error of 50ns, but all other possible errors (e.g., multiple sources per retrieval attempt) are neglected. The detailed spatial distributions of retrieval errors are provided. Given that the two methods are completely independent of one another, it is shown that they provide remarkably similar results, except that the chi-squared theory produces larger altitude error estimates than the (more realistic) Monte Carlo simulation.

  7. NeMO-Net & Fluid Lensing: The Neural Multi-Modal Observation & Training Network for Global Coral Reef Assessment Using Fluid Lensing Augmentation of NASA EOS Data

    NASA Technical Reports Server (NTRS)

    Chirayath, Ved

    2018-01-01

    We present preliminary results from NASA NeMO-Net, the first neural multi-modal observation and training network for global coral reef assessment. NeMO-Net is an open-source deep convolutional neural network (CNN) and interactive active learning training software in development which will assess the present and past dynamics of coral reef ecosystems. NeMO-Net exploits active learning and data fusion of mm-scale remotely sensed 3D images of coral reefs captured using fluid lensing with the NASA FluidCam instrument, presently the highest-resolution remote sensing benthic imaging technology capable of removing ocean wave distortion, as well as hyperspectral airborne remote sensing data from the ongoing NASA CORAL mission and lower-resolution satellite data to determine coral reef ecosystem makeup globally at unprecedented spatial and temporal scales. Aquatic ecosystems, particularly coral reefs, remain quantitatively misrepresented by low-resolution remote sensing as a result of refractive distortion from ocean waves, optical attenuation, and remoteness. Machine learning classification of coral reefs using FluidCam mm-scale 3D data show that present satellite and airborne remote sensing techniques poorly characterize coral reef percent living cover, morphology type, and species breakdown at the mm, cm, and meter scales. Indeed, current global assessments of coral reef cover and morphology classification based on km-scale satellite data alone can suffer from segmentation errors greater than 40%, capable of change detection only on yearly temporal scales and decameter spatial scales, significantly hindering our understanding of patterns and processes in marine biodiversity at a time when these ecosystems are experiencing unprecedented anthropogenic pressures, ocean acidification, and sea surface temperature rise. NeMO-Net leverages our augmented machine learning algorithm that demonstrates data fusion of regional FluidCam (mm, cm-scale) airborne remote sensing with global low-resolution (m, km-scale) airborne and spaceborne imagery to reduce classification errors up to 80% over regional scales. Such technologies can substantially enhance our ability to assess coral reef ecosystems dynamics.

  8. Development and validation of Aviation Causal Contributors for Error Reporting Systems (ACCERS).

    PubMed

    Baker, David P; Krokos, Kelley J

    2007-04-01

    This investigation sought to develop a reliable and valid classification system for identifying and classifying the underlying causes of pilot errors reported under the Aviation Safety Action Program (ASAP). ASAP is a voluntary safety program that air carriers may establish to study pilot and crew performance on the line. In ASAP programs, similar to the Aviation Safety Reporting System, pilots self-report incidents by filing a short text description of the event. The identification of contributors to errors is critical if organizations are to improve human performance, yet it is difficult for analysts to extract this information from text narratives. A taxonomy was needed that could be used by pilots to classify the causes of errors. After completing a thorough literature review, pilot interviews and a card-sorting task were conducted in Studies 1 and 2 to develop the initial structure of the Aviation Causal Contributors for Event Reporting Systems (ACCERS) taxonomy. The reliability and utility of ACCERS was then tested in studies 3a and 3b by having pilots independently classify the primary and secondary causes of ASAP reports. The results provided initial evidence for the internal and external validity of ACCERS. Pilots were found to demonstrate adequate levels of agreement with respect to their category classifications. ACCERS appears to be a useful system for studying human error captured under pilot ASAP reports. Future work should focus on how ACCERS is organized and whether it can be used or modified to classify human error in ASAP programs for other aviation-related job categories such as dispatchers. Potential applications of this research include systems in which individuals self-report errors and that attempt to extract and classify the causes of those events.

  9. Understanding EFL Students' Errors in Writing

    ERIC Educational Resources Information Center

    Phuket, Pimpisa Rattanadilok Na; Othman, Normah Binti

    2015-01-01

    Writing is the most difficult skill in English, so most EFL students tend to make errors in writing. In assisting the learners to successfully acquire writing skill, the analysis of errors and the understanding of their sources are necessary. This study attempts to explore the major sources of errors occurred in the writing of EFL students. It…

  10. Target/error overlap in jargonaphasia: The case for a one-source model, lexical and non-lexical summation, and the special status of correct responses.

    PubMed

    Olson, Andrew; Halloran, Elizabeth; Romani, Cristina

    2015-12-01

    We present three jargonaphasic patients who made phonological errors in naming, repetition and reading. We analyse target/response overlap using statistical models to answer three questions: 1) Is there a single phonological source for errors or two sources, one for target-related errors and a separate source for abstruse errors? 2) Can correct responses be predicted by the same distribution used to predict errors or do they show a completion boost (CB)? 3) Is non-lexical and lexical information summed during reading and repetition? The answers were clear. 1) Abstruse errors did not require a separate distribution created by failure to access word forms. Abstruse and target-related errors were the endpoints of a single overlap distribution. 2) Correct responses required a special factor, e.g., a CB or lexical/phonological feedback, to preserve their integrity. 3) Reading and repetition required separate lexical and non-lexical contributions that were combined at output. Copyright © 2015 Elsevier Ltd. All rights reserved.

  11. Measurement-device-independent quantum key distribution with source state errors and statistical fluctuation

    NASA Astrophysics Data System (ADS)

    Jiang, Cong; Yu, Zong-Wen; Wang, Xiang-Bin

    2017-03-01

    We show how to calculate the secure final key rate in the four-intensity decoy-state measurement-device-independent quantum key distribution protocol with both source errors and statistical fluctuations with a certain failure probability. Our results rely only on the range of only a few parameters in the source state. All imperfections in this protocol have been taken into consideration without assuming any specific error patterns of the source.

  12. Automated spectral classification and the GAIA project

    NASA Technical Reports Server (NTRS)

    Lasala, Jerry; Kurtz, Michael J.

    1995-01-01

    Two dimensional spectral types for each of the stars observed in the global astrometric interferometer for astrophysics (GAIA) mission would provide additional information for the galactic structure and stellar evolution studies, as well as helping in the identification of unusual objects and populations. The classification of the large quantity generated spectra requires that automated techniques are implemented. Approaches for the automatic classification are reviewed, and a metric-distance method is discussed. In tests, the metric-distance method produced spectral types with mean errors comparable to those of human classifiers working at similar resolution. Data and equipment requirements for an automated classification survey, are discussed. A program of auxiliary observations is proposed to yield spectral types and radial velocities for the GAIA-observed stars.

  13. Speaker normalization and adaptation using second-order connectionist networks.

    PubMed

    Watrous, R L

    1993-01-01

    A method for speaker normalization and adaption using connectionist networks is developed. A speaker-specific linear transformation of observations of the speech signal is computed using second-order network units. Classification is accomplished by a multilayer feedforward network that operates on the normalized speech data. The network is adapted for a new talker by modifying the transformation parameters while leaving the classifier fixed. This is accomplished by backpropagating classification error through the classifier to the second-order transformation units. This method was evaluated for the classification of ten vowels for 76 speakers using the first two formant values of the Peterson-Barney data. The results suggest that rapid speaker adaptation resulting in high classification accuracy can be accomplished by this method.

  14. Errors in imaging patients in the emergency setting

    PubMed Central

    Reginelli, Alfonso; Lo Re, Giuseppe; Midiri, Federico; Muzj, Carlo; Romano, Luigia; Brunese, Luca

    2016-01-01

    Emergency and trauma care produces a “perfect storm” for radiological errors: uncooperative patients, inadequate histories, time-critical decisions, concurrent tasks and often junior personnel working after hours in busy emergency departments. The main cause of diagnostic errors in the emergency department is the failure to correctly interpret radiographs, and the majority of diagnoses missed on radiographs are fractures. Missed diagnoses potentially have important consequences for patients, clinicians and radiologists. Radiologists play a pivotal role in the diagnostic assessment of polytrauma patients and of patients with non-traumatic craniothoracoabdominal emergencies, and key elements to reduce errors in the emergency setting are knowledge, experience and the correct application of imaging protocols. This article aims to highlight the definition and classification of errors in radiology, the causes of errors in emergency radiology and the spectrum of diagnostic errors in radiography, ultrasonography and CT in the emergency setting. PMID:26838955

  15. Errors in imaging patients in the emergency setting.

    PubMed

    Pinto, Antonio; Reginelli, Alfonso; Pinto, Fabio; Lo Re, Giuseppe; Midiri, Federico; Muzj, Carlo; Romano, Luigia; Brunese, Luca

    2016-01-01

    Emergency and trauma care produces a "perfect storm" for radiological errors: uncooperative patients, inadequate histories, time-critical decisions, concurrent tasks and often junior personnel working after hours in busy emergency departments. The main cause of diagnostic errors in the emergency department is the failure to correctly interpret radiographs, and the majority of diagnoses missed on radiographs are fractures. Missed diagnoses potentially have important consequences for patients, clinicians and radiologists. Radiologists play a pivotal role in the diagnostic assessment of polytrauma patients and of patients with non-traumatic craniothoracoabdominal emergencies, and key elements to reduce errors in the emergency setting are knowledge, experience and the correct application of imaging protocols. This article aims to highlight the definition and classification of errors in radiology, the causes of errors in emergency radiology and the spectrum of diagnostic errors in radiography, ultrasonography and CT in the emergency setting.

  16. Acoustic holography as a metrological tool for characterizing medical ultrasound sources and fields

    PubMed Central

    Sapozhnikov, Oleg A.; Tsysar, Sergey A.; Khokhlova, Vera A.; Kreider, Wayne

    2015-01-01

    Acoustic holography is a powerful technique for characterizing ultrasound sources and the fields they radiate, with the ability to quantify source vibrations and reduce the number of required measurements. These capabilities are increasingly appealing for meeting measurement standards in medical ultrasound; however, associated uncertainties have not been investigated systematically. Here errors associated with holographic representations of a linear, continuous-wave ultrasound field are studied. To facilitate the analysis, error metrics are defined explicitly, and a detailed description of a holography formulation based on the Rayleigh integral is provided. Errors are evaluated both for simulations of a typical therapeutic ultrasound source and for physical experiments with three different ultrasound sources. Simulated experiments explore sampling errors introduced by the use of a finite number of measurements, geometric uncertainties in the actual positions of acquired measurements, and uncertainties in the properties of the propagation medium. Results demonstrate the theoretical feasibility of keeping errors less than about 1%. Typical errors in physical experiments were somewhat larger, on the order of a few percent; comparison with simulations provides specific guidelines for improving the experimental implementation to reduce these errors. Overall, results suggest that holography can be implemented successfully as a metrological tool with small, quantifiable errors. PMID:26428789

  17. Common but unappreciated sources of error in one, two, and multiple-color pyrometry

    NASA Technical Reports Server (NTRS)

    Spjut, R. Erik

    1988-01-01

    The most common sources of error in optical pyrometry are examined. They can be classified as either noise and uncertainty errors, stray radiation errors, or speed-of-response errors. Through judicious choice of detectors and optical wavelengths the effect of noise errors can be minimized, but one should strive to determine as many of the system properties as possible. Careful consideration of the optical-collection system can minimize stray radiation errors. Careful consideration must also be given to the slowest elements in a pyrometer when measuring rapid phenomena.

  18. Reliability, Validity, and Classification Accuracy of the DSM-5 Diagnostic Criteria for Gambling Disorder and Comparison to DSM-IV.

    PubMed

    Stinchfield, Randy; McCready, John; Turner, Nigel E; Jimenez-Murcia, Susana; Petry, Nancy M; Grant, Jon; Welte, John; Chapman, Heather; Winters, Ken C

    2016-09-01

    The DSM-5 was published in 2013 and it included two substantive revisions for gambling disorder (GD). These changes are the reduction in the threshold from five to four criteria and elimination of the illegal activities criterion. The purpose of this study was to twofold. First, to assess the reliability, validity and classification accuracy of the DSM-5 diagnostic criteria for GD. Second, to compare the DSM-5-DSM-IV on reliability, validity, and classification accuracy, including an examination of the effect of the elimination of the illegal acts criterion on diagnostic accuracy. To compare DSM-5 and DSM-IV, eight datasets from three different countries (Canada, USA, and Spain; total N = 3247) were used. All datasets were based on similar research methods. Participants were recruited from outpatient gambling treatment services to represent the group with a GD and from the community to represent the group without a GD. All participants were administered a standardized measure of diagnostic criteria. The DSM-5 yielded satisfactory reliability, validity and classification accuracy. In comparing the DSM-5 to the DSM-IV, most comparisons of reliability, validity and classification accuracy showed more similarities than differences. There was evidence of modest improvements in classification accuracy for DSM-5 over DSM-IV, particularly in reduction of false negative errors. This reduction in false negative errors was largely a function of lowering the cut score from five to four and this revision is an improvement over DSM-IV. From a statistical standpoint, eliminating the illegal acts criterion did not make a significant impact on diagnostic accuracy. From a clinical standpoint, illegal acts can still be addressed in the context of the DSM-5 criterion of lying to others.

  19. Multiple category-lot quality assurance sampling: a new classification system with application to schistosomiasis control.

    PubMed

    Olives, Casey; Valadez, Joseph J; Brooker, Simon J; Pagano, Marcello

    2012-01-01

    Originally a binary classifier, Lot Quality Assurance Sampling (LQAS) has proven to be a useful tool for classification of the prevalence of Schistosoma mansoni into multiple categories (≤10%, >10 and <50%, ≥50%), and semi-curtailed sampling has been shown to effectively reduce the number of observations needed to reach a decision. To date the statistical underpinnings for Multiple Category-LQAS (MC-LQAS) have not received full treatment. We explore the analytical properties of MC-LQAS, and validate its use for the classification of S. mansoni prevalence in multiple settings in East Africa. We outline MC-LQAS design principles and formulae for operating characteristic curves. In addition, we derive the average sample number for MC-LQAS when utilizing semi-curtailed sampling and introduce curtailed sampling in this setting. We also assess the performance of MC-LQAS designs with maximum sample sizes of n=15 and n=25 via a weighted kappa-statistic using S. mansoni data collected in 388 schools from four studies in East Africa. Overall performance of MC-LQAS classification was high (kappa-statistic of 0.87). In three of the studies, the kappa-statistic for a design with n=15 was greater than 0.75. In the fourth study, where these designs performed poorly (kappa-statistic less than 0.50), the majority of observations fell in regions where potential error is known to be high. Employment of semi-curtailed and curtailed sampling further reduced the sample size by as many as 0.5 and 3.5 observations per school, respectively, without increasing classification error. This work provides the needed analytics to understand the properties of MC-LQAS for assessing the prevalance of S. mansoni and shows that in most settings a sample size of 15 children provides a reliable classification of schools.

  20. The presence of English and Spanish dyslexia in the Web

    NASA Astrophysics Data System (ADS)

    Rello, Luz; Baeza-Yates, Ricardo

    2012-09-01

    In this study we present a lower bound of the prevalence of dyslexia in the Web for English and Spanish. On the basis of analysis of corpora written by dyslexic people, we propose a classification of the different kinds of dyslexic errors. A representative data set of dyslexic words is used to calculate this lower bound in web pages containing English and Spanish dyslexic errors. We also present an analysis of dyslexic errors in major Internet domains, social media sites, and throughout English- and Spanish-speaking countries. To show the independence of our estimations from the presence of other kinds of errors, we compare them with the overall lexical quality of the Web and with the error rate of noncorrected corpora. The presence of dyslexic errors in the Web motivates work in web accessibility for dyslexic users.

  1. Syndrome-source-coding and its universal generalization. [error correcting codes for data compression

    NASA Technical Reports Server (NTRS)

    Ancheta, T. C., Jr.

    1976-01-01

    A method of using error-correcting codes to obtain data compression, called syndrome-source-coding, is described in which the source sequence is treated as an error pattern whose syndrome forms the compressed data. It is shown that syndrome-source-coding can achieve arbitrarily small distortion with the number of compressed digits per source digit arbitrarily close to the entropy of a binary memoryless source. A 'universal' generalization of syndrome-source-coding is formulated which provides robustly effective distortionless coding of source ensembles. Two examples are given, comparing the performance of noiseless universal syndrome-source-coding to (1) run-length coding and (2) Lynch-Davisson-Schalkwijk-Cover universal coding for an ensemble of binary memoryless sources.

  2. Nineteen hundred seventy three significant accomplishments. [Landsat satellite data applications

    NASA Technical Reports Server (NTRS)

    1974-01-01

    Data collected by the Skylab remote sensing satellites was used to develop applications techniques and to combine automatic data classification with statistical clustering methods. Continuing research was concentrated in the correlation and registration of data products and in the definition of the atmospheric effects on remote sensing. The causes of errors encountered in the automated classification of agricultural data are identified. Other applications in forestry, geography, environmental geology, and land use are discussed.

  3. A Comprehensive Radial Velocity Error Budget for Next Generation Doppler Spectrometers

    NASA Technical Reports Server (NTRS)

    Halverson, Samuel; Ryan, Terrien; Mahadevan, Suvrath; Roy, Arpita; Bender, Chad; Stefansson, Guomundur Kari; Monson, Andrew; Levi, Eric; Hearty, Fred; Blake, Cullen; hide

    2016-01-01

    We describe a detailed radial velocity error budget for the NASA-NSF Extreme Precision Doppler Spectrometer instrument concept NEID (NN-explore Exoplanet Investigations with Doppler spectroscopy). Such an instrument performance budget is a necessity for both identifying the variety of noise sources currently limiting Doppler measurements, and estimating the achievable performance of next generation exoplanet hunting Doppler spectrometers. For these instruments, no single source of instrumental error is expected to set the overall measurement floor. Rather, the overall instrumental measurement precision is set by the contribution of many individual error sources. We use a combination of numerical simulations, educated estimates based on published materials, extrapolations of physical models, results from laboratory measurements of spectroscopic subsystems, and informed upper limits for a variety of error sources to identify likely sources of systematic error and construct our global instrument performance error budget. While natively focused on the performance of the NEID instrument, this modular performance budget is immediately adaptable to a number of current and future instruments. Such an approach is an important step in charting a path towards improving Doppler measurement precisions to the levels necessary for discovering Earth-like planets.

  4. The Accuracy of Webcams in 2D Motion Analysis: Sources of Error and Their Control

    ERIC Educational Resources Information Center

    Page, A.; Moreno, R.; Candelas, P.; Belmar, F.

    2008-01-01

    In this paper, we show the potential of webcams as precision measuring instruments in a physics laboratory. Various sources of error appearing in 2D coordinate measurements using low-cost commercial webcams are discussed, quantifying their impact on accuracy and precision, and simple procedures to control these sources of error are presented.…

  5. Five-way smoking status classification using text hot-spot identification and error-correcting output codes.

    PubMed

    Cohen, Aaron M

    2008-01-01

    We participated in the i2b2 smoking status classification challenge task. The purpose of this task was to evaluate the ability of systems to automatically identify patient smoking status from discharge summaries. Our submission included several techniques that we compared and studied, including hot-spot identification, zero-vector filtering, inverse class frequency weighting, error-correcting output codes, and post-processing rules. We evaluated our approaches using the same methods as the i2b2 task organizers, using micro- and macro-averaged F1 as the primary performance metric. Our best performing system achieved a micro-F1 of 0.9000 on the test collection, equivalent to the best performing system submitted to the i2b2 challenge. Hot-spot identification, zero-vector filtering, classifier weighting, and error correcting output coding contributed additively to increased performance, with hot-spot identification having by far the largest positive effect. High performance on automatic identification of patient smoking status from discharge summaries is achievable with the efficient and straightforward machine learning techniques studied here.

  6. Linear Discriminant Analysis Achieves High Classification Accuracy for the BOLD fMRI Response to Naturalistic Movie Stimuli

    PubMed Central

    Mandelkow, Hendrik; de Zwart, Jacco A.; Duyn, Jeff H.

    2016-01-01

    Naturalistic stimuli like movies evoke complex perceptual processes, which are of great interest in the study of human cognition by functional MRI (fMRI). However, conventional fMRI analysis based on statistical parametric mapping (SPM) and the general linear model (GLM) is hampered by a lack of accurate parametric models of the BOLD response to complex stimuli. In this situation, statistical machine-learning methods, a.k.a. multivariate pattern analysis (MVPA), have received growing attention for their ability to generate stimulus response models in a data-driven fashion. However, machine-learning methods typically require large amounts of training data as well as computational resources. In the past, this has largely limited their application to fMRI experiments involving small sets of stimulus categories and small regions of interest in the brain. By contrast, the present study compares several classification algorithms known as Nearest Neighbor (NN), Gaussian Naïve Bayes (GNB), and (regularized) Linear Discriminant Analysis (LDA) in terms of their classification accuracy in discriminating the global fMRI response patterns evoked by a large number of naturalistic visual stimuli presented as a movie. Results show that LDA regularized by principal component analysis (PCA) achieved high classification accuracies, above 90% on average for single fMRI volumes acquired 2 s apart during a 300 s movie (chance level 0.7% = 2 s/300 s). The largest source of classification errors were autocorrelations in the BOLD signal compounded by the similarity of consecutive stimuli. All classifiers performed best when given input features from a large region of interest comprising around 25% of the voxels that responded significantly to the visual stimulus. Consistent with this, the most informative principal components represented widespread distributions of co-activated brain regions that were similar between subjects and may represent functional networks. In light of these results, the combination of naturalistic movie stimuli and classification analysis in fMRI experiments may prove to be a sensitive tool for the assessment of changes in natural cognitive processes under experimental manipulation. PMID:27065832

  7. The foodscape: classification and field validation of secondary data sources.

    PubMed

    Lake, Amelia A; Burgoine, Thomas; Greenhalgh, Fiona; Stamp, Elaine; Tyrrell, Rachel

    2010-07-01

    The aims were to: develop a food environment classification tool and to test the acceptability and validity of three secondary sources of food environment data within a defined urban area of Newcastle-Upon-Tyne, using a field validation method. A 21 point (with 77 sub-categories) classification tool was developed. The fieldwork recorded 617 establishments selling food and/or food products. The sensitivity analysis of the secondary sources against fieldwork for the Newcastle City Council data was good (83.6%), while Yell.com and the Yellow Pages were low (51.2% and 50.9%, respectively). To improve the quality of secondary data, multiple sources should be used in order to achieve a realistic picture of the foodscape. 2010 Elsevier Ltd. All rights reserved.

  8. 46 CFR 503.55 - Derivative classification.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 46 Shipping 9 2011-10-01 2011-10-01 false Derivative classification. 503.55 Section 503.55... Security Program § 503.55 Derivative classification. (a) In accordance with Part 2 of Executive Order 13526... developed material consistent with the classification markings that apply to the source information, is...

  9. 46 CFR 503.55 - Derivative classification.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 46 Shipping 9 2010-10-01 2010-10-01 false Derivative classification. 503.55 Section 503.55... Security Program § 503.55 Derivative classification. (a) In accordance with Part 2 of Executive Order 12958... developed material consistent with the classification markings that apply to the source information, is...

  10. 46 CFR 503.55 - Derivative classification.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 46 Shipping 9 2014-10-01 2014-10-01 false Derivative classification. 503.55 Section 503.55... Security Program § 503.55 Derivative classification. (a) In accordance with Part 2 of Executive Order 13526... developed material consistent with the classification markings that apply to the source information, is...

  11. 46 CFR 503.55 - Derivative classification.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 46 Shipping 9 2013-10-01 2013-10-01 false Derivative classification. 503.55 Section 503.55... Security Program § 503.55 Derivative classification. (a) In accordance with Part 2 of Executive Order 13526... developed material consistent with the classification markings that apply to the source information, is...

  12. 46 CFR 503.55 - Derivative classification.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 46 Shipping 9 2012-10-01 2012-10-01 false Derivative classification. 503.55 Section 503.55... Security Program § 503.55 Derivative classification. (a) In accordance with Part 2 of Executive Order 13526... developed material consistent with the classification markings that apply to the source information, is...

  13. Verification and classification bias interactions in diagnostic test accuracy studies for fine-needle aspiration biopsy.

    PubMed

    Schmidt, Robert L; Walker, Brandon S; Cohen, Michael B

    2015-03-01

    Reliable estimates of accuracy are important for any diagnostic test. Diagnostic accuracy studies are subject to unique sources of bias. Verification bias and classification bias are 2 sources of bias that commonly occur in diagnostic accuracy studies. Statistical methods are available to estimate the impact of these sources of bias when they occur alone. The impact of interactions when these types of bias occur together has not been investigated. We developed mathematical relationships to show the combined effect of verification bias and classification bias. A wide range of case scenarios were generated to assess the impact of bias components and interactions on total bias. Interactions between verification bias and classification bias caused overestimation of sensitivity and underestimation of specificity. Interactions had more effect on sensitivity than specificity. Sensitivity was overestimated by at least 7% in approximately 6% of the tested scenarios. Specificity was underestimated by at least 7% in less than 0.1% of the scenarios. Interactions between verification bias and classification bias create distortions in accuracy estimates that are greater than would be predicted from each source of bias acting independently. © 2014 American Cancer Society.

  14. 28 CFR 17.28 - Automatic declassification.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... declassified not later than 25 years after the date of its original classification with the exception of... information exempted does not identify a confidential human source or human intelligence source. (c) Proposed... submit it to the Executive Secretary of the Interagency Security Classification Appeals Panel. (d...

  15. 28 CFR 17.28 - Automatic declassification.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... declassified not later than 25 years after the date of its original classification with the exception of... information exempted does not identify a confidential human source or human intelligence source. (c) Proposed... submit it to the Executive Secretary of the Interagency Security Classification Appeals Panel. (d...

  16. 28 CFR 17.28 - Automatic declassification.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... declassified not later than 25 years after the date of its original classification with the exception of... information exempted does not identify a confidential human source or human intelligence source. (c) Proposed... submit it to the Executive Secretary of the Interagency Security Classification Appeals Panel. (d...

  17. 28 CFR 17.28 - Automatic declassification.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... declassified not later than 25 years after the date of its original classification with the exception of... information exempted does not identify a confidential human source or human intelligence source. (c) Proposed... submit it to the Executive Secretary of the Interagency Security Classification Appeals Panel. (d...

  18. 28 CFR 17.28 - Automatic declassification.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... declassified not later than 25 years after the date of its original classification with the exception of... information exempted does not identify a confidential human source or human intelligence source. (c) Proposed... submit it to the Executive Secretary of the Interagency Security Classification Appeals Panel. (d...

  19. The Data Release of the Sloan Digital Sky Survey-II Supernova Survey

    NASA Astrophysics Data System (ADS)

    Sako, Masao; Bassett, Bruce; Becker, Andrew C.; Brown, Peter J.; Campbell, Heather; Wolf, Rachel; Cinabro, David; D’Andrea, Chris B.; Dawson, Kyle S.; DeJongh, Fritz; Depoy, Darren L.; Dilday, Ben; Doi, Mamoru; Filippenko, Alexei V.; Fischer, John A.; Foley, Ryan J.; Frieman, Joshua A.; Galbany, Lluis; Garnavich, Peter M.; Goobar, Ariel; Gupta, Ravi R.; Hill, Gary J.; Hayden, Brian T.; Hlozek, Renée; Holtzman, Jon A.; Hopp, Ulrich; Jha, Saurabh W.; Kessler, Richard; Kollatschny, Wolfram; Leloudas, Giorgos; Marriner, John; Marshall, Jennifer L.; Miquel, Ramon; Morokuma, Tomoki; Mosher, Jennifer; Nichol, Robert C.; Nordin, Jakob; Olmstead, Matthew D.; Östman, Linda; Prieto, Jose L.; Richmond, Michael; Romani, Roger W.; Sollerman, Jesper; Stritzinger, Max; Schneider, Donald P.; Smith, Mathew; Wheeler, J. Craig; Yasuda, Naoki; Zheng, Chen

    2018-06-01

    This paper describes the data release of the Sloan Digital Sky Survey-II (SDSS-II) Supernova Survey conducted between 2005 and 2007. Light curves, spectra, classifications, and ancillary data are presented for 10,258 variable and transient sources discovered through repeat ugriz imaging of SDSS Stripe 82, a 300 deg2 area along the celestial equator. This data release is comprised of all transient sources brighter than r ≃ 22.5 mag with no history of variability prior to 2004. Dedicated spectroscopic observations were performed on a subset of 889 transients, as well as spectra for thousands of transient host galaxies using the SDSS-III BOSS spectrographs. Photometric classifications are provided for the candidates with good multi-color light curves that were not observed spectroscopically, using host galaxy redshift information when available. From these observations, 4607 transients are either spectroscopically confirmed, or likely to be, supernovae, making this the largest sample of supernova candidates ever compiled. We present a new method for SN host-galaxy identification and derive host-galaxy properties including stellar masses, star formation rates, and the average stellar population ages from our SDSS multi-band photometry. We derive SALT2 distance moduli for a total of 1364 SN Ia with spectroscopic redshifts as well as photometric redshifts for a further 624 purely photometric SN Ia candidates. Using the spectroscopically confirmed subset of the three-year SDSS-II SN Ia sample and assuming a flat ΛCDM cosmology, we determine Ω M = 0.315 ± 0.093 (statistical error only) and detect a non-zero cosmological constant at 5.7σ.

  20. Mindcontrol: A web application for brain segmentation quality control.

    PubMed

    Keshavan, Anisha; Datta, Esha; M McDonough, Ian; Madan, Christopher R; Jordan, Kesshi; Henry, Roland G

    2018-04-15

    Tissue classification plays a crucial role in the investigation of normal neural development, brain-behavior relationships, and the disease mechanisms of many psychiatric and neurological illnesses. Ensuring the accuracy of tissue classification is important for quality research and, in particular, the translation of imaging biomarkers to clinical practice. Assessment with the human eye is vital to correct various errors inherent to all currently available segmentation algorithms. Manual quality assurance becomes methodologically difficult at a large scale - a problem of increasing importance as the number of data sets is on the rise. To make this process more efficient, we have developed Mindcontrol, an open-source web application for the collaborative quality control of neuroimaging processing outputs. The Mindcontrol platform consists of a dashboard to organize data, descriptive visualizations to explore the data, an imaging viewer, and an in-browser annotation and editing toolbox for data curation and quality control. Mindcontrol is flexible and can be configured for the outputs of any software package in any data organization structure. Example configurations for three large, open-source datasets are presented: the 1000 Functional Connectomes Project (FCP), the Consortium for Reliability and Reproducibility (CoRR), and the Autism Brain Imaging Data Exchange (ABIDE) Collection. These demo applications link descriptive quality control metrics, regional brain volumes, and thickness scalars to a 3D imaging viewer and editing module, resulting in an easy-to-implement quality control protocol that can be scaled for any size and complexity of study. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  1. A classification on human factor accident/incident of China civil aviation in recent twelve years.

    PubMed

    Luo, Xiao-li

    2004-10-01

    To study human factor accident/incident occurred during 1990-2001 using new classification standard. The human factor accident/incident classification standard is developed on the basis of Reason's Model, combining with CAAC's traditional classifying method, and applied to the classified statistical analysis for 361 flying incidents and 35 flight accidents of China civil aviation, which is induced by human factors and occurred from 1990 to 2001. 1) the incident percentage of taxi and cruise is higher than that of takeoff, climb and descent. 2) The dominating type of flight incidents is diverging of runway, overrunning, near-miss, tail/wingtip/engine strike and ground obstacle impacting. 3) The top three accidents are out of control caused by crew, mountain collision and over runway. 4) Crew's basic operating skill is lower than what we imagined, the mostly representation is poor correcting ability when flight error happened. 5) Crew errors can be represented by incorrect control, regulation and procedure violation, disorientation and diverging percentage of correct flight level. The poor CRM skill is the dominant factor impacting China civil aviation safety, this result has a coincidence with previous study, but there is much difference and distinct characteristic in top incident phase, the type of crew error and behavior performance compared with that of advanced countries. We should strengthen CRM training for all of pilots aiming at the Chinese pilot behavior characteristic in order to improve the safety level of China civil aviation.

  2. Prediction of discretization error using the error transport equation

    NASA Astrophysics Data System (ADS)

    Celik, Ismail B.; Parsons, Don Roscoe

    2017-06-01

    This study focuses on an approach to quantify the discretization error associated with numerical solutions of partial differential equations by solving an error transport equation (ETE). The goal is to develop a method that can be used to adequately predict the discretization error using the numerical solution on only one grid/mesh. The primary problem associated with solving the ETE is the formulation of the error source term which is required for accurately predicting the transport of the error. In this study, a novel approach is considered which involves fitting the numerical solution with a series of locally smooth curves and then blending them together with a weighted spline approach. The result is a continuously differentiable analytic expression that can be used to determine the error source term. Once the source term has been developed, the ETE can easily be solved using the same solver that is used to obtain the original numerical solution. The new methodology is applied to the two-dimensional Navier-Stokes equations in the laminar flow regime. A simple unsteady flow case is also considered. The discretization error predictions based on the methodology presented in this study are in good agreement with the 'true error'. While in most cases the error predictions are not quite as accurate as those from Richardson extrapolation, the results are reasonable and only require one numerical grid. The current results indicate that there is much promise going forward with the newly developed error source term evaluation technique and the ETE.

  3. Characterization of Escherichia coli isolates from different fecal sources by means of classification tree analysis of fatty acid methyl ester (FAME) profiles.

    PubMed

    Seurinck, Sylvie; Deschepper, Ellen; Deboch, Bishaw; Verstraete, Willy; Siciliano, Steven

    2006-03-01

    Microbial source tracking (MST) methods need to be rapid, inexpensive and accurate. Unfortunately, many MST methods provide a wealth of information that is difficult to interpret by the regulators who use this information to make decisions. This paper describes the use of classification tree analysis to interpret the results of a MST method based on fatty acid methyl ester (FAME) profiles of Escherichia coli isolates, and to present results in a format readily interpretable by water quality managers. Raw sewage E. coli isolates and animal E. coli isolates from cow, dog, gull, and horse were isolated and their FAME profiles collected. Correct classification rates determined with leaveone-out cross-validation resulted in an overall low correct classification rate of 61%. A higher overall correct classification rate of 85% was obtained when the animal isolates were pooled together and compared to the raw sewage isolates. Bootstrap aggregation or adaptive resampling and combining of the FAME profile data increased correct classification rates substantially. Other MST methods may be better suited to differentiate between different fecal sources but classification tree analysis has enabled us to distinguish raw sewage from animal E. coli isolates, which previously had not been possible with other multivariate methods such as principal component analysis and cluster analysis.

  4. Robust Image Regression Based on the Extended Matrix Variate Power Exponential Distribution of Dependent Noise.

    PubMed

    Luo, Lei; Yang, Jian; Qian, Jianjun; Tai, Ying; Lu, Gui-Fu

    2017-09-01

    Dealing with partial occlusion or illumination is one of the most challenging problems in image representation and classification. In this problem, the characterization of the representation error plays a crucial role. In most current approaches, the error matrix needs to be stretched into a vector and each element is assumed to be independently corrupted. This ignores the dependence between the elements of error. In this paper, it is assumed that the error image caused by partial occlusion or illumination changes is a random matrix variate and follows the extended matrix variate power exponential distribution. This has the heavy tailed regions and can be used to describe a matrix pattern of l×m dimensional observations that are not independent. This paper reveals the essence of the proposed distribution: it actually alleviates the correlations between pixels in an error matrix E and makes E approximately Gaussian. On the basis of this distribution, we derive a Schatten p -norm-based matrix regression model with L q regularization. Alternating direction method of multipliers is applied to solve this model. To get a closed-form solution in each step of the algorithm, two singular value function thresholding operators are introduced. In addition, the extended Schatten p -norm is utilized to characterize the distance between the test samples and classes in the design of the classifier. Extensive experimental results for image reconstruction and classification with structural noise demonstrate that the proposed algorithm works much more robustly than some existing regression-based methods.

  5. Assessing the accuracy of the International Classification of Diseases codes to identify abusive head trauma: a feasibility study.

    PubMed

    Berger, Rachel P; Parks, Sharyn; Fromkin, Janet; Rubin, Pamela; Pecora, Peter J

    2015-04-01

    To assess the accuracy of an International Classification of Diseases (ICD) code-based operational case definition for abusive head trauma (AHT). Subjects were children <5 years of age evaluated for AHT by a hospital-based Child Protection Team (CPT) at a tertiary care paediatric hospital with a completely electronic medical record (EMR) system. Subjects were designated as non-AHT traumatic brain injury (TBI) or AHT based on whether the CPT determined that the injuries were due to AHT. The sensitivity and specificity of the ICD-based definition were calculated. There were 223 children evaluated for AHT: 117 AHT and 106 non-AHT TBI. The sensitivity and specificity of the ICD-based operational case definition were 92% (95% CI 85.8 to 96.2) and 96% (95% CI 92.3 to 99.7), respectively. All errors in sensitivity and three of the four specificity errors were due to coder error; one specificity error was a physician error. In a paediatric tertiary care hospital with an EMR system, the accuracy of an ICD-based case definition for AHT was high. Additional studies are needed to assess the accuracy of this definition in all types of hospitals in which children with AHT are cared for. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  6. Modeling habitat dynamics accounting for possible misclassification

    USGS Publications Warehouse

    Veran, Sophie; Kleiner, Kevin J.; Choquet, Remi; Collazo, Jaime; Nichols, James D.

    2012-01-01

    Land cover data are widely used in ecology as land cover change is a major component of changes affecting ecological systems. Landscape change estimates are characterized by classification errors. Researchers have used error matrices to adjust estimates of areal extent, but estimation of land cover change is more difficult and more challenging, with error in classification being confused with change. We modeled land cover dynamics for a discrete set of habitat states. The approach accounts for state uncertainty to produce unbiased estimates of habitat transition probabilities using ground information to inform error rates. We consider the case when true and observed habitat states are available for the same geographic unit (pixel) and when true and observed states are obtained at one level of resolution, but transition probabilities estimated at a different level of resolution (aggregations of pixels). Simulation results showed a strong bias when estimating transition probabilities if misclassification was not accounted for. Scaling-up does not necessarily decrease the bias and can even increase it. Analyses of land cover data in the Southeast region of the USA showed that land change patterns appeared distorted if misclassification was not accounted for: rate of habitat turnover was artificially increased and habitat composition appeared more homogeneous. Not properly accounting for land cover misclassification can produce misleading inferences about habitat state and dynamics and also misleading predictions about species distributions based on habitat. Our models that explicitly account for state uncertainty should be useful in obtaining more accurate inferences about change from data that include errors.

  7. Underwater target classification using wavelet packets and neural networks.

    PubMed

    Azimi-Sadjadi, M R; Yao, D; Huang, Q; Dobeck, G J

    2000-01-01

    In this paper, a new subband-based classification scheme is developed for classifying underwater mines and mine-like targets from the acoustic backscattered signals. The system consists of a feature extractor using wavelet packets in conjunction with linear predictive coding (LPC), a feature selection scheme, and a backpropagation neural-network classifier. The data set used for this study consists of the backscattered signals from six different objects: two mine-like targets and four nontargets for several aspect angles. Simulation results on ten different noisy realizations and for signal-to-noise ratio (SNR) of 12 dB are presented. The receiver operating characteristic (ROC) curve of the classifier generated based on these results demonstrated excellent classification performance of the system. The generalization ability of the trained network was demonstrated by computing the error and classification rate statistics on a large data set. A multiaspect fusion scheme was also adopted in order to further improve the classification performance.

  8. The effect of the atmosphere on the classification of satellite observations to identify surface features

    NASA Technical Reports Server (NTRS)

    Fraser, R. S.; Bahethi, O. P.; Al-Abbas, A. H.

    1977-01-01

    The effect of differences in atmospheric turbidity on the classification of Landsat 1 observations of a rural scene is presented. The observations are classified by an unsupervised clustering technique. These clusters serve as a training set for use of a maximum-likelihood algorithm. The measured radiances in each of the four spectral bands are then changed by amounts measured by Landsat 1. These changes can be associated with a decrease in atmospheric turbidity by a factor of 1.3. The classification of 22% of the pixels changes as a result of the modification. The modified observations are then reclassified as an independent set. Only 3% of the pixels have a different classification than the unmodified set. Hence, if classification errors of rural areas are not to exceed 15%, a new training set has to be developed whenever the difference in turbidity between the training and test sets reaches unity.

  9. Multinomial mixture model with heterogeneous classification probabilities

    USGS Publications Warehouse

    Holland, M.D.; Gray, B.R.

    2011-01-01

    Royle and Link (Ecology 86(9):2505-2512, 2005) proposed an analytical method that allowed estimation of multinomial distribution parameters and classification probabilities from categorical data measured with error. While useful, we demonstrate algebraically and by simulations that this method yields biased multinomial parameter estimates when the probabilities of correct category classifications vary among sampling units. We address this shortcoming by treating these probabilities as logit-normal random variables within a Bayesian framework. We use Markov chain Monte Carlo to compute Bayes estimates from a simulated sample from the posterior distribution. Based on simulations, this elaborated Royle-Link model yields nearly unbiased estimates of multinomial and correct classification probability estimates when classification probabilities are allowed to vary according to the normal distribution on the logit scale or according to the Beta distribution. The method is illustrated using categorical submersed aquatic vegetation data. ?? 2010 Springer Science+Business Media, LLC.

  10. Space-Borne Laser Altimeter Geolocation Error Analysis

    NASA Astrophysics Data System (ADS)

    Wang, Y.; Fang, J.; Ai, Y.

    2018-05-01

    This paper reviews the development of space-borne laser altimetry technology over the past 40 years. Taking the ICESAT satellite as an example, a rigorous space-borne laser altimeter geolocation model is studied, and an error propagation equation is derived. The influence of the main error sources, such as the platform positioning error, attitude measurement error, pointing angle measurement error and range measurement error, on the geolocation accuracy of the laser spot are analysed by simulated experiments. The reasons for the different influences on geolocation accuracy in different directions are discussed, and to satisfy the accuracy of the laser control point, a design index for each error source is put forward.

  11. Stretchy binary classification.

    PubMed

    Toh, Kar-Ann; Lin, Zhiping; Sun, Lei; Li, Zhengguo

    2018-01-01

    In this article, we introduce an analytic formulation for compressive binary classification. The formulation seeks to solve the least ℓ p -norm of the parameter vector subject to a classification error constraint. An analytic and stretchable estimation is conjectured where the estimation can be viewed as an extension of the pseudoinverse with left and right constructions. Our variance analysis indicates that the estimation based on the left pseudoinverse is unbiased and the estimation based on the right pseudoinverse is biased. Sparseness can be obtained for the biased estimation under certain mild conditions. The proposed estimation is investigated numerically using both synthetic and real-world data. Copyright © 2017 Elsevier Ltd. All rights reserved.

  12. [Effects of residents' care needs classification (and misclassification) in nursing homes: the example of SOSIA classification].

    PubMed

    Nebuloni, G; Di Giulio, P; Gregori, D; Sandonà, P; Berchialla, P; Foltran, F; Renga, G

    2011-01-01

    Since 2003, the Lombardy region has introduced a case-mix reimbursement system for nursing homes based on the SOSIA form which classifies residents into eight classes of frailty. In the present study the agreement between SOSIA classification and other well documented instruments, including Barthel Index, Mini Mental State Examination and Clinical Dementia Rating Scale is evaluated in 100 nursing home residents. Only 50% of residents with severe dementia have been recognized as seriously impaired when assessed with SOSIA form; since misclassification errors underestimate residents' care needs, they determine an insufficient reimbursement limiting nursing home possibility to offer care appropriate for the case-mix.

  13. Effect of filtration of signals of brain activity on quality of recognition of brain activity patterns using artificial intelligence methods

    NASA Astrophysics Data System (ADS)

    Hramov, Alexander E.; Frolov, Nikita S.; Musatov, Vyachaslav Yu.

    2018-02-01

    In present work we studied features of the human brain states classification, corresponding to the real movements of hands and legs. For this purpose we used supervised learning algorithm based on feed-forward artificial neural networks (ANNs) with error back-propagation along with the support vector machine (SVM) method. We compared the quality of operator movements classification by means of EEG signals obtained experimentally in the absence of preliminary processing and after filtration in different ranges up to 25 Hz. It was shown that low-frequency filtering of multichannel EEG data significantly improved accuracy of operator movements classification.

  14. Ground truth management system to support multispectral scanner /MSS/ digital analysis

    NASA Technical Reports Server (NTRS)

    Coiner, J. C.; Ungar, S. G.

    1977-01-01

    A computerized geographic information system for management of ground truth has been designed and implemented to relate MSS classification results to in situ observations. The ground truth system transforms, generalizes and rectifies ground observations to conform to the pixel size and shape of high resolution MSS aircraft data. These observations can then be aggregated for comparison to lower resolution sensor data. Construction of a digital ground truth array allows direct pixel by pixel comparison between classification results of MSS data and ground truth. By making comparisons, analysts can identify spatial distribution of error within the MSS data as well as usual figures of merit for the classifications. Use of the ground truth system permits investigators to compare a variety of environmental or anthropogenic data, such as soil color or tillage patterns, with classification results and allows direct inclusion of such data into classification operations. To illustrate the system, examples from classification of simulated Thematic Mapper data for agricultural test sites in North Dakota and Kansas are provided.

  15. Determination of Barometric Altimeter Errors for the Orion Exploration Flight Test-1 Entry

    NASA Technical Reports Server (NTRS)

    Brown, Denise L.; Bunoz, Jean-Philippe; Gay, Robert

    2012-01-01

    The Exploration Flight Test 1 (EFT-1) mission is the unmanned flight test for the upcoming Multi-Purpose Crew Vehicle (MPCV). During entry, the EFT-1 vehicle will trigger several Landing and Recovery System (LRS) events, such as parachute deployment, based on on-board altitude information. The primary altitude source is the filtered navigation solution updated with GPS measurement data. The vehicle also has three barometric altimeters that will be used to measure atmospheric pressure during entry. In the event that GPS data is not available during entry, the altitude derived from the barometric altimeter pressure will be used to trigger chute deployment for the drogues and main parachutes. Therefore it is important to understand the impact of error sources on the pressure measured by the barometric altimeters and on the altitude derived from that pressure. The error sources for the barometric altimeters are not independent, and many error sources result in bias in a specific direction. Therefore conventional error budget methods could not be applied. Instead, high fidelity Monte-Carlo simulation was performed and error bounds were determined based on the results of this analysis. Aerodynamic errors were the largest single contributor to the error budget for the barometric altimeters. The large errors drove a change to the altitude trigger setpoint for FBC jettison deploy.

  16. Global terrain classification using Multiple-Error-Removed Improved-Terrain (MERIT) to address susceptibility of landslides and other geohazards

    NASA Astrophysics Data System (ADS)

    Iwahashi, J.; Yamazaki, D.; Matsuoka, M.; Thamarux, P.; Herrick, J.; Yong, A.; Mital, U.

    2017-12-01

    A seamless model of landform classifications with regional accuracy will be a powerful platform for geophysical studies that forecast geologic hazards. Spatial variability as a function of landform on a global scale was captured in the automated classifications of Iwahashi and Pike (2007) and additional developments are presented here that incorporate more accurate depictions using higher-resolution elevation data than the original 1-km scale Shuttle Radar Topography Mission digital elevation model (DEM). We create polygon-based terrain classifications globally by using the 280-m DEM interpolated from the Multi-Error-Removed Improved-Terrain DEM (MERIT; Yamazaki et al., 2017). The multi-scale pixel-image analysis method, known as Multi-resolution Segmentation (Baatz and Schäpe, 2000), is first used to classify the terrains based on geometric signatures (slope and local convexity) calculated from the 280-m DEM. Next, we apply the machine learning method of "k-means clustering" to prepare the polygon-based classification at the globe-scale using slope, local convexity and surface texture. We then group the divisions with similar properties by hierarchical clustering and other statistical analyses using geological and geomorphological data of the area where landslides and earthquakes are frequent (e.g. Japan and California). We find the 280-m DEM resolution is only partially sufficient for classifying plains. We nevertheless observe that the categories correspond to reported landslide and liquefaction features at the global scale, suggesting that our model is an appropriate platform to forecast ground failure. To predict seismic amplification, we estimate site conditions using the time-averaged shear-wave velocity in the upper 30-m (VS30) measurements compiled by Yong et al. (2016) and the terrain model developed by Yong (2016; Y16). We plan to test our method on finer resolution DEMs and report our findings to obtain a more globally consistent terrain model as there are known errors in DEM derivatives at higher-resolutions. We expect the improvement in DEM resolution (4 times greater detail) and the combination of regional and global coverage will yield a consistent dataset of polygons that have the potential to improve relations to the Y16 estimates significantly.

  17. Systemic inaccuracies in the National Surgical Quality Improvement Program database: Implications for accuracy and validity for neurosurgery outcomes research.

    PubMed

    Rolston, John D; Han, Seunggu J; Chang, Edward F

    2017-03-01

    The American College of Surgeons (ACS) National Surgical Quality Improvement Program (NSQIP) provides a rich database of North American surgical procedures and their complications. Yet no external source has validated the accuracy of the information within this database. Using records from the 2006 to 2013 NSQIP database, we used two methods to identify errors: (1) mismatches between the Current Procedural Terminology (CPT) code that was used to identify the surgical procedure, and the International Classification of Diseases (ICD-9) post-operative diagnosis: i.e., a diagnosis that is incompatible with a certain procedure. (2) Primary anesthetic and CPT code mismatching: i.e., anesthesia not indicated for a particular procedure. Analyzing data for movement disorders, epilepsy, and tumor resection, we found evidence of CPT code and postoperative diagnosis mismatches in 0.4-100% of cases, depending on the CPT code examined. When analyzing anesthetic data from brain tumor, epilepsy, trauma, and spine surgery, we found evidence of miscoded anesthesia in 0.1-0.8% of cases. National databases like NSQIP are an important tool for quality improvement. Yet all databases are subject to errors, and measures of internal consistency show that errors affect up to 100% of case records for certain procedures in NSQIP. Steps should be taken to improve data collection on the frontend of NSQIP, and also to ensure that future studies with NSQIP take steps to exclude erroneous cases from analysis. Copyright © 2016 Elsevier Ltd. All rights reserved.

  18. Disregarding population specificity: its influence on the sex assessment methods from the tibia.

    PubMed

    Kotěrová, Anežka; Velemínská, Jana; Dupej, Ján; Brzobohatá, Hana; Pilný, Aleš; Brůžek, Jaroslav

    2017-01-01

    Forensic anthropology has developed classification techniques for sex estimation of unknown skeletal remains, for example population-specific discriminant function analyses. These methods were designed for populations that lived mostly in the late nineteenth and twentieth centuries. Their level of reliability or misclassification is important for practical use in today's forensic practice; it is, however, unknown. We addressed the question of what the likelihood of errors would be if population specificity of discriminant functions of the tibia were disregarded. Moreover, five classification functions in a Czech sample were proposed (accuracies 82.1-87.5 %, sex bias ranged from -1.3 to -5.4 %). We measured ten variables traditionally used for sex assessment of the tibia on a sample of 30 male and 26 female models from recent Czech population. To estimate the classification accuracy and error (misclassification) rates ignoring population specificity, we selected published classification functions of tibia for the Portuguese, south European, and the North American populations. These functions were applied on the dimensions of the Czech population. Comparing the classification success of the reference and the tested Czech sample showed that females from Czech population were significantly overestimated and mostly misclassified as males. Overall accuracy of sex assessment significantly decreased (53.6-69.7 %), sex bias -29.4-100 %, which is most probably caused by secular trend and the generally high variability of body size. Results indicate that the discriminant functions, developed for skeletal series representing geographically and chronologically diverse populations, are not applicable in current forensic investigations. Finally, implications and recommendations for future research are discussed.

  19. Taxonaut: an application software for comparative display of multiple taxonomies with a use case of GBIF Species API

    PubMed Central

    2016-01-01

    Abstract Background The Species API of the Global Biodiversity Information Facility (GBIF) provides public access to taxonomic data aggregated from multiple data sources. Each data source follows its own classification which can be inconsistent with classifications from other sources. Even with a reference classification e.g. the GBIF Backbone taxonomy, a comprehensive method to compare classifications in the data aggregation is essential, especially for non-expert users. New information A Java application was developed to compare multiple taxonomies graphically using classification data acquired from GBIF’s ChecklistBank via the GBIF Species API. It uses a table to display taxonomies where each column represents a taxonomy under comparison, with an aligner column to organise taxa by name. Each cell contains the name of a taxon if the classification in that column contains the name. Each column also has a cell showing the hierarchy of the taxonomy by a folder metaphor where taxa are aligned and synchronised in the aligner column. A set of those comparative tables shows taxa categorised by relationship between taxonomies. The result set is also available as tables in an Excel format file. PMID:27932916

  20. Decision Making for Borderline Cases in Pass/Fail Clinical Anatomy Courses: The Practical Value of the Standard Error of Measurement and Likelihood Ratio in a Diagnostic Test

    ERIC Educational Resources Information Center

    Severo, Milton; Silva-Pereira, Fernanda; Ferreira, Maria Amelia

    2013-01-01

    Several studies have shown that the standard error of measurement (SEM) can be used as an additional “safety net” to reduce the frequency of false-positive or false-negative student grading classifications. Practical examinations in clinical anatomy are often used as diagnostic tests to admit students to course final examinations. The aim of this…

  1. TOWARD ERROR ANALYSIS OF LARGE-SCALE FOREST CARBON BUDGETS

    EPA Science Inventory

    Quantification of forest carbon sources and sinks is an important part of national inventories of net greenhouse gas emissions. Several such forest carbon budgets have been constructed, but little effort has been made to analyse the sources of error and how these errors propagate...

  2. Task-dependent signal variations in EEG error-related potentials for brain-computer interfaces.

    PubMed

    Iturrate, I; Montesano, L; Minguez, J

    2013-04-01

    A major difficulty of brain-computer interface (BCI) technology is dealing with the noise of EEG and its signal variations. Previous works studied time-dependent non-stationarities for BCIs in which the user's mental task was independent of the device operation (e.g., the mental task was motor imagery and the operational task was a speller). However, there are some BCIs, such as those based on error-related potentials, where the mental and operational tasks are dependent (e.g., the mental task is to assess the device action and the operational task is the device action itself). The dependence between the mental task and the device operation could introduce a new source of signal variations when the operational task changes, which has not been studied yet. The aim of this study is to analyse task-dependent signal variations and their effect on EEG error-related potentials. The work analyses the EEG variations on the three design steps of BCIs: an electrophysiology study to characterize the existence of these variations, a feature distribution analysis and a single-trial classification analysis to measure the impact on the final BCI performance. The results demonstrate that a change in the operational task produces variations in the potentials, even when EEG activity exclusively originated in brain areas related to error processing is considered. Consequently, the extracted features from the signals vary, and a classifier trained with one operational task presents a significant loss of performance for other tasks, requiring calibration or adaptation for each new task. In addition, a new calibration for each of the studied tasks rapidly outperforms adaptive techniques designed in the literature to mitigate the EEG time-dependent non-stationarities.

  3. Task-dependent signal variations in EEG error-related potentials for brain-computer interfaces

    NASA Astrophysics Data System (ADS)

    Iturrate, I.; Montesano, L.; Minguez, J.

    2013-04-01

    Objective. A major difficulty of brain-computer interface (BCI) technology is dealing with the noise of EEG and its signal variations. Previous works studied time-dependent non-stationarities for BCIs in which the user’s mental task was independent of the device operation (e.g., the mental task was motor imagery and the operational task was a speller). However, there are some BCIs, such as those based on error-related potentials, where the mental and operational tasks are dependent (e.g., the mental task is to assess the device action and the operational task is the device action itself). The dependence between the mental task and the device operation could introduce a new source of signal variations when the operational task changes, which has not been studied yet. The aim of this study is to analyse task-dependent signal variations and their effect on EEG error-related potentials.Approach. The work analyses the EEG variations on the three design steps of BCIs: an electrophysiology study to characterize the existence of these variations, a feature distribution analysis and a single-trial classification analysis to measure the impact on the final BCI performance.Results and significance. The results demonstrate that a change in the operational task produces variations in the potentials, even when EEG activity exclusively originated in brain areas related to error processing is considered. Consequently, the extracted features from the signals vary, and a classifier trained with one operational task presents a significant loss of performance for other tasks, requiring calibration or adaptation for each new task. In addition, a new calibration for each of the studied tasks rapidly outperforms adaptive techniques designed in the literature to mitigate the EEG time-dependent non-stationarities.

  4. Use of scan overlap redundancy to enhance multispectral aircraft scanner data

    NASA Technical Reports Server (NTRS)

    Lindenlaub, J. C.; Keat, J.

    1973-01-01

    Two criteria were suggested for optimizing the resolution error versus signal-to-noise-ratio tradeoff. The first criterion uses equal weighting coefficients and chooses n, the number of lines averaged, so as to make the average resolution error equal to the noise error. The second criterion adjusts both the number and relative sizes of the weighting coefficients so as to minimize the total error (resolution error plus noise error). The optimum set of coefficients depends upon the geometry of the resolution element, the number of redundant scan lines, the scan line increment, and the original signal-to-noise ratio of the channel. Programs were developed to find the optimum number and relative weights of the averaging coefficients. A working definition of signal-to-noise ratio was given and used to try line averaging on a typical set of data. Line averaging was evaluated only with respect to its effect on classification accuracy.

  5. Quantifying errors without random sampling.

    PubMed

    Phillips, Carl V; LaPole, Luwanna M

    2003-06-12

    All quantifications of mortality, morbidity, and other health measures involve numerous sources of error. The routine quantification of random sampling error makes it easy to forget that other sources of error can and should be quantified. When a quantification does not involve sampling, error is almost never quantified and results are often reported in ways that dramatically overstate their precision. We argue that the precision implicit in typical reporting is problematic and sketch methods for quantifying the various sources of error, building up from simple examples that can be solved analytically to more complex cases. There are straightforward ways to partially quantify the uncertainty surrounding a parameter that is not characterized by random sampling, such as limiting reported significant figures. We present simple methods for doing such quantifications, and for incorporating them into calculations. More complicated methods become necessary when multiple sources of uncertainty must be combined. We demonstrate that Monte Carlo simulation, using available software, can estimate the uncertainty resulting from complicated calculations with many sources of uncertainty. We apply the method to the current estimate of the annual incidence of foodborne illness in the United States. Quantifying uncertainty from systematic errors is practical. Reporting this uncertainty would more honestly represent study results, help show the probability that estimated values fall within some critical range, and facilitate better targeting of further research.

  6. 12 CFR 605.502 - Program and procedures.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... procedures. (a) The Farm Credit Administration has no authority for the original classification of... classify information. (b) Derivative classification. “Derivative classification” means the incorporating... developed material consistent with the classification markings that apply to the source information...

  7. 12 CFR 605.502 - Program and procedures.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... procedures. (a) The Farm Credit Administration has no authority for the original classification of... classify information. (b) Derivative classification. “Derivative classification” means the incorporating... developed material consistent with the classification markings that apply to the source information...

  8. 12 CFR 605.502 - Program and procedures.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... procedures. (a) The Farm Credit Administration has no authority for the original classification of... classify information. (b) Derivative classification. “Derivative classification” means the incorporating... developed material consistent with the classification markings that apply to the source information...

  9. 12 CFR 605.502 - Program and procedures.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... procedures. (a) The Farm Credit Administration has no authority for the original classification of... classify information. (b) Derivative classification. “Derivative classification” means the incorporating... developed material consistent with the classification markings that apply to the source information...

  10. Reducing error and improving efficiency during vascular interventional radiology: implementation of a preprocedural team rehearsal.

    PubMed

    Morbi, Abigail H M; Hamady, Mohamad S; Riga, Celia V; Kashef, Elika; Pearch, Ben J; Vincent, Charles; Moorthy, Krishna; Vats, Amit; Cheshire, Nicholas J W; Bicknell, Colin D

    2012-08-01

    To determine the type and frequency of errors during vascular interventional radiology (VIR) and design and implement an intervention to reduce error and improve efficiency in this setting. Ethical guidance was sought from the Research Services Department at Imperial College London. Informed consent was not obtained. Field notes were recorded during 55 VIR procedures by a single observer. Two blinded assessors identified failures from field notes and categorized them into one or more errors by using a 22-part classification system. The potential to cause harm, disruption to procedural flow, and preventability of each failure was determined. A preprocedural team rehearsal (PPTR) was then designed and implemented to target frequent preventable potential failures. Thirty-three procedures were observed subsequently to determine the efficacy of the PPTR. Nonparametric statistical analysis was used to determine the effect of intervention on potential failure rates, potential to cause harm and procedural flow disruption scores (Mann-Whitney U test), and number of preventable failures (Fisher exact test). Before intervention, 1197 potential failures were recorded, of which 54.6% were preventable. A total of 2040 errors were deemed to have occurred to produce these failures. Planning error (19.7%), staff absence (16.2%), equipment unavailability (12.2%), communication error (11.2%), and lack of safety consciousness (6.1%) were the most frequent errors, accounting for 65.4% of the total. After intervention, 352 potential failures were recorded. Classification resulted in 477 errors. Preventable failures decreased from 54.6% to 27.3% (P < .001) with implementation of PPTR. Potential failure rates per hour decreased from 18.8 to 9.2 (P < .001), with no increase in potential to cause harm or procedural flow disruption per failure. Failures during VIR procedures are largely because of ineffective planning, communication error, and equipment difficulties, rather than a result of technical or patient-related issues. Many of these potential failures are preventable. A PPTR is an effective means of targeting frequent preventable failures, reducing procedural delays and improving patient safety.

  11. Bug Distribution and Statistical Pattern Classification.

    ERIC Educational Resources Information Center

    Tatsuoka, Kikumi K.; Tatsuoka, Maurice M.

    1987-01-01

    The rule space model permits measurement of cognitive skill acquisition and error diagnosis. Further discussion introduces Bayesian hypothesis testing and bug distribution. An illustration involves an artificial intelligence approach to testing fractions and arithmetic. (Author/GDC)

  12. Scattering from binary optics

    NASA Technical Reports Server (NTRS)

    Ricks, Douglas W.

    1993-01-01

    There are a number of sources of scattering in binary optics: etch depth errors, line edge errors, quantization errors, roughness, and the binary approximation to the ideal surface. These sources of scattering can be systematic (deterministic) or random. In this paper, scattering formulas for both systematic and random errors are derived using Fourier optics. These formulas can be used to explain the results of scattering measurements and computer simulations.

  13. Psychrometric Measurement of Leaf Water Potential: Lack of Error Attributable to Leaf Permeability.

    PubMed

    Barrs, H D

    1965-07-02

    A report that low permeability could cause gross errors in psychrometric determinations of water potential in leaves has not been confirmed. No measurable error from this source could be detected for either of two types of thermocouple psychrometer tested on four species, each at four levels of water potential. No source of error other than tissue respiration could be demonstrated.

  14. Classification of ASKAP Vast Radio Light Curves

    NASA Technical Reports Server (NTRS)

    Rebbapragada, Umaa; Lo, Kitty; Wagstaff, Kiri L.; Reed, Colorado; Murphy, Tara; Thompson, David R.

    2012-01-01

    The VAST survey is a wide-field survey that observes with unprecedented instrument sensitivity (0.5 mJy or lower) and repeat cadence (a goal of 5 seconds) that will enable novel scientific discoveries related to known and unknown classes of radio transients and variables. Given the unprecedented observing characteristics of VAST, it is important to estimate source classification performance, and determine best practices prior to the launch of ASKAP's BETA in 2012. The goal of this study is to identify light curve characterization and classification algorithms that are best suited for archival VAST light curve classification. We perform our experiments on light curve simulations of eight source types and achieve best case performance of approximately 90% accuracy. We note that classification performance is most influenced by light curve characterization rather than classifier algorithm.

  15. Temporal dynamics of conflict monitoring and the effects of one or two conflict sources on error-(related) negativity.

    PubMed

    Armbrecht, Anne-Simone; Wöhrmann, Anne; Gibbons, Henning; Stahl, Jutta

    2010-09-01

    The present electrophysiological study investigated the temporal development of response conflict and the effects of diverging conflict sources on error(-related) negativity (Ne). Eighteen participants performed a combined stop-signal flanker task, which was comprised of two different conflict sources: a left-right and a go-stop response conflict. It is assumed that the Ne reflects the activity of a conflict monitoring system and thus increases according to (i) the number of conflict sources and (ii) the temporal development of the conflict activity. No increase of the Ne amplitude after double errors (comprising two conflict sources) as compared to hand- and stop-errors (comprising one conflict source) was found, whereas a higher Ne amplitude was observed after a delayed stop-signal onset. The results suggest that the Ne is not sensitive to an increase in the number of conflict sources, but to the temporal dynamics of a go-stop response conflict. Copyright (c) 2010 Elsevier B.V. All rights reserved.

  16. An Optimization-Based Framework for the Transformation of Incomplete Biological Knowledge into a Probabilistic Structure and Its Application to the Utilization of Gene/Protein Signaling Pathways in Discrete Phenotype Classification.

    PubMed

    Esfahani, Mohammad Shahrokh; Dougherty, Edward R

    2015-01-01

    Phenotype classification via genomic data is hampered by small sample sizes that negatively impact classifier design. Utilization of prior biological knowledge in conjunction with training data can improve both classifier design and error estimation via the construction of the optimal Bayesian classifier. In the genomic setting, gene/protein signaling pathways provide a key source of biological knowledge. Although these pathways are neither complete, nor regulatory, with no timing associated with them, they are capable of constraining the set of possible models representing the underlying interaction between molecules. The aim of this paper is to provide a framework and the mathematical tools to transform signaling pathways to prior probabilities governing uncertainty classes of feature-label distributions used in classifier design. Structural motifs extracted from the signaling pathways are mapped to a set of constraints on a prior probability on a Multinomial distribution. Being the conjugate prior for the Multinomial distribution, we propose optimization paradigms to estimate the parameters of a Dirichlet distribution in the Bayesian setting. The performance of the proposed methods is tested on two widely studied pathways: mammalian cell cycle and a p53 pathway model.

  17. Quantifying Uncertainty in Instantaneous Orbital Data Products of TRMM over Indian Subcontinent

    NASA Astrophysics Data System (ADS)

    Jayaluxmi, I.; Nagesh, D.

    2013-12-01

    In the last 20 years, microwave radiometers have taken satellite images of earth's weather proving to be a valuable tool for quantitative estimation of precipitation from space. However, along with the widespread acceptance of microwave based precipitation products, it has also been recognized that they contain large uncertainties. While most of the uncertainty evaluation studies focus on the accuracy of rainfall accumulated over time (e.g., season/year), evaluation of instantaneous rainfall intensities from satellite orbital data products are relatively rare. These instantaneous products are known to potentially cause large uncertainties during real time flood forecasting studies at the watershed scale. Especially over land regions, where the highly varying land surface emissivity offer a myriad of complications hindering accurate rainfall estimation. The error components of orbital data products also tend to interact nonlinearly with hydrologic modeling uncertainty. Keeping these in mind, the present study fosters the development of uncertainty analysis using instantaneous satellite orbital data products (version 7 of 1B11, 2A25, 2A23) derived from the passive and active sensors onboard Tropical Rainfall Measuring Mission (TRMM) satellite, namely TRMM microwave imager (TMI) and Precipitation Radar (PR). The study utilizes 11 years of orbital data from 2002 to 2012 over the Indian subcontinent and examines the influence of various error sources on the convective and stratiform precipitation types. Analysis conducted over the land regions of India investigates three sources of uncertainty in detail. These include 1) Errors due to improper delineation of rainfall signature within microwave footprint (rain/no rain classification), 2) Uncertainty offered by the transfer function linking rainfall with TMI low frequency channels and 3) Sampling errors owing to the narrow swath and infrequent visits of TRMM sensors. Case study results obtained during the Indian summer monsoon months of June-September are presented using contingency table statistics, performance diagram, scatter plots and probability density functions. Our study demonstrates that theory of copula can be efficiently used to represent the highly non linear dependency structure of rainfall with respect to TMI low frequency channels of 19, 21, 37 GHz. This questions the exclusive usage of high frequency 85 GHz channel for TMI overland rainfall retrieval algorithms. Further, the PR sampling errors revealed using a statistical bootstrap technique was found to incur relative sampling errors <30% (for 2 degree grids) over India whose magnitudes were biased towards stratiform rainfall type and sampling technique employed. These findings clearly document that proper characterization of error structure offered by TMI and PR has wider implications for decision making prior to incorporating the resulting orbital products for basin scale hydrologic modeling.

  18. Assessment of Metronidazole Susceptibility in Helicobacter pylori: Statistical Validation and Error Rate Analysis of Breakpoints Determined by the Disk Diffusion Test

    PubMed Central

    Chaves, Sandra; Gadanho, Mário; Tenreiro, Rogério; Cabrita, José

    1999-01-01

    Metronidazole susceptibility of 100 Helicobacter pylori strains was assessed by determining the inhibition zone diameters by disk diffusion test and the MICs by agar dilution and PDM Epsilometer test (E test). Linear regression analysis was performed, allowing the definition of significant linear relations, and revealed correlations of disk diffusion results with both E-test and agar dilution results (r2 = 0.88 and 0.81, respectively). No significant differences (P = 0.84) were found between MICs defined by E test and those defined by agar dilution, taken as a standard. Reproducibility comparison between E-test and disk diffusion tests showed that they are equivalent and with good precision. Two interpretative susceptibility schemes (with or without an intermediate class) were compared by an interpretative error rate analysis method. The susceptibility classification scheme that included the intermediate category was retained, and breakpoints were assessed for diffusion assay with 5-μg metronidazole disks. Strains with inhibition zone diameters less than 16 mm were defined as resistant (MIC > 8 μg/ml), those with zone diameters equal to or greater than 16 mm but less than 21 mm were considered intermediate (4 μg/ml < MIC ≤ 8 μg/ml), and those with zone diameters of 21 mm or greater were regarded as susceptible (MIC ≤ 4 μg/ml). Error rate analysis applied to this classification scheme showed occurrence frequencies of 1% for major errors and 7% for minor errors, when the results were compared to those obtained by agar dilution. No very major errors were detected, suggesting that disk diffusion might be a good alternative for determining the metronidazole sensitivity of H. pylori strains. PMID:10203543

  19. [Comparison of Google and Yahoo applications for geocoding of postal addresses in epidemiological studies].

    PubMed

    Quesada, Jose Antonio; Nolasco, Andreu; Moncho, Joaquín

    2013-01-01

    Geocoding is the assignment of geographic coordinates to spatial points, which often are postal addresses. The error made in applying this process can introduce bias in estimates of spatiotemporal models in epidemiological studies. No studies have been found to measure the error made in applying this process in Spanish cities. The objective is to evaluate the errors in magnitude and direction from two free sources (Google and Yahoo) with regard to a GPS in two Spanish cities. 30 addresses were geocoded with those two sources and the GPS in Santa Pola (Alicante) and Alicante city. The distances were calculated in metres (median, CI95%) between the sources and the GPS, globally and according to the status reported by each source. The directionality of the error was evaluated by calculating the location quadrant and applying a Chi-Square test. The GPS error was evaluated by geocoding 11 addresses twice at 4 days interval. The overall median in Google-GPS was 23,2 metres (16,0-32,1) for Santa Pola, and 21,4 meters (14,9-31,1) for Alicante. The overall median in Yahoo was 136,0 meters (19,2-318,5) for Santa Pola, and 23,8 meters (13,6- 29,2) for Alicante. Between the 73% and 90% were geocoded by status as "exact or interpolated" (minor error), where Goggle and Yahoo had a median error between 19 and 23 metres in the two cities. The GPS had a median error of 13.8 meters (6,7-17,8). No error directionality was detected. Google error is acceptable and stable in the two cities, so that it is a reliable source for Para medir elgeocoding addresses in Spain in epidemiological studies.

  20. Nonlinear Conservation Laws and Finite Volume Methods

    NASA Astrophysics Data System (ADS)

    Leveque, Randall J.

    Introduction Software Notation Classification of Differential Equations Derivation of Conservation Laws The Euler Equations of Gas Dynamics Dissipative Fluxes Source Terms Radiative Transfer and Isothermal Equations Multi-dimensional Conservation Laws The Shock Tube Problem Mathematical Theory of Hyperbolic Systems Scalar Equations Linear Hyperbolic Systems Nonlinear Systems The Riemann Problem for the Euler Equations Numerical Methods in One Dimension Finite Difference Theory Finite Volume Methods Importance of Conservation Form - Incorrect Shock Speeds Numerical Flux Functions Godunov's Method Approximate Riemann Solvers High-Resolution Methods Other Approaches Boundary Conditions Source Terms and Fractional Steps Unsplit Methods Fractional Step Methods General Formulation of Fractional Step Methods Stiff Source Terms Quasi-stationary Flow and Gravity Multi-dimensional Problems Dimensional Splitting Multi-dimensional Finite Volume Methods Grids and Adaptive Refinement Computational Difficulties Low-Density Flows Discrete Shocks and Viscous Profiles Start-Up Errors Wall Heating Slow-Moving Shocks Grid Orientation Effects Grid-Aligned Shocks Magnetohydrodynamics The MHD Equations One-Dimensional MHD Solving the Riemann Problem Nonstrict Hyperbolicity Stiffness The Divergence of B Riemann Problems in Multi-dimensional MHD Staggered Grids The 8-Wave Riemann Solver Relativistic Hydrodynamics Conservation Laws in Spacetime The Continuity Equation The 4-Momentum of a Particle The Stress-Energy Tensor Finite Volume Methods Multi-dimensional Relativistic Flow Gravitation and General Relativity References

  1. A 1400-MHz survey of 1478 Abell clusters of galaxies

    NASA Technical Reports Server (NTRS)

    Owen, F. N.; White, R. A.; Hilldrup, K. C.; Hanisch, R. J.

    1982-01-01

    Observations of 1478 Abell clusters of galaxies with the NRAO 91-m telescope at 1400 MHz are reported. The measured beam shape was deconvolved from the measured source Gaussian fits in order to estimate the source size and position angle. All detected sources within 0.5 corrected Abell cluster radii are listed, including the cluster number, richness class, distance class, magnitude of the tenth brightest galaxy, redshift estimate, corrected cluster radius in arcmin, right ascension and error, declination and error, total flux density and error, and angular structure for each source.

  2. Classification of medication incidents associated with information technology.

    PubMed

    Cheung, Ka-Chun; van der Veen, Willem; Bouvy, Marcel L; Wensing, Michel; van den Bemt, Patricia M L A; de Smet, Peter A G M

    2014-02-01

    Information technology (IT) plays a pivotal role in improving patient safety, but can also cause new problems for patient safety. This study analyzed the nature and consequences of a large sample of IT-related medication incidents, as reported by healthcare professionals in community pharmacies and hospitals. The medication incidents submitted to the Dutch central medication incidents registration (CMR) reporting system were analyzed from the perspective of the healthcare professional with the Magrabi classification. During classification new terms were added, if necessary. The principal source of the IT-related problem, nature of error. Additional measures: consequences of incidents, IT systems, phases of the medication process. From March 2010 to February 2011 the CMR received 4161 incidents: 1643 (39.5%) from community pharmacies and 2518 (60.5%) from hospitals. Eventually one of six incidents (16.1%, n=668) were related to IT; in community pharmacies more incidents (21.5%, n=351) were related to IT than in hospitals (12.6%, n=317). In community pharmacies 41.0% (n=150) of the incidents were about choosing the wrong medicine. Most of the erroneous exchanges were associated with confusion of medicine names and poor design of screens. In hospitals 55.3% (n=187) of incidents concerned human-machine interaction-related input during the use of computerized prescriber order entry. These use problems were also a major problem in pharmacy information systems outside the hospital. A large sample of incidents shows that many of the incidents are related to IT, both in community pharmacies and hospitals. The interaction between human and machine plays a pivotal role in IT incidents in both settings.

  3. Reaching nearby sources: comparison between real and virtual sound and visual targets

    PubMed Central

    Parseihian, Gaëtan; Jouffrais, Christophe; Katz, Brian F. G.

    2014-01-01

    Sound localization studies over the past century have predominantly been concerned with directional accuracy for far-field sources. Few studies have examined the condition of near-field sources and distance perception. The current study concerns localization and pointing accuracy by examining source positions in the peripersonal space, specifically those associated with a typical tabletop surface. Accuracy is studied with respect to the reporting hand (dominant or secondary) for auditory sources. Results show no effect on the reporting hand with azimuthal errors increasing equally for the most extreme source positions. Distance errors show a consistent compression toward the center of the reporting area. A second evaluation is carried out comparing auditory and visual stimuli to examine any bias in reporting protocol or biomechanical difficulties. No common bias error was observed between auditory and visual stimuli indicating that reporting errors were not due to biomechanical limitations in the pointing task. A final evaluation compares real auditory sources and anechoic condition virtual sources created using binaural rendering. Results showed increased azimuthal errors, with virtual source positions being consistently overestimated to more lateral positions, while no significant distance perception was observed, indicating a deficiency in the binaural rendering condition relative to the real stimuli situation. Various potential reasons for this discrepancy are discussed with several proposals for improving distance perception in peripersonal virtual environments. PMID:25228855

  4. A SVM framework for fault detection of the braking system in a high speed train

    NASA Astrophysics Data System (ADS)

    Liu, Jie; Li, Yan-Fu; Zio, Enrico

    2017-03-01

    In April 2015, the number of operating High Speed Trains (HSTs) in the world has reached 3603. An efficient, effective and very reliable braking system is evidently very critical for trains running at a speed around 300 km/h. Failure of a highly reliable braking system is a rare event and, consequently, informative recorded data on fault conditions are scarce. This renders the fault detection problem a classification problem with highly unbalanced data. In this paper, a Support Vector Machine (SVM) framework, including feature selection, feature vector selection, model construction and decision boundary optimization, is proposed for tackling this problem. Feature vector selection can largely reduce the data size and, thus, the computational burden. The constructed model is a modified version of the least square SVM, in which a higher cost is assigned to the error of classification of faulty conditions than the error of classification of normal conditions. The proposed framework is successfully validated on a number of public unbalanced datasets. Then, it is applied for the fault detection of braking systems in HST: in comparison with several SVM approaches for unbalanced datasets, the proposed framework gives better results.

  5. Semi-supervised anomaly detection - towards model-independent searches of new physics

    NASA Astrophysics Data System (ADS)

    Kuusela, Mikael; Vatanen, Tommi; Malmi, Eric; Raiko, Tapani; Aaltonen, Timo; Nagai, Yoshikazu

    2012-06-01

    Most classification algorithms used in high energy physics fall under the category of supervised machine learning. Such methods require a training set containing both signal and background events and are prone to classification errors should this training data be systematically inaccurate for example due to the assumed MC model. To complement such model-dependent searches, we propose an algorithm based on semi-supervised anomaly detection techniques, which does not require a MC training sample for the signal data. We first model the background using a multivariate Gaussian mixture model. We then search for deviations from this model by fitting to the observations a mixture of the background model and a number of additional Gaussians. This allows us to perform pattern recognition of any anomalous excess over the background. We show by a comparison to neural network classifiers that such an approach is a lot more robust against misspecification of the signal MC than supervised classification. In cases where there is an unexpected signal, a neural network might fail to correctly identify it, while anomaly detection does not suffer from such a limitation. On the other hand, when there are no systematic errors in the training data, both methods perform comparably.

  6. A Q-backpropagated time delay neural network for diagnosing severity of gait disturbances in Parkinson's disease.

    PubMed

    Nancy Jane, Y; Khanna Nehemiah, H; Arputharaj, Kannan

    2016-04-01

    Parkinson's disease (PD) is a movement disorder that affects the patient's nervous system and health-care applications mostly uses wearable sensors to collect these data. Since these sensors generate time stamped data, analyzing gait disturbances in PD becomes challenging task. The objective of this paper is to develop an effective clinical decision-making system (CDMS) that aids the physician in diagnosing the severity of gait disturbances in PD affected patients. This paper presents a Q-backpropagated time delay neural network (Q-BTDNN) classifier that builds a temporal classification model, which performs the task of classification and prediction in CDMS. The proposed Q-learning induced backpropagation (Q-BP) training algorithm trains the Q-BTDNN by generating a reinforced error signal. The network's weights are adjusted through backpropagating the generated error signal. For experimentation, the proposed work uses a PD gait database, which contains gait measures collected through wearable sensors from three different PD research studies. The experimental result proves the efficiency of Q-BP in terms of its improved classification accuracy of 91.49%, 92.19% and 90.91% with three datasets accordingly compared to other neural network training algorithms. Copyright © 2016 Elsevier Inc. All rights reserved.

  7. Bayesian logistic regression approaches to predict incorrect DRG assignment.

    PubMed

    Suleiman, Mani; Demirhan, Haydar; Boyd, Leanne; Girosi, Federico; Aksakalli, Vural

    2018-05-07

    Episodes of care involving similar diagnoses and treatments and requiring similar levels of resource utilisation are grouped to the same Diagnosis-Related Group (DRG). In jurisdictions which implement DRG based payment systems, DRGs are a major determinant of funding for inpatient care. Hence, service providers often dedicate auditing staff to the task of checking that episodes have been coded to the correct DRG. The use of statistical models to estimate an episode's probability of DRG error can significantly improve the efficiency of clinical coding audits. This study implements Bayesian logistic regression models with weakly informative prior distributions to estimate the likelihood that episodes require a DRG revision, comparing these models with each other and to classical maximum likelihood estimates. All Bayesian approaches had more stable model parameters than maximum likelihood. The best performing Bayesian model improved overall classification per- formance by 6% compared to maximum likelihood, with a 34% gain compared to random classification, respectively. We found that the original DRG, coder and the day of coding all have a significant effect on the likelihood of DRG error. Use of Bayesian approaches has improved model parameter stability and classification accuracy. This method has already lead to improved audit efficiency in an operational capacity.

  8. International Test Comparisons: Reviewing Translation Error in Different Source Language-Target Language Combinations

    ERIC Educational Resources Information Center

    Zhao, Xueyu; Solano-Flores, Guillermo; Qian, Ming

    2018-01-01

    This article addresses test translation review in international test comparisons. We investigated the applicability of the theory of test translation error--a theory of the multidimensionality and inevitability of test translation error--across source language-target language combinations in the translation of PISA (Programme of International…

  9. Optical linear algebra processors: noise and error-source modeling.

    PubMed

    Casasent, D; Ghosh, A

    1985-06-01

    The modeling of system and component noise and error sources in optical linear algebra processors (OLAP's) are considered, with attention to the frequency-multiplexed OLAP. General expressions are obtained for the output produced as a function of various component errors and noise. A digital simulator for this model is discussed.

  10. Optical linear algebra processors - Noise and error-source modeling

    NASA Technical Reports Server (NTRS)

    Casasent, D.; Ghosh, A.

    1985-01-01

    The modeling of system and component noise and error sources in optical linear algebra processors (OLAPs) are considered, with attention to the frequency-multiplexed OLAP. General expressions are obtained for the output produced as a function of various component errors and noise. A digital simulator for this model is discussed.

  11. Characterization of groups using composite kernels and multi-source fMRI analysis data: application to schizophrenia

    PubMed Central

    Castro, Eduardo; Martínez-Ramón, Manel; Pearlson, Godfrey; Sui, Jing; Calhoun, Vince D.

    2011-01-01

    Pattern classification of brain imaging data can enable the automatic detection of differences in cognitive processes of specific groups of interest. Furthermore, it can also give neuroanatomical information related to the regions of the brain that are most relevant to detect these differences by means of feature selection procedures, which are also well-suited to deal with the high dimensionality of brain imaging data. This work proposes the application of recursive feature elimination using a machine learning algorithm based on composite kernels to the classification of healthy controls and patients with schizophrenia. This framework, which evaluates nonlinear relationships between voxels, analyzes whole-brain fMRI data from an auditory task experiment that is segmented into anatomical regions and recursively eliminates the uninformative ones based on their relevance estimates, thus yielding the set of most discriminative brain areas for group classification. The collected data was processed using two analysis methods: the general linear model (GLM) and independent component analysis (ICA). GLM spatial maps as well as ICA temporal lobe and default mode component maps were then input to the classifier. A mean classification accuracy of up to 95% estimated with a leave-two-out cross-validation procedure was achieved by doing multi-source data classification. In addition, it is shown that the classification accuracy rate obtained by using multi-source data surpasses that reached by using single-source data, hence showing that this algorithm takes advantage of the complimentary nature of GLM and ICA. PMID:21723948

  12. Association of medication errors with drug classifications, clinical units, and consequence of errors: Are they related?

    PubMed

    Muroi, Maki; Shen, Jay J; Angosta, Alona

    2017-02-01

    Registered nurses (RNs) play an important role in safe medication administration and patient safety. This study examined a total of 1276 medication error (ME) incident reports made by RNs in hospital inpatient settings in the southwestern region of the United States. The most common drug class associated with MEs was cardiovascular drugs (24.7%). Among this class, anticoagulants had the most errors (11.3%). The antimicrobials was the second most common drug class associated with errors (19.1%) and vancomycin was the most common antimicrobial that caused errors in this category (6.1%). MEs occurred more frequently in the medical-surgical and intensive care units than any other hospital units. Ten percent of MEs reached the patients with harm and 11% reached the patients with increased monitoring. Understanding the contributing factors related to MEs, addressing and eliminating risk of errors across hospital units, and providing education and resources for nurses may help reduce MEs. Copyright © 2016 Elsevier Inc. All rights reserved.

  13. Classification and spatial mapping of riparian habitat with applications toward management of streams impacted by nonpoint source pollution

    NASA Astrophysics Data System (ADS)

    Delong, Michael D.; Brusven, Merlyn A.

    1991-07-01

    Management of riparian habitats has been recognized for its importance in reducing instream effects of agricultural nonpoint source pollution. By serving as a buffer, well structured riparian habitats can reduce nonpoint source impacts by filtering surface runoff from field to stream. A system has been developed where key characteristics of riparian habitat, vegetation type, height, width, riparian and shoreline bank slope, and land use are classified as discrete categorical units. This classification system recognizes seven riparian vegetation types, which are determined by dominant plant type. Riparian and shoreline bank slope, in addition to riparian width and height, each consist of five categories. Classification by discrete units allows for ready digitizing of information for production of spatial maps using a geographic information system (GIS). The classification system was tested for field efficiency on Tom Beall Creek watershed, an agriculturally impacted third-order stream in the Clearwater River drainage, Nez Perce County, Idaho, USA. The classification system was simple to use during field applications and provided a good inventory of riparian habitat. After successful field tests, spatial maps were produced for each component using the Professional Map Analysis Package (pMAP), a GIS program. With pMAP, a map describing general riparian habitat condition was produced by combining the maps of components of riparian habitat, and the condition map was integrated with a map of soil erosion potential in order to determine areas along the stream that are susceptible to nonpoint source pollution inputs. Integration of spatial maps of riparian classification and watershed characteristics has great potential as a tool for aiding in making management decisions for mitigating off-site impacts of agricultural nonpoint source pollution.

  14. Classification Management. Journal of the National Classification Management Society, Volume 18, 1982,

    DTIC Science & Technology

    1983-01-01

    changes. Concurrently, CIA formed and AD HOC esting to step back and look at the U.S. security Intelligence Community Working Group to re...administrative error; to prevent embarrassment to expected damage will be. If you foresee the dam- a person, organization, or agency; to restrain com- age...the decision will be to classify the informa- petition; or to pTevent or delay the public release of tion. But note that in this thought process, you

  15. Validation of tool mark analysis of cut costal cartilage.

    PubMed

    Love, Jennifer C; Derrick, Sharon M; Wiersema, Jason M; Peters, Charles

    2012-03-01

    This study was designed to establish the potential error rate associated with the generally accepted method of tool mark analysis of cut marks in costal cartilage. Three knives with different blade types were used to make experimental cut marks in costal cartilage of pigs. Each cut surface was cast, and each cast was examined by three analysts working independently. The presence of striations, regularity of striations, and presence of a primary and secondary striation pattern were recorded for each cast. The distance between each striation was measured. The results showed that striations were not consistently impressed on the cut surface by the blade's cutting edge. Also, blade type classification by the presence or absence of striations led to a 65% misclassification rate. Use of the classification tree and cross-validation methods and inclusion of the mean interstriation distance decreased the error rate to c. 50%. © 2011 American Academy of Forensic Sciences.

  16. Automatic and semi-automatic approaches for arteriolar-to-venular computation in retinal photographs

    NASA Astrophysics Data System (ADS)

    Mendonça, Ana Maria; Remeseiro, Beatriz; Dashtbozorg, Behdad; Campilho, Aurélio

    2017-03-01

    The Arteriolar-to-Venular Ratio (AVR) is a popular dimensionless measure which allows the assessment of patients' condition for the early diagnosis of different diseases, including hypertension and diabetic retinopathy. This paper presents two new approaches for AVR computation in retinal photographs which include a sequence of automated processing steps: vessel segmentation, caliber measurement, optic disc segmentation, artery/vein classification, region of interest delineation, and AVR calculation. Both approaches have been tested on the INSPIRE-AVR dataset, and compared with a ground-truth provided by two medical specialists. The obtained results demonstrate the reliability of the fully automatic approach which provides AVR ratios very similar to at least one of the observers. Furthermore, the semi-automatic approach, which includes the manual modification of the artery/vein classification if needed, allows to significantly reduce the error to a level below the human error.

  17. Bayes classification of terrain cover using normalized polarimetric data

    NASA Technical Reports Server (NTRS)

    Yueh, H. A.; Swartz, A. A.; Kong, J. A.; Shin, R. T.; Novak, L. M.

    1988-01-01

    The normalized polarimetric classifier (NPC) which uses only the relative magnitudes and phases of the polarimetric data is proposed for discrimination of terrain elements. The probability density functions (PDFs) of polarimetric data are assumed to have a complex Gaussian distribution, and the marginal PDF of the normalized polarimetric data is derived by adopting the Euclidean norm as the normalization function. The general form of the distance measure for the NPC is also obtained. It is demonstrated that for polarimetric data with an arbitrary PDF, the distance measure of NPC will be independent of the normalization function selected even when the classifier is mistrained. A complex Gaussian distribution is assumed for the polarimetric data consisting of grass and tree regions. The probability of error for the NPC is compared with those of several other single-feature classifiers. The classification error of NPCs is shown to be independent of the normalization function.

  18. Physician Preferences to Communicate Neuropsychological Results: Comparison of Qualitative Descriptors and a Proposal to Reduce Communication Errors.

    PubMed

    Schoenberg, Mike R; Osborn, Katie E; Mahone, E Mark; Feigon, Maia; Roth, Robert M; Pliskin, Neil H

    2017-11-08

    Errors in communication are a leading cause of medical errors. A potential source of error in communicating neuropsychological results is confusion in the qualitative descriptors used to describe standardized neuropsychological data. This study sought to evaluate the extent to which medical consumers of neuropsychological assessments believed that results/findings were not clearly communicated. In addition, preference data for a variety of qualitative descriptors commonly used to communicate normative neuropsychological test scores were obtained. Preference data were obtained for five qualitative descriptor systems as part of a larger 36-item internet-based survey of physician satisfaction with neuropsychological services. A new qualitative descriptor system termed the Simplified Qualitative Classification System (Q-Simple) was proposed to reduce the potential for communication errors using seven terms: very superior, superior, high average, average, low average, borderline, and abnormal/impaired. A non-random convenience sample of 605 clinicians identified from four United States academic medical centers from January 1, 2015 through January 7, 2016 were invited to participate. A total of 182 surveys were completed. A minority of clinicians (12.5%) indicated that neuropsychological study results were not clearly communicated. When communicating neuropsychological standardized scores, the two most preferred qualitative descriptor systems were by Heaton and colleagues (26%) and a newly proposed Q-simple system (22%). Comprehensive norms for an extended Halstead-Reitan battery: Demographic corrections, research findings, and clinical applications. Odessa, TX: Psychological Assessment Resources) (26%) and the newly proposed Q-Simple system (22%). Initial findings highlight the need to improve and standardize communication of neuropsychological results. These data offer initial guidance for preferred terms to communicate test results and form a foundation for more standardized practice among neuropsychologists. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  19. Search for the Identification of 3EG J1835+5918: Evidence for a New Type of High-Energy Gamma-Ray Source

    NASA Technical Reports Server (NTRS)

    Mirabal, N.; Halpern, Jules P.; Eracleous, M.; Becker, R. H.; Oliversen, Ronald (Technical Monitor)

    2001-01-01

    The EGRET source 3EG J1835+5918 is the brightest and most accurately positioned of the as-yet unidentified high-energy gamma-ray sources at high Galactic latitude (l, b = 89 deg, 25 deg). We present a multiwavelength study of the region around it, including X-ray, radio, and optical imaging surveys, as well as optical spectroscopic classification of most of the active objects in this area. Identifications are made of all but one of the ROSAT and ASCA sources in this region to a flux limit of approximately 5 x 10(exp -14) erg/sq cm s, which is 10(exp -4) of the gamma-ray flux. The identified X-ray sources in or near the EGRET error ellipse are radio-quiet QSOs, a galaxy cluster, and coronal emitting stars. We also find eight quasars using purely optical color selection, and we have monitored the entire field for variable optical objects on short and long time scales without any notable discoveries. The radio sources inside the error ellipse are all fainter than 4 mJy at 1.4 GHz. There are no flat-spectrum radio sources in the vicinity; the brightest neighboring radio sources are steep-spectrum radio galaxies or quasars. Since no blazar-like or pulsar-like candidate has been found as a result of these searches, 3EG J1835+5918 must be lacking one or more of the physically essential attributes of these known classes of gamma-ray emitters. If it is an AGN it lacks the beamed emission radio of blazars by at least a factor of 100 relative to identified EGRET blazars. If it is an isolated neutron star, it lacks the steady thermal X-rays from a cooling surface and the magnetospheric non-thermal X-ray emission that is characteristic of all EGRET pulsars. If a pulsar, 3EG J1835+5918 must be either older or more distant than Geminga, and probably an even more efficient or beamed gamma-ray engine. One intermittent ROSA T source falls on a blank optical field to a limit of B greater than 23.4, V greater than 23.3, and R greater than 22.5. In view of this conspicuous absence, RX J1836-2+5925 should be examined further as a candidate for identification with 3EG J1835+5918 and possibly the prototype of a new class of high-energy gamma-ray source.

  20. ROSAT X-Ray Observation of the Second Error Box for SGR 1900+14

    NASA Technical Reports Server (NTRS)

    Li, P.; Hurley, K.; Vrba, F.; Kouveliotou, C.; Meegan, C. A.; Fishman, G. J.; Kulkarni, S.; Frail, D.

    1997-01-01

    The positions of the two error boxes for the soft gamma repeater (SGR) 1900+14 were determined by the "network synthesis" method, which employs observations by the Ulysses gamma-ray burst and CGRO BATSE instruments. The location of the first error box has been observed at optical, infrared, and X-ray wavelengths, resulting in the discovery of a ROSAT X-ray point source and a curious double infrared source. We have recently used the ROSAT HRI to observe the second error box to complete the counterpart search. A total of six X-ray sources were identified within the field of view. None of them falls within the network synthesis error box, and a 3 sigma upper limit to any X-ray counterpart was estimated to be 6.35 x 10(exp -14) ergs/sq cm/s. The closest source is approximately 3 min. away, and has an estimated unabsorbed flux of 1.5 x 10(exp -12) ergs/sq cm/s. Unlike the first error box, there is no supernova remnant near the second error box. The closest one, G43.9+1.6, lies approximately 2.dg6 away. For these reasons, we believe that the first error box is more likely to be the correct one.

  1. Sources of variability and systematic error in mouse timing behavior.

    PubMed

    Gallistel, C R; King, Adam; McDonald, Robert

    2004-01-01

    In the peak procedure, starts and stops in responding bracket the target time at which food is expected. The variability in start and stop times is proportional to the target time (scalar variability), as is the systematic error in the mean center (scalar error). The authors investigated the source of the error and the variability, using head poking in the mouse, with target intervals of 5 s, 15 s, and 45 s, in the standard procedure, and in a variant with 3 different target intervals at 3 different locations in a single trial. The authors conclude that the systematic error is due to the asymmetric location of start and stop decision criteria, and the scalar variability derives primarily from sources other than memory.

  2. 40 CFR 432.1 - General Applicability.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... STANDARDS MEAT AND POULTRY PRODUCTS POINT SOURCE CATEGORY § 432.1 General Applicability. As defined more... the following industrial classification codes: Standard industrial classification 1 North Americanindustrial classification system 2 SIC 0751 NAICS 311611. SIC 2011 NAICS 311612. SIC 2013 NAICS 311615. SIC...

  3. The intercrater plains of Mercury and the Moon: Their nature, origin and role in terrestrial planet evolution. Measurement and errors of crater statistics. Ph.D. Thesis

    NASA Technical Reports Server (NTRS)

    Leake, M. A.

    1982-01-01

    Planetary imagery techniques, errors in measurement or degradation assignment, and statistical formulas are presented with respect to cratering data. Base map photograph preparation, measurement of crater diameters and sampled area, and instruments used are discussed. Possible uncertainties, such as Sun angle, scale factors, degradation classification, and biases in crater recognition are discussed. The mathematical formulas used in crater statistics are presented.

  4. Magnetic Nanoparticle Thermometer: An Investigation of Minimum Error Transmission Path and AC Bias Error

    PubMed Central

    Du, Zhongzhou; Su, Rijian; Liu, Wenzhong; Huang, Zhixing

    2015-01-01

    The signal transmission module of a magnetic nanoparticle thermometer (MNPT) was established in this study to analyze the error sources introduced during the signal flow in the hardware system. The underlying error sources that significantly affected the precision of the MNPT were determined through mathematical modeling and simulation. A transfer module path with the minimum error in the hardware system was then proposed through the analysis of the variations of the system error caused by the significant error sources when the signal flew through the signal transmission module. In addition, a system parameter, named the signal-to-AC bias ratio (i.e., the ratio between the signal and AC bias), was identified as a direct determinant of the precision of the measured temperature. The temperature error was below 0.1 K when the signal-to-AC bias ratio was higher than 80 dB, and other system errors were not considered. The temperature error was below 0.1 K in the experiments with a commercial magnetic fluid (Sample SOR-10, Ocean Nanotechnology, Springdale, AR, USA) when the hardware system of the MNPT was designed with the aforementioned method. PMID:25875188

  5. Radiomic machine-learning classifiers for prognostic biomarkers of advanced nasopharyngeal carcinoma.

    PubMed

    Zhang, Bin; He, Xin; Ouyang, Fusheng; Gu, Dongsheng; Dong, Yuhao; Zhang, Lu; Mo, Xiaokai; Huang, Wenhui; Tian, Jie; Zhang, Shuixing

    2017-09-10

    We aimed to identify optimal machine-learning methods for radiomics-based prediction of local failure and distant failure in advanced nasopharyngeal carcinoma (NPC). We enrolled 110 patients with advanced NPC. A total of 970 radiomic features were extracted from MRI images for each patient. Six feature selection methods and nine classification methods were evaluated in terms of their performance. We applied the 10-fold cross-validation as the criterion for feature selection and classification. We repeated each combination for 50 times to obtain the mean area under the curve (AUC) and test error. We observed that the combination methods Random Forest (RF) + RF (AUC, 0.8464 ± 0.0069; test error, 0.3135 ± 0.0088) had the highest prognostic performance, followed by RF + Adaptive Boosting (AdaBoost) (AUC, 0.8204 ± 0.0095; test error, 0.3384 ± 0.0097), and Sure Independence Screening (SIS) + Linear Support Vector Machines (LSVM) (AUC, 0.7883 ± 0.0096; test error, 0.3985 ± 0.0100). Our radiomics study identified optimal machine-learning methods for the radiomics-based prediction of local failure and distant failure in advanced NPC, which could enhance the applications of radiomics in precision oncology and clinical practice. Copyright © 2017 Elsevier B.V. All rights reserved.

  6. Light curves of 213 Type Ia supernovae from the Essence survey

    DOE PAGES

    Narayan, G.; Rest, A.; Tucker, B. E.; ...

    2016-05-06

    The ESSENCE survey discovered 213 Type Ia supernovae at redshiftsmore » $$0.1\\lt z\\lt 0.81$$ between 2002 and 2008. We present their R- and I-band photometry, measured from images obtained using the MOSAIC II camera at the CTIO Blanco, along with rapid-response spectroscopy for each object. We use our spectroscopic follow-up observations to determine an accurate, quantitative classification, and precise redshift. Through an extensive calibration program we have improved the precision of the CTIO Blanco natural photometric system. We use several empirical metrics to measure our internal photometric consistency and our absolute calibration of the survey. Here, we assess the effect of various potential sources of systematic bias on our measured fluxes, and estimate the dominant term in the systematic error budget from the photometric calibration on our absolute fluxes is ~1%.« less

  7. Light curves of 213 Type Ia supernovae from the Essence survey

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Narayan, G.; Rest, A.; Tucker, B. E.

    The ESSENCE survey discovered 213 Type Ia supernovae at redshiftsmore » $$0.1\\lt z\\lt 0.81$$ between 2002 and 2008. We present their R- and I-band photometry, measured from images obtained using the MOSAIC II camera at the CTIO Blanco, along with rapid-response spectroscopy for each object. We use our spectroscopic follow-up observations to determine an accurate, quantitative classification, and precise redshift. Through an extensive calibration program we have improved the precision of the CTIO Blanco natural photometric system. We use several empirical metrics to measure our internal photometric consistency and our absolute calibration of the survey. Here, we assess the effect of various potential sources of systematic bias on our measured fluxes, and estimate the dominant term in the systematic error budget from the photometric calibration on our absolute fluxes is ~1%.« less

  8. Multi-source remotely sensed data fusion for improving land cover classification

    NASA Astrophysics Data System (ADS)

    Chen, Bin; Huang, Bo; Xu, Bing

    2017-02-01

    Although many advances have been made in past decades, land cover classification of fine-resolution remotely sensed (RS) data integrating multiple temporal, angular, and spectral features remains limited, and the contribution of different RS features to land cover classification accuracy remains uncertain. We proposed to improve land cover classification accuracy by integrating multi-source RS features through data fusion. We further investigated the effect of different RS features on classification performance. The results of fusing Landsat-8 Operational Land Imager (OLI) data with Moderate Resolution Imaging Spectroradiometer (MODIS), China Environment 1A series (HJ-1A), and Advanced Spaceborne Thermal Emission and Reflection (ASTER) digital elevation model (DEM) data, showed that the fused data integrating temporal, spectral, angular, and topographic features achieved better land cover classification accuracy than the original RS data. Compared with the topographic feature, the temporal and angular features extracted from the fused data played more important roles in classification performance, especially those temporal features containing abundant vegetation growth information, which markedly increased the overall classification accuracy. In addition, the multispectral and hyperspectral fusion successfully discriminated detailed forest types. Our study provides a straightforward strategy for hierarchical land cover classification by making full use of available RS data. All of these methods and findings could be useful for land cover classification at both regional and global scales.

  9. Automated structural classification of lipids by machine learning.

    PubMed

    Taylor, Ryan; Miller, Ryan H; Miller, Ryan D; Porter, Michael; Dalgleish, James; Prince, John T

    2015-03-01

    Modern lipidomics is largely dependent upon structural ontologies because of the great diversity exhibited in the lipidome, but no automated lipid classification exists to facilitate this partitioning. The size of the putative lipidome far exceeds the number currently classified, despite a decade of work. Automated classification would benefit ongoing classification efforts by decreasing the time needed and increasing the accuracy of classification while providing classifications for mass spectral identification algorithms. We introduce a tool that automates classification into the LIPID MAPS ontology of known lipids with >95% accuracy and novel lipids with 63% accuracy. The classification is based upon simple chemical characteristics and modern machine learning algorithms. The decision trees produced are intelligible and can be used to clarify implicit assumptions about the current LIPID MAPS classification scheme. These characteristics and decision trees are made available to facilitate alternative implementations. We also discovered many hundreds of lipids that are currently misclassified in the LIPID MAPS database, strongly underscoring the need for automated classification. Source code and chemical characteristic lists as SMARTS search strings are available under an open-source license at https://www.github.com/princelab/lipid_classifier. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  10. Altimeter error sources at the 10-cm performance level

    NASA Technical Reports Server (NTRS)

    Martin, C. F.

    1977-01-01

    Error sources affecting the calibration and operational use of a 10 cm altimeter are examined to determine the magnitudes of current errors and the investigations necessary to reduce them to acceptable bounds. Errors considered include those affecting operational data pre-processing, and those affecting altitude bias determination, with error budgets developed for both. The most significant error sources affecting pre-processing are bias calibration, propagation corrections for the ionosphere, and measurement noise. No ionospheric models are currently validated at the required 10-25% accuracy level. The optimum smoothing to reduce the effects of measurement noise is investigated and found to be on the order of one second, based on the TASC model of geoid undulations. The 10 cm calibrations are found to be feasible only through the use of altimeter passes that are very high elevation for a tracking station which tracks very close to the time of altimeter track, such as a high elevation pass across the island of Bermuda. By far the largest error source, based on the current state-of-the-art, is the location of the island tracking station relative to mean sea level in the surrounding ocean areas.

  11. High Dimensional Classification Using Features Annealed Independence Rules.

    PubMed

    Fan, Jianqing; Fan, Yingying

    2008-01-01

    Classification using high-dimensional features arises frequently in many contemporary statistical studies such as tumor classification using microarray or other high-throughput data. The impact of dimensionality on classifications is largely poorly understood. In a seminal paper, Bickel and Levina (2004) show that the Fisher discriminant performs poorly due to diverging spectra and they propose to use the independence rule to overcome the problem. We first demonstrate that even for the independence classification rule, classification using all the features can be as bad as the random guessing due to noise accumulation in estimating population centroids in high-dimensional feature space. In fact, we demonstrate further that almost all linear discriminants can perform as bad as the random guessing. Thus, it is paramountly important to select a subset of important features for high-dimensional classification, resulting in Features Annealed Independence Rules (FAIR). The conditions under which all the important features can be selected by the two-sample t-statistic are established. The choice of the optimal number of features, or equivalently, the threshold value of the test statistics are proposed based on an upper bound of the classification error. Simulation studies and real data analysis support our theoretical results and demonstrate convincingly the advantage of our new classification procedure.

  12. A label distance maximum-based classifier for multi-label learning.

    PubMed

    Liu, Xiaoli; Bao, Hang; Zhao, Dazhe; Cao, Peng

    2015-01-01

    Multi-label classification is useful in many bioinformatics tasks such as gene function prediction and protein site localization. This paper presents an improved neural network algorithm, Max Label Distance Back Propagation Algorithm for Multi-Label Classification. The method was formulated by modifying the total error function of the standard BP by adding a penalty term, which was realized by maximizing the distance between the positive and negative labels. Extensive experiments were conducted to compare this method against state-of-the-art multi-label methods on three popular bioinformatic benchmark datasets. The results illustrated that this proposed method is more effective for bioinformatic multi-label classification compared to commonly used techniques.

  13. Lacie phase 1 Classification and Mensuration Subsystem (CAMS) rework experiment

    NASA Technical Reports Server (NTRS)

    Chhikara, R. S.; Hsu, E. M.; Liszcz, C. J.

    1976-01-01

    An experiment was designed to test the ability of the Classification and Mensuration Subsystem rework operations to improve wheat proportion estimates for segments that had been processed previously. Sites selected for the experiment included three in Kansas and three in Texas, with the remaining five distributed in Montana and North and South Dakota. The acquisition dates were selected to be representative of imagery available in actual operations. No more than one acquisition per biophase were used, and biophases were determined by actual crop calendars. All sites were worked by each of four Analyst-Interpreter/Data Processing Analyst Teams who reviewed the initial processing of each segment and accepted or reworked it for an estimate of the proportion of small grains in the segment. Classification results, acquisitions and classification errors and performance results between CAMS regular and ITS rework are tabulated.

  14. Minimization of model representativity errors in identification of point source emission from atmospheric concentration measurements

    NASA Astrophysics Data System (ADS)

    Sharan, Maithili; Singh, Amit Kumar; Singh, Sarvesh Kumar

    2017-11-01

    Estimation of an unknown atmospheric release from a finite set of concentration measurements is considered an ill-posed inverse problem. Besides ill-posedness, the estimation process is influenced by the instrumental errors in the measured concentrations and model representativity errors. The study highlights the effect of minimizing model representativity errors on the source estimation. This is described in an adjoint modelling framework and followed in three steps. First, an estimation of point source parameters (location and intensity) is carried out using an inversion technique. Second, a linear regression relationship is established between the measured concentrations and corresponding predicted using the retrieved source parameters. Third, this relationship is utilized to modify the adjoint functions. Further, source estimation is carried out using these modified adjoint functions to analyse the effect of such modifications. The process is tested for two well known inversion techniques, called renormalization and least-square. The proposed methodology and inversion techniques are evaluated for a real scenario by using concentrations measurements from the Idaho diffusion experiment in low wind stable conditions. With both the inversion techniques, a significant improvement is observed in the retrieval of source estimation after minimizing the representativity errors.

  15. Sample Errors Call Into Question Conclusions Regarding Same-Sex Married Parents: A Comment on "Family Structure and Child Health: Does the Sex Composition of Parents Matter?"

    PubMed

    Paul Sullins, D

    2017-12-01

    Because of classification errors reported by the National Center for Health Statistics, an estimated 42 % of the same-sex married partners in the sample for this study are misclassified different-sex married partners, thus calling into question findings regarding same-sex married parents. Including biological parentage as a control variable suppresses same-sex/different-sex differences, thus obscuring the data error. Parentage is not appropriate as a control because it correlates nearly perfectly (+.97, gamma) with the same-sex/different-sex distinction and is invariant for the category of joint biological parents.

  16. Image Augmentation for Object Image Classification Based On Combination of Pre-Trained CNN and SVM

    NASA Astrophysics Data System (ADS)

    Shima, Yoshihiro

    2018-04-01

    Neural networks are a powerful means of classifying object images. The proposed image category classification method for object images combines convolutional neural networks (CNNs) and support vector machines (SVMs). A pre-trained CNN, called Alex-Net, is used as a pattern-feature extractor. Alex-Net is pre-trained for the large-scale object-image dataset ImageNet. Instead of training, Alex-Net, pre-trained for ImageNet is used. An SVM is used as trainable classifier. The feature vectors are passed to the SVM from Alex-Net. The STL-10 dataset are used as object images. The number of classes is ten. Training and test samples are clearly split. STL-10 object images are trained by the SVM with data augmentation. We use the pattern transformation method with the cosine function. We also apply some augmentation method such as rotation, skewing and elastic distortion. By using the cosine function, the original patterns were left-justified, right-justified, top-justified, or bottom-justified. Patterns were also center-justified and enlarged. Test error rate is decreased by 0.435 percentage points from 16.055% by augmentation with cosine transformation. Error rates are increased by other augmentation method such as rotation, skewing and elastic distortion, compared without augmentation. Number of augmented data is 30 times that of the original STL-10 5K training samples. Experimental test error rate for the test 8k STL-10 object images was 15.620%, which shows that image augmentation is effective for image category classification.

  17. Automated detection of cloud and cloud-shadow in single-date Landsat imagery using neural networks and spatial post-processing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hughes, Michael J.; Hayes, Daniel J

    2014-01-01

    Use of Landsat data to answer ecological questions is contingent on the effective removal of cloud and cloud shadow from satellite images. We develop a novel algorithm to identify and classify clouds and cloud shadow, \\textsc{sparcs}: Spacial Procedures for Automated Removal of Cloud and Shadow. The method uses neural networks to determine cloud, cloud-shadow, water, snow/ice, and clear-sky membership of each pixel in a Landsat scene, and then applies a set of procedures to enforce spatial rules. In a comparison to FMask, a high-quality cloud and cloud-shadow classification algorithm currently available, \\textsc{sparcs} performs favorably, with similar omission errors for cloudsmore » (0.8% and 0.9%, respectively), substantially lower omission error for cloud-shadow (8.3% and 1.1%), and fewer errors of commission (7.8% and 5.0%). Additionally, textsc{sparcs} provides a measure of uncertainty in its classification that can be exploited by other processes that use the cloud and cloud-shadow detection. To illustrate this, we present an application that constructs obstruction-free composites of images acquired on different dates in support of algorithms detecting vegetation change.« less

  18. Discrimination of Aspergillus isolates at the species and strain level by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry fingerprinting.

    PubMed

    Hettick, Justin M; Green, Brett J; Buskirk, Amanda D; Kashon, Michael L; Slaven, James E; Janotka, Erika; Blachere, Francoise M; Schmechel, Detlef; Beezhold, Donald H

    2008-09-15

    Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) was used to generate highly reproducible mass spectral fingerprints for 12 species of fungi of the genus Aspergillus and 5 different strains of Aspergillus flavus. Prior to MALDI-TOF MS analysis, the fungi were subjected to three 1-min bead beating cycles in an acetonitrile/trifluoroacetic acid solvent. The mass spectra contain abundant peaks in the range of 5 to 20kDa and may be used to discriminate between species unambiguously. A discriminant analysis using all peaks from the MALDI-TOF MS data yielded error rates for classification of 0 and 18.75% for resubstitution and cross-validation methods, respectively. If a subset of 28 significant peaks is chosen, resubstitution and cross-validation error rates are 0%. Discriminant analysis of the MALDI-TOF MS data for 5 strains of A. flavus using all peaks yielded error rates for classification of 0 and 5% for resubstitution and cross-validation methods, respectively. These data indicate that MALDI-TOF MS data may be used for unambiguous identification of members of the genus Aspergillus at both the species and strain levels.

  19. Syndrome source coding and its universal generalization

    NASA Technical Reports Server (NTRS)

    Ancheta, T. C., Jr.

    1975-01-01

    A method of using error-correcting codes to obtain data compression, called syndrome-source-coding, is described in which the source sequence is treated as an error pattern whose syndrome forms the compressed data. It is shown that syndrome-source-coding can achieve arbitrarily small distortion with the number of compressed digits per source digit arbitrarily close to the entropy of a binary memoryless source. A universal generalization of syndrome-source-coding is formulated which provides robustly-effective, distortionless, coding of source ensembles.

  20. Identifying medication error chains from critical incident reports: a new analytic approach.

    PubMed

    Huckels-Baumgart, Saskia; Manser, Tanja

    2014-10-01

    Research into the distribution of medication errors usually focuses on isolated stages within the medication use process. Our study aimed to provide a novel process-oriented approach to medication incident analysis focusing on medication error chains. Our study was conducted across a 900-bed teaching hospital in Switzerland. All reported 1,591 medication errors 2009-2012 were categorized using the Medication Error Index NCC MERP and the WHO Classification for Patient Safety Methodology. In order to identify medication error chains, each reported medication incident was allocated to the relevant stage of the hospital medication use process. Only 25.8% of the reported medication errors were detected before they propagated through the medication use process. The majority of medication errors (74.2%) formed an error chain encompassing two or more stages. The most frequent error chain comprised preparation up to and including medication administration (45.2%). "Non-consideration of documentation/prescribing" during the drug preparation was the most frequent contributor for "wrong dose" during the administration of medication. Medication error chains provide important insights for detecting and stopping medication errors before they reach the patient. Existing and new safety barriers need to be extended to interrupt error chains and to improve patient safety. © 2014, The American College of Clinical Pharmacology.

  1. Parameter optimization in biased decoy-state quantum key distribution with both source errors and statistical fluctuations

    NASA Astrophysics Data System (ADS)

    Zhu, Jian-Rong; Li, Jian; Zhang, Chun-Mei; Wang, Qin

    2017-10-01

    The decoy-state method has been widely used in commercial quantum key distribution (QKD) systems. In view of the practical decoy-state QKD with both source errors and statistical fluctuations, we propose a universal model of full parameter optimization in biased decoy-state QKD with phase-randomized sources. Besides, we adopt this model to carry out simulations of two widely used sources: weak coherent source (WCS) and heralded single-photon source (HSPS). Results show that full parameter optimization can significantly improve not only the secure transmission distance but also the final key generation rate. And when taking source errors and statistical fluctuations into account, the performance of decoy-state QKD using HSPS suffered less than that of decoy-state QKD using WCS.

  2. A model for the statistical description of analytical errors occurring in clinical chemical laboratories with time.

    PubMed

    Hyvärinen, A

    1985-01-01

    The main purpose of the present study was to describe the statistical behaviour of daily analytical errors in the dimensions of place and time, providing a statistical basis for realistic estimates of the analytical error, and hence allowing the importance of the error and the relative contributions of its different sources to be re-evaluated. The observation material consists of creatinine and glucose results for control sera measured in daily routine quality control in five laboratories for a period of one year. The observation data were processed and computed by means of an automated data processing system. Graphic representations of time series of daily observations, as well as their means and dispersion limits when grouped over various time intervals, were investigated. For partition of the total variation several two-way analyses of variance were done with laboratory and various time classifications as factors. Pooled sets of observations were tested for normality of distribution and for consistency of variances, and the distribution characteristics of error variation in different categories of place and time were compared. Errors were found from the time series to vary typically between days. Due to irregular fluctuations in general and particular seasonal effects in creatinine, stable estimates of means or of dispersions for errors in individual laboratories could not be easily obtained over short periods of time but only from data sets pooled over long intervals (preferably at least one year). Pooled estimates of proportions of intralaboratory variation were relatively low (less than 33%) when the variation was pooled within days. However, when the variation was pooled over longer intervals this proportion increased considerably, even to a maximum of 89-98% (95-98% in each method category) when an outlying laboratory in glucose was omitted, with a concomitant decrease in the interaction component (representing laboratory-dependent variation with time). This indicates that a substantial part of the variation comes from intralaboratory variation with time rather than from constant interlaboratory differences. Normality and consistency of statistical distributions were best achieved in the long-term intralaboratory sets of the data, under which conditions the statistical estimates of error variability were also most characteristic of the individual laboratories rather than necessarily being similar to one another. Mixing of data from different laboratories may give heterogeneous and nonparametric distributions and hence is not advisable.(ABSTRACT TRUNCATED AT 400 WORDS)

  3. Methods for data classification

    DOEpatents

    Garrity, George [Okemos, MI; Lilburn, Timothy G [Front Royal, VA

    2011-10-11

    The present invention provides methods for classifying data and uncovering and correcting annotation errors. In particular, the present invention provides a self-organizing, self-correcting algorithm for use in classifying data. Additionally, the present invention provides a method for classifying biological taxa.

  4. Decoding of Ankle Flexion and Extension from Cortical Current Sources Estimated from Non-invasive Brain Activity Recording Methods.

    PubMed

    Mejia Tobar, Alejandra; Hyoudou, Rikiya; Kita, Kahori; Nakamura, Tatsuhiro; Kambara, Hiroyuki; Ogata, Yousuke; Hanakawa, Takashi; Koike, Yasuharu; Yoshimura, Natsue

    2017-01-01

    The classification of ankle movements from non-invasive brain recordings can be applied to a brain-computer interface (BCI) to control exoskeletons, prosthesis, and functional electrical stimulators for the benefit of patients with walking impairments. In this research, ankle flexion and extension tasks at two force levels in both legs, were classified from cortical current sources estimated by a hierarchical variational Bayesian method, using electroencephalography (EEG) and functional magnetic resonance imaging (fMRI) recordings. The hierarchical prior for the current source estimation from EEG was obtained from activated brain areas and their intensities from an fMRI group (second-level) analysis. The fMRI group analysis was performed on regions of interest defined over the primary motor cortex, the supplementary motor area, and the somatosensory area, which are well-known to contribute to movement control. A sparse logistic regression method was applied for a nine-class classification (eight active tasks and a resting control task) obtaining a mean accuracy of 65.64% for time series of current sources, estimated from the EEG and the fMRI signals using a variational Bayesian method, and a mean accuracy of 22.19% for the classification of the pre-processed of EEG sensor signals, with a chance level of 11.11%. The higher classification accuracy of current sources, when compared to EEG classification accuracy, was attributed to the high number of sources and the different signal patterns obtained in the same vertex for different motor tasks. Since the inverse filter estimation for current sources can be done offline with the present method, the present method is applicable to real-time BCIs. Finally, due to the highly enhanced spatial distribution of current sources over the brain cortex, this method has the potential to identify activation patterns to design BCIs for the control of an affected limb in patients with stroke, or BCIs from motor imagery in patients with spinal cord injury.

  5. Impact of numerical choices on water conservation in the E3SM Atmosphere Model Version 1 (EAM V1)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Kai; Rasch, Philip J.; Taylor, Mark A.

    The conservation of total water is an important numerical feature for global Earth system models. Even small conservation problems in the water budget can lead to systematic errors in century-long simulations for sea level rise projection. This study quantifies and reduces various sources of water conservation error in the atmosphere component of the Energy Exascale Earth System Model. Several sources of water conservation error have been identified during the development of the version 1 (V1) model. The largest errors result from the numerical coupling between the resolved dynamics and the parameterized sub-grid physics. A hybrid coupling using different methods formore » fluid dynamics and tracer transport provides a reduction of water conservation error by a factor of 50 at 1° horizontal resolution as well as consistent improvements at other resolutions. The second largest error source is the use of an overly simplified relationship between the surface moisture flux and latent heat flux at the interface between the host model and the turbulence parameterization. This error can be prevented by applying the same (correct) relationship throughout the entire model. Two additional types of conservation error that result from correcting the surface moisture flux and clipping negative water concentrations can be avoided by using mass-conserving fixers. With all four error sources addressed, the water conservation error in the V1 model is negligible and insensitive to the horizontal resolution. The associated changes in the long-term statistics of the main atmospheric features are small. A sensitivity analysis is carried out to show that the magnitudes of the conservation errors decrease strongly with temporal resolution but increase with horizontal resolution. The increased vertical resolution in the new model results in a very thin model layer at the Earth’s surface, which amplifies the conservation error associated with the surface moisture flux correction. We note that for some of the identified error sources, the proposed fixers are remedies rather than solutions to the problems at their roots. Future improvements in time integration would be beneficial for this model.« less

  6. Impact of numerical choices on water conservation in the E3SM Atmosphere Model version 1 (EAMv1)

    NASA Astrophysics Data System (ADS)

    Zhang, Kai; Rasch, Philip J.; Taylor, Mark A.; Wan, Hui; Leung, Ruby; Ma, Po-Lun; Golaz, Jean-Christophe; Wolfe, Jon; Lin, Wuyin; Singh, Balwinder; Burrows, Susannah; Yoon, Jin-Ho; Wang, Hailong; Qian, Yun; Tang, Qi; Caldwell, Peter; Xie, Shaocheng

    2018-06-01

    The conservation of total water is an important numerical feature for global Earth system models. Even small conservation problems in the water budget can lead to systematic errors in century-long simulations. This study quantifies and reduces various sources of water conservation error in the atmosphere component of the Energy Exascale Earth System Model. Several sources of water conservation error have been identified during the development of the version 1 (V1) model. The largest errors result from the numerical coupling between the resolved dynamics and the parameterized sub-grid physics. A hybrid coupling using different methods for fluid dynamics and tracer transport provides a reduction of water conservation error by a factor of 50 at 1° horizontal resolution as well as consistent improvements at other resolutions. The second largest error source is the use of an overly simplified relationship between the surface moisture flux and latent heat flux at the interface between the host model and the turbulence parameterization. This error can be prevented by applying the same (correct) relationship throughout the entire model. Two additional types of conservation error that result from correcting the surface moisture flux and clipping negative water concentrations can be avoided by using mass-conserving fixers. With all four error sources addressed, the water conservation error in the V1 model becomes negligible and insensitive to the horizontal resolution. The associated changes in the long-term statistics of the main atmospheric features are small. A sensitivity analysis is carried out to show that the magnitudes of the conservation errors in early V1 versions decrease strongly with temporal resolution but increase with horizontal resolution. The increased vertical resolution in V1 results in a very thin model layer at the Earth's surface, which amplifies the conservation error associated with the surface moisture flux correction. We note that for some of the identified error sources, the proposed fixers are remedies rather than solutions to the problems at their roots. Future improvements in time integration would be beneficial for V1.

  7. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yan, H; Chen, Z; Nath, R

    Purpose: kV fluoroscopic imaging combined with MV treatment beam imaging has been investigated for intrafractional motion monitoring and correction. It is, however, subject to additional kV imaging dose to normal tissue. To balance tracking accuracy and imaging dose, we previously proposed an adaptive imaging strategy to dynamically decide future imaging type and moments based on motion tracking uncertainty. kV imaging may be used continuously for maximal accuracy or only when the position uncertainty (probability of out of threshold) is high if a preset imaging dose limit is considered. In this work, we propose more accurate methods to estimate tracking uncertaintymore » through analyzing acquired data in real-time. Methods: We simulated motion tracking process based on a previously developed imaging framework (MV + initial seconds of kV imaging) using real-time breathing data from 42 patients. Motion tracking errors for each time point were collected together with the time point’s corresponding features, such as tumor motion speed and 2D tracking error of previous time points, etc. We tested three methods for error uncertainty estimation based on the features: conditional probability distribution, logistic regression modeling, and support vector machine (SVM) classification to detect errors exceeding a threshold. Results: For conditional probability distribution, polynomial regressions on three features (previous tracking error, prediction quality, and cosine of the angle between the trajectory and the treatment beam) showed strong correlation with the variation (uncertainty) of the mean 3D tracking error and its standard deviation: R-square = 0.94 and 0.90, respectively. The logistic regression and SVM classification successfully identified about 95% of tracking errors exceeding 2.5mm threshold. Conclusion: The proposed methods can reliably estimate the motion tracking uncertainty in real-time, which can be used to guide adaptive additional imaging to confirm the tumor is within the margin or initialize motion compensation if it is out of the margin.« less

  8. Detection of long duration cloud contamination in hyper-temporal NDVI imagery

    NASA Astrophysics Data System (ADS)

    Ali, A.; de Bie, C. A. J. M.; Skidmore, A. K.; Scarrott, R. G.

    2012-04-01

    NDVI time series imagery are commonly used as a reliable source for land use and land cover mapping and monitoring. However long duration cloud can significantly influence its precision in areas where persistent clouds prevails. Therefore quantifying errors related to cloud contamination are essential for accurate land cover mapping and monitoring. This study aims to detect long duration cloud contamination in hyper-temporal NDVI imagery based land cover mapping and monitoring. MODIS-Terra NDVI imagery (250 m; 16-day; Feb'03-Dec'09) were used after necessary pre-processing using quality flags and upper envelope filter (ASAVOGOL). Subsequently stacked MODIS-Terra NDVI image (161 layers) was classified for 10 to 100 clusters using ISODATA. After classifications, 97 clusters image was selected as best classified with the help of divergence statistics. To detect long duration cloud contamination, mean NDVI class profiles of 97 clusters image was analyzed for temporal artifacts. Results showed that long duration clouds affect the normal temporal progression of NDVI and caused anomalies. Out of total 97 clusters, 32 clusters were found with cloud contamination. Cloud contamination was found more prominent in areas where high rainfall occurs. This study can help to stop error propagation in regional land cover mapping and monitoring, caused by long duration cloud contamination.

  9. Implications of Spatial Data Variations for Protected Areas Management: An Example from East Africa

    NASA Astrophysics Data System (ADS)

    Dowhaniuk, Nicholas; Hartter, Joel; Ryan, Sadie J.

    2014-09-01

    Geographic information systems and remote sensing technologies have become an important tool for visualizing conservation management and developing solutions to problems associated with conservation. When multiple organizations separately develop spatial data representations of protected areas, implicit error arises due to variation between data sets. We used boundary data produced by three conservation organizations (International Union for the Conservation of Nature, World Resource Institute, and Uganda Wildlife Authority), for seven Ugandan parks, to study variation in the size represented and the location of boundaries. We found variation in the extent of overlapping total area encompassed by the three data sources, ranging from miniscule (0.4 %) differences to quite large ones (9.0 %). To underscore how protected area boundary discrepancies may have implications to protected area management, we used a landcover classification, defining crop, shrub, forest, savanna, and grassland. The total area in the different landcover classes varied most in smaller protected areas (those less than 329 km2), with forest and cropland area estimates varying up to 65 %. The discrepancies introduced by boundary errors could, in this hypothetical case, generate erroneous findings and could have a significant impact on conservation, such as local-scale management for encroachment and larger-scale assessments of deforestation.

  10. Community assessment of tropical tree biomass: challenges and opportunities for REDD.

    PubMed

    Theilade, Ida; Rutishauser, Ervan; Poulsen, Michael K

    2015-12-01

    REDD+ programs rely on accurate forest carbon monitoring. Several REDD+ projects have recently shown that local communities can monitor above ground biomass as well as external professionals, but at lower costs. However, the precision and accuracy of carbon monitoring conducted by local communities have rarely been assessed in the tropics. The aim of this study was to investigate different sources of error in tree biomass measurements conducted by community monitors and determine the effect on biomass estimates. Furthermore, we explored the potential of local ecological knowledge to assess wood density and botanical identification of trees. Community monitors were able to measure tree DBH accurately, but some large errors were found in girth measurements of large and odd-shaped trees. Monitors with experience from the logging industry performed better than monitors without previous experience. Indeed, only experienced monitors were able to discriminate trees with low wood densities. Local ecological knowledge did not allow consistent tree identification across monitors. Future REDD+ programmes may benefit from the systematic training of local monitors in tree DBH measurement, with special attention given to large and odd-shaped trees. A better understanding of traditional classification systems and concepts is required for local tree identifications and wood density estimates to become useful in monitoring of biomass and tree diversity.

  11. Implications of spatial data variations for protected areas management: an example from East Africa.

    PubMed

    Dowhaniuk, Nicholas; Hartter, Joel; Ryan, Sadie J

    2014-09-01

    Geographic information systems and remote sensing technologies have become an important tool for visualizing conservation management and developing solutions to problems associated with conservation. When multiple organizations separately develop spatial data representations of protected areas, implicit error arises due to variation between data sets. We used boundary data produced by three conservation organizations (International Union for the Conservation of Nature, World Resource Institute, and Uganda Wildlife Authority), for seven Ugandan parks, to study variation in the size represented and the location of boundaries. We found variation in the extent of overlapping total area encompassed by the three data sources, ranging from miniscule (0.4 %) differences to quite large ones (9.0 %). To underscore how protected area boundary discrepancies may have implications to protected area management, we used a landcover classification, defining crop, shrub, forest, savanna, and grassland. The total area in the different landcover classes varied most in smaller protected areas (those less than 329 km(2)), with forest and cropland area estimates varying up to 65 %. The discrepancies introduced by boundary errors could, in this hypothetical case, generate erroneous findings and could have a significant impact on conservation, such as local-scale management for encroachment and larger-scale assessments of deforestation.

  12. Elastic scattering spectroscopy for detection of cancer risk in Barrett's esophagus: experimental and clinical validation of error removal by orthogonal subtraction for increasing accuracy

    NASA Astrophysics Data System (ADS)

    Zhu, Ying; Fearn, Tom; MacKenzie, Gary; Clark, Ben; Dunn, Jason M.; Bigio, Irving J.; Bown, Stephen G.; Lovat, Laurence B.

    2009-07-01

    Elastic scattering spectroscopy (ESS) may be used to detect high-grade dysplasia (HGD) or cancer in Barrett's esophagus (BE). When spectra are measured in vivo by a hand-held optical probe, variability among replicated spectra from the same site can hinder the development of a diagnostic model for cancer risk. An experiment was carried out on excised tissue to investigate how two potential sources of this variability, pressure and angle, influence spectral variability, and the results were compared with the variations observed in spectra collected in vivo from patients with Barrett's esophagus. A statistical method called error removal by orthogonal subtraction (EROS) was applied to model and remove this measurement variability, which accounted for 96.6% of the variation in the spectra, from the in vivo data. Its removal allowed the construction of a diagnostic model with specificity improved from 67% to 82% (with sensitivity fixed at 90%). The improvement was maintained in predictions on an independent in vivo data set. EROS works well as an effective pretreatment for Barrett's in vivo data by identifying measurement variability and ameliorating its effect. The procedure reduces the complexity and increases the accuracy and interpretability of the model for classification and detection of cancer risk in Barrett's esophagus.

  13. Application of the AMBUR R package for spatio-temporal analysis of shoreline change: Jekyll Island, Georgia, USA

    NASA Astrophysics Data System (ADS)

    Jackson, Chester W.; Alexander, Clark R.; Bush, David M.

    2012-04-01

    The AMBUR (Analyzing Moving Boundaries Using R) package for the R software environment provides a collection of functions for assisting with analyzing and visualizing historical shoreline change. The package allows import and export of geospatial data in ESRI shapefile format, which is compatible with most commercial and open-source GIS software. The "baseline and transect" method is the primary technique used to quantify distances and rates of shoreline movement, and to detect classification changes across time. Along with the traditional "perpendicular" transect method, two new transect methods, "near" and "filtered," assist with quantifying changes along curved shorelines that are problematic for perpendicular transect methods. Output from the analyses includes data tables, graphics, and geospatial data, which are useful in rapidly assessing trends and potential errors in the dataset. A forecasting function also allows the user to estimate the future location of the shoreline and store the results in a shapefile. Other utilities and tools provided in the package assist with preparing and manipulating geospatial data, error checking, and generating supporting graphics and shapefiles. The package can be customized to perform additional statistical, graphical, and geospatial functions, and, it is capable of analyzing the movement of any boundary (e.g., shorelines, glacier terminus, fire edge, and marine and terrestrial ecozones).

  14. Analysis of feature selection with Probabilistic Neural Network (PNN) to classify sources influencing indoor air quality

    NASA Astrophysics Data System (ADS)

    Saad, S. M.; Shakaff, A. Y. M.; Saad, A. R. M.; Yusof, A. M.; Andrew, A. M.; Zakaria, A.; Adom, A. H.

    2017-03-01

    There are various sources influencing indoor air quality (IAQ) which could emit dangerous gases such as carbon monoxide (CO), carbon dioxide (CO2), ozone (O3) and particulate matter. These gases are usually safe for us to breathe in if they are emitted in safe quantity but if the amount of these gases exceeded the safe level, they might be hazardous to human being especially children and people with asthmatic problem. Therefore, a smart indoor air quality monitoring system (IAQMS) is needed that able to tell the occupants about which sources that trigger the indoor air pollution. In this project, an IAQMS that able to classify sources influencing IAQ has been developed. This IAQMS applies a classification method based on Probabilistic Neural Network (PNN). It is used to classify the sources of indoor air pollution based on five conditions: ambient air, human activity, presence of chemical products, presence of food and beverage, and presence of fragrance. In order to get good and best classification accuracy, an analysis of several feature selection based on data pre-processing method is done to discriminate among the sources. The output from each data pre-processing method has been used as the input for the neural network. The result shows that PNN analysis with the data pre-processing method give good classification accuracy of 99.89% and able to classify the sources influencing IAQ high classification rate.

  15. AGN classification for X-ray sources in the 105 month Swift/BAT survey

    NASA Astrophysics Data System (ADS)

    Masetti, N.; Bassani, L.; Palazzi, E.; Malizia, A.; Stephen, J. B.; Ubertini, P.

    2018-03-01

    We here provide classifications for 8 hard X-ray sources listed as 'unknown AGN' in the 105 month Swift/BAT all-sky survey catalogue (Oh et al. 2018, ApJS, 235, 4). The corresponding optical spectra were extracted from the 6dF Galaxy Survey (Jones et al. 2009, MNRAS, 399, 683).

  16. Quality assurance of chemical ingredient classification for the National Drug File - Reference Terminology.

    PubMed

    Zheng, Ling; Yumak, Hasan; Chen, Ling; Ochs, Christopher; Geller, James; Kapusnik-Uner, Joan; Perl, Yehoshua

    2017-09-01

    The National Drug File - Reference Terminology (NDF-RT) is a large and complex drug terminology consisting of several classification hierarchies on top of an extensive collection of drug concepts. These hierarchies provide important information about clinical drugs, e.g., their chemical ingredients, mechanisms of action, dosage form and physiological effects. Within NDF-RT such information is represented using tens of thousands of roles connecting drugs to classifications. In previous studies, we have introduced various kinds of Abstraction Networks to summarize the content and structure of terminologies in order to facilitate their visual comprehension, and support quality assurance of terminologies. However, these previous kinds of Abstraction Networks are not appropriate for summarizing the NDF-RT classification hierarchies, due to its unique structure. In this paper, we present the novel Ingredient Abstraction Network (IAbN) to summarize, visualize and support the audit of NDF-RT's Chemical Ingredients hierarchy and its associated drugs. A common theme in our quality assurance framework is to use characterizations of sets of concepts, revealed by the Abstraction Network structure, to capture concepts, the modeling of which is more complex than for other concepts. For the IAbN, we characterize drug ingredient concepts as more complex if they belong to IAbN groups with multiple parent groups. We show that such concepts have a statistically significantly higher rate of errors than a control sample and identify two especially common patterns of errors. Copyright © 2017 Elsevier Inc. All rights reserved.

  17. Improvement in defect classification efficiency by grouping disposition for reticle inspection

    NASA Astrophysics Data System (ADS)

    Lai, Rick; Hsu, Luke T. H.; Chang, Peter; Ho, C. H.; Tsai, Frankie; Long, Garrett; Yu, Paul; Miller, John; Hsu, Vincent; Chen, Ellison

    2005-11-01

    As the lithography design rule of IC manufacturing continues to migrate toward more advanced technology nodes, the mask error enhancement factor (MEEF) increases and necessitates the use of aggressive OPC features. These aggressive OPC features pose challenges to reticle inspection due to high false detection, which is time-consuming for defect classification and impacts the throughput of mask manufacturing. Moreover, higher MEEF leads to stricter mask defect capture criteria so that new generation reticle inspection tool is equipped with better detection capability. Hence, mask process induced defects, which were once undetectable, are now detected and results in the increase of total defect count. Therefore, how to review and characterize reticle defects efficiently is becoming more significant. A new defect review system called ReviewSmart has been developed based on the concept of defect grouping disposition. The review system intelligently bins repeating or similar defects into defect groups and thus allows operators to review massive defects more efficiently. Compared to the conventional defect review method, ReviewSmart not only reduces defect classification time and human judgment error, but also eliminates desensitization that is formerly inevitable. In this study, we attempt to explore the most efficient use of ReviewSmart by evaluating various defect binning conditions. The optimal binning conditions are obtained and have been verified for fidelity qualification through inspection reports (IRs) of production masks. The experiment results help to achieve the best defect classification efficiency when using ReviewSmart in the mask manufacturing and development.

  18. A Quantile Mapping Bias Correction Method Based on Hydroclimatic Classification of the Guiana Shield

    PubMed Central

    Ringard, Justine; Seyler, Frederique; Linguet, Laurent

    2017-01-01

    Satellite precipitation products (SPPs) provide alternative precipitation data for regions with sparse rain gauge measurements. However, SPPs are subject to different types of error that need correction. Most SPP bias correction methods use the statistical properties of the rain gauge data to adjust the corresponding SPP data. The statistical adjustment does not make it possible to correct the pixels of SPP data for which there is no rain gauge data. The solution proposed in this article is to correct the daily SPP data for the Guiana Shield using a novel two set approach, without taking into account the daily gauge data of the pixel to be corrected, but the daily gauge data from surrounding pixels. In this case, a spatial analysis must be involved. The first step defines hydroclimatic areas using a spatial classification that considers precipitation data with the same temporal distributions. The second step uses the Quantile Mapping bias correction method to correct the daily SPP data contained within each hydroclimatic area. We validate the results by comparing the corrected SPP data and daily rain gauge measurements using relative RMSE and relative bias statistical errors. The results show that analysis scale variation reduces rBIAS and rRMSE significantly. The spatial classification avoids mixing rainfall data with different temporal characteristics in each hydroclimatic area, and the defined bias correction parameters are more realistic and appropriate. This study demonstrates that hydroclimatic classification is relevant for implementing bias correction methods at the local scale. PMID:28621723

  19. A Quantile Mapping Bias Correction Method Based on Hydroclimatic Classification of the Guiana Shield.

    PubMed

    Ringard, Justine; Seyler, Frederique; Linguet, Laurent

    2017-06-16

    Satellite precipitation products (SPPs) provide alternative precipitation data for regions with sparse rain gauge measurements. However, SPPs are subject to different types of error that need correction. Most SPP bias correction methods use the statistical properties of the rain gauge data to adjust the corresponding SPP data. The statistical adjustment does not make it possible to correct the pixels of SPP data for which there is no rain gauge data. The solution proposed in this article is to correct the daily SPP data for the Guiana Shield using a novel two set approach, without taking into account the daily gauge data of the pixel to be corrected, but the daily gauge data from surrounding pixels. In this case, a spatial analysis must be involved. The first step defines hydroclimatic areas using a spatial classification that considers precipitation data with the same temporal distributions. The second step uses the Quantile Mapping bias correction method to correct the daily SPP data contained within each hydroclimatic area. We validate the results by comparing the corrected SPP data and daily rain gauge measurements using relative RMSE and relative bias statistical errors. The results show that analysis scale variation reduces rBIAS and rRMSE significantly. The spatial classification avoids mixing rainfall data with different temporal characteristics in each hydroclimatic area, and the defined bias correction parameters are more realistic and appropriate. This study demonstrates that hydroclimatic classification is relevant for implementing bias correction methods at the local scale.

  20. Her2Net: A Deep Framework for Semantic Segmentation and Classification of Cell Membranes and Nuclei in Breast Cancer Evaluation.

    PubMed

    Saha, Monjoy; Chakraborty, Chandan

    2018-05-01

    We present an efficient deep learning framework for identifying, segmenting, and classifying cell membranes and nuclei from human epidermal growth factor receptor-2 (HER2)-stained breast cancer images with minimal user intervention. This is a long-standing issue for pathologists because the manual quantification of HER2 is error-prone, costly, and time-consuming. Hence, we propose a deep learning-based HER2 deep neural network (Her2Net) to solve this issue. The convolutional and deconvolutional parts of the proposed Her2Net framework consisted mainly of multiple convolution layers, max-pooling layers, spatial pyramid pooling layers, deconvolution layers, up-sampling layers, and trapezoidal long short-term memory (TLSTM). A fully connected layer and a softmax layer were also used for classification and error estimation. Finally, HER2 scores were calculated based on the classification results. The main contribution of our proposed Her2Net framework includes the implementation of TLSTM and a deep learning framework for cell membrane and nucleus detection, segmentation, and classification and HER2 scoring. Our proposed Her2Net achieved 96.64% precision, 96.79% recall, 96.71% F-score, 93.08% negative predictive value, 98.33% accuracy, and a 6.84% false-positive rate. Our results demonstrate the high accuracy and wide applicability of the proposed Her2Net in the context of HER2 scoring for breast cancer evaluation.

Top