Sample records for lowest classification error

  1. A research of selected textural features for detection of asbestos-cement roofing sheets using orthoimages

    NASA Astrophysics Data System (ADS)

    Książek, Judyta

    2015-10-01

    At present, there has been a great interest in the development of texture based image classification methods in many different areas. This study presents the results of research carried out to assess the usefulness of selected textural features for detection of asbestos-cement roofs in orthophotomap classification. Two different orthophotomaps of southern Poland (with ground resolution: 5 cm and 25 cm) were used. On both orthoimages representative samples for two classes: asbestos-cement roofing sheets and other roofing materials were selected. Estimation of texture analysis usefulness was conducted using machine learning methods based on decision trees (C5.0 algorithm). For this purpose, various sets of texture parameters were calculated in MaZda software. During the calculation of decision trees different numbers of texture parameters groups were considered. In order to obtain the best settings for decision trees models cross-validation was performed. Decision trees models with the lowest mean classification error were selected. The accuracy of the classification was held based on validation data sets, which were not used for the classification learning. For 5 cm ground resolution samples, the lowest mean classification error was 15.6%. The lowest mean classification error in the case of 25 cm ground resolution was 20.0%. The obtained results confirm potential usefulness of the texture parameter image processing for detection of asbestos-cement roofing sheets. In order to improve the accuracy another extended study should be considered in which additional textural features as well as spectral characteristics should be analyzed.

  2. Factors That Affect Large Subunit Ribosomal DNA Amplicon Sequencing Studies of Fungal Communities: Classification Method, Primer Choice, and Error

    PubMed Central

    Porter, Teresita M.; Golding, G. Brian

    2012-01-01

    Nuclear large subunit ribosomal DNA is widely used in fungal phylogenetics and to an increasing extent also amplicon-based environmental sequencing. The relatively short reads produced by next-generation sequencing, however, makes primer choice and sequence error important variables for obtaining accurate taxonomic classifications. In this simulation study we tested the performance of three classification methods: 1) a similarity-based method (BLAST + Metagenomic Analyzer, MEGAN); 2) a composition-based method (Ribosomal Database Project naïve Bayesian classifier, NBC); and, 3) a phylogeny-based method (Statistical Assignment Package, SAP). We also tested the effects of sequence length, primer choice, and sequence error on classification accuracy and perceived community composition. Using a leave-one-out cross validation approach, results for classifications to the genus rank were as follows: BLAST + MEGAN had the lowest error rate and was particularly robust to sequence error; SAP accuracy was highest when long LSU query sequences were classified; and, NBC runs significantly faster than the other tested methods. All methods performed poorly with the shortest 50–100 bp sequences. Increasing simulated sequence error reduced classification accuracy. Community shifts were detected due to sequence error and primer selection even though there was no change in the underlying community composition. Short read datasets from individual primers, as well as pooled datasets, appear to only approximate the true community composition. We hope this work informs investigators of some of the factors that affect the quality and interpretation of their environmental gene surveys. PMID:22558215

  3. The Relationship between Occurrence Timing of Dispensing Errors and Subsequent Danger to Patients under the Situation According to the Classification of Drugs by Efficacy.

    PubMed

    Tsuji, Toshikazu; Nagata, Kenichiro; Kawashiri, Takehiro; Yamada, Takaaki; Irisa, Toshihiro; Murakami, Yuko; Kanaya, Akiko; Egashira, Nobuaki; Masuda, Satohiro

    2016-01-01

    There are many reports regarding various medical institutions' attempts at the prevention of dispensing errors. However, the relationship between occurrence timing of dispensing errors and subsequent danger to patients has not been studied under the situation according to the classification of drugs by efficacy. Therefore, we analyzed the relationship between position and time regarding the occurrence of dispensing errors. Furthermore, we investigated the relationship between occurrence timing of them and danger to patients. In this study, dispensing errors and incidents in three categories (drug name errors, drug strength errors, drug count errors) were classified into two groups in terms of its drug efficacy (efficacy similarity (-) group, efficacy similarity (+) group), into three classes in terms of the occurrence timing of dispensing errors (initial phase errors, middle phase errors, final phase errors). Then, the rates of damage shifting from "dispensing errors" to "damage to patients" were compared as an index of danger between two groups and among three classes. Consequently, the rate of damage in "efficacy similarity (-) group" was significantly higher than that in "efficacy similarity (+) group". Furthermore, the rate of damage is the highest in "initial phase errors", the lowest in "final phase errors" among three classes. From the results of this study, it became clear that the earlier the timing of dispensing errors occurs, the more severe the damage to patients becomes.

  4. Unbiased Taxonomic Annotation of Metagenomic Samples

    PubMed Central

    Fosso, Bruno; Pesole, Graziano; Rosselló, Francesc

    2018-01-01

    Abstract The classification of reads from a metagenomic sample using a reference taxonomy is usually based on first mapping the reads to the reference sequences and then classifying each read at a node under the lowest common ancestor of the candidate sequences in the reference taxonomy with the least classification error. However, this taxonomic annotation can be biased by an imbalanced taxonomy and also by the presence of multiple nodes in the taxonomy with the least classification error for a given read. In this article, we show that the Rand index is a better indicator of classification error than the often used area under the receiver operating characteristic (ROC) curve and F-measure for both balanced and imbalanced reference taxonomies, and we also address the second source of bias by reducing the taxonomic annotation problem for a whole metagenomic sample to a set cover problem, for which a logarithmic approximation can be obtained in linear time and an exact solution can be obtained by integer linear programming. Experimental results with a proof-of-concept implementation of the set cover approach to taxonomic annotation in a next release of the TANGO software show that the set cover approach further reduces ambiguity in the taxonomic annotation obtained with TANGO without distorting the relative abundance profile of the metagenomic sample. PMID:29028181

  5. Detection of Periodic Leg Movements by Machine Learning Methods Using Polysomnographic Parameters Other Than Leg Electromyography

    PubMed Central

    Umut, İlhan; Çentik, Güven

    2016-01-01

    The number of channels used for polysomnographic recording frequently causes difficulties for patients because of the many cables connected. Also, it increases the risk of having troubles during recording process and increases the storage volume. In this study, it is intended to detect periodic leg movement (PLM) in sleep with the use of the channels except leg electromyography (EMG) by analysing polysomnography (PSG) data with digital signal processing (DSP) and machine learning methods. PSG records of 153 patients of different ages and genders with PLM disorder diagnosis were examined retrospectively. A novel software was developed for the analysis of PSG records. The software utilizes the machine learning algorithms, statistical methods, and DSP methods. In order to classify PLM, popular machine learning methods (multilayer perceptron, K-nearest neighbour, and random forests) and logistic regression were used. Comparison of classified results showed that while K-nearest neighbour classification algorithm had higher average classification rate (91.87%) and lower average classification error value (RMSE = 0.2850), multilayer perceptron algorithm had the lowest average classification rate (83.29%) and the highest average classification error value (RMSE = 0.3705). Results showed that PLM can be classified with high accuracy (91.87%) without leg EMG record being present. PMID:27213008

  6. Detection of Periodic Leg Movements by Machine Learning Methods Using Polysomnographic Parameters Other Than Leg Electromyography.

    PubMed

    Umut, İlhan; Çentik, Güven

    2016-01-01

    The number of channels used for polysomnographic recording frequently causes difficulties for patients because of the many cables connected. Also, it increases the risk of having troubles during recording process and increases the storage volume. In this study, it is intended to detect periodic leg movement (PLM) in sleep with the use of the channels except leg electromyography (EMG) by analysing polysomnography (PSG) data with digital signal processing (DSP) and machine learning methods. PSG records of 153 patients of different ages and genders with PLM disorder diagnosis were examined retrospectively. A novel software was developed for the analysis of PSG records. The software utilizes the machine learning algorithms, statistical methods, and DSP methods. In order to classify PLM, popular machine learning methods (multilayer perceptron, K-nearest neighbour, and random forests) and logistic regression were used. Comparison of classified results showed that while K-nearest neighbour classification algorithm had higher average classification rate (91.87%) and lower average classification error value (RMSE = 0.2850), multilayer perceptron algorithm had the lowest average classification rate (83.29%) and the highest average classification error value (RMSE = 0.3705). Results showed that PLM can be classified with high accuracy (91.87%) without leg EMG record being present.

  7. Intelligent quotient estimation of mental retarded people from different psychometric instruments using artificial neural networks.

    PubMed

    Di Nuovo, Alessandro G; Di Nuovo, Santo; Buono, Serafino

    2012-02-01

    The estimation of a person's intelligence quotient (IQ) by means of psychometric tests is indispensable in the application of psychological assessment to several fields. When complex tests as the Wechsler scales, which are the most commonly used and universally recognized parameter for the diagnosis of degrees of retardation, are not applicable, it is necessary to use other psycho-diagnostic tools more suited for the subject's specific condition. But to ensure a homogeneous diagnosis it is necessary to reach a common metric, thus, the aim of our work is to build models able to estimate accurately and reliably the Wechsler IQ, starting from different psycho-diagnostic tools. Four different psychometric tests (Leiter international performance scale; coloured progressive matrices test; the mental development scale; psycho educational profile), along with the Wechsler scale, were administered to a group of 40 mentally retarded subjects, with various pathologies, and control persons. The obtained database is used to evaluate Wechsler IQ estimation models starting from the scores obtained in the other tests. Five modelling methods, two statistical and three from machine learning, that belong to the family of artificial neural networks (ANNs) are employed to build the estimator. Several error metrics for estimated IQ and for retardation level classification are defined to compare the performance of the various models with univariate and multivariate analyses. Eight empirical studies show that, after ten-fold cross-validation, best average estimation error is of 3.37 IQ points and mental retardation level classification error of 7.5%. Furthermore our experiments prove the superior performance of ANN methods over statistical regression ones, because in all cases considered ANN models show the lowest estimation error (from 0.12 to 0.9 IQ points) and the lowest classification error (from 2.5% to 10%). Since the estimation performance is better than the confidence interval of Wechsler scales (five IQ points), we consider models built very accurate and reliable and they can be used into help clinical diagnosis. Therefore a computer software based on the results of our work is currently used in a clinical center and empirical trails confirm its validity. Furthermore positive results in our multivariate studies suggest new approaches for clinicians. Copyright © 2011 Elsevier B.V. All rights reserved.

  8. Chemometric study of Andalusian extra virgin olive oils Raman spectra: Qualitative and quantitative information.

    PubMed

    Sánchez-López, E; Sánchez-Rodríguez, M I; Marinas, A; Marinas, J M; Urbano, F J; Caridad, J M; Moalem, M

    2016-08-15

    Authentication of extra virgin olive oil (EVOO) is an important topic for olive oil industry. The fraudulent practices in this sector are a major problem affecting both producers and consumers. This study analyzes the capability of FT-Raman combined with chemometric treatments of prediction of the fatty acid contents (quantitative information), using gas chromatography as the reference technique, and classification of diverse EVOOs as a function of the harvest year, olive variety, geographical origin and Andalusian PDO (qualitative information). The optimal number of PLS components that summarizes the spectral information was introduced progressively. For the estimation of the fatty acid composition, the lowest error (both in fitting and prediction) corresponded to MUFA, followed by SAFA and PUFA though such errors were close to zero in all cases. As regards the qualitative variables, discriminant analysis allowed a correct classification of 94.3%, 84.0%, 89.0% and 86.6% of samples for harvest year, olive variety, geographical origin and PDO, respectively. Copyright © 2016 Elsevier B.V. All rights reserved.

  9. Detection of eardrum abnormalities using ensemble deep learning approaches

    NASA Astrophysics Data System (ADS)

    Senaras, Caglar; Moberly, Aaron C.; Teknos, Theodoros; Essig, Garth; Elmaraghy, Charles; Taj-Schaal, Nazhat; Yua, Lianbo; Gurcan, Metin N.

    2018-02-01

    In this study, we proposed an approach to report the condition of the eardrum as "normal" or "abnormal" by ensembling two different deep learning architectures. In the first network (Network 1), we applied transfer learning to the Inception V3 network by using 409 labeled samples. As a second network (Network 2), we designed a convolutional neural network to take advantage of auto-encoders by using additional 673 unlabeled eardrum samples. The individual classification accuracies of the Network 1 and Network 2 were calculated as 84.4%(+/- 12.1%) and 82.6% (+/- 11.3%), respectively. Only 32% of the errors of the two networks were the same, making it possible to combine two approaches to achieve better classification accuracy. The proposed ensemble method allows us to achieve robust classification because it has high accuracy (84.4%) with the lowest standard deviation (+/- 10.3%).

  10. Influence of nuclei segmentation on breast cancer malignancy classification

    NASA Astrophysics Data System (ADS)

    Jelen, Lukasz; Fevens, Thomas; Krzyzak, Adam

    2009-02-01

    Breast Cancer is one of the most deadly cancers affecting middle-aged women. Accurate diagnosis and prognosis are crucial to reduce the high death rate. Nowadays there are numerous diagnostic tools for breast cancer diagnosis. In this paper we discuss a role of nuclear segmentation from fine needle aspiration biopsy (FNA) slides and its influence on malignancy classification. Classification of malignancy plays a very important role during the diagnosis process of breast cancer. Out of all cancer diagnostic tools, FNA slides provide the most valuable information about the cancer malignancy grade which helps to choose an appropriate treatment. This process involves assessing numerous nuclear features and therefore precise segmentation of nuclei is very important. In this work we compare three powerful segmentation approaches and test their impact on the classification of breast cancer malignancy. The studied approaches involve level set segmentation, fuzzy c-means segmentation and textural segmentation based on co-occurrence matrix. Segmented nuclei were used to extract nuclear features for malignancy classification. For classification purposes four different classifiers were trained and tested with previously extracted features. The compared classifiers are Multilayer Perceptron (MLP), Self-Organizing Maps (SOM), Principal Component-based Neural Network (PCA) and Support Vector Machines (SVM). The presented results show that level set segmentation yields the best results over the three compared approaches and leads to a good feature extraction with a lowest average error rate of 6.51% over four different classifiers. The best performance was recorded for multilayer perceptron with an error rate of 3.07% using fuzzy c-means segmentation.

  11. Time-Frequency Distribution of Seismocardiographic Signals: A Comparative Study

    PubMed Central

    Taebi, Amirtaha; Mansy, Hansen A.

    2017-01-01

    Accurate estimation of seismocardiographic (SCG) signal features can help successful signal characterization and classification in health and disease. This may lead to new methods for diagnosing and monitoring heart function. Time-frequency distributions (TFD) were often used to estimate the spectrotemporal signal features. In this study, the performance of different TFDs (e.g., short-time Fourier transform (STFT), polynomial chirplet transform (PCT), and continuous wavelet transform (CWT) with different mother functions) was assessed using simulated signals, and then utilized to analyze actual SCGs. The instantaneous frequency (IF) was determined from TFD and the error in estimating IF was calculated for simulated signals. Results suggested that the lowest IF error depended on the TFD and the test signal. STFT had lower error than CWT methods for most test signals. For a simulated SCG, Morlet CWT more accurately estimated IF than other CWTs, but Morlet did not provide noticeable advantages over STFT or PCT. PCT had the most consistently accurate IF estimations and appeared more suited for estimating IF of actual SCG signals. PCT analysis showed that actual SCGs from eight healthy subjects had multiple spectral peaks at 9.20 ± 0.48, 25.84 ± 0.77, 50.71 ± 1.83 Hz (mean ± SEM). These may prove useful features for SCG characterization and classification. PMID:28952511

  12. Preselection statistics and Random Forest classification identify population informative single nucleotide polymorphisms in cosmopolitan and autochthonous cattle breeds.

    PubMed

    Bertolini, F; Galimberti, G; Schiavo, G; Mastrangelo, S; Di Gerlando, R; Strillacci, M G; Bagnato, A; Portolano, B; Fontanesi, L

    2018-01-01

    Commercial single nucleotide polymorphism (SNP) arrays have been recently developed for several species and can be used to identify informative markers to differentiate breeds or populations for several downstream applications. To identify the most discriminating genetic markers among thousands of genotyped SNPs, a few statistical approaches have been proposed. In this work, we compared several methods of SNPs preselection (Delta, F st and principal component analyses (PCA)) in addition to Random Forest classifications to analyse SNP data from six dairy cattle breeds, including cosmopolitan (Holstein, Brown and Simmental) and autochthonous Italian breeds raised in two different regions and subjected to limited or no breeding programmes (Cinisara, Modicana, raised only in Sicily and Reggiana, raised only in Emilia Romagna). From these classifications, two panels of 96 and 48 SNPs that contain the most discriminant SNPs were created for each preselection method. These panels were evaluated in terms of the ability to discriminate as a whole and breed-by-breed, as well as linkage disequilibrium within each panel. The obtained results showed that for the 48-SNP panel, the error rate increased mainly for autochthonous breeds, probably as a consequence of their admixed origin lower selection pressure and by ascertaining bias in the construction of the SNP chip. The 96-SNP panels were generally more able to discriminate all breeds. The panel derived by PCA-chrom (obtained by a preselection chromosome by chromosome) could identify informative SNPs that were particularly useful for the assignment of minor breeds that reached the lowest value of Out Of Bag error even in the Cinisara, whose value was quite high in all other panels. Moreover, this panel contained also the lowest number of SNPs in linkage disequilibrium. Several selected SNPs are located nearby genes affecting breed-specific phenotypic traits (coat colour and stature) or associated with production traits. In general, our results demonstrated the usefulness of Random Forest in combination to other reduction techniques to identify population informative SNPs.

  13. [Detection and classification of medication errors at Joan XXIII University Hospital].

    PubMed

    Jornet Montaña, S; Canadell Vilarrasa, L; Calabuig Mũoz, M; Riera Sendra, G; Vuelta Arce, M; Bardají Ruiz, A; Gallart Mora, M J

    2004-01-01

    Medication errors are multifactorial and multidisciplinary, and may originate in processes such as drug prescription, transcription, dispensation, preparation and administration. The goal of this work was to measure the incidence of detectable medication errors that arise within a unit dose drug distribution and control system, from drug prescription to drug administration, by means of an observational method confined to the Pharmacy Department, as well as a voluntary, anonymous report system. The acceptance of this voluntary report system's implementation was also assessed. A prospective descriptive study was conducted. Data collection was performed at the Pharmacy Department from a review of prescribed medical orders, a review of pharmaceutical transcriptions, a review of dispensed medication and a review of medication returned in unit dose medication carts. A voluntary, anonymous report system centralized in the Pharmacy Department was also set up to detect medication errors. Prescription errors were the most frequent (1.12%), closely followed by dispensation errors (1.04%). Transcription errors (0.42%) and administration errors (0.69%) had the lowest overall incidence. Voluntary report involved only 4.25% of all detected errors, whereas unit dose medication cart review contributed the most to error detection. Recognizing the incidence and types of medication errors that occur in a health-care setting allows us to analyze their causes and effect changes in different stages of the process in order to ensure maximal patient safety.

  14. Development of a methodology for classifying software errors

    NASA Technical Reports Server (NTRS)

    Gerhart, S. L.

    1976-01-01

    A mathematical formalization of the intuition behind classification of software errors is devised and then extended to a classification discipline: Every classification scheme should have an easily discernible mathematical structure and certain properties of the scheme should be decidable (although whether or not these properties hold is relative to the intended use of the scheme). Classification of errors then becomes an iterative process of generalization from actual errors to terms defining the errors together with adjustment of definitions according to the classification discipline. Alternatively, whenever possible, small scale models may be built to give more substance to the definitions. The classification discipline and the difficulties of definition are illustrated by examples of classification schemes from the literature and a new study of observed errors in published papers of programming methodologies.

  15. Classification and reduction of pilot error

    NASA Technical Reports Server (NTRS)

    Rogers, W. H.; Logan, A. L.; Boley, G. D.

    1989-01-01

    Human error is a primary or contributing factor in about two-thirds of commercial aviation accidents worldwide. With the ultimate goal of reducing pilot error accidents, this contract effort is aimed at understanding the factors underlying error events and reducing the probability of certain types of errors by modifying underlying factors such as flight deck design and procedures. A review of the literature relevant to error classification was conducted. Classification includes categorizing types of errors, the information processing mechanisms and factors underlying them, and identifying factor-mechanism-error relationships. The classification scheme developed by Jens Rasmussen was adopted because it provided a comprehensive yet basic error classification shell or structure that could easily accommodate addition of details on domain-specific factors. For these purposes, factors specific to the aviation environment were incorporated. Hypotheses concerning the relationship of a small number of underlying factors, information processing mechanisms, and error types types identified in the classification scheme were formulated. ASRS data were reviewed and a simulation experiment was performed to evaluate and quantify the hypotheses.

  16. Automatic detection of malaria parasite in blood images using two parameters.

    PubMed

    Kim, Jong-Dae; Nam, Kyeong-Min; Park, Chan-Young; Kim, Yu-Seop; Song, Hye-Jeong

    2015-01-01

    Malaria must be diagnosed quickly and accurately at the initial infection stage and treated early to cure it properly. The malaria diagnosis method using a microscope requires much labor and time of a skilled expert and the diagnosis results vary greatly between individual diagnosticians. Therefore, to be able to measure the malaria parasite infection quickly and accurately, studies have been conducted for automated classification techniques using various parameters. In this study, by measuring classification technique performance according to changes of two parameters, the parameter values were determined that best distinguish normal from plasmodium-infected red blood cells. To reduce the stain deviation of the acquired images, a principal component analysis (PCA) grayscale conversion method was used, and as parameters, we used a malaria infected area and a threshold value used in binarization. The parameter values with the best classification performance were determined by selecting the value (72) corresponding to the lowest error rate on the basis of cell threshold value 128 for the malaria threshold value for detecting plasmodium-infected red blood cells.

  17. Quantification of Coffea arabica and Coffea canephora var. robusta concentration in blends by means of synchronous fluorescence and UV-Vis spectroscopies.

    PubMed

    Dankowska, A; Domagała, A; Kowalewski, W

    2017-09-01

    The potential of fluorescence, UV-Vis spectroscopies as well as the low- and mid-level data fusion of both spectroscopies for the quantification of concentrations of roasted Coffea arabica and Coffea canephora var. robusta in coffee blends was investigated. Principal component analysis was used to reduce data multidimensionality. To calculate the level of undeclared addition, multiple linear regression (PCA-MLR) models were used with lowest root mean square error of calibration (RMSEC) of 3.6% and root mean square error of cross-validation (RMSECV) of 7.9%. LDA analysis was applied to fluorescence intensities and UV spectra of Coffea arabica, canephora samples, and their mixtures in order to examine classification ability. The best performance of PCA-LDA analysis was observed for data fusion of UV and fluorescence intensity measurements at wavelength interval of 60nm. LDA showed that data fusion can achieve over 96% of correct classifications (sensitivity) in the test set and 100% of correct classifications in the training set, with low-level data fusion. The corresponding results for individual spectroscopies ranged from 90% (UV-Vis spectroscopy) to 77% (synchronous fluorescence) in the test set, and from 93% to 97% in the training set. The results demonstrate that fluorescence, UV, and visible spectroscopies complement each other, giving a complementary effect for the quantification of roasted Coffea arabica and Coffea canephora var. robusta concentration in blends. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. Fully Convolutional Networks for Ground Classification from LIDAR Point Clouds

    NASA Astrophysics Data System (ADS)

    Rizaldy, A.; Persello, C.; Gevaert, C. M.; Oude Elberink, S. J.

    2018-05-01

    Deep Learning has been massively used for image classification in recent years. The use of deep learning for ground classification from LIDAR point clouds has also been recently studied. However, point clouds need to be converted into an image in order to use Convolutional Neural Networks (CNNs). In state-of-the-art techniques, this conversion is slow because each point is converted into a separate image. This approach leads to highly redundant computation during conversion and classification. The goal of this study is to design a more efficient data conversion and ground classification. This goal is achieved by first converting the whole point cloud into a single image. The classification is then performed by a Fully Convolutional Network (FCN), a modified version of CNN designed for pixel-wise image classification. The proposed method is significantly faster than state-of-the-art techniques. On the ISPRS Filter Test dataset, it is 78 times faster for conversion and 16 times faster for classification. Our experimental analysis on the same dataset shows that the proposed method results in 5.22 % of total error, 4.10 % of type I error, and 15.07 % of type II error. Compared to the previous CNN-based technique and LAStools software, the proposed method reduces the total error and type I error (while type II error is slightly higher). The method was also tested on a very high point density LIDAR point clouds resulting in 4.02 % of total error, 2.15 % of type I error and 6.14 % of type II error.

  19. Seasonal trends in separability of leaf reflectance spectra for Ailanthus altissima and four other tree species

    NASA Astrophysics Data System (ADS)

    Burkholder, Aaron

    This project investigated the spectral separability of the invasive species Ailanthus altissima, commonly called tree of heaven, and four other native species. Leaves were collected from Ailanthus and four native tree species from May 13 through August 24, 2008, and spectral reflectance factor measurements were gathered for each tree using an ASD (Boulder, Colorado) FieldSpec Pro full-range spectroradiometer. The original data covered the range from 350-2500 nm, with one reflectance measurement collected per one nm wavelength. To reduce dimensionality, the measurements were resampled to the actual resolution of the spectrometer's sensors, and regions of atmospheric absorption were removed. Continuum removal was performed on the reflectance data, resulting in a second dataset. For both the reflectance and continuum removed datasets, least angle regression (LARS) and random forest classification were used to identify a single set of optimal wavelengths across all sampled dates, a set of optimal wavelengths for each date, and the dates for which Ailanthus is most separable from other species. It was found that classification accuracy varies both with dates and bands used. Contrary to expectations that early spring would provide the best separability, the lowest classification error was observed on July 22 for the reflectance data, and on May 13, July 11 and August 1 for the continuum removed data. This suggests that July and August are also potentially good months for species differentiation. Applying continuum removal in many cases reduced classification error, although not consistently. Band selection seems to be more important for reflectance data in that it results in greater improvement in classification accuracy, and LARS appears to be an effective band selection tool. The optimal spectral bands were selected from across the spectrum, often with bands from the blue (401-431 nm), NIR (1115 nm) and SWIR (1985-1995 nm), suggesting that hyperspectral sensors with broad wavelength sensitivity are important for mapping and identification of Ailanthus.

  20. Patient identification using a near-infrared laser scanner

    NASA Astrophysics Data System (ADS)

    Manit, Jirapong; Bremer, Christina; Schweikard, Achim; Ernst, Floris

    2017-03-01

    We propose a new biometric approach where the tissue thickness of a person's forehead is used as a biometric feature. Given that the spatial registration of two 3D laser scans of the same human face usually produces a low error value, the principle of point cloud registration and its error metric can be applied to human classification techniques. However, by only considering the spatial error, it is not possible to reliably verify a person's identity. We propose to use a novel near-infrared laser-based head tracking system to determine an additional feature, the tissue thickness, and include this in the error metric. Using MRI as a ground truth, data from the foreheads of 30 subjects was collected from which a 4D reference point cloud was created for each subject. The measurements from the near-infrared system were registered with all reference point clouds using the ICP algorithm. Afterwards, the spatial and tissue thickness errors were extracted, forming a 2D feature space. For all subjects, the lowest feature distance resulted from the registration of a measurement and the reference point cloud of the same person. The combined registration error features yielded two clusters in the feature space, one from the same subject and another from the other subjects. When only the tissue thickness error was considered, these clusters were less distinct but still present. These findings could help to raise safety standards for head and neck cancer patients and lays the foundation for a future human identification technique.

  1. Source Attribution of Cyanides Using Anionic Impurity Profiling, Stable Isotope Ratios, Trace Elemental Analysis and Chemometrics.

    PubMed

    Mirjankar, Nikhil S; Fraga, Carlos G; Carman, April J; Moran, James J

    2016-02-02

    Chemical attribution signatures (CAS) for chemical threat agents (CTAs), such as cyanides, are being investigated to provide an evidentiary link between CTAs and specific sources to support criminal investigations and prosecutions. Herein, stocks of KCN and NaCN were analyzed for trace anions by high performance ion chromatography (HPIC), carbon stable isotope ratio (δ(13)C) by isotope ratio mass spectrometry (IRMS), and trace elements by inductively coupled plasma optical emission spectroscopy (ICP-OES). The collected analytical data were evaluated using hierarchical cluster analysis (HCA), Fisher-ratio (F-ratio), interval partial least-squares (iPLS), genetic algorithm-based partial least-squares (GAPLS), partial least-squares discriminant analysis (PLSDA), K nearest neighbors (KNN), and support vector machines discriminant analysis (SVMDA). HCA of anion impurity profiles from multiple cyanide stocks from six reported countries of origin resulted in cyanide samples clustering into three groups, independent of the associated alkali metal (K or Na). The three groups were independently corroborated by HCA of cyanide elemental profiles and corresponded to countries each having one known solid cyanide factory: Czech Republic, Germany, and United States. Carbon stable isotope measurements resulted in two clusters: Germany and United States (the single Czech stock grouped with United States stocks). Classification errors for two validation studies using anion impurity profiles collected over five years on different instruments were as low as zero for KNN and SVMDA, demonstrating the excellent reliability associated with using anion impurities for matching a cyanide sample to its factory using our current cyanide stocks. Variable selection methods reduced errors for those classification methods having errors greater than zero; iPLS-forward selection and F-ratio typically provided the lowest errors. Finally, using anion profiles to classify cyanides to a specific stock or stock group for a subset of United States stocks resulted in cross-validation errors ranging from 0 to 5.3%.

  2. Classification-Based Spatial Error Concealment for Visual Communications

    NASA Astrophysics Data System (ADS)

    Chen, Meng; Zheng, Yefeng; Wu, Min

    2006-12-01

    In an error-prone transmission environment, error concealment is an effective technique to reconstruct the damaged visual content. Due to large variations of image characteristics, different concealment approaches are necessary to accommodate the different nature of the lost image content. In this paper, we address this issue and propose using classification to integrate the state-of-the-art error concealment techniques. The proposed approach takes advantage of multiple concealment algorithms and adaptively selects the suitable algorithm for each damaged image area. With growing awareness that the design of sender and receiver systems should be jointly considered for efficient and reliable multimedia communications, we proposed a set of classification-based block concealment schemes, including receiver-side classification, sender-side attachment, and sender-side embedding. Our experimental results provide extensive performance comparisons and demonstrate that the proposed classification-based error concealment approaches outperform the conventional approaches.

  3. C-fuzzy variable-branch decision tree with storage and classification error rate constraints

    NASA Astrophysics Data System (ADS)

    Yang, Shiueng-Bien

    2009-10-01

    The C-fuzzy decision tree (CFDT), which is based on the fuzzy C-means algorithm, has recently been proposed. The CFDT is grown by selecting the nodes to be split according to its classification error rate. However, the CFDT design does not consider the classification time taken to classify the input vector. Thus, the CFDT can be improved. We propose a new C-fuzzy variable-branch decision tree (CFVBDT) with storage and classification error rate constraints. The design of the CFVBDT consists of two phases-growing and pruning. The CFVBDT is grown by selecting the nodes to be split according to the classification error rate and the classification time in the decision tree. Additionally, the pruning method selects the nodes to prune based on the storage requirement and the classification time of the CFVBDT. Furthermore, the number of branches of each internal node is variable in the CFVBDT. Experimental results indicate that the proposed CFVBDT outperforms the CFDT and other methods.

  4. Multiple-rule bias in the comparison of classification rules

    PubMed Central

    Yousefi, Mohammadmahdi R.; Hua, Jianping; Dougherty, Edward R.

    2011-01-01

    Motivation: There is growing discussion in the bioinformatics community concerning overoptimism of reported results. Two approaches contributing to overoptimism in classification are (i) the reporting of results on datasets for which a proposed classification rule performs well and (ii) the comparison of multiple classification rules on a single dataset that purports to show the advantage of a certain rule. Results: This article provides a careful probabilistic analysis of the second issue and the ‘multiple-rule bias’, resulting from choosing a classification rule having minimum estimated error on the dataset. It quantifies this bias corresponding to estimating the expected true error of the classification rule possessing minimum estimated error and it characterizes the bias from estimating the true comparative advantage of the chosen classification rule relative to the others by the estimated comparative advantage on the dataset. The analysis is applied to both synthetic and real data using a number of classification rules and error estimators. Availability: We have implemented in C code the synthetic data distribution model, classification rules, feature selection routines and error estimation methods. The code for multiple-rule analysis is implemented in MATLAB. The source code is available at http://gsp.tamu.edu/Publications/supplementary/yousefi11a/. Supplementary simulation results are also included. Contact: edward@ece.tamu.edu Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:21546390

  5. Clarification of terminology in medication errors: definitions and classification.

    PubMed

    Ferner, Robin E; Aronson, Jeffrey K

    2006-01-01

    We have previously described and analysed some terms that are used in drug safety and have proposed definitions. Here we discuss and define terms that are used in the field of medication errors, particularly terms that are sometimes misunderstood or misused. We also discuss the classification of medication errors. A medication error is a failure in the treatment process that leads to, or has the potential to lead to, harm to the patient. Errors can be classified according to whether they are mistakes, slips, or lapses. Mistakes are errors in the planning of an action. They can be knowledge based or rule based. Slips and lapses are errors in carrying out an action - a slip through an erroneous performance and a lapse through an erroneous memory. Classification of medication errors is important because the probabilities of errors of different classes are different, as are the potential remedies.

  6. Increasing accuracy of vehicle detection from conventional vehicle detectors - counts, speeds, classification, and travel time.

    DOT National Transportation Integrated Search

    2014-09-01

    Vehicle classification is an important traffic parameter for transportation planning and infrastructure : management. Length-based vehicle classification from dual loop detectors is among the lowest cost : technologies commonly used for collecting th...

  7. Fault detection and diagnosis of diesel engine valve trains

    NASA Astrophysics Data System (ADS)

    Flett, Justin; Bone, Gary M.

    2016-05-01

    This paper presents the development of a fault detection and diagnosis (FDD) system for use with a diesel internal combustion engine (ICE) valve train. A novel feature is generated for each of the valve closing and combustion impacts. Deformed valve spring faults and abnormal valve clearance faults were seeded on a diesel engine instrumented with one accelerometer. Five classification methods were implemented experimentally and compared. The FDD system using the Naïve-Bayes classification method produced the best overall performance, with a lowest detection accuracy (DA) of 99.95% and a lowest classification accuracy (CA) of 99.95% for the spring faults occurring on individual valves. The lowest DA and CA values for multiple faults occurring simultaneously were 99.95% and 92.45%, respectively. The DA and CA results demonstrate the accuracy of our FDD system for diesel ICE valve train fault scenarios not previously addressed in the literature.

  8. A method for classification of transient events in EEG recordings: application to epilepsy diagnosis.

    PubMed

    Tzallas, A T; Karvelis, P S; Katsis, C D; Fotiadis, D I; Giannopoulos, S; Konitsiotis, S

    2006-01-01

    The aim of the paper is to analyze transient events in inter-ictal EEG recordings, and classify epileptic activity into focal or generalized epilepsy using an automated method. A two-stage approach is proposed. In the first stage the observed transient events of a single channel are classified into four categories: epileptic spike (ES), muscle activity (EMG), eye blinking activity (EOG), and sharp alpha activity (SAA). The process is based on an artificial neural network. Different artificial neural network architectures have been tried and the network having the lowest error has been selected using the hold out approach. In the second stage a knowledge-based system is used to produce diagnosis for focal or generalized epileptic activity. The classification of transient events reported high overall accuracy (84.48%), while the knowledge-based system for epilepsy diagnosis correctly classified nine out of ten cases. The proposed method is advantageous since it effectively detects and classifies the undesirable activity into appropriate categories and produces a final outcome related to the existence of epilepsy.

  9. Determining the optimal window length for pattern recognition-based myoelectric control: balancing the competing effects of classification error and controller delay.

    PubMed

    Smith, Lauren H; Hargrove, Levi J; Lock, Blair A; Kuiken, Todd A

    2011-04-01

    Pattern recognition-based control of myoelectric prostheses has shown great promise in research environments, but has not been optimized for use in a clinical setting. To explore the relationship between classification error, controller delay, and real-time controllability, 13 able-bodied subjects were trained to operate a virtual upper-limb prosthesis using pattern recognition of electromyogram (EMG) signals. Classification error and controller delay were varied by training different classifiers with a variety of analysis window lengths ranging from 50 to 550 ms and either two or four EMG input channels. Offline analysis showed that classification error decreased with longer window lengths (p < 0.01 ). Real-time controllability was evaluated with the target achievement control (TAC) test, which prompted users to maneuver the virtual prosthesis into various target postures. The results indicated that user performance improved with lower classification error (p < 0.01 ) and was reduced with longer controller delay (p < 0.01 ), as determined by the window length. Therefore, both of these effects should be considered when choosing a window length; it may be beneficial to increase the window length if this results in a reduced classification error, despite the corresponding increase in controller delay. For the system employed in this study, the optimal window length was found to be between 150 and 250 ms, which is within acceptable controller delays for conventional multistate amplitude controllers.

  10. The use of a contextual, modal and psychological classification of medication errors in the emergency department: a retrospective descriptive study.

    PubMed

    Cabilan, C J; Hughes, James A; Shannon, Carl

    2017-12-01

    To describe the contextual, modal and psychological classification of medication errors in the emergency department to know the factors associated with the reported medication errors. The causes of medication errors are unique in every clinical setting; hence, error minimisation strategies are not always effective. For this reason, it is fundamental to understand the causes specific to the emergency department so that targeted strategies can be implemented. Retrospective analysis of reported medication errors in the emergency department. All voluntarily staff-reported medication-related incidents from 2010-2015 from the hospital's electronic incident management system were retrieved for analysis. Contextual classification involved the time, place and the type of medications involved. Modal classification pertained to the stage and issue (e.g. wrong medication, wrong patient). Psychological classification categorised the errors in planning (knowledge-based and rule-based errors) and skill (slips and lapses). There were 405 errors reported. Most errors occurred in the acute care area, short-stay unit and resuscitation area, during the busiest shifts (0800-1559, 1600-2259). Half of the errors involved high-alert medications. Many of the errors occurred during administration (62·7%), prescribing (28·6%) and commonly during both stages (18·5%). Wrong dose, wrong medication and omission were the issues that dominated. Knowledge-based errors characterised the errors that occurred in prescribing and administration. The highest proportion of slips (79·5%) and lapses (76·1%) occurred during medication administration. It is likely that some of the errors occurred due to the lack of adherence to safety protocols. Technology such as computerised prescribing, barcode medication administration and reminder systems could potentially decrease the medication errors in the emergency department. There was a possibility that some of the errors could be prevented if safety protocols were adhered to, which highlights the need to also address clinicians' attitudes towards safety. Technology can be implemented to help minimise errors in the ED, but this must be coupled with efforts to enhance the culture of safety. © 2017 John Wiley & Sons Ltd.

  11. Effects of uncertainty and variability on population declines and IUCN Red List classifications.

    PubMed

    Rueda-Cediel, Pamela; Anderson, Kurt E; Regan, Tracey J; Regan, Helen M

    2018-01-22

    The International Union for Conservation of Nature (IUCN) Red List Categories and Criteria is a quantitative framework for classifying species according to extinction risk. Population models may be used to estimate extinction risk or population declines. Uncertainty and variability arise in threat classifications through measurement and process error in empirical data and uncertainty in the models used to estimate extinction risk and population declines. Furthermore, species traits are known to affect extinction risk. We investigated the effects of measurement and process error, model type, population growth rate, and age at first reproduction on the reliability of risk classifications based on projected population declines on IUCN Red List classifications. We used an age-structured population model to simulate true population trajectories with different growth rates, reproductive ages and levels of variation, and subjected them to measurement error. We evaluated the ability of scalar and matrix models parameterized with these simulated time series to accurately capture the IUCN Red List classification generated with true population declines. Under all levels of measurement error tested and low process error, classifications were reasonably accurate; scalar and matrix models yielded roughly the same rate of misclassifications, but the distribution of errors differed; matrix models led to greater overestimation of extinction risk than underestimations; process error tended to contribute to misclassifications to a greater extent than measurement error; and more misclassifications occurred for fast, rather than slow, life histories. These results indicate that classifications of highly threatened taxa (i.e., taxa with low growth rates) under criterion A are more likely to be reliable than for less threatened taxa when assessed with population models. Greater scrutiny needs to be placed on data used to parameterize population models for species with high growth rates, particularly when available evidence indicates a potential transition to higher risk categories. © 2018 Society for Conservation Biology.

  12. Human Error Assessment and Reduction Technique (HEART) and Human Factor Analysis and Classification System (HFACS)

    NASA Technical Reports Server (NTRS)

    Alexander, Tiffaney Miller

    2017-01-01

    Research results have shown that more than half of aviation, aerospace and aeronautics mishaps incidents are attributed to human error. As a part of Safety within space exploration ground processing operations, the identification and/or classification of underlying contributors and causes of human error must be identified, in order to manage human error. This research provides a framework and methodology using the Human Error Assessment and Reduction Technique (HEART) and Human Factor Analysis and Classification System (HFACS), as an analysis tool to identify contributing factors, their impact on human error events, and predict the Human Error probabilities (HEPs) of future occurrences. This research methodology was applied (retrospectively) to six (6) NASA ground processing operations scenarios and thirty (30) years of Launch Vehicle related mishap data. This modifiable framework can be used and followed by other space and similar complex operations.

  13. Human Error Assessment and Reduction Technique (HEART) and Human Factor Analysis and Classification System (HFACS)

    NASA Technical Reports Server (NTRS)

    Alexander, Tiffaney Miller

    2017-01-01

    Research results have shown that more than half of aviation, aerospace and aeronautics mishaps/incidents are attributed to human error. As a part of Safety within space exploration ground processing operations, the identification and/or classification of underlying contributors and causes of human error must be identified, in order to manage human error. This research provides a framework and methodology using the Human Error Assessment and Reduction Technique (HEART) and Human Factor Analysis and Classification System (HFACS), as an analysis tool to identify contributing factors, their impact on human error events, and predict the Human Error probabilities (HEPs) of future occurrences. This research methodology was applied (retrospectively) to six (6) NASA ground processing operations scenarios and thirty (30) years of Launch Vehicle related mishap data. This modifiable framework can be used and followed by other space and similar complex operations.

  14. Human Error Assessment and Reduction Technique (HEART) and Human Factor Analysis and Classification System (HFACS)

    NASA Technical Reports Server (NTRS)

    Alexander, Tiffaney Miller

    2017-01-01

    Research results have shown that more than half of aviation, aerospace and aeronautics mishaps incidents are attributed to human error. As a part of Quality within space exploration ground processing operations, the identification and or classification of underlying contributors and causes of human error must be identified, in order to manage human error.This presentation will provide a framework and methodology using the Human Error Assessment and Reduction Technique (HEART) and Human Factor Analysis and Classification System (HFACS), as an analysis tool to identify contributing factors, their impact on human error events, and predict the Human Error probabilities (HEPs) of future occurrences. This research methodology was applied (retrospectively) to six (6) NASA ground processing operations scenarios and thirty (30) years of Launch Vehicle related mishap data. This modifiable framework can be used and followed by other space and similar complex operations.

  15. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mirjankar, Nikhil S.; Fraga, Carlos G.; Carman, April J.

    Chemical attribution signatures (CAS) for chemical threat agents (CTAs) are being investigated to provide an evidentiary link between CTAs and specific sources to support criminal investigations and prosecutions. In a previous study, anionic impurity profiles developed using high performance ion chromatography (HPIC) were demonstrated as CAS for matching samples from eight potassium cyanide (KCN) stocks to their reported countries of origin. Herein, a larger number of solid KCN stocks (n = 13) and, for the first time, solid sodium cyanide (NaCN) stocks (n = 15) were examined to determine what additional sourcing information can be obtained through anion, carbon stablemore » isotope, and elemental analyses of cyanide stocks by HPIC, isotope ratio mass spectrometry (IRMS), and inductively coupled plasma optical emission spectroscopy (ICP-OES), respectively. The HPIC anion data was evaluated using the variable selection methods of Fisher-ratio (F-ratio), interval partial least squares (iPLS), and genetic algorithm-based partial least squares (GAPLS) and the classification methods of partial least squares discriminate analysis (PLSDA), K nearest neighbors (KNN), and support vector machines discriminate analysis (SVMDA). In summary, hierarchical cluster analysis (HCA) of anion impurity profiles from multiple cyanide stocks from six reported country of origins resulted in cyanide samples clustering into three groups: Czech Republic, Germany, and United States, independent of the associated alkali metal (K or Na). The three country groups were independently corroborated by HCA of cyanide elemental profiles and corresponded to countries with known solid cyanide factories. Both the anion and elemental CAS are believed to originate from the aqueous alkali hydroxides used in cyanide manufacture. Carbon stable isotope measurements resulted in two clusters: Germany and United States (the single Czech stock grouped with United States stocks). The carbon isotope CAS is believed to originate from the carbon source and process used to make the HCN utilized in cyanide synthesis. Classification errors for two validation studies using anion impurity profiles collected over five years on different instruments were as low as zero for KNN and SVMDA, demonstrating the excellent reliability (so far) of using anion impurities for matching a cyanide sample to its country of manufacture (i.e., factory). Variable selection reduced errors for those classification methods having errors greater than zero with iPLS-forward selection, and F-ratio typically providing the lowest errors. Finally, using anion profiles to match cyanides to a specific stock or stock group resulted in cross-validation errors ranging from zero to 5.3%.« less

  16. Masked and unmasked error-related potentials during continuous control and feedback

    NASA Astrophysics Data System (ADS)

    Lopes Dias, Catarina; Sburlea, Andreea I.; Müller-Putz, Gernot R.

    2018-06-01

    The detection of error-related potentials (ErrPs) in tasks with discrete feedback is well established in the brain–computer interface (BCI) field. However, the decoding of ErrPs in tasks with continuous feedback is still in its early stages. Objective. We developed a task in which subjects have continuous control of a cursor’s position by means of a joystick. The cursor’s position was shown to the participants in two different modalities of continuous feedback: normal and jittered. The jittered feedback was created to mimic the instability that could exist if participants controlled the trajectory directly with brain signals. Approach. This paper studies the electroencephalographic (EEG)—measurable signatures caused by a loss of control over the cursor’s trajectory, causing a target miss. Main results. In both feedback modalities, time-locked potentials revealed the typical frontal-central components of error-related potentials. Errors occurring during the jittered feedback (masked errors) were delayed in comparison to errors occurring during normal feedback (unmasked errors). Masked errors displayed lower peak amplitudes than unmasked errors. Time-locked classification analysis allowed a good distinction between correct and error classes (average Cohen-, average TPR  =  81.8% and average TNR  =  96.4%). Time-locked classification analysis between masked error and unmasked error classes revealed results at chance level (average Cohen-, average TPR  =  60.9% and average TNR  =  58.3%). Afterwards, we performed asynchronous detection of ErrPs, combining both masked and unmasked trials. The asynchronous detection of ErrPs in a simulated online scenario resulted in an average TNR of 84.0% and in an average TPR of 64.9%. Significance. The time-locked classification results suggest that the masked and unmasked errors were indistinguishable in terms of classification. The asynchronous classification results suggest that the feedback modality did not hinder the asynchronous detection of ErrPs.

  17. Quantification of errors in ordinal outcome scales using shannon entropy: effect on sample size calculations.

    PubMed

    Mandava, Pitchaiah; Krumpelman, Chase S; Shah, Jharna N; White, Donna L; Kent, Thomas A

    2013-01-01

    Clinical trial outcomes often involve an ordinal scale of subjective functional assessments but the optimal way to quantify results is not clear. In stroke, the most commonly used scale, the modified Rankin Score (mRS), a range of scores ("Shift") is proposed as superior to dichotomization because of greater information transfer. The influence of known uncertainties in mRS assessment has not been quantified. We hypothesized that errors caused by uncertainties could be quantified by applying information theory. Using Shannon's model, we quantified errors of the "Shift" compared to dichotomized outcomes using published distributions of mRS uncertainties and applied this model to clinical trials. We identified 35 randomized stroke trials that met inclusion criteria. Each trial's mRS distribution was multiplied with the noise distribution from published mRS inter-rater variability to generate an error percentage for "shift" and dichotomized cut-points. For the SAINT I neuroprotectant trial, considered positive by "shift" mRS while the larger follow-up SAINT II trial was negative, we recalculated sample size required if classification uncertainty was taken into account. Considering the full mRS range, error rate was 26.1%±5.31 (Mean±SD). Error rates were lower for all dichotomizations tested using cut-points (e.g. mRS 1; 6.8%±2.89; overall p<0.001). Taking errors into account, SAINT I would have required 24% more subjects than were randomized. We show when uncertainty in assessments is considered, the lowest error rates are with dichotomization. While using the full range of mRS is conceptually appealing, a gain of information is counter-balanced by a decrease in reliability. The resultant errors need to be considered since sample size may otherwise be underestimated. In principle, we have outlined an approach to error estimation for any condition in which there are uncertainties in outcome assessment. We provide the user with programs to calculate and incorporate errors into sample size estimation.

  18. Prediction of wastewater quality indicators at the inflow to the wastewater treatment plant using data mining methods

    NASA Astrophysics Data System (ADS)

    Szeląg, Bartosz; Barbusiński, Krzysztof; Studziński, Jan; Bartkiewicz, Lidia

    2017-11-01

    In the study, models developed using data mining methods are proposed for predicting wastewater quality indicators: biochemical and chemical oxygen demand, total suspended solids, total nitrogen and total phosphorus at the inflow to wastewater treatment plant (WWTP). The models are based on values measured in previous time steps and daily wastewater inflows. Also, independent prediction systems that can be used in case of monitoring devices malfunction are provided. Models of wastewater quality indicators were developed using MARS (multivariate adaptive regression spline) method, artificial neural networks (ANN) of the multilayer perceptron type combined with the classification model (SOM) and cascade neural networks (CNN). The lowest values of absolute and relative errors were obtained using ANN+SOM, whereas the MARS method produced the highest error values. It was shown that for the analysed WWTP it is possible to obtain continuous prediction of selected wastewater quality indicators using the two developed independent prediction systems. Such models can ensure reliable WWTP work when wastewater quality monitoring systems become inoperable, or are under maintenance.

  19. Detection of Butter Adulteration with Lard by Employing (1)H-NMR Spectroscopy and Multivariate Data Analysis.

    PubMed

    Fadzillah, Nurrulhidayah Ahmad; Man, Yaakob bin Che; Rohman, Abdul; Rosman, Arieff Salleh; Ismail, Amin; Mustafa, Shuhaimi; Khatib, Alfi

    2015-01-01

    The authentication of food products from the presence of non-allowed components for certain religion like lard is very important. In this study, we used proton Nuclear Magnetic Resonance ((1)H-NMR) spectroscopy for the analysis of butter adulterated with lard by simultaneously quantification of all proton bearing compounds, and consequently all relevant sample classes. Since the spectra obtained were too complex to be analyzed visually by the naked eyes, the classification of spectra was carried out.The multivariate calibration of partial least square (PLS) regression was used for modelling the relationship between actual value of lard and predicted value. The model yielded a highest regression coefficient (R(2)) of 0.998 and the lowest root mean square error calibration (RMSEC) of 0.0091% and root mean square error prediction (RMSEP) of 0.0090, respectively. Cross validation testing evaluates the predictive power of the model. PLS model was shown as good models as the intercept of R(2)Y and Q(2)Y were 0.0853 and -0.309, respectively.

  20. On the use of interaction error potentials for adaptive brain computer interfaces.

    PubMed

    Llera, A; van Gerven, M A J; Gómez, V; Jensen, O; Kappen, H J

    2011-12-01

    We propose an adaptive classification method for the Brain Computer Interfaces (BCI) which uses Interaction Error Potentials (IErrPs) as a reinforcement signal and adapts the classifier parameters when an error is detected. We analyze the quality of the proposed approach in relation to the misclassification of the IErrPs. In addition we compare static versus adaptive classification performance using artificial and MEG data. We show that the proposed adaptive framework significantly improves the static classification methods. Copyright © 2011 Elsevier Ltd. All rights reserved.

  1. A neural network for noise correlation classification

    NASA Astrophysics Data System (ADS)

    Paitz, Patrick; Gokhberg, Alexey; Fichtner, Andreas

    2018-02-01

    We present an artificial neural network (ANN) for the classification of ambient seismic noise correlations into two categories, suitable and unsuitable for noise tomography. By using only a small manually classified data subset for network training, the ANN allows us to classify large data volumes with low human effort and to encode the valuable subjective experience of data analysts that cannot be captured by a deterministic algorithm. Based on a new feature extraction procedure that exploits the wavelet-like nature of seismic time-series, we efficiently reduce the dimensionality of noise correlation data, still keeping relevant features needed for automated classification. Using global- and regional-scale data sets, we show that classification errors of 20 per cent or less can be achieved when the network training is performed with as little as 3.5 per cent and 16 per cent of the data sets, respectively. Furthermore, the ANN trained on the regional data can be applied to the global data, and vice versa, without a significant increase of the classification error. An experiment where four students manually classified the data, revealed that the classification error they would assign to each other is substantially larger than the classification error of the ANN (>35 per cent). This indicates that reproducibility would be hampered more by human subjectivity than by imperfections of the ANN.

  2. Analyzing thematic maps and mapping for accuracy

    USGS Publications Warehouse

    Rosenfield, G.H.

    1982-01-01

    Two problems which exist while attempting to test the accuracy of thematic maps and mapping are: (1) evaluating the accuracy of thematic content, and (2) evaluating the effects of the variables on thematic mapping. Statistical analysis techniques are applicable to both these problems and include techniques for sampling the data and determining their accuracy. In addition, techniques for hypothesis testing, or inferential statistics, are used when comparing the effects of variables. A comprehensive and valid accuracy test of a classification project, such as thematic mapping from remotely sensed data, includes the following components of statistical analysis: (1) sample design, including the sample distribution, sample size, size of the sample unit, and sampling procedure; and (2) accuracy estimation, including estimation of the variance and confidence limits. Careful consideration must be given to the minimum sample size necessary to validate the accuracy of a given. classification category. The results of an accuracy test are presented in a contingency table sometimes called a classification error matrix. Usually the rows represent the interpretation, and the columns represent the verification. The diagonal elements represent the correct classifications. The remaining elements of the rows represent errors by commission, and the remaining elements of the columns represent the errors of omission. For tests of hypothesis that compare variables, the general practice has been to use only the diagonal elements from several related classification error matrices. These data are arranged in the form of another contingency table. The columns of the table represent the different variables being compared, such as different scales of mapping. The rows represent the blocking characteristics, such as the various categories of classification. The values in the cells of the tables might be the counts of correct classification or the binomial proportions of these counts divided by either the row totals or the column totals from the original classification error matrices. In hypothesis testing, when the results of tests of multiple sample cases prove to be significant, some form of statistical test must be used to separate any results that differ significantly from the others. In the past, many analyses of the data in this error matrix were made by comparing the relative magnitudes of the percentage of correct classifications, for either individual categories, the entire map or both. More rigorous analyses have used data transformations and (or) two-way classification analysis of variance. A more sophisticated step of data analysis techniques would be to use the entire classification error matrices using the methods of discrete multivariate analysis or of multiviariate analysis of variance.

  3. Analysis of DSN software anomalies

    NASA Technical Reports Server (NTRS)

    Galorath, D. D.; Hecht, H.; Hecht, M.; Reifer, D. J.

    1981-01-01

    A categorized data base of software errors which were discovered during the various stages of development and operational use of the Deep Space Network DSN/Mark 3 System was developed. A study team identified several existing error classification schemes (taxonomies), prepared a detailed annotated bibliography of the error taxonomy literature, and produced a new classification scheme which was tuned to the DSN anomaly reporting system and encapsulated the work of others. Based upon the DSN/RCI error taxonomy, error data on approximately 1000 reported DSN/Mark 3 anomalies were analyzed, interpreted and classified. Next, error data are summarized and histograms were produced highlighting key tendencies.

  4. A Modified MinMax k-Means Algorithm Based on PSO.

    PubMed

    Wang, Xiaoyan; Bai, Yanping

    The MinMax k -means algorithm is widely used to tackle the effect of bad initialization by minimizing the maximum intraclustering errors. Two parameters, including the exponent parameter and memory parameter, are involved in the executive process. Since different parameters have different clustering errors, it is crucial to choose appropriate parameters. In the original algorithm, a practical framework is given. Such framework extends the MinMax k -means to automatically adapt the exponent parameter to the data set. It has been believed that if the maximum exponent parameter has been set, then the programme can reach the lowest intraclustering errors. However, our experiments show that this is not always correct. In this paper, we modified the MinMax k -means algorithm by PSO to determine the proper values of parameters which can subject the algorithm to attain the lowest clustering errors. The proposed clustering method is tested on some favorite data sets in several different initial situations and is compared to the k -means algorithm and the original MinMax k -means algorithm. The experimental results indicate that our proposed algorithm can reach the lowest clustering errors automatically.

  5. Neyman-Pearson classification algorithms and NP receiver operating characteristics

    PubMed Central

    Tong, Xin; Feng, Yang; Li, Jingyi Jessica

    2018-01-01

    In many binary classification applications, such as disease diagnosis and spam detection, practitioners commonly face the need to limit type I error (that is, the conditional probability of misclassifying a class 0 observation as class 1) so that it remains below a desired threshold. To address this need, the Neyman-Pearson (NP) classification paradigm is a natural choice; it minimizes type II error (that is, the conditional probability of misclassifying a class 1 observation as class 0) while enforcing an upper bound, α, on the type I error. Despite its century-long history in hypothesis testing, the NP paradigm has not been well recognized and implemented in classification schemes. Common practices that directly limit the empirical type I error to no more than α do not satisfy the type I error control objective because the resulting classifiers are likely to have type I errors much larger than α, and the NP paradigm has not been properly implemented in practice. We develop the first umbrella algorithm that implements the NP paradigm for all scoring-type classification methods, such as logistic regression, support vector machines, and random forests. Powered by this algorithm, we propose a novel graphical tool for NP classification methods: NP receiver operating characteristic (NP-ROC) bands motivated by the popular ROC curves. NP-ROC bands will help choose α in a data-adaptive way and compare different NP classifiers. We demonstrate the use and properties of the NP umbrella algorithm and NP-ROC bands, available in the R package nproc, through simulation and real data studies. PMID:29423442

  6. Neyman-Pearson classification algorithms and NP receiver operating characteristics.

    PubMed

    Tong, Xin; Feng, Yang; Li, Jingyi Jessica

    2018-02-01

    In many binary classification applications, such as disease diagnosis and spam detection, practitioners commonly face the need to limit type I error (that is, the conditional probability of misclassifying a class 0 observation as class 1) so that it remains below a desired threshold. To address this need, the Neyman-Pearson (NP) classification paradigm is a natural choice; it minimizes type II error (that is, the conditional probability of misclassifying a class 1 observation as class 0) while enforcing an upper bound, α, on the type I error. Despite its century-long history in hypothesis testing, the NP paradigm has not been well recognized and implemented in classification schemes. Common practices that directly limit the empirical type I error to no more than α do not satisfy the type I error control objective because the resulting classifiers are likely to have type I errors much larger than α, and the NP paradigm has not been properly implemented in practice. We develop the first umbrella algorithm that implements the NP paradigm for all scoring-type classification methods, such as logistic regression, support vector machines, and random forests. Powered by this algorithm, we propose a novel graphical tool for NP classification methods: NP receiver operating characteristic (NP-ROC) bands motivated by the popular ROC curves. NP-ROC bands will help choose α in a data-adaptive way and compare different NP classifiers. We demonstrate the use and properties of the NP umbrella algorithm and NP-ROC bands, available in the R package nproc, through simulation and real data studies.

  7. Evaluation of normalization methods for cDNA microarray data by k-NN classification

    PubMed Central

    Wu, Wei; Xing, Eric P; Myers, Connie; Mian, I Saira; Bissell, Mina J

    2005-01-01

    Background Non-biological factors give rise to unwanted variations in cDNA microarray data. There are many normalization methods designed to remove such variations. However, to date there have been few published systematic evaluations of these techniques for removing variations arising from dye biases in the context of downstream, higher-order analytical tasks such as classification. Results Ten location normalization methods that adjust spatial- and/or intensity-dependent dye biases, and three scale methods that adjust scale differences were applied, individually and in combination, to five distinct, published, cancer biology-related cDNA microarray data sets. Leave-one-out cross-validation (LOOCV) classification error was employed as the quantitative end-point for assessing the effectiveness of a normalization method. In particular, a known classifier, k-nearest neighbor (k-NN), was estimated from data normalized using a given technique, and the LOOCV error rate of the ensuing model was computed. We found that k-NN classifiers are sensitive to dye biases in the data. Using NONRM and GMEDIAN as baseline methods, our results show that single-bias-removal techniques which remove either spatial-dependent dye bias (referred later as spatial effect) or intensity-dependent dye bias (referred later as intensity effect) moderately reduce LOOCV classification errors; whereas double-bias-removal techniques which remove both spatial- and intensity effect reduce LOOCV classification errors even further. Of the 41 different strategies examined, three two-step processes, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, all of which removed intensity effect globally and spatial effect locally, appear to reduce LOOCV classification errors most consistently and effectively across all data sets. We also found that the investigated scale normalization methods do not reduce LOOCV classification error. Conclusion Using LOOCV error of k-NNs as the evaluation criterion, three double-bias-removal normalization strategies, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, outperform other strategies for removing spatial effect, intensity effect and scale differences from cDNA microarray data. The apparent sensitivity of k-NN LOOCV classification error to dye biases suggests that this criterion provides an informative measure for evaluating normalization methods. All the computational tools used in this study were implemented using the R language for statistical computing and graphics. PMID:16045803

  8. Evaluation of normalization methods for cDNA microarray data by k-NN classification.

    PubMed

    Wu, Wei; Xing, Eric P; Myers, Connie; Mian, I Saira; Bissell, Mina J

    2005-07-26

    Non-biological factors give rise to unwanted variations in cDNA microarray data. There are many normalization methods designed to remove such variations. However, to date there have been few published systematic evaluations of these techniques for removing variations arising from dye biases in the context of downstream, higher-order analytical tasks such as classification. Ten location normalization methods that adjust spatial- and/or intensity-dependent dye biases, and three scale methods that adjust scale differences were applied, individually and in combination, to five distinct, published, cancer biology-related cDNA microarray data sets. Leave-one-out cross-validation (LOOCV) classification error was employed as the quantitative end-point for assessing the effectiveness of a normalization method. In particular, a known classifier, k-nearest neighbor (k-NN), was estimated from data normalized using a given technique, and the LOOCV error rate of the ensuing model was computed. We found that k-NN classifiers are sensitive to dye biases in the data. Using NONRM and GMEDIAN as baseline methods, our results show that single-bias-removal techniques which remove either spatial-dependent dye bias (referred later as spatial effect) or intensity-dependent dye bias (referred later as intensity effect) moderately reduce LOOCV classification errors; whereas double-bias-removal techniques which remove both spatial- and intensity effect reduce LOOCV classification errors even further. Of the 41 different strategies examined, three two-step processes, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, all of which removed intensity effect globally and spatial effect locally, appear to reduce LOOCV classification errors most consistently and effectively across all data sets. We also found that the investigated scale normalization methods do not reduce LOOCV classification error. Using LOOCV error of k-NNs as the evaluation criterion, three double-bias-removal normalization strategies, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, outperform other strategies for removing spatial effect, intensity effect and scale differences from cDNA microarray data. The apparent sensitivity of k-NN LOOCV classification error to dye biases suggests that this criterion provides an informative measure for evaluating normalization methods. All the computational tools used in this study were implemented using the R language for statistical computing and graphics.

  9. Bayesian learning for spatial filtering in an EEG-based brain-computer interface.

    PubMed

    Zhang, Haihong; Yang, Huijuan; Guan, Cuntai

    2013-07-01

    Spatial filtering for EEG feature extraction and classification is an important tool in brain-computer interface. However, there is generally no established theory that links spatial filtering directly to Bayes classification error. To address this issue, this paper proposes and studies a Bayesian analysis theory for spatial filtering in relation to Bayes error. Following the maximum entropy principle, we introduce a gamma probability model for describing single-trial EEG power features. We then formulate and analyze the theoretical relationship between Bayes classification error and the so-called Rayleigh quotient, which is a function of spatial filters and basically measures the ratio in power features between two classes. This paper also reports our extensive study that examines the theory and its use in classification, using three publicly available EEG data sets and state-of-the-art spatial filtering techniques and various classifiers. Specifically, we validate the positive relationship between Bayes error and Rayleigh quotient in real EEG power features. Finally, we demonstrate that the Bayes error can be practically reduced by applying a new spatial filter with lower Rayleigh quotient.

  10. Sources of error in estimating truck traffic from automatic vehicle classification data

    DOT National Transportation Integrated Search

    1998-10-01

    Truck annual average daily traffic estimation errors resulting from sample classification counts are computed in this paper under two scenarios. One scenario investigates an improper factoring procedure that may be used by highway agencies. The study...

  11. Assessing the statistical significance of the achieved classification error of classifiers constructed using serum peptide profiles, and a prescription for random sampling repeated studies for massive high-throughput genomic and proteomic studies.

    PubMed

    Lyons-Weiler, James; Pelikan, Richard; Zeh, Herbert J; Whitcomb, David C; Malehorn, David E; Bigbee, William L; Hauskrecht, Milos

    2005-01-01

    Peptide profiles generated using SELDI/MALDI time of flight mass spectrometry provide a promising source of patient-specific information with high potential impact on the early detection and classification of cancer and other diseases. The new profiling technology comes, however, with numerous challenges and concerns. Particularly important are concerns of reproducibility of classification results and their significance. In this work we describe a computational validation framework, called PACE (Permutation-Achieved Classification Error), that lets us assess, for a given classification model, the significance of the Achieved Classification Error (ACE) on the profile data. The framework compares the performance statistic of the classifier on true data samples and checks if these are consistent with the behavior of the classifier on the same data with randomly reassigned class labels. A statistically significant ACE increases our belief that a discriminative signal was found in the data. The advantage of PACE analysis is that it can be easily combined with any classification model and is relatively easy to interpret. PACE analysis does not protect researchers against confounding in the experimental design, or other sources of systematic or random error. We use PACE analysis to assess significance of classification results we have achieved on a number of published data sets. The results show that many of these datasets indeed possess a signal that leads to a statistically significant ACE.

  12. Automated Classification of Phonological Errors in Aphasic Language

    PubMed Central

    Ahuja, Sanjeev B.; Reggia, James A.; Berndt, Rita S.

    1984-01-01

    Using heuristically-guided state space search, a prototype program has been developed to simulate and classify phonemic errors occurring in the speech of neurologically-impaired patients. Simulations are based on an interchangeable rule/operator set of elementary errors which represent a theory of phonemic processing faults. This work introduces and evaluates a novel approach to error simulation and classification, it provides a prototype simulation tool for neurolinguistic research, and it forms the initial phase of a larger research effort involving computer modelling of neurolinguistic processes.

  13. ANALYSIS OF A CLASSIFICATION ERROR MATRIX USING CATEGORICAL DATA TECHNIQUES.

    USGS Publications Warehouse

    Rosenfield, George H.; Fitzpatrick-Lins, Katherine

    1984-01-01

    Summary form only given. A classification error matrix typically contains tabulation results of an accuracy evaluation of a thematic classification, such as that of a land use and land cover map. The diagonal elements of the matrix represent the counts corrected, and the usual designation of classification accuracy has been the total percent correct. The nondiagonal elements of the matrix have usually been neglected. The classification error matrix is known in statistical terms as a contingency table of categorical data. As an example, an application of these methodologies to a problem of remotely sensed data concerning two photointerpreters and four categories of classification indicated that there is no significant difference in the interpretation between the two photointerpreters, and that there are significant differences among the interpreted category classifications. However, two categories, oak and cottonwood, are not separable in classification in this experiment at the 0. 51 percent probability. A coefficient of agreement is determined for the interpreted map as a whole, and individually for each of the interpreted categories. A conditional coefficient of agreement for the individual categories is compared to other methods for expressing category accuracy which have already been presented in the remote sensing literature.

  14. New wideband radar target classification method based on neural learning and modified Euclidean metric

    NASA Astrophysics Data System (ADS)

    Jiang, Yicheng; Cheng, Ping; Ou, Yangkui

    2001-09-01

    A new method for target classification of high-range resolution radar is proposed. It tries to use neural learning to obtain invariant subclass features of training range profiles. A modified Euclidean metric based on the Box-Cox transformation technique is investigated for Nearest Neighbor target classification improvement. The classification experiments using real radar data of three different aircraft have demonstrated that classification error can reduce 8% if this method proposed in this paper is chosen instead of the conventional method. The results of this paper have shown that by choosing an optimized metric, it is indeed possible to reduce the classification error without increasing the number of samples.

  15. What Do Spelling Errors Tell Us? Classification and Analysis of Errors Made by Greek Schoolchildren with and without Dyslexia

    ERIC Educational Resources Information Center

    Protopapas, Athanassios; Fakou, Aikaterini; Drakopoulou, Styliani; Skaloumbakas, Christos; Mouzaki, Angeliki

    2013-01-01

    In this study we propose a classification system for spelling errors and determine the most common spelling difficulties of Greek children with and without dyslexia. Spelling skills of 542 children from the general population and 44 children with dyslexia, Grades 3-4 and 7, were assessed with a dictated common word list and age-appropriate…

  16. PCA feature extraction for change detection in multidimensional unlabeled data.

    PubMed

    Kuncheva, Ludmila I; Faithfull, William J

    2014-01-01

    When classifiers are deployed in real-world applications, it is assumed that the distribution of the incoming data matches the distribution of the data used to train the classifier. This assumption is often incorrect, which necessitates some form of change detection or adaptive classification. While there has been a lot of work on change detection based on the classification error monitored over the course of the operation of the classifier, finding changes in multidimensional unlabeled data is still a challenge. Here, we propose to apply principal component analysis (PCA) for feature extraction prior to the change detection. Supported by a theoretical example, we argue that the components with the lowest variance should be retained as the extracted features because they are more likely to be affected by a change. We chose a recently proposed semiparametric log-likelihood change detection criterion that is sensitive to changes in both mean and variance of the multidimensional distribution. An experiment with 35 datasets and an illustration with a simple video segmentation demonstrate the advantage of using extracted features compared to raw data. Further analysis shows that feature extraction through PCA is beneficial, specifically for data with multiple balanced classes.

  17. Effectiveness of Global Features for Automatic Medical Image Classification and Retrieval – the experiences of OHSU at ImageCLEFmed

    PubMed Central

    Kalpathy-Cramer, Jayashree; Hersh, William

    2008-01-01

    In 2006 and 2007, Oregon Health & Science University (OHSU) participated in the automatic image annotation task for medical images at ImageCLEF, an annual international benchmarking event that is part of the Cross Language Evaluation Forum (CLEF). The goal of the automatic annotation task was to classify 1000 test images based on the Image Retrieval in Medical Applications (IRMA) code, given a set of 10,000 training images. There were 116 distinct classes in 2006 and 2007. We evaluated the efficacy of a variety of primarily global features for this classification task. These included features based on histograms, gray level correlation matrices and the gist technique. A multitude of classifiers including k-nearest neighbors, two-level neural networks, support vector machines, and maximum likelihood classifiers were evaluated. Our official error rates for the 1000 test images were 26% in 2006 using the flat classification structure. The error count in 2007 was 67.8 using the hierarchical classification error computation based on the IRMA code in 2007. Confusion matrices as well as clustering experiments were used to identify visually similar classes. The use of the IRMA code did not help us in the classification task as the semantic hierarchy of the IRMA classes did not correspond well with the hierarchy based on clustering of image features that we used. Our most frequent misclassification errors were along the view axis. Subsequent experiments based on a two-stage classification system decreased our error rate to 19.8% for the 2006 dataset and our error count to 55.4 for the 2007 data. PMID:19884953

  18. Error-related brain activity and error awareness in an error classification paradigm.

    PubMed

    Di Gregorio, Francesco; Steinhauser, Marco; Maier, Martin E

    2016-10-01

    Error-related brain activity has been linked to error detection enabling adaptive behavioral adjustments. However, it is still unclear which role error awareness plays in this process. Here, we show that the error-related negativity (Ne/ERN), an event-related potential reflecting early error monitoring, is dissociable from the degree of error awareness. Participants responded to a target while ignoring two different incongruent distractors. After responding, they indicated whether they had committed an error, and if so, whether they had responded to one or to the other distractor. This error classification paradigm allowed distinguishing partially aware errors, (i.e., errors that were noticed but misclassified) and fully aware errors (i.e., errors that were correctly classified). The Ne/ERN was larger for partially aware errors than for fully aware errors. Whereas this speaks against the idea that the Ne/ERN foreshadows the degree of error awareness, it confirms the prediction of a computational model, which relates the Ne/ERN to post-response conflict. This model predicts that stronger distractor processing - a prerequisite of error classification in our paradigm - leads to lower post-response conflict and thus a smaller Ne/ERN. This implies that the relationship between Ne/ERN and error awareness depends on how error awareness is related to response conflict in a specific task. Our results further indicate that the Ne/ERN but not the degree of error awareness determines adaptive performance adjustments. Taken together, we conclude that the Ne/ERN is dissociable from error awareness and foreshadows adaptive performance adjustments. Our results suggest that the relationship between the Ne/ERN and error awareness is correlative and mediated by response conflict. Copyright © 2016 Elsevier Inc. All rights reserved.

  19. Toward attenuating the impact of arm positions on electromyography pattern-recognition based motion classification in transradial amputees

    PubMed Central

    2012-01-01

    Background Electromyography (EMG) pattern-recognition based control strategies for multifunctional myoelectric prosthesis systems have been studied commonly in a controlled laboratory setting. Before these myoelectric prosthesis systems are clinically viable, it will be necessary to assess the effect of some disparities between the ideal laboratory setting and practical use on the control performance. One important obstacle is the impact of arm position variation that causes the changes of EMG pattern when performing identical motions in different arm positions. This study aimed to investigate the impacts of arm position variation on EMG pattern-recognition based motion classification in upper-limb amputees and the solutions for reducing these impacts. Methods With five unilateral transradial (TR) amputees, the EMG signals and tri-axial accelerometer mechanomyography (ACC-MMG) signals were simultaneously collected from both amputated and intact arms when performing six classes of arm and hand movements in each of five arm positions that were considered in the study. The effect of the arm position changes was estimated in terms of motion classification error and compared between amputated and intact arms. Then the performance of three proposed methods in attenuating the impact of arm positions was evaluated. Results With EMG signals, the average intra-position and inter-position classification errors across all five arm positions and five subjects were around 7.3% and 29.9% from amputated arms, respectively, about 1.0% and 10% low in comparison with those from intact arms. While ACC-MMG signals could yield a similar intra-position classification error (9.9%) as EMG, they had much higher inter-position classification error with an average value of 81.1% over the arm positions and the subjects. When the EMG data from all five arm positions were involved in the training set, the average classification error reached a value of around 10.8% for amputated arms. Using a two-stage cascade classifier, the average classification error was around 9.0% over all five arm positions. Reducing ACC-MMG channels from 8 to 2 only increased the average position classification error across all five arm positions from 0.7% to 1.0% in amputated arms. Conclusions The performance of EMG pattern-recognition based method in classifying movements strongly depends on arm positions. This dependency is a little stronger in intact arm than in amputated arm, which suggests that the investigations associated with practical use of a myoelectric prosthesis should use the limb amputees as subjects instead of using able-body subjects. The two-stage cascade classifier mode with ACC-MMG for limb position identification and EMG for limb motion classification may be a promising way to reduce the effect of limb position variation on classification performance. PMID:23036049

  20. A Modified MinMax k-Means Algorithm Based on PSO

    PubMed Central

    2016-01-01

    The MinMax k-means algorithm is widely used to tackle the effect of bad initialization by minimizing the maximum intraclustering errors. Two parameters, including the exponent parameter and memory parameter, are involved in the executive process. Since different parameters have different clustering errors, it is crucial to choose appropriate parameters. In the original algorithm, a practical framework is given. Such framework extends the MinMax k-means to automatically adapt the exponent parameter to the data set. It has been believed that if the maximum exponent parameter has been set, then the programme can reach the lowest intraclustering errors. However, our experiments show that this is not always correct. In this paper, we modified the MinMax k-means algorithm by PSO to determine the proper values of parameters which can subject the algorithm to attain the lowest clustering errors. The proposed clustering method is tested on some favorite data sets in several different initial situations and is compared to the k-means algorithm and the original MinMax k-means algorithm. The experimental results indicate that our proposed algorithm can reach the lowest clustering errors automatically. PMID:27656201

  1. Algorithmic Classification of Five Characteristic Types of Paraphasias.

    PubMed

    Fergadiotis, Gerasimos; Gorman, Kyle; Bedrick, Steven

    2016-12-01

    This study was intended to evaluate a series of algorithms developed to perform automatic classification of paraphasic errors (formal, semantic, mixed, neologistic, and unrelated errors). We analyzed 7,111 paraphasias from the Moss Aphasia Psycholinguistics Project Database (Mirman et al., 2010) and evaluated the classification accuracy of 3 automated tools. First, we used frequency norms from the SUBTLEXus database (Brysbaert & New, 2009) to differentiate nonword errors and real-word productions. Then we implemented a phonological-similarity algorithm to identify phonologically related real-word errors. Last, we assessed the performance of a semantic-similarity criterion that was based on word2vec (Mikolov, Yih, & Zweig, 2013). Overall, the algorithmic classification replicated human scoring for the major categories of paraphasias studied with high accuracy. The tool that was based on the SUBTLEXus frequency norms was more than 97% accurate in making lexicality judgments. The phonological-similarity criterion was approximately 91% accurate, and the overall classification accuracy of the semantic classifier ranged from 86% to 90%. Overall, the results highlight the potential of tools from the field of natural language processing for the development of highly reliable, cost-effective diagnostic tools suitable for collecting high-quality measurement data for research and clinical purposes.

  2. Human error analysis of commercial aviation accidents using the human factors analysis and classification system (HFACS)

    DOT National Transportation Integrated Search

    2001-02-01

    The Human Factors Analysis and Classification System (HFACS) is a general human error framework : originally developed and tested within the U.S. military as a tool for investigating and analyzing the human : causes of aviation accidents. Based upon ...

  3. Current Assessment and Classification of Suicidal Phenomena using the FDA 2012 Draft Guidance Document on Suicide Assessment: A Critical Review.

    PubMed

    Sheehan, David V; Giddens, Jennifer M; Sheehan, Kathy Harnett

    2014-09-01

    Standard international classification criteria require that classification categories be comprehensive to avoid type II error. Categories should be mutually exclusive and definitions should be clear and unambiguous (to avoid type I and type II errors). In addition, the classification system should be robust enough to last over time and provide comparability between data collections. This article was designed to evaluate the extent to which the classification system contained in the United States Food and Drug Administration 2012 Draft Guidance for the prospective assessment and classification of suicidal ideation and behavior in clinical trials meets these criteria. A critical review is used to assess the extent to which the proposed categories contained in the Food and Drug Administration 2012 Draft Guidance are comprehensive, unambiguous, and robust. Assumptions that underlie the classification system are also explored. The Food and Drug Administration classification system contained in the 2012 Draft Guidance does not capture the full range of suicidal ideation and behavior (type II error). Definitions, moreover, are frequently ambiguous (susceptible to multiple interpretations), and the potential for misclassification (type I and type II errors) is compounded by frequent mismatches in category titles and definitions. These issues have the potential to compromise data comparability within clinical trial sites, across sites, and over time. These problems need to be remedied because of the potential for flawed data output and consequent threats to public health, to research on the safety of medications, and to the search for effective medication treatments for suicidality.

  4. An experimental study of interstitial lung tissue classification in HRCT images using ANN and role of cost functions

    NASA Astrophysics Data System (ADS)

    Dash, Jatindra K.; Kale, Mandar; Mukhopadhyay, Sudipta; Khandelwal, Niranjan; Prabhakar, Nidhi; Garg, Mandeep; Kalra, Naveen

    2017-03-01

    In this paper, we investigate the effect of the error criteria used during a training phase of the artificial neural network (ANN) on the accuracy of the classifier for classification of lung tissues affected with Interstitial Lung Diseases (ILD). Mean square error (MSE) and the cross-entropy (CE) criteria are chosen being most popular choice in state-of-the-art implementations. The classification experiment performed on the six interstitial lung disease (ILD) patterns viz. Consolidation, Emphysema, Ground Glass Opacity, Micronodules, Fibrosis and Healthy from MedGIFT database. The texture features from an arbitrary region of interest (AROI) are extracted using Gabor filter. Two different neural networks are trained with the scaled conjugate gradient back propagation algorithm with MSE and CE error criteria function respectively for weight updation. Performance is evaluated in terms of average accuracy of these classifiers using 4 fold cross-validation. Each network is trained for five times for each fold with randomly initialized weight vectors and accuracies are computed. Significant improvement in classification accuracy is observed when ANN is trained by using CE (67.27%) as error function compared to MSE (63.60%). Moreover, standard deviation of the classification accuracy for the network trained with CE (6.69) error criteria is found less as compared to network trained with MSE (10.32) criteria.

  5. Particle Swarm Optimization approach to defect detection in armour ceramics.

    PubMed

    Kesharaju, Manasa; Nagarajah, Romesh

    2017-03-01

    In this research, various extracted features were used in the development of an automated ultrasonic sensor based inspection system that enables defect classification in each ceramic component prior to despatch to the field. Classification is an important task and large number of irrelevant, redundant features commonly introduced to a dataset reduces the classifiers performance. Feature selection aims to reduce the dimensionality of the dataset while improving the performance of a classification system. In the context of a multi-criteria optimization problem (i.e. to minimize classification error rate and reduce number of features) such as one discussed in this research, the literature suggests that evolutionary algorithms offer good results. Besides, it is noted that Particle Swarm Optimization (PSO) has not been explored especially in the field of classification of high frequency ultrasonic signals. Hence, a binary coded Particle Swarm Optimization (BPSO) technique is investigated in the implementation of feature subset selection and to optimize the classification error rate. In the proposed method, the population data is used as input to an Artificial Neural Network (ANN) based classification system to obtain the error rate, as ANN serves as an evaluator of PSO fitness function. Copyright © 2016. Published by Elsevier B.V.

  6. The impact of OCR accuracy on automated cancer classification of pathology reports.

    PubMed

    Zuccon, Guido; Nguyen, Anthony N; Bergheim, Anton; Wickman, Sandra; Grayson, Narelle

    2012-01-01

    To evaluate the effects of Optical Character Recognition (OCR) on the automatic cancer classification of pathology reports. Scanned images of pathology reports were converted to electronic free-text using a commercial OCR system. A state-of-the-art cancer classification system, the Medical Text Extraction (MEDTEX) system, was used to automatically classify the OCR reports. Classifications produced by MEDTEX on the OCR versions of the reports were compared with the classification from a human amended version of the OCR reports. The employed OCR system was found to recognise scanned pathology reports with up to 99.12% character accuracy and up to 98.95% word accuracy. Errors in the OCR processing were found to minimally impact on the automatic classification of scanned pathology reports into notifiable groups. However, the impact of OCR errors is not negligible when considering the extraction of cancer notification items, such as primary site, histological type, etc. The automatic cancer classification system used in this work, MEDTEX, has proven to be robust to errors produced by the acquisition of freetext pathology reports from scanned images through OCR software. However, issues emerge when considering the extraction of cancer notification items.

  7. Error Detection in Mechanized Classification Systems

    ERIC Educational Resources Information Center

    Hoyle, W. G.

    1976-01-01

    When documentary material is indexed by a mechanized classification system, and the results judged by trained professionals, the number of documents in disagreement, after suitable adjustment, defines the error rate of the system. In a test case disagreement was 22 percent and, of this 22 percent, the computer correctly identified two-thirds of…

  8. Privacy-Preserving Evaluation of Generalization Error and Its Application to Model and Attribute Selection

    NASA Astrophysics Data System (ADS)

    Sakuma, Jun; Wright, Rebecca N.

    Privacy-preserving classification is the task of learning or training a classifier on the union of privately distributed datasets without sharing the datasets. The emphasis of existing studies in privacy-preserving classification has primarily been put on the design of privacy-preserving versions of particular data mining algorithms, However, in classification problems, preprocessing and postprocessing— such as model selection or attribute selection—play a prominent role in achieving higher classification accuracy. In this paper, we show generalization error of classifiers in privacy-preserving classification can be securely evaluated without sharing prediction results. Our main technical contribution is a new generalized Hamming distance protocol that is universally applicable to preprocessing and postprocessing of various privacy-preserving classification problems, such as model selection in support vector machine and attribute selection in naive Bayes classification.

  9. Information analysis of a spatial database for ecological land classification

    NASA Technical Reports Server (NTRS)

    Davis, Frank W.; Dozier, Jeff

    1990-01-01

    An ecological land classification was developed for a complex region in southern California using geographic information system techniques of map overlay and contingency table analysis. Land classes were identified by mutual information analysis of vegetation pattern in relation to other mapped environmental variables. The analysis was weakened by map errors, especially errors in the digital elevation data. Nevertheless, the resulting land classification was ecologically reasonable and performed well when tested with higher quality data from the region.

  10. Predictive mapping of soil organic carbon in wet cultivated lands using classification-tree based models: the case study of Denmark.

    PubMed

    Bou Kheir, Rania; Greve, Mogens H; Bøcher, Peder K; Greve, Mette B; Larsen, René; McCloy, Keith

    2010-05-01

    Soil organic carbon (SOC) is one of the most important carbon stocks globally and has large potential to affect global climate. Distribution patterns of SOC in Denmark constitute a nation-wide baseline for studies on soil carbon changes (with respect to Kyoto protocol). This paper predicts and maps the geographic distribution of SOC across Denmark using remote sensing (RS), geographic information systems (GISs) and decision-tree modeling (un-pruned and pruned classification trees). Seventeen parameters, i.e. parent material, soil type, landscape type, elevation, slope gradient, slope aspect, mean curvature, plan curvature, profile curvature, flow accumulation, specific catchment area, tangent slope, tangent curvature, steady-state wetness index, Normalized Difference Vegetation Index (NDVI), Normalized Difference Wetness Index (NDWI) and Soil Color Index (SCI) were generated to statistically explain SOC field measurements in the area of interest (Denmark). A large number of tree-based classification models (588) were developed using (i) all of the parameters, (ii) all Digital Elevation Model (DEM) parameters only, (iii) the primary DEM parameters only, (iv), the remote sensing (RS) indices only, (v) selected pairs of parameters, (vi) soil type, parent material and landscape type only, and (vii) the parameters having a high impact on SOC distribution in built pruned trees. The best constructed classification tree models (in the number of three) with the lowest misclassification error (ME) and the lowest number of nodes (N) as well are: (i) the tree (T1) combining all of the parameters (ME=29.5%; N=54); (ii) the tree (T2) based on the parent material, soil type and landscape type (ME=31.5%; N=14); and (iii) the tree (T3) constructed using parent material, soil type, landscape type, elevation, tangent slope and SCI (ME=30%; N=39). The produced SOC maps at 1:50,000 cartographic scale using these trees are highly matching with coincidence values equal to 90.5% (Map T1/Map T2), 95% (Map T1/Map T3) and 91% (Map T2/Map T3). The overall accuracies of these maps once compared with field observations were estimated to be 69.54% (Map T1), 68.87% (Map T2) and 69.41% (Map T3). The proposed tree models are relatively simple, and may be also applied to other areas. Copyright 2010 Elsevier Ltd. All rights reserved.

  11. 40 CFR 51.900 - Definitions.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... higher or lower, classifications are ranked from lowest to highest as follows: classification under... National Ambient Air Quality Standard § 51.900 Definitions. The following definitions apply for purposes of... 42 U.S.C. 7401-7671q (2003). (f) Applicable requirements means for an area the following requirements...

  12. 40 CFR 51.900 - Definitions.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... higher or lower, classifications are ranked from lowest to highest as follows: classification under... National Ambient Air Quality Standard § 51.900 Definitions. The following definitions apply for purposes of... 42 U.S.C. 7401-7671q (2003). (f) Applicable requirements means for an area the following requirements...

  13. Structural MRI-based detection of Alzheimer's disease using feature ranking and classification error.

    PubMed

    Beheshti, Iman; Demirel, Hasan; Farokhian, Farnaz; Yang, Chunlan; Matsuda, Hiroshi

    2016-12-01

    This paper presents an automatic computer-aided diagnosis (CAD) system based on feature ranking for detection of Alzheimer's disease (AD) using structural magnetic resonance imaging (sMRI) data. The proposed CAD system is composed of four systematic stages. First, global and local differences in the gray matter (GM) of AD patients compared to the GM of healthy controls (HCs) are analyzed using a voxel-based morphometry technique. The aim is to identify significant local differences in the volume of GM as volumes of interests (VOIs). Second, the voxel intensity values of the VOIs are extracted as raw features. Third, the raw features are ranked using a seven-feature ranking method, namely, statistical dependency (SD), mutual information (MI), information gain (IG), Pearson's correlation coefficient (PCC), t-test score (TS), Fisher's criterion (FC), and the Gini index (GI). The features with higher scores are more discriminative. To determine the number of top features, the estimated classification error based on training set made up of the AD and HC groups is calculated, with the vector size that minimized this error selected as the top discriminative feature. Fourth, the classification is performed using a support vector machine (SVM). In addition, a data fusion approach among feature ranking methods is introduced to improve the classification performance. The proposed method is evaluated using a data-set from ADNI (130 AD and 130 HC) with 10-fold cross-validation. The classification accuracy of the proposed automatic system for the diagnosis of AD is up to 92.48% using the sMRI data. An automatic CAD system for the classification of AD based on feature-ranking method and classification errors is proposed. In this regard, seven-feature ranking methods (i.e., SD, MI, IG, PCC, TS, FC, and GI) are evaluated. The optimal size of top discriminative features is determined by the classification error estimation in the training phase. The experimental results indicate that the performance of the proposed system is comparative to that of state-of-the-art classification models. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  14. Application of texture analysis method for classification of benign and malignant thyroid nodules in ultrasound images.

    PubMed

    Abbasian Ardakani, Ali; Gharbali, Akbar; Mohammadi, Afshin

    2015-01-01

    The aim of this study was to evaluate computer aided diagnosis (CAD) system with texture analysis (TA) to improve radiologists' accuracy in identification of thyroid nodules as malignant or benign. A total of 70 cases (26 benign and 44 malignant) were analyzed in this study. We extracted up to 270 statistical texture features as a descriptor for each selected region of interests (ROIs) in three normalization schemes (default, 3s and 1%-99%). Then features by the lowest probability of classification error and average correlation coefficients (POE+ACC), and Fisher coefficient (Fisher) eliminated to 10 best and most effective features. These features were analyzed under standard and nonstandard states. For TA of the thyroid nodules, Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA) and Non-Linear Discriminant Analysis (NDA) were applied. First Nearest-Neighbour (1-NN) classifier was performed for the features resulting from PCA and LDA. NDA features were classified by artificial neural network (A-NN). Receiver operating characteristic (ROC) curve analysis was used for examining the performance of TA methods. The best results were driven in 1-99% normalization with features extracted by POE+ACC algorithm and analyzed by NDA with the area under the ROC curve ( Az) of 0.9722 which correspond to sensitivity of 94.45%, specificity of 100%, and accuracy of 97.14%. Our results indicate that TA is a reliable method, can provide useful information help radiologist in detection and classification of benign and malignant thyroid nodules.

  15. Documentation of procedures for textural/spatial pattern recognition techniques

    NASA Technical Reports Server (NTRS)

    Haralick, R. M.; Bryant, W. F.

    1976-01-01

    A C-130 aircraft was flown over the Sam Houston National Forest on March 21, 1973 at 10,000 feet altitude to collect multispectral scanner (MSS) data. Existing textural and spatial automatic processing techniques were used to classify the MSS imagery into specified timber categories. Several classification experiments were performed on this data using features selected from the spectral bands and a textural transform band. The results indicate that (1) spatial post-processing a classified image can cut the classification error to 1/2 or 1/3 of its initial value, (2) spatial post-processing the classified image using combined spectral and textural features produces a resulting image with less error than post-processing a classified image using only spectral features and (3) classification without spatial post processing using the combined spectral textural features tends to produce about the same error rate as a classification without spatial post processing using only spectral features.

  16. Human error analysis of commercial aviation accidents: application of the Human Factors Analysis and Classification system (HFACS).

    PubMed

    Wiegmann, D A; Shappell, S A

    2001-11-01

    The Human Factors Analysis and Classification System (HFACS) is a general human error framework originally developed and tested within the U.S. military as a tool for investigating and analyzing the human causes of aviation accidents. Based on Reason's (1990) model of latent and active failures, HFACS addresses human error at all levels of the system, including the condition of aircrew and organizational factors. The purpose of the present study was to assess the utility of the HFACS framework as an error analysis and classification tool outside the military. The HFACS framework was used to analyze human error data associated with aircrew-related commercial aviation accidents that occurred between January 1990 and December 1996 using database records maintained by the NTSB and the FAA. Investigators were able to reliably accommodate all the human causal factors associated with the commercial aviation accidents examined in this study using the HFACS system. In addition, the classification of data using HFACS highlighted several critical safety issues in need of intervention research. These results demonstrate that the HFACS framework can be a viable tool for use within the civil aviation arena. However, additional research is needed to examine its applicability to areas outside the flight deck, such as aircraft maintenance and air traffic control domains.

  17. Practical Procedures for Constructing Mastery Tests to Minimize Errors of Classification and to Maximize or Optimize Decision Reliability.

    ERIC Educational Resources Information Center

    Byars, Alvin Gregg

    The objectives of this investigation are to develop, describe, assess, and demonstrate procedures for constructing mastery tests to minimize errors of classification and to maximize decision reliability. The guidelines are based on conditions where item exchangeability is a reasonable assumption and the test constructor can control the number of…

  18. Soil Genesis and Development, Lesson 5 - Soil Geography and Classification

    USDA-ARS?s Scientific Manuscript database

    The system of soil classification developed by the United States Department of Agriculture (USDA) is called Soil Taxonomy. Soil Taxonomy consists of a hierarchy of six levels which, from highest to lowest, are: Order, Suborder, Great Group, Subgroup, family, and series. This lesson will focus on bro...

  19. Classification of stillbirths is an ongoing dilemma.

    PubMed

    Nappi, Luigi; Trezza, Federica; Bufo, Pantaleo; Riezzo, Irene; Turillazzi, Emanuela; Borghi, Chiara; Bonaccorsi, Gloria; Scutiero, Gennaro; Fineschi, Vittorio; Greco, Pantaleo

    2016-10-01

    To compare different classification systems in a cohort of stillbirths undergoing a comprehensive workup; to establish whether a particular classification system is most suitable and useful in determining cause of death, purporting the lowest percentage of unexplained death. Cases of stillbirth at gestational age 22-41 weeks occurring at the Department of Gynecology and Obstetrics of Foggia University during a 4 year period were collected. The World Health Organization (WHO) diagnosis of stillbirth was used. All the data collection was based on the recommendations of an Italian diagnostic workup for stillbirth. Two expert obstetricians reviewed all cases and classified causes according to five classification systems. Relevant Condition at Death (ReCoDe) and Causes Of Death and Associated Conditions (CODAC) classification systems performed best in retaining information. The ReCoDe system provided the lowest rate of unexplained stillbirth (14%) compared to de Galan-Roosen (16%), CODAC (16%), Tulip (18%), Wigglesworth (62%). Classification of stillbirth is influenced by the multiplicity of possible causes and factors related to fetal death. Fetal autopsy, placental histology and cytogenetic analysis are strongly recommended to have a complete diagnostic evaluation. Commonly employed classification systems performed differently in our experience, the most satisfactory being the ReCoDe. Given the rate of "unexplained" cases, none can be considered optimal and further efforts are necessary to work out a clinically useful system.

  20. Comparing K-mer based methods for improved classification of 16S sequences.

    PubMed

    Vinje, Hilde; Liland, Kristian Hovde; Almøy, Trygve; Snipen, Lars

    2015-07-01

    The need for precise and stable taxonomic classification is highly relevant in modern microbiology. Parallel to the explosion in the amount of sequence data accessible, there has also been a shift in focus for classification methods. Previously, alignment-based methods were the most applicable tools. Now, methods based on counting K-mers by sliding windows are the most interesting classification approach with respect to both speed and accuracy. Here, we present a systematic comparison on five different K-mer based classification methods for the 16S rRNA gene. The methods differ from each other both in data usage and modelling strategies. We have based our study on the commonly known and well-used naïve Bayes classifier from the RDP project, and four other methods were implemented and tested on two different data sets, on full-length sequences as well as fragments of typical read-length. The difference in classification error obtained by the methods seemed to be small, but they were stable and for both data sets tested. The Preprocessed nearest-neighbour (PLSNN) method performed best for full-length 16S rRNA sequences, significantly better than the naïve Bayes RDP method. On fragmented sequences the naïve Bayes Multinomial method performed best, significantly better than all other methods. For both data sets explored, and on both full-length and fragmented sequences, all the five methods reached an error-plateau. We conclude that no K-mer based method is universally best for classifying both full-length sequences and fragments (reads). All methods approach an error plateau indicating improved training data is needed to improve classification from here. Classification errors occur most frequent for genera with few sequences present. For improving the taxonomy and testing new classification methods, the need for a better and more universal and robust training data set is crucial.

  1. An experimental evaluation of the incidence of fitness-function/search-algorithm combinations on the classification performance of myoelectric control systems with iPCA tuning

    PubMed Central

    2013-01-01

    Background The information of electromyographic signals can be used by Myoelectric Control Systems (MCSs) to actuate prostheses. These devices allow the performing of movements that cannot be carried out by persons with amputated limbs. The state of the art in the development of MCSs is based on the use of individual principal component analysis (iPCA) as a stage of pre-processing of the classifiers. The iPCA pre-processing implies an optimization stage which has not yet been deeply explored. Methods The present study considers two factors in the iPCA stage: namely A (the fitness function), and B (the search algorithm). The A factor comprises two levels, namely A1 (the classification error) and A2 (the correlation factor). Otherwise, the B factor has four levels, specifically B1 (the Sequential Forward Selection, SFS), B2 (the Sequential Floating Forward Selection, SFFS), B3 (Artificial Bee Colony, ABC), and B4 (Particle Swarm Optimization, PSO). This work evaluates the incidence of each one of the eight possible combinations between A and B factors over the classification error of the MCS. Results A two factor ANOVA was performed on the computed classification errors and determined that: (1) the interactive effects over the classification error are not significative (F0.01,3,72 = 4.0659 > f AB  = 0.09), (2) the levels of factor A have significative effects on the classification error (F0.02,1,72 = 5.0162 < f A  = 6.56), and (3) the levels of factor B over the classification error are not significative (F0.01,3,72 = 4.0659 > f B  = 0.08). Conclusions Considering the classification performance we found a superiority of using the factor A2 in combination with any of the levels of factor B. With respect to the time performance the analysis suggests that the PSO algorithm is at least 14 percent better than its best competitor. The latter behavior has been observed for a particular configuration set of parameters in the search algorithms. Future works will investigate the effect of these parameters in the classification performance, such as length of the reduced size vector, number of particles and bees used during optimal search, the cognitive parameters in the PSO algorithm as well as the limit of cycles to improve a solution in the ABC algorithm. PMID:24369728

  2. Exploring human error in military aviation flight safety events using post-incident classification systems.

    PubMed

    Hooper, Brionny J; O'Hare, David P A

    2013-08-01

    Human error classification systems theoretically allow researchers to analyze postaccident data in an objective and consistent manner. The Human Factors Analysis and Classification System (HFACS) framework is one such practical analysis tool that has been widely used to classify human error in aviation. The Cognitive Error Taxonomy (CET) is another. It has been postulated that the focus on interrelationships within HFACS can facilitate the identification of the underlying causes of pilot error. The CET provides increased granularity at the level of unsafe acts. The aim was to analyze the influence of factors at higher organizational levels on the unsafe acts of front-line operators and to compare the errors of fixed-wing and rotary-wing operations. This study analyzed 288 aircraft incidents involving human error from an Australasian military organization occurring between 2001 and 2008. Action errors accounted for almost twice (44%) the proportion of rotary wing compared to fixed wing (23%) incidents. Both classificatory systems showed significant relationships between precursor factors such as the physical environment, mental and physiological states, crew resource management, training and personal readiness, and skill-based, but not decision-based, acts. The CET analysis showed different predisposing factors for different aspects of skill-based behaviors. Skill-based errors in military operations are more prevalent in rotary wing incidents and are related to higher level supervisory processes in the organization. The Cognitive Error Taxonomy provides increased granularity to HFACS analyses of unsafe acts.

  3. How should children with speech sound disorders be classified? A review and critical evaluation of current classification systems.

    PubMed

    Waring, R; Knight, R

    2013-01-01

    Children with speech sound disorders (SSD) form a heterogeneous group who differ in terms of the severity of their condition, underlying cause, speech errors, involvement of other aspects of the linguistic system and treatment response. To date there is no universal and agreed-upon classification system. Instead, a number of theoretically differing classification systems have been proposed based on either an aetiological (medical) approach, a descriptive-linguistic approach or a processing approach. To describe and review the supporting evidence, and to provide a critical evaluation of the current childhood SSD classification systems. Descriptions of the major specific approaches to classification are reviewed and research papers supporting the reliability and validity of the systems are evaluated. Three specific paediatric SSD classification systems; the aetiologic-based Speech Disorders Classification System, the descriptive-linguistic Differential Diagnosis system, and the processing-based Psycholinguistic Framework are identified as potentially useful in classifying children with SSD into homogeneous subgroups. The Differential Diagnosis system has a growing body of empirical support from clinical population studies, across language error pattern studies and treatment efficacy studies. The Speech Disorders Classification System is currently a research tool with eight proposed subgroups. The Psycholinguistic Framework is a potential bridge to linking cause and surface level speech errors. There is a need for a universally agreed-upon classification system that is useful to clinicians and researchers. The resulting classification system needs to be robust, reliable and valid. A universal classification system would allow for improved tailoring of treatments to subgroups of SSD which may, in turn, lead to improved treatment efficacy. © 2012 Royal College of Speech and Language Therapists.

  4. The application of Aronson's taxonomy to medication errors in nursing.

    PubMed

    Johnson, Maree; Young, Helen

    2011-01-01

    Medication administration is a frequent nursing activity that is prone to error. In this study of 318 self-reported medication incidents (including near misses), very few resulted in patient harm-7% required intervention or prolonged hospitalization or caused temporary harm. Aronson's classification system provided an excellent framework for analysis of the incidents with a close connection between the type of error and the change strategy to minimize medication incidents. Taking a behavioral approach to medication error classification has provided helpful strategies for nurses such as nurse-call cards on patient lockers when patients are absent and checking of medication sign-off by outgoing and incoming staff at handover.

  5. Center for Seismic Studies Final Technical Report, October 1992 through October 1993

    DTIC Science & Technology

    1994-02-07

    SECURITY CLASSIFICATION 18. SECURITY CLASSIFICATION 19. SECURITY CLASSIFICATION 20. LIMITATION OF ABSTRACT OF REPORT OF THIS PAGE OF ABSTRACT...Upper limit of depth error as a function of mb for estimates based on P and S waves for three netowrks : GSETr-2, ALPHA, and ALPHA + a 50 station...U 4A 4 U 4S as 1 I I I Figure 42: Upper limit of depth error as a function of mb for estimatesbased on P and S waves for three netowrk : GSETT-2o ALPHA

  6. Assimilation of a knowledge base and physical models to reduce errors in passive-microwave classifications of sea ice

    NASA Technical Reports Server (NTRS)

    Maslanik, J. A.; Key, J.

    1992-01-01

    An expert system framework has been developed to classify sea ice types using satellite passive microwave data, an operational classification algorithm, spatial and temporal information, ice types estimated from a dynamic-thermodynamic model, output from a neural network that detects the onset of melt, and knowledge about season and region. The rule base imposes boundary conditions upon the ice classification, modifies parameters in the ice algorithm, determines a `confidence' measure for the classified data, and under certain conditions, replaces the algorithm output with model output. Results demonstrate the potential power of such a system for minimizing overall error in the classification and for providing non-expert data users with a means of assessing the usefulness of the classification results for their applications.

  7. Successional stage of biological soil crusts: an accurate indicator of ecohydrological condition

    USGS Publications Warehouse

    Belnap, Jayne; Wilcox, Bradford P.; Van Scoyoc, Matthew V.; Phillips, Susan L.

    2013-01-01

    Biological soil crusts are a key component of many dryland ecosystems. Following disturbance, biological soil crusts will recover in stages. Recently, a simple classification of these stages has been developed, largely on the basis of external features of the crusts, which reflects their level of development (LOD). The classification system has six LOD classes, from low (1) to high (6). To determine whether the LOD of a crust is related to its ecohydrological function, we used rainfall simulation to evaluate differences in infiltration, runoff, and erosion among crusts in the various LODs, across a range of soil depths and with different wetting pre-treatments. We found large differences between the lowest and highest LODs, with runoff and erosion being greatest from the lowest LOD. Under dry antecedent conditions, about 50% of the water applied ran off the lowest LOD plots, whereas less than 10% ran off the plots of the two highest LODs. Similarly, sediment loss was 400 g m-2 from the lowest LOD and almost zero from the higher LODs. We scaled up the results from these simulations using the Rangeland Hydrology and Erosion Model. Modelling results indicate that erosion increases dramatically as slope length and gradient increase, especially beyond the threshold values of 10 m for slope length and 10% for slope gradient. Our findings confirm that the LOD classification is a quick, easy, nondestructive, and accurate index of hydrological condition and should be incorporated in field and modelling assessments of ecosystem health.

  8. Spelling in adolescents with dyslexia: errors and modes of assessment.

    PubMed

    Tops, Wim; Callens, Maaike; Bijn, Evi; Brysbaert, Marc

    2014-01-01

    In this study we focused on the spelling of high-functioning students with dyslexia. We made a detailed classification of the errors in a word and sentence dictation task made by 100 students with dyslexia and 100 matched control students. All participants were in the first year of their bachelor's studies and had Dutch as mother tongue. Three main error categories were distinguished: phonological, orthographic, and grammatical errors (on the basis of morphology and language-specific spelling rules). The results indicated that higher-education students with dyslexia made on average twice as many spelling errors as the controls, with effect sizes of d ≥ 2. When the errors were classified as phonological, orthographic, or grammatical, we found a slight dominance of phonological errors in students with dyslexia. Sentence dictation did not provide more information than word dictation in the correct classification of students with and without dyslexia. © Hammill Institute on Disabilities 2012.

  9. A Guide for Setting the Cut-Scores to Minimize Weighted Classification Errors in Test Batteries

    ERIC Educational Resources Information Center

    Grabovsky, Irina; Wainer, Howard

    2017-01-01

    In this article, we extend the methodology of the Cut-Score Operating Function that we introduced previously and apply it to a testing scenario with multiple independent components and different testing policies. We derive analytically the overall classification error rate for a test battery under the policy when several retakes are allowed for…

  10. [Classifications in forensic medicine and their logical basis].

    PubMed

    Kovalev, A V; Shmarov, L A; Ten'kov, A A

    2014-01-01

    The objective of the present study was to characterize the main requirements for the correct construction of classifications used in forensic medicine, with special reference to the errors that occur in the relevant text-books, guidelines, and manuals and the ways to avoid them. This publication continues the series of thematic articles of the authors devoted to the logical errors in the expert conclusions. The preparation of further publications is underway to report the results of the in-depth analysis of the logical errors encountered in expert conclusions, text-books, guidelines, and manuals.

  11. Exciplex Formation between Silver Ions and the Lowest MLCT Excited State of the Tris(Bipyrazine)Ruthenium(2) Cation

    DTIC Science & Technology

    1988-07-11

    OFFICE OF NAVAL RESEARCH Contract N00014-84-G-0201 Task No. 0051-865 0 Technical Report #21 Exciplex Formation Between Silver Ions and the Lowest...ELEMENT NO-. NO NO ~ ACCESSION NO 11. TITLE (include Security Classification) Exciplex Formation Between Silver Ions and the Lowest MLCT Excited State of... eXCiplexes with upIV to six silver ions per excited Cation. Lifetime, wavelength data are presented as a function of the [Agi/[Ru] ratio. An excited state

  12. Sensitivity analysis of the GEMS soil organic carbon model to land cover land use classification uncertainties under different climate scenarios in Senegal

    USGS Publications Warehouse

    Dieye, A.M.; Roy, David P.; Hanan, N.P.; Liu, S.; Hansen, M.; Toure, A.

    2012-01-01

    Spatially explicit land cover land use (LCLU) change information is needed to drive biogeochemical models that simulate soil organic carbon (SOC) dynamics. Such information is increasingly being mapped using remotely sensed satellite data with classification schemes and uncertainties constrained by the sensing system, classification algorithms and land cover schemes. In this study, automated LCLU classification of multi-temporal Landsat satellite data were used to assess the sensitivity of SOC modeled by the Global Ensemble Biogeochemical Modeling System (GEMS). The GEMS was run for an area of 1560 km2 in Senegal under three climate change scenarios with LCLU maps generated using different Landsat classification approaches. This research provides a method to estimate the variability of SOC, specifically the SOC uncertainty due to satellite classification errors, which we show is dependent not only on the LCLU classification errors but also on where the LCLU classes occur relative to the other GEMS model inputs.

  13. The Soil Series in Soil Classifications of the United States

    NASA Astrophysics Data System (ADS)

    Indorante, Samuel; Beaudette, Dylan; Brevik, Eric C.

    2014-05-01

    Organized national soil survey began in the United States in 1899, with soil types as the units being mapped. The soil series concept was introduced into the U.S. soil survey in 1903 as a way to relate soils being mapped in one area to the soils of other areas. The original concept of a soil series was all soil types formed in the same parent materials that were of the same geologic age. However, within about 15 years soil series became the primary units being mapped in U.S. soil survey. Soil types became subdivisions of soil series, with the subdivisions based on changes in texture. As the soil series became the primary mapping unit the concept of what a soil series was also changed. Instead of being based on parent materials and geologic age, the soil series of the 1920s was based on the morphology and composition of the soil profile. Another major change in the concept of soil series occurred when U.S. Soil Taxonomy was released in 1975. Under Soil Taxonomy, the soil series subdivisions were based on the uses the soils might be put to, particularly their agricultural uses (Simonson, 1997). While the concept of the soil series has changed over the years, the term soil series has been the longest-lived term in U.S. soil classification. It has appeared in every official classification system used by the U.S. soil survey (Brevik and Hartemink, 2013). The first classification system was put together by Milton Whitney in 1909 and had soil series at its second lowest level, with soil type at the lowest level. The second classification system used by the U.S. soil survey was developed by C.F. Marbut, H.H. Bennett, J.E. Lapham, and M.H. Lapham in 1913. It had soil series at the second highest level, with soil classes and soil types at more detailed levels. This was followed by another system in 1938 developed by M. Baldwin, C.E. Kellogg, and J. Thorp. In this system soil series were again at the second lowest level with soil types at the lowest level. The soil type concept was dropped and replaced by the soil phase in the 1950s in a modification of the 1938 Baldwin et al. classification (Simonson, 1997). When Soil Taxonomy was released in 1975, soil series became the most detailed (lowest) level of the classification system, and the only term maintained throughout all U.S. classifications to date. While the number of recognized soil series have increased steadily throughout the history of U.S. soil survey, there was a rapid increase in the recognition of new soil series following the introduction of Soil Taxonomy (Brevik and Hartemink, 2013). References Brevik, E.C., and A.E. Hartemink. 2013. Soil maps of the United States of America. Soil Science Society of America Journal 77:1117-1132. doi:10.2136/sssaj2012.0390. Simonson, R.W. 1997. Evolution of soil series and type concepts in the United States. Advances in Geoecology 29:79-108.

  14. Attitudes of Mashhad Public Hospital's Nurses and Midwives toward the Causes and Rates of Medical Errors Reporting.

    PubMed

    Mobarakabadi, Sedigheh Sedigh; Ebrahimipour, Hosein; Najar, Ali Vafaie; Janghorban, Roksana; Azarkish, Fatemeh

    2017-03-01

    Patient's safety is one of the main objective in healthcare services; however medical errors are a prevalent potential occurrence for the patients in treatment systems. Medical errors lead to an increase in mortality rate of the patients and challenges such as prolonging of the inpatient period in the hospitals and increased cost. Controlling the medical errors is very important, because these errors besides being costly, threaten the patient's safety. To evaluate the attitudes of nurses and midwives toward the causes and rates of medical errors reporting. It was a cross-sectional observational study. The study population was 140 midwives and nurses employed in Mashhad Public Hospitals. The data collection was done through Goldstone 2001 revised questionnaire. SPSS 11.5 software was used for data analysis. To analyze data, descriptive and inferential analytic statistics were used. Standard deviation and relative frequency distribution, descriptive statistics were used for calculation of the mean and the results were adjusted as tables and charts. Chi-square test was used for the inferential analysis of the data. Most of midwives and nurses (39.4%) were in age range of 25 to 34 years and the lowest percentage (2.2%) were in age range of 55-59 years. The highest average of medical errors was related to employees with three-four years of work experience, while the lowest average was related to those with one-two years of work experience. The highest average of medical errors was during the evening shift, while the lowest were during the night shift. Three main causes of medical errors were considered: illegibile physician prescription orders, similarity of names in different drugs and nurse fatigueness. The most important causes for medical errors from the viewpoints of nurses and midwives are illegible physician's order, drug name similarity with other drugs, nurse's fatigueness and damaged label or packaging of the drug, respectively. Head nurse feedback, peer feedback, fear of punishment or job loss were considered as reasons for under reporting of medical errors. This research demonstrates the need for greater attention to be paid to the causes of medical errors.

  15. Evaluating data mining algorithms using molecular dynamics trajectories.

    PubMed

    Tatsis, Vasileios A; Tjortjis, Christos; Tzirakis, Panagiotis

    2013-01-01

    Molecular dynamics simulations provide a sample of a molecule's conformational space. Experiments on the mus time scale, resulting in large amounts of data, are nowadays routine. Data mining techniques such as classification provide a way to analyse such data. In this work, we evaluate and compare several classification algorithms using three data sets which resulted from computer simulations, of a potential enzyme mimetic biomolecule. We evaluated 65 classifiers available in the well-known data mining toolkit Weka, using 'classification' errors to assess algorithmic performance. Results suggest that: (i) 'meta' classifiers perform better than the other groups, when applied to molecular dynamics data sets; (ii) Random Forest and Rotation Forest are the best classifiers for all three data sets; and (iii) classification via clustering yields the highest classification error. Our findings are consistent with bibliographic evidence, suggesting a 'roadmap' for dealing with such data.

  16. Predictive modeling of respiratory tumor motion for real-time prediction of baseline shifts

    NASA Astrophysics Data System (ADS)

    Balasubramanian, A.; Shamsuddin, R.; Prabhakaran, B.; Sawant, A.

    2017-03-01

    Baseline shifts in respiratory patterns can result in significant spatiotemporal changes in patient anatomy (compared to that captured during simulation), in turn, causing geometric and dosimetric errors in the administration of thoracic and abdominal radiotherapy. We propose predictive modeling of the tumor motion trajectories for predicting a baseline shift ahead of its occurrence. The key idea is to use the features of the tumor motion trajectory over a 1 min window, and predict the occurrence of a baseline shift in the 5 s that immediately follow (lookahead window). In this study, we explored a preliminary trend-based analysis with multi-class annotations as well as a more focused binary classification analysis. In both analyses, a number of different inter-fraction and intra-fraction training strategies were studied, both offline as well as online, along with data sufficiency and skew compensation for class imbalances. The performance of different training strategies were compared across multiple machine learning classification algorithms, including nearest neighbor, Naïve Bayes, linear discriminant and ensemble Adaboost. The prediction performance is evaluated using metrics such as accuracy, precision, recall and the area under the curve (AUC) for repeater operating characteristics curve. The key results of the trend-based analysis indicate that (i) intra-fraction training strategies achieve highest prediction accuracies (90.5-91.4%) (ii) the predictive modeling yields lowest accuracies (50-60%) when the training data does not include any information from the test patient; (iii) the prediction latencies are as low as a few hundred milliseconds, and thus conducive for real-time prediction. The binary classification performance is promising, indicated by high AUCs (0.96-0.98). It also confirms the utility of prior data from previous patients, and also the necessity of training the classifier on some initial data from the new patient for reasonable prediction performance. The ability to predict a baseline shift with a sufficient look-ahead window will enable clinical systems or even human users to hold the treatment beam in such situations, thereby reducing the probability of serious geometric and dosimetric errors.

  17. Predictive modeling of respiratory tumor motion for real-time prediction of baseline shifts

    PubMed Central

    Balasubramanian, A; Shamsuddin, R; Prabhakaran, B; Sawant, A

    2017-01-01

    Baseline shifts in respiratory patterns can result in significant spatiotemporal changes in patient anatomy (compared to that captured during simulation), in turn, causing geometric and dosimetric errors in the administration of thoracic and abdominal radiotherapy. We propose predictive modeling of the tumor motion trajectories for predicting a baseline shift ahead of its occurrence. The key idea is to use the features of the tumor motion trajectory over a 1 min window, and predict the occurrence of a baseline shift in the 5 s that immediately follow (lookahead window). In this study, we explored a preliminary trend-based analysis with multi-class annotations as well as a more focused binary classification analysis. In both analyses, a number of different inter-fraction and intra-fraction training strategies were studied, both offline as well as online, along with data sufficiency and skew compensation for class imbalances. The performance of different training strategies were compared across multiple machine learning classification algorithms, including nearest neighbor, Naïve Bayes, linear discriminant and ensemble Adaboost. The prediction performance is evaluated using metrics such as accuracy, precision, recall and the area under the curve (AUC) for repeater operating characteristics curve. The key results of the trend-based analysis indicate that (i) intra-fraction training strategies achieve highest prediction accuracies (90.5–91.4%); (ii) the predictive modeling yields lowest accuracies (50–60%) when the training data does not include any information from the test patient; (iii) the prediction latencies are as low as a few hundred milliseconds, and thus conducive for real-time prediction. The binary classification performance is promising, indicated by high AUCs (0.96–0.98). It also confirms the utility of prior data from previous patients, and also the necessity of training the classifier on some initial data from the new patient for reasonable prediction performance. The ability to predict a baseline shift with a sufficient lookahead window will enable clinical systems or even human users to hold the treatment beam in such situations, thereby reducing the probability of serious geometric and dosimetric errors. PMID:28075331

  18. Predictive modeling of respiratory tumor motion for real-time prediction of baseline shifts.

    PubMed

    Balasubramanian, A; Shamsuddin, R; Prabhakaran, B; Sawant, A

    2017-03-07

    Baseline shifts in respiratory patterns can result in significant spatiotemporal changes in patient anatomy (compared to that captured during simulation), in turn, causing geometric and dosimetric errors in the administration of thoracic and abdominal radiotherapy. We propose predictive modeling of the tumor motion trajectories for predicting a baseline shift ahead of its occurrence. The key idea is to use the features of the tumor motion trajectory over a 1 min window, and predict the occurrence of a baseline shift in the 5 s that immediately follow (lookahead window). In this study, we explored a preliminary trend-based analysis with multi-class annotations as well as a more focused binary classification analysis. In both analyses, a number of different inter-fraction and intra-fraction training strategies were studied, both offline as well as online, along with data sufficiency and skew compensation for class imbalances. The performance of different training strategies were compared across multiple machine learning classification algorithms, including nearest neighbor, Naïve Bayes, linear discriminant and ensemble Adaboost. The prediction performance is evaluated using metrics such as accuracy, precision, recall and the area under the curve (AUC) for repeater operating characteristics curve. The key results of the trend-based analysis indicate that (i) intra-fraction training strategies achieve highest prediction accuracies (90.5-91.4%); (ii) the predictive modeling yields lowest accuracies (50-60%) when the training data does not include any information from the test patient; (iii) the prediction latencies are as low as a few hundred milliseconds, and thus conducive for real-time prediction. The binary classification performance is promising, indicated by high AUCs (0.96-0.98). It also confirms the utility of prior data from previous patients, and also the necessity of training the classifier on some initial data from the new patient for reasonable prediction performance. The ability to predict a baseline shift with a sufficient look-ahead window will enable clinical systems or even human users to hold the treatment beam in such situations, thereby reducing the probability of serious geometric and dosimetric errors.

  19. Evaluation criteria for software classification inventories, accuracies, and maps

    NASA Technical Reports Server (NTRS)

    Jayroe, R. R., Jr.

    1976-01-01

    Statistical criteria are presented for modifying the contingency table used to evaluate tabular classification results obtained from remote sensing and ground truth maps. This classification technique contains information on the spatial complexity of the test site, on the relative location of classification errors, on agreement of the classification maps with ground truth maps, and reduces back to the original information normally found in a contingency table.

  20. The search for structure - Object classification in large data sets. [for astronomers

    NASA Technical Reports Server (NTRS)

    Kurtz, Michael J.

    1988-01-01

    Research concerning object classifications schemes are reviewed, focusing on large data sets. Classification techniques are discussed, including syntactic, decision theoretic methods, fuzzy techniques, and stochastic and fuzzy grammars. Consideration is given to the automation of MK classification (Morgan and Keenan, 1973) and other problems associated with the classification of spectra. In addition, the classification of galaxies is examined, including the problems of systematic errors, blended objects, galaxy types, and galaxy clusters.

  1. Procedural Error and Task Interruption

    DTIC Science & Technology

    2016-09-30

    red for research on errors and individual differences . Results indicate predictive validity for fluid intelligence and specifi c forms of work...TERMS procedural error, task interruption, individual differences , fluid intelligence, sleep deprivation 16. SECURITY CLASSIFICATION OF: 17...and individual differences . It generates rich data on several kinds of errors, including procedural errors in which steps are skipped or repeated

  2. The Sources of Error in Spanish Writing.

    ERIC Educational Resources Information Center

    Justicia, Fernando; Defior, Sylvia; Pelegrina, Santiago; Martos, Francisco J.

    1999-01-01

    Determines the pattern of errors in Spanish spelling. Analyzes and proposes a classification system for the errors made by children in the initial stages of the acquisition of spelling skills. Finds the diverse forms of only 20 Spanish words produces 36% of the spelling errors in Spanish; and substitution is the most frequent type of error. (RS)

  3. Application of principal component analysis to distinguish patients with schizophrenia from healthy controls based on fractional anisotropy measurements.

    PubMed

    Caprihan, A; Pearlson, G D; Calhoun, V D

    2008-08-15

    Principal component analysis (PCA) is often used to reduce the dimension of data before applying more sophisticated data analysis methods such as non-linear classification algorithms or independent component analysis. This practice is based on selecting components corresponding to the largest eigenvalues. If the ultimate goal is separation of data in two groups, then these set of components need not have the most discriminatory power. We measured the distance between two such populations using Mahalanobis distance and chose the eigenvectors to maximize it, a modified PCA method, which we call the discriminant PCA (DPCA). DPCA was applied to diffusion tensor-based fractional anisotropy images to distinguish age-matched schizophrenia subjects from healthy controls. The performance of the proposed method was evaluated by the one-leave-out method. We show that for this fractional anisotropy data set, the classification error with 60 components was close to the minimum error and that the Mahalanobis distance was twice as large with DPCA, than with PCA. Finally, by masking the discriminant function with the white matter tracts of the Johns Hopkins University atlas, we identified left superior longitudinal fasciculus as the tract which gave the least classification error. In addition, with six optimally chosen tracts the classification error was zero.

  4. Pareto-optimal multi-objective dimensionality reduction deep auto-encoder for mammography classification.

    PubMed

    Taghanaki, Saeid Asgari; Kawahara, Jeremy; Miles, Brandon; Hamarneh, Ghassan

    2017-07-01

    Feature reduction is an essential stage in computer aided breast cancer diagnosis systems. Multilayer neural networks can be trained to extract relevant features by encoding high-dimensional data into low-dimensional codes. Optimizing traditional auto-encoders works well only if the initial weights are close to a proper solution. They are also trained to only reduce the mean squared reconstruction error (MRE) between the encoder inputs and the decoder outputs, but do not address the classification error. The goal of the current work is to test the hypothesis that extending traditional auto-encoders (which only minimize reconstruction error) to multi-objective optimization for finding Pareto-optimal solutions provides more discriminative features that will improve classification performance when compared to single-objective and other multi-objective approaches (i.e. scalarized and sequential). In this paper, we introduce a novel multi-objective optimization of deep auto-encoder networks, in which the auto-encoder optimizes two objectives: MRE and mean classification error (MCE) for Pareto-optimal solutions, rather than just MRE. These two objectives are optimized simultaneously by a non-dominated sorting genetic algorithm. We tested our method on 949 X-ray mammograms categorized into 12 classes. The results show that the features identified by the proposed algorithm allow a classification accuracy of up to 98.45%, demonstrating favourable accuracy over the results of state-of-the-art methods reported in the literature. We conclude that adding the classification objective to the traditional auto-encoder objective and optimizing for finding Pareto-optimal solutions, using evolutionary multi-objective optimization, results in producing more discriminative features. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. Wildlife management by habitat units: A preliminary plan of action

    NASA Technical Reports Server (NTRS)

    Frentress, C. D.; Frye, R. G.

    1975-01-01

    Procedures for yielding vegetation type maps were developed using LANDSAT data and a computer assisted classification analysis (LARSYS) to assist in managing populations of wildlife species by defined area units. Ground cover in Travis County, Texas was classified on two occasions using a modified version of the unsupervised approach to classification. The first classification produced a total of 17 classes. Examination revealed that further grouping was justified. A second analysis produced 10 classes which were displayed on printouts which were later color-coded. The final classification was 82 percent accurate. While the classification map appeared to satisfactorily depict the existing vegetation, two classes were determined to contain significant error. The major sources of error could have been eliminated by stratifying cluster sites more closely among previously mapped soil associations that are identified with particular plant associations and by precisely defining class nomenclature using established criteria early in the analysis.

  6. Classification Model for Forest Fire Hotspot Occurrences Prediction Using ANFIS Algorithm

    NASA Astrophysics Data System (ADS)

    Wijayanto, A. K.; Sani, O.; Kartika, N. D.; Herdiyeni, Y.

    2017-01-01

    This study proposed the application of data mining technique namely Adaptive Neuro-Fuzzy inference system (ANFIS) on forest fires hotspot data to develop classification models for hotspots occurrence in Central Kalimantan. Hotspot is a point that is indicated as the location of fires. In this study, hotspot distribution is categorized as true alarm and false alarm. ANFIS is a soft computing method in which a given inputoutput data set is expressed in a fuzzy inference system (FIS). The FIS implements a nonlinear mapping from its input space to the output space. The method of this study classified hotspots as target objects by correlating spatial attributes data using three folds in ANFIS algorithm to obtain the best model. The best result obtained from the 3rd fold provided low error for training (error = 0.0093676) and also low error testing result (error = 0.0093676). Attribute of distance to road is the most determining factor that influences the probability of true and false alarm where the level of human activities in this attribute is higher. This classification model can be used to develop early warning system of forest fire.

  7. Classification based upon gene expression data: bias and precision of error rates.

    PubMed

    Wood, Ian A; Visscher, Peter M; Mengersen, Kerrie L

    2007-06-01

    Gene expression data offer a large number of potentially useful predictors for the classification of tissue samples into classes, such as diseased and non-diseased. The predictive error rate of classifiers can be estimated using methods such as cross-validation. We have investigated issues of interpretation and potential bias in the reporting of error rate estimates. The issues considered here are optimization and selection biases, sampling effects, measures of misclassification rate, baseline error rates, two-level external cross-validation and a novel proposal for detection of bias using the permutation mean. Reporting an optimal estimated error rate incurs an optimization bias. Downward bias of 3-5% was found in an existing study of classification based on gene expression data and may be endemic in similar studies. Using a simulated non-informative dataset and two example datasets from existing studies, we show how bias can be detected through the use of label permutations and avoided using two-level external cross-validation. Some studies avoid optimization bias by using single-level cross-validation and a test set, but error rates can be more accurately estimated via two-level cross-validation. In addition to estimating the simple overall error rate, we recommend reporting class error rates plus where possible the conditional risk incorporating prior class probabilities and a misclassification cost matrix. We also describe baseline error rates derived from three trivial classifiers which ignore the predictors. R code which implements two-level external cross-validation with the PAMR package, experiment code, dataset details and additional figures are freely available for non-commercial use from http://www.maths.qut.edu.au/profiles/wood/permr.jsp

  8. LACIE performance predictor FOC users manual

    NASA Technical Reports Server (NTRS)

    1976-01-01

    The LACIE Performance Predictor (LPP) is a computer simulation of the LACIE process for predicting worldwide wheat production. The simulation provides for the introduction of various errors into the system and provides estimates based on these errors, thus allowing the user to determine the impact of selected error sources. The FOC LPP simulates the acquisition of the sample segment data by the LANDSAT Satellite (DAPTS), the classification of the agricultural area within the sample segment (CAMS), the estimation of the wheat yield (YES), and the production estimation and aggregation (CAS). These elements include data acquisition characteristics, environmental conditions, classification algorithms, the LACIE aggregation and data adjustment procedures. The operational structure for simulating these elements consists of the following key programs: (1) LACIE Utility Maintenance Process, (2) System Error Executive, (3) Ephemeris Generator, (4) Access Generator, (5) Acquisition Selector, (6) LACIE Error Model (LEM), and (7) Post Processor.

  9. Sensitivity of geographic information system outputs to errors in remotely sensed data

    NASA Technical Reports Server (NTRS)

    Ramapriyan, H. K.; Boyd, R. K.; Gunther, F. J.; Lu, Y. C.

    1981-01-01

    The sensitivity of the outputs of a geographic information system (GIS) to errors in inputs derived from remotely sensed data (RSD) is investigated using a suitability model with per-cell decisions and a gridded geographic data base whose cells are larger than the RSD pixels. The process of preparing RSD as input to a GIS is analyzed, and the errors associated with classification and registration are examined. In the case of the model considered, it is found that the errors caused during classification and registration are partially compensated by the aggregation of pixels. The compensation is quantified by means of an analytical model, a Monte Carlo simulation, and experiments with Landsat data. The results show that error reductions of the order of 50% occur because of aggregation when 25 pixels of RSD are used per cell in the geographic data base.

  10. Error Analysis in Composition of Iranian Lower Intermediate Students

    ERIC Educational Resources Information Center

    Taghavi, Mehdi

    2012-01-01

    Learners make errors during the process of learning languages. This study examines errors in writing task of twenty Iranian lower intermediate male students aged between 13 and 15. A subject was given to the participants was a composition about the seasons of a year. All of the errors were identified and classified. Corder's classification (1967)…

  11. Combining multiple decisions: applications to bioinformatics

    NASA Astrophysics Data System (ADS)

    Yukinawa, N.; Takenouchi, T.; Oba, S.; Ishii, S.

    2008-01-01

    Multi-class classification is one of the fundamental tasks in bioinformatics and typically arises in cancer diagnosis studies by gene expression profiling. This article reviews two recent approaches to multi-class classification by combining multiple binary classifiers, which are formulated based on a unified framework of error-correcting output coding (ECOC). The first approach is to construct a multi-class classifier in which each binary classifier to be aggregated has a weight value to be optimally tuned based on the observed data. In the second approach, misclassification of each binary classifier is formulated as a bit inversion error with a probabilistic model by making an analogy to the context of information transmission theory. Experimental studies using various real-world datasets including cancer classification problems reveal that both of the new methods are superior or comparable to other multi-class classification methods.

  12. Credit Risk Evaluation Using a C-Variable Least Squares Support Vector Classification Model

    NASA Astrophysics Data System (ADS)

    Yu, Lean; Wang, Shouyang; Lai, K. K.

    Credit risk evaluation is one of the most important issues in financial risk management. In this paper, a C-variable least squares support vector classification (C-VLSSVC) model is proposed for credit risk analysis. The main idea of this model is based on the prior knowledge that different classes may have different importance for modeling and more weights should be given to those classes with more importance. The C-VLSSVC model can be constructed by a simple modification of the regularization parameter in LSSVC, whereby more weights are given to the lease squares classification errors with important classes than the lease squares classification errors with unimportant classes while keeping the regularized terms in its original form. For illustration purpose, a real-world credit dataset is used to test the effectiveness of the C-VLSSVC model.

  13. Attitude Error Representations for Kalman Filtering

    NASA Technical Reports Server (NTRS)

    Markley, F. Landis; Bauer, Frank H. (Technical Monitor)

    2002-01-01

    The quaternion has the lowest dimensionality possible for a globally nonsingular attitude representation. The quaternion must obey a unit norm constraint, though, which has led to the development of an extended Kalman filter using a quaternion for the global attitude estimate and a three-component representation for attitude errors. We consider various attitude error representations for this Multiplicative Extended Kalman Filter and its second-order extension.

  14. Exception handling for sensor fusion

    NASA Astrophysics Data System (ADS)

    Chavez, G. T.; Murphy, Robin R.

    1993-08-01

    This paper presents a control scheme for handling sensing failures (sensor malfunctions, significant degradations in performance due to changes in the environment, and errant expectations) in sensor fusion for autonomous mobile robots. The advantages of the exception handling mechanism are that it emphasizes a fast response to sensing failures, is able to use only a partial causal model of sensing failure, and leads to a graceful degradation of sensing if the sensing failure cannot be compensated for. The exception handling mechanism consists of two modules: error classification and error recovery. The error classification module in the exception handler attempts to classify the type and source(s) of the error using a modified generate-and-test procedure. If the source of the error is isolated, the error recovery module examines its cache of recovery schemes, which either repair or replace the current sensing configuration. If the failure is due to an error in expectation or cannot be identified, the planner is alerted. Experiments using actual sensor data collected by the CSM Mobile Robotics/Machine Perception Laboratory's Denning mobile robot demonstrate the operation of the exception handling mechanism.

  15. Using beta binomials to estimate classification uncertainty for ensemble models.

    PubMed

    Clark, Robert D; Liang, Wenkel; Lee, Adam C; Lawless, Michael S; Fraczkiewicz, Robert; Waldman, Marvin

    2014-01-01

    Quantitative structure-activity (QSAR) models have enormous potential for reducing drug discovery and development costs as well as the need for animal testing. Great strides have been made in estimating their overall reliability, but to fully realize that potential, researchers and regulators need to know how confident they can be in individual predictions. Submodels in an ensemble model which have been trained on different subsets of a shared training pool represent multiple samples of the model space, and the degree of agreement among them contains information on the reliability of ensemble predictions. For artificial neural network ensembles (ANNEs) using two different methods for determining ensemble classification - one using vote tallies and the other averaging individual network outputs - we have found that the distribution of predictions across positive vote tallies can be reasonably well-modeled as a beta binomial distribution, as can the distribution of errors. Together, these two distributions can be used to estimate the probability that a given predictive classification will be in error. Large data sets comprised of logP, Ames mutagenicity, and CYP2D6 inhibition data are used to illustrate and validate the method. The distributions of predictions and errors for the training pool accurately predicted the distribution of predictions and errors for large external validation sets, even when the number of positive and negative examples in the training pool were not balanced. Moreover, the likelihood of a given compound being prospectively misclassified as a function of the degree of consensus between networks in the ensemble could in most cases be estimated accurately from the fitted beta binomial distributions for the training pool. Confidence in an individual predictive classification by an ensemble model can be accurately assessed by examining the distributions of predictions and errors as a function of the degree of agreement among the constituent submodels. Further, ensemble uncertainty estimation can often be improved by adjusting the voting or classification threshold based on the parameters of the error distribution. Finally, the profiles for models whose predictive uncertainty estimates are not reliable provide clues to that effect without the need for comparison to an external test set.

  16. Original and Mirror Face Images and Minimum Squared Error Classification for Visible Light Face Recognition.

    PubMed

    Wang, Rong

    2015-01-01

    In real-world applications, the image of faces varies with illumination, facial expression, and poses. It seems that more training samples are able to reveal possible images of the faces. Though minimum squared error classification (MSEC) is a widely used method, its applications on face recognition usually suffer from the problem of a limited number of training samples. In this paper, we improve MSEC by using the mirror faces as virtual training samples. We obtained the mirror faces generated from original training samples and put these two kinds of samples into a new set. The face recognition experiments show that our method does obtain high accuracy performance in classification.

  17. Rank preserving sparse learning for Kinect based scene classification.

    PubMed

    Tao, Dapeng; Jin, Lianwen; Yang, Zhao; Li, Xuelong

    2013-10-01

    With the rapid development of the RGB-D sensors and the promptly growing population of the low-cost Microsoft Kinect sensor, scene classification, which is a hard, yet important, problem in computer vision, has gained a resurgence of interest recently. That is because the depth of information provided by the Kinect sensor opens an effective and innovative way for scene classification. In this paper, we propose a new scheme for scene classification, which applies locality-constrained linear coding (LLC) to local SIFT features for representing the RGB-D samples and classifies scenes through the cooperation between a new rank preserving sparse learning (RPSL) based dimension reduction and a simple classification method. RPSL considers four aspects: 1) it preserves the rank order information of the within-class samples in a local patch; 2) it maximizes the margin between the between-class samples on the local patch; 3) the L1-norm penalty is introduced to obtain the parsimony property; and 4) it models the classification error minimization by utilizing the least-squares error minimization. Experiments are conducted on the NYU Depth V1 dataset and demonstrate the robustness and effectiveness of RPSL for scene classification.

  18. Comparison of two Classification methods (MLC and SVM) to extract land use and land cover in Johor Malaysia

    NASA Astrophysics Data System (ADS)

    Rokni Deilmai, B.; Ahmad, B. Bin; Zabihi, H.

    2014-06-01

    Mapping is essential for the analysis of the land use and land cover, which influence many environmental processes and properties. For the purpose of the creation of land cover maps, it is important to minimize error. These errors will propagate into later analyses based on these land cover maps. The reliability of land cover maps derived from remotely sensed data depends on an accurate classification. In this study, we have analyzed multispectral data using two different classifiers including Maximum Likelihood Classifier (MLC) and Support Vector Machine (SVM). To pursue this aim, Landsat Thematic Mapper data and identical field-based training sample datasets in Johor Malaysia used for each classification method, which results indicate in five land cover classes forest, oil palm, urban area, water, rubber. Classification results indicate that SVM was more accurate than MLC. With demonstrated capability to produce reliable cover results, the SVM methods should be especially useful for land cover classification.

  19. On the Discriminant Analysis in the 2-Populations Case

    NASA Astrophysics Data System (ADS)

    Rublík, František

    2008-01-01

    The empirical Bayes Gaussian rule, which in the normal case yields good values of the probability of total error, may yield high values of the maximum probability error. From this point of view the presented modified version of the classification rule of Broffitt, Randles and Hogg appears to be superior. The modification included in this paper is termed as a WR method, and the choice of its weights is discussed. The mentioned methods are also compared with the K nearest neighbours classification rule.

  20. A classification of errors in lay comprehension of medical documents.

    PubMed

    Keselman, Alla; Smith, Catherine Arnott

    2012-12-01

    Emphasis on participatory medicine requires that patients and consumers participate in tasks traditionally reserved for healthcare providers. This includes reading and comprehending medical documents, often but not necessarily in the context of interacting with Personal Health Records (PHRs). Research suggests that while giving patients access to medical documents has many benefits (e.g., improved patient-provider communication), lay people often have difficulty understanding medical information. Informatics can address the problem by developing tools that support comprehension; this requires in-depth understanding of the nature and causes of errors that lay people make when comprehending clinical documents. The objective of this study was to develop a classification scheme of comprehension errors, based on lay individuals' retellings of two documents containing clinical text: a description of a clinical trial and a typical office visit note. While not comprehensive, the scheme can serve as a foundation of further development of a taxonomy of patients' comprehension errors. Eighty participants, all healthy volunteers, read and retold two medical documents. A data-driven content analysis procedure was used to extract and classify retelling errors. The resulting hierarchical classification scheme contains nine categories and 23 subcategories. The most common error made by the participants involved incorrectly recalling brand names of medications. Other common errors included misunderstanding clinical concepts, misreporting the objective of a clinical research study and physician's findings during a patient's visit, and confusing and misspelling clinical terms. A combination of informatics support and health education is likely to improve the accuracy of lay comprehension of medical documents. Published by Elsevier Inc.

  1. An extension of the receiver operating characteristic curve and AUC-optimal classification.

    PubMed

    Takenouchi, Takashi; Komori, Osamu; Eguchi, Shinto

    2012-10-01

    While most proposed methods for solving classification problems focus on minimization of the classification error rate, we are interested in the receiver operating characteristic (ROC) curve, which provides more information about classification performance than the error rate does. The area under the ROC curve (AUC) is a natural measure for overall assessment of a classifier based on the ROC curve. We discuss a class of concave functions for AUC maximization in which a boosting-type algorithm including RankBoost is considered, and the Bayesian risk consistency and the lower bound of the optimum function are discussed. A procedure derived by maximizing a specific optimum function has high robustness, based on gross error sensitivity. Additionally, we focus on the partial AUC, which is the partial area under the ROC curve. For example, in medical screening, a high true-positive rate to the fixed lower false-positive rate is preferable and thus the partial AUC corresponding to lower false-positive rates is much more important than the remaining AUC. We extend the class of concave optimum functions for partial AUC optimality with the boosting algorithm. We investigated the validity of the proposed method through several experiments with data sets in the UCI repository.

  2. Review of medication errors that are new or likely to occur more frequently with electronic medication management systems.

    PubMed

    Van de Vreede, Melita; McGrath, Anne; de Clifford, Jan

    2018-05-14

    Objective. The aim of the present study was to identify and quantify medication errors reportedly related to electronic medication management systems (eMMS) and those considered likely to occur more frequently with eMMS. This included developing a new classification system relevant to eMMS errors. Methods. Eight Victorian hospitals with eMMS participated in a retrospective audit of reported medication incidents from their incident reporting databases between May and July 2014. Site-appointed project officers submitted deidentified incidents they deemed new or likely to occur more frequently due to eMMS, together with the Incident Severity Rating (ISR). The authors reviewed and classified incidents. Results. There were 5826 medication-related incidents reported. In total, 93 (47 prescribing errors, 46 administration errors) were identified as new or potentially related to eMMS. Only one ISR2 (moderate) and no ISR1 (severe or death) errors were reported, so harm to patients in this 3-month period was minimal. The most commonly reported error types were 'human factors' and 'unfamiliarity or training' (70%) and 'cross-encounter or hybrid system errors' (22%). Conclusions. Although the results suggest that the errors reported were of low severity, organisations must remain vigilant to the risk of new errors and avoid the assumption that eMMS is the panacea to all medication error issues. What is known about the topic? eMMS have been shown to reduce some types of medication errors, but it has been reported that some new medication errors have been identified and some are likely to occur more frequently with eMMS. There are few published Australian studies that have reported on medication error types that are likely to occur more frequently with eMMS in more than one organisation and that include administration and prescribing errors. What does this paper add? This paper includes a new simple classification system for eMMS that is useful and outlines the most commonly reported incident types and can inform organisations and vendors on possible eMMS improvements. The paper suggests a new classification system for eMMS medication errors. What are the implications for practitioners? The results of the present study will highlight to organisations the need for ongoing review of system design, refinement of workflow issues, staff education and training and reporting and monitoring of errors.

  3. EEG error potentials detection and classification using time-frequency features for robot reinforcement learning.

    PubMed

    Boubchir, Larbi; Touati, Youcef; Daachi, Boubaker; Chérif, Arab Ali

    2015-08-01

    In thought-based steering of robots, error potentials (ErrP) can appear when the action resulting from the brain-machine interface (BMI) classifier/controller does not correspond to the user's thought. Using the Steady State Visual Evoked Potentials (SSVEP) techniques, ErrP, which appear when a classification error occurs, are not easily recognizable by only examining the temporal or frequency characteristics of EEG signals. A supplementary classification process is therefore needed to identify them in order to stop the course of the action and back up to a recovery state. This paper presents a set of time-frequency (t-f) features for the detection and classification of EEG ErrP in extra-brain activities due to misclassification observed by a user exploiting non-invasive BMI and robot control in the task space. The proposed features are able to characterize and detect ErrP activities in the t-f domain. These features are derived from the information embedded in the t-f representation of EEG signals, and include the Instantaneous Frequency (IF), t-f information complexity, SVD information, energy concentration and sub-bands' energies. The experiment results on real EEG data show that the use of the proposed t-f features for detecting and classifying EEG ErrP achieved an overall classification accuracy up to 97% for 50 EEG segments using 2-class SVM classifier.

  4. Lowest Energy Electronic States of Rare Earth Monoxides.

    DTIC Science & Technology

    1980-09-01

    CCSSON No. 3. RECtPIENT-S CATALOG NUMBER fLowest Energy Electronic States of Rare)J 1’Final $ cientific -, S. PERFORMING OR. REPORT NUMBER 7. ATHOR...Cambridge, Massachusetts 02139 Final Report 1 January 1977 - 15 August 1980 1 September 1980 Approved for public release; distribution unlimited AIR...IUncl ass i fie d SECUA.ITY CLASSIFICATION OF THIS PAGE rWhen Data Entered) READ INSTRUCTIONS ,j REPORT DOCUMENTATION PAGE BEFtORE COMPLETING FORM 2.GOT

  5. Evaluation of spatial filtering on the accuracy of wheat area estimate

    NASA Technical Reports Server (NTRS)

    Dejesusparada, N. (Principal Investigator); Moreira, M. A.; Chen, S. C.; Delima, A. M.

    1982-01-01

    A 3 x 3 pixel spatial filter for postclassification was used for wheat classification to evaluate the effects of this procedure on the accuracy of area estimation using LANDSAT digital data obtained from a single pass. Quantitative analyses were carried out in five test sites (approx 40 sq km each) and t tests showed that filtering with threshold values significantly decreased errors of commission and omission. In area estimation filtering improved the overestimate of 4.5% to 2.7% and the root-mean-square error decreased from 126.18 ha to 107.02 ha. Extrapolating the same procedure of automatic classification using spatial filtering for postclassification to the whole study area, the accuracy in area estimate was improved from the overestimate of 10.9% to 9.7%. It is concluded that when single pass LANDSAT data is used for crop identification and area estimation the postclassification procedure using a spatial filter provides a more accurate area estimate by reducing classification errors.

  6. Regression Equations for Monthly and Annual Mean and Selected Percentile Streamflows for Ungaged Rivers in Maine

    USGS Publications Warehouse

    Dudley, Robert W.

    2015-12-03

    The largest average errors of prediction are associated with regression equations for the lowest streamflows derived for months during which the lowest streamflows of the year occur (such as the 5 and 1 monthly percentiles for August and September). The regression equations have been derived on the basis of streamflow and basin characteristics data for unregulated, rural drainage basins without substantial streamflow or drainage modifications (for example, diversions and (or) regulation by dams or reservoirs, tile drainage, irrigation, channelization, and impervious paved surfaces), therefore using the equations for regulated or urbanized basins with substantial streamflow or drainage modifications will yield results of unknown error. Input basin characteristics derived using techniques or datasets other than those documented in this report or using values outside the ranges used to develop these regression equations also will yield results of unknown error.

  7. Maximum likelihood estimation of label imperfections and its use in the identification of mislabeled patterns

    NASA Technical Reports Server (NTRS)

    Chittineni, C. B.

    1979-01-01

    The problem of estimating label imperfections and the use of the estimation in identifying mislabeled patterns is presented. Expressions for the maximum likelihood estimates of classification errors and a priori probabilities are derived from the classification of a set of labeled patterns. Expressions also are given for the asymptotic variances of probability of correct classification and proportions. Simple models are developed for imperfections in the labels and for classification errors and are used in the formulation of a maximum likelihood estimation scheme. Schemes are presented for the identification of mislabeled patterns in terms of threshold on the discriminant functions for both two-class and multiclass cases. Expressions are derived for the probability that the imperfect label identification scheme will result in a wrong decision and are used in computing thresholds. The results of practical applications of these techniques in the processing of remotely sensed multispectral data are presented.

  8. Misclassification Errors in Unsupervised Classification Methods. Comparison Based on the Simulation of Targeted Proteomics Data

    PubMed Central

    Andreev, Victor P; Gillespie, Brenda W; Helfand, Brian T; Merion, Robert M

    2016-01-01

    Unsupervised classification methods are gaining acceptance in omics studies of complex common diseases, which are often vaguely defined and are likely the collections of disease subtypes. Unsupervised classification based on the molecular signatures identified in omics studies have the potential to reflect molecular mechanisms of the subtypes of the disease and to lead to more targeted and successful interventions for the identified subtypes. Multiple classification algorithms exist but none is ideal for all types of data. Importantly, there are no established methods to estimate sample size in unsupervised classification (unlike power analysis in hypothesis testing). Therefore, we developed a simulation approach allowing comparison of misclassification errors and estimating the required sample size for a given effect size, number, and correlation matrix of the differentially abundant proteins in targeted proteomics studies. All the experiments were performed in silico. The simulated data imitated the expected one from the study of the plasma of patients with lower urinary tract dysfunction with the aptamer proteomics assay Somascan (SomaLogic Inc, Boulder, CO), which targeted 1129 proteins, including 330 involved in inflammation, 180 in stress response, 80 in aging, etc. Three popular clustering methods (hierarchical, k-means, and k-medoids) were compared. K-means clustering performed much better for the simulated data than the other two methods and enabled classification with misclassification error below 5% in the simulated cohort of 100 patients based on the molecular signatures of 40 differentially abundant proteins (effect size 1.5) from among the 1129-protein panel. PMID:27524871

  9. Computer discrimination procedures applicable to aerial and ERTS multispectral data

    NASA Technical Reports Server (NTRS)

    Richardson, A. J.; Torline, R. J.; Allen, W. A.

    1970-01-01

    Two statistical models are compared in the classification of crops recorded on color aerial photographs. A theory of error ellipses is applied to the pattern recognition problem. An elliptical boundary condition classification model (EBC), useful for recognition of candidate patterns, evolves out of error ellipse theory. The EBC model is compared with the minimum distance to the mean (MDM) classification model in terms of pattern recognition ability. The pattern recognition results of both models are interpreted graphically using scatter diagrams to represent measurement space. Measurement space, for this report, is determined by optical density measurements collected from Kodak Ektachrome Infrared Aero Film 8443 (EIR). The EBC model is shown to be a significant improvement over the MDM model.

  10. Statistical learning from nonrecurrent experience with discrete input variables and recursive-error-minimization equations

    NASA Astrophysics Data System (ADS)

    Carter, Jeffrey R.; Simon, Wayne E.

    1990-08-01

    Neural networks are trained using Recursive Error Minimization (REM) equations to perform statistical classification. Using REM equations with continuous input variables reduces the required number of training experiences by factors of one to two orders of magnitude over standard back propagation. Replacing the continuous input variables with discrete binary representations reduces the number of connections by a factor proportional to the number of variables reducing the required number of experiences by another order of magnitude. Undesirable effects of using recurrent experience to train neural networks for statistical classification problems are demonstrated and nonrecurrent experience used to avoid these undesirable effects. 1. THE 1-41 PROBLEM The statistical classification problem which we address is is that of assigning points in ddimensional space to one of two classes. The first class has a covariance matrix of I (the identity matrix) the covariance matrix of the second class is 41. For this reason the problem is known as the 1-41 problem. Both classes have equal probability of occurrence and samples from both classes may appear anywhere throughout the ddimensional space. Most samples near the origin of the coordinate system will be from the first class while most samples away from the origin will be from the second class. Since the two classes completely overlap it is impossible to have a classifier with zero error. The minimum possible error is known as the Bayes error and

  11. Calibration of remotely sensed proportion or area estimates for misclassification error

    Treesearch

    Raymond L. Czaplewski; Glenn P. Catts

    1992-01-01

    Classifications of remotely sensed data contain misclassification errors that bias areal estimates. Monte Carlo techniques were used to compare two statistical methods that correct or calibrate remotely sensed areal estimates for misclassification bias using reference data from an error matrix. The inverse calibration estimator was consistently superior to the...

  12. Mapping gully-affected areas in the region of Taroudannt, Morocco based on Object-Based Image Analysis (OBIA)

    NASA Astrophysics Data System (ADS)

    d'Oleire-Oltmanns, Sebastian; Marzolff, Irene; Tiede, Dirk; Blaschke, Thomas

    2015-04-01

    The need for area-wide landform mapping approaches, especially in terms of land degradation, can be ascribed to the fact that within area-wide landform mapping approaches, the (spatial) context of erosional landforms is considered by providing additional information on the physiography neighboring the distinct landform. This study presents an approach for the detection of gully-affected areas by applying object-based image analysis in the region of Taroudannt, Morocco, which is highly affected by gully erosion while simultaneously representing a major region of agro-industry with a high demand of arable land. Various sensors provide readily available high-resolution optical satellite data with a much better temporal resolution than 3D terrain data which lead to the development of an area-wide mapping approach to extract gully-affected areas using only optical satellite imagery. The classification rule-set was developed with a clear focus on virtual spatial independence within the software environment of eCognition Developer. This allows the incorporation of knowledge about the target objects under investigation. Only optical QuickBird-2 satellite data and freely-available OpenStreetMap (OSM) vector data were used as input data. The OSM vector data were incorporated in order to mask out plantations and residential areas. Optical input data are more readily available for a broad range of users compared to terrain data, which is considered to be a major advantage. The methodology additionally incorporates expert knowledge and freely-available vector data in a cyclic object-based image analysis approach. This connects the two fields of geomorphology and remote sensing. The classification results allow conclusions on the current distribution of gullies. The results of the classification were checked against manually delineated reference data incorporating expert knowledge based on several field campaigns in the area, resulting in an overall classification accuracy of 62%. The error of omission accounts for 38% and the error of commission for 16%, respectively. Additionally, a manual assessment was carried out to assess the quality of the applied classification algorithm. The limited error of omission contributes with 23% to the overall error of omission and the limited error of commission contributes with 98% to the overall error of commission. This assessment improves the results and confirms the high quality of the developed approach for area-wide mapping of gully-affected areas in larger regions. In the field of landform mapping, the overall quality of the classification results is often assessed with more than one method to incorporate all aspects adequately.

  13. A Novel Methodology to Validate the Accuracy of Extraoral Dental Scanners and Digital Articulation Systems.

    PubMed

    Ellakwa, A; Elnajar, S; Littlefair, D; Sara, G

    2018-05-03

    The aim of the current study is to develop a novel method to investigate the accuracy of 3D scanners and digital articulation systems. An upper and a lower poured stone model were created by taking impression of fully dentate male (fifty years old) participant. Titanium spheres were added to the models to allow for an easily recognisable geometric shape for measurement after scanning and digital articulation. Measurements were obtained using a Coordinate Measuring Machine to record volumetric error, articulation error and clinical effect error. Three scanners were compared, including the Imetric 3D iScan d104i, Shining 3D AutoScan-DS100 and 3Shape D800, as well as their respective digital articulation software packages. Stoneglass Industries PDC digital articulation system was also applied to the Imetric scans for comparison with the CMM measurements. All the scans displayed low volumetric error (p⟩0.05), indicating that the scanners themselves had a minor contribution to the articulation and clinical effect errors. The PDC digital articulation system was found to deliver the lowest average errors, with good repeatability of results. The new measuring technique in the current study was able to assess the scanning and articulation accuracy of the four systems investigated. The PDC digital articulation system using Imetric scans was recommended as it displayed the lowest articulation error and clinical effect error with good repeatability. The low errors from the PDC system may have been due to its use of a 3D axis for alignment rather than the use of a best fit. Copyright© 2018 Dennis Barber Ltd.

  14. Demarcation of potentially mineral-deficient areas in central and northern Namibia by means of natural classification systems.

    PubMed

    Grant, C C; Biggs, H C; Meissner, H H

    1996-06-01

    Mineral deficiencies that lead to production losses often occur concurrently with climatic and management changes. To diagnose these deficiencies in time to prevent production losses, long-term monitoring of mineral status is advisable. Different classification systems were examined to determine whether areas of possible mineral deficiencies could be identified, so that those which were promising could then be selected for further monitoring purposes. The classification systems addressed differences in soil, vegetation and geology, and were used to define the cattle-ranching areas in the central and northern districts of Namibia. Copper (Cu), Iron (Fe), zinc (Zn), manganese (Mn) and cobalt (Co) concentrations were determined in cattle livers collected at abattoirs. Pooled faecal grab samples and milk samples were collected by farmers, and used to determine phosphorus (P) and calcium (Ca), and iodine (I) status, respectively. Areas of low P concentrations could be identified by all classification systems. The lowest P concentrations were recorded in samples from the Kalahari-sand area, whereas faecal samples collected from cattle on farms in the more arid areas, where the harder soils are mostly found, rarely showed low P concentrations. In the north of the country, low iodine levels were found in milk samples collected from cows grazing on farms in the northern Kalahari broad-leaved woodland. Areas supporting animals with marginal Cu status, could be effectively identified by the detailed soil-classification system of irrigation potential. Copper concentrations were lowest in areas of arid soils, but no indication of Co, Fe, Zn, or Mn deficiencies were found. For most minerals, the geological classification was the best single indicator of areas of lower concentrations. Significant monthly variation for all minerals could also be detected within the classification system. It is concluded that specific classification systems can be useful as indicators of areas with lower mineral concentrations or possible deficiencies.

  15. Maximum-likelihood techniques for joint segmentation-classification of multispectral chromosome images.

    PubMed

    Schwartzkopf, Wade C; Bovik, Alan C; Evans, Brian L

    2005-12-01

    Traditional chromosome imaging has been limited to grayscale images, but recently a 5-fluorophore combinatorial labeling technique (M-FISH) was developed wherein each class of chromosomes binds with a different combination of fluorophores. This results in a multispectral image, where each class of chromosomes has distinct spectral components. In this paper, we develop new methods for automatic chromosome identification by exploiting the multispectral information in M-FISH chromosome images and by jointly performing chromosome segmentation and classification. We (1) develop a maximum-likelihood hypothesis test that uses multispectral information, together with conventional criteria, to select the best segmentation possibility; (2) use this likelihood function to combine chromosome segmentation and classification into a robust chromosome identification system; and (3) show that the proposed likelihood function can also be used as a reliable indicator of errors in segmentation, errors in classification, and chromosome anomalies, which can be indicators of radiation damage, cancer, and a wide variety of inherited diseases. We show that the proposed multispectral joint segmentation-classification method outperforms past grayscale segmentation methods when decomposing touching chromosomes. We also show that it outperforms past M-FISH classification techniques that do not use segmentation information.

  16. Classification accuracy on the family planning participation status using kernel discriminant analysis

    NASA Astrophysics Data System (ADS)

    Kurniawan, Dian; Suparti; Sugito

    2018-05-01

    Population growth in Indonesia has increased every year. According to the population census conducted by the Central Bureau of Statistics (BPS) in 2010, the population of Indonesia has reached 237.6 million people. Therefore, to control the population growth rate, the government hold Family Planning or Keluarga Berencana (KB) program for couples of childbearing age. The purpose of this program is to improve the health of mothers and children in order to manifest prosperous society by controlling births while ensuring control of population growth. The data used in this study is the updated family data of Semarang city in 2016 that conducted by National Family Planning Coordinating Board (BKKBN). From these data, classifiers with kernel discriminant analysis will be obtained, and also classification accuracy will be obtained from that method. The result of the analysis showed that normal kernel discriminant analysis gives 71.05 % classification accuracy with 28.95 % classification error. Whereas triweight kernel discriminant analysis gives 73.68 % classification accuracy with 26.32 % classification error. Using triweight kernel discriminant for data preprocessing of family planning participation of childbearing age couples in Semarang City of 2016 can be stated better than with normal kernel discriminant.

  17. A parametric multiclass Bayes error estimator for the multispectral scanner spatial model performance evaluation

    NASA Technical Reports Server (NTRS)

    Mobasseri, B. G.; Mcgillem, C. D.; Anuta, P. E. (Principal Investigator)

    1978-01-01

    The author has identified the following significant results. The probability of correct classification of various populations in data was defined as the primary performance index. The multispectral data being of multiclass nature as well, required a Bayes error estimation procedure that was dependent on a set of class statistics alone. The classification error was expressed in terms of an N dimensional integral, where N was the dimensionality of the feature space. The multispectral scanner spatial model was represented by a linear shift, invariant multiple, port system where the N spectral bands comprised the input processes. The scanner characteristic function, the relationship governing the transformation of the input spatial, and hence, spectral correlation matrices through the systems, was developed.

  18. Using Gaussian mixture models to detect and classify dolphin whistles and pulses.

    PubMed

    Peso Parada, Pablo; Cardenal-López, Antonio

    2014-06-01

    In recent years, a number of automatic detection systems for free-ranging cetaceans have been proposed that aim to detect not just surfaced, but also submerged, individuals. These systems are typically based on pattern-recognition techniques applied to underwater acoustic recordings. Using a Gaussian mixture model, a classification system was developed that detects sounds in recordings and classifies them as one of four types: background noise, whistles, pulses, and combined whistles and pulses. The classifier was tested using a database of underwater recordings made off the Spanish coast during 2011. Using cepstral-coefficient-based parameterization, a sound detection rate of 87.5% was achieved for a 23.6% classification error rate. To improve these results, two parameters computed using the multiple signal classification algorithm and an unpredictability measure were included in the classifier. These parameters, which helped to classify the segments containing whistles, increased the detection rate to 90.3% and reduced the classification error rate to 18.1%. Finally, the potential of the multiple signal classification algorithm and unpredictability measure for estimating whistle contours and classifying cetacean species was also explored, with promising results.

  19. Global land cover mapping: a review and uncertainty analysis

    USGS Publications Warehouse

    Congalton, Russell G.; Gu, Jianyu; Yadav, Kamini; Thenkabail, Prasad S.; Ozdogan, Mutlu

    2014-01-01

    Given the advances in remotely sensed imagery and associated technologies, several global land cover maps have been produced in recent times including IGBP DISCover, UMD Land Cover, Global Land Cover 2000 and GlobCover 2009. However, the utility of these maps for specific applications has often been hampered due to considerable amounts of uncertainties and inconsistencies. A thorough review of these global land cover projects including evaluating the sources of error and uncertainty is prudent and enlightening. Therefore, this paper describes our work in which we compared, summarized and conducted an uncertainty analysis of the four global land cover mapping projects using an error budget approach. The results showed that the classification scheme and the validation methodology had the highest error contribution and implementation priority. A comparison of the classification schemes showed that there are many inconsistencies between the definitions of the map classes. This is especially true for the mixed type classes for which thresholds vary for the attributes/discriminators used in the classification process. Examination of these four global mapping projects provided quite a few important lessons for the future global mapping projects including the need for clear and uniform definitions of the classification scheme and an efficient, practical, and valid design of the accuracy assessment.

  20. Defining and classifying medical error: lessons for patient safety reporting systems.

    PubMed

    Tamuz, M; Thomas, E J; Franchois, K E

    2004-02-01

    It is important for healthcare providers to report safety related events, but little attention has been paid to how the definition and classification of events affects a hospital's ability to learn from its experience. To examine how the definition and classification of safety related events influences key organizational routines for gathering information, allocating incentives, and analyzing event reporting data. In semi-structured interviews, professional staff and administrators in a tertiary care teaching hospital and its pharmacy were asked to describe the existing programs designed to monitor medication safety, including the reporting systems. With a focus primarily on the pharmacy staff, interviews were audio recorded, transcribed, and analyzed using qualitative research methods. Eighty six interviews were conducted, including 36 in the hospital pharmacy. Examples are presented which show that: (1) the definition of an event could lead to under-reporting; (2) the classification of a medication error into alternative categories can influence the perceived incentives and disincentives for incident reporting; (3) event classification can enhance or impede organizational routines for data analysis and learning; and (4) routines that promote organizational learning within the pharmacy can reduce the flow of medication error data to the hospital. These findings from one hospital raise important practical and research questions about how reporting systems are influenced by the definition and classification of safety related events. By understanding more clearly how hospitals define and classify their experience, we may improve our capacity to learn and ultimately improve patient safety.

  1. Fisher classifier and its probability of error estimation

    NASA Technical Reports Server (NTRS)

    Chittineni, C. B.

    1979-01-01

    Computationally efficient expressions are derived for estimating the probability of error using the leave-one-out method. The optimal threshold for the classification of patterns projected onto Fisher's direction is derived. A simple generalization of the Fisher classifier to multiple classes is presented. Computational expressions are developed for estimating the probability of error of the multiclass Fisher classifier.

  2. Human factors analysis and classification system-HFACS.

    DOT National Transportation Integrated Search

    2000-02-01

    Human error has been implicated in 70 to 80% of all civil and military aviation accidents. Yet, most accident : reporting systems are not designed around any theoretical framework of human error. As a result, most : accident databases are not conduci...

  3. IMPACTS OF PATCH SIZE AND LANDSCAPE HETEROGENEITY ON THEMATIC IMAGE CLASSIFICATION ACCURACY

    EPA Science Inventory

    Impacts of Patch Size and Landscape Heterogeneity on Thematic Image Classification Accuracy.
    Currently, most thematic accuracy assessments of classified remotely sensed images oily account for errors between the various classes employed, at particular pixels of interest, thu...

  4. Sea ice classification using fast learning neural networks

    NASA Technical Reports Server (NTRS)

    Dawson, M. S.; Fung, A. K.; Manry, M. T.

    1992-01-01

    A first learning neural network approach to the classification of sea ice is presented. The fast learning (FL) neural network and a multilayer perceptron (MLP) trained with backpropagation learning (BP network) were tested on simulated data sets based on the known dominant scattering characteristics of the target class. Four classes were used in the data simulation: open water, thick lossy saline ice, thin saline ice, and multiyear ice. The BP network was unable to consistently converge to less than 25 percent error while the FL method yielded an average error of approximately 1 percent on the first iteration of training. The fast learning method presented can significantly reduce the CPU time necessary to train a neural network as well as consistently yield higher classification accuracy than BP networks.

  5. High-density force myography: A possible alternative for upper-limb prosthetic control.

    PubMed

    Radmand, Ashkan; Scheme, Erik; Englehart, Kevin

    2016-01-01

    Several multiple degree-of-freedom upper-limb prostheses that have the promise of highly dexterous control have recently been developed. Inadequate controllability, however, has limited adoption of these devices. Introducing more robust control methods will likely result in higher acceptance rates. This work investigates the suitability of using high-density force myography (HD-FMG) for prosthetic control. HD-FMG uses a high-density array of pressure sensors to detect changes in the pressure patterns between the residual limb and socket caused by the contraction of the forearm muscles. In this work, HD-FMG outperforms the standard electromyography (EMG)-based system in detecting different wrist and hand gestures. With the arm in a fixed, static position, eight hand and wrist motions were classified with 0.33% error using the HD-FMG technique. Comparatively, classification errors in the range of 2.2%-11.3% have been reported in the literature for multichannel EMG-based approaches. As with EMG, position variation in HD-FMG can introduce classification error, but incorporating position variation into the training protocol reduces this effect. Channel reduction was also applied to the HD-FMG technique to decrease the dimensionality of the problem as well as the size of the sensorized area. We found that with informed, symmetric channel reduction, classification error could be decreased to 0.02%.

  6. Improved wetland remote sensing in Yellowstone National Park using classification trees to combine TM imagery and ancillary environmental data

    USGS Publications Warehouse

    Wright, C.; Gallant, Alisa L.

    2007-01-01

    The U.S. Fish and Wildlife Service uses the term palustrine wetland to describe vegetated wetlands traditionally identified as marsh, bog, fen, swamp, or wet meadow. Landsat TM imagery was combined with image texture and ancillary environmental data to model probabilities of palustrine wetland occurrence in Yellowstone National Park using classification trees. Model training and test locations were identified from National Wetlands Inventory maps, and classification trees were built for seven years spanning a range of annual precipitation. At a coarse level, palustrine wetland was separated from upland. At a finer level, five palustrine wetland types were discriminated: aquatic bed (PAB), emergent (PEM), forested (PFO), scrub–shrub (PSS), and unconsolidated shore (PUS). TM-derived variables alone were relatively accurate at separating wetland from upland, but model error rates dropped incrementally as image texture, DEM-derived terrain variables, and other ancillary GIS layers were added. For classification trees making use of all available predictors, average overall test error rates were 7.8% for palustrine wetland/upland models and 17.0% for palustrine wetland type models, with consistent accuracies across years. However, models were prone to wetland over-prediction. While the predominant PEM class was classified with omission and commission error rates less than 14%, we had difficulty identifying the PAB and PSS classes. Ancillary vegetation information greatly improved PSS classification and moderately improved PFO discrimination. Association with geothermal areas distinguished PUS wetlands. Wetland over-prediction was exacerbated by class imbalance in likely combination with spatial and spectral limitations of the TM sensor. Wetland probability surfaces may be more informative than hard classification, and appear to respond to climate-driven wetland variability. The developed method is portable, relatively easy to implement, and should be applicable in other settings and over larger extents.

  7. Common component classification: what can we learn from machine learning?

    PubMed

    Anderson, Ariana; Labus, Jennifer S; Vianna, Eduardo P; Mayer, Emeran A; Cohen, Mark S

    2011-05-15

    Machine learning methods have been applied to classifying fMRI scans by studying locations in the brain that exhibit temporal intensity variation between groups, frequently reporting classification accuracy of 90% or better. Although empirical results are quite favorable, one might doubt the ability of classification methods to withstand changes in task ordering and the reproducibility of activation patterns over runs, and question how much of the classification machines' power is due to artifactual noise versus genuine neurological signal. To examine the true strength and power of machine learning classifiers we create and then deconstruct a classifier to examine its sensitivity to physiological noise, task reordering, and across-scan classification ability. The models are trained and tested both within and across runs to assess stability and reproducibility across conditions. We demonstrate the use of independent components analysis for both feature extraction and artifact removal and show that removal of such artifacts can reduce predictive accuracy even when data has been cleaned in the preprocessing stages. We demonstrate how mistakes in the feature selection process can cause the cross-validation error seen in publication to be a biased estimate of the testing error seen in practice and measure this bias by purposefully making flawed models. We discuss other ways to introduce bias and the statistical assumptions lying behind the data and model themselves. Finally we discuss the complications in drawing inference from the smaller sample sizes typically seen in fMRI studies, the effects of small or unbalanced samples on the Type 1 and Type 2 error rates, and how publication bias can give a false confidence of the power of such methods. Collectively this work identifies challenges specific to fMRI classification and methods affecting the stability of models. Copyright © 2010 Elsevier Inc. All rights reserved.

  8. J-Plus: Morphological Classification Of Compact And Extended Sources By Pdf Analysis

    NASA Astrophysics Data System (ADS)

    López-Sanjuan, C.; Vázquez-Ramió, H.; Varela, J.; Spinoso, D.; Cristóbal-Hornillos, D.; Viironen, K.; Muniesa, D.; J-PLUS Collaboration

    2017-10-01

    We present a morphological classification of J-PLUS EDR sources into compact (i.e. stars) and extended (i.e. galaxies). Such classification is based on the Bayesian modelling of the concentration distribution, including observational errors and magnitude + sky position priors. We provide the star / galaxy probability of each source computed from the gri images. The comparison with the SDSS number counts support our classification up to r 21. The 31.7 deg² analised comprises 150k stars and 101k galaxies.

  9. Crop identification from radar imagery of the Huntington County, Indiana test site

    NASA Technical Reports Server (NTRS)

    Batlivala, P. P.; Ulaby, F. T. (Principal Investigator)

    1975-01-01

    The author has identified the following significant results. Like polarization was successful in discriminating corn and soybeans; however, pasture and woods were consistently confused as soybeans and corn, respectively. The probability of correct classification was about 65%. The cross polarization component (highest for woods and lowest for pasture) helped in separating the woods from corn, and pasture from soybeans, and when used with the like polarization component, the probability of correct classification increased to 74%.

  10. Halftoning Algorithms and Systems.

    DTIC Science & Technology

    1996-08-01

    TERMS 15. NUMBER IF PAGESi. Halftoning algorithms; error diffusions ; color printing; topographic maps 16. PRICE CODE 17. SECURITY CLASSIFICATION 18...graylevels for each screen level. In the case of error diffusion algorithms, the calibration procedure using the new centering concept manifests itself as a...Novel Centering Concept for Overlapping Correction Paper / Transparency (Patent Applied 5/94)I * Applications To Error Diffusion * To Dithering (IS&T

  11. Simulation techniques for estimating error in the classification of normal patterns

    NASA Technical Reports Server (NTRS)

    Whitsitt, S. J.; Landgrebe, D. A.

    1974-01-01

    Methods of efficiently generating and classifying samples with specified multivariate normal distributions were discussed. Conservative confidence tables for sample sizes are given for selective sampling. Simulation results are compared with classified training data. Techniques for comparing error and separability measure for two normal patterns are investigated and used to display the relationship between the error and the Chernoff bound.

  12. Optimal number of features as a function of sample size for various classification rules.

    PubMed

    Hua, Jianping; Xiong, Zixiang; Lowey, James; Suh, Edward; Dougherty, Edward R

    2005-04-15

    Given the joint feature-label distribution, increasing the number of features always results in decreased classification error; however, this is not the case when a classifier is designed via a classification rule from sample data. Typically (but not always), for fixed sample size, the error of a designed classifier decreases and then increases as the number of features grows. The potential downside of using too many features is most critical for small samples, which are commonplace for gene-expression-based classifiers for phenotype discrimination. For fixed sample size and feature-label distribution, the issue is to find an optimal number of features. Since only in rare cases is there a known distribution of the error as a function of the number of features and sample size, this study employs simulation for various feature-label distributions and classification rules, and across a wide range of sample and feature-set sizes. To achieve the desired end, finding the optimal number of features as a function of sample size, it employs massively parallel computation. Seven classifiers are treated: 3-nearest-neighbor, Gaussian kernel, linear support vector machine, polynomial support vector machine, perceptron, regular histogram and linear discriminant analysis. Three Gaussian-based models are considered: linear, nonlinear and bimodal. In addition, real patient data from a large breast-cancer study is considered. To mitigate the combinatorial search for finding optimal feature sets, and to model the situation in which subsets of genes are co-regulated and correlation is internal to these subsets, we assume that the covariance matrix of the features is blocked, with each block corresponding to a group of correlated features. Altogether there are a large number of error surfaces for the many cases. These are provided in full on a companion website, which is meant to serve as resource for those working with small-sample classification. For the companion website, please visit http://public.tgen.org/tamu/ofs/ e-dougherty@ee.tamu.edu.

  13. The Human Factors Analysis and Classification System : HFACS : final report.

    DOT National Transportation Integrated Search

    2000-02-01

    Human error has been implicated in 70 to 80% of all civil and military aviation accidents. Yet, most accident reporting systems are not designed around any theoretical framework of human error. As a result, most accident databases are not conducive t...

  14. A Confidence Paradigm for Classification Systems

    DTIC Science & Technology

    2008-09-01

    methodology to determine how much confi- dence one should have in a classifier output. This research proposes a framework to determine the level of...theoretical framework that attempts to unite the viewpoints of the classification system developer (or engineer) and the classification system user (or...operating point. An algorithm is developed that minimizes a “confidence” measure called Binned Error in the Posterior ( BEP ). Then, we prove that training a

  15. A False Alarm Reduction Method for a Gas Sensor Based Electronic Nose

    PubMed Central

    Rahman, Mohammad Mizanur; Suksompong, Prapun; Toochinda, Pisanu; Taparugssanagorn, Attaphongse

    2017-01-01

    Electronic noses (E-Noses) are becoming popular for food and fruit quality assessment due to their robustness and repeated usability without fatigue, unlike human experts. An E-Nose equipped with classification algorithms and having open ended classification boundaries such as the k-nearest neighbor (k-NN), support vector machine (SVM), and multilayer perceptron neural network (MLPNN), are found to suffer from false classification errors of irrelevant odor data. To reduce false classification and misclassification errors, and to improve correct rejection performance; algorithms with a hyperspheric boundary, such as a radial basis function neural network (RBFNN) and generalized regression neural network (GRNN) with a Gaussian activation function in the hidden layer should be used. The simulation results presented in this paper show that GRNN has more correct classification efficiency and false alarm reduction capability compared to RBFNN. As the design of a GRNN and RBFNN is complex and expensive due to large numbers of neuron requirements, a simple hyperspheric classification method based on minimum, maximum, and mean (MMM) values of each class of the training dataset was presented. The MMM algorithm was simple and found to be fast and efficient in correctly classifying data of training classes, and correctly rejecting data of extraneous odors, and thereby reduced false alarms. PMID:28895910

  16. A False Alarm Reduction Method for a Gas Sensor Based Electronic Nose.

    PubMed

    Rahman, Mohammad Mizanur; Charoenlarpnopparut, Chalie; Suksompong, Prapun; Toochinda, Pisanu; Taparugssanagorn, Attaphongse

    2017-09-12

    Electronic noses (E-Noses) are becoming popular for food and fruit quality assessment due to their robustness and repeated usability without fatigue, unlike human experts. An E-Nose equipped with classification algorithms and having open ended classification boundaries such as the k -nearest neighbor ( k -NN), support vector machine (SVM), and multilayer perceptron neural network (MLPNN), are found to suffer from false classification errors of irrelevant odor data. To reduce false classification and misclassification errors, and to improve correct rejection performance; algorithms with a hyperspheric boundary, such as a radial basis function neural network (RBFNN) and generalized regression neural network (GRNN) with a Gaussian activation function in the hidden layer should be used. The simulation results presented in this paper show that GRNN has more correct classification efficiency and false alarm reduction capability compared to RBFNN. As the design of a GRNN and RBFNN is complex and expensive due to large numbers of neuron requirements, a simple hyperspheric classification method based on minimum, maximum, and mean (MMM) values of each class of the training dataset was presented. The MMM algorithm was simple and found to be fast and efficient in correctly classifying data of training classes, and correctly rejecting data of extraneous odors, and thereby reduced false alarms.

  17. Differences in chewing sounds of dry-crisp snacks by multivariate data analysis

    NASA Astrophysics Data System (ADS)

    De Belie, N.; Sivertsvik, M.; De Baerdemaeker, J.

    2003-09-01

    Chewing sounds of different types of dry-crisp snacks (two types of potato chips, prawn crackers, cornflakes and low calorie snacks from extruded starch) were analysed to assess differences in sound emission patterns. The emitted sounds were recorded by a microphone placed over the ear canal. The first bite and the first subsequent chew were selected from the time signal and a fast Fourier transformation provided the power spectra. Different multivariate analysis techniques were used for classification of the snack groups. This included principal component analysis (PCA) and unfold partial least-squares (PLS) algorithms, as well as multi-way techniques such as three-way PLS, three-way PCA (Tucker3), and parallel factor analysis (PARAFAC) on the first bite and subsequent chew. The models were evaluated by calculating the classification errors and the root mean square error of prediction (RMSEP) for independent validation sets. It appeared that the logarithm of the power spectra obtained from the chewing sounds could be used successfully to distinguish the different snack groups. When different chewers were used, recalibration of the models was necessary. Multi-way models distinguished better between chewing sounds of different snack groups than PCA on bite or chew separately and than unfold PLS. From all three-way models applied, N-PLS with three components showed the best classification capabilities, resulting in classification errors of 14-18%. The major amount of incorrect classifications was due to one type of potato chips that had a very irregular shape, resulting in a wide variation of the emitted sounds.

  18. Kernel Wiener filter and its application to pattern recognition.

    PubMed

    Yoshino, Hirokazu; Dong, Chen; Washizawa, Yoshikazu; Yamashita, Yukihiko

    2010-11-01

    The Wiener filter (WF) is widely used for inverse problems. From an observed signal, it provides the best estimated signal with respect to the squared error averaged over the original and the observed signals among linear operators. The kernel WF (KWF), extended directly from WF, has a problem that an additive noise has to be handled by samples. Since the computational complexity of kernel methods depends on the number of samples, a huge computational cost is necessary for the case. By using the first-order approximation of kernel functions, we realize KWF that can handle such a noise not by samples but as a random variable. We also propose the error estimation method for kernel filters by using the approximations. In order to show the advantages of the proposed methods, we conducted the experiments to denoise images and estimate errors. We also apply KWF to classification since KWF can provide an approximated result of the maximum a posteriori classifier that provides the best recognition accuracy. The noise term in the criterion can be used for the classification in the presence of noise or a new regularization to suppress changes in the input space, whereas the ordinary regularization for the kernel method suppresses changes in the feature space. In order to show the advantages of the proposed methods, we conducted experiments of binary and multiclass classifications and classification in the presence of noise.

  19. Consequences of land-cover misclassification in models of impervious surface

    USGS Publications Warehouse

    McMahon, G.

    2007-01-01

    Model estimates of impervious area as a function of landcover area may be biased and imprecise because of errors in the land-cover classification. This investigation of the effects of land-cover misclassification on impervious surface models that use National Land Cover Data (NLCD) evaluates the consequences of adjusting land-cover within a watershed to reflect uncertainty assessment information. Model validation results indicate that using error-matrix information to adjust land-cover values used in impervious surface models does not substantially improve impervious surface predictions. Validation results indicate that the resolution of the landcover data (Level I and Level II) is more important in predicting impervious surface accurately than whether the land-cover data have been adjusted using information in the error matrix. Level I NLCD, adjusted for land-cover misclassification, is preferable to the other land-cover options for use in models of impervious surface. This result is tied to the lower classification error rates for the Level I NLCD. ?? 2007 American Society for Photogrammetry and Remote Sensing.

  20. Improved classification accuracy by feature extraction using genetic algorithms

    NASA Astrophysics Data System (ADS)

    Patriarche, Julia; Manduca, Armando; Erickson, Bradley J.

    2003-05-01

    A feature extraction algorithm has been developed for the purposes of improving classification accuracy. The algorithm uses a genetic algorithm / hill-climber hybrid to generate a set of linearly recombined features, which may be of reduced dimensionality compared with the original set. The genetic algorithm performs the global exploration, and a hill climber explores local neighborhoods. Hybridizing the genetic algorithm with a hill climber improves both the rate of convergence, and the final overall cost function value; it also reduces the sensitivity of the genetic algorithm to parameter selection. The genetic algorithm includes the operators: crossover, mutation, and deletion / reactivation - the last of these effects dimensionality reduction. The feature extractor is supervised, and is capable of deriving a separate feature space for each tissue (which are reintegrated during classification). A non-anatomical digital phantom was developed as a gold standard for testing purposes. In tests with the phantom, and with images of multiple sclerosis patients, classification with feature extractor derived features yielded lower error rates than using standard pulse sequences, and with features derived using principal components analysis. Using the multiple sclerosis patient data, the algorithm resulted in a mean 31% reduction in classification error of pure tissues.

  1. Locally Weighted Score Estimation for Quantile Classification in Binary Regression Models

    PubMed Central

    Rice, John D.; Taylor, Jeremy M. G.

    2016-01-01

    One common use of binary response regression methods is classification based on an arbitrary probability threshold dictated by the particular application. Since this is given to us a priori, it is sensible to incorporate the threshold into our estimation procedure. Specifically, for the linear logistic model, we solve a set of locally weighted score equations, using a kernel-like weight function centered at the threshold. The bandwidth for the weight function is selected by cross validation of a novel hybrid loss function that combines classification error and a continuous measure of divergence between observed and fitted values; other possible cross-validation functions based on more common binary classification metrics are also examined. This work has much in common with robust estimation, but diers from previous approaches in this area in its focus on prediction, specifically classification into high- and low-risk groups. Simulation results are given showing the reduction in error rates that can be obtained with this method when compared with maximum likelihood estimation, especially under certain forms of model misspecification. Analysis of a melanoma data set is presented to illustrate the use of the method in practice. PMID:28018492

  2. Error detection and reduction in blood banking.

    PubMed

    Motschman, T L; Moore, S B

    1996-12-01

    Error management plays a major role in facility process improvement efforts. By detecting and reducing errors, quality and, therefore, patient care improve. It begins with a strong organizational foundation of management attitude with clear, consistent employee direction and appropriate physical facilities. Clearly defined critical processes, critical activities, and SOPs act as the framework for operations as well as active quality monitoring. To assure that personnel can detect an report errors they must be trained in both operational duties and error management practices. Use of simulated/intentional errors and incorporation of error detection into competency assessment keeps employees practiced, confident, and diminishes fear of the unknown. Personnel can clearly see that errors are indeed used as opportunities for process improvement and not for punishment. The facility must have a clearly defined and consistently used definition for reportable errors. Reportable errors should include those errors with potentially harmful outcomes as well as those errors that are "upstream," and thus further away from the outcome. A well-written error report consists of who, what, when, where, why/how, and follow-up to the error. Before correction can occur, an investigation to determine the underlying cause of the error should be undertaken. Obviously, the best corrective action is prevention. Correction can occur at five different levels; however, only three of these levels are directed at prevention. Prevention requires a method to collect and analyze data concerning errors. In the authors' facility a functional error classification method and a quality system-based classification have been useful. An active method to search for problems uncovers them further upstream, before they can have disastrous outcomes. In the continual quest for improving processes, an error management program is itself a process that needs improvement, and we must strive to always close the circle of quality assurance. Ultimately, the goal of better patient care will be the reward.

  3. Performance Evaluation of Three Blood Glucose Monitoring Systems Using ISO 15197: 2013 Accuracy Criteria, Consensus and Surveillance Error Grid Analyses, and Insulin Dosing Error Modeling in a Hospital Setting.

    PubMed

    Bedini, José Luis; Wallace, Jane F; Pardo, Scott; Petruschke, Thorsten

    2015-10-07

    Blood glucose monitoring is an essential component of diabetes management. Inaccurate blood glucose measurements can severely impact patients' health. This study evaluated the performance of 3 blood glucose monitoring systems (BGMS), Contour® Next USB, FreeStyle InsuLinx®, and OneTouch® Verio™ IQ, under routine hospital conditions. Venous blood samples (N = 236) obtained for routine laboratory procedures were collected at a Spanish hospital, and blood glucose (BG) concentrations were measured with each BGMS and with the available reference (hexokinase) method. Accuracy of the 3 BGMS was compared according to ISO 15197:2013 accuracy limit criteria, by mean absolute relative difference (MARD), consensus error grid (CEG) and surveillance error grid (SEG) analyses, and an insulin dosing error model. All BGMS met the accuracy limit criteria defined by ISO 15197:2013. While all measurements of the 3 BGMS were within low-risk zones in both error grid analyses, the Contour Next USB showed significantly smaller MARDs between reference values compared to the other 2 BGMS. Insulin dosing errors were lowest for the Contour Next USB than compared to the other systems. All BGMS fulfilled ISO 15197:2013 accuracy limit criteria and CEG criterion. However, taking together all analyses, differences in performance of potential clinical relevance may be observed. Results showed that Contour Next USB had lowest MARD values across the tested glucose range, as compared with the 2 other BGMS. CEG and SEG analyses as well as calculation of the hypothetical bolus insulin dosing error suggest a high accuracy of the Contour Next USB. © 2015 Diabetes Technology Society.

  4. Modified Bat Algorithm for Feature Selection with the Wisconsin Diagnosis Breast Cancer (WDBC) Dataset

    PubMed

    Jeyasingh, Suganthi; Veluchamy, Malathi

    2017-05-01

    Early diagnosis of breast cancer is essential to save lives of patients. Usually, medical datasets include a large variety of data that can lead to confusion during diagnosis. The Knowledge Discovery on Database (KDD) process helps to improve efficiency. It requires elimination of inappropriate and repeated data from the dataset before final diagnosis. This can be done using any of the feature selection algorithms available in data mining. Feature selection is considered as a vital step to increase the classification accuracy. This paper proposes a Modified Bat Algorithm (MBA) for feature selection to eliminate irrelevant features from an original dataset. The Bat algorithm was modified using simple random sampling to select the random instances from the dataset. Ranking was with the global best features to recognize the predominant features available in the dataset. The selected features are used to train a Random Forest (RF) classification algorithm. The MBA feature selection algorithm enhanced the classification accuracy of RF in identifying the occurrence of breast cancer. The Wisconsin Diagnosis Breast Cancer Dataset (WDBC) was used for estimating the performance analysis of the proposed MBA feature selection algorithm. The proposed algorithm achieved better performance in terms of Kappa statistic, Mathew’s Correlation Coefficient, Precision, F-measure, Recall, Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Relative Absolute Error (RAE) and Root Relative Squared Error (RRSE). Creative Commons Attribution License

  5. Simulated rRNA/DNA Ratios Show Potential To Misclassify Active Populations as Dormant

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Steven, Blaire; Hesse, Cedar; Soghigian, John

    The use of rRNA/DNA ratios derived from surveys of rRNA sequences in RNA and DNA extracts is an appealing but poorly validated approach to infer the activity status of environmental microbes. To improve the interpretation of rRNA/DNA ratios, we performed simulations to investigate the effects of community structure, rRNA amplification, and sampling depth on the accuracy of rRNA/DNA ratios in classifying bacterial populations as “active” or “dormant.” Community structure was an insignificant factor. In contrast, the extent of rRNA amplification that occurs as cells transition from dormant to growing had a significant effect (P < 0.0001) on classification accuracy, withmore » misclassification errors ranging from 16 to 28%, depending on the rRNA amplification model. The error rate increased to 47% when communities included a mixture of rRNA amplification models, but most of the inflated error was false negatives (i.e., active populations misclassified as dormant). Sampling depth also affected error rates (P < 0.001). Inadequate sampling depth produced various artifacts that are characteristic of rRNA/DNA ratios generated from real communities. These data show important constraints on the use of rRNA/DNA ratios to infer activity status. Whereas classification of populations as active based on rRNA/DNA ratios appears generally valid, classification of populations as dormant is potentially far less accurate.« less

  6. Simulated rRNA/DNA Ratios Show Potential To Misclassify Active Populations as Dormant

    DOE PAGES

    Steven, Blaire; Hesse, Cedar; Soghigian, John; ...

    2017-03-31

    The use of rRNA/DNA ratios derived from surveys of rRNA sequences in RNA and DNA extracts is an appealing but poorly validated approach to infer the activity status of environmental microbes. To improve the interpretation of rRNA/DNA ratios, we performed simulations to investigate the effects of community structure, rRNA amplification, and sampling depth on the accuracy of rRNA/DNA ratios in classifying bacterial populations as “active” or “dormant.” Community structure was an insignificant factor. In contrast, the extent of rRNA amplification that occurs as cells transition from dormant to growing had a significant effect (P < 0.0001) on classification accuracy, withmore » misclassification errors ranging from 16 to 28%, depending on the rRNA amplification model. The error rate increased to 47% when communities included a mixture of rRNA amplification models, but most of the inflated error was false negatives (i.e., active populations misclassified as dormant). Sampling depth also affected error rates (P < 0.001). Inadequate sampling depth produced various artifacts that are characteristic of rRNA/DNA ratios generated from real communities. These data show important constraints on the use of rRNA/DNA ratios to infer activity status. Whereas classification of populations as active based on rRNA/DNA ratios appears generally valid, classification of populations as dormant is potentially far less accurate.« less

  7. Regulation of IAP (Inhibitor of Apoptosis) Gene Expression by the p53 Tumor Suppressor Protein

    DTIC Science & Technology

    2005-05-01

    adenovirus, gene therapy, polymorphism, 31 16. PRICE CODE 17. SECURITY CLASSIFICATION 18. SECURITY CLASSIFICATION 19. SECURITY CLASSIFICATION 20...averaged results of three inde- pendent experiments, with standard error. Right panel: Level of p53 in infected cells using the antibody Ab-6 (Calbiochem...with highly purified mitochondria as described in (2). The arrow marks oligomerized BAK. The right _ -. panel depicts the purity of BMH CrosIinked Mito

  8. Determination of origin and sugars of citrus fruits using genetic algorithm, correspondence analysis and partial least square combined with fiber optic NIR spectroscopy

    NASA Astrophysics Data System (ADS)

    Tewari, Jagdish C.; Dixit, Vivechana; Cho, Byoung-Kwan; Malik, Kamal A.

    2008-12-01

    The capacity to confirm the variety or origin and the estimation of sucrose, glucose, fructose of the citrus fruits are major interests of citrus juice industry. A rapid classification and quantification technique was developed and validated for simultaneous and nondestructive quantifying the sugar constituent's concentrations and the origin of citrus fruits using Fourier Transform Near-Infrared (FT-NIR) spectroscopy in conjunction with Artificial Neural Network (ANN) using genetic algorithm, Chemometrics and Correspondences Analysis (CA). To acquire good classification accuracy and to present a wide range of concentration of sucrose, glucose and fructose, we have collected 22 different varieties of citrus fruits from the market during the entire season of citruses. FT-NIR spectra were recorded in the NIR region from 1100 to 2500 nm using the fiber optic probe and three types of data analysis were performed. Chemometrics analysis using Partial Least Squares (PLS) was performed in order to determine the concentration of individual sugars. Artificial Neural Network analysis was performed for classification, origin or variety identification of citrus fruits using genetic algorithm. Correspondence analysis was performed in order to visualize the relationship between the citrus fruits. To compute a PLS model based upon the reference values and to validate the developed method, high performance liquid chromatography (HPLC) was performed. Spectral range and the number of PLS factors were optimized for the lowest standard error of calibration (SEC), prediction (SEP) and correlation coefficient ( R2). The calibration model developed was able to assess the sucrose, glucose and fructose contents in unknown citrus fruit up to an R2 value of 0.996-0.998. Numbers of factors from F1 to F10 were optimized for correspondence analysis for relationship visualization of citrus fruits based on the output values of genetic algorithm. ANN and CA analysis showed excellent classification of citrus according to the variety to which they belong and well-classified citrus according to their origin. The technique has potential in rapid determination of sugars content and to identify different varieties and origins of citrus in citrus juice industry.

  9. Determination of origin and sugars of citrus fruits using genetic algorithm, correspondence analysis and partial least square combined with fiber optic NIR spectroscopy.

    PubMed

    Tewari, Jagdish C; Dixit, Vivechana; Cho, Byoung-Kwan; Malik, Kamal A

    2008-12-01

    The capacity to confirm the variety or origin and the estimation of sucrose, glucose, fructose of the citrus fruits are major interests of citrus juice industry. A rapid classification and quantification technique was developed and validated for simultaneous and nondestructive quantifying the sugar constituent's concentrations and the origin of citrus fruits using Fourier Transform Near-Infrared (FT-NIR) spectroscopy in conjunction with Artificial Neural Network (ANN) using genetic algorithm, Chemometrics and Correspondences Analysis (CA). To acquire good classification accuracy and to present a wide range of concentration of sucrose, glucose and fructose, we have collected 22 different varieties of citrus fruits from the market during the entire season of citruses. FT-NIR spectra were recorded in the NIR region from 1,100 to 2,500 nm using the fiber optic probe and three types of data analysis were performed. Chemometrics analysis using Partial Least Squares (PLS) was performed in order to determine the concentration of individual sugars. Artificial Neural Network analysis was performed for classification, origin or variety identification of citrus fruits using genetic algorithm. Correspondence analysis was performed in order to visualize the relationship between the citrus fruits. To compute a PLS model based upon the reference values and to validate the developed method, high performance liquid chromatography (HPLC) was performed. Spectral range and the number of PLS factors were optimized for the lowest standard error of calibration (SEC), prediction (SEP) and correlation coefficient (R(2)). The calibration model developed was able to assess the sucrose, glucose and fructose contents in unknown citrus fruit up to an R(2) value of 0.996-0.998. Numbers of factors from F1 to F10 were optimized for correspondence analysis for relationship visualization of citrus fruits based on the output values of genetic algorithm. ANN and CA analysis showed excellent classification of citrus according to the variety to which they belong and well-classified citrus according to their origin. The technique has potential in rapid determination of sugars content and to identify different varieties and origins of citrus in citrus juice industry.

  10. An Analysis of U.S. Army Fratricide Incidents during the Global War on Terror (11 September 2001 to 31 March 2008)

    DTIC Science & Technology

    2010-03-15

    Swiss cheese model of human error causation. ................................................................... 3  2. Results for the classification of...based on Reason’s “ Swiss cheese ” model of human error (1990). Figure 1 describes how an accident is likely to occur when all of the errors, or “holes...align. A detailed description of HFACS can be found in Wiegmann and Shappell (2003). Figure 1. The Swiss cheese model of human error

  11. Classification of urban features using airborne hyperspectral data

    NASA Astrophysics Data System (ADS)

    Ganesh Babu, Bharath

    Accurate mapping and modeling of urban environments are critical for their efficient and successful management. Superior understanding of complex urban environments is made possible by using modern geospatial technologies. This research focuses on thematic classification of urban land use and land cover (LULC) using 248 bands of 2.0 meter resolution hyperspectral data acquired from an airborne imaging spectrometer (AISA+) on 24th July 2006 in and near Terre Haute, Indiana. Three distinct study areas including two commercial classes, two residential classes, and two urban parks/recreational classes were selected for classification and analysis. Four commonly used classification methods -- maximum likelihood (ML), extraction and classification of homogeneous objects (ECHO), spectral angle mapper (SAM), and iterative self organizing data analysis (ISODATA) - were applied to each data set. Accuracy assessment was conducted and overall accuracies were compared between the twenty four resulting thematic maps. With the exception of SAM and ISODATA in a complex commercial area, all methods employed classified the designated urban features with more than 80% accuracy. The thematic classification from ECHO showed the best agreement with ground reference samples. The residential area with relatively homogeneous composition was classified consistently with highest accuracy by all four of the classification methods used. The average accuracy amongst the classifiers was 93.60% for this area. When individually observed, the complex recreational area (Deming Park) was classified with the highest accuracy by ECHO, with an accuracy of 96.80% and 96.10% Kappa. The average accuracy amongst all the classifiers was 92.07%. The commercial area with relatively high complexity was classified with the least accuracy by all classifiers. The lowest accuracy was achieved by SAM at 63.90% with 59.20% Kappa. This was also the lowest accuracy in the entire analysis. This study demonstrates the potential for using the visible and near infrared (VNIR) bands from AISA+ hyperspectral data in urban LULC classification. Based on their performance, the need for further research using ECHO and SAM is underscored. The importance incorporating imaging spectrometer data in high resolution urban feature mapping is emphasized.

  12. Object-Based Land Use Classification of Agricultural Land by Coupling Multi-Temporal Spectral Characteristics and Phenological Events in Germany

    NASA Astrophysics Data System (ADS)

    Knoefel, Patrick; Loew, Fabian; Conrad, Christopher

    2015-04-01

    Crop maps based on classification of remotely sensed data are of increased attendance in agricultural management. This induces a more detailed knowledge about the reliability of such spatial information. However, classification of agricultural land use is often limited by high spectral similarities of the studied crop types. More, spatially and temporally varying agro-ecological conditions can introduce confusion in crop mapping. Classification errors in crop maps in turn may have influence on model outputs, like agricultural production monitoring. One major goal of the PhenoS project ("Phenological structuring to determine optimal acquisition dates for Sentinel-2 data for field crop classification"), is the detection of optimal phenological time windows for land cover classification purposes. Since many crop species are spectrally highly similar, accurate classification requires the right selection of satellite images for a certain classification task. In the course of one growing season, phenological phases exist where crops are separable with higher accuracies. For this purpose, coupling of multi-temporal spectral characteristics and phenological events is promising. The focus of this study is set on the separation of spectrally similar cereal crops like winter wheat, barley, and rye of two test sites in Germany called "Harz/Central German Lowland" and "Demmin". However, this study uses object based random forest (RF) classification to investigate the impact of image acquisition frequency and timing on crop classification uncertainty by permuting all possible combinations of available RapidEye time series recorded on the test sites between 2010 and 2014. The permutations were applied to different segmentation parameters. Then, classification uncertainty was assessed and analysed, based on the probabilistic soft-output from the RF algorithm at the per-field basis. From this soft output, entropy was calculated as a spatial measure of classification uncertainty. The results indicate that uncertainty estimates provide a valuable addition to traditional accuracy assessments and helps the user to allocate error in crop maps.

  13. A new classification of glaucomas

    PubMed Central

    Bordeianu, Constantin-Dan

    2014-01-01

    Purpose To suggest a new glaucoma classification that is pathogenic, etiologic, and clinical. Methods After discussing the logical pathway used in criteria selection, the paper presents the new classification and compares it with the classification currently in use, that is, the one issued by the European Glaucoma Society in 2008. Results The paper proves that the new classification is clear (being based on a coherent and consistently followed set of criteria), is comprehensive (framing all forms of glaucoma), and helps in understanding the sickness understanding (in that it uses a logical framing system). The great advantage is that it facilitates therapeutic decision making in that it offers direct therapeutic suggestions and avoids errors leading to disasters. Moreover, the scheme remains open to any new development. Conclusion The suggested classification is a pathogenic, etiologic, and clinical classification that fulfills the conditions of an ideal classification. The suggested classification is the first classification in which the main criterion is consistently used for the first 5 to 7 crossings until its differentiation capabilities are exhausted. Then, secondary criteria (etiologic and clinical) pick up the relay until each form finds its logical place in the scheme. In order to avoid unclear aspects, the genetic criterion is no longer used, being replaced by age, one of the clinical criteria. The suggested classification brings only benefits to all categories of ophthalmologists: the beginners will have a tool to better understand the sickness and to ease their decision making, whereas the experienced doctors will have their practice simplified. For all doctors, errors leading to therapeutic disasters will be less likely to happen. Finally, researchers will have the object of their work gathered in the group of glaucoma with unknown or uncertain pathogenesis, whereas the results of their work will easily find a logical place in the scheme, as the suggested classification remains open to any new development. PMID:25246759

  14. North error estimation based on solar elevation errors in the third step of sky-polarimetric Viking navigation.

    PubMed

    Száz, Dénes; Farkas, Alexandra; Barta, András; Kretzer, Balázs; Egri, Ádám; Horváth, Gábor

    2016-07-01

    The theory of sky-polarimetric Viking navigation has been widely accepted for decades without any information about the accuracy of this method. Previously, we have measured the accuracy of the first and second steps of this navigation method in psychophysical laboratory and planetarium experiments. Now, we have tested the accuracy of the third step in a planetarium experiment, assuming that the first and second steps are errorless. Using the fists of their outstretched arms, 10 test persons had to estimate the elevation angles (measured in numbers of fists and fingers) of black dots (representing the position of the occluded Sun) projected onto the planetarium dome. The test persons performed 2400 elevation estimations, 48% of which were more accurate than ±1°. We selected three test persons with the (i) largest and (ii) smallest elevation errors and (iii) highest standard deviation of the elevation error. From the errors of these three persons, we calculated their error function, from which the North errors (the angles with which they deviated from the geographical North) were determined for summer solstice and spring equinox, two specific dates of the Viking sailing period. The range of possible North errors Δ ω N was the lowest and highest at low and high solar elevations, respectively. At high elevations, the maximal Δ ω N was 35.6° and 73.7° at summer solstice and 23.8° and 43.9° at spring equinox for the best and worst test person (navigator), respectively. Thus, the best navigator was twice as good as the worst one. At solstice and equinox, high elevations occur the most frequently during the day, thus high North errors could occur more frequently than expected before. According to our findings, the ideal periods for sky-polarimetric Viking navigation are immediately after sunrise and before sunset, because the North errors are the lowest at low solar elevations.

  15. North error estimation based on solar elevation errors in the third step of sky-polarimetric Viking navigation

    PubMed Central

    Száz, Dénes; Farkas, Alexandra; Barta, András; Kretzer, Balázs; Egri, Ádám

    2016-01-01

    The theory of sky-polarimetric Viking navigation has been widely accepted for decades without any information about the accuracy of this method. Previously, we have measured the accuracy of the first and second steps of this navigation method in psychophysical laboratory and planetarium experiments. Now, we have tested the accuracy of the third step in a planetarium experiment, assuming that the first and second steps are errorless. Using the fists of their outstretched arms, 10 test persons had to estimate the elevation angles (measured in numbers of fists and fingers) of black dots (representing the position of the occluded Sun) projected onto the planetarium dome. The test persons performed 2400 elevation estimations, 48% of which were more accurate than ±1°. We selected three test persons with the (i) largest and (ii) smallest elevation errors and (iii) highest standard deviation of the elevation error. From the errors of these three persons, we calculated their error function, from which the North errors (the angles with which they deviated from the geographical North) were determined for summer solstice and spring equinox, two specific dates of the Viking sailing period. The range of possible North errors ΔωN was the lowest and highest at low and high solar elevations, respectively. At high elevations, the maximal ΔωN was 35.6° and 73.7° at summer solstice and 23.8° and 43.9° at spring equinox for the best and worst test person (navigator), respectively. Thus, the best navigator was twice as good as the worst one. At solstice and equinox, high elevations occur the most frequently during the day, thus high North errors could occur more frequently than expected before. According to our findings, the ideal periods for sky-polarimetric Viking navigation are immediately after sunrise and before sunset, because the North errors are the lowest at low solar elevations. PMID:27493566

  16. Classification of burn wounds using support vector machines

    NASA Astrophysics Data System (ADS)

    Acha, Begona; Serrano, Carmen; Palencia, Sergio; Murillo, Juan Jose

    2004-05-01

    The purpose of this work is to improve a previous method developed by the authors for the classification of burn wounds into their depths. The inputs of the system are color and texture information, as these are the characteristics observed by physicians in order to give a diagnosis. Our previous work consisted in segmenting the burn wound from the rest of the image and classifying the burn into its depth. In this paper we focus on the classification problem only. We already proposed to use a Fuzzy-ARTMAP neural network (NN). However, we may take advantage of new powerful classification tools such as Support Vector Machines (SVM). We apply the five-folded cross validation scheme to divide the database into training and validating sets. Then, we apply a feature selection method for each classifier, which will give us the set of features that yields the smallest classification error for each classifier. Features used to classify are first-order statistical parameters extracted from the L*, u* and v* color components of the image. The feature selection algorithms used are the Sequential Forward Selection (SFS) and the Sequential Backward Selection (SBS) methods. As data of the problem faced here are not linearly separable, the SVM was trained using some different kernels. The validating process shows that the SVM method, when using a Gaussian kernel of variance 1, outperforms classification results obtained with the rest of the classifiers, yielding an error classification rate of 0.7% whereas the Fuzzy-ARTMAP NN attained 1.6 %.

  17. Comparative assessment of LANDSAT-D MSS and TM data quality for mapping applications in the Southeast

    NASA Technical Reports Server (NTRS)

    1984-01-01

    Rectifications of multispectral scanner and thematic mapper data sets for full and subscene areas, analyses of planimetric errors, assessments of the number and distribution of ground control points required to minimize errors, and factors contributing to error residual are examined. Other investigations include the generation of three dimensional terrain models and the effects of spatial resolution on digital classification accuracies.

  18. Cluster designs to assess the prevalence of acute malnutrition by lot quality assurance sampling: a validation study by computer simulation.

    PubMed

    Olives, Casey; Pagano, Marcello; Deitchler, Megan; Hedt, Bethany L; Egge, Kari; Valadez, Joseph J

    2009-04-01

    Traditional lot quality assurance sampling (LQAS) methods require simple random sampling to guarantee valid results. However, cluster sampling has been proposed to reduce the number of random starting points. This study uses simulations to examine the classification error of two such designs, a 67x3 (67 clusters of three observations) and a 33x6 (33 clusters of six observations) sampling scheme to assess the prevalence of global acute malnutrition (GAM). Further, we explore the use of a 67x3 sequential sampling scheme for LQAS classification of GAM prevalence. Results indicate that, for independent clusters with moderate intracluster correlation for the GAM outcome, the three sampling designs maintain approximate validity for LQAS analysis. Sequential sampling can substantially reduce the average sample size that is required for data collection. The presence of intercluster correlation can impact dramatically the classification error that is associated with LQAS analysis.

  19. Influence of ECG measurement accuracy on ECG diagnostic statements.

    PubMed

    Zywietz, C; Celikag, D; Joseph, G

    1996-01-01

    Computer analysis of electrocardiograms (ECGs) provides a large amount of ECG measurement data, which may be used for diagnostic classification and storage in ECG databases. Until now, neither error limits for ECG measurements have been specified nor has their influence on diagnostic statements been systematically investigated. An analytical method is presented to estimate the influence of measurement errors on the accuracy of diagnostic ECG statements. Systematic (offset) errors will usually result in an increase of false positive or false negative statements since they cause a shift of the working point on the receiver operating characteristics curve. Measurement error dispersion broadens the distribution function of discriminative measurement parameters and, therefore, usually increases the overlap between discriminative parameters. This results in a flattening of the receiver operating characteristics curve and an increase of false positive and false negative classifications. The method developed has been applied to ECG conduction defect diagnoses by using the proposed International Electrotechnical Commission's interval measurement tolerance limits. These limits appear too large because more than 30% of false positive atrial conduction defect statements and 10-18% of false intraventricular conduction defect statements could be expected due to tolerated measurement errors. To assure long-term usability of ECG measurement databases, it is recommended that systems provide its error tolerance limits obtained on a defined test set.

  20. Clinical Implications of Cluster Analysis-Based Classification of Acute Decompensated Heart Failure and Correlation with Bedside Hemodynamic Profiles.

    PubMed

    Ahmad, Tariq; Desai, Nihar; Wilson, Francis; Schulte, Phillip; Dunning, Allison; Jacoby, Daniel; Allen, Larry; Fiuzat, Mona; Rogers, Joseph; Felker, G Michael; O'Connor, Christopher; Patel, Chetan B

    2016-01-01

    Classification of acute decompensated heart failure (ADHF) is based on subjective criteria that crudely capture disease heterogeneity. Improved phenotyping of the syndrome may help improve therapeutic strategies. To derive cluster analysis-based groupings for patients hospitalized with ADHF, and compare their prognostic performance to hemodynamic classifications derived at the bedside. We performed a cluster analysis on baseline clinical variables and PAC measurements of 172 ADHF patients from the ESCAPE trial. Employing regression techniques, we examined associations between clusters and clinically determined hemodynamic profiles (warm/cold/wet/dry). We assessed association with clinical outcomes using Cox proportional hazards models. Likelihood ratio tests were used to compare the prognostic value of cluster data to that of hemodynamic data. We identified four advanced HF clusters: 1) male Caucasians with ischemic cardiomyopathy, multiple comorbidities, lowest B-type natriuretic peptide (BNP) levels; 2) females with non-ischemic cardiomyopathy, few comorbidities, most favorable hemodynamics; 3) young African American males with non-ischemic cardiomyopathy, most adverse hemodynamics, advanced disease; and 4) older Caucasians with ischemic cardiomyopathy, concomitant renal insufficiency, highest BNP levels. There was no association between clusters and bedside-derived hemodynamic profiles (p = 0.70). For all adverse clinical outcomes, Cluster 4 had the highest risk, and Cluster 2, the lowest. Compared to Cluster 4, Clusters 1-3 had 45-70% lower risk of all-cause mortality. Clusters were significantly associated with clinical outcomes, whereas hemodynamic profiles were not. By clustering patients with similar objective variables, we identified four clinically relevant phenotypes of ADHF patients, with no discernable relationship to hemodynamic profiles, but distinct associations with adverse outcomes. Our analysis suggests that ADHF classification using simultaneous considerations of etiology, comorbid conditions, and biomarker levels, may be superior to bedside classifications.

  1. Decision support system for determining the contact lens for refractive errors patients with classification ID3

    NASA Astrophysics Data System (ADS)

    Situmorang, B. H.; Setiawan, M. P.; Tosida, E. T.

    2017-01-01

    Refractive errors are abnormalities of the refraction of light so that the shadows do not focus precisely on the retina resulting in blurred vision [1]. Refractive errors causing the patient should wear glasses or contact lenses in order eyesight returned to normal. The use of glasses or contact lenses in a person will be different from others, it is influenced by patient age, the amount of tear production, vision prescription, and astigmatic. Because the eye is one organ of the human body is very important to see, then the accuracy in determining glasses or contact lenses which will be used is required. This research aims to develop a decision support system that can produce output on the right contact lenses for refractive errors patients with a value of 100% accuracy. Iterative Dichotomize Three (ID3) classification methods will generate gain and entropy values of attributes that include code sample data, age of the patient, astigmatic, the ratio of tear production, vision prescription, and classes that will affect the outcome of the decision tree. The eye specialist test result for the training data obtained the accuracy rate of 96.7% and an error rate of 3.3%, the result test using confusion matrix obtained the accuracy rate of 96.1% and an error rate of 3.1%; for the data testing obtained accuracy rate of 100% and an error rate of 0.

  2. The generalization ability of online SVM classification based on Markov sampling.

    PubMed

    Xu, Jie; Yan Tang, Yuan; Zou, Bin; Xu, Zongben; Li, Luoqing; Lu, Yang

    2015-03-01

    In this paper, we consider online support vector machine (SVM) classification learning algorithms with uniformly ergodic Markov chain (u.e.M.c.) samples. We establish the bound on the misclassification error of an online SVM classification algorithm with u.e.M.c. samples based on reproducing kernel Hilbert spaces and obtain a satisfactory convergence rate. We also introduce a novel online SVM classification algorithm based on Markov sampling, and present the numerical studies on the learning ability of online SVM classification based on Markov sampling for benchmark repository. The numerical studies show that the learning performance of the online SVM classification algorithm based on Markov sampling is better than that of classical online SVM classification based on random sampling as the size of training samples is larger.

  3. Aspen, climate, and sudden decline in western USA

    Treesearch

    Gerald E. Rehfeldt; Dennis E. Ferguson; Nicholas L. Crookston

    2009-01-01

    A bioclimate model predicting the presence or absence of aspen, Populus tremuloides, in western USA from climate variables was developed by using the Random Forests classification tree on Forest Inventory data from about 118,000 permanent sample plots. A reasonably parsimonious model used eight predictors to describe aspen's climate profile. Classification errors...

  4. Multi-template tensor-based morphometry: Application to analysis of Alzheimer's disease

    PubMed Central

    Koikkalainen, Juha; Lötjönen, Jyrki; Thurfjell, Lennart; Rueckert, Daniel; Waldemar, Gunhild; Soininen, Hilkka

    2012-01-01

    In this paper methods for using multiple templates in tensor-based morphometry (TBM) are presented and comparedtothe conventional single-template approach. TBM analysis requires non-rigid registrations which are often subject to registration errors. When using multiple templates and, therefore, multiple registrations, it can be assumed that the registration errors are averaged and eventually compensated. Four different methods are proposed for multi-template TBM. The methods were evaluated using magnetic resonance (MR) images of healthy controls, patients with stable or progressive mild cognitive impairment (MCI), and patients with Alzheimer's disease (AD) from the ADNI database (N=772). The performance of TBM features in classifying images was evaluated both quantitatively and qualitatively. Classification results show that the multi-template methods are statistically significantly better than the single-template method. The overall classification accuracy was 86.0% for the classification of control and AD subjects, and 72.1%for the classification of stable and progressive MCI subjects. The statistical group-level difference maps produced using multi-template TBM were smoother, formed larger continuous regions, and had larger t-values than the maps obtained with single-template TBM. PMID:21419228

  5. Predicting alpine headwater stream intermittency: a case study in the northern Rocky Mountains

    USGS Publications Warehouse

    Sando, Thomas R.; Blasch, Kyle W.

    2015-01-01

    This investigation used climatic, geological, and environmental data coupled with observational stream intermittency data to predict alpine headwater stream intermittency. Prediction was made using a random forest classification model. Results showed that the most important variables in the prediction model were snowpack persistence, represented by average snow extent from March through July, mean annual mean monthly minimum temperature, and surface geology types. For stream catchments with intermittent headwater streams, snowpack, on average, persisted until early June, whereas for stream catchments with perennial headwater streams, snowpack, on average, persisted until early July. Additionally, on average, stream catchments with intermittent headwater streams were about 0.7 °C warmer than stream catchments with perennial headwater streams. Finally, headwater stream catchments primarily underlain by coarse, permeable sediment are significantly more likely to have intermittent headwater streams than those primarily underlain by impermeable bedrock. Comparison of the predicted streamflow classification with observed stream status indicated a four percent classification error for first-order streams and a 21 percent classification error for all stream orders in the study area.

  6. A framework for software fault tolerance in real-time systems

    NASA Technical Reports Server (NTRS)

    Anderson, T.; Knight, J. C.

    1983-01-01

    A classification scheme for errors and a technique for the provision of software fault tolerance in cyclic real-time systems is presented. The technique requires that the process structure of a system be represented by a synchronization graph which is used by an executive as a specification of the relative times at which they will communicate during execution. Communication between concurrent processes is severely limited and may only take place between processes engaged in an exchange. A history of error occurrences is maintained by an error handler. When an error is detected, the error handler classifies it using the error history information and then initiates appropriate recovery action.

  7. Software platform for managing the classification of error- related potentials of observers

    NASA Astrophysics Data System (ADS)

    Asvestas, P.; Ventouras, E.-C.; Kostopoulos, S.; Sidiropoulos, K.; Korfiatis, V.; Korda, A.; Uzunolglu, A.; Karanasiou, I.; Kalatzis, I.; Matsopoulos, G.

    2015-09-01

    Human learning is partly based on observation. Electroencephalographic recordings of subjects who perform acts (actors) or observe actors (observers), contain a negative waveform in the Evoked Potentials (EPs) of the actors that commit errors and of observers who observe the error-committing actors. This waveform is called the Error-Related Negativity (ERN). Its detection has applications in the context of Brain-Computer Interfaces. The present work describes a software system developed for managing EPs of observers, with the aim of classifying them into observations of either correct or incorrect actions. It consists of an integrated platform for the storage, management, processing and classification of EPs recorded during error-observation experiments. The system was developed using C# and the following development tools and frameworks: MySQL, .NET Framework, Entity Framework and Emgu CV, for interfacing with the machine learning library of OpenCV. Up to six features can be computed per EP recording per electrode. The user can select among various feature selection algorithms and then proceed to train one of three types of classifiers: Artificial Neural Networks, Support Vector Machines, k-nearest neighbour. Next the classifier can be used for classifying any EP curve that has been inputted to the database.

  8. Satellite inventory of Minnesota forest resources

    NASA Technical Reports Server (NTRS)

    Bauer, Marvin E.; Burk, Thomas E.; Ek, Alan R.; Coppin, Pol R.; Lime, Stephen D.; Walsh, Terese A.; Walters, David K.; Befort, William; Heinzen, David F.

    1993-01-01

    The methods and results of using Landsat Thematic Mapper (TM) data to classify and estimate the acreage of forest covertypes in northeastern Minnesota are described. Portions of six TM scenes covering five counties with a total area of 14,679 square miles were classified into six forest and five nonforest classes. The approach involved the integration of cluster sampling, image processing, and estimation. Using cluster sampling, 343 plots, each 88 acres in size, were photo interpreted and field mapped as a source of reference data for classifier training and calibration of the TM data classifications. Classification accuracies of up to 75 percent were achieved; most misclassification was between similar or related classes. An inverse method of calibration, based on the error rates obtained from the classifications of the cluster plots, was used to adjust the classification class proportions for classification errors. The resulting area estimates for total forest land in the five-county area were within 3 percent of the estimate made independently by the USDA Forest Service. Area estimates for conifer and hardwood forest types were within 0.8 and 6.0 percent respectively, of the Forest Service estimates. A trial of a second method of estimating the same classes as the Forest Service resulted in standard errors of 0.002 to 0.015. A study of the use of multidate TM data for change detection showed that forest canopy depletion, canopy increment, and no change could be identified with greater than 90 percent accuracy. The project results have been the basis for the Minnesota Department of Natural Resources and the Forest Service to define and begin to implement an annual system of forest inventory which utilizes Landsat TM data to detect changes in forest cover.

  9. Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines

    PubMed Central

    del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J.; Raboso, Mariano

    2015-01-01

    Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation—based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking—to reduce the dimensions of images—and binarization—to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements. PMID:26091392

  10. Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines.

    PubMed

    del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J; Raboso, Mariano

    2015-06-17

    Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation-based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking-to reduce the dimensions of images-and binarization-to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements.

  11. Multi-factorial analysis of class prediction error: estimating optimal number of biomarkers for various classification rules.

    PubMed

    Khondoker, Mizanur R; Bachmann, Till T; Mewissen, Muriel; Dickinson, Paul; Dobrzelecki, Bartosz; Campbell, Colin J; Mount, Andrew R; Walton, Anthony J; Crain, Jason; Schulze, Holger; Giraud, Gerard; Ross, Alan J; Ciani, Ilenia; Ember, Stuart W J; Tlili, Chaker; Terry, Jonathan G; Grant, Eilidh; McDonnell, Nicola; Ghazal, Peter

    2010-12-01

    Machine learning and statistical model based classifiers have increasingly been used with more complex and high dimensional biological data obtained from high-throughput technologies. Understanding the impact of various factors associated with large and complex microarray datasets on the predictive performance of classifiers is computationally intensive, under investigated, yet vital in determining the optimal number of biomarkers for various classification purposes aimed towards improved detection, diagnosis, and therapeutic monitoring of diseases. We investigate the impact of microarray based data characteristics on the predictive performance for various classification rules using simulation studies. Our investigation using Random Forest, Support Vector Machines, Linear Discriminant Analysis and k-Nearest Neighbour shows that the predictive performance of classifiers is strongly influenced by training set size, biological and technical variability, replication, fold change and correlation between biomarkers. Optimal number of biomarkers for a classification problem should therefore be estimated taking account of the impact of all these factors. A database of average generalization errors is built for various combinations of these factors. The database of generalization errors can be used for estimating the optimal number of biomarkers for given levels of predictive accuracy as a function of these factors. Examples show that curves from actual biological data resemble that of simulated data with corresponding levels of data characteristics. An R package optBiomarker implementing the method is freely available for academic use from the Comprehensive R Archive Network (http://www.cran.r-project.org/web/packages/optBiomarker/).

  12. Effects of stress typicality during speeded grammatical classification.

    PubMed

    Arciuli, Joanne; Cupples, Linda

    2003-01-01

    The experiments reported here were designed to investigate the influence of stress typicality during speeded grammatical classification of disyllabic English words by native and non-native speakers. Trochaic nouns and iambic gram verbs were considered to be typically stressed, whereas iambic nouns and trochaic verbs were considered to be atypically stressed. Experiments 1a and 2a showed that while native speakers classified typically stressed words individual more quickly and more accurately than atypically stressed words during differences reading, there were no overall effects during classification of spoken stimuli. However, a subgroup of native speakers with high error rates did show a significant effect during classification of spoken stimuli. Experiments 1b and 2b showed that non-native speakers classified typically stressed words more quickly and more accurately than atypically stressed words during reading. Typically stressed words were classified more accurately than atypically stressed words when the stimuli were spoken. Importantly, there was a significant relationship between error rates, vocabulary size and the size of the stress typicality effect in each experiment. We conclude that participants use information about lexical stress to help them distinguish between disyllabic nouns and verbs during speeded grammatical classification. This is especially so for individuals with a limited vocabulary who lack other knowledge (e.g., semantic knowledge) about the differences between these grammatical categories.

  13. Automatic classification of diseases from free-text death certificates for real-time surveillance.

    PubMed

    Koopman, Bevan; Karimi, Sarvnaz; Nguyen, Anthony; McGuire, Rhydwyn; Muscatello, David; Kemp, Madonna; Truran, Donna; Zhang, Ming; Thackway, Sarah

    2015-07-15

    Death certificates provide an invaluable source for mortality statistics which can be used for surveillance and early warnings of increases in disease activity and to support the development and monitoring of prevention or response strategies. However, their value can be realised only if accurate, quantitative data can be extracted from death certificates, an aim hampered by both the volume and variable nature of certificates written in natural language. This study aims to develop a set of machine learning and rule-based methods to automatically classify death certificates according to four high impact diseases of interest: diabetes, influenza, pneumonia and HIV. Two classification methods are presented: i) a machine learning approach, where detailed features (terms, term n-grams and SNOMED CT concepts) are extracted from death certificates and used to train a set of supervised machine learning models (Support Vector Machines); and ii) a set of keyword-matching rules. These methods were used to identify the presence of diabetes, influenza, pneumonia and HIV in a death certificate. An empirical evaluation was conducted using 340,142 death certificates, divided between training and test sets, covering deaths from 2000-2007 in New South Wales, Australia. Precision and recall (positive predictive value and sensitivity) were used as evaluation measures, with F-measure providing a single, overall measure of effectiveness. A detailed error analysis was performed on classification errors. Classification of diabetes, influenza, pneumonia and HIV was highly accurate (F-measure 0.96). More fine-grained ICD-10 classification effectiveness was more variable but still high (F-measure 0.80). The error analysis revealed that word variations as well as certain word combinations adversely affected classification. In addition, anomalies in the ground truth likely led to an underestimation of the effectiveness. The high accuracy and low cost of the classification methods allow for an effective means for automatic and real-time surveillance of diabetes, influenza, pneumonia and HIV deaths. In addition, the methods are generally applicable to other diseases of interest and to other sources of medical free-text besides death certificates.

  14. Accuracy in Wrist-Worn, Sensor-Based Measurements of Heart Rate and Energy Expenditure in a Diverse Cohort

    PubMed Central

    Shcherbina, Anna; Mattsson, C. Mikael; Waggott, Daryl; Salisbury, Heidi; Christle, Jeffrey W.; Hastie, Trevor; Wheeler, Matthew T.; Ashley, Euan A.

    2017-01-01

    The ability to measure physical activity through wrist-worn devices provides an opportunity for cardiovascular medicine. However, the accuracy of commercial devices is largely unknown. The aim of this work is to assess the accuracy of seven commercially available wrist-worn devices in estimating heart rate (HR) and energy expenditure (EE) and to propose a wearable sensor evaluation framework. We evaluated the Apple Watch, Basis Peak, Fitbit Surge, Microsoft Band, Mio Alpha 2, PulseOn, and Samsung Gear S2. Participants wore devices while being simultaneously assessed with continuous telemetry and indirect calorimetry while sitting, walking, running, and cycling. Sixty volunteers (29 male, 31 female, age 38 ± 11 years) of diverse age, height, weight, skin tone, and fitness level were selected. Error in HR and EE was computed for each subject/device/activity combination. Devices reported the lowest error for cycling and the highest for walking. Device error was higher for males, greater body mass index, darker skin tone, and walking. Six of the devices achieved a median error for HR below 5% during cycling. No device achieved an error in EE below 20 percent. The Apple Watch achieved the lowest overall error in both HR and EE, while the Samsung Gear S2 reported the highest. In conclusion, most wrist-worn devices adequately measure HR in laboratory-based activities, but poorly estimate EE, suggesting caution in the use of EE measurements as part of health improvement programs. We propose reference standards for the validation of consumer health devices (http://precision.stanford.edu/). PMID:28538708

  15. Accuracy in Wrist-Worn, Sensor-Based Measurements of Heart Rate and Energy Expenditure in a Diverse Cohort.

    PubMed

    Shcherbina, Anna; Mattsson, C Mikael; Waggott, Daryl; Salisbury, Heidi; Christle, Jeffrey W; Hastie, Trevor; Wheeler, Matthew T; Ashley, Euan A

    2017-05-24

    The ability to measure physical activity through wrist-worn devices provides an opportunity for cardiovascular medicine. However, the accuracy of commercial devices is largely unknown. The aim of this work is to assess the accuracy of seven commercially available wrist-worn devices in estimating heart rate (HR) and energy expenditure (EE) and to propose a wearable sensor evaluation framework. We evaluated the Apple Watch, Basis Peak, Fitbit Surge, Microsoft Band, Mio Alpha 2, PulseOn, and Samsung Gear S2. Participants wore devices while being simultaneously assessed with continuous telemetry and indirect calorimetry while sitting, walking, running, and cycling. Sixty volunteers (29 male, 31 female, age 38 ± 11 years) of diverse age, height, weight, skin tone, and fitness level were selected. Error in HR and EE was computed for each subject/device/activity combination. Devices reported the lowest error for cycling and the highest for walking. Device error was higher for males, greater body mass index, darker skin tone, and walking. Six of the devices achieved a median error for HR below 5% during cycling. No device achieved an error in EE below 20 percent. The Apple Watch achieved the lowest overall error in both HR and EE, while the Samsung Gear S2 reported the highest. In conclusion, most wrist-worn devices adequately measure HR in laboratory-based activities, but poorly estimate EE, suggesting caution in the use of EE measurements as part of health improvement programs. We propose reference standards for the validation of consumer health devices (http://precision.stanford.edu/).

  16. Performance Evaluation of Three Blood Glucose Monitoring Systems Using ISO 15197

    PubMed Central

    Bedini, José Luis; Wallace, Jane F.; Pardo, Scott; Petruschke, Thorsten

    2015-01-01

    Background: Blood glucose monitoring is an essential component of diabetes management. Inaccurate blood glucose measurements can severely impact patients’ health. This study evaluated the performance of 3 blood glucose monitoring systems (BGMS), Contour® Next USB, FreeStyle InsuLinx®, and OneTouch® Verio™ IQ, under routine hospital conditions. Methods: Venous blood samples (N = 236) obtained for routine laboratory procedures were collected at a Spanish hospital, and blood glucose (BG) concentrations were measured with each BGMS and with the available reference (hexokinase) method. Accuracy of the 3 BGMS was compared according to ISO 15197:2013 accuracy limit criteria, by mean absolute relative difference (MARD), consensus error grid (CEG) and surveillance error grid (SEG) analyses, and an insulin dosing error model. Results: All BGMS met the accuracy limit criteria defined by ISO 15197:2013. While all measurements of the 3 BGMS were within low-risk zones in both error grid analyses, the Contour Next USB showed significantly smaller MARDs between reference values compared to the other 2 BGMS. Insulin dosing errors were lowest for the Contour Next USB than compared to the other systems. Conclusions: All BGMS fulfilled ISO 15197:2013 accuracy limit criteria and CEG criterion. However, taking together all analyses, differences in performance of potential clinical relevance may be observed. Results showed that Contour Next USB had lowest MARD values across the tested glucose range, as compared with the 2 other BGMS. CEG and SEG analyses as well as calculation of the hypothetical bolus insulin dosing error suggest a high accuracy of the Contour Next USB. PMID:26445813

  17. Measures of Linguistic Accuracy in Second Language Writing Research.

    ERIC Educational Resources Information Center

    Polio, Charlene G.

    1997-01-01

    Investigates the reliability of measures of linguistic accuracy in second language writing. The study uses a holistic scale, error-free T-units, and an error classification system on the essays of English-as-a-Second-Language students and discusses why disagreements arise within a rater and between raters. (24 references) (Author/CK)

  18. [Study of inversion and classification of particle size distribution under dependent model algorithm].

    PubMed

    Sun, Xiao-Gang; Tang, Hong; Yuan, Gui-Bin

    2008-05-01

    For the total light scattering particle sizing technique, an inversion and classification method was proposed with the dependent model algorithm. The measured particle system was inversed simultaneously by different particle distribution functions whose mathematic model was known in advance, and then classified according to the inversion errors. The simulation experiments illustrated that it is feasible to use the inversion errors to determine the particle size distribution. The particle size distribution function was obtained accurately at only three wavelengths in the visible light range with the genetic algorithm, and the inversion results were steady and reliable, which decreased the number of multi wavelengths to the greatest extent and increased the selectivity of light source. The single peak distribution inversion error was less than 5% and the bimodal distribution inversion error was less than 10% when 5% stochastic noise was put in the transmission extinction measurement values at two wavelengths. The running time of this method was less than 2 s. The method has advantages of simplicity, rapidity, and suitability for on-line particle size measurement.

  19. Classification accuracy for stratification with remotely sensed data

    Treesearch

    Raymond L. Czaplewski; Paul L. Patterson

    2003-01-01

    Tools are developed that help specify the classification accuracy required from remotely sensed data. These tools are applied during the planning stage of a sample survey that will use poststratification, prestratification with proportional allocation, or double sampling for stratification. Accuracy standards are developed in terms of an “error matrix,” which is...

  20. Comparison of wheat classification accuracy using different classifiers of the image-100 system

    NASA Technical Reports Server (NTRS)

    Dejesusparada, N. (Principal Investigator); Chen, S. C.; Moreira, M. A.; Delima, A. M.

    1981-01-01

    Classification results using single-cell and multi-cell signature acquisition options, a point-by-point Gaussian maximum-likelihood classifier, and K-means clustering of the Image-100 system are presented. Conclusions reached are that: a better indication of correct classification can be provided by using a test area which contains various cover types of the study area; classification accuracy should be evaluated considering both the percentages of correct classification and error of commission; supervised classification approaches are better than K-means clustering; Gaussian distribution maximum likelihood classifier is better than Single-cell and Multi-cell Signature Acquisition Options of the Image-100 system; and in order to obtain a high classification accuracy in a large and heterogeneous crop area, using Gaussian maximum-likelihood classifier, homogeneous spectral subclasses of the study crop should be created to derive training statistics.

  1. Gas chimney detection based on improving the performance of combined multilayer perceptron and support vector classifier

    NASA Astrophysics Data System (ADS)

    Hashemi, H.; Tax, D. M. J.; Duin, R. P. W.; Javaherian, A.; de Groot, P.

    2008-11-01

    Seismic object detection is a relatively new field in which 3-D bodies are visualized and spatial relationships between objects of different origins are studied in order to extract geologic information. In this paper, we propose a method for finding an optimal classifier with the help of a statistical feature ranking technique and combining different classifiers. The method, which has general applicability, is demonstrated here on a gas chimney detection problem. First, we evaluate a set of input seismic attributes extracted at locations labeled by a human expert using regularized discriminant analysis (RDA). In order to find the RDA score for each seismic attribute, forward and backward search strategies are used. Subsequently, two non-linear classifiers: multilayer perceptron (MLP) and support vector classifier (SVC) are run on the ranked seismic attributes. Finally, to capitalize on the intrinsic differences between both classifiers, the MLP and SVC results are combined using logical rules of maximum, minimum and mean. The proposed method optimizes the ranked feature space size and yields the lowest classification error in the final combined result. We will show that the logical minimum reveals gas chimneys that exhibit both the softness of MLP and the resolution of SVC classifiers.

  2. Ensemble of classifiers for confidence-rated classification of NDE signal

    NASA Astrophysics Data System (ADS)

    Banerjee, Portia; Safdarnejad, Seyed; Udpa, Lalita; Udpa, Satish

    2016-02-01

    Ensemble of classifiers in general, aims to improve classification accuracy by combining results from multiple weak hypotheses into a single strong classifier through weighted majority voting. Improved versions of ensemble of classifiers generate self-rated confidence scores which estimate the reliability of each of its prediction and boost the classifier using these confidence-rated predictions. However, such a confidence metric is based only on the rate of correct classification. In existing works, although ensemble of classifiers has been widely used in computational intelligence, the effect of all factors of unreliability on the confidence of classification is highly overlooked. With relevance to NDE, classification results are affected by inherent ambiguity of classifica-tion, non-discriminative features, inadequate training samples and noise due to measurement. In this paper, we extend the existing ensemble classification by maximizing confidence of every classification decision in addition to minimizing the classification error. Initial results of the approach on data from eddy current inspection show improvement in classification performance of defect and non-defect indications.

  3. On the statistical assessment of classifiers using DNA microarray data

    PubMed Central

    Ancona, N; Maglietta, R; Piepoli, A; D'Addabbo, A; Cotugno, R; Savino, M; Liuni, S; Carella, M; Pesole, G; Perri, F

    2006-01-01

    Background In this paper we present a method for the statistical assessment of cancer predictors which make use of gene expression profiles. The methodology is applied to a new data set of microarray gene expression data collected in Casa Sollievo della Sofferenza Hospital, Foggia – Italy. The data set is made up of normal (22) and tumor (25) specimens extracted from 25 patients affected by colon cancer. We propose to give answers to some questions which are relevant for the automatic diagnosis of cancer such as: Is the size of the available data set sufficient to build accurate classifiers? What is the statistical significance of the associated error rates? In what ways can accuracy be considered dependant on the adopted classification scheme? How many genes are correlated with the pathology and how many are sufficient for an accurate colon cancer classification? The method we propose answers these questions whilst avoiding the potential pitfalls hidden in the analysis and interpretation of microarray data. Results We estimate the generalization error, evaluated through the Leave-K-Out Cross Validation error, for three different classification schemes by varying the number of training examples and the number of the genes used. The statistical significance of the error rate is measured by using a permutation test. We provide a statistical analysis in terms of the frequencies of the genes involved in the classification. Using the whole set of genes, we found that the Weighted Voting Algorithm (WVA) classifier learns the distinction between normal and tumor specimens with 25 training examples, providing e = 21% (p = 0.045) as an error rate. This remains constant even when the number of examples increases. Moreover, Regularized Least Squares (RLS) and Support Vector Machines (SVM) classifiers can learn with only 15 training examples, with an error rate of e = 19% (p = 0.035) and e = 18% (p = 0.037) respectively. Moreover, the error rate decreases as the training set size increases, reaching its best performances with 35 training examples. In this case, RLS and SVM have error rates of e = 14% (p = 0.027) and e = 11% (p = 0.019). Concerning the number of genes, we found about 6000 genes (p < 0.05) correlated with the pathology, resulting from the signal-to-noise statistic. Moreover the performances of RLS and SVM classifiers do not change when 74% of genes is used. They progressively reduce up to e = 16% (p < 0.05) when only 2 genes are employed. The biological relevance of a set of genes determined by our statistical analysis and the major roles they play in colorectal tumorigenesis is discussed. Conclusions The method proposed provides statistically significant answers to precise questions relevant for the diagnosis and prognosis of cancer. We found that, with as few as 15 examples, it is possible to train statistically significant classifiers for colon cancer diagnosis. As for the definition of the number of genes sufficient for a reliable classification of colon cancer, our results suggest that it depends on the accuracy required. PMID:16919171

  4. Speech variability effects on recognition accuracy associated with concurrent task performance by pilots

    NASA Technical Reports Server (NTRS)

    Simpson, C. A.

    1985-01-01

    In the present study of the responses of pairs of pilots to aircraft warning classification tasks using an isolated word, speaker-dependent speech recognition system, the induced stress was manipulated by means of different scoring procedures for the classification task and by the inclusion of a competitive manual control task. Both speech patterns and recognition accuracy were analyzed, and recognition errors were recorded by type for an isolated word speaker-dependent system and by an offline technique for a connected word speaker-dependent system. While errors increased with task loading for the isolated word system, there was no such effect for task loading in the case of the connected word system.

  5. Robust Transmission of H.264/AVC Streams Using Adaptive Group Slicing and Unequal Error Protection

    NASA Astrophysics Data System (ADS)

    Thomos, Nikolaos; Argyropoulos, Savvas; Boulgouris, Nikolaos V.; Strintzis, Michael G.

    2006-12-01

    We present a novel scheme for the transmission of H.264/AVC video streams over lossy packet networks. The proposed scheme exploits the error-resilient features of H.264/AVC codec and employs Reed-Solomon codes to protect effectively the streams. A novel technique for adaptive classification of macroblocks into three slice groups is also proposed. The optimal classification of macroblocks and the optimal channel rate allocation are achieved by iterating two interdependent steps. Dynamic programming techniques are used for the channel rate allocation process in order to reduce complexity. Simulations clearly demonstrate the superiority of the proposed method over other recent algorithms for transmission of H.264/AVC streams.

  6. The pot calling the kettle black: the extent and type of errors in a computerized immunization registry and by parent report.

    PubMed

    MacDonald, Shannon E; Schopflocher, Donald P; Golonka, Richard P

    2014-01-04

    Accurate classification of children's immunization status is essential for clinical care, administration and evaluation of immunization programs, and vaccine program research. Computerized immunization registries have been proposed as a valuable alternative to provider paper records or parent report, but there is a need to better understand the challenges associated with their use. This study assessed the accuracy of immunization status classification in an immunization registry as compared to parent report and determined the number and type of errors occurring in both sources. This study was a sub-analysis of a larger study which compared the characteristics of children whose immunizations were up to date (UTD) at two years as compared to those not UTD. Children's immunization status was initially determined from a population-based immunization registry, and then compared to parent report of immunization status, as reported in a postal survey. Discrepancies between the two sources were adjudicated by review of immunization providers' hard-copy clinic records. Descriptive analyses included calculating proportions and confidence intervals for errors in classification and reporting of the type and frequency of errors. Among the 461 survey respondents, there were 60 discrepancies in immunization status. The majority of errors were due to parent report (n = 44), but the registry was not without fault (n = 16). Parents tended to erroneously report their child as UTD, whereas the registry was more likely to wrongly classify children as not UTD. Reasons for registry errors included failure to account for varicella disease history, variable number of doses required due to age at series initiation, and doses administered out of the region. These results confirm that parent report is often flawed, but also identify that registries are prone to misclassification of immunization status. Immunization program administrators and researchers need to institute measures to identify and reduce misclassification, in order for registries to play an effective role in the control of vaccine-preventable disease.

  7. The pot calling the kettle black: the extent and type of errors in a computerized immunization registry and by parent report

    PubMed Central

    2014-01-01

    Background Accurate classification of children’s immunization status is essential for clinical care, administration and evaluation of immunization programs, and vaccine program research. Computerized immunization registries have been proposed as a valuable alternative to provider paper records or parent report, but there is a need to better understand the challenges associated with their use. This study assessed the accuracy of immunization status classification in an immunization registry as compared to parent report and determined the number and type of errors occurring in both sources. Methods This study was a sub-analysis of a larger study which compared the characteristics of children whose immunizations were up to date (UTD) at two years as compared to those not UTD. Children’s immunization status was initially determined from a population-based immunization registry, and then compared to parent report of immunization status, as reported in a postal survey. Discrepancies between the two sources were adjudicated by review of immunization providers’ hard-copy clinic records. Descriptive analyses included calculating proportions and confidence intervals for errors in classification and reporting of the type and frequency of errors. Results Among the 461 survey respondents, there were 60 discrepancies in immunization status. The majority of errors were due to parent report (n = 44), but the registry was not without fault (n = 16). Parents tended to erroneously report their child as UTD, whereas the registry was more likely to wrongly classify children as not UTD. Reasons for registry errors included failure to account for varicella disease history, variable number of doses required due to age at series initiation, and doses administered out of the region. Conclusions These results confirm that parent report is often flawed, but also identify that registries are prone to misclassification of immunization status. Immunization program administrators and researchers need to institute measures to identify and reduce misclassification, in order for registries to play an effective role in the control of vaccine-preventable disease. PMID:24387002

  8. Bayesian Network Structure Learning for Urban Land Use Classification from Landsat ETM+ and Ancillary Data

    NASA Astrophysics Data System (ADS)

    Park, M.; Stenstrom, M. K.

    2004-12-01

    Recognizing urban information from the satellite imagery is problematic due to the diverse features and dynamic changes of urban landuse. The use of Landsat imagery for urban land use classification involves inherent uncertainty due to its spatial resolution and the low separability among land uses. To resolve the uncertainty problem, we investigated the performance of Bayesian networks to classify urban land use since Bayesian networks provide a quantitative way of handling uncertainty and have been successfully used in many areas. In this study, we developed the optimized networks for urban land use classification from Landsat ETM+ images of Marina del Rey area based on USGS land cover/use classification level III. The networks started from a tree structure based on mutual information between variables and added the links to improve accuracy. This methodology offers several advantages: (1) The network structure shows the dependency relationships between variables. The class node value can be predicted even with particular band information missing due to sensor system error. The missing information can be inferred from other dependent bands. (2) The network structure provides information of variables that are important for the classification, which is not available from conventional classification methods such as neural networks and maximum likelihood classification. In our case, for example, bands 1, 5 and 6 are the most important inputs in determining the land use of each pixel. (3) The networks can be reduced with those input variables important for classification. This minimizes the problem without considering all possible variables. We also examined the effect of incorporating ancillary data: geospatial information such as X and Y coordinate values of each pixel and DEM data, and vegetation indices such as NDVI and Tasseled Cap transformation. The results showed that the locational information improved overall accuracy (81%) and kappa coefficient (76%), and lowered the omission and commission errors compared with using only spectral data (accuracy 71%, kappa coefficient 62%). Incorporating DEM data did not significantly improve overall accuracy (74%) and kappa coefficient (66%) but lowered the omission and commission errors. Incorporating NDVI did not much improve the overall accuracy (72%) and k coefficient (65%). Including Tasseled Cap transformation reduced the accuracy (accuracy 70%, kappa 61%). Therefore, additional information from the DEM and vegetation indices was not useful as locational ancillary data.

  9. Vector quantizer designs for joint compression and terrain categorization of multispectral imagery

    NASA Technical Reports Server (NTRS)

    Gorman, John D.; Lyons, Daniel F.

    1994-01-01

    Two vector quantizer designs for compression of multispectral imagery and their impact on terrain categorization performance are evaluated. The mean-squared error (MSE) and classification performance of the two quantizers are compared, and it is shown that a simple two-stage design minimizing MSE subject to a constraint on classification performance has a significantly better classification performance than a standard MSE-based tree-structured vector quantizer followed by maximum likelihood classification. This improvement in classification performance is obtained with minimal loss in MSE performance. The results show that it is advantageous to tailor compression algorithm designs to the required data exploitation tasks. Applications of joint compression/classification include compression for the archival or transmission of Landsat imagery that is later used for land utility surveys and/or radiometric analysis.

  10. Medication errors: definitions and classification

    PubMed Central

    Aronson, Jeffrey K

    2009-01-01

    To understand medication errors and to identify preventive strategies, we need to classify them and define the terms that describe them. The four main approaches to defining technical terms consider etymology, usage, previous definitions, and the Ramsey–Lewis method (based on an understanding of theory and practice). A medication error is ‘a failure in the treatment process that leads to, or has the potential to lead to, harm to the patient’. Prescribing faults, a subset of medication errors, should be distinguished from prescription errors. A prescribing fault is ‘a failure in the prescribing [decision-making] process that leads to, or has the potential to lead to, harm to the patient’. The converse of this, ‘balanced prescribing’ is ‘the use of a medicine that is appropriate to the patient's condition and, within the limits created by the uncertainty that attends therapeutic decisions, in a dosage regimen that optimizes the balance of benefit to harm’. This excludes all forms of prescribing faults, such as irrational, inappropriate, and ineffective prescribing, underprescribing and overprescribing. A prescription error is ‘a failure in the prescription writing process that results in a wrong instruction about one or more of the normal features of a prescription’. The ‘normal features’ include the identity of the recipient, the identity of the drug, the formulation, dose, route, timing, frequency, and duration of administration. Medication errors can be classified, invoking psychological theory, as knowledge-based mistakes, rule-based mistakes, action-based slips, and memory-based lapses. This classification informs preventive strategies. PMID:19594526

  11. Linear and Order Statistics Combiners for Pattern Classification

    NASA Technical Reports Server (NTRS)

    Tumer, Kagan; Ghosh, Joydeep; Lau, Sonie (Technical Monitor)

    2001-01-01

    Several researchers have experimentally shown that substantial improvements can be obtained in difficult pattern recognition problems by combining or integrating the outputs of multiple classifiers. This chapter provides an analytical framework to quantify the improvements in classification results due to combining. The results apply to both linear combiners and order statistics combiners. We first show that to a first order approximation, the error rate obtained over and above the Bayes error rate, is directly proportional to the variance of the actual decision boundaries around the Bayes optimum boundary. Combining classifiers in output space reduces this variance, and hence reduces the 'added' error. If N unbiased classifiers are combined by simple averaging. the added error rate can be reduced by a factor of N if the individual errors in approximating the decision boundaries are uncorrelated. Expressions are then derived for linear combiners which are biased or correlated, and the effect of output correlations on ensemble performance is quantified. For order statistics based non-linear combiners, we derive expressions that indicate how much the median, the maximum and in general the i-th order statistic can improve classifier performance. The analysis presented here facilitates the understanding of the relationships among error rates, classifier boundary distributions, and combining in output space. Experimental results on several public domain data sets are provided to illustrate the benefits of combining and to support the analytical results.

  12. Why do pathological stage IA lung adenocarcinomas vary from prognosis?: a clinicopathologic study of 176 patients with pathological stage IA lung adenocarcinoma based on the IASLC/ATS/ERS classification.

    PubMed

    Zhang, Jie; Wu, Jie; Tan, Qiang; Zhu, Lei; Gao, Wen

    2013-09-01

    Patients with pathological stage IA adenocarcinoma (AC) have a variable prognosis, even if treated in the same way. The postoperative treatment of pathological stage IA patients is also controversial. We identified 176 patients with pathological stage IA AC who had undergone a lobectomy and mediastinal lymph node dissection at the Shanghai Chest Hospital, Shanghai, China, between 2000 and 2006. No patient had preoperative treatment. The histologic subtypes of all patients were classified according to the 2011 International Association for the Study of Lung Cancer (IASLC)/American Thoracic Society (ATS)/European Respiratory Society (ERS) international multidisciplinary lung AC classification. Patients' 5-year overall survival (OS) and 5-year disease-free survival (DFS) were calculated using Kaplan-Meier and Cox regression analyses. One hundred seventy-six patients with pathological stage IA AC had an 86.6% 5-year OS and 74.6% 5-year DFS. The 10 patients with micropapillary predominant subtype had the lowest 5-year DFS (40.0%).The 12 patients with solid predominant with mucin production subtype had the lowest 5-year OS (66.7%). Univariate and multivariate analysis showed that sex and prognositic groups of the IASLC/ATS/ERS histologic classification were significantly associated with 5-year DFS of pathological stage IA AC. Our study revealed that sex was an independent prognostic factor of pathological stage IA AC. The IASLC/ATS/ERS classification of lung AC identifies histologic categories with prognostic differences that could be helpful in clinical therapy.

  13. Metrics for Analyzing Quantifiable Differentiation of Designs with Varying Integrity for Hardware Assurance

    DTIC Science & Technology

    2017-03-01

    proposed. Expected profiles can incorporate a level of overdesign. Finally, the Design Integrity measuring techniques are applied to five Test Article ...Inserted into Test System Table 2 presents the results of the analysis applied to each of the test article designs. Each of the domains are...the lowest integrities. Based on the analysis, the DI metric shows measurable differentiation between all five Test Article Error Location Error

  14. 48 CFR 52.247-29 - F.o.b. Origin.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ...) To, and placed on, the carrier's wharf (at shipside, within reach of the ship's loading tackle, when the shipping point is within a port area having water transportation service) or the carrier's freight... terms of the governing freight classification or tariff (or Government rate tender) under which lowest...

  15. 48 CFR 47.303-1 - F.o.b. origin.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... carrier's wharf (at shipside, within reach of the ship's loading tackle, when the shipping point is within a port area having water transportation service) or the carrier's freight station; (3) To a U.S... terms of the governing freight classification or tariff (or Government rate tender) under which lowest...

  16. 48 CFR 47.303-1 - F.o.b. origin.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... carrier's wharf (at shipside, within reach of the ship's loading tackle, when the shipping point is within a port area having water transportation service) or the carrier's freight station; (3) To a U.S... terms of the governing freight classification or tariff (or Government rate tender) under which lowest...

  17. 48 CFR 47.303-1 - F.o.b. origin.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... carrier's wharf (at shipside, within reach of the ship's loading tackle, when the shipping point is within a port area having water transportation service) or the carrier's freight station; (3) To a U.S... terms of the governing freight classification or tariff (or Government rate tender) under which lowest...

  18. 48 CFR 52.247-29 - F.o.b. Origin.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ...) To, and placed on, the carrier's wharf (at shipside, within reach of the ship's loading tackle, when the shipping point is within a port area having water transportation service) or the carrier's freight... terms of the governing freight classification or tariff (or Government rate tender) under which lowest...

  19. 48 CFR 52.247-29 - F.o.b. Origin.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ...) To, and placed on, the carrier's wharf (at shipside, within reach of the ship's loading tackle, when the shipping point is within a port area having water transportation service) or the carrier's freight... terms of the governing freight classification or tariff (or Government rate tender) under which lowest...

  20. 48 CFR 52.247-29 - F.o.b. Origin.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ...) To, and placed on, the carrier's wharf (at shipside, within reach of the ship's loading tackle, when the shipping point is within a port area having water transportation service) or the carrier's freight... terms of the governing freight classification or tariff (or Government rate tender) under which lowest...

  1. 48 CFR 47.303-1 - F.o.b. origin.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... carrier's wharf (at shipside, within reach of the ship's loading tackle, when the shipping point is within a port area having water transportation service) or the carrier's freight station; (3) To a U.S... terms of the governing freight classification or tariff (or Government rate tender) under which lowest...

  2. Conversion of Continuous-Valued Deep Networks to Efficient Event-Driven Networks for Image Classification.

    PubMed

    Rueckauer, Bodo; Lungu, Iulia-Alexandra; Hu, Yuhuang; Pfeiffer, Michael; Liu, Shih-Chii

    2017-01-01

    Spiking neural networks (SNNs) can potentially offer an efficient way of doing inference because the neurons in the networks are sparsely activated and computations are event-driven. Previous work showed that simple continuous-valued deep Convolutional Neural Networks (CNNs) can be converted into accurate spiking equivalents. These networks did not include certain common operations such as max-pooling, softmax, batch-normalization and Inception-modules. This paper presents spiking equivalents of these operations therefore allowing conversion of nearly arbitrary CNN architectures. We show conversion of popular CNN architectures, including VGG-16 and Inception-v3, into SNNs that produce the best results reported to date on MNIST, CIFAR-10 and the challenging ImageNet dataset. SNNs can trade off classification error rate against the number of available operations whereas deep continuous-valued neural networks require a fixed number of operations to achieve their classification error rate. From the examples of LeNet for MNIST and BinaryNet for CIFAR-10, we show that with an increase in error rate of a few percentage points, the SNNs can achieve more than 2x reductions in operations compared to the original CNNs. This highlights the potential of SNNs in particular when deployed on power-efficient neuromorphic spiking neuron chips, for use in embedded applications.

  3. Conversion of Continuous-Valued Deep Networks to Efficient Event-Driven Networks for Image Classification

    PubMed Central

    Rueckauer, Bodo; Lungu, Iulia-Alexandra; Hu, Yuhuang; Pfeiffer, Michael; Liu, Shih-Chii

    2017-01-01

    Spiking neural networks (SNNs) can potentially offer an efficient way of doing inference because the neurons in the networks are sparsely activated and computations are event-driven. Previous work showed that simple continuous-valued deep Convolutional Neural Networks (CNNs) can be converted into accurate spiking equivalents. These networks did not include certain common operations such as max-pooling, softmax, batch-normalization and Inception-modules. This paper presents spiking equivalents of these operations therefore allowing conversion of nearly arbitrary CNN architectures. We show conversion of popular CNN architectures, including VGG-16 and Inception-v3, into SNNs that produce the best results reported to date on MNIST, CIFAR-10 and the challenging ImageNet dataset. SNNs can trade off classification error rate against the number of available operations whereas deep continuous-valued neural networks require a fixed number of operations to achieve their classification error rate. From the examples of LeNet for MNIST and BinaryNet for CIFAR-10, we show that with an increase in error rate of a few percentage points, the SNNs can achieve more than 2x reductions in operations compared to the original CNNs. This highlights the potential of SNNs in particular when deployed on power-efficient neuromorphic spiking neuron chips, for use in embedded applications. PMID:29375284

  4. 3D multi-view convolutional neural networks for lung nodule classification

    PubMed Central

    Kang, Guixia; Hou, Beibei; Zhang, Ningbo

    2017-01-01

    The 3D convolutional neural network (CNN) is able to make full use of the spatial 3D context information of lung nodules, and the multi-view strategy has been shown to be useful for improving the performance of 2D CNN in classifying lung nodules. In this paper, we explore the classification of lung nodules using the 3D multi-view convolutional neural networks (MV-CNN) with both chain architecture and directed acyclic graph architecture, including 3D Inception and 3D Inception-ResNet. All networks employ the multi-view-one-network strategy. We conduct a binary classification (benign and malignant) and a ternary classification (benign, primary malignant and metastatic malignant) on Computed Tomography (CT) images from Lung Image Database Consortium and Image Database Resource Initiative database (LIDC-IDRI). All results are obtained via 10-fold cross validation. As regards the MV-CNN with chain architecture, results show that the performance of 3D MV-CNN surpasses that of 2D MV-CNN by a significant margin. Finally, a 3D Inception network achieved an error rate of 4.59% for the binary classification and 7.70% for the ternary classification, both of which represent superior results for the corresponding task. We compare the multi-view-one-network strategy with the one-view-one-network strategy. The results reveal that the multi-view-one-network strategy can achieve a lower error rate than the one-view-one-network strategy. PMID:29145492

  5. Automatic classification for mammogram backgrounds based on bi-rads complexity definition and on a multi content analysis framework

    NASA Astrophysics Data System (ADS)

    Wu, Jie; Besnehard, Quentin; Marchessoux, Cédric

    2011-03-01

    Clinical studies for the validation of new medical imaging devices require hundreds of images. An important step in creating and tuning the study protocol is the classification of images into "difficult" and "easy" cases. This consists of classifying the image based on features like the complexity of the background, the visibility of the disease (lesions). Therefore, an automatic medical background classification tool for mammograms would help for such clinical studies. This classification tool is based on a multi-content analysis framework (MCA) which was firstly developed to recognize image content of computer screen shots. With the implementation of new texture features and a defined breast density scale, the MCA framework is able to automatically classify digital mammograms with a satisfying accuracy. BI-RADS (Breast Imaging Reporting Data System) density scale is used for grouping the mammograms, which standardizes the mammography reporting terminology and assessment and recommendation categories. Selected features are input into a decision tree classification scheme in MCA framework, which is the so called "weak classifier" (any classifier with a global error rate below 50%). With the AdaBoost iteration algorithm, these "weak classifiers" are combined into a "strong classifier" (a classifier with a low global error rate) for classifying one category. The results of classification for one "strong classifier" show the good accuracy with the high true positive rates. For the four categories the results are: TP=90.38%, TN=67.88%, FP=32.12% and FN =9.62%.

  6. Hydrology, secondary growth, and elevation accuracy in two preliminary Amazon Basin SRTM DEMs

    NASA Astrophysics Data System (ADS)

    Alsdorf, D.; Hess, L.; Sheng, Y.; Souza, C.; Pavelsky, T.; Melack, J.; Dunne, T.; Hendricks, G.; Ballantine, A.; Holmes, K.

    2003-04-01

    Two preliminary Shuttle Radar Topography Mission digital elevation models (SRTM DEMs) of Manaus (1S to 5S and 59W to 63W) and Rondonia (9S to 12S and 61W to 64W) were received from the "PI Processor" at NASA JPL. We compared the Manaus DEM (C-band) with a previously constructed Cabaliana floodplain classification based on Global RainForest Mapping (GRFM) JERS-1 SAR data (L-band) and determined that habitats of open water, bare ground, and flooded shrub contained the lowest elevations; macrophyte and non-flooded shrub habitats are marked by intermediate elevations; and the highest elevations are found within flooded and non-flooded forest. Although the water surface typically produces specular reflections, double-bounce travel paths result from dead, leafless trees found across the Balbina reservoir near Manaus. There (i.e., in Balbina) the water surface is marked by pixel-to-pixel height changes of generally 0 to 1 m and changes across a ˜100 km transect rarely exceed 3 m. Reported SRTM errors throughout the transect range from 1 to 2 m with some errors up to 5 m. The smooth Balbina surface contrasts with the wind-roughened Amazon River surface where SRTM height variations easily range from 1 to 10 m (reported errors often exceed 5 m). Deforestation and subsequent regrowth in the Rondonia DEM is remarkably clear. Our colleagues used a 20 year sequence of Landsat TM/MSS classified imagery to delineate areas in various stages of secondary growth and we find a general trend of increasing vegetation height with increasing age. Flow path networks derived from the Cabaliana floodplain DEM are in general agreement with networks previously extracted from the GRFM mosaics; however, watershed boundaries differ. We have also developed an algorithm for extracting channel widths, which is presently being applied to the DEM and classified imagery to determine morphological variations between reaches.

  7. Classification of electroencephalograph signals using time-frequency decomposition and linear discriminant analysis

    NASA Astrophysics Data System (ADS)

    Szuflitowska, B.; Orlowski, P.

    2017-08-01

    Automated detection system consists of two key steps: extraction of features from EEG signals and classification for detection of pathology activity. The EEG sequences were analyzed using Short-Time Fourier Transform and the classification was performed using Linear Discriminant Analysis. The accuracy of the technique was tested on three sets of EEG signals: epilepsy, healthy and Alzheimer's Disease. The classification error below 10% has been considered a success. The higher accuracy are obtained for new data of unknown classes than testing data. The methodology can be helpful in differentiation epilepsy seizure and disturbances in the EEG signal in Alzheimer's Disease.

  8. The Influence of Item Calibration Error on Variable-Length Computerized Adaptive Testing

    ERIC Educational Resources Information Center

    Patton, Jeffrey M.; Cheng, Ying; Yuan, Ke-Hai; Diao, Qi

    2013-01-01

    Variable-length computerized adaptive testing (VL-CAT) allows both items and test length to be "tailored" to examinees, thereby achieving the measurement goal (e.g., scoring precision or classification) with as few items as possible. Several popular test termination rules depend on the standard error of the ability estimate, which in turn depends…

  9. Lexical Errors in Second Language Scientific Writing: Some Conceptual Implications

    ERIC Educational Resources Information Center

    Carrió Pastor, María Luisa; Mestre-Mestre, Eva María

    2014-01-01

    Nowadays, scientific writers are required not only a thorough knowledge of their subject field, but also a sound command of English as a lingua franca. In this paper, the lexical errors produced in scientific texts written in English by non-native researchers are identified to propose a classification of the categories they contain. This study…

  10. Land use surveys by means of automatic interpretation of LANDSAT system data

    NASA Technical Reports Server (NTRS)

    Dejesusparada, N. (Principal Investigator); Lombardo, M. A.; Novo, E. M. L. D.; Niero, M.; Foresti, C.

    1981-01-01

    Analyses for seven land-use classes are presented. The classes are: urban area, industrial area, bare soil, cultivated area, pastureland, reforestation, and natural vegetation. The automatic classification of LANDSAT MSS data using a maximum likelihood algorithm shows a 39% average error of emission and a 3.45 error of commission for the seven classes.

  11. Anatomical and/or pathological predictors for the “incorrect” classification of red dot markers on wrist radiographs taken following trauma

    PubMed Central

    Kranz, R

    2015-01-01

    Objective: To establish the prevalence of red dot markers in a sample of wrist radiographs and to identify any anatomical and/or pathological characteristics that predict “incorrect” red dot classification. Methods: Accident and emergency (A&E) wrist cases from a digital imaging and communications in medicine/digital teaching library were examined for red dot prevalence and for the presence of several anatomical and pathological features. Binary logistic regression analyses were run to establish if any of these features were predictors of incorrect red dot classification. Results: 398 cases were analysed. Red dot was “incorrectly” classified in 8.5% of cases; 6.3% were “false negatives” (“FNs”)and 2.3% false positives (FPs) (one decimal place). Old fractures [odds ratio (OR), 5.070 (1.256–20.471)] and reported degenerative change [OR, 9.870 (2.300–42.359)] were found to predict FPs. Frykman V [OR, 9.500 (1.954–46.179)], Frykman VI [OR, 6.333 (1.205–33.283)] and non-Frykman positive abnormalities [OR, 4.597 (1.264–16.711)] predict “FNs”. Old fractures and Frykman VI were predictive of error at 90% confidence interval (CI); the rest at 95% CI. Conclusion: The five predictors of incorrect red dot classification may inform the image interpretation training of radiographers and other professionals to reduce diagnostic error. Verification with larger samples would reinforce these findings. Advances in knowledge: All healthcare providers strive to eradicate diagnostic error. By examining specific anatomical and pathological predictors on radiographs for such error, as well as extrinsic factors that may affect reporting accuracy, image interpretation training can focus on these “problem” areas and influence which radiographic abnormality detection schemes are appropriate to implement in A&E departments. PMID:25496373

  12. How Should Children with Speech Sound Disorders be Classified? A Review and Critical Evaluation of Current Classification Systems

    ERIC Educational Resources Information Center

    Waring, R.; Knight, R.

    2013-01-01

    Background: Children with speech sound disorders (SSD) form a heterogeneous group who differ in terms of the severity of their condition, underlying cause, speech errors, involvement of other aspects of the linguistic system and treatment response. To date there is no universal and agreed-upon classification system. Instead, a number of…

  13. Evaluation of the confusion matrix method in the validation of an automated system for measuring feeding behaviour of cattle.

    PubMed

    Ruuska, Salla; Hämäläinen, Wilhelmiina; Kajava, Sari; Mughal, Mikaela; Matilainen, Pekka; Mononen, Jaakko

    2018-03-01

    The aim of the present study was to evaluate empirically confusion matrices in device validation. We compared the confusion matrix method to linear regression and error indices in the validation of a device measuring feeding behaviour of dairy cattle. In addition, we studied how to extract additional information on classification errors with confusion probabilities. The data consisted of 12 h behaviour measurements from five dairy cows; feeding and other behaviour were detected simultaneously with a device and from video recordings. The resulting 216 000 pairs of classifications were used to construct confusion matrices and calculate performance measures. In addition, hourly durations of each behaviour were calculated and the accuracy of measurements was evaluated with linear regression and error indices. All three validation methods agreed when the behaviour was detected very accurately or inaccurately. Otherwise, in the intermediate cases, the confusion matrix method and error indices produced relatively concordant results, but the linear regression method often disagreed with them. Our study supports the use of confusion matrix analysis in validation since it is robust to any data distribution and type of relationship, it makes a stringent evaluation of validity, and it offers extra information on the type and sources of errors. Copyright © 2018 Elsevier B.V. All rights reserved.

  14. Bayes-LQAS: classifying the prevalence of global acute malnutrition

    PubMed Central

    2010-01-01

    Lot Quality Assurance Sampling (LQAS) applications in health have generally relied on frequentist interpretations for statistical validity. Yet health professionals often seek statements about the probability distribution of unknown parameters to answer questions of interest. The frequentist paradigm does not pretend to yield such information, although a Bayesian formulation might. This is the source of an error made in a recent paper published in this journal. Many applications lend themselves to a Bayesian treatment, and would benefit from such considerations in their design. We discuss Bayes-LQAS (B-LQAS), which allows for incorporation of prior information into the LQAS classification procedure, and thus shows how to correct the aforementioned error. Further, we pay special attention to the formulation of Bayes Operating Characteristic Curves and the use of prior information to improve survey designs. As a motivating example, we discuss the classification of Global Acute Malnutrition prevalence and draw parallels between the Bayes and classical classifications schemes. We also illustrate the impact of informative and non-informative priors on the survey design. Results indicate that using a Bayesian approach allows the incorporation of expert information and/or historical data and is thus potentially a valuable tool for making accurate and precise classifications. PMID:20534159

  15. Bayes-LQAS: classifying the prevalence of global acute malnutrition.

    PubMed

    Olives, Casey; Pagano, Marcello

    2010-06-09

    Lot Quality Assurance Sampling (LQAS) applications in health have generally relied on frequentist interpretations for statistical validity. Yet health professionals often seek statements about the probability distribution of unknown parameters to answer questions of interest. The frequentist paradigm does not pretend to yield such information, although a Bayesian formulation might. This is the source of an error made in a recent paper published in this journal. Many applications lend themselves to a Bayesian treatment, and would benefit from such considerations in their design. We discuss Bayes-LQAS (B-LQAS), which allows for incorporation of prior information into the LQAS classification procedure, and thus shows how to correct the aforementioned error. Further, we pay special attention to the formulation of Bayes Operating Characteristic Curves and the use of prior information to improve survey designs. As a motivating example, we discuss the classification of Global Acute Malnutrition prevalence and draw parallels between the Bayes and classical classifications schemes. We also illustrate the impact of informative and non-informative priors on the survey design. Results indicate that using a Bayesian approach allows the incorporation of expert information and/or historical data and is thus potentially a valuable tool for making accurate and precise classifications.

  16. Optimization of the ANFIS using a genetic algorithm for physical work rate classification.

    PubMed

    Habibi, Ehsanollah; Salehi, Mina; Yadegarfar, Ghasem; Taheri, Ali

    2018-03-13

    Recently, a new method was proposed for physical work rate classification based on an adaptive neuro-fuzzy inference system (ANFIS). This study aims to present a genetic algorithm (GA)-optimized ANFIS model for a highly accurate classification of physical work rate. Thirty healthy men participated in this study. Directly measured heart rate and oxygen consumption of the participants in the laboratory were used for training the ANFIS classifier model in MATLAB version 8.0.0 using a hybrid algorithm. A similar process was done using the GA as an optimization technique. The accuracy, sensitivity and specificity of the ANFIS classifier model were increased successfully. The mean accuracy of the model was increased from 92.95 to 97.92%. Also, the calculated root mean square error of the model was reduced from 5.4186 to 3.1882. The maximum estimation error of the optimized ANFIS during the network testing process was ± 5%. The GA can be effectively used for ANFIS optimization and leads to an accurate classification of physical work rate. In addition to high accuracy, simple implementation and inter-individual variability consideration are two other advantages of the presented model.

  17. Galaxy Zoo 1: data release of morphological classifications for nearly 900 000 galaxies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Linott, C.; Slosar, A.; Lintott, C.

    Morphology is a powerful indicator of a galaxy's dynamical and merger history. It is strongly correlated with many physical parameters, including mass, star formation history and the distribution of mass. The Galaxy Zoo project collected simple morphological classifications of nearly 900,000 galaxies drawn from the Sloan Digital Sky Survey, contributed by hundreds of thousands of volunteers. This large number of classifications allows us to exclude classifier error, and measure the influence of subtle biases inherent in morphological classification. This paper presents the data collected by the project, alongside measures of classification accuracy and bias. The data are now publicly availablemore » and full catalogues can be downloaded in electronic format from http://data.galaxyzoo.org.« less

  18. Spotting East African mammals in open savannah from space.

    PubMed

    Yang, Zheng; Wang, Tiejun; Skidmore, Andrew K; de Leeuw, Jan; Said, Mohammed Y; Freer, Jim

    2014-01-01

    Knowledge of population dynamics is essential for managing and conserving wildlife. Traditional methods of counting wild animals such as aerial survey or ground counts not only disturb animals, but also can be labour intensive and costly. New, commercially available very high-resolution satellite images offer great potential for accurate estimates of animal abundance over large open areas. However, little research has been conducted in the area of satellite-aided wildlife census, although computer processing speeds and image analysis algorithms have vastly improved. This paper explores the possibility of detecting large animals in the open savannah of Maasai Mara National Reserve, Kenya from very high-resolution GeoEye-1 satellite images. A hybrid image classification method was employed for this specific purpose by incorporating the advantages of both pixel-based and object-based image classification approaches. This was performed in two steps: firstly, a pixel-based image classification method, i.e., artificial neural network was applied to classify potential targets with similar spectral reflectance at pixel level; and then an object-based image classification method was used to further differentiate animal targets from the surrounding landscapes through the applications of expert knowledge. As a result, the large animals in two pilot study areas were successfully detected with an average count error of 8.2%, omission error of 6.6% and commission error of 13.7%. The results of the study show for the first time that it is feasible to perform automated detection and counting of large wild animals in open savannahs from space, and therefore provide a complementary and alternative approach to the conventional wildlife survey techniques.

  19. Comparison of three methods for long-term monitoring of boreal lake area using Landsat TM and ETM+ imagery

    USGS Publications Warehouse

    Roach, Jennifer K.; Griffith, Brad; Verbyla, David

    2012-01-01

    Programs to monitor lake area change are becoming increasingly important in high latitude regions, and their development often requires evaluating tradeoffs among different approaches in terms of accuracy of measurement, consistency across multiple users over long time periods, and efficiency. We compared three supervised methods for lake classification from Landsat imagery (density slicing, classification trees, and feature extraction). The accuracy of lake area and number estimates was evaluated relative to high-resolution aerial photography acquired within two days of satellite overpasses. The shortwave infrared band 5 was better at separating surface water from nonwater when used alone than when combined with other spectral bands. The simplest of the three methods, density slicing, performed best overall. The classification tree method resulted in the most omission errors (approx. 2x), feature extraction resulted in the most commission errors (approx. 4x), and density slicing had the least directional bias (approx. half of the lakes with overestimated area and half of the lakes with underestimated area). Feature extraction was the least consistent across training sets (i.e., large standard error among different training sets). Density slicing was the best of the three at classifying small lakes as evidenced by its lower optimal minimum lake size criterion of 5850 m2 compared with the other methods (8550 m2). Contrary to conventional wisdom, the use of additional spectral bands and a more sophisticated method not only required additional processing effort but also had a cost in terms of the accuracy and consistency of lake classifications.

  20. Cluster designs to assess the prevalence of acute malnutrition by lot quality assurance sampling: a validation study by computer simulation

    PubMed Central

    Olives, Casey; Pagano, Marcello; Deitchler, Megan; Hedt, Bethany L; Egge, Kari; Valadez, Joseph J

    2009-01-01

    Traditional lot quality assurance sampling (LQAS) methods require simple random sampling to guarantee valid results. However, cluster sampling has been proposed to reduce the number of random starting points. This study uses simulations to examine the classification error of two such designs, a 67×3 (67 clusters of three observations) and a 33×6 (33 clusters of six observations) sampling scheme to assess the prevalence of global acute malnutrition (GAM). Further, we explore the use of a 67×3 sequential sampling scheme for LQAS classification of GAM prevalence. Results indicate that, for independent clusters with moderate intracluster correlation for the GAM outcome, the three sampling designs maintain approximate validity for LQAS analysis. Sequential sampling can substantially reduce the average sample size that is required for data collection. The presence of intercluster correlation can impact dramatically the classification error that is associated with LQAS analysis. PMID:20011037

  1. Electroencephalography epilepsy classifications using hybrid cuckoo search and neural network

    NASA Astrophysics Data System (ADS)

    Pratiwi, A. B.; Damayanti, A.; Miswanto

    2017-07-01

    Epilepsy is a condition that affects the brain and causes repeated seizures. This seizure is episodes that can vary and nearly undetectable to long periods of vigorous shaking or brain contractions. Epilepsy often can be confirmed with an electrocephalography (EEG). Neural Networks has been used in biomedic signal analysis, it has successfully classified the biomedic signal, such as EEG signal. In this paper, a hybrid cuckoo search and neural network are used to recognize EEG signal for epilepsy classifications. The weight of the multilayer perceptron is optimized by the cuckoo search algorithm based on its error. The aim of this methods is making the network faster to obtained the local or global optimal then the process of classification become more accurate. Based on the comparison results with the traditional multilayer perceptron, the hybrid cuckoo search and multilayer perceptron provides better performance in term of error convergence and accuracy. The purpose methods give MSE 0.001 and accuracy 90.0 %.

  2. Selective Weighted Least Squares Method for Fourier Transform Infrared Quantitative Analysis.

    PubMed

    Wang, Xin; Li, Yan; Wei, Haoyun; Chen, Xia

    2017-06-01

    Classical least squares (CLS) regression is a popular multivariate statistical method used frequently for quantitative analysis using Fourier transform infrared (FT-IR) spectrometry. Classical least squares provides the best unbiased estimator for uncorrelated residual errors with zero mean and equal variance. However, the noise in FT-IR spectra, which accounts for a large portion of the residual errors, is heteroscedastic. Thus, if this noise with zero mean dominates in the residual errors, the weighted least squares (WLS) regression method described in this paper is a better estimator than CLS. However, if bias errors, such as the residual baseline error, are significant, WLS may perform worse than CLS. In this paper, we compare the effect of noise and bias error in using CLS and WLS in quantitative analysis. Results indicated that for wavenumbers with low absorbance, the bias error significantly affected the error, such that the performance of CLS is better than that of WLS. However, for wavenumbers with high absorbance, the noise significantly affected the error, and WLS proves to be better than CLS. Thus, we propose a selective weighted least squares (SWLS) regression that processes data with different wavenumbers using either CLS or WLS based on a selection criterion, i.e., lower or higher than an absorbance threshold. The effects of various factors on the optimal threshold value (OTV) for SWLS have been studied through numerical simulations. These studies reported that: (1) the concentration and the analyte type had minimal effect on OTV; and (2) the major factor that influences OTV is the ratio between the bias error and the standard deviation of the noise. The last part of this paper is dedicated to quantitative analysis of methane gas spectra, and methane/toluene mixtures gas spectra as measured using FT-IR spectrometry and CLS, WLS, and SWLS. The standard error of prediction (SEP), bias of prediction (bias), and the residual sum of squares of the errors (RSS) from the three quantitative analyses were compared. In methane gas analysis, SWLS yielded the lowest SEP and RSS among the three methods. In methane/toluene mixture gas analysis, a modification of the SWLS has been presented to tackle the bias error from other components. The SWLS without modification presents the lowest SEP in all cases but not bias and RSS. The modification of SWLS reduced the bias, which showed a lower RSS than CLS, especially for small components.

  3. Distributions in the error space: goal-directed movements described in time and state-space representations.

    PubMed

    Fisher, Moria E; Huang, Felix C; Wright, Zachary A; Patton, James L

    2014-01-01

    Manipulation of error feedback has been of great interest to recent studies in motor control and rehabilitation. Typically, motor adaptation is shown as a change in performance with a single scalar metric for each trial, yet such an approach might overlook details about how error evolves through the movement. We believe that statistical distributions of movement error through the extent of the trajectory can reveal unique patterns of adaption and possibly reveal clues to how the motor system processes information about error. This paper describes different possible ordinate domains, focusing on representations in time and state-space, used to quantify reaching errors. We hypothesized that the domain with the lowest amount of variability would lead to a predictive model of reaching error with the highest accuracy. Here we showed that errors represented in a time domain demonstrate the least variance and allow for the highest predictive model of reaching errors. These predictive models will give rise to more specialized methods of robotic feedback and improve previous techniques of error augmentation.

  4. Peculiarities of use of ECOC and AdaBoost based classifiers for thematic processing of hyperspectral data

    NASA Astrophysics Data System (ADS)

    Dementev, A. O.; Dmitriev, E. V.; Kozoderov, V. V.; Egorov, V. D.

    2017-10-01

    Hyperspectral imaging is up-to-date promising technology widely applied for the accurate thematic mapping. The presence of a large number of narrow survey channels allows us to use subtle differences in spectral characteristics of objects and to make a more detailed classification than in the case of using standard multispectral data. The difficulties encountered in the processing of hyperspectral images are usually associated with the redundancy of spectral information which leads to the problem of the curse of dimensionality. Methods currently used for recognizing objects on multispectral and hyperspectral images are usually based on standard base supervised classification algorithms of various complexity. Accuracy of these algorithms can be significantly different depending on considered classification tasks. In this paper we study the performance of ensemble classification methods for the problem of classification of the forest vegetation. Error correcting output codes and boosting are tested on artificial data and real hyperspectral images. It is demonstrates, that boosting gives more significant improvement when used with simple base classifiers. The accuracy in this case in comparable the error correcting output code (ECOC) classifier with Gaussian kernel SVM base algorithm. However the necessity of boosting ECOC with Gaussian kernel SVM is questionable. It is demonstrated, that selected ensemble classifiers allow us to recognize forest species with high enough accuracy which can be compared with ground-based forest inventory data.

  5. Error, Power, and Blind Sentinels: The Statistics of Seagrass Monitoring

    PubMed Central

    Schultz, Stewart T.; Kruschel, Claudia; Bakran-Petricioli, Tatjana; Petricioli, Donat

    2015-01-01

    We derive statistical properties of standard methods for monitoring of habitat cover worldwide, and criticize them in the context of mandated seagrass monitoring programs, as exemplified by Posidonia oceanica in the Mediterranean Sea. We report the novel result that cartographic methods with non-trivial classification errors are generally incapable of reliably detecting habitat cover losses less than about 30 to 50%, and the field labor required to increase their precision can be orders of magnitude higher than that required to estimate habitat loss directly in a field campaign. We derive a universal utility threshold of classification error in habitat maps that represents the minimum habitat map accuracy above which direct methods are superior. Widespread government reliance on blind-sentinel methods for monitoring seafloor can obscure the gradual and currently ongoing losses of benthic resources until the time has long passed for meaningful management intervention. We find two classes of methods with very high statistical power for detecting small habitat cover losses: 1) fixed-plot direct methods, which are over 100 times as efficient as direct random-plot methods in a variable habitat mosaic; and 2) remote methods with very low classification error such as geospatial underwater videography, which is an emerging, low-cost, non-destructive method for documenting small changes at millimeter visual resolution. General adoption of these methods and their further development will require a fundamental cultural change in conservation and management bodies towards the recognition and promotion of requirements of minimal statistical power and precision in the development of international goals for monitoring these valuable resources and the ecological services they provide. PMID:26367863

  6. Land use in the Paraiba Valley through remotely sensed data. [Brazil

    NASA Technical Reports Server (NTRS)

    Dejesusparada, N. (Principal Investigator); Lombardo, M. A.; Novo, E. M. L. D.; Niero, M.; Foresti, C.

    1980-01-01

    A methodology for land use survey was developed and land use modification rates were determined using LANDSAT imagery of the Paraiba Valley (state of Sao Paulo). Both visual and automatic interpretation methods were employed to analyze seven land use classes: urban area, industrial area, bare soil, cultivated area, pastureland, reforestation and natural vegetation. By means of visual interpretation, little spectral differences are observed among those classes. The automatic classification of LANDSAT MSS data using maximum likelihood algorithm shows a 39% average error of omission and a 3.4% error of inclusion for the seven classes. The complexity of land uses in the study area, the large spectral variations of analyzed classes, and the low resolution of LANDSAT data influenced the classification results.

  7. Multilayer perceptron, fuzzy sets, and classification

    NASA Technical Reports Server (NTRS)

    Pal, Sankar K.; Mitra, Sushmita

    1992-01-01

    A fuzzy neural network model based on the multilayer perceptron, using the back-propagation algorithm, and capable of fuzzy classification of patterns is described. The input vector consists of membership values to linguistic properties while the output vector is defined in terms of fuzzy class membership values. This allows efficient modeling of fuzzy or uncertain patterns with appropriate weights being assigned to the backpropagated errors depending upon the membership values at the corresponding outputs. During training, the learning rate is gradually decreased in discrete steps until the network converges to a minimum error solution. The effectiveness of the algorithm is demonstrated on a speech recognition problem. The results are compared with those of the conventional MLP, the Bayes classifier, and the other related models.

  8. One-Class Classification-Based Real-Time Activity Error Detection in Smart Homes.

    PubMed

    Das, Barnan; Cook, Diane J; Krishnan, Narayanan C; Schmitter-Edgecombe, Maureen

    2016-08-01

    Caring for individuals with dementia is frequently associated with extreme physical and emotional stress, which often leads to depression. Smart home technology and advances in machine learning techniques can provide innovative solutions to reduce caregiver burden. One key service that caregivers provide is prompting individuals with memory limitations to initiate and complete daily activities. We hypothesize that sensor technologies combined with machine learning techniques can automate the process of providing reminder-based interventions. The first step towards automated interventions is to detect when an individual faces difficulty with activities. We propose machine learning approaches based on one-class classification that learn normal activity patterns. When we apply these classifiers to activity patterns that were not seen before, the classifiers are able to detect activity errors, which represent potential prompt situations. We validate our approaches on smart home sensor data obtained from older adult participants, some of whom faced difficulties performing routine activities and thus committed errors.

  9. Improving the mapping of crop types in the Midwestern U.S. by fusing Landsat and MODIS satellite data

    NASA Astrophysics Data System (ADS)

    Zhu, Likai; Radeloff, Volker C.; Ives, Anthony R.

    2017-06-01

    Mapping crop types is of great importance for assessing agricultural production, land-use patterns, and the environmental effects of agriculture. Indeed, both radiometric and spatial resolution of Landsat's sensors images are optimized for cropland monitoring. However, accurate mapping of crop types requires frequent cloud-free images during the growing season, which are often not available, and this raises the question of whether Landsat data can be combined with data from other satellites. Here, our goal is to evaluate to what degree fusing Landsat with MODIS Nadir Bidirectional Reflectance Distribution Function (BRDF)-Adjusted Reflectance (NBAR) data can improve crop-type classification. Choosing either one or two images from all cloud-free Landsat observations available for the Arlington Agricultural Research Station area in Wisconsin from 2010 to 2014, we generated 87 combinations of images, and used each combination as input into the Spatial and Temporal Adaptive Reflectance Fusion Model (STARFM) algorithm to predict Landsat-like images at the nominal dates of each 8-day MODIS NBAR product. Both the original Landsat and STARFM-predicted images were then classified with a support vector machine (SVM), and we compared the classification errors of three scenarios: 1) classifying the one or two original Landsat images of each combination only, 2) classifying the one or two original Landsat images plus all STARFM-predicted images, and 3) classifying the one or two original Landsat images together with STARFM-predicted images for key dates. Our results indicated that using two Landsat images as the input of STARFM did not significantly improve the STARFM predictions compared to using only one, and predictions using Landsat images between July and August as input were most accurate. Including all STARFM-predicted images together with the Landsat images significantly increased average classification error by 4% points (from 21% to 25%) compared to using only Landsat images. However, incorporating only STARFM-predicted images for key dates decreased average classification error by 2% points (from 21% to 19%) compared to using only Landsat images. In particular, if only a single Landsat image was available, adding STARFM predictions for key dates significantly decreased the average classification error by 4 percentage points from 30% to 26% (p < 0.05). We conclude that adding STARFM-predicted images can be effective for improving crop-type classification when only limited Landsat observations are available, but carefully selecting images from a full set of STARFM predictions is crucial. We developed an approach to identify the optimal subsets of all STARFM predictions, which gives an alternative method of feature selection for future research.

  10. Empirical performance of interpolation techniques in risk-neutral density (RND) estimation

    NASA Astrophysics Data System (ADS)

    Bahaludin, H.; Abdullah, M. H.

    2017-03-01

    The objective of this study is to evaluate the empirical performance of interpolation techniques in risk-neutral density (RND) estimation. Firstly, the empirical performance is evaluated by using statistical analysis based on the implied mean and the implied variance of RND. Secondly, the interpolation performance is measured based on pricing error. We propose using the leave-one-out cross-validation (LOOCV) pricing error for interpolation selection purposes. The statistical analyses indicate that there are statistical differences between the interpolation techniques:second-order polynomial, fourth-order polynomial and smoothing spline. The results of LOOCV pricing error shows that interpolation by using fourth-order polynomial provides the best fitting to option prices in which it has the lowest value error.

  11. The Passive Series Stiffness That Optimizes Torque Tracking for a Lower-Limb Exoskeleton in Human Walking

    PubMed Central

    Zhang, Juanjuan; Collins, Steven H.

    2017-01-01

    This study uses theory and experiments to investigate the relationship between the passive stiffness of series elastic actuators and torque tracking performance in lower-limb exoskeletons during human walking. Through theoretical analysis with our simplified system model, we found that the optimal passive stiffness matches the slope of the desired torque-angle relationship. We also conjectured that a bandwidth limit resulted in a maximum rate of change in torque error that can be commanded through control input, which is fixed across desired and passive stiffness conditions. This led to hypotheses about the interactions among optimal control gains, passive stiffness and desired quasi-stiffness. Walking experiments were conducted with multiple angle-based desired torque curves. The observed lowest torque tracking errors identified for each combination of desired and passive stiffnesses were shown to be linearly proportional to the magnitude of the difference between the two stiffnesses. The proportional gains corresponding to the lowest observed errors were seen inversely proportional to passive stiffness values and to desired stiffness. These findings supported our hypotheses, and provide guidance to application-specific hardware customization as well as controller design for torque-controlled robotic legged locomotion. PMID:29326580

  12. Combining inferences from models of capture efficiency, detectability, and suitable habitat to classify landscapes for conservation of threatened bull trout

    USGS Publications Warehouse

    Peterson, J.; Dunham, J.B.

    2003-01-01

    Effective conservation efforts for at-risk species require knowledge of the locations of existing populations. Species presence can be estimated directly by conducting field-sampling surveys or alternatively by developing predictive models. Direct surveys can be expensive and inefficient, particularly for rare and difficult-to-sample species, and models of species presence may produce biased predictions. We present a Bayesian approach that combines sampling and model-based inferences for estimating species presence. The accuracy and cost-effectiveness of this approach were compared to those of sampling surveys and predictive models for estimating the presence of the threatened bull trout ( Salvelinus confluentus ) via simulation with existing models and empirical sampling data. Simulations indicated that a sampling-only approach would be the most effective and would result in the lowest presence and absence misclassification error rates for three thresholds of detection probability. When sampling effort was considered, however, the combined approach resulted in the lowest error rates per unit of sampling effort. Hence, lower probability-of-detection thresholds can be specified with the combined approach, resulting in lower misclassification error rates and improved cost-effectiveness.

  13. Exploring diversity in ensemble classification: Applications in large area land cover mapping

    NASA Astrophysics Data System (ADS)

    Mellor, Andrew; Boukir, Samia

    2017-07-01

    Ensemble classifiers, such as random forests, are now commonly applied in the field of remote sensing, and have been shown to perform better than single classifier systems, resulting in reduced generalisation error. Diversity across the members of ensemble classifiers is known to have a strong influence on classification performance - whereby classifier errors are uncorrelated and more uniformly distributed across ensemble members. The relationship between ensemble diversity and classification performance has not yet been fully explored in the fields of information science and machine learning and has never been examined in the field of remote sensing. This study is a novel exploration of ensemble diversity and its link to classification performance, applied to a multi-class canopy cover classification problem using random forests and multisource remote sensing and ancillary GIS data, across seven million hectares of diverse dry-sclerophyll dominated public forests in Victoria Australia. A particular emphasis is placed on analysing the relationship between ensemble diversity and ensemble margin - two key concepts in ensemble learning. The main novelty of our work is on boosting diversity by emphasizing the contribution of lower margin instances used in the learning process. Exploring the influence of tree pruning on diversity is also a new empirical analysis that contributes to a better understanding of ensemble performance. Results reveal insights into the trade-off between ensemble classification accuracy and diversity, and through the ensemble margin, demonstrate how inducing diversity by targeting lower margin training samples is a means of achieving better classifier performance for more difficult or rarer classes and reducing information redundancy in classification problems. Our findings inform strategies for collecting training data and designing and parameterising ensemble classifiers, such as random forests. This is particularly important in large area remote sensing applications, for which training data is costly and resource intensive to collect.

  14. Analysis of swallowing sounds using hidden Markov models.

    PubMed

    Aboofazeli, Mohammad; Moussavi, Zahra

    2008-04-01

    In recent years, acoustical analysis of the swallowing mechanism has received considerable attention due to its diagnostic potentials. This paper presents a hidden Markov model (HMM) based method for the swallowing sound segmentation and classification. Swallowing sound signals of 15 healthy and 11 dysphagic subjects were studied. The signals were divided into sequences of 25 ms segments each of which were represented by seven features. The sequences of features were modeled by HMMs. Trained HMMs were used for segmentation of the swallowing sounds into three distinct phases, i.e., initial quiet period, initial discrete sounds (IDS) and bolus transit sounds (BTS). Among the seven features, accuracy of segmentation by the HMM based on multi-scale product of wavelet coefficients was higher than that of the other HMMs and the linear prediction coefficient (LPC)-based HMM showed the weakest performance. In addition, HMMs were used for classification of the swallowing sounds of healthy subjects and dysphagic patients. Classification accuracy of different HMM configurations was investigated. When we increased the number of states of the HMMs from 4 to 8, the classification error gradually decreased. In most cases, classification error for N=9 was higher than that of N=8. Among the seven features used, root mean square (RMS) and waveform fractal dimension (WFD) showed the best performance in the HMM-based classification of swallowing sounds. When the sequences of the features of IDS segment were modeled separately, the accuracy reached up to 85.5%. As a second stage classification, a screening algorithm was used which correctly classified all the subjects but one healthy subject when RMS was used as characteristic feature of the swallowing sounds and the number of states was set to N=8.

  15. Spelling in Adolescents with Dyslexia: Errors and Modes of Assessment

    ERIC Educational Resources Information Center

    Tops, Wim; Callens, Maaike; Bijn, Evi; Brysbaert, Marc

    2014-01-01

    In this study we focused on the spelling of high-functioning students with dyslexia. We made a detailed classification of the errors in a word and sentence dictation task made by 100 students with dyslexia and 100 matched control students. All participants were in the first year of their bachelor's studies and had Dutch as mother tongue. Three…

  16. Estimation of a cover-type change matrix from error-prone data

    Treesearch

    Steen Magnussen

    2009-01-01

    Coregistration and classification errors seriously compromise per-pixel estimates of land cover change. A more robust estimation of change is proposed in which adjacent pixels are grouped into 3x3 clusters and treated as a unit of observation. A complete change matrix is recovered in a two-step process. The diagonal elements of a change matrix are recovered from...

  17. Curriculum Assessment Using Artificial Neural Network and Support Vector Machine Modeling Approaches: A Case Study. IR Applications. Volume 29

    ERIC Educational Resources Information Center

    Chen, Chau-Kuang

    2010-01-01

    Artificial Neural Network (ANN) and Support Vector Machine (SVM) approaches have been on the cutting edge of science and technology for pattern recognition and data classification. In the ANN model, classification accuracy can be achieved by using the feed-forward of inputs, back-propagation of errors, and the adjustment of connection weights. In…

  18. North American vegetation model for land-use planning in a changing climate: A solution to large classification problems

    Treesearch

    Gerald E. Rehfeldt; Nicholas L. Crookston; Cuauhtemoc Saenz-Romero; Elizabeth M. Campbell

    2012-01-01

    Data points intensively sampling 46 North American biomes were used to predict the geographic distribution of biomes from climate variables using the Random Forests classification tree. Techniques were incorporated to accommodate a large number of classes and to predict the future occurrence of climates beyond the contemporary climatic range of the biomes. Errors of...

  19. Feasibility of Equivalent Dipole Models for Electroencephalogram-Based Brain Computer Interfaces.

    PubMed

    Schimpf, Paul H

    2017-09-15

    This article examines the localization errors of equivalent dipolar sources inverted from the surface electroencephalogram in order to determine the feasibility of using their location as classification parameters for non-invasive brain computer interfaces. Inverse localization errors are examined for two head models: a model represented by four concentric spheres and a realistic model based on medical imagery. It is shown that the spherical model results in localization ambiguity such that a number of dipolar sources, with different azimuths and varying orientations, provide a near match to the electroencephalogram of the best equivalent source. No such ambiguity exists for the elevation of inverted sources, indicating that for spherical head models, only the elevation of inverted sources (and not the azimuth) can be expected to provide meaningful classification parameters for brain-computer interfaces. In a realistic head model, all three parameters of the inverted source location are found to be reliable, providing a more robust set of parameters. In both cases, the residual error hypersurfaces demonstrate local minima, indicating that a search for the best-matching sources should be global. Source localization error vs. signal-to-noise ratio is also demonstrated for both head models.

  20. Bayes Error Rate Estimation Using Classifier Ensembles

    NASA Technical Reports Server (NTRS)

    Tumer, Kagan; Ghosh, Joydeep

    2003-01-01

    The Bayes error rate gives a statistical lower bound on the error achievable for a given classification problem and the associated choice of features. By reliably estimating th is rate, one can assess the usefulness of the feature set that is being used for classification. Moreover, by comparing the accuracy achieved by a given classifier with the Bayes rate, one can quantify how effective that classifier is. Classical approaches for estimating or finding bounds for the Bayes error, in general, yield rather weak results for small sample sizes; unless the problem has some simple characteristics, such as Gaussian class-conditional likelihoods. This article shows how the outputs of a classifier ensemble can be used to provide reliable and easily obtainable estimates of the Bayes error with negligible extra computation. Three methods of varying sophistication are described. First, we present a framework that estimates the Bayes error when multiple classifiers, each providing an estimate of the a posteriori class probabilities, a recombined through averaging. Second, we bolster this approach by adding an information theoretic measure of output correlation to the estimate. Finally, we discuss a more general method that just looks at the class labels indicated by ensem ble members and provides error estimates based on the disagreements among classifiers. The methods are illustrated for artificial data, a difficult four-class problem involving underwater acoustic data, and two problems from the Problem benchmarks. For data sets with known Bayes error, the combiner-based methods introduced in this article outperform existing methods. The estimates obtained by the proposed methods also seem quite reliable for the real-life data sets for which the true Bayes rates are unknown.

  1. Identification of terrain cover using the optimum polarimetric classifier

    NASA Technical Reports Server (NTRS)

    Kong, J. A.; Swartz, A. A.; Yueh, H. A.; Novak, L. M.; Shin, R. T.

    1988-01-01

    A systematic approach for the identification of terrain media such as vegetation canopy, forest, and snow-covered fields is developed using the optimum polarimetric classifier. The covariance matrices for various terrain cover are computed from theoretical models of random medium by evaluating the scattering matrix elements. The optimal classification scheme makes use of a quadratic distance measure and is applied to classify a vegetation canopy consisting of both trees and grass. Experimentally measured data are used to validate the classification scheme. Analytical and Monte Carlo simulated classification errors using the fully polarimetric feature vector are compared with classification based on single features which include the phase difference between the VV and HH polarization returns. It is shown that the full polarimetric results are optimal and provide better classification performance than single feature measurements.

  2. Classifying nursing errors in clinical management within an Australian hospital.

    PubMed

    Tran, D T; Johnson, M

    2010-12-01

    Although many classification systems relating to patient safety exist, no taxonomy was identified that classified nursing errors in clinical management. To develop a classification system for nursing errors relating to clinical management (NECM taxonomy) and to describe contributing factors and patient consequences. We analysed 241 (11%) self-reported incidents relating to clinical management in nursing in a metropolitan hospital. Descriptive analysis of numeric data and content analysis of text data were undertaken to derive the NECM taxonomy, contributing factors and consequences for patients. Clinical management incidents represented 1.63 incidents per 1000 occupied bed days. The four themes of the NECM taxonomy were nursing care process (67%), communication (22%), administrative process (5%), and knowledge and skill (6%). Half of the incidents did not cause any patient harm. Contributing factors (n=111) included the following: patient clinical, social conditions and behaviours (27%); resources (22%); environment and workload (18%); other health professionals (15%); communication (13%); and nurse's knowledge and experience (5%). The NECM taxonomy provides direction to clinicians and managers on areas in clinical management that are most vulnerable to error, and therefore, priorities for system change management. Any nurses who wish to classify nursing errors relating to clinical management could use these types of errors. This study informs further research into risk management behaviour, and self-assessment tools for clinicians. Globally, nurses need to continue to monitor and act upon patient safety issues. © 2010 The Authors. International Nursing Review © 2010 International Council of Nurses.

  3. In-vivo determination of chewing patterns using FBG and artificial neural networks

    NASA Astrophysics Data System (ADS)

    Pegorini, Vinicius; Zen Karam, Leandro; Rocha Pitta, Christiano S.; Ribeiro, Richardson; Simioni Assmann, Tangriani; Cardozo da Silva, Jean Carlos; Bertotti, Fábio L.; Kalinowski, Hypolito J.; Cardoso, Rafael

    2015-09-01

    This paper reports the process of pattern classification of the chewing process of ruminants. We propose a simplified signal processing scheme for optical fiber Bragg grating (FBG) sensors based on machine learning techniques. The FBG sensors measure the biomechanical forces during jaw movements and an artificial neural network is responsible for the classification of the associated chewing pattern. In this study, three patterns associated to dietary supplement, hay and ryegrass were considered. Additionally, two other important events for ingestive behavior studies were monitored, rumination and idle period. Experimental results show that the proposed approach for pattern classification has been capable of differentiating the materials involved in the chewing process with a small classification error.

  4. Accuracy assessment, using stratified plurality sampling, of portions of a LANDSAT classification of the Arctic National Wildlife Refuge Coastal Plain

    NASA Technical Reports Server (NTRS)

    Card, Don H.; Strong, Laurence L.

    1989-01-01

    An application of a classification accuracy assessment procedure is described for a vegetation and land cover map prepared by digital image processing of LANDSAT multispectral scanner data. A statistical sampling procedure called Stratified Plurality Sampling was used to assess the accuracy of portions of a map of the Arctic National Wildlife Refuge coastal plain. Results are tabulated as percent correct classification overall as well as per category with associated confidence intervals. Although values of percent correct were disappointingly low for most categories, the study was useful in highlighting sources of classification error and demonstrating shortcomings of the plurality sampling method.

  5. Hyperspectral image classification based on local binary patterns and PCANet

    NASA Astrophysics Data System (ADS)

    Yang, Huizhen; Gao, Feng; Dong, Junyu; Yang, Yang

    2018-04-01

    Hyperspectral image classification has been well acknowledged as one of the challenging tasks of hyperspectral data processing. In this paper, we propose a novel hyperspectral image classification framework based on local binary pattern (LBP) features and PCANet. In the proposed method, linear prediction error (LPE) is first employed to select a subset of informative bands, and LBP is utilized to extract texture features. Then, spectral and texture features are stacked into a high dimensional vectors. Next, the extracted features of a specified position are transformed to a 2-D image. The obtained images of all pixels are fed into PCANet for classification. Experimental results on real hyperspectral dataset demonstrate the effectiveness of the proposed method.

  6. Reliability study of biometrics "do not contact" in myopia.

    PubMed

    Migliorini, R; Fratipietro, M; Comberiati, A M; Pattavina, L; Arrico, L

    The aim of the study is a comparison between the actually achieved after surgery condition versus the expected refractive condition of the eye as calculated via a biometer. The study was conducted in a random group of 38 eyes of patients undergoing surgery by phacoemulsification. The mean absolute error was calculated between the predicted values from the measurements with the optical biometer and those obtained in the post-operative error which was at around 0.47% Our study shows results not far from those reported in the literature, and in relation, to the mean absolute error is among the lowest values at 0.47 ± 0.11 SEM.

  7. Sub-pixel image classification for forest types in East Texas

    NASA Astrophysics Data System (ADS)

    Westbrook, Joey

    Sub-pixel classification is the extraction of information about the proportion of individual materials of interest within a pixel. Landcover classification at the sub-pixel scale provides more discrimination than traditional per-pixel multispectral classifiers for pixels where the material of interest is mixed with other materials. It allows for the un-mixing of pixels to show the proportion of each material of interest. The materials of interest for this study are pine, hardwood, mixed forest and non-forest. The goal of this project was to perform a sub-pixel classification, which allows a pixel to have multiple labels, and compare the result to a traditional supervised classification, which allows a pixel to have only one label. The satellite image used was a Landsat 5 Thematic Mapper (TM) scene of the Stephen F. Austin Experimental Forest in Nacogdoches County, Texas and the four cover type classes are pine, hardwood, mixed forest and non-forest. Once classified, a multi-layer raster datasets was created that comprised four raster layers where each layer showed the percentage of that cover type within the pixel area. Percentage cover type maps were then produced and the accuracy of each was assessed using a fuzzy error matrix for the sub-pixel classifications, and the results were compared to the supervised classification in which a traditional error matrix was used. The overall accuracy of the sub-pixel classification using the aerial photo for both training and reference data had the highest (65% overall) out of the three sub-pixel classifications. This was understandable because the analyst can visually observe the cover types actually on the ground for training data and reference data, whereas using the FIA (Forest Inventory and Analysis) plot data, the analyst must assume that an entire pixel contains the exact percentage of a cover type found in a plot. An increase in accuracy was found after reclassifying each sub-pixel classification from nine classes with 10 percent interval each to five classes with 20 percent interval each. When compared to the supervised classification which has a satisfactory overall accuracy of 90%, none of the sub-pixel classification achieved the same level. However, since traditional per-pixel classifiers assign only one label to pixels throughout the landscape while sub-pixel classifications assign multiple labels to each pixel, the traditional 85% accuracy of acceptance for pixel-based classifications should not apply to sub-pixel classifications. More research is needed in order to define the level of accuracy that is deemed acceptable for sub-pixel classifications.

  8. Unified framework for triaxial accelerometer-based fall event detection and classification using cumulants and hierarchical decision tree classifier.

    PubMed

    Kambhampati, Satya Samyukta; Singh, Vishal; Manikandan, M Sabarimalai; Ramkumar, Barathram

    2015-08-01

    In this Letter, the authors present a unified framework for fall event detection and classification using the cumulants extracted from the acceleration (ACC) signals acquired using a single waist-mounted triaxial accelerometer. The main objective of this Letter is to find suitable representative cumulants and classifiers in effectively detecting and classifying different types of fall and non-fall events. It was discovered that the first level of the proposed hierarchical decision tree algorithm implements fall detection using fifth-order cumulants and support vector machine (SVM) classifier. In the second level, the fall event classification algorithm uses the fifth-order cumulants and SVM. Finally, human activity classification is performed using the second-order cumulants and SVM. The detection and classification results are compared with those of the decision tree, naive Bayes, multilayer perceptron and SVM classifiers with different types of time-domain features including the second-, third-, fourth- and fifth-order cumulants and the signal magnitude vector and signal magnitude area. The experimental results demonstrate that the second- and fifth-order cumulant features and SVM classifier can achieve optimal detection and classification rates of above 95%, as well as the lowest false alarm rate of 1.03%.

  9. A real-time heat strain risk classifier using heart rate and skin temperature.

    PubMed

    Buller, Mark J; Latzka, William A; Yokota, Miyo; Tharion, William J; Moran, Daniel S

    2008-12-01

    Heat injury is a real concern to workers engaged in physically demanding tasks in high heat strain environments. Several real-time physiological monitoring systems exist that can provide indices of heat strain, e.g. physiological strain index (PSI), and provide alerts to medical personnel. However, these systems depend on core temperature measurement using expensive, ingestible thermometer pills. Seeking a better solution, we suggest the use of a model which can identify the probability that individuals are 'at risk' from heat injury using non-invasive measures. The intent is for the system to identify individuals who need monitoring more closely or who should apply heat strain mitigation strategies. We generated a model that can identify 'at risk' (PSI 7.5) workers from measures of heart rate and chest skin temperature. The model was built using data from six previously published exercise studies in which some subjects wore chemical protective equipment. The model has an overall classification error rate of 10% with one false negative error (2.7%), and outperforms an earlier model and a least squares regression model with classification errors of 21% and 14%, respectively. Additionally, the model allows the classification criteria to be adjusted based on the task and acceptable level of risk. We conclude that the model could be a valuable part of a multi-faceted heat strain management system.

  10. Automated Segmentation Errors When Using Optical Coherence Tomography to Measure Retinal Nerve Fiber Layer Thickness in Glaucoma.

    PubMed

    Mansberger, Steven L; Menda, Shivali A; Fortune, Brad A; Gardiner, Stuart K; Demirel, Shaban

    2017-02-01

    To characterize the error of optical coherence tomography (OCT) measurements of retinal nerve fiber layer (RNFL) thickness when using automated retinal layer segmentation algorithms without manual refinement. Cross-sectional study. This study was set in a glaucoma clinical practice, and the dataset included 3490 scans from 412 eyes of 213 individuals with a diagnosis of glaucoma or glaucoma suspect. We used spectral domain OCT (Spectralis) to measure RNFL thickness in a 6-degree peripapillary circle, and exported the native "automated segmentation only" results. In addition, we exported the results after "manual refinement" to correct errors in the automated segmentation of the anterior (internal limiting membrane) and the posterior boundary of the RNFL. Our outcome measures included differences in RNFL thickness and glaucoma classification (i.e., normal, borderline, or outside normal limits) between scans with automated segmentation only and scans using manual refinement. Automated segmentation only resulted in a thinner global RNFL thickness (1.6 μm thinner, P < .001) when compared to manual refinement. When adjusted by operator, a multivariate model showed increased differences with decreasing RNFL thickness (P < .001), decreasing scan quality (P < .001), and increasing age (P < .03). Manual refinement changed 298 of 3486 (8.5%) of scans to a different global glaucoma classification, wherein 146 of 617 (23.7%) of borderline classifications became normal. Superior and inferior temporal clock hours had the largest differences. Automated segmentation without manual refinement resulted in reduced global RNFL thickness and overestimated the classification of glaucoma. Differences increased in eyes with a thinner RNFL thickness, older age, and decreased scan quality. Operators should inspect and manually refine OCT retinal layer segmentation when assessing RNFL thickness in the management of patients with glaucoma. Copyright © 2016 Elsevier Inc. All rights reserved.

  11. Acquiring Research-grade ALSM Data in the Commercial Marketplace

    NASA Astrophysics Data System (ADS)

    Haugerud, R. A.; Harding, D. J.; Latypov, D.; Martinez, D.; Routh, S.; Ziegler, J.

    2003-12-01

    The Puget Sound Lidar Consortium, working with TerraPoint, LLC, has procured a large volume of ALSM (topographic lidar) data for scientific research. Research-grade ALSM data can be characterized by their completeness, density, and accuracy. Complete data include-at a minimum-X, Y, Z, time, and classification (ground, vegetation, structure, blunder) for each laser reflection. Off-nadir angle and return number for multiple returns are also useful. We began with a pulse density of 1/sq m, and after limited experiments still find this density satisfactory in the dense second-growth forests of western Washington. Lower pulse densities would have produced unacceptably limited sampling in forested areas and aliased some topographic features. Higher pulse densities do not produce markedly better topographic models, in part because of limitations of reproducibility between the overlapping survey swaths used to achieve higher density. Our experience in a variety of forest types demonstrates that the fraction of pulses that produce ground returns varies with vegetation cover, laser beam divergence, laser power, and detector sensitivity, but have not quantified this relationship. The most significant operational limits on vertical accuracy of ALSM appear to be instrument calibration and the accuracy with which returns are classified as ground or vegetation. TerraPoint has recently implemented in-situ calibration using overlapping swaths (Latypov and Zosse, 2002, see http://www.terrapoint.com/News_damirACSM_ASPRS2002.html). On the consumer side, we routinely perform a similar overlap analysis to produce maps of relative Z error between swaths; we find that in bare, low-slope regions the in-situ calibration has reduced this internal Z error to 6-10 cm RMSE. Comparison with independent ground control points commonly illuminates inconsistencies in how GPS heights have been reduced to orthometric heights. Once these inconsistencies are resolved, it appears that the internal errors are the bulk of the error of the survey. The error maps suggest that with in-situ calibration, minor time-varying errors with a period of circa 1 sec are the largest remaining source of survey error. For forested terrain, limited ground penetration and errors in return classification can severely limit the accuracy of resulting topographic models. Initial work by Haugerud and Harding demonstrated the feasibility of fully-automatic return classification; however, TerraPoint has found that better results can be obtained more effectively with 3rd-party classification software that allows a mix of automated routines and human intervention. Our relationship has been evolving since early 2000. Important aspects of this relationship include close communication between data producer and consumer, a willingness to learn from each other, significant technical expertise and resources on the consumer side, and continued refinement of achievable, quantitative performance and accuracy specifications. Most recently we have instituted a slope-dependent Z accuracy specification that TerraPoint first developed as a heuristic for surveying mountainous terrain in Switzerland. We are now working on quantifying the internal consistency of topographic models in forested areas, using a variant of overlap analysis, and standards for the spatial distribution of internal errors.

  12. Brain fingerprinting classification concealed information test detects US Navy military medical information with P300

    PubMed Central

    Farwell, Lawrence A.; Richardson, Drew C.; Richardson, Graham M.; Furedy, John J.

    2014-01-01

    A classification concealed information test (CIT) used the “brain fingerprinting” method of applying P300 event-related potential (ERP) in detecting information that is (1) acquired in real life and (2) unique to US Navy experts in military medicine. Military medicine experts and non-experts were asked to push buttons in response to three types of text stimuli. Targets contain known information relevant to military medicine, are identified to subjects as relevant, and require pushing one button. Subjects are told to push another button to all other stimuli. Probes contain concealed information relevant to military medicine, and are not identified to subjects. Irrelevants contain equally plausible, but incorrect/irrelevant information. Error rate was 0%. Median and mean statistical confidences for individual determinations were 99.9% with no indeterminates (results lacking sufficiently high statistical confidence to be classified). We compared error rate and statistical confidence for determinations of both information present and information absent produced by classification CIT (Is a probe ERP more similar to a target or to an irrelevant ERP?) vs. comparison CIT (Does a probe produce a larger ERP than an irrelevant?) using P300 plus the late negative component (LNP; together, P300-MERMER). Comparison CIT produced a significantly higher error rate (20%) and lower statistical confidences: mean 67%; information-absent mean was 28.9%, less than chance (50%). We compared analysis using P300 alone with the P300 + LNP. P300 alone produced the same 0% error rate but significantly lower statistical confidences. These findings add to the evidence that the brain fingerprinting methods as described here provide sufficient conditions to produce less than 1% error rate and greater than 95% median statistical confidence in a CIT on information obtained in the course of real life that is characteristic of individuals with specific training, expertise, or organizational affiliation. PMID:25565941

  13. Morbidity Assessment in Surgery: Refinement Proposal Based on a Concept of Perioperative Adverse Events

    PubMed Central

    Kazaryan, Airazat M.; Røsok, Bård I.; Edwin, Bjørn

    2013-01-01

    Background. Morbidity is a cornerstone assessing surgical treatment; nevertheless surgeons have not reached extensive consensus on this problem. Methods and Findings. Clavien, Dindo, and Strasberg with coauthors (1992, 2004, 2009, and 2010) made significant efforts to the standardization of surgical morbidity (Clavien-Dindo-Strasberg classification, last revision, the Accordion classification). However, this classification includes only postoperative complications and has two principal shortcomings: disregard of intraoperative events and confusing terminology. Postoperative events have a major impact on patient well-being. However, intraoperative events should also be recorded and reported even if they do not evidently affect the patient's postoperative well-being. The term surgical complication applied in the Clavien-Dindo-Strasberg classification may be regarded as an incident resulting in a complication caused by technical failure of surgery, in contrast to the so-called medical complications. Therefore, the term surgical complication contributes to misinterpretation of perioperative morbidity. The term perioperative adverse events comprising both intraoperative unfavourable incidents and postoperative complications could be regarded as better alternative. In 2005, Satava suggested a simple grading to evaluate intraoperative surgical errors. Based on that approach, we have elaborated a 3-grade classification of intraoperative incidents so that it can be used to grade intraoperative events of any type of surgery. Refinements have been made to the Accordion classification of postoperative complications. Interpretation. The proposed systematization of perioperative adverse events utilizing the combined application of two appraisal tools, that is, the elaborated classification of intraoperative incidents on the basis of the Satava approach to surgical error evaluation together with the modified Accordion classification of postoperative complication, appears to be an effective tool for comprehensive assessment of surgical outcomes. This concept was validated in regard to various surgical procedures. Broad implementation of this approach will promote the development of surgical science and practice. PMID:23762627

  14. Entropy-based gene ranking without selection bias for the predictive classification of microarray data.

    PubMed

    Furlanello, Cesare; Serafini, Maria; Merler, Stefano; Jurman, Giuseppe

    2003-11-06

    We describe the E-RFE method for gene ranking, which is useful for the identification of markers in the predictive classification of array data. The method supports a practical modeling scheme designed to avoid the construction of classification rules based on the selection of too small gene subsets (an effect known as the selection bias, in which the estimated predictive errors are too optimistic due to testing on samples already considered in the feature selection process). With E-RFE, we speed up the recursive feature elimination (RFE) with SVM classifiers by eliminating chunks of uninteresting genes using an entropy measure of the SVM weights distribution. An optimal subset of genes is selected according to a two-strata model evaluation procedure: modeling is replicated by an external stratified-partition resampling scheme, and, within each run, an internal K-fold cross-validation is used for E-RFE ranking. Also, the optimal number of genes can be estimated according to the saturation of Zipf's law profiles. Without a decrease of classification accuracy, E-RFE allows a speed-up factor of 100 with respect to standard RFE, while improving on alternative parametric RFE reduction strategies. Thus, a process for gene selection and error estimation is made practical, ensuring control of the selection bias, and providing additional diagnostic indicators of gene importance.

  15. DMSP SSJ4 Data Restoration, Classification, and On-Line Data Access

    NASA Technical Reports Server (NTRS)

    Wing, Simon; Bredekamp, Joseph H. (Technical Monitor)

    2000-01-01

    Compress and clean raw data file for permanent storage We have identified various error conditions/types and developed algorithms to get rid of these errors/noises, including the more complicated noise in the newer data sets. (status = 100% complete). Internet access of compacted raw data. It is now possible to access the raw data via our web site, http://www.jhuapl.edu/Aurora/index.html. The software to read and plot the compacted raw data is also available from the same web site. The users can now download the raw data, read, plot, or manipulate the data as they wish on their own computer. The users are able to access the cleaned data sets. Internet access of the color spectrograms. This task has also been completed. It is now possible to access the spectrograms from the web site mentioned above. Improve the particle precipitation region classification. The algorithm for doing this task has been developed and implemented. As a result, the accuracies improved. Now the web site routinely distributes the results of applying the new algorithm to the cleaned data set. Mark the classification region on the spectrograms. The software to mark the classification region in the spectrograms has been completed. This is also available from our web site.

  16. Malingering in Toxic Exposure. Classification Accuracy of Reliable Digit Span and WAIS-III Digit Span Scaled Scores

    ERIC Educational Resources Information Center

    Greve, Kevin W.; Springer, Steven; Bianchini, Kevin J.; Black, F. William; Heinly, Matthew T.; Love, Jeffrey M.; Swift, Douglas A.; Ciota, Megan A.

    2007-01-01

    This study examined the sensitivity and false-positive error rate of reliable digit span (RDS) and the WAIS-III Digit Span (DS) scaled score in persons alleging toxic exposure and determined whether error rates differed from published rates in traumatic brain injury (TBI) and chronic pain (CP). Data were obtained from the files of 123 persons…

  17. Accurate prediction of energy expenditure using a shoe-based activity monitor.

    PubMed

    Sazonova, Nadezhda; Browning, Raymond C; Sazonov, Edward

    2011-07-01

    The aim of this study was to develop and validate a method for predicting energy expenditure (EE) using a footwear-based system with integrated accelerometer and pressure sensors. We developed a footwear-based device with an embedded accelerometer and insole pressure sensors for the prediction of EE. The data from the device can be used to perform accurate recognition of major postures and activities and to estimate EE using the acceleration, pressure, and posture/activity classification information in a branched algorithm without the need for individual calibration. We measured EE via indirect calorimetry as 16 adults (body mass index=19-39 kg·m) performed various low- to moderate-intensity activities and compared measured versus predicted EE using several models based on the acceleration and pressure signals. Inclusion of pressure data resulted in better accuracy of EE prediction during static postures such as sitting and standing. The activity-based branched model that included predictors from accelerometer and pressure sensors (BACC-PS) achieved the lowest error (e.g., root mean squared error (RMSE)=0.69 METs) compared with the accelerometer-only-based branched model BACC (RMSE=0.77 METs) and nonbranched model (RMSE=0.94-0.99 METs). Comparison of EE prediction models using data from both legs versus models using data from a single leg indicates that only one shoe needs to be equipped with sensors. These results suggest that foot acceleration combined with insole pressure measurement, when used in an activity-specific branched model, can accurately estimate the EE associated with common daily postures and activities. The accuracy and unobtrusiveness of a footwear-based device may make it an effective physical activity monitoring tool.

  18. Qualitative properties of roasting defect beans and development of its classification methods by hyperspectral imaging technology.

    PubMed

    Cho, Jeong-Seok; Bae, Hyung-Jin; Cho, Byoung-Kwan; Moon, Kwang-Deog

    2017-04-01

    Qualitative properties of roasting defect coffee beans and their classification methods were studied using hyperspectral imaging (HSI). The roasting defect beans were divided into 5 groups: medium roasting (Cont), under developed (RD-1), over roasting (RD-2), interior under developed (RD-3), and interior scorching (RD-4). The following qualitative properties were assayed: browning index (BI), moisture content (MC), chlorogenic acid (CA), trigonelline (TG), and caffeine (CF) content. Their HSI spectra (1000-1700nm) were also analysed to develop the classification methods of roasting defect beans. RD-2 showed the highest BI and the lowest MC, CA, and TG content. The accuracy of classification model of partial least-squares discriminant was 86.2%. The most powerful wavelength to classify the defective beans was approximately 1420nm (related to OH bond). The HSI reflectance values at 1420nm showed similar tendency with MC, enabling the use of this technology to classify the roasting defect beans. Copyright © 2016. Published by Elsevier Ltd.

  19. Software errors and complexity: An empirical investigation

    NASA Technical Reports Server (NTRS)

    Basili, Victor R.; Perricone, Berry T.

    1983-01-01

    The distributions and relationships derived from the change data collected during the development of a medium scale satellite software project show that meaningful results can be obtained which allow an insight into software traits and the environment in which it is developed. Modified and new modules were shown to behave similarly. An abstract classification scheme for errors which allows a better understanding of the overall traits of a software project is also shown. Finally, various size and complexity metrics are examined with respect to errors detected within the software yielding some interesting results.

  20. Software errors and complexity: An empirical investigation

    NASA Technical Reports Server (NTRS)

    Basili, V. R.; Perricone, B. T.

    1982-01-01

    The distributions and relationships derived from the change data collected during the development of a medium scale satellite software project show that meaningful results can be obtained which allow an insight into software traits and the environment in which it is developed. Modified and new modules were shown to behave similarly. An abstract classification scheme for errors which allows a better understanding of the overall traits of a software project is also shown. Finally, various size and complexity metrics are examined with respect to errors detected within the software yielding some interesting results.

  1. Video compression of coronary angiograms based on discrete wavelet transform with block classification.

    PubMed

    Ho, B T; Tsai, M J; Wei, J; Ma, M; Saipetch, P

    1996-01-01

    A new method of video compression for angiographic images has been developed to achieve high compression ratio (~20:1) while eliminating block artifacts which leads to loss of diagnostic accuracy. This method adopts motion picture experts group's (MPEGs) motion compensated prediction to takes advantage of frame to frame correlation. However, in contrast to MPEG, the error images arising from mismatches in the motion estimation are encoded by discrete wavelet transform (DWT) rather than block discrete cosine transform (DCT). Furthermore, the authors developed a classification scheme which label each block in an image as intra, error, or background type and encode it accordingly. This hybrid coding can significantly improve the compression efficiency in certain eases. This method can be generalized for any dynamic image sequences applications sensitive to block artifacts.

  2. Empirically Estimable Classification Bounds Based on a Nonparametric Divergence Measure

    PubMed Central

    Berisha, Visar; Wisler, Alan; Hero, Alfred O.; Spanias, Andreas

    2015-01-01

    Information divergence functions play a critical role in statistics and information theory. In this paper we show that a non-parametric f-divergence measure can be used to provide improved bounds on the minimum binary classification probability of error for the case when the training and test data are drawn from the same distribution and for the case where there exists some mismatch between training and test distributions. We confirm the theoretical results by designing feature selection algorithms using the criteria from these bounds and by evaluating the algorithms on a series of pathological speech classification tasks. PMID:26807014

  3. Comparisons of neural networks to standard techniques for image classification and correlation

    NASA Technical Reports Server (NTRS)

    Paola, Justin D.; Schowengerdt, Robert A.

    1994-01-01

    Neural network techniques for multispectral image classification and spatial pattern detection are compared to the standard techniques of maximum-likelihood classification and spatial correlation. The neural network produced a more accurate classification than maximum-likelihood of a Landsat scene of Tucson, Arizona. Some of the errors in the maximum-likelihood classification are illustrated using decision region and class probability density plots. As expected, the main drawback to the neural network method is the long time required for the training stage. The network was trained using several different hidden layer sizes to optimize both the classification accuracy and training speed, and it was found that one node per class was optimal. The performance improved when 3x3 local windows of image data were entered into the net. This modification introduces texture into the classification without explicit calculation of a texture measure. Larger windows were successfully used for the detection of spatial features in Landsat and Magellan synthetic aperture radar imagery.

  4. Analysis and application of classification methods of complex carbonate reservoirs

    NASA Astrophysics Data System (ADS)

    Li, Xiongyan; Qin, Ruibao; Ping, Haitao; Wei, Dan; Liu, Xiaomei

    2018-06-01

    There are abundant carbonate reservoirs from the Cenozoic to Mesozoic era in the Middle East. Due to variation in sedimentary environment and diagenetic process of carbonate reservoirs, several porosity types coexist in carbonate reservoirs. As a result, because of the complex lithologies and pore types as well as the impact of microfractures, the pore structure is very complicated. Therefore, it is difficult to accurately calculate the reservoir parameters. In order to accurately evaluate carbonate reservoirs, based on the pore structure evaluation of carbonate reservoirs, the classification methods of carbonate reservoirs are analyzed based on capillary pressure curves and flow units. Based on the capillary pressure curves, although the carbonate reservoirs can be classified, the relationship between porosity and permeability after classification is not ideal. On the basis of the flow units, the high-precision functional relationship between porosity and permeability after classification can be established. Therefore, the carbonate reservoirs can be quantitatively evaluated based on the classification of flow units. In the dolomite reservoirs, the average absolute error of calculated permeability decreases from 15.13 to 7.44 mD. Similarly, the average absolute error of calculated permeability of limestone reservoirs is reduced from 20.33 to 7.37 mD. Only by accurately characterizing pore structures and classifying reservoir types, reservoir parameters could be calculated accurately. Therefore, characterizing pore structures and classifying reservoir types are very important to accurate evaluation of complex carbonate reservoirs in the Middle East.

  5. Study on Classification Accuracy Inspection of Land Cover Data Aided by Automatic Image Change Detection Technology

    NASA Astrophysics Data System (ADS)

    Xie, W.-J.; Zhang, L.; Chen, H.-P.; Zhou, J.; Mao, W.-J.

    2018-04-01

    The purpose of carrying out national geographic conditions monitoring is to obtain information of surface changes caused by human social and economic activities, so that the geographic information can be used to offer better services for the government, enterprise and public. Land cover data contains detailed geographic conditions information, thus has been listed as one of the important achievements in the national geographic conditions monitoring project. At present, the main issue of the production of the land cover data is about how to improve the classification accuracy. For the land cover data quality inspection and acceptance, classification accuracy is also an important check point. So far, the classification accuracy inspection is mainly based on human-computer interaction or manual inspection in the project, which are time consuming and laborious. By harnessing the automatic high-resolution remote sensing image change detection technology based on the ERDAS IMAGINE platform, this paper carried out the classification accuracy inspection test of land cover data in the project, and presented a corresponding technical route, which includes data pre-processing, change detection, result output and information extraction. The result of the quality inspection test shows the effectiveness of the technical route, which can meet the inspection needs for the two typical errors, that is, missing and incorrect update error, and effectively reduces the work intensity of human-computer interaction inspection for quality inspectors, and also provides a technical reference for the data production and quality control of the land cover data.

  6. Analyzing communication errors in an air medical transport service.

    PubMed

    Dalto, Joseph D; Weir, Charlene; Thomas, Frank

    2013-01-01

    Poor communication can result in adverse events. Presently, no standards exist for classifying and analyzing air medical communication errors. This study sought to determine the frequency and types of communication errors reported within an air medical quality and safety assurance reporting system. Of 825 quality assurance reports submitted in 2009, 278 were randomly selected and analyzed for communication errors. Each communication error was classified and mapped to Clark's communication level hierarchy (ie, levels 1-4). Descriptive statistics were performed, and comparisons were evaluated using chi-square analysis. Sixty-four communication errors were identified in 58 reports (21% of 278). Of the 64 identified communication errors, only 18 (28%) were classified by the staff to be communication errors. Communication errors occurred most often at level 1 (n = 42/64, 66%) followed by level 4 (21/64, 33%). Level 2 and 3 communication failures were rare (, 1%). Communication errors were found in a fifth of quality and safety assurance reports. The reporting staff identified less than a third of these errors. Nearly all communication errors (99%) occurred at either the lowest level of communication (level 1, 66%) or the highest level (level 4, 33%). An air medical communication ontology is necessary to improve the recognition and analysis of communication errors. Copyright © 2013 Air Medical Journal Associates. Published by Elsevier Inc. All rights reserved.

  7. Does a better model yield a better argument? An info-gap analysis

    NASA Astrophysics Data System (ADS)

    Ben-Haim, Yakov

    2017-04-01

    Theories, models and computations underlie reasoned argumentation in many areas. The possibility of error in these arguments, though of low probability, may be highly significant when the argument is used in predicting the probability of rare high-consequence events. This implies that the choice of a theory, model or computational method for predicting rare high-consequence events must account for the probability of error in these components. However, error may result from lack of knowledge or surprises of various sorts, and predicting the probability of error is highly uncertain. We show that the putatively best, most innovative and sophisticated argument may not actually have the lowest probability of error. Innovative arguments may entail greater uncertainty than more standard but less sophisticated methods, creating an innovation dilemma in formulating the argument. We employ info-gap decision theory to characterize and support the resolution of this problem and present several examples.

  8. Resolving Mixed Algal Species in Hyperspectral Images

    PubMed Central

    Mehrubeoglu, Mehrube; Teng, Ming Y.; Zimba, Paul V.

    2014-01-01

    We investigated a lab-based hyperspectral imaging system's response from pure (single) and mixed (two) algal cultures containing known algae types and volumetric combinations to characterize the system's performance. The spectral response to volumetric changes in single and combinations of algal mixtures with known ratios were tested. Constrained linear spectral unmixing was applied to extract the algal content of the mixtures based on abundances that produced the lowest root mean square error. Percent prediction error was computed as the difference between actual percent volumetric content and abundances at minimum RMS error. Best prediction errors were computed as 0.4%, 0.4% and 6.3% for the mixed spectra from three independent experiments. The worst prediction errors were found as 5.6%, 5.4% and 13.4% for the same order of experiments. Additionally, Beer-Lambert's law was utilized to relate transmittance to different volumes of pure algal suspensions demonstrating linear logarithmic trends for optical property measurements. PMID:24451451

  9. Improved EEG Event Classification Using Differential Energy.

    PubMed

    Harati, A; Golmohammadi, M; Lopez, S; Obeid, I; Picone, J

    2015-12-01

    Feature extraction for automatic classification of EEG signals typically relies on time frequency representations of the signal. Techniques such as cepstral-based filter banks or wavelets are popular analysis techniques in many signal processing applications including EEG classification. In this paper, we present a comparison of a variety of approaches to estimating and postprocessing features. To further aid in discrimination of periodic signals from aperiodic signals, we add a differential energy term. We evaluate our approaches on the TUH EEG Corpus, which is the largest publicly available EEG corpus and an exceedingly challenging task due to the clinical nature of the data. We demonstrate that a variant of a standard filter bank-based approach, coupled with first and second derivatives, provides a substantial reduction in the overall error rate. The combination of differential energy and derivatives produces a 24 % absolute reduction in the error rate and improves our ability to discriminate between signal events and background noise. This relatively simple approach proves to be comparable to other popular feature extraction approaches such as wavelets, but is much more computationally efficient.

  10. Comparison of remote sensing image processing techniques to identify tornado damage areas from Landsat TM data

    USGS Publications Warehouse

    Myint, S.W.; Yuan, M.; Cerveny, R.S.; Giri, C.P.

    2008-01-01

    Remote sensing techniques have been shown effective for large-scale damage surveys after a hazardous event in both near real-time or post-event analyses. The paper aims to compare accuracy of common imaging processing techniques to detect tornado damage tracks from Landsat TM data. We employed the direct change detection approach using two sets of images acquired before and after the tornado event to produce a principal component composite images and a set of image difference bands. Techniques in the comparison include supervised classification, unsupervised classification, and objectoriented classification approach with a nearest neighbor classifier. Accuracy assessment is based on Kappa coefficient calculated from error matrices which cross tabulate correctly identified cells on the TM image and commission and omission errors in the result. Overall, the Object-oriented Approach exhibits the highest degree of accuracy in tornado damage detection. PCA and Image Differencing methods show comparable outcomes. While selected PCs can improve detection accuracy 5 to 10%, the Object-oriented Approach performs significantly better with 15-20% higher accuracy than the other two techniques. ?? 2008 by MDPI.

  11. Benchmark data on the separability among crops in the southern San Joaquin Valley of California

    NASA Technical Reports Server (NTRS)

    Morse, A.; Card, D. H.

    1984-01-01

    Landsat MSS data were input to a discriminant analysis of 21 crops on each of eight dates in 1979 using a total of 4,142 fields in southern Fresno County, California. The 21 crops, which together account for over 70 percent of the agricultural acreage in the southern San Joaquin Valley, were analyzed to quantify the spectral separability, defined as omission error, between all pairs of crops. On each date the fields were segregated into six groups based on the mean value of the MSS7/MSS5 ratio, which is correlated with green biomass. Discriminant analysis was run on each group on each date. The resulting contingency tables offer information that can be profitably used in conjunction with crop calendars to pick the best dates for a classification. The tables show expected percent correct classification and error rates for all the crops. The patterns in the contingency tables show that the percent correct classification for crops generally increases with the amount of greenness in the fields being classified. However, there are exceptions to this general rule, notably grain.

  12. Comparison of Remote Sensing Image Processing Techniques to Identify Tornado Damage Areas from Landsat TM Data

    PubMed Central

    Myint, Soe W.; Yuan, May; Cerveny, Randall S.; Giri, Chandra P.

    2008-01-01

    Remote sensing techniques have been shown effective for large-scale damage surveys after a hazardous event in both near real-time or post-event analyses. The paper aims to compare accuracy of common imaging processing techniques to detect tornado damage tracks from Landsat TM data. We employed the direct change detection approach using two sets of images acquired before and after the tornado event to produce a principal component composite images and a set of image difference bands. Techniques in the comparison include supervised classification, unsupervised classification, and object-oriented classification approach with a nearest neighbor classifier. Accuracy assessment is based on Kappa coefficient calculated from error matrices which cross tabulate correctly identified cells on the TM image and commission and omission errors in the result. Overall, the Object-oriented Approach exhibits the highest degree of accuracy in tornado damage detection. PCA and Image Differencing methods show comparable outcomes. While selected PCs can improve detection accuracy 5 to 10%, the Object-oriented Approach performs significantly better with 15-20% higher accuracy than the other two techniques. PMID:27879757

  13. Multicategory nets of single-layer perceptrons: complexity and sample-size issues.

    PubMed

    Raudys, Sarunas; Kybartas, Rimantas; Zavadskas, Edmundas Kazimieras

    2010-05-01

    The standard cost function of multicategory single-layer perceptrons (SLPs) does not minimize the classification error rate. In order to reduce classification error, it is necessary to: 1) refuse the traditional cost function, 2) obtain near to optimal pairwise linear classifiers by specially organized SLP training and optimal stopping, and 3) fuse their decisions properly. To obtain better classification in unbalanced training set situations, we introduce the unbalance correcting term. It was found that fusion based on the Kulback-Leibler (K-L) distance and the Wu-Lin-Weng (WLW) method result in approximately the same performance in situations where sample sizes are relatively small. The explanation for this observation is by theoretically known verity that an excessive minimization of inexact criteria becomes harmful at times. Comprehensive comparative investigations of six real-world pattern recognition (PR) problems demonstrated that employment of SLP-based pairwise classifiers is comparable and as often as not outperforming the linear support vector (SV) classifiers in moderate dimensional situations. The colored noise injection used to design pseudovalidation sets proves to be a powerful tool for facilitating finite sample problems in moderate-dimensional PR tasks.

  14. Remote sensing of submerged aquatic vegetation in lower Chesapeake Bay - A comparison of Landsat MSS to TM imagery

    NASA Technical Reports Server (NTRS)

    Ackleson, S. G.; Klemas, V.

    1987-01-01

    Landsat MSS and TM imagery, obtained simultaneously over Guinea Marsh, VA, as analyzed and compares for its ability to detect submerged aquatic vegetation (SAV). An unsupervised clustering algorithm was applied to each image, where the input classification parameters are defined as functions of apparent sensor noise. Class confidence and accuracy were computed for all water areas by comparing the classified images, pixel-by-pixel, to rasterized SAV distributions derived from color aerial photography. To illustrate the effect of water depth on classification error, areas of depth greater than 1.9 m were masked, and class confidence and accuracy recalculated. A single-scattering radiative-transfer model is used to illustrate how percent canopy cover and water depth affect the volume reflectance from a water column containing SAV. For a submerged canopy that is morphologically and optically similar to Zostera marina inhabiting Lower Chesapeake Bay, dense canopies may be isolated by masking optically deep water. For less dense canopies, the effect of increasing water depth is to increase the apparent percent crown cover, which may result in classification error.

  15. Development and validation of Aviation Causal Contributors for Error Reporting Systems (ACCERS).

    PubMed

    Baker, David P; Krokos, Kelley J

    2007-04-01

    This investigation sought to develop a reliable and valid classification system for identifying and classifying the underlying causes of pilot errors reported under the Aviation Safety Action Program (ASAP). ASAP is a voluntary safety program that air carriers may establish to study pilot and crew performance on the line. In ASAP programs, similar to the Aviation Safety Reporting System, pilots self-report incidents by filing a short text description of the event. The identification of contributors to errors is critical if organizations are to improve human performance, yet it is difficult for analysts to extract this information from text narratives. A taxonomy was needed that could be used by pilots to classify the causes of errors. After completing a thorough literature review, pilot interviews and a card-sorting task were conducted in Studies 1 and 2 to develop the initial structure of the Aviation Causal Contributors for Event Reporting Systems (ACCERS) taxonomy. The reliability and utility of ACCERS was then tested in studies 3a and 3b by having pilots independently classify the primary and secondary causes of ASAP reports. The results provided initial evidence for the internal and external validity of ACCERS. Pilots were found to demonstrate adequate levels of agreement with respect to their category classifications. ACCERS appears to be a useful system for studying human error captured under pilot ASAP reports. Future work should focus on how ACCERS is organized and whether it can be used or modified to classify human error in ASAP programs for other aviation-related job categories such as dispatchers. Potential applications of this research include systems in which individuals self-report errors and that attempt to extract and classify the causes of those events.

  16. Automated spectral classification and the GAIA project

    NASA Technical Reports Server (NTRS)

    Lasala, Jerry; Kurtz, Michael J.

    1995-01-01

    Two dimensional spectral types for each of the stars observed in the global astrometric interferometer for astrophysics (GAIA) mission would provide additional information for the galactic structure and stellar evolution studies, as well as helping in the identification of unusual objects and populations. The classification of the large quantity generated spectra requires that automated techniques are implemented. Approaches for the automatic classification are reviewed, and a metric-distance method is discussed. In tests, the metric-distance method produced spectral types with mean errors comparable to those of human classifiers working at similar resolution. Data and equipment requirements for an automated classification survey, are discussed. A program of auxiliary observations is proposed to yield spectral types and radial velocities for the GAIA-observed stars.

  17. Speaker normalization and adaptation using second-order connectionist networks.

    PubMed

    Watrous, R L

    1993-01-01

    A method for speaker normalization and adaption using connectionist networks is developed. A speaker-specific linear transformation of observations of the speech signal is computed using second-order network units. Classification is accomplished by a multilayer feedforward network that operates on the normalized speech data. The network is adapted for a new talker by modifying the transformation parameters while leaving the classifier fixed. This is accomplished by backpropagating classification error through the classifier to the second-order transformation units. This method was evaluated for the classification of ten vowels for 76 speakers using the first two formant values of the Peterson-Barney data. The results suggest that rapid speaker adaptation resulting in high classification accuracy can be accomplished by this method.

  18. Errors in imaging patients in the emergency setting

    PubMed Central

    Reginelli, Alfonso; Lo Re, Giuseppe; Midiri, Federico; Muzj, Carlo; Romano, Luigia; Brunese, Luca

    2016-01-01

    Emergency and trauma care produces a “perfect storm” for radiological errors: uncooperative patients, inadequate histories, time-critical decisions, concurrent tasks and often junior personnel working after hours in busy emergency departments. The main cause of diagnostic errors in the emergency department is the failure to correctly interpret radiographs, and the majority of diagnoses missed on radiographs are fractures. Missed diagnoses potentially have important consequences for patients, clinicians and radiologists. Radiologists play a pivotal role in the diagnostic assessment of polytrauma patients and of patients with non-traumatic craniothoracoabdominal emergencies, and key elements to reduce errors in the emergency setting are knowledge, experience and the correct application of imaging protocols. This article aims to highlight the definition and classification of errors in radiology, the causes of errors in emergency radiology and the spectrum of diagnostic errors in radiography, ultrasonography and CT in the emergency setting. PMID:26838955

  19. Errors in imaging patients in the emergency setting.

    PubMed

    Pinto, Antonio; Reginelli, Alfonso; Pinto, Fabio; Lo Re, Giuseppe; Midiri, Federico; Muzj, Carlo; Romano, Luigia; Brunese, Luca

    2016-01-01

    Emergency and trauma care produces a "perfect storm" for radiological errors: uncooperative patients, inadequate histories, time-critical decisions, concurrent tasks and often junior personnel working after hours in busy emergency departments. The main cause of diagnostic errors in the emergency department is the failure to correctly interpret radiographs, and the majority of diagnoses missed on radiographs are fractures. Missed diagnoses potentially have important consequences for patients, clinicians and radiologists. Radiologists play a pivotal role in the diagnostic assessment of polytrauma patients and of patients with non-traumatic craniothoracoabdominal emergencies, and key elements to reduce errors in the emergency setting are knowledge, experience and the correct application of imaging protocols. This article aims to highlight the definition and classification of errors in radiology, the causes of errors in emergency radiology and the spectrum of diagnostic errors in radiography, ultrasonography and CT in the emergency setting.

  20. Feasibility of using near infrared spectroscopy to detect and quantify an adulterant in high quality sandalwood oil.

    PubMed

    Kuriakose, Saji; Joe, I Hubert

    2013-11-01

    Determination of the authenticity of essential oils has become more significant, in recent years, following some illegal adulteration and contamination scandals. The present investigative study focuses on the application of near infrared spectroscopy to detect sample authenticity and quantify economic adulteration of sandalwood oils. Several data pre-treatments are investigated for calibration and prediction using partial least square regression (PLSR). The quantitative data analysis is done using a new spectral approach - full spectrum or sequential spectrum. The optimum number of PLS components is obtained according to the lowest root mean square error of calibration (RMSEC=0.00009% v/v). The lowest root mean square error of prediction (RMSEP=0.00016% v/v) in the test set and the highest coefficient of determination (R(2)=0.99989) are used as the evaluation tools for the best model. A nonlinear method, locally weighted regression (LWR), is added to extract nonlinear information and to compare with the linear PLSR model. Copyright © 2013 Elsevier B.V. All rights reserved.

  1. Feasibility of using near infrared spectroscopy to detect and quantify an adulterant in high quality sandalwood oil

    NASA Astrophysics Data System (ADS)

    Kuriakose, Saji; Joe, I. Hubert

    2013-11-01

    Determination of the authenticity of essential oils has become more significant, in recent years, following some illegal adulteration and contamination scandals. The present investigative study focuses on the application of near infrared spectroscopy to detect sample authenticity and quantify economic adulteration of sandalwood oils. Several data pre-treatments are investigated for calibration and prediction using partial least square regression (PLSR). The quantitative data analysis is done using a new spectral approach - full spectrum or sequential spectrum. The optimum number of PLS components is obtained according to the lowest root mean square error of calibration (RMSEC = 0.00009% v/v). The lowest root mean square error of prediction (RMSEP = 0.00016% v/v) in the test set and the highest coefficient of determination (R2 = 0.99989) are used as the evaluation tools for the best model. A nonlinear method, locally weighted regression (LWR), is added to extract nonlinear information and to compare with the linear PLSR model.

  2. A combined M5P tree and hazard-based duration model for predicting urban freeway traffic accident durations.

    PubMed

    Lin, Lei; Wang, Qian; Sadek, Adel W

    2016-06-01

    The duration of freeway traffic accidents duration is an important factor, which affects traffic congestion, environmental pollution, and secondary accidents. Among previous studies, the M5P algorithm has been shown to be an effective tool for predicting incident duration. M5P builds a tree-based model, like the traditional classification and regression tree (CART) method, but with multiple linear regression models as its leaves. The problem with M5P for accident duration prediction, however, is that whereas linear regression assumes that the conditional distribution of accident durations is normally distributed, the distribution for a "time-to-an-event" is almost certainly nonsymmetrical. A hazard-based duration model (HBDM) is a better choice for this kind of a "time-to-event" modeling scenario, and given this, HBDMs have been previously applied to analyze and predict traffic accidents duration. Previous research, however, has not yet applied HBDMs for accident duration prediction, in association with clustering or classification of the dataset to minimize data heterogeneity. The current paper proposes a novel approach for accident duration prediction, which improves on the original M5P tree algorithm through the construction of a M5P-HBDM model, in which the leaves of the M5P tree model are HBDMs instead of linear regression models. Such a model offers the advantage of minimizing data heterogeneity through dataset classification, and avoids the need for the incorrect assumption of normality for traffic accident durations. The proposed model was then tested on two freeway accident datasets. For each dataset, the first 500 records were used to train the following three models: (1) an M5P tree; (2) a HBDM; and (3) the proposed M5P-HBDM, and the remainder of data were used for testing. The results show that the proposed M5P-HBDM managed to identify more significant and meaningful variables than either M5P or HBDMs. Moreover, the M5P-HBDM had the lowest overall mean absolute percentage error (MAPE). Copyright © 2016 Elsevier Ltd. All rights reserved.

  3. Reliability, Validity, and Classification Accuracy of the DSM-5 Diagnostic Criteria for Gambling Disorder and Comparison to DSM-IV.

    PubMed

    Stinchfield, Randy; McCready, John; Turner, Nigel E; Jimenez-Murcia, Susana; Petry, Nancy M; Grant, Jon; Welte, John; Chapman, Heather; Winters, Ken C

    2016-09-01

    The DSM-5 was published in 2013 and it included two substantive revisions for gambling disorder (GD). These changes are the reduction in the threshold from five to four criteria and elimination of the illegal activities criterion. The purpose of this study was to twofold. First, to assess the reliability, validity and classification accuracy of the DSM-5 diagnostic criteria for GD. Second, to compare the DSM-5-DSM-IV on reliability, validity, and classification accuracy, including an examination of the effect of the elimination of the illegal acts criterion on diagnostic accuracy. To compare DSM-5 and DSM-IV, eight datasets from three different countries (Canada, USA, and Spain; total N = 3247) were used. All datasets were based on similar research methods. Participants were recruited from outpatient gambling treatment services to represent the group with a GD and from the community to represent the group without a GD. All participants were administered a standardized measure of diagnostic criteria. The DSM-5 yielded satisfactory reliability, validity and classification accuracy. In comparing the DSM-5 to the DSM-IV, most comparisons of reliability, validity and classification accuracy showed more similarities than differences. There was evidence of modest improvements in classification accuracy for DSM-5 over DSM-IV, particularly in reduction of false negative errors. This reduction in false negative errors was largely a function of lowering the cut score from five to four and this revision is an improvement over DSM-IV. From a statistical standpoint, eliminating the illegal acts criterion did not make a significant impact on diagnostic accuracy. From a clinical standpoint, illegal acts can still be addressed in the context of the DSM-5 criterion of lying to others.

  4. Multiple category-lot quality assurance sampling: a new classification system with application to schistosomiasis control.

    PubMed

    Olives, Casey; Valadez, Joseph J; Brooker, Simon J; Pagano, Marcello

    2012-01-01

    Originally a binary classifier, Lot Quality Assurance Sampling (LQAS) has proven to be a useful tool for classification of the prevalence of Schistosoma mansoni into multiple categories (≤10%, >10 and <50%, ≥50%), and semi-curtailed sampling has been shown to effectively reduce the number of observations needed to reach a decision. To date the statistical underpinnings for Multiple Category-LQAS (MC-LQAS) have not received full treatment. We explore the analytical properties of MC-LQAS, and validate its use for the classification of S. mansoni prevalence in multiple settings in East Africa. We outline MC-LQAS design principles and formulae for operating characteristic curves. In addition, we derive the average sample number for MC-LQAS when utilizing semi-curtailed sampling and introduce curtailed sampling in this setting. We also assess the performance of MC-LQAS designs with maximum sample sizes of n=15 and n=25 via a weighted kappa-statistic using S. mansoni data collected in 388 schools from four studies in East Africa. Overall performance of MC-LQAS classification was high (kappa-statistic of 0.87). In three of the studies, the kappa-statistic for a design with n=15 was greater than 0.75. In the fourth study, where these designs performed poorly (kappa-statistic less than 0.50), the majority of observations fell in regions where potential error is known to be high. Employment of semi-curtailed and curtailed sampling further reduced the sample size by as many as 0.5 and 3.5 observations per school, respectively, without increasing classification error. This work provides the needed analytics to understand the properties of MC-LQAS for assessing the prevalance of S. mansoni and shows that in most settings a sample size of 15 children provides a reliable classification of schools.

  5. The presence of English and Spanish dyslexia in the Web

    NASA Astrophysics Data System (ADS)

    Rello, Luz; Baeza-Yates, Ricardo

    2012-09-01

    In this study we present a lower bound of the prevalence of dyslexia in the Web for English and Spanish. On the basis of analysis of corpora written by dyslexic people, we propose a classification of the different kinds of dyslexic errors. A representative data set of dyslexic words is used to calculate this lower bound in web pages containing English and Spanish dyslexic errors. We also present an analysis of dyslexic errors in major Internet domains, social media sites, and throughout English- and Spanish-speaking countries. To show the independence of our estimations from the presence of other kinds of errors, we compare them with the overall lexical quality of the Web and with the error rate of noncorrected corpora. The presence of dyslexic errors in the Web motivates work in web accessibility for dyslexic users.

  6. Influence of survey strategy and interpolation model on DEM quality

    NASA Astrophysics Data System (ADS)

    Heritage, George L.; Milan, David J.; Large, Andrew R. G.; Fuller, Ian C.

    2009-11-01

    Accurate characterisation of morphology is critical to many studies in the field of geomorphology, particularly those dealing with changes over time. Digital elevation models (DEMs) are commonly used to represent morphology in three dimensions. The quality of the DEM is largely a function of the accuracy of individual survey points, field survey strategy, and the method of interpolation. Recommendations concerning field survey strategy and appropriate methods of interpolation are currently lacking. Furthermore, the majority of studies to date consider error to be uniform across a surface. This study quantifies survey strategy and interpolation error for a gravel bar on the River Nent, Blagill, Cumbria, UK. Five sampling strategies were compared: (i) cross section; (ii) bar outline only; (iii) bar and chute outline; (iv) bar and chute outline with spot heights; and (v) aerial LiDAR equivalent, derived from degraded terrestrial laser scan (TLS) data. Digital Elevation Models were then produced using five different common interpolation algorithms. Each resultant DEM was differentiated from a terrestrial laser scan of the gravel bar surface in order to define the spatial distribution of vertical and volumetric error. Overall triangulation with linear interpolation (TIN) or point kriging appeared to provide the best interpolators for the bar surface. Lowest error on average was found for the simulated aerial LiDAR survey strategy, regardless of interpolation technique. However, comparably low errors were also found for the bar-chute-spot sampling strategy when TINs or point kriging was used as the interpolator. The magnitude of the errors between survey strategy exceeded those found between interpolation technique for a specific survey strategy. Strong relationships between local surface topographic variation (as defined by the standard deviation of vertical elevations in a 0.2-m diameter moving window), and DEM errors were also found, with much greater errors found at slope breaks such as bank edges. A series of curves are presented that demonstrate these relationships for each interpolation and survey strategy. The simulated aerial LiDAR data set displayed the lowest errors across the flatter surfaces; however, sharp slope breaks are better modelled by the morphologically based survey strategy. The curves presented have general application to spatially distributed data of river beds and may be applied to standard deviation grids to predict spatial error within a surface, depending upon sampling strategy and interpolation algorithm.

  7. Nineteen hundred seventy three significant accomplishments. [Landsat satellite data applications

    NASA Technical Reports Server (NTRS)

    1974-01-01

    Data collected by the Skylab remote sensing satellites was used to develop applications techniques and to combine automatic data classification with statistical clustering methods. Continuing research was concentrated in the correlation and registration of data products and in the definition of the atmospheric effects on remote sensing. The causes of errors encountered in the automated classification of agricultural data are identified. Other applications in forestry, geography, environmental geology, and land use are discussed.

  8. Atmospheric Correction of Satellite Imagery Using Modtran 3.5 Code

    NASA Technical Reports Server (NTRS)

    Gonzales, Fabian O.; Velez-Reyes, Miguel

    1997-01-01

    When performing satellite remote sensing of the earth in the solar spectrum, atmospheric scattering and absorption effects provide the sensors corrupted information about the target's radiance characteristics. We are faced with the problem of reconstructing the signal that was reflected from the target, from the data sensed by the remote sensing instrument. This article presents a method for simulating radiance characteristic curves of satellite images using a MODTRAN 3.5 band model (BM) code to solve the radiative transfer equation (RTE), and proposes a method for the implementation of an adaptive system for automated atmospheric corrections. The simulation procedure is carried out as follows: (1) for each satellite digital image a radiance characteristic curve is obtained by performing a digital number (DN) to radiance conversion, (2) using MODTRAN 3.5 a simulation of the images characteristic curves is generated, (3) the output of the code is processed to generate radiance characteristic curves for the simulated cases. The simulation algorithm was used to simulate Landsat Thematic Mapper (TM) images for two types of locations: the ocean surface, and a forest surface. The simulation procedure was validated by computing the error between the empirical and simulated radiance curves. While results in the visible region of the spectrum where not very accurate, those for the infrared region of the spectrum were encouraging. This information can be used for correction of the atmospheric effects. For the simulation over ocean, the lowest error produced in this region was of the order of 105 and up to 14 times smaller than errors in the visible region. For the same spectral region on the forest case, the lowest error produced was of the order of 10-4, and up to 41 times smaller than errors in the visible region,

  9. Assessment of Automatically Exported Clinical Data from a Hospital Information System for Clinical Research in Multiple Myeloma.

    PubMed

    Torres, Viviana; Cerda, Mauricio; Knaup, Petra; Löpprich, Martin

    2016-01-01

    An important part of the electronic information available in Hospital Information System (HIS) has the potential to be automatically exported to Electronic Data Capture (EDC) platforms for improving clinical research. This automation has the advantage of reducing manual data transcription, a time consuming and prone to errors process. However, quantitative evaluations of the process of exporting data from a HIS to an EDC system have not been reported extensively, in particular comparing with manual transcription. In this work an assessment to study the quality of an automatic export process, focused in laboratory data from a HIS is presented. Quality of the laboratory data was assessed in two types of processes: (1) a manual process of data transcription, and (2) an automatic process of data transference. The automatic transference was implemented as an Extract, Transform and Load (ETL) process. Then, a comparison was carried out between manual and automatic data collection methods. The criteria to measure data quality were correctness and completeness. The manual process had a general error rate of 2.6% to 7.1%, obtaining the lowest error rate if data fields with a not clear definition were removed from the analysis (p < 10E-3). In the case of automatic process, the general error rate was 1.9% to 12.1%, where lowest error rate is obtained when excluding information missing in the HIS but transcribed to the EDC from other physical sources. The automatic ETL process can be used to collect laboratory data for clinical research if data in the HIS as well as physical documentation not included in HIS, are identified previously and follows a standardized data collection protocol.

  10. The QUIET Instrument

    NASA Technical Reports Server (NTRS)

    Gaier, T.; Kangaslahti, P.; Lawrence, C. R.; Leitch, E. M.; Wollack, E. J.

    2012-01-01

    The Q/U Imaging ExperimenT (QUIET) is designed to measure polarization in the Cosmic Microwave Background, targeting the imprint of inflationary gravitational waves at large angular scales ( approx 1 deg.) . Between 2008 October and 2010 December, two independent receiver arrays were deployed sequentially on a 1.4 m side-fed Dragonian telescope. The polarimeters which form the focal planes use a highly compact design based on High Electron Mobility Transistors (HEMTs) that provides simultaneous measurements of the Stokes parameters Q, U, and I in a single module. The 17-element Q-band polarimeter array, with a central frequency of 43.1 GHz, has the best sensitivity (69 micro Ks(exp 1/2)) and the lowest instrumental systematic errors ever achieved in this band, contributing to the tensor-to-scalar ratio at r < 0.1. The 84-element W-band polarimeter array has a sensitivity of 87 micro Ks(exp 1/2) at a central frequency of 94.5 GHz. It has the lowest systematic errors to date, contributing at r < 0.01 (QUIET Collaboration 2012) The two arrays together cover multipoles in the range l approximately equals 25-975 . These are the largest HEMT-ba.sed arrays deployed to date. This article describes the design, calibration, performance of, and sources of systematic error for the instrument,

  11. Five-way smoking status classification using text hot-spot identification and error-correcting output codes.

    PubMed

    Cohen, Aaron M

    2008-01-01

    We participated in the i2b2 smoking status classification challenge task. The purpose of this task was to evaluate the ability of systems to automatically identify patient smoking status from discharge summaries. Our submission included several techniques that we compared and studied, including hot-spot identification, zero-vector filtering, inverse class frequency weighting, error-correcting output codes, and post-processing rules. We evaluated our approaches using the same methods as the i2b2 task organizers, using micro- and macro-averaged F1 as the primary performance metric. Our best performing system achieved a micro-F1 of 0.9000 on the test collection, equivalent to the best performing system submitted to the i2b2 challenge. Hot-spot identification, zero-vector filtering, classifier weighting, and error correcting output coding contributed additively to increased performance, with hot-spot identification having by far the largest positive effect. High performance on automatic identification of patient smoking status from discharge summaries is achievable with the efficient and straightforward machine learning techniques studied here.

  12. Describing Peripancreatic Collections According to the Revised Atlanta Classification of Acute Pancreatitis: An International Interobserver Agreement Study.

    PubMed

    Bouwense, Stefan A; van Brunschot, Sandra; van Santvoort, Hjalmar C; Besselink, Marc G; Bollen, Thomas L; Bakker, Olaf J; Banks, Peter A; Boermeester, Marja A; Cappendijk, Vincent C; Carter, Ross; Charnley, Richard; van Eijck, Casper H; Freeny, Patrick C; Hermans, John J; Hough, David M; Johnson, Colin D; Laméris, Johan S; Lerch, Markus M; Mayerle, Julia; Mortele, Koenraad J; Sarr, Michael G; Stedman, Brian; Vege, Santhi Swaroop; Werner, Jens; Dijkgraaf, Marcel G; Gooszen, Hein G; Horvath, Karen D

    2017-08-01

    Severe acute pancreatitis is associated with peripancreatic morphologic changes as seen on imaging. Uniform communication regarding these morphologic findings is crucial for accurate diagnosis and treatment. For the original 1992 Atlanta classification, interobserver agreement is poor. We hypothesized that for the revised Atlanta classification, interobserver agreement will be better. An international, interobserver agreement study was performed among expert and nonexpert radiologists (n = 14), surgeons (n = 15), and gastroenterologists (n = 8). Representative computed tomographies of all stages of acute pancreatitis were selected from 55 patients and were assessed according to the revised Atlanta classification. The interobserver agreement was calculated among all reviewers and subgroups, that is, expert and nonexpert reviewers; interobserver agreement was defined as poor (≤0.20), fair (0.21-0.40), moderate (0.41-0.60), good (0.61-0.80), or very good (0.81-1.00). Interobserver agreement among all reviewers was good (0.75 [standard deviation, 0.21]) for describing the type of acute pancreatitis and good (0.62 [standard deviation, 0.19]) for the type of peripancreatic collection. Expert radiologists showed the best and nonexpert clinicians the lowest interobserver agreement. Interobserver agreement was good for the revised Atlanta classification, supporting the importance for widespread adaption of this revised classification for clinical and research communications.

  13. Failure analysis and modeling of a VAXcluster system

    NASA Technical Reports Server (NTRS)

    Tang, Dong; Iyer, Ravishankar K.; Subramani, Sujatha S.

    1990-01-01

    This paper discusses the results of a measurement-based analysis of real error data collected from a DEC VAXcluster multicomputer system. In addition to evaluating basic system dependability characteristics such as error and failure distributions and hazard rates for both individual machines and for the VAXcluster, reward models were developed to analyze the impact of failures on the system as a whole. The results show that more than 46 percent of all failures were due to errors in shared resources. This is despite the fact that these errors have a recovery probability greater than 0.99. The hazard rate calculations show that not only errors, but also failures occur in bursts. Approximately 40 percent of all failures occur in bursts and involved multiple machines. This result indicates that correlated failures are significant. Analysis of rewards shows that software errors have the lowest reward (0.05 vs 0.74 for disk errors). The expected reward rate (reliability measure) of the VAXcluster drops to 0.5 in 18 hours for the 7-out-of-7 model and in 80 days for the 3-out-of-7 model.

  14. A classification on human factor accident/incident of China civil aviation in recent twelve years.

    PubMed

    Luo, Xiao-li

    2004-10-01

    To study human factor accident/incident occurred during 1990-2001 using new classification standard. The human factor accident/incident classification standard is developed on the basis of Reason's Model, combining with CAAC's traditional classifying method, and applied to the classified statistical analysis for 361 flying incidents and 35 flight accidents of China civil aviation, which is induced by human factors and occurred from 1990 to 2001. 1) the incident percentage of taxi and cruise is higher than that of takeoff, climb and descent. 2) The dominating type of flight incidents is diverging of runway, overrunning, near-miss, tail/wingtip/engine strike and ground obstacle impacting. 3) The top three accidents are out of control caused by crew, mountain collision and over runway. 4) Crew's basic operating skill is lower than what we imagined, the mostly representation is poor correcting ability when flight error happened. 5) Crew errors can be represented by incorrect control, regulation and procedure violation, disorientation and diverging percentage of correct flight level. The poor CRM skill is the dominant factor impacting China civil aviation safety, this result has a coincidence with previous study, but there is much difference and distinct characteristic in top incident phase, the type of crew error and behavior performance compared with that of advanced countries. We should strengthen CRM training for all of pilots aiming at the Chinese pilot behavior characteristic in order to improve the safety level of China civil aviation.

  15. Southwestern Cooperative Educational Laboratory Interaction Observation Schedule (SCIOS): A System for Analyzing Teacher-Pupil Interaction in the Affective Domain.

    ERIC Educational Resources Information Center

    Bemis, Katherine A.; Liberty, Paul G.

    The Southwestern Cooperative Interaction Observation Schedule (SCIOS) is a classroom observation instrument designed to record pupil-teacher interaction. The classification of pupil behavior is based on Krathwohl's (1964) theory of the three lowest levels of the affective domain. The levels are (1) receiving: the learner should be sensitized to…

  16. Solar dynamic heat receiver thermal characteristics in low earth orbit

    NASA Technical Reports Server (NTRS)

    Wu, Y. C.; Roschke, E. J.; Birur, G. C.

    1988-01-01

    A simplified system model is under development for evaluating the thermal characteristics and thermal performance of a solar dynamic spacecraft energy system's heat receiver. Results based on baseline orbit, power system configuration, and operational conditions, are generated for three basic receiver concepts and three concentrator surface slope errors. Receiver thermal characteristics and thermal behavior in LEO conditions are presented. The configuration in which heat is directly transferred to the working fluid is noted to generate the best system and thermal characteristics. as well as the lowest performance degradation with increasing slope error.

  17. Content-based multiple bitstream image transmission over noisy channels.

    PubMed

    Cao, Lei; Chen, Chang Wen

    2002-01-01

    In this paper, we propose a novel combined source and channel coding scheme for image transmission over noisy channels. The main feature of the proposed scheme is a systematic decomposition of image sources so that unequal error protection can be applied according to not only bit error sensitivity but also visual content importance. The wavelet transform is adopted to hierarchically decompose the image. The association between the wavelet coefficients and what they represent spatially in the original image is fully exploited so that wavelet blocks are classified based on their corresponding image content. The classification produces wavelet blocks in each class with similar content and statistics, therefore enables high performance source compression using the set partitioning in hierarchical trees (SPIHT) algorithm. To combat the channel noise, an unequal error protection strategy with rate-compatible punctured convolutional/cyclic redundancy check (RCPC/CRC) codes is implemented based on the bit contribution to both peak signal-to-noise ratio (PSNR) and visual quality. At the receiving end, a postprocessing method making use of the SPIHT decoding structure and the classification map is developed to restore the degradation due to the residual error after channel decoding. Experimental results show that the proposed scheme is indeed able to provide protection both for the bits that are more sensitive to errors and for the more important visual content under a noisy transmission environment. In particular, the reconstructed images illustrate consistently better visual quality than using the single-bitstream-based schemes.

  18. Robust Image Regression Based on the Extended Matrix Variate Power Exponential Distribution of Dependent Noise.

    PubMed

    Luo, Lei; Yang, Jian; Qian, Jianjun; Tai, Ying; Lu, Gui-Fu

    2017-09-01

    Dealing with partial occlusion or illumination is one of the most challenging problems in image representation and classification. In this problem, the characterization of the representation error plays a crucial role. In most current approaches, the error matrix needs to be stretched into a vector and each element is assumed to be independently corrupted. This ignores the dependence between the elements of error. In this paper, it is assumed that the error image caused by partial occlusion or illumination changes is a random matrix variate and follows the extended matrix variate power exponential distribution. This has the heavy tailed regions and can be used to describe a matrix pattern of l×m dimensional observations that are not independent. This paper reveals the essence of the proposed distribution: it actually alleviates the correlations between pixels in an error matrix E and makes E approximately Gaussian. On the basis of this distribution, we derive a Schatten p -norm-based matrix regression model with L q regularization. Alternating direction method of multipliers is applied to solve this model. To get a closed-form solution in each step of the algorithm, two singular value function thresholding operators are introduced. In addition, the extended Schatten p -norm is utilized to characterize the distance between the test samples and classes in the design of the classifier. Extensive experimental results for image reconstruction and classification with structural noise demonstrate that the proposed algorithm works much more robustly than some existing regression-based methods.

  19. Assessing the accuracy of the International Classification of Diseases codes to identify abusive head trauma: a feasibility study.

    PubMed

    Berger, Rachel P; Parks, Sharyn; Fromkin, Janet; Rubin, Pamela; Pecora, Peter J

    2015-04-01

    To assess the accuracy of an International Classification of Diseases (ICD) code-based operational case definition for abusive head trauma (AHT). Subjects were children <5 years of age evaluated for AHT by a hospital-based Child Protection Team (CPT) at a tertiary care paediatric hospital with a completely electronic medical record (EMR) system. Subjects were designated as non-AHT traumatic brain injury (TBI) or AHT based on whether the CPT determined that the injuries were due to AHT. The sensitivity and specificity of the ICD-based definition were calculated. There were 223 children evaluated for AHT: 117 AHT and 106 non-AHT TBI. The sensitivity and specificity of the ICD-based operational case definition were 92% (95% CI 85.8 to 96.2) and 96% (95% CI 92.3 to 99.7), respectively. All errors in sensitivity and three of the four specificity errors were due to coder error; one specificity error was a physician error. In a paediatric tertiary care hospital with an EMR system, the accuracy of an ICD-based case definition for AHT was high. Additional studies are needed to assess the accuracy of this definition in all types of hospitals in which children with AHT are cared for. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  20. Modeling habitat dynamics accounting for possible misclassification

    USGS Publications Warehouse

    Veran, Sophie; Kleiner, Kevin J.; Choquet, Remi; Collazo, Jaime; Nichols, James D.

    2012-01-01

    Land cover data are widely used in ecology as land cover change is a major component of changes affecting ecological systems. Landscape change estimates are characterized by classification errors. Researchers have used error matrices to adjust estimates of areal extent, but estimation of land cover change is more difficult and more challenging, with error in classification being confused with change. We modeled land cover dynamics for a discrete set of habitat states. The approach accounts for state uncertainty to produce unbiased estimates of habitat transition probabilities using ground information to inform error rates. We consider the case when true and observed habitat states are available for the same geographic unit (pixel) and when true and observed states are obtained at one level of resolution, but transition probabilities estimated at a different level of resolution (aggregations of pixels). Simulation results showed a strong bias when estimating transition probabilities if misclassification was not accounted for. Scaling-up does not necessarily decrease the bias and can even increase it. Analyses of land cover data in the Southeast region of the USA showed that land change patterns appeared distorted if misclassification was not accounted for: rate of habitat turnover was artificially increased and habitat composition appeared more homogeneous. Not properly accounting for land cover misclassification can produce misleading inferences about habitat state and dynamics and also misleading predictions about species distributions based on habitat. Our models that explicitly account for state uncertainty should be useful in obtaining more accurate inferences about change from data that include errors.

  1. Underwater target classification using wavelet packets and neural networks.

    PubMed

    Azimi-Sadjadi, M R; Yao, D; Huang, Q; Dobeck, G J

    2000-01-01

    In this paper, a new subband-based classification scheme is developed for classifying underwater mines and mine-like targets from the acoustic backscattered signals. The system consists of a feature extractor using wavelet packets in conjunction with linear predictive coding (LPC), a feature selection scheme, and a backpropagation neural-network classifier. The data set used for this study consists of the backscattered signals from six different objects: two mine-like targets and four nontargets for several aspect angles. Simulation results on ten different noisy realizations and for signal-to-noise ratio (SNR) of 12 dB are presented. The receiver operating characteristic (ROC) curve of the classifier generated based on these results demonstrated excellent classification performance of the system. The generalization ability of the trained network was demonstrated by computing the error and classification rate statistics on a large data set. A multiaspect fusion scheme was also adopted in order to further improve the classification performance.

  2. The effect of the atmosphere on the classification of satellite observations to identify surface features

    NASA Technical Reports Server (NTRS)

    Fraser, R. S.; Bahethi, O. P.; Al-Abbas, A. H.

    1977-01-01

    The effect of differences in atmospheric turbidity on the classification of Landsat 1 observations of a rural scene is presented. The observations are classified by an unsupervised clustering technique. These clusters serve as a training set for use of a maximum-likelihood algorithm. The measured radiances in each of the four spectral bands are then changed by amounts measured by Landsat 1. These changes can be associated with a decrease in atmospheric turbidity by a factor of 1.3. The classification of 22% of the pixels changes as a result of the modification. The modified observations are then reclassified as an independent set. Only 3% of the pixels have a different classification than the unmodified set. Hence, if classification errors of rural areas are not to exceed 15%, a new training set has to be developed whenever the difference in turbidity between the training and test sets reaches unity.

  3. Multinomial mixture model with heterogeneous classification probabilities

    USGS Publications Warehouse

    Holland, M.D.; Gray, B.R.

    2011-01-01

    Royle and Link (Ecology 86(9):2505-2512, 2005) proposed an analytical method that allowed estimation of multinomial distribution parameters and classification probabilities from categorical data measured with error. While useful, we demonstrate algebraically and by simulations that this method yields biased multinomial parameter estimates when the probabilities of correct category classifications vary among sampling units. We address this shortcoming by treating these probabilities as logit-normal random variables within a Bayesian framework. We use Markov chain Monte Carlo to compute Bayes estimates from a simulated sample from the posterior distribution. Based on simulations, this elaborated Royle-Link model yields nearly unbiased estimates of multinomial and correct classification probability estimates when classification probabilities are allowed to vary according to the normal distribution on the logit scale or according to the Beta distribution. The method is illustrated using categorical submersed aquatic vegetation data. ?? 2010 Springer Science+Business Media, LLC.

  4. Stretchy binary classification.

    PubMed

    Toh, Kar-Ann; Lin, Zhiping; Sun, Lei; Li, Zhengguo

    2018-01-01

    In this article, we introduce an analytic formulation for compressive binary classification. The formulation seeks to solve the least ℓ p -norm of the parameter vector subject to a classification error constraint. An analytic and stretchable estimation is conjectured where the estimation can be viewed as an extension of the pseudoinverse with left and right constructions. Our variance analysis indicates that the estimation based on the left pseudoinverse is unbiased and the estimation based on the right pseudoinverse is biased. Sparseness can be obtained for the biased estimation under certain mild conditions. The proposed estimation is investigated numerically using both synthetic and real-world data. Copyright © 2017 Elsevier Ltd. All rights reserved.

  5. [Effects of residents' care needs classification (and misclassification) in nursing homes: the example of SOSIA classification].

    PubMed

    Nebuloni, G; Di Giulio, P; Gregori, D; Sandonà, P; Berchialla, P; Foltran, F; Renga, G

    2011-01-01

    Since 2003, the Lombardy region has introduced a case-mix reimbursement system for nursing homes based on the SOSIA form which classifies residents into eight classes of frailty. In the present study the agreement between SOSIA classification and other well documented instruments, including Barthel Index, Mini Mental State Examination and Clinical Dementia Rating Scale is evaluated in 100 nursing home residents. Only 50% of residents with severe dementia have been recognized as seriously impaired when assessed with SOSIA form; since misclassification errors underestimate residents' care needs, they determine an insufficient reimbursement limiting nursing home possibility to offer care appropriate for the case-mix.

  6. Effect of filtration of signals of brain activity on quality of recognition of brain activity patterns using artificial intelligence methods

    NASA Astrophysics Data System (ADS)

    Hramov, Alexander E.; Frolov, Nikita S.; Musatov, Vyachaslav Yu.

    2018-02-01

    In present work we studied features of the human brain states classification, corresponding to the real movements of hands and legs. For this purpose we used supervised learning algorithm based on feed-forward artificial neural networks (ANNs) with error back-propagation along with the support vector machine (SVM) method. We compared the quality of operator movements classification by means of EEG signals obtained experimentally in the absence of preliminary processing and after filtration in different ranges up to 25 Hz. It was shown that low-frequency filtering of multichannel EEG data significantly improved accuracy of operator movements classification.

  7. Ice crystals classification using airborne measurements in mixing phase

    NASA Astrophysics Data System (ADS)

    Sorin Vajaiac, Nicolae; Boscornea, Andreea

    2017-04-01

    This paper presents a case study of ice crystals classification from airborne measurements in mixed-phase clouds. Ice crystal shadow is recorded with CIP (Cloud Imaging Probe) component of CAPS (Cloud, Aerosol, and Precipitation Spectrometer) system. The analyzed flight was performed in the south-western part of Romania (between Pietrosani, Ramnicu Valcea, Craiova and Targu Jiu), with a Beechcraft C90 GTX which was specially equipped with a CAPS system. The temperature, during the fly, reached the lowest value at -35 °C. These low temperatures allow the formation of ice crystals and influence their form. For the here presented ice crystals classification a special software, OASIS (Optical Array Shadow Imaging Software), developed by DMT (Droplet Measurement Technologies), was used. The obtained results, as expected are influenced by the atmospheric and microphysical parameters. The particles recorded where classified in four groups: edge, irregular, round and small.

  8. GDF v2.0, an enhanced version of GDF

    NASA Astrophysics Data System (ADS)

    Tsoulos, Ioannis G.; Gavrilis, Dimitris; Dermatas, Evangelos

    2007-12-01

    An improved version of the function estimation program GDF is presented. The main enhancements of the new version include: multi-output function estimation, capability of defining custom functions in the grammar and selection of the error function. The new version has been evaluated on a series of classification and regression datasets, that are widely used for the evaluation of such methods. It is compared to two known neural networks and outperforms them in 5 (out of 10) datasets. Program summaryTitle of program: GDF v2.0 Catalogue identifier: ADXC_v2_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADXC_v2_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 98 147 No. of bytes in distributed program, including test data, etc.: 2 040 684 Distribution format: tar.gz Programming language: GNU C++ Computer: The program is designed to be portable in all systems running the GNU C++ compiler Operating system: Linux, Solaris, FreeBSD RAM: 200000 bytes Classification: 4.9 Does the new version supersede the previous version?: Yes Nature of problem: The technique of function estimation tries to discover from a series of input data a functional form that best describes them. This can be performed with the use of parametric models, whose parameters can adapt according to the input data. Solution method: Functional forms are being created by genetic programming which are approximations for the symbolic regression problem. Reasons for new version: The GDF package was extended in order to be more flexible and user customizable than the old package. The user can extend the package by defining his own error functions and he can extend the grammar of the package by adding new functions to the function repertoire. Also, the new version can perform function estimation of multi-output functions and it can be used for classification problems. Summary of revisions: The following features have been added to the package GDF: Multi-output function approximation. The package can now approximate any function f:R→R. This feature gives also to the package the capability of performing classification and not only regression. User defined function can be added to the repertoire of the grammar, extending the regression capabilities of the package. This feature is limited to 3 functions, but easily this number can be increased. Capability of selecting the error function. The package offers now to the user apart from the mean square error other error functions such as: mean absolute square error, maximum square error. Also, user defined error functions can be added to the set of error functions. More verbose output. The main program displays more information to the user as well as the default values for the parameters. Also, the package gives to the user the capability to define an output file, where the output of the gdf program for the testing set will be stored after the termination of the process. Additional comments: A technical report describing the revisions, experiments and test runs is packaged with the source code. Running time: Depending on the train data.

  9. Ground truth management system to support multispectral scanner /MSS/ digital analysis

    NASA Technical Reports Server (NTRS)

    Coiner, J. C.; Ungar, S. G.

    1977-01-01

    A computerized geographic information system for management of ground truth has been designed and implemented to relate MSS classification results to in situ observations. The ground truth system transforms, generalizes and rectifies ground observations to conform to the pixel size and shape of high resolution MSS aircraft data. These observations can then be aggregated for comparison to lower resolution sensor data. Construction of a digital ground truth array allows direct pixel by pixel comparison between classification results of MSS data and ground truth. By making comparisons, analysts can identify spatial distribution of error within the MSS data as well as usual figures of merit for the classifications. Use of the ground truth system permits investigators to compare a variety of environmental or anthropogenic data, such as soil color or tillage patterns, with classification results and allows direct inclusion of such data into classification operations. To illustrate the system, examples from classification of simulated Thematic Mapper data for agricultural test sites in North Dakota and Kansas are provided.

  10. Low-lying electronic states of Li 2+ and Li 2-

    NASA Astrophysics Data System (ADS)

    Konowalow, Daniel D.; Fish, James L.

    1984-02-01

    Potential curves for the eight lowest lying electronic states of Li2+ and the two lowest-lying states of Li2- are obtained by valence configuration calculations which-utilize an effective core potential. The calculated ionization potential of the ground state of Li2 is found to b. 5.16 eV and its electron affinity is 0.429 eV. Both values are in excellent agreement with recent experimental values and with value deduced from other high quality ab initio quantum mechanical treatments. When our potential curve for the Li2+(12Σg+ state, is corrected for the core-valence correlation error we obtain spectroscopic constants which agree nicely with the experimental values of Bernheim, Gold and Tipton (BGT). For example, we findDe = 10460 ± 140 cm-1 while BGT reportDe = 10469 ± 6 cm-1.

  11. Global terrain classification using Multiple-Error-Removed Improved-Terrain (MERIT) to address susceptibility of landslides and other geohazards

    NASA Astrophysics Data System (ADS)

    Iwahashi, J.; Yamazaki, D.; Matsuoka, M.; Thamarux, P.; Herrick, J.; Yong, A.; Mital, U.

    2017-12-01

    A seamless model of landform classifications with regional accuracy will be a powerful platform for geophysical studies that forecast geologic hazards. Spatial variability as a function of landform on a global scale was captured in the automated classifications of Iwahashi and Pike (2007) and additional developments are presented here that incorporate more accurate depictions using higher-resolution elevation data than the original 1-km scale Shuttle Radar Topography Mission digital elevation model (DEM). We create polygon-based terrain classifications globally by using the 280-m DEM interpolated from the Multi-Error-Removed Improved-Terrain DEM (MERIT; Yamazaki et al., 2017). The multi-scale pixel-image analysis method, known as Multi-resolution Segmentation (Baatz and Schäpe, 2000), is first used to classify the terrains based on geometric signatures (slope and local convexity) calculated from the 280-m DEM. Next, we apply the machine learning method of "k-means clustering" to prepare the polygon-based classification at the globe-scale using slope, local convexity and surface texture. We then group the divisions with similar properties by hierarchical clustering and other statistical analyses using geological and geomorphological data of the area where landslides and earthquakes are frequent (e.g. Japan and California). We find the 280-m DEM resolution is only partially sufficient for classifying plains. We nevertheless observe that the categories correspond to reported landslide and liquefaction features at the global scale, suggesting that our model is an appropriate platform to forecast ground failure. To predict seismic amplification, we estimate site conditions using the time-averaged shear-wave velocity in the upper 30-m (VS30) measurements compiled by Yong et al. (2016) and the terrain model developed by Yong (2016; Y16). We plan to test our method on finer resolution DEMs and report our findings to obtain a more globally consistent terrain model as there are known errors in DEM derivatives at higher-resolutions. We expect the improvement in DEM resolution (4 times greater detail) and the combination of regional and global coverage will yield a consistent dataset of polygons that have the potential to improve relations to the Y16 estimates significantly.

  12. Disregarding population specificity: its influence on the sex assessment methods from the tibia.

    PubMed

    Kotěrová, Anežka; Velemínská, Jana; Dupej, Ján; Brzobohatá, Hana; Pilný, Aleš; Brůžek, Jaroslav

    2017-01-01

    Forensic anthropology has developed classification techniques for sex estimation of unknown skeletal remains, for example population-specific discriminant function analyses. These methods were designed for populations that lived mostly in the late nineteenth and twentieth centuries. Their level of reliability or misclassification is important for practical use in today's forensic practice; it is, however, unknown. We addressed the question of what the likelihood of errors would be if population specificity of discriminant functions of the tibia were disregarded. Moreover, five classification functions in a Czech sample were proposed (accuracies 82.1-87.5 %, sex bias ranged from -1.3 to -5.4 %). We measured ten variables traditionally used for sex assessment of the tibia on a sample of 30 male and 26 female models from recent Czech population. To estimate the classification accuracy and error (misclassification) rates ignoring population specificity, we selected published classification functions of tibia for the Portuguese, south European, and the North American populations. These functions were applied on the dimensions of the Czech population. Comparing the classification success of the reference and the tested Czech sample showed that females from Czech population were significantly overestimated and mostly misclassified as males. Overall accuracy of sex assessment significantly decreased (53.6-69.7 %), sex bias -29.4-100 %, which is most probably caused by secular trend and the generally high variability of body size. Results indicate that the discriminant functions, developed for skeletal series representing geographically and chronologically diverse populations, are not applicable in current forensic investigations. Finally, implications and recommendations for future research are discussed.

  13. Decision Making for Borderline Cases in Pass/Fail Clinical Anatomy Courses: The Practical Value of the Standard Error of Measurement and Likelihood Ratio in a Diagnostic Test

    ERIC Educational Resources Information Center

    Severo, Milton; Silva-Pereira, Fernanda; Ferreira, Maria Amelia

    2013-01-01

    Several studies have shown that the standard error of measurement (SEM) can be used as an additional “safety net” to reduce the frequency of false-positive or false-negative student grading classifications. Practical examinations in clinical anatomy are often used as diagnostic tests to admit students to course final examinations. The aim of this…

  14. Use of scan overlap redundancy to enhance multispectral aircraft scanner data

    NASA Technical Reports Server (NTRS)

    Lindenlaub, J. C.; Keat, J.

    1973-01-01

    Two criteria were suggested for optimizing the resolution error versus signal-to-noise-ratio tradeoff. The first criterion uses equal weighting coefficients and chooses n, the number of lines averaged, so as to make the average resolution error equal to the noise error. The second criterion adjusts both the number and relative sizes of the weighting coefficients so as to minimize the total error (resolution error plus noise error). The optimum set of coefficients depends upon the geometry of the resolution element, the number of redundant scan lines, the scan line increment, and the original signal-to-noise ratio of the channel. Programs were developed to find the optimum number and relative weights of the averaging coefficients. A working definition of signal-to-noise ratio was given and used to try line averaging on a typical set of data. Line averaging was evaluated only with respect to its effect on classification accuracy.

  15. Reducing error and improving efficiency during vascular interventional radiology: implementation of a preprocedural team rehearsal.

    PubMed

    Morbi, Abigail H M; Hamady, Mohamad S; Riga, Celia V; Kashef, Elika; Pearch, Ben J; Vincent, Charles; Moorthy, Krishna; Vats, Amit; Cheshire, Nicholas J W; Bicknell, Colin D

    2012-08-01

    To determine the type and frequency of errors during vascular interventional radiology (VIR) and design and implement an intervention to reduce error and improve efficiency in this setting. Ethical guidance was sought from the Research Services Department at Imperial College London. Informed consent was not obtained. Field notes were recorded during 55 VIR procedures by a single observer. Two blinded assessors identified failures from field notes and categorized them into one or more errors by using a 22-part classification system. The potential to cause harm, disruption to procedural flow, and preventability of each failure was determined. A preprocedural team rehearsal (PPTR) was then designed and implemented to target frequent preventable potential failures. Thirty-three procedures were observed subsequently to determine the efficacy of the PPTR. Nonparametric statistical analysis was used to determine the effect of intervention on potential failure rates, potential to cause harm and procedural flow disruption scores (Mann-Whitney U test), and number of preventable failures (Fisher exact test). Before intervention, 1197 potential failures were recorded, of which 54.6% were preventable. A total of 2040 errors were deemed to have occurred to produce these failures. Planning error (19.7%), staff absence (16.2%), equipment unavailability (12.2%), communication error (11.2%), and lack of safety consciousness (6.1%) were the most frequent errors, accounting for 65.4% of the total. After intervention, 352 potential failures were recorded. Classification resulted in 477 errors. Preventable failures decreased from 54.6% to 27.3% (P < .001) with implementation of PPTR. Potential failure rates per hour decreased from 18.8 to 9.2 (P < .001), with no increase in potential to cause harm or procedural flow disruption per failure. Failures during VIR procedures are largely because of ineffective planning, communication error, and equipment difficulties, rather than a result of technical or patient-related issues. Many of these potential failures are preventable. A PPTR is an effective means of targeting frequent preventable failures, reducing procedural delays and improving patient safety.

  16. Bug Distribution and Statistical Pattern Classification.

    ERIC Educational Resources Information Center

    Tatsuoka, Kikumi K.; Tatsuoka, Maurice M.

    1987-01-01

    The rule space model permits measurement of cognitive skill acquisition and error diagnosis. Further discussion introduces Bayesian hypothesis testing and bug distribution. An illustration involves an artificial intelligence approach to testing fractions and arithmetic. (Author/GDC)

  17. Assessment of Metronidazole Susceptibility in Helicobacter pylori: Statistical Validation and Error Rate Analysis of Breakpoints Determined by the Disk Diffusion Test

    PubMed Central

    Chaves, Sandra; Gadanho, Mário; Tenreiro, Rogério; Cabrita, José

    1999-01-01

    Metronidazole susceptibility of 100 Helicobacter pylori strains was assessed by determining the inhibition zone diameters by disk diffusion test and the MICs by agar dilution and PDM Epsilometer test (E test). Linear regression analysis was performed, allowing the definition of significant linear relations, and revealed correlations of disk diffusion results with both E-test and agar dilution results (r2 = 0.88 and 0.81, respectively). No significant differences (P = 0.84) were found between MICs defined by E test and those defined by agar dilution, taken as a standard. Reproducibility comparison between E-test and disk diffusion tests showed that they are equivalent and with good precision. Two interpretative susceptibility schemes (with or without an intermediate class) were compared by an interpretative error rate analysis method. The susceptibility classification scheme that included the intermediate category was retained, and breakpoints were assessed for diffusion assay with 5-μg metronidazole disks. Strains with inhibition zone diameters less than 16 mm were defined as resistant (MIC > 8 μg/ml), those with zone diameters equal to or greater than 16 mm but less than 21 mm were considered intermediate (4 μg/ml < MIC ≤ 8 μg/ml), and those with zone diameters of 21 mm or greater were regarded as susceptible (MIC ≤ 4 μg/ml). Error rate analysis applied to this classification scheme showed occurrence frequencies of 1% for major errors and 7% for minor errors, when the results were compared to those obtained by agar dilution. No very major errors were detected, suggesting that disk diffusion might be a good alternative for determining the metronidazole sensitivity of H. pylori strains. PMID:10203543

  18. Quality of healthcare services and its relationship with patient safety culture and nurse-physician professional communication

    PubMed Central

    Ghahramanian, Akram; Rezaei, Tayyebeh; Abdullahzadeh, Farahnaz; Sheikhalipour, Zahra; Dianat, Iman

    2017-01-01

    Background: This study investigated quality of healthcare services from patients’ perspectives and its relationship with patient safety culture and nurse-physician professional communication. Methods: A cross-sectional study was conducted among 300 surgery patients and 101 nurses caring them in a public hospital in Tabriz–Iran. Data were collected using the service quality measurement scale (SERVQUAL), hospital survey on patient safety culture (HSOPSC) and nurse physician professional communication questionnaire. Results: The highest and lowest mean (±SD) scores of the patients’ perception on the healthcare services quality belonged to the assurance 13.92 (±3.55) and empathy 6.78 (±1.88) domains,respectively. With regard to the patient safety culture, the mean percentage of positive answers ranged from 45.87% for "non-punitive response to errors" to 68.21% for "organizational continuous learning" domains. The highest and lowest mean (±SD) scores for the nurse physician professional communication were obtained for "cooperation" 3.44 (±0.35) and "non-participative decision-making" 2.84 (±0.34) domains, respectively. The "frequency of reported errors by healthcare professionals" (B=-4.20, 95% CI = -7.14 to -1.27, P<0.01) and "respect and sharing of information" (B=7.69, 95% CI=4.01 to 11.36, P<0.001) predicted the patients’perceptions of the quality of healthcare services. Conclusion: Organizational culture in dealing with medical error should be changed to non-punitive response. Change in safety culture towards reporting of errors, effective communication and teamwork between healthcare professionals are recommended. PMID:28695106

  19. Mammogram classification scheme using 2D-discrete wavelet and local binary pattern for detection of breast cancer

    NASA Astrophysics Data System (ADS)

    Adi Putra, Januar

    2018-04-01

    In this paper, we propose a new mammogram classification scheme to classify the breast tissues as normal or abnormal. Feature matrix is generated using Local Binary Pattern to all the detailed coefficients from 2D-DWT of the region of interest (ROI) of a mammogram. Feature selection is done by selecting the relevant features that affect the classification. Feature selection is used to reduce the dimensionality of data and features that are not relevant, in this paper the F-test and Ttest will be performed to the results of the feature extraction dataset to reduce and select the relevant feature. The best features are used in a Neural Network classifier for classification. In this research we use MIAS and DDSM database. In addition to the suggested scheme, the competent schemes are also simulated for comparative analysis. It is observed that the proposed scheme has a better say with respect to accuracy, specificity and sensitivity. Based on experiments, the performance of the proposed scheme can produce high accuracy that is 92.71%, while the lowest accuracy obtained is 77.08%.

  20. Dietary assessment of British police force employees: a description of diet record coding procedures and cross-sectional evaluation of dietary energy intake reporting (The Airwave Health Monitoring Study)

    PubMed Central

    Gibson, Rachel; Eriksen, Rebeca; Lamb, Kathryn; McMeel, Yvonne; Vergnaud, Anne-Claire; Spear, Jeanette; Aresu, Maria; Chan, Queenie; Elliott, Paul; Frost, Gary

    2017-01-01

    Objectives Dietary intake is a key aspect of occupational health. To capture the characteristics of dietary behaviour that is affected by occupational environment that may affect disease risk, a collection of prospective multiday dietary records is required. The aims of this paper are to: (1) collect multiday dietary data in the Airwave Health Monitoring Study, (2) describe the dietary coding procedures applied and (3) investigate the plausibility of dietary reporting in this occupational cohort. Design A dietary coding protocol for this large-scale study was developed to minimise coding error rate. Participants (n 4412) who completed 7-day food records were included for cross-sectional analyses. Energy intake (EI) misreporting was estimated using the Goldberg method. Multivariate logistic regression models were applied to determine participant characteristics associated with EI misreporting. Setting British police force employees enrolled (2007–2012) into the Airwave Health Monitoring Study. Results The mean code error rate per food diary was 3.7% (SD 3.2%). The strongest predictors of EI under-reporting were body mass index (BMI) and physical activity. Compared with participants with BMI<25 kg/m2, those with BMI>30 kg/m2 had increased odds of being classified as under-reporting EI (men OR 5.20 95% CI 3.92 to 6.89; women OR 2.66 95% CI 1.85 to 3.83). Men and women in the highest physical activity category compared with the lowest were also more likely to be classified as under-reporting (men OR 3.33 95% CI 2.46 to 4.50; women OR 4.34 95% CI 2.91 to 6.55). Conclusions A reproducible dietary record coding procedure has been developed to minimise coding error in complex 7-day diet diaries. The prevalence of EI under-reporting is comparable with existing national UK cohorts and, in agreement with previous studies, classification of under-reporting was biased towards specific subgroups of participants. PMID:28377391

  1. Prevalence of refractive error in malay primary school children in suburban area of Kota Bharu, Kelantan, Malaysia.

    PubMed

    Hashim, Syaratul-Emma; Tan, Hui-Ken; Wan-Hazabbah, W H; Ibrahim, Mohtar

    2008-11-01

    Refractive error remains one of the primary causes of visual impairment in children worldwide, and the prevalence of refractive error varies widely. The objective of this study was to determine the prevalence of refractive error and study the possible associated factors inducing refractive error among primary school children of Malay ethnicity in the suburban area of Kota Bharu, Kelantan, Malaysia. A school-based cross-sectional study was performed from January to July 2006 by random selection on Standard 1 to Standard 6 students of 10 primary schools in the Kota Bharu district. Visual acuity assessment was measured using logMAR ETDRS chart. Positive predictive value of uncorrected visual acuity equal or worse than 20/40, was used as a cut-off point for further evaluation by automated refraction and retinoscopic refraction. A total of 840 students were enumerated but only 705 were examined. The prevalence of uncorrected visual impairment was seen in 54 (7.7%) children. The main cause of the uncorrected visual impairment was refractive error which contributed to 90.7% of the total, and with 7.0% prevalence for the studied population. Myopia is the most common type of refractive error among children aged 6 to 12 years with prevalence of 5.4%, followed by hyperopia at 1.0% and astigmatism at 0.6%. A significant positive correlation was noted between myopia development with increasing age (P <0.005), more hours spent on reading books (P <0.005) and background history of siblings with glasses (P <0.005) and whose parents are of higher educational level (P <0.005). Malays in suburban Kelantan (5.4%) have the lowest prevalence of myopia compared with Malays in the metropolitan cities of Kuala Lumpur (9.2%) and Singapore (22.1%). The ethnicity-specific prevalence rate of myopia was the lowest among Malays in Kota Bharu, followed by Kuala Lumpur, and is the highest among Singaporean Malays. Better socio-economic factors could have contributed to higher myopia rates in the cities, since the genetic background of these ethnic Malays are similar.

  2. A SVM framework for fault detection of the braking system in a high speed train

    NASA Astrophysics Data System (ADS)

    Liu, Jie; Li, Yan-Fu; Zio, Enrico

    2017-03-01

    In April 2015, the number of operating High Speed Trains (HSTs) in the world has reached 3603. An efficient, effective and very reliable braking system is evidently very critical for trains running at a speed around 300 km/h. Failure of a highly reliable braking system is a rare event and, consequently, informative recorded data on fault conditions are scarce. This renders the fault detection problem a classification problem with highly unbalanced data. In this paper, a Support Vector Machine (SVM) framework, including feature selection, feature vector selection, model construction and decision boundary optimization, is proposed for tackling this problem. Feature vector selection can largely reduce the data size and, thus, the computational burden. The constructed model is a modified version of the least square SVM, in which a higher cost is assigned to the error of classification of faulty conditions than the error of classification of normal conditions. The proposed framework is successfully validated on a number of public unbalanced datasets. Then, it is applied for the fault detection of braking systems in HST: in comparison with several SVM approaches for unbalanced datasets, the proposed framework gives better results.

  3. Semi-supervised anomaly detection - towards model-independent searches of new physics

    NASA Astrophysics Data System (ADS)

    Kuusela, Mikael; Vatanen, Tommi; Malmi, Eric; Raiko, Tapani; Aaltonen, Timo; Nagai, Yoshikazu

    2012-06-01

    Most classification algorithms used in high energy physics fall under the category of supervised machine learning. Such methods require a training set containing both signal and background events and are prone to classification errors should this training data be systematically inaccurate for example due to the assumed MC model. To complement such model-dependent searches, we propose an algorithm based on semi-supervised anomaly detection techniques, which does not require a MC training sample for the signal data. We first model the background using a multivariate Gaussian mixture model. We then search for deviations from this model by fitting to the observations a mixture of the background model and a number of additional Gaussians. This allows us to perform pattern recognition of any anomalous excess over the background. We show by a comparison to neural network classifiers that such an approach is a lot more robust against misspecification of the signal MC than supervised classification. In cases where there is an unexpected signal, a neural network might fail to correctly identify it, while anomaly detection does not suffer from such a limitation. On the other hand, when there are no systematic errors in the training data, both methods perform comparably.

  4. A Q-backpropagated time delay neural network for diagnosing severity of gait disturbances in Parkinson's disease.

    PubMed

    Nancy Jane, Y; Khanna Nehemiah, H; Arputharaj, Kannan

    2016-04-01

    Parkinson's disease (PD) is a movement disorder that affects the patient's nervous system and health-care applications mostly uses wearable sensors to collect these data. Since these sensors generate time stamped data, analyzing gait disturbances in PD becomes challenging task. The objective of this paper is to develop an effective clinical decision-making system (CDMS) that aids the physician in diagnosing the severity of gait disturbances in PD affected patients. This paper presents a Q-backpropagated time delay neural network (Q-BTDNN) classifier that builds a temporal classification model, which performs the task of classification and prediction in CDMS. The proposed Q-learning induced backpropagation (Q-BP) training algorithm trains the Q-BTDNN by generating a reinforced error signal. The network's weights are adjusted through backpropagating the generated error signal. For experimentation, the proposed work uses a PD gait database, which contains gait measures collected through wearable sensors from three different PD research studies. The experimental result proves the efficiency of Q-BP in terms of its improved classification accuracy of 91.49%, 92.19% and 90.91% with three datasets accordingly compared to other neural network training algorithms. Copyright © 2016 Elsevier Inc. All rights reserved.

  5. Bayesian logistic regression approaches to predict incorrect DRG assignment.

    PubMed

    Suleiman, Mani; Demirhan, Haydar; Boyd, Leanne; Girosi, Federico; Aksakalli, Vural

    2018-05-07

    Episodes of care involving similar diagnoses and treatments and requiring similar levels of resource utilisation are grouped to the same Diagnosis-Related Group (DRG). In jurisdictions which implement DRG based payment systems, DRGs are a major determinant of funding for inpatient care. Hence, service providers often dedicate auditing staff to the task of checking that episodes have been coded to the correct DRG. The use of statistical models to estimate an episode's probability of DRG error can significantly improve the efficiency of clinical coding audits. This study implements Bayesian logistic regression models with weakly informative prior distributions to estimate the likelihood that episodes require a DRG revision, comparing these models with each other and to classical maximum likelihood estimates. All Bayesian approaches had more stable model parameters than maximum likelihood. The best performing Bayesian model improved overall classification per- formance by 6% compared to maximum likelihood, with a 34% gain compared to random classification, respectively. We found that the original DRG, coder and the day of coding all have a significant effect on the likelihood of DRG error. Use of Bayesian approaches has improved model parameter stability and classification accuracy. This method has already lead to improved audit efficiency in an operational capacity.

  6. A data-driven modeling approach to stochastic computation for low-energy biomedical devices.

    PubMed

    Lee, Kyong Ho; Jang, Kuk Jin; Shoeb, Ali; Verma, Naveen

    2011-01-01

    Low-power devices that can detect clinically relevant correlations in physiologically-complex patient signals can enable systems capable of closed-loop response (e.g., controlled actuation of therapeutic stimulators, continuous recording of disease states, etc.). In ultra-low-power platforms, however, hardware error sources are becoming increasingly limiting. In this paper, we present how data-driven methods, which allow us to accurately model physiological signals, also allow us to effectively model and overcome prominent hardware error sources with nearly no additional overhead. Two applications, EEG-based seizure detection and ECG-based arrhythmia-beat classification, are synthesized to a logic-gate implementation, and two prominent error sources are introduced: (1) SRAM bit-cell errors and (2) logic-gate switching errors ('stuck-at' faults). Using patient data from the CHB-MIT and MIT-BIH databases, performance similar to error-free hardware is achieved even for very high fault rates (up to 0.5 for SRAMs and 7 × 10(-2) for logic) that cause computational bit error rates as high as 50%.

  7. Association of medication errors with drug classifications, clinical units, and consequence of errors: Are they related?

    PubMed

    Muroi, Maki; Shen, Jay J; Angosta, Alona

    2017-02-01

    Registered nurses (RNs) play an important role in safe medication administration and patient safety. This study examined a total of 1276 medication error (ME) incident reports made by RNs in hospital inpatient settings in the southwestern region of the United States. The most common drug class associated with MEs was cardiovascular drugs (24.7%). Among this class, anticoagulants had the most errors (11.3%). The antimicrobials was the second most common drug class associated with errors (19.1%) and vancomycin was the most common antimicrobial that caused errors in this category (6.1%). MEs occurred more frequently in the medical-surgical and intensive care units than any other hospital units. Ten percent of MEs reached the patients with harm and 11% reached the patients with increased monitoring. Understanding the contributing factors related to MEs, addressing and eliminating risk of errors across hospital units, and providing education and resources for nurses may help reduce MEs. Copyright © 2016 Elsevier Inc. All rights reserved.

  8. Classification Management. Journal of the National Classification Management Society, Volume 18, 1982,

    DTIC Science & Technology

    1983-01-01

    changes. Concurrently, CIA formed and AD HOC esting to step back and look at the U.S. security Intelligence Community Working Group to re...administrative error; to prevent embarrassment to expected damage will be. If you foresee the dam- a person, organization, or agency; to restrain com- age...the decision will be to classify the informa- petition; or to pTevent or delay the public release of tion. But note that in this thought process, you

  9. Validation of tool mark analysis of cut costal cartilage.

    PubMed

    Love, Jennifer C; Derrick, Sharon M; Wiersema, Jason M; Peters, Charles

    2012-03-01

    This study was designed to establish the potential error rate associated with the generally accepted method of tool mark analysis of cut marks in costal cartilage. Three knives with different blade types were used to make experimental cut marks in costal cartilage of pigs. Each cut surface was cast, and each cast was examined by three analysts working independently. The presence of striations, regularity of striations, and presence of a primary and secondary striation pattern were recorded for each cast. The distance between each striation was measured. The results showed that striations were not consistently impressed on the cut surface by the blade's cutting edge. Also, blade type classification by the presence or absence of striations led to a 65% misclassification rate. Use of the classification tree and cross-validation methods and inclusion of the mean interstriation distance decreased the error rate to c. 50%. © 2011 American Academy of Forensic Sciences.

  10. Automatic and semi-automatic approaches for arteriolar-to-venular computation in retinal photographs

    NASA Astrophysics Data System (ADS)

    Mendonça, Ana Maria; Remeseiro, Beatriz; Dashtbozorg, Behdad; Campilho, Aurélio

    2017-03-01

    The Arteriolar-to-Venular Ratio (AVR) is a popular dimensionless measure which allows the assessment of patients' condition for the early diagnosis of different diseases, including hypertension and diabetic retinopathy. This paper presents two new approaches for AVR computation in retinal photographs which include a sequence of automated processing steps: vessel segmentation, caliber measurement, optic disc segmentation, artery/vein classification, region of interest delineation, and AVR calculation. Both approaches have been tested on the INSPIRE-AVR dataset, and compared with a ground-truth provided by two medical specialists. The obtained results demonstrate the reliability of the fully automatic approach which provides AVR ratios very similar to at least one of the observers. Furthermore, the semi-automatic approach, which includes the manual modification of the artery/vein classification if needed, allows to significantly reduce the error to a level below the human error.

  11. Bayes classification of terrain cover using normalized polarimetric data

    NASA Technical Reports Server (NTRS)

    Yueh, H. A.; Swartz, A. A.; Kong, J. A.; Shin, R. T.; Novak, L. M.

    1988-01-01

    The normalized polarimetric classifier (NPC) which uses only the relative magnitudes and phases of the polarimetric data is proposed for discrimination of terrain elements. The probability density functions (PDFs) of polarimetric data are assumed to have a complex Gaussian distribution, and the marginal PDF of the normalized polarimetric data is derived by adopting the Euclidean norm as the normalization function. The general form of the distance measure for the NPC is also obtained. It is demonstrated that for polarimetric data with an arbitrary PDF, the distance measure of NPC will be independent of the normalization function selected even when the classifier is mistrained. A complex Gaussian distribution is assumed for the polarimetric data consisting of grass and tree regions. The probability of error for the NPC is compared with those of several other single-feature classifiers. The classification error of NPCs is shown to be independent of the normalization function.

  12. Accuracy of heart rate variability estimation by photoplethysmography using an smartphone: Processing optimization and fiducial point selection.

    PubMed

    Ferrer-Mileo, V; Guede-Fernandez, F; Fernandez-Chimeno, M; Ramos-Castro, J; Garcia-Gonzalez, M A

    2015-08-01

    This work compares several fiducial points to detect the arrival of a new pulse in a photoplethysmographic signal using the built-in camera of smartphones or a photoplethysmograph. Also, an optimization process for the signal preprocessing stage has been done. Finally we characterize the error produced when we use the best cutoff frequencies and fiducial point for smartphones and photopletysmograph and compare if the error of smartphones can be reasonably be explained by variations in pulse transit time. The results have revealed that the peak of the first derivative and the minimum of the second derivative of the pulse wave have the lowest error. Moreover, for these points, high pass filtering the signal between 0.1 to 0.8 Hz and low pass around 2.7 Hz or 3.5 Hz are the best cutoff frequencies. Finally, the error in smartphones is slightly higher than in a photoplethysmograph.

  13. The intercrater plains of Mercury and the Moon: Their nature, origin and role in terrestrial planet evolution. Measurement and errors of crater statistics. Ph.D. Thesis

    NASA Technical Reports Server (NTRS)

    Leake, M. A.

    1982-01-01

    Planetary imagery techniques, errors in measurement or degradation assignment, and statistical formulas are presented with respect to cratering data. Base map photograph preparation, measurement of crater diameters and sampled area, and instruments used are discussed. Possible uncertainties, such as Sun angle, scale factors, degradation classification, and biases in crater recognition are discussed. The mathematical formulas used in crater statistics are presented.

  14. Radiomic machine-learning classifiers for prognostic biomarkers of advanced nasopharyngeal carcinoma.

    PubMed

    Zhang, Bin; He, Xin; Ouyang, Fusheng; Gu, Dongsheng; Dong, Yuhao; Zhang, Lu; Mo, Xiaokai; Huang, Wenhui; Tian, Jie; Zhang, Shuixing

    2017-09-10

    We aimed to identify optimal machine-learning methods for radiomics-based prediction of local failure and distant failure in advanced nasopharyngeal carcinoma (NPC). We enrolled 110 patients with advanced NPC. A total of 970 radiomic features were extracted from MRI images for each patient. Six feature selection methods and nine classification methods were evaluated in terms of their performance. We applied the 10-fold cross-validation as the criterion for feature selection and classification. We repeated each combination for 50 times to obtain the mean area under the curve (AUC) and test error. We observed that the combination methods Random Forest (RF) + RF (AUC, 0.8464 ± 0.0069; test error, 0.3135 ± 0.0088) had the highest prognostic performance, followed by RF + Adaptive Boosting (AdaBoost) (AUC, 0.8204 ± 0.0095; test error, 0.3384 ± 0.0097), and Sure Independence Screening (SIS) + Linear Support Vector Machines (LSVM) (AUC, 0.7883 ± 0.0096; test error, 0.3985 ± 0.0100). Our radiomics study identified optimal machine-learning methods for the radiomics-based prediction of local failure and distant failure in advanced NPC, which could enhance the applications of radiomics in precision oncology and clinical practice. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. Flavour and identification threshold detection overview of Slovak adepts for certified testing.

    PubMed

    Vietoris, VladimIr; Barborova, Petra; Jancovicova, Jana; Eliasova, Lucia; Karvaj, Marian

    2016-07-01

    During certification process of sensory assessors of Slovak certification body we obtained results for basic taste thresholds and lifestyle habits. 500 adult people were screened during experiment with food industry background. For analysis of basic and non basic tastes, we used standardized procedure of ISO 8586-1:1993. In flavour test experiment, group of (26-35 y.o) produced the lowest error ratio (1.438), highest is (56+ y.o.) group with result (2.0). Average error value based on gender for women was (1.510) in comparison to men (1.477). People with allergies have the average error ratio (1.437) in comparison to people without allergies (1.511). Non-smokers produced less errors (1.484) against the smokers (1.576). Another flavour threshold identification test detected differences among age groups (by age are values increased). The highest number of errors made by men in metallic taste was (24%) the same as made by women (22%). Higher error ratio made by men occurred in salty taste (19%) against women (10%). Analysis detected some differences between allergic/non-allergic, smokers/non-smokers groups.

  16. Reliability and validity of a short form household food security scale in a Caribbean community.

    PubMed

    Gulliford, Martin C; Mahabir, Deepak; Rocke, Brian

    2004-06-16

    We evaluated the reliability and validity of the short form household food security scale in a different setting from the one in which it was developed. The scale was interview administered to 531 subjects from 286 households in north central Trinidad in Trinidad and Tobago, West Indies. We evaluated the six items by fitting item response theory models to estimate item thresholds, estimating agreement among respondents in the same households and estimating the slope index of income-related inequality (SII) after adjusting for age, sex and ethnicity. Item-score correlations ranged from 0.52 to 0.79 and Cronbach's alpha was 0.87. Item responses gave within-household correlation coefficients ranging from 0.70 to 0.78. Estimated item thresholds (standard errors) from the Rasch model ranged from -2.027 (0.063) for the 'balanced meal' item to 2.251 (0.116) for the 'hungry' item. The 'balanced meal' item had the lowest threshold in each ethnic group even though there was evidence of differential functioning for this item by ethnicity. Relative thresholds of other items were generally consistent with US data. Estimation of the SII, comparing those at the bottom with those at the top of the income scale, gave relative odds for an affirmative response of 3.77 (95% confidence interval 1.40 to 10.2) for the lowest severity item, and 20.8 (2.67 to 162.5) for highest severity item. Food insecurity was associated with reduced consumption of green vegetables after additionally adjusting for income and education (0.52, 0.28 to 0.96). The household food security scale gives reliable and valid responses in this setting. Differing relative item thresholds compared with US data do not require alteration to the cut-points for classification of 'food insecurity without hunger' or 'food insecurity with hunger'. The data provide further evidence that re-evaluation of the 'balanced meal' item is required.

  17. High Dimensional Classification Using Features Annealed Independence Rules.

    PubMed

    Fan, Jianqing; Fan, Yingying

    2008-01-01

    Classification using high-dimensional features arises frequently in many contemporary statistical studies such as tumor classification using microarray or other high-throughput data. The impact of dimensionality on classifications is largely poorly understood. In a seminal paper, Bickel and Levina (2004) show that the Fisher discriminant performs poorly due to diverging spectra and they propose to use the independence rule to overcome the problem. We first demonstrate that even for the independence classification rule, classification using all the features can be as bad as the random guessing due to noise accumulation in estimating population centroids in high-dimensional feature space. In fact, we demonstrate further that almost all linear discriminants can perform as bad as the random guessing. Thus, it is paramountly important to select a subset of important features for high-dimensional classification, resulting in Features Annealed Independence Rules (FAIR). The conditions under which all the important features can be selected by the two-sample t-statistic are established. The choice of the optimal number of features, or equivalently, the threshold value of the test statistics are proposed based on an upper bound of the classification error. Simulation studies and real data analysis support our theoretical results and demonstrate convincingly the advantage of our new classification procedure.

  18. A label distance maximum-based classifier for multi-label learning.

    PubMed

    Liu, Xiaoli; Bao, Hang; Zhao, Dazhe; Cao, Peng

    2015-01-01

    Multi-label classification is useful in many bioinformatics tasks such as gene function prediction and protein site localization. This paper presents an improved neural network algorithm, Max Label Distance Back Propagation Algorithm for Multi-Label Classification. The method was formulated by modifying the total error function of the standard BP by adding a penalty term, which was realized by maximizing the distance between the positive and negative labels. Extensive experiments were conducted to compare this method against state-of-the-art multi-label methods on three popular bioinformatic benchmark datasets. The results illustrated that this proposed method is more effective for bioinformatic multi-label classification compared to commonly used techniques.

  19. Use of streamflow data to estimate base flowground-water recharge for Wisconsin

    USGS Publications Warehouse

    Gebert, W.A.; Radloff, M.J.; Considine, E.J.; Kennedy, J.L.

    2007-01-01

    The average annual base flow/recharge was determined for streamflow-gaging stations throughout Wisconsin by base-flow separation. A map of the State was prepared that shows the average annual base flow for the period 1970-99 for watersheds at 118 gaging stations. Trend analysis was performed on 22 of the 118 streamflow-gaging stations that had long-term records, unregulated flow, and provided aerial coverage of the State. The analysis found that a statistically significant increasing trend was occurring for watersheds where the primary land use was agriculture. Most gaging stations where the land cover was forest had no significant trend. A method to estimate the average annual base flow at ungaged sites was developed by multiple-regression analysis using basin characteristics. The equation with the lowest standard error of estimate, 9.5%, has drainage area, soil infiltration and base flow factor as independent variables. To determine the average annual base flow for smaller watersheds, estimates were made at low-flow partial-record stations in 3 of the 12 major river basins in Wisconsin. Regression equations were developed for each of the three major river basins using basin characteristics. Drainage area, soil infiltration, basin storage and base-flow factor were the independent variables in the regression equations with the lowest standard error of estimate. The standard error of estimate ranged from 17% to 52% for the three river basins. ?? 2007 American Water Resources Association.

  20. Lacie phase 1 Classification and Mensuration Subsystem (CAMS) rework experiment

    NASA Technical Reports Server (NTRS)

    Chhikara, R. S.; Hsu, E. M.; Liszcz, C. J.

    1976-01-01

    An experiment was designed to test the ability of the Classification and Mensuration Subsystem rework operations to improve wheat proportion estimates for segments that had been processed previously. Sites selected for the experiment included three in Kansas and three in Texas, with the remaining five distributed in Montana and North and South Dakota. The acquisition dates were selected to be representative of imagery available in actual operations. No more than one acquisition per biophase were used, and biophases were determined by actual crop calendars. All sites were worked by each of four Analyst-Interpreter/Data Processing Analyst Teams who reviewed the initial processing of each segment and accepted or reworked it for an estimate of the proportion of small grains in the segment. Classification results, acquisitions and classification errors and performance results between CAMS regular and ITS rework are tabulated.

  1. Sample Errors Call Into Question Conclusions Regarding Same-Sex Married Parents: A Comment on "Family Structure and Child Health: Does the Sex Composition of Parents Matter?"

    PubMed

    Paul Sullins, D

    2017-12-01

    Because of classification errors reported by the National Center for Health Statistics, an estimated 42 % of the same-sex married partners in the sample for this study are misclassified different-sex married partners, thus calling into question findings regarding same-sex married parents. Including biological parentage as a control variable suppresses same-sex/different-sex differences, thus obscuring the data error. Parentage is not appropriate as a control because it correlates nearly perfectly (+.97, gamma) with the same-sex/different-sex distinction and is invariant for the category of joint biological parents.

  2. Image Augmentation for Object Image Classification Based On Combination of Pre-Trained CNN and SVM

    NASA Astrophysics Data System (ADS)

    Shima, Yoshihiro

    2018-04-01

    Neural networks are a powerful means of classifying object images. The proposed image category classification method for object images combines convolutional neural networks (CNNs) and support vector machines (SVMs). A pre-trained CNN, called Alex-Net, is used as a pattern-feature extractor. Alex-Net is pre-trained for the large-scale object-image dataset ImageNet. Instead of training, Alex-Net, pre-trained for ImageNet is used. An SVM is used as trainable classifier. The feature vectors are passed to the SVM from Alex-Net. The STL-10 dataset are used as object images. The number of classes is ten. Training and test samples are clearly split. STL-10 object images are trained by the SVM with data augmentation. We use the pattern transformation method with the cosine function. We also apply some augmentation method such as rotation, skewing and elastic distortion. By using the cosine function, the original patterns were left-justified, right-justified, top-justified, or bottom-justified. Patterns were also center-justified and enlarged. Test error rate is decreased by 0.435 percentage points from 16.055% by augmentation with cosine transformation. Error rates are increased by other augmentation method such as rotation, skewing and elastic distortion, compared without augmentation. Number of augmented data is 30 times that of the original STL-10 5K training samples. Experimental test error rate for the test 8k STL-10 object images was 15.620%, which shows that image augmentation is effective for image category classification.

  3. Automated detection of cloud and cloud-shadow in single-date Landsat imagery using neural networks and spatial post-processing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hughes, Michael J.; Hayes, Daniel J

    2014-01-01

    Use of Landsat data to answer ecological questions is contingent on the effective removal of cloud and cloud shadow from satellite images. We develop a novel algorithm to identify and classify clouds and cloud shadow, \\textsc{sparcs}: Spacial Procedures for Automated Removal of Cloud and Shadow. The method uses neural networks to determine cloud, cloud-shadow, water, snow/ice, and clear-sky membership of each pixel in a Landsat scene, and then applies a set of procedures to enforce spatial rules. In a comparison to FMask, a high-quality cloud and cloud-shadow classification algorithm currently available, \\textsc{sparcs} performs favorably, with similar omission errors for cloudsmore » (0.8% and 0.9%, respectively), substantially lower omission error for cloud-shadow (8.3% and 1.1%), and fewer errors of commission (7.8% and 5.0%). Additionally, textsc{sparcs} provides a measure of uncertainty in its classification that can be exploited by other processes that use the cloud and cloud-shadow detection. To illustrate this, we present an application that constructs obstruction-free composites of images acquired on different dates in support of algorithms detecting vegetation change.« less

  4. Discrimination of Aspergillus isolates at the species and strain level by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry fingerprinting.

    PubMed

    Hettick, Justin M; Green, Brett J; Buskirk, Amanda D; Kashon, Michael L; Slaven, James E; Janotka, Erika; Blachere, Francoise M; Schmechel, Detlef; Beezhold, Donald H

    2008-09-15

    Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) was used to generate highly reproducible mass spectral fingerprints for 12 species of fungi of the genus Aspergillus and 5 different strains of Aspergillus flavus. Prior to MALDI-TOF MS analysis, the fungi were subjected to three 1-min bead beating cycles in an acetonitrile/trifluoroacetic acid solvent. The mass spectra contain abundant peaks in the range of 5 to 20kDa and may be used to discriminate between species unambiguously. A discriminant analysis using all peaks from the MALDI-TOF MS data yielded error rates for classification of 0 and 18.75% for resubstitution and cross-validation methods, respectively. If a subset of 28 significant peaks is chosen, resubstitution and cross-validation error rates are 0%. Discriminant analysis of the MALDI-TOF MS data for 5 strains of A. flavus using all peaks yielded error rates for classification of 0 and 5% for resubstitution and cross-validation methods, respectively. These data indicate that MALDI-TOF MS data may be used for unambiguous identification of members of the genus Aspergillus at both the species and strain levels.

  5. Identifying medication error chains from critical incident reports: a new analytic approach.

    PubMed

    Huckels-Baumgart, Saskia; Manser, Tanja

    2014-10-01

    Research into the distribution of medication errors usually focuses on isolated stages within the medication use process. Our study aimed to provide a novel process-oriented approach to medication incident analysis focusing on medication error chains. Our study was conducted across a 900-bed teaching hospital in Switzerland. All reported 1,591 medication errors 2009-2012 were categorized using the Medication Error Index NCC MERP and the WHO Classification for Patient Safety Methodology. In order to identify medication error chains, each reported medication incident was allocated to the relevant stage of the hospital medication use process. Only 25.8% of the reported medication errors were detected before they propagated through the medication use process. The majority of medication errors (74.2%) formed an error chain encompassing two or more stages. The most frequent error chain comprised preparation up to and including medication administration (45.2%). "Non-consideration of documentation/prescribing" during the drug preparation was the most frequent contributor for "wrong dose" during the administration of medication. Medication error chains provide important insights for detecting and stopping medication errors before they reach the patient. Existing and new safety barriers need to be extended to interrupt error chains and to improve patient safety. © 2014, The American College of Clinical Pharmacology.

  6. Effects of dynamic aeroelasticity on handling qualities and pilot rating

    NASA Technical Reports Server (NTRS)

    Swaim, R. L.; Yen, W.-Y.

    1978-01-01

    Pilot performance parameters, such as pilot ratings, tracking errors, and pilot comments, were recorded and analyzed for a longitudinal pitch tracking task on a large, flexible aircraft. The tracking task was programmed on a fixed-base simulator with a CRT attitude director display of pitch angle command, pitch angle, and pitch angle error. Parametric variations in the undamped natural frequencies of the two lowest frequency symmetric elastic modes were made to induce varying degrees of rigid body and elastic mode interaction. The results indicate that such mode interaction can drastically affect the handling qualities and pilot ratings of the task.

  7. Methods for data classification

    DOEpatents

    Garrity, George [Okemos, MI; Lilburn, Timothy G [Front Royal, VA

    2011-10-11

    The present invention provides methods for classifying data and uncovering and correcting annotation errors. In particular, the present invention provides a self-organizing, self-correcting algorithm for use in classifying data. Additionally, the present invention provides a method for classifying biological taxa.

  8. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yan, H; Chen, Z; Nath, R

    Purpose: kV fluoroscopic imaging combined with MV treatment beam imaging has been investigated for intrafractional motion monitoring and correction. It is, however, subject to additional kV imaging dose to normal tissue. To balance tracking accuracy and imaging dose, we previously proposed an adaptive imaging strategy to dynamically decide future imaging type and moments based on motion tracking uncertainty. kV imaging may be used continuously for maximal accuracy or only when the position uncertainty (probability of out of threshold) is high if a preset imaging dose limit is considered. In this work, we propose more accurate methods to estimate tracking uncertaintymore » through analyzing acquired data in real-time. Methods: We simulated motion tracking process based on a previously developed imaging framework (MV + initial seconds of kV imaging) using real-time breathing data from 42 patients. Motion tracking errors for each time point were collected together with the time point’s corresponding features, such as tumor motion speed and 2D tracking error of previous time points, etc. We tested three methods for error uncertainty estimation based on the features: conditional probability distribution, logistic regression modeling, and support vector machine (SVM) classification to detect errors exceeding a threshold. Results: For conditional probability distribution, polynomial regressions on three features (previous tracking error, prediction quality, and cosine of the angle between the trajectory and the treatment beam) showed strong correlation with the variation (uncertainty) of the mean 3D tracking error and its standard deviation: R-square = 0.94 and 0.90, respectively. The logistic regression and SVM classification successfully identified about 95% of tracking errors exceeding 2.5mm threshold. Conclusion: The proposed methods can reliably estimate the motion tracking uncertainty in real-time, which can be used to guide adaptive additional imaging to confirm the tumor is within the margin or initialize motion compensation if it is out of the margin.« less

  9. Pattern Classification of Endocervical Adenocarcinoma: Reproducibility and Review of Criteria

    PubMed Central

    Rutgers, Joanne K.L.; Roma, Andres; Park, Kay; Zaino, Richard J.; Johnson, Abbey; Alvarado, Isabel; Daya, Dean; Rasty, Golnar; Longacre, Teri; Ronnett, Brigitte; Silva, Elvio

    2017-01-01

    Previously, our international team proposed a 3-tiered pattern classification (Pattern Classification) system for endocervical adenocarcinoma of the usual type that correlates with nodal disease and recurrence. Pattern Classification- A have well demarcated glands lacking destructive stromal invasion or lymphovascular invasion (lymphovascular invasion), Pattern Classification- B show localized, limited destructive invasion arising from A-type glands, and Pattern Classification- C have diffuse destructive stromal invasion, significant (filling a 4× field) confluence, or solid architecture. 24 Pattern Classification-A, 22 Pattern Classification-B, 38 Pattern Classification-C from the tumor set used in the original description were chosen using the reference diagnosis (reference diagnosis) originally established. 1 H&E slide per case was reviewed by 7 gynecologic pathologists, 4 from the original study. Kappa statistics were prepared, and cases with discrepancies reviewed. We found a majority agreement with reference diagnosis in 81% of cases, with complete or near complete (6 of 7) agreement in 50%. Overall concordance was 74%. Overall Kappa (agreement among pathologists) was .488 (moderate agreement). Pattern Classification- B has lowest kappa, and agreement is not improved by combining B+C. 6 of 7 reviewers had substantial agreement by weighted kappas (>.6), with one reviewer accounting for the majority of cases under or overcalled by 2 tiers. Confluence filling a 4× field, labyrinthine glands, or solid architecture accounted for undercalling other reference diagnosis-C cases. Missing a few individually infiltrative cells was the most common cause of undercalling reference diagnosis- B. Small foci of inflamed, loose or desmoplastic stroma lacking infiltrative tumor cells in reference diagnosis-A appeared to account for those cases up-graded to Pattern Classification-B. In summary, an overall concordance of 74% indicates that the criteria can be reproducibly applied by gynecologic pathologists. Further refinement of criteria should allow use of this powerful classification system to delineate which cervical adenocarcinomas can be safely treated conservatively. PMID:27255163

  10. Real-time Neuroimaging and Cognitive Monitoring Using Wearable Dry EEG

    PubMed Central

    Mullen, Tim R.; Kothe, Christian A.E.; Chi, Mike; Ojeda, Alejandro; Kerth, Trevor; Makeig, Scott; Jung, Tzyy-Ping; Cauwenberghs, Gert

    2015-01-01

    Goal We present and evaluate a wearable high-density dry electrode EEG system and an open-source software framework for online neuroimaging and state classification. Methods The system integrates a 64-channel dry EEG form-factor with wireless data streaming for online analysis. A real-time software framework is applied, including adaptive artifact rejection, cortical source localization, multivariate effective connectivity inference, data visualization, and cognitive state classification from connectivity features using a constrained logistic regression approach (ProxConn). We evaluate the system identification methods on simulated 64-channel EEG data. Then we evaluate system performance, using ProxConn and a benchmark ERP method, in classifying response errors in 9 subjects using the dry EEG system. Results Simulations yielded high accuracy (AUC=0.97±0.021) for real-time cortical connectivity estimation. Response error classification using cortical effective connectivity (sdDTF) was significantly above chance with similar performance (AUC) for cLORETA (0.74±0.09) and LCMV (0.72±0.08) source localization. Cortical ERP-based classification was equivalent to ProxConn for cLORETA (0.74±0.16) but significantly better for LCMV (0.82±0.12). Conclusion We demonstrated the feasibility for real-time cortical connectivity analysis and cognitive state classification from high-density wearable dry EEG. Significance This paper is the first validated application of these methods to 64-channel dry EEG. The work addresses a need for robust real-time measurement and interpretation of complex brain activity in the dynamic environment of the wearable setting. Such advances can have broad impact in research, medicine, and brain-computer interfaces. The pipelines are made freely available in the open-source SIFT and BCILAB toolboxes. PMID:26415149

  11. Quality assurance of chemical ingredient classification for the National Drug File - Reference Terminology.

    PubMed

    Zheng, Ling; Yumak, Hasan; Chen, Ling; Ochs, Christopher; Geller, James; Kapusnik-Uner, Joan; Perl, Yehoshua

    2017-09-01

    The National Drug File - Reference Terminology (NDF-RT) is a large and complex drug terminology consisting of several classification hierarchies on top of an extensive collection of drug concepts. These hierarchies provide important information about clinical drugs, e.g., their chemical ingredients, mechanisms of action, dosage form and physiological effects. Within NDF-RT such information is represented using tens of thousands of roles connecting drugs to classifications. In previous studies, we have introduced various kinds of Abstraction Networks to summarize the content and structure of terminologies in order to facilitate their visual comprehension, and support quality assurance of terminologies. However, these previous kinds of Abstraction Networks are not appropriate for summarizing the NDF-RT classification hierarchies, due to its unique structure. In this paper, we present the novel Ingredient Abstraction Network (IAbN) to summarize, visualize and support the audit of NDF-RT's Chemical Ingredients hierarchy and its associated drugs. A common theme in our quality assurance framework is to use characterizations of sets of concepts, revealed by the Abstraction Network structure, to capture concepts, the modeling of which is more complex than for other concepts. For the IAbN, we characterize drug ingredient concepts as more complex if they belong to IAbN groups with multiple parent groups. We show that such concepts have a statistically significantly higher rate of errors than a control sample and identify two especially common patterns of errors. Copyright © 2017 Elsevier Inc. All rights reserved.

  12. Improvement in defect classification efficiency by grouping disposition for reticle inspection

    NASA Astrophysics Data System (ADS)

    Lai, Rick; Hsu, Luke T. H.; Chang, Peter; Ho, C. H.; Tsai, Frankie; Long, Garrett; Yu, Paul; Miller, John; Hsu, Vincent; Chen, Ellison

    2005-11-01

    As the lithography design rule of IC manufacturing continues to migrate toward more advanced technology nodes, the mask error enhancement factor (MEEF) increases and necessitates the use of aggressive OPC features. These aggressive OPC features pose challenges to reticle inspection due to high false detection, which is time-consuming for defect classification and impacts the throughput of mask manufacturing. Moreover, higher MEEF leads to stricter mask defect capture criteria so that new generation reticle inspection tool is equipped with better detection capability. Hence, mask process induced defects, which were once undetectable, are now detected and results in the increase of total defect count. Therefore, how to review and characterize reticle defects efficiently is becoming more significant. A new defect review system called ReviewSmart has been developed based on the concept of defect grouping disposition. The review system intelligently bins repeating or similar defects into defect groups and thus allows operators to review massive defects more efficiently. Compared to the conventional defect review method, ReviewSmart not only reduces defect classification time and human judgment error, but also eliminates desensitization that is formerly inevitable. In this study, we attempt to explore the most efficient use of ReviewSmart by evaluating various defect binning conditions. The optimal binning conditions are obtained and have been verified for fidelity qualification through inspection reports (IRs) of production masks. The experiment results help to achieve the best defect classification efficiency when using ReviewSmart in the mask manufacturing and development.

  13. A Quantile Mapping Bias Correction Method Based on Hydroclimatic Classification of the Guiana Shield

    PubMed Central

    Ringard, Justine; Seyler, Frederique; Linguet, Laurent

    2017-01-01

    Satellite precipitation products (SPPs) provide alternative precipitation data for regions with sparse rain gauge measurements. However, SPPs are subject to different types of error that need correction. Most SPP bias correction methods use the statistical properties of the rain gauge data to adjust the corresponding SPP data. The statistical adjustment does not make it possible to correct the pixels of SPP data for which there is no rain gauge data. The solution proposed in this article is to correct the daily SPP data for the Guiana Shield using a novel two set approach, without taking into account the daily gauge data of the pixel to be corrected, but the daily gauge data from surrounding pixels. In this case, a spatial analysis must be involved. The first step defines hydroclimatic areas using a spatial classification that considers precipitation data with the same temporal distributions. The second step uses the Quantile Mapping bias correction method to correct the daily SPP data contained within each hydroclimatic area. We validate the results by comparing the corrected SPP data and daily rain gauge measurements using relative RMSE and relative bias statistical errors. The results show that analysis scale variation reduces rBIAS and rRMSE significantly. The spatial classification avoids mixing rainfall data with different temporal characteristics in each hydroclimatic area, and the defined bias correction parameters are more realistic and appropriate. This study demonstrates that hydroclimatic classification is relevant for implementing bias correction methods at the local scale. PMID:28621723

  14. A Quantile Mapping Bias Correction Method Based on Hydroclimatic Classification of the Guiana Shield.

    PubMed

    Ringard, Justine; Seyler, Frederique; Linguet, Laurent

    2017-06-16

    Satellite precipitation products (SPPs) provide alternative precipitation data for regions with sparse rain gauge measurements. However, SPPs are subject to different types of error that need correction. Most SPP bias correction methods use the statistical properties of the rain gauge data to adjust the corresponding SPP data. The statistical adjustment does not make it possible to correct the pixels of SPP data for which there is no rain gauge data. The solution proposed in this article is to correct the daily SPP data for the Guiana Shield using a novel two set approach, without taking into account the daily gauge data of the pixel to be corrected, but the daily gauge data from surrounding pixels. In this case, a spatial analysis must be involved. The first step defines hydroclimatic areas using a spatial classification that considers precipitation data with the same temporal distributions. The second step uses the Quantile Mapping bias correction method to correct the daily SPP data contained within each hydroclimatic area. We validate the results by comparing the corrected SPP data and daily rain gauge measurements using relative RMSE and relative bias statistical errors. The results show that analysis scale variation reduces rBIAS and rRMSE significantly. The spatial classification avoids mixing rainfall data with different temporal characteristics in each hydroclimatic area, and the defined bias correction parameters are more realistic and appropriate. This study demonstrates that hydroclimatic classification is relevant for implementing bias correction methods at the local scale.

  15. An investigation of the usability of sound recognition for source separation of packaging wastes in reverse vending machines.

    PubMed

    Korucu, M Kemal; Kaplan, Özgür; Büyük, Osman; Güllü, M Kemal

    2016-10-01

    In this study, we investigate the usability of sound recognition for source separation of packaging wastes in reverse vending machines (RVMs). For this purpose, an experimental setup equipped with a sound recording mechanism was prepared. Packaging waste sounds generated by three physical impacts such as free falling, pneumatic hitting and hydraulic crushing were separately recorded using two different microphones. To classify the waste types and sizes based on sound features of the wastes, a support vector machine (SVM) and a hidden Markov model (HMM) based sound classification systems were developed. In the basic experimental setup in which only free falling impact type was considered, SVM and HMM systems provided 100% classification accuracy for both microphones. In the expanded experimental setup which includes all three impact types, material type classification accuracies were 96.5% for dynamic microphone and 97.7% for condenser microphone. When both the material type and the size of the wastes were classified, the accuracy was 88.6% for the microphones. The modeling studies indicated that hydraulic crushing impact type recordings were very noisy for an effective sound recognition application. In the detailed analysis of the recognition errors, it was observed that most of the errors occurred in the hitting impact type. According to the experimental results, it can be said that the proposed novel approach for the separation of packaging wastes could provide a high classification performance for RVMs. Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. Her2Net: A Deep Framework for Semantic Segmentation and Classification of Cell Membranes and Nuclei in Breast Cancer Evaluation.

    PubMed

    Saha, Monjoy; Chakraborty, Chandan

    2018-05-01

    We present an efficient deep learning framework for identifying, segmenting, and classifying cell membranes and nuclei from human epidermal growth factor receptor-2 (HER2)-stained breast cancer images with minimal user intervention. This is a long-standing issue for pathologists because the manual quantification of HER2 is error-prone, costly, and time-consuming. Hence, we propose a deep learning-based HER2 deep neural network (Her2Net) to solve this issue. The convolutional and deconvolutional parts of the proposed Her2Net framework consisted mainly of multiple convolution layers, max-pooling layers, spatial pyramid pooling layers, deconvolution layers, up-sampling layers, and trapezoidal long short-term memory (TLSTM). A fully connected layer and a softmax layer were also used for classification and error estimation. Finally, HER2 scores were calculated based on the classification results. The main contribution of our proposed Her2Net framework includes the implementation of TLSTM and a deep learning framework for cell membrane and nucleus detection, segmentation, and classification and HER2 scoring. Our proposed Her2Net achieved 96.64% precision, 96.79% recall, 96.71% F-score, 93.08% negative predictive value, 98.33% accuracy, and a 6.84% false-positive rate. Our results demonstrate the high accuracy and wide applicability of the proposed Her2Net in the context of HER2 scoring for breast cancer evaluation.

  17. Estimating workload using EEG spectral power and ERPs in the n-back task

    NASA Astrophysics Data System (ADS)

    Brouwer, Anne-Marie; Hogervorst, Maarten A.; van Erp, Jan B. F.; Heffelaar, Tobias; Zimmerman, Patrick H.; Oostenveld, Robert

    2012-08-01

    Previous studies indicate that both electroencephalogram (EEG) spectral power (in particular the alpha and theta band) and event-related potentials (ERPs) (in particular the P300) can be used as a measure of mental work or memory load. We compare their ability to estimate workload level in a well-controlled task. In addition, we combine both types of measures in a single classification model to examine whether this results in higher classification accuracy than either one alone. Participants watched a sequence of visually presented letters and indicated whether or not the current letter was the same as the one (n instances) before. Workload was varied by varying n. We developed different classification models using ERP features, frequency power features or a combination (fusion). Training and testing of the models simulated an online workload estimation situation. All our ERP, power and fusion models provide classification accuracies between 80% and 90% when distinguishing between the highest and the lowest workload condition after 2 min. For 32 out of 35 participants, classification was significantly higher than chance level after 2.5 s (or one letter) as estimated by the fusion model. Differences between the models are rather small, though the fusion model performs better than the other models when only short data segments are available for estimating workload.

  18. Failure analysis and modeling of a multicomputer system. M.S. Thesis

    NASA Technical Reports Server (NTRS)

    Subramani, Sujatha Srinivasan

    1990-01-01

    This thesis describes the results of an extensive measurement-based analysis of real error data collected from a 7-machine DEC VaxCluster multicomputer system. In addition to evaluating basic system error and failure characteristics, we develop reward models to analyze the impact of failures and errors on the system. The results show that, although 98 percent of errors in the shared resources recover, they result in 48 percent of all system failures. The analysis of rewards shows that the expected reward rate for the VaxCluster decreases to 0.5 in 100 days for a 3 out of 7 model, which is well over a 100 times that for a 7-out-of-7 model. A comparison of the reward rates for a range of k-out-of-n models indicates that the maximum increase in reward rate (0.25) occurs in going from the 6-out-of-7 model to the 5-out-of-7 model. The analysis also shows that software errors have the lowest reward (0.2 vs. 0.91 for network errors). The large loss in reward rate for software errors is due to the fact that a large proportion (94 percent) of software errors lead to failure. In comparison, the high reward rate for network errors is due to fast recovery from a majority of these errors (median recovery duration is 0 seconds).

  19. A Locality-Constrained and Label Embedding Dictionary Learning Algorithm for Image Classification.

    PubMed

    Zhengming Li; Zhihui Lai; Yong Xu; Jian Yang; Zhang, David

    2017-02-01

    Locality and label information of training samples play an important role in image classification. However, previous dictionary learning algorithms do not take the locality and label information of atoms into account together in the learning process, and thus their performance is limited. In this paper, a discriminative dictionary learning algorithm, called the locality-constrained and label embedding dictionary learning (LCLE-DL) algorithm, was proposed for image classification. First, the locality information was preserved using the graph Laplacian matrix of the learned dictionary instead of the conventional one derived from the training samples. Then, the label embedding term was constructed using the label information of atoms instead of the classification error term, which contained discriminating information of the learned dictionary. The optimal coding coefficients derived by the locality-based and label-based reconstruction were effective for image classification. Experimental results demonstrated that the LCLE-DL algorithm can achieve better performance than some state-of-the-art algorithms.

  20. Convolutional neural network with transfer learning for rice type classification

    NASA Astrophysics Data System (ADS)

    Patel, Vaibhav Amit; Joshi, Manjunath V.

    2018-04-01

    Presently, rice type is identified manually by humans, which is time consuming and error prone. Therefore, there is a need to do this by machine which makes it faster with greater accuracy. This paper proposes a deep learning based method for classification of rice types. We propose two methods to classify the rice types. In the first method, we train a deep convolutional neural network (CNN) using the given segmented rice images. In the second method, we train a combination of a pretrained VGG16 network and the proposed method, while using transfer learning in which the weights of a pretrained network are used to achieve better accuracy. Our approach can also be used for classification of rice grain as broken or fine. We train a 5-class model for classifying rice types using 4000 training images and another 2- class model for the classification of broken and normal rice using 1600 training images. We observe that despite having distinct rice images, our architecture, pretrained on ImageNet data boosts classification accuracy significantly.

  1. Application of partial least squares near-infrared spectral classification in diabetic identification

    NASA Astrophysics Data System (ADS)

    Yan, Wen-juan; Yang, Ming; He, Guo-quan; Qin, Lin; Li, Gang

    2014-11-01

    In order to identify the diabetic patients by using tongue near-infrared (NIR) spectrum - a spectral classification model of the NIR reflectivity of the tongue tip is proposed, based on the partial least square (PLS) method. 39sample data of tongue tip's NIR spectra are harvested from healthy people and diabetic patients , respectively. After pretreatment of the reflectivity, the spectral data are set as the independent variable matrix, and information of classification as the dependent variables matrix, Samples were divided into two groups - i.e. 53 samples as calibration set and 25 as prediction set - then the PLS is used to build the classification model The constructed modelfrom the 53 samples has the correlation of 0.9614 and the root mean square error of cross-validation (RMSECV) of 0.1387.The predictions for the 25 samples have the correlation of 0.9146 and the RMSECV of 0.2122.The experimental result shows that the PLS method can achieve good classification on features of healthy people and diabetic patients.

  2. Schizophrenia classification using functional network features

    NASA Astrophysics Data System (ADS)

    Rish, Irina; Cecchi, Guillermo A.; Heuton, Kyle

    2012-03-01

    This paper focuses on discovering statistical biomarkers (features) that are predictive of schizophrenia, with a particular focus on topological properties of fMRI functional networks. We consider several network properties, such as node (voxel) strength, clustering coefficients, local efficiency, as well as just a subset of pairwise correlations. While all types of features demonstrate highly significant statistical differences in several brain areas, and close to 80% classification accuracy, the most remarkable results of 93% accuracy are achieved by using a small subset of only a dozen of most-informative (lowest p-value) correlation features. Our results suggest that voxel-level correlations and functional network features derived from them are highly informative about schizophrenia and can be used as statistical biomarkers for the disease.

  3. Multiple Category-Lot Quality Assurance Sampling: A New Classification System with Application to Schistosomiasis Control

    PubMed Central

    Olives, Casey; Valadez, Joseph J.; Brooker, Simon J.; Pagano, Marcello

    2012-01-01

    Background Originally a binary classifier, Lot Quality Assurance Sampling (LQAS) has proven to be a useful tool for classification of the prevalence of Schistosoma mansoni into multiple categories (≤10%, >10 and <50%, ≥50%), and semi-curtailed sampling has been shown to effectively reduce the number of observations needed to reach a decision. To date the statistical underpinnings for Multiple Category-LQAS (MC-LQAS) have not received full treatment. We explore the analytical properties of MC-LQAS, and validate its use for the classification of S. mansoni prevalence in multiple settings in East Africa. Methodology We outline MC-LQAS design principles and formulae for operating characteristic curves. In addition, we derive the average sample number for MC-LQAS when utilizing semi-curtailed sampling and introduce curtailed sampling in this setting. We also assess the performance of MC-LQAS designs with maximum sample sizes of n = 15 and n = 25 via a weighted kappa-statistic using S. mansoni data collected in 388 schools from four studies in East Africa. Principle Findings Overall performance of MC-LQAS classification was high (kappa-statistic of 0.87). In three of the studies, the kappa-statistic for a design with n = 15 was greater than 0.75. In the fourth study, where these designs performed poorly (kappa-statistic less than 0.50), the majority of observations fell in regions where potential error is known to be high. Employment of semi-curtailed and curtailed sampling further reduced the sample size by as many as 0.5 and 3.5 observations per school, respectively, without increasing classification error. Conclusion/Significance This work provides the needed analytics to understand the properties of MC-LQAS for assessing the prevalance of S. mansoni and shows that in most settings a sample size of 15 children provides a reliable classification of schools. PMID:22970333

  4. AVNM: A Voting based Novel Mathematical Rule for Image Classification.

    PubMed

    Vidyarthi, Ankit; Mittal, Namita

    2016-12-01

    In machine learning, the accuracy of the system depends upon classification result. Classification accuracy plays an imperative role in various domains. Non-parametric classifier like K-Nearest Neighbor (KNN) is the most widely used classifier for pattern analysis. Besides its easiness, simplicity and effectiveness characteristics, the main problem associated with KNN classifier is the selection of a number of nearest neighbors i.e. "k" for computation. At present, it is hard to find the optimal value of "k" using any statistical algorithm, which gives perfect accuracy in terms of low misclassification error rate. Motivated by the prescribed problem, a new sample space reduction weighted voting mathematical rule (AVNM) is proposed for classification in machine learning. The proposed AVNM rule is also non-parametric in nature like KNN. AVNM uses the weighted voting mechanism with sample space reduction to learn and examine the predicted class label for unidentified sample. AVNM is free from any initial selection of predefined variable and neighbor selection as found in KNN algorithm. The proposed classifier also reduces the effect of outliers. To verify the performance of the proposed AVNM classifier, experiments are made on 10 standard datasets taken from UCI database and one manually created dataset. The experimental result shows that the proposed AVNM rule outperforms the KNN classifier and its variants. Experimentation results based on confusion matrix accuracy parameter proves higher accuracy value with AVNM rule. The proposed AVNM rule is based on sample space reduction mechanism for identification of an optimal number of nearest neighbor selections. AVNM results in better classification accuracy and minimum error rate as compared with the state-of-art algorithm, KNN, and its variants. The proposed rule automates the selection of nearest neighbor selection and improves classification rate for UCI dataset and manually created dataset. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  5. Evaluation of recent GRACE monthly solution series with an ice sheet perspective

    NASA Astrophysics Data System (ADS)

    Horwath, Martin; Groh, Andreas

    2016-04-01

    GRACE monthly global gravity field solutions have undergone a remarkable evolution, leading to the latest (Release 5) series by CSR, GFZ, and JPL, to new series by other processing centers, such as ITSG and AIUB, as well as to efforts to derive combined solutions, particularly by the EGSIEM (European Gravity Service for Improved Emergency Management) project. For applications, such as GRACE inferences on ice sheet mass balance, the obvious question is on what GRACE solution series to base the assessment. Here we evaluate different GRACE solution series (including the ones listed above) in a unified framework. We concentrate on solutions expanded up to degree 90 or higher, since this is most appropriate for polar applications. We empirically assess the error levels in the spectral as well as in the spatial domain based on the month-to-month scatter in the high spherical harmonic degrees. We include empirical assessment of error correlations. We then apply all series to infer Antarctic and Greenland mass change time series and compare the results in terms of apparent signal content and noise level. We find that the ITSG solutions show lowest noise level in the high degrees (above 60). A preliminary combined solution from the EGSIEM project shows lowest noise in the degrees below 60. This virtue maps into the derived ice mass time series, where the EGSIEM-based results show the lowest noise in most cases. Meanwhile, there is no indication that any of the considered series systematically dampens actual geophysical signals.

  6. Microscopic saw mark analysis: an empirical approach.

    PubMed

    Love, Jennifer C; Derrick, Sharon M; Wiersema, Jason M; Peters, Charles

    2015-01-01

    Microscopic saw mark analysis is a well published and generally accepted qualitative analytical method. However, little research has focused on identifying and mitigating potential sources of error associated with the method. The presented study proposes the use of classification trees and random forest classifiers as an optimal, statistically sound approach to mitigate the potential for error of variability and outcome error in microscopic saw mark analysis. The statistical model was applied to 58 experimental saw marks created with four types of saws. The saw marks were made in fresh human femurs obtained through anatomical gift and were analyzed using a Keyence digital microscope. The statistical approach weighed the variables based on discriminatory value and produced decision trees with an associated outcome error rate of 8.62-17.82%. © 2014 American Academy of Forensic Sciences.

  7. Hope Modified the Association between Distress and Incidence of Self-Perceived Medical Errors among Practicing Physicians: Prospective Cohort Study

    PubMed Central

    Hayashino, Yasuaki; Utsugi-Ozaki, Makiko; Feldman, Mitchell D.; Fukuhara, Shunichi

    2012-01-01

    The presence of hope has been found to influence an individual's ability to cope with stressful situations. The objective of this study is to evaluate the relationship between medical errors, hope and burnout among practicing physicians using validated metrics. Prospective cohort study was conducted among hospital based physicians practicing in Japan (N = 836). Measures included the validated Burnout Scale, self-assessment of medical errors and Herth Hope Index (HHI). The main outcome measure was the frequency of self-perceived medical errors, and Poisson regression analysis was used to evaluate the association between hope and medical error. A total of 361 errors were reported in 836 physician-years. We observed a significant association between hope and self-report of medical errors. Compared with the lowest tertile category of HHI, incidence rate ratios (IRRs) of self-perceived medical errors of physicians in the highest category were 0.44 (95%CI, 0.34 to 0.58) and 0.54 (95%CI, 0.42 to 0.70) respectively, for the 2nd and 3rd tertile. In stratified analysis by hope score, among physicians with a low hope score, those who experienced higher burnout reported higher incidence of errors; physicians with high hope scores did not report high incidences of errors, even if they experienced high burnout. Self-perceived medical errors showed a strong association with physicians' hope, and hope modified the association between physicians' burnout and self-perceived medical errors. PMID:22530055

  8. Predicting Increased Blood Pressure Using Machine Learning

    PubMed Central

    Golino, Hudson Fernandes; Amaral, Liliany Souza de Brito; Duarte, Stenio Fernando Pimentel; Soares, Telma de Jesus; dos Reis, Luciana Araujo

    2014-01-01

    The present study investigates the prediction of increased blood pressure by body mass index (BMI), waist (WC) and hip circumference (HC), and waist hip ratio (WHR) using a machine learning technique named classification tree. Data were collected from 400 college students (56.3% women) from 16 to 63 years old. Fifteen trees were calculated in the training group for each sex, using different numbers and combinations of predictors. The result shows that for women BMI, WC, and WHR are the combination that produces the best prediction, since it has the lowest deviance (87.42), misclassification (.19), and the higher pseudo R 2 (.43). This model presented a sensitivity of 80.86% and specificity of 81.22% in the training set and, respectively, 45.65% and 65.15% in the test sample. For men BMI, WC, HC, and WHC showed the best prediction with the lowest deviance (57.25), misclassification (.16), and the higher pseudo R 2 (.46). This model had a sensitivity of 72% and specificity of 86.25% in the training set and, respectively, 58.38% and 69.70% in the test set. Finally, the result from the classification tree analysis was compared with traditional logistic regression, indicating that the former outperformed the latter in terms of predictive power. PMID:24669313

  9. Predicting increased blood pressure using machine learning.

    PubMed

    Golino, Hudson Fernandes; Amaral, Liliany Souza de Brito; Duarte, Stenio Fernando Pimentel; Gomes, Cristiano Mauro Assis; Soares, Telma de Jesus; Dos Reis, Luciana Araujo; Santos, Joselito

    2014-01-01

    The present study investigates the prediction of increased blood pressure by body mass index (BMI), waist (WC) and hip circumference (HC), and waist hip ratio (WHR) using a machine learning technique named classification tree. Data were collected from 400 college students (56.3% women) from 16 to 63 years old. Fifteen trees were calculated in the training group for each sex, using different numbers and combinations of predictors. The result shows that for women BMI, WC, and WHR are the combination that produces the best prediction, since it has the lowest deviance (87.42), misclassification (.19), and the higher pseudo R (2) (.43). This model presented a sensitivity of 80.86% and specificity of 81.22% in the training set and, respectively, 45.65% and 65.15% in the test sample. For men BMI, WC, HC, and WHC showed the best prediction with the lowest deviance (57.25), misclassification (.16), and the higher pseudo R (2) (.46). This model had a sensitivity of 72% and specificity of 86.25% in the training set and, respectively, 58.38% and 69.70% in the test set. Finally, the result from the classification tree analysis was compared with traditional logistic regression, indicating that the former outperformed the latter in terms of predictive power.

  10. [Medication error management climate and perception for system use according to construction of medication error prevention system].

    PubMed

    Kim, Myoung Soo

    2012-08-01

    The purpose of this cross-sectional study was to examine current status of IT-based medication error prevention system construction and the relationships among system construction, medication error management climate and perception for system use. The participants were 124 patient safety chief managers working for 124 hospitals with over 300 beds in Korea. The characteristics of the participants, construction status and perception of systems (electric pharmacopoeia, electric drug dosage calculation system, computer-based patient safety reporting and bar-code system) and medication error management climate were measured in this study. The data were collected between June and August 2011. Descriptive statistics, partial Pearson correlation and MANCOVA were used for data analysis. Electric pharmacopoeia were constructed in 67.7% of participating hospitals, computer-based patient safety reporting systems were constructed in 50.8%, electric drug dosage calculation systems were in use in 32.3%. Bar-code systems showed up the lowest construction rate at 16.1% of Korean hospitals. Higher rates of construction of IT-based medication error prevention systems resulted in greater safety and a more positive error management climate prevailed. The supportive strategies for improving perception for use of IT-based systems would add to system construction, and positive error management climate would be more easily promoted.

  11. Identifying chronic errors at freeway loop detectors- splashover, pulse breakup, and sensitivity settings.

    DOT National Transportation Integrated Search

    2011-03-01

    Traffic Management applications such as ramp metering, incident detection, travel time prediction, and vehicle : classification greatly depend on the accuracy of data collected from inductive loop detectors, but these data are : prone to various erro...

  12. Utilizing spatial and spectral features of photoacoustic imaging for ovarian cancer detection and diagnosis

    NASA Astrophysics Data System (ADS)

    Li, Hai; Kumavor, Patrick; Salman Alqasemi, Umar; Zhu, Quing

    2015-01-01

    A composite set of ovarian tissue features extracted from photoacoustic spectral data, beam envelope, and co-registered ultrasound and photoacoustic images are used to characterize malignant and normal ovaries using logistic and support vector machine (SVM) classifiers. Normalized power spectra were calculated from the Fourier transform of the photoacoustic beamformed data, from which the spectral slopes and 0-MHz intercepts were extracted. Five features were extracted from the beam envelope and another 10 features were extracted from the photoacoustic images. These 17 features were ranked by their p-values from t-tests on which a filter type of feature selection method was used to determine the optimal feature number for final classification. A total of 169 samples from 19 ex vivo ovaries were randomly distributed into training and testing groups. Both classifiers achieved a minimum value of the mean misclassification error when the seven features with lowest p-values were selected. Using these seven features, the logistic and SVM classifiers obtained sensitivities of 96.39±3.35% and 97.82±2.26%, and specificities of 98.92±1.39% and 100%, respectively, for the training group. For the testing group, logistic and SVM classifiers achieved sensitivities of 92.71±3.55% and 92.64±3.27%, and specificities of 87.52±8.78% and 98.49±2.05%, respectively.

  13. A quantum algorithm for obtaining the lowest eigenstate of a Hamiltonian assisted with an ancillary qubit system

    NASA Astrophysics Data System (ADS)

    Bang, Jeongho; Lee, Seung-Woo; Lee, Chang-Woo; Jeong, Hyunseok

    2015-01-01

    We propose a quantum algorithm to obtain the lowest eigenstate of any Hamiltonian simulated by a quantum computer. The proposed algorithm begins with an arbitrary initial state of the simulated system. A finite series of transforms is iteratively applied to the initial state assisted with an ancillary qubit. The fraction of the lowest eigenstate in the initial state is then amplified up to 1. We prove that our algorithm can faithfully work for any arbitrary Hamiltonian in the theoretical analysis. Numerical analyses are also carried out. We firstly provide a numerical proof-of-principle demonstration with a simple Hamiltonian in order to compare our scheme with the so-called "Demon-like algorithmic cooling (DLAC)", recently proposed in Xu (Nat Photonics 8:113, 2014). The result shows a good agreement with our theoretical analysis, exhibiting the comparable behavior to the best `cooling' with the DLAC method. We then consider a random Hamiltonian model for further analysis of our algorithm. By numerical simulations, we show that the total number of iterations is proportional to , where is the difference between the two lowest eigenvalues and is an error defined as the probability that the finally obtained system state is in an unexpected (i.e., not the lowest) eigenstate.

  14. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Aristophanous, M; Court, L

    Purpose: Despite daily image guidance setup uncertainties can be high when treating large areas of the body. The aim of this study was to measure local uncertainties inside the PTV for patients receiving IMRT to the mediastinum region. Methods: Eleven lymphoma patients that received radiotherapy (breath-hold) to the mediastinum were included in this study. The treated region could range all the way from the neck to the diaphragm. Each patient had a CT scan with a CT-on-rails system prior to every treatment. The entire PTV region was matched to the planning CT using automatic rigid registration. The PTV was thenmore » split into 5 regions: neck, supraclavicular, superior mediastinum, upper heart, lower heart. Additional auto-registrations for each of the 5 local PTV regions were performed. The residual local setup errors were calculated as the difference between the final global PTV position and the individual final local PTV positions for the AP, SI and RL directions. For each patient 4 CT scans were analyzed (1 per week of treatment). Results: The residual mean group error (M) and standard deviation of the inter-patient (or systematic) error (Σ) were lowest in the RL direction of the superior mediastinum (0.0mm and 0.5mm) and highest in the RL direction of the lower heart (3.5mm and 2.9mm). The standard deviation of the inter-fraction (or random) error (σ) was lowest in the RL direction of the superior mediastinum (0.5mm) and highest in the SI direction of the lower heart (3.9mm) The directionality of local uncertainties is important; a superior residual error in the lower heart for example keeps it in the global PTV. Conclusion: There is a complex relationship between breath-holding and positioning uncertainties that needs further investigation. Residual setup uncertainties can be significant even under daily CT image guidance when treating large regions of the body.« less

  15. A feasibility study in adapting Shamos Bickel and Hodges Lehman estimator into T-Method for normalization

    NASA Astrophysics Data System (ADS)

    Harudin, N.; Jamaludin, K. R.; Muhtazaruddin, M. Nabil; Ramlie, F.; Muhamad, Wan Zuki Azman Wan

    2018-03-01

    T-Method is one of the techniques governed under Mahalanobis Taguchi System that developed specifically for multivariate data predictions. Prediction using T-Method is always possible even with very limited sample size. The user of T-Method required to clearly understanding the population data trend since this method is not considering the effect of outliers within it. Outliers may cause apparent non-normality and the entire classical methods breakdown. There exist robust parameter estimate that provide satisfactory results when the data contain outliers, as well as when the data are free of them. The robust parameter estimates of location and scale measure called Shamos Bickel (SB) and Hodges Lehman (HL) which are used as a comparable method to calculate the mean and standard deviation of classical statistic is part of it. Embedding these into T-Method normalize stage feasibly help in enhancing the accuracy of the T-Method as well as analysing the robustness of T-method itself. However, the result of higher sample size case study shows that T-method is having lowest average error percentages (3.09%) on data with extreme outliers. HL and SB is having lowest error percentages (4.67%) for data without extreme outliers with minimum error differences compared to T-Method. The error percentages prediction trend is vice versa for lower sample size case study. The result shows that with minimum sample size, which outliers always be at low risk, T-Method is much better on that, while higher sample size with extreme outliers, T-Method as well show better prediction compared to others. For the case studies conducted in this research, it shows that normalization of T-Method is showing satisfactory results and it is not feasible to adapt HL and SB or normal mean and standard deviation into it since it’s only provide minimum effect of percentages errors. Normalization using T-method is still considered having lower risk towards outlier’s effect.

  16. Wavefront Sensing Analysis of Grazing Incidence Optical Systems

    NASA Technical Reports Server (NTRS)

    Rohrbach, Scott; Saha, Timo

    2012-01-01

    Wavefront sensing is a process by which optical system errors are deduced from the aberrations in the image of an ideal source. The method has been used successfully in near-normal incidence, but not for grazing incidence systems. This innovation highlights the ability to examine out-of-focus images from grazing incidence telescopes (typically operating in the x-ray wavelengths, but integrated using optical wavelengths) and determine the lower-order deformations. This is important because as a metrology tool, this method would allow the integration of high angular resolution optics without the use of normal incidence interferometry, which requires direct access to the front surface of each mirror. Measuring the surface figure of mirror segments in a highly nested x-ray telescope mirror assembly is difficult due to the tight packing of elements and blockage of all but the innermost elements to normal incidence light. While this can be done on an individual basis in a metrology mount, once the element is installed and permanently bonded into the assembly, it is impossible to verify the figure of each element and ensure that the necessary imaging quality will be maintained. By examining on-axis images of an ideal point source, one can gauge the low-order figure errors of individual elements, even when integrated into an assembly. This technique is known as wavefront sensing (WFS). By shining collimated light down the optical axis of the telescope and looking at out-of-focus images, the blur due to low-order figure errors of individual elements can be seen, and the figure error necessary to produce that blur can be calculated. The method avoids the problem of requiring normal incidence access to the surface of each mirror segment. Mirror figure errors span a wide range of spatial frequencies, from the lowest-order bending to the highest order micro-roughness. While all of these can be measured in normal incidence, only the lowest-order contributors can be determined through this WFS technique.

  17. Automated Classification of Selected Data Elements from Free-text Diagnostic Reports for Clinical Research.

    PubMed

    Löpprich, Martin; Krauss, Felix; Ganzinger, Matthias; Senghas, Karsten; Riezler, Stefan; Knaup, Petra

    2016-08-05

    In the Multiple Myeloma clinical registry at Heidelberg University Hospital, most data are extracted from discharge letters. Our aim was to analyze if it is possible to make the manual documentation process more efficient by using methods of natural language processing for multiclass classification of free-text diagnostic reports to automatically document the diagnosis and state of disease of myeloma patients. The first objective was to create a corpus consisting of free-text diagnosis paragraphs of patients with multiple myeloma from German diagnostic reports, and its manual annotation of relevant data elements by documentation specialists. The second objective was to construct and evaluate a framework using different NLP methods to enable automatic multiclass classification of relevant data elements from free-text diagnostic reports. The main diagnoses paragraph was extracted from the clinical report of one third randomly selected patients of the multiple myeloma research database from Heidelberg University Hospital (in total 737 selected patients). An EDC system was setup and two data entry specialists performed independently a manual documentation of at least nine specific data elements for multiple myeloma characterization. Both data entries were compared and assessed by a third specialist and an annotated text corpus was created. A framework was constructed, consisting of a self-developed package to split multiple diagnosis sequences into several subsequences, four different preprocessing steps to normalize the input data and two classifiers: a maximum entropy classifier (MEC) and a support vector machine (SVM). In total 15 different pipelines were examined and assessed by a ten-fold cross-validation, reiterated 100 times. For quality indication the average error rate and the average F1-score were conducted. For significance testing the approximate randomization test was used. The created annotated corpus consists of 737 different diagnoses paragraphs with a total number of 865 coded diagnosis. The dataset is publicly available in the supplementary online files for training and testing of further NLP methods. Both classifiers showed low average error rates (MEC: 1.05; SVM: 0.84) and high F1-scores (MEC: 0.89; SVM: 0.92). However the results varied widely depending on the classified data element. Preprocessing methods increased this effect and had significant impact on the classification, both positive and negative. The automatic diagnosis splitter increased the average error rate significantly, even if the F1-score decreased only slightly. The low average error rates and high average F1-scores of each pipeline demonstrate the suitability of the investigated NPL methods. However, it was also shown that there is no best practice for an automatic classification of data elements from free-text diagnostic reports.

  18. Reliability of Semi-Automated Segmentations in Glioblastoma.

    PubMed

    Huber, T; Alber, G; Bette, S; Boeckh-Behrens, T; Gempt, J; Ringel, F; Alberts, E; Zimmer, C; Bauer, J S

    2017-06-01

    In glioblastoma, quantitative volumetric measurements of contrast-enhancing or fluid-attenuated inversion recovery (FLAIR) hyperintense tumor compartments are needed for an objective assessment of therapy response. The aim of this study was to evaluate the reliability of a semi-automated, region-growing segmentation tool for determining tumor volume in patients with glioblastoma among different users of the software. A total of 320 segmentations of tumor-associated FLAIR changes and contrast-enhancing tumor tissue were performed by different raters (neuroradiologists, medical students, and volunteers). All patients underwent high-resolution magnetic resonance imaging including a 3D-FLAIR and a 3D-MPRage sequence. Segmentations were done using a semi-automated, region-growing segmentation tool. Intra- and inter-rater-reliability were addressed by intra-class-correlation (ICC). Root-mean-square error (RMSE) was used to determine the precision error. Dice score was calculated to measure the overlap between segmentations. Semi-automated segmentation showed a high ICC (> 0.985) for all groups indicating an excellent intra- and inter-rater-reliability. Significant smaller precision errors and higher Dice scores were observed for FLAIR segmentations compared with segmentations of contrast-enhancement. Single rater segmentations showed the lowest RMSE for FLAIR of 3.3 % (MPRage: 8.2 %). Both, single raters and neuroradiologists had the lowest precision error for longitudinal evaluation of FLAIR changes. Semi-automated volumetry of glioblastoma was reliably performed by all groups of raters, even without neuroradiologic expertise. Interestingly, segmentations of tumor-associated FLAIR changes were more reliable than segmentations of contrast enhancement. In longitudinal evaluations, an experienced rater can detect progressive FLAIR changes of less than 15 % reliably in a quantitative way which could help to detect progressive disease earlier.

  19. Locality-preserving sparse representation-based classification in hyperspectral imagery

    NASA Astrophysics Data System (ADS)

    Gao, Lianru; Yu, Haoyang; Zhang, Bing; Li, Qingting

    2016-10-01

    This paper proposes to combine locality-preserving projections (LPP) and sparse representation (SR) for hyperspectral image classification. The LPP is first used to reduce the dimensionality of all the training and testing data by finding the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the manifold, where the high-dimensional data lies. Then, SR codes the projected testing pixels as sparse linear combinations of all the training samples to classify the testing pixels by evaluating which class leads to the minimum approximation error. The integration of LPP and SR represents an innovative contribution to the literature. The proposed approach, called locality-preserving SR-based classification, addresses the imbalance between high dimensionality of hyperspectral data and the limited number of training samples. Experimental results on three real hyperspectral data sets demonstrate that the proposed approach outperforms the original counterpart, i.e., SR-based classification.

  20. A Discriminative Approach to EEG Seizure Detection

    PubMed Central

    Johnson, Ashley N.; Sow, Daby; Biem, Alain

    2011-01-01

    Seizures are abnormal sudden discharges in the brain with signatures represented in electroencephalograms (EEG). The efficacy of the application of speech processing techniques to discriminate between seizure and non-seizure states in EEGs is reported. The approach accounts for the challenges of unbalanced datasets (seizure and non-seizure), while also showing a system capable of real-time seizure detection. The Minimum Classification Error (MCE) algorithm, which is a discriminative learning algorithm with wide-use in speech processing, is applied and compared with conventional classification techniques that have already been applied to the discrimination between seizure and non-seizure states in the literature. The system is evaluated on 22 pediatric patients multi-channel EEG recordings. Experimental results show that the application of speech processing techniques and MCE compare favorably with conventional classification techniques in terms of classification performance, while requiring less computational overhead. The results strongly suggests the possibility of deploying the designed system at the bedside. PMID:22195192

  1. Balancing research and funding using value of information and portfolio tools for nanomaterial risk classification

    NASA Astrophysics Data System (ADS)

    Bates, Matthew E.; Keisler, Jeffrey M.; Zussblatt, Niels P.; Plourde, Kenton J.; Wender, Ben A.; Linkov, Igor

    2016-02-01

    Risk research for nanomaterials is currently prioritized by means of expert workshops and other deliberative processes. However, analytical techniques that quantify and compare alternative research investments are increasingly recommended. Here, we apply value of information and portfolio decision analysis—methods commonly applied in financial and operations management—to prioritize risk research for multiwalled carbon nanotubes and nanoparticulate silver and titanium dioxide. We modify the widely accepted CB Nanotool hazard evaluation framework, which combines nano- and bulk-material properties into a hazard score, to operate probabilistically with uncertain inputs. Literature is reviewed to develop uncertain estimates for each input parameter, and a Monte Carlo simulation is applied to assess how different research strategies can improve hazard classification. The relative cost of each research experiment is elicited from experts, which enables identification of efficient research portfolios—combinations of experiments that lead to the greatest improvement in hazard classification at the lowest cost. Nanoparticle shape, diameter, solubility and surface reactivity were most frequently identified within efficient portfolios in our results.

  2. Classification of dry-cured hams according to the maturation time using near infrared spectra and artificial neural networks.

    PubMed

    Prevolnik, M; Andronikov, D; Žlender, B; Font-i-Furnols, M; Novič, M; Škorjanc, D; Čandek-Potokar, M

    2014-01-01

    An attempt to classify dry-cured hams according to the maturation time on the basis of near infrared (NIR) spectra was studied. The study comprised 128 samples of biceps femoris (BF) muscle from dry-cured hams matured for 10 (n=32), 12 (n=32), 14 (n=32) or 16 months (n=32). Samples were minced and scanned in the wavelength range from 400 to 2500 nm using spectrometer NIR System model 6500 (Silver Spring, MD, USA). Spectral data were used for i) splitting of samples into the training and test set using 2D Kohonen artificial neural networks (ANN) and for ii) construction of classification models using counter-propagation ANN (CP-ANN). Different models were tested, and the one selected was based on the lowest percentage of misclassified test samples (external validation). Overall correctness of the classification was 79.7%, which demonstrates practical relevance of using NIR spectroscopy and ANN for dry-cured ham processing control. Copyright © 2013 Elsevier Ltd. All rights reserved.

  3. Balancing research and funding using value of information and portfolio tools for nanomaterial risk classification.

    PubMed

    Bates, Matthew E; Keisler, Jeffrey M; Zussblatt, Niels P; Plourde, Kenton J; Wender, Ben A; Linkov, Igor

    2016-02-01

    Risk research for nanomaterials is currently prioritized by means of expert workshops and other deliberative processes. However, analytical techniques that quantify and compare alternative research investments are increasingly recommended. Here, we apply value of information and portfolio decision analysis-methods commonly applied in financial and operations management-to prioritize risk research for multiwalled carbon nanotubes and nanoparticulate silver and titanium dioxide. We modify the widely accepted CB Nanotool hazard evaluation framework, which combines nano- and bulk-material properties into a hazard score, to operate probabilistically with uncertain inputs. Literature is reviewed to develop uncertain estimates for each input parameter, and a Monte Carlo simulation is applied to assess how different research strategies can improve hazard classification. The relative cost of each research experiment is elicited from experts, which enables identification of efficient research portfolios-combinations of experiments that lead to the greatest improvement in hazard classification at the lowest cost. Nanoparticle shape, diameter, solubility and surface reactivity were most frequently identified within efficient portfolios in our results.

  4. Analysis of failure in patients with adenoid cystic carcinoma of the head and neck. An international collaborative study.

    PubMed

    Amit, Moran; Binenbaum, Yoav; Sharma, Kanika; Ramer, Naomi; Ramer, Ilana; Agbetoba, Abib; Miles, Brett; Yang, Xinjie; Lei, Delin; Bjøerndal, Kristine; Godballe, Christian; Mücke, Thomas; Wolff, Klaus-Dietrich; Fliss, Dan; Eckardt, André M; Copelli, Chiara; Sesenna, Enrico; Palmer, Frank; Patel, Snehal; Gil, Ziv

    2014-07-01

    Adenoid cystic carcinoma (ACC) is a locally aggressive tumor with a high prevalence of distant metastases. The purpose of this study was to identify independent predictors of outcome and to characterize the patterns of failure. An international retrospective review was conducted of 489 patients with ACC treated between 1985 and 2011 in 9 cancer centers worldwide. Five-year overall-survival (OS), disease-specific survival (DSS), and disease-free survival (DFS) were 76%, 80%, and 68%, respectively. Independent predictors of OS and DSS were: age, site, N classification, and presence of distant metastases. N classification, age, and bone invasion were associated with DFS on multivariate analysis. Age, tumor site, orbital invasion, and N classification were independent predictors of distant metastases. The clinical course of ACC is slow but persistent. Paranasal sinus origin is associated with the lowest distant metastases rate but with the poorest outcome. These prognostic estimates should be considered when tailoring treatment for patients with ACC. Copyright © 2013 Wiley Periodicals, Inc.

  5. The associations of insomnia with costly workplace accidents and errors: results from the America Insomnia Survey.

    PubMed

    Shahly, Victoria; Berglund, Patricia A; Coulouvrat, Catherine; Fitzgerald, Timothy; Hajak, Goeran; Roth, Thomas; Shillington, Alicia C; Stephenson, Judith J; Walsh, James K; Kessler, Ronald C

    2012-10-01

    Insomnia is a common and seriously impairing condition that often goes unrecognized. To examine associations of broadly defined insomnia (ie, meeting inclusion criteria for a diagnosis from International Statistical Classification of Diseases, 10th Revision, DSM-IV, or Research Diagnostic Criteria/International Classification of Sleep Disorders, Second Edition) with costly workplace accidents and errors after excluding other chronic conditions among workers in the America Insomnia Survey (AIS). A national cross-sectional telephone survey (65.0% cooperation rate) of commercially insured health plan members selected from the more than 34 million in the HealthCore Integrated Research Database. Four thousand nine hundred ninety-one employed AIS respondents. Costly workplace accidents or errors in the 12 months before the AIS interview were assessed with one question about workplace accidents "that either caused damage or work disruption with a value of $500 or more" and another about other mistakes "that cost your company $500 or more." Current insomnia with duration of at least 12 months was assessed with the Brief Insomnia Questionnaire, a validated (area under the receiver operating characteristic curve, 0.86 compared with diagnoses based on blinded clinical reappraisal interviews), fully structured diagnostic interview. Eighteen other chronic conditions were assessed with medical/pharmacy claims records and validated self-report scales. Insomnia had a significant odds ratio with workplace accidents and/or errors controlled for other chronic conditions (1.4). The odds ratio did not vary significantly with respondent age, sex, educational level, or comorbidity. The average costs of insomnia-related accidents and errors ($32 062) were significantly higher than those of other accidents and errors ($21 914). Simulations estimated that insomnia was associated with 7.2% of all costly workplace accidents and errors and 23.7% of all the costs of these incidents. These proportions are higher than for any other chronic condition, with annualized US population projections of 274 000 costly insomnia-related workplace accidents and errors having a combined value of US $31.1 billion. Effectiveness trials are needed to determine whether expanded screening, outreach, and treatment of workers with insomnia would yield a positive return on investment for employers.

  6. Consistent latent position estimation and vertex classification for random dot product graphs.

    PubMed

    Sussman, Daniel L; Tang, Minh; Priebe, Carey E

    2014-01-01

    In this work, we show that using the eigen-decomposition of the adjacency matrix, we can consistently estimate latent positions for random dot product graphs provided the latent positions are i.i.d. from some distribution. If class labels are observed for a number of vertices tending to infinity, then we show that the remaining vertices can be classified with error converging to Bayes optimal using the $(k)$-nearest-neighbors classification rule. We evaluate the proposed methods on simulated data and a graph derived from Wikipedia.

  7. Classification of JERS-1 Image Mosaic of Central Africa Using A Supervised Multiscale Classifier of Texture Features

    NASA Technical Reports Server (NTRS)

    Saatchi, Sassan; DeGrandi, Franco; Simard, Marc; Podest, Erika

    1999-01-01

    In this paper, a multiscale approach is introduced to classify the Japanese Research Satellite-1 (JERS-1) mosaic image over the Central African rainforest. A series of texture maps are generated from the 100 m mosaic image at various scales. Using a quadtree model and relating classes at each scale by a Markovian relationship, the multiscale images are classified from course to finer scale. The results are verified at various scales and the evolution of classification is monitored by calculating the error at each stage.

  8. Comparison of the Pentacam equivalent keratometry reading and IOL Master keratometry measurement in intraocular lens power calculations.

    PubMed

    Karunaratne, Nicholas

    2013-12-01

    To compare the accuracy of the Pentacam Holladay equivalent keratometry readings with the IOL Master 500 keratometry in calculating intraocular lens power. Non-randomized, prospective clinical study conducted in private practice. Forty-five consecutive normal patients undergoing cataract surgery. Forty-five consecutive patients had Pentacam equivalent keratometry readings at the 2-, 3 and 4.5-mm corneal zone and IOL Master keratometry measurements prior to cataract surgery. For each Pentacam equivalent keratometry reading zone and IOL Master measurement the difference between the observed and expected refractive error was calculated using the Holladay 2 and Sanders, Retzlaff and Kraff theoretic (SRKT) formulas. Mean keratometric value and mean absolute refractive error. There was a statistically significantly difference between the mean keratometric values of the IOL Master, Pentacam equivalent keratometry reading 2-, 3- and 4.5-mm measurements (P < 0.0001, analysis of variance). There was no statistically significant difference between the mean absolute refraction error for the IOL Master and equivalent keratometry readings 2 mm, 3 mm and 4.5 mm zones for either the Holladay 2 formula (P = 0.14) or SRKT formula (P = 0.47). The lowest mean absolute refraction error for Holladay 2 equivalent keratometry reading was the 4.5 mm zone (mean 0.25 D ± 0.17 D). The lowest mean absolute refraction error for SRKT equivalent keratometry reading was the 4.5 mm zone (mean 0.25 D ± 0.19 D). Comparing the absolute refraction error of IOL Master and Pentacam equivalent keratometry reading, best agreement was with Holladay 2 and equivalent keratometry reading 4.5 mm, with mean of the difference of 0.02 D and 95% limits of agreement of -0.35 and 0.39 D. The IOL Master keratometry and Pentacam equivalent keratometry reading were not equivalent when used only for corneal power measurements. However, the keratometry measurements of the IOL Master and Pentacam equivalent keratometry reading 4.5 mm may be similarly effective when used in intraocular lens power calculation formulas, following constant optimization. © 2013 Royal Australian and New Zealand College of Ophthalmologists.

  9. Low-dimensional Representation of Error Covariance

    NASA Technical Reports Server (NTRS)

    Tippett, Michael K.; Cohn, Stephen E.; Todling, Ricardo; Marchesin, Dan

    2000-01-01

    Ensemble and reduced-rank approaches to prediction and assimilation rely on low-dimensional approximations of the estimation error covariances. Here stability properties of the forecast/analysis cycle for linear, time-independent systems are used to identify factors that cause the steady-state analysis error covariance to admit a low-dimensional representation. A useful measure of forecast/analysis cycle stability is the bound matrix, a function of the dynamics, observation operator and assimilation method. Upper and lower estimates for the steady-state analysis error covariance matrix eigenvalues are derived from the bound matrix. The estimates generalize to time-dependent systems. If much of the steady-state analysis error variance is due to a few dominant modes, the leading eigenvectors of the bound matrix approximate those of the steady-state analysis error covariance matrix. The analytical results are illustrated in two numerical examples where the Kalman filter is carried to steady state. The first example uses the dynamics of a generalized advection equation exhibiting nonmodal transient growth. Failure to observe growing modes leads to increased steady-state analysis error variances. Leading eigenvectors of the steady-state analysis error covariance matrix are well approximated by leading eigenvectors of the bound matrix. The second example uses the dynamics of a damped baroclinic wave model. The leading eigenvectors of a lowest-order approximation of the bound matrix are shown to approximate well the leading eigenvectors of the steady-state analysis error covariance matrix.

  10. Sensitivity and accuracy of high-throughput metabarcoding methods for early detection of invasive fish species

    EPA Science Inventory

    For early detection biomonitoring of aquatic invasive species, sensitivity to rare individuals and accurate, high-resolution taxonomic classification are critical to minimize detection errors. Given the great expense and effort associated with morphological identification of many...

  11. When do latent class models overstate accuracy for diagnostic and other classifiers in the absence of a gold standard?

    PubMed

    Spencer, Bruce D

    2012-06-01

    Latent class models are increasingly used to assess the accuracy of medical diagnostic tests and other classifications when no gold standard is available and the true state is unknown. When the latent class is treated as the true class, the latent class models provide measures of components of accuracy including specificity and sensitivity and their complements, type I and type II error rates. The error rates according to the latent class model differ from the true error rates, however, and empirical comparisons with a gold standard suggest the true error rates often are larger. We investigate conditions under which the true type I and type II error rates are larger than those provided by the latent class models. Results from Uebersax (1988, Psychological Bulletin 104, 405-416) are extended to accommodate random effects and covariates affecting the responses. The results are important for interpreting the results of latent class analyses. An error decomposition is presented that incorporates an error component from invalidity of the latent class model. © 2011, The International Biometric Society.

  12. Fusing metabolomics data sets with heterogeneous measurement errors

    PubMed Central

    Waaijenborg, Sandra; Korobko, Oksana; Willems van Dijk, Ko; Lips, Mirjam; Hankemeier, Thomas; Wilderjans, Tom F.; Smilde, Age K.

    2018-01-01

    Combining different metabolomics platforms can contribute significantly to the discovery of complementary processes expressed under different conditions. However, analysing the fused data might be hampered by the difference in their quality. In metabolomics data, one often observes that measurement errors increase with increasing measurement level and that different platforms have different measurement error variance. In this paper we compare three different approaches to correct for the measurement error heterogeneity, by transformation of the raw data, by weighted filtering before modelling and by a modelling approach using a weighted sum of residuals. For an illustration of these different approaches we analyse data from healthy obese and diabetic obese individuals, obtained from two metabolomics platforms. Concluding, the filtering and modelling approaches that both estimate a model of the measurement error did not outperform the data transformation approaches for this application. This is probably due to the limited difference in measurement error and the fact that estimation of measurement error models is unstable due to the small number of repeats available. A transformation of the data improves the classification of the two groups. PMID:29698490

  13. Noninvasive forward-scattering system for rapid detection, characterization, and identification of Listeria colonies: image processing and data analysis

    NASA Astrophysics Data System (ADS)

    Rajwa, Bartek; Bayraktar, Bulent; Banada, Padmapriya P.; Huff, Karleigh; Bae, Euiwon; Hirleman, E. Daniel; Bhunia, Arun K.; Robinson, J. Paul

    2006-10-01

    Bacterial contamination by Listeria monocytogenes puts the public at risk and is also costly for the food-processing industry. Traditional methods for pathogen identification require complicated sample preparation for reliable results. Previously, we have reported development of a noninvasive optical forward-scattering system for rapid identification of Listeria colonies grown on solid surfaces. The presented system included application of computer-vision and patternrecognition techniques to classify scatter pattern formed by bacterial colonies irradiated with laser light. This report shows an extension of the proposed method. A new scatterometer equipped with a high-resolution CCD chip and application of two additional sets of image features for classification allow for higher accuracy and lower error rates. Features based on Zernike moments are supplemented by Tchebichef moments, and Haralick texture descriptors in the new version of the algorithm. Fisher's criterion has been used for feature selection to decrease the training time of machine learning systems. An algorithm based on support vector machines was used for classification of patterns. Low error rates determined by cross-validation, reproducibility of the measurements, and robustness of the system prove that the proposed technology can be implemented in automated devices for detection and classification of pathogenic bacteria.

  14. Correcting evaluation bias of relational classifiers with network cross validation

    DOE PAGES

    Neville, Jennifer; Gallagher, Brian; Eliassi-Rad, Tina; ...

    2011-01-04

    Recently, a number of modeling techniques have been developed for data mining and machine learning in relational and network domains where the instances are not independent and identically distributed (i.i.d.). These methods specifically exploit the statistical dependencies among instances in order to improve classification accuracy. However, there has been little focus on how these same dependencies affect our ability to draw accurate conclusions about the performance of the models. More specifically, the complex link structure and attribute dependencies in relational data violate the assumptions of many conventional statistical tests and make it difficult to use these tests to assess themore » models in an unbiased manner. In this work, we examine the task of within-network classification and the question of whether two algorithms will learn models that will result in significantly different levels of performance. We show that the commonly used form of evaluation (paired t-test on overlapping network samples) can result in an unacceptable level of Type I error. Furthermore, we show that Type I error increases as (1) the correlation among instances increases and (2) the size of the evaluation set increases (i.e., the proportion of labeled nodes in the network decreases). Lastly, we propose a method for network cross-validation that combined with paired t-tests produces more acceptable levels of Type I error while still providing reasonable levels of statistical power (i.e., 1–Type II error).« less

  15. The efficacy of protoporphyrin as a predictive biomarker for lead exposure in canvasback ducks: effect of sample storage time

    USGS Publications Warehouse

    Franson, J.C.; Hohman, W.L.; Moore, J.L.; Smith, M.R.

    1996-01-01

    We used 363 blood samples collected from wild canvasback dueks (Aythya valisineria) at Catahoula Lake, Louisiana, U.S.A. to evaluate the effect of sample storage time on the efficacy of erythrocytic protoporphyrin as an indicator of lead exposure. The protoporphyrin concentration of each sample was determined by hematofluorometry within 5 min of blood collection and after refrigeration at 4 °C for 24 and 48 h. All samples were analyzed for lead by atomic absorption spectrophotometry. Based on a blood lead concentration of ≥0.2 ppm wet weight as positive evidence for lead exposure, the protoporphyrin technique resulted in overall error rates of 29%, 20%, and 19% and false negative error rates of 47%, 29% and 25% when hematofluorometric determinations were made on blood at 5 min, 24 h, and 48 h, respectively. False positive error rates were less than 10% for all three measurement times. The accuracy of the 24-h erythrocytic protoporphyrin classification of blood samples as positive or negative for lead exposure was significantly greater than the 5-min classification, but no improvement in accuracy was gained when samples were tested at 48 h. The false negative errors were probably due, at least in part, to the lag time between lead exposure and the increase of blood protoporphyrin concentrations. False negatives resulted in an underestimation of the true number of canvasbacks exposed to lead, indicating that hematofluorometry provides a conservative estimate of lead exposure.

  16. [Tobacco quality analysis of industrial classification of different years using near-infrared (NIR) spectrum].

    PubMed

    Wang, Yi; Xiang, Ma; Wen, Ya-Dong; Yu, Chun-Xia; Wang, Luo-Ping; Zhao, Long-Lian; Li, Jun-Hui

    2012-11-01

    In this study, tobacco quality analysis of main Industrial classification of different years was carried out applying spectrum projection and correlation methods. The group of data was near-infrared (NIR) spectrum from Hongta Tobacco (Group) Co., Ltd. 5730 tobacco leaf Industrial classification samples from Yuxi in Yunnan province from 2007 to 2010 year were collected using near infrared spectroscopy, which from different parts and colors and all belong to tobacco varieties of HONGDA. The conclusion showed that, when the samples were divided to two part by the ratio of 2:1 randomly as analysis and verification sets in the same year, the verification set corresponded with the analysis set applying spectrum projection because their correlation coefficients were above 0.98. The correlation coefficients between two different years applying spectrum projection were above 0.97. The highest correlation coefficient was the one between 2008 and 2009 year and the lowest correlation coefficient was the one between 2007 and 2010 year. At the same time, The study discussed a method to get the quantitative similarity values of different industrial classification samples. The similarity and consistency values were instructive in combination and replacement of tobacco leaf blending.

  17. Using the EC decision on case definitions for communicable diseases as a terminology source--lessons learned.

    PubMed

    Balkanyi, Laszlo; Heja, Gergely; Nagy, Attlia

    2014-01-01

    Extracting scientifically accurate terminology from an EU public health regulation is part of the knowledge engineering work at the European Centre for Disease Prevention and Control (ECDC). ECDC operates information systems at the crossroads of many areas - posing a challenge for transparency and consistency. Semantic interoperability is based on the Terminology Server (TS). TS value sets (structured vocabularies) describe shared domains as "diseases", "organisms", "public health terms", "geo-entities" "organizations" and "administrative terms" and others. We extracted information from the relevant EC Implementing Decision on case definitions for reporting communicable diseases, listing 53 notifiable infectious diseases, containing clinical, diagnostic, laboratory and epidemiological criteria. We performed a consistency check; a simplification - abstraction; we represented lab criteria in triplets: as 'y' procedural result /of 'x' organism-substance/on 'z' specimen and identified negations. The resulting new case definition value set represents the various formalized criteria, meanwhile the existing disease value set has been extended, new signs and symptoms were added. New organisms enriched the organism value set. Other new categories have been added to the public health value set, as transmission modes; substances; specimens and procedures. We identified problem areas, as (a) some classification error(s); (b) inconsistent granularity of conditions; (c) seemingly nonsense criteria, medical trivialities; (d) possible logical errors, (e) seemingly factual errors that might be phrasing errors. We think our hypothesis regarding room for possible improvements is valid: there are some open issues and a further improved legal text might lead to more precise epidemiologic data collection. It has to be noted that formal representation for automatic classification of cases was out of scope, such a task would require other formalism, as e.g. those used by rule-based decision support systems.

  18. The 'Soil Cover App' - a new tool for fast determination of dead and living biomass on soil

    NASA Astrophysics Data System (ADS)

    Bauer, Thomas; Strauss, Peter; Riegler-Nurscher, Peter; Prankl, Johann; Prankl, Heinrich

    2017-04-01

    Worldwide many agricultural practices aim on soil protection strategies using living or dead biomass as soil cover. Especially for the case when management practices are focusing on soil erosion mitigation the effectiveness of these practices is directly driven by the amount of soil coverleft on the soil surface. Hence there is a need for quick and reliable methods of soil cover estimation not only for living biomass but particularly for dead biomass (mulch). Available methods for the soil cover measurement are either subjective, depending on an educated guess or time consuming, e.g., if the image is analysed manually at grid points. We therefore developed a mobile application using an algorithm based on entangled forest classification. The final output of the algorithm gives classified labels for each pixel of the input image as well as the percentage of each class which are living biomass, dead biomass, stones and soil. Our training dataset consisted of more than 250 different images and their annotated class information. Images have been taken in a set of different environmental conditions such as light, soil coverages from between 0% to 100%, different materials such as living plants, residues, straw material and stones. We compared the results provided by our mobile application with a data set of 180 images that had been manually annotated A comparison between both methods revealed a regression slope of 0.964 with a coefficient of determination R2 = 0.92, corresponding to an average error of about 4%. While average error of living plant classification was about 3%, dead residue classification resulted in an 8% error. Thus the new mobile application tool offers a fast and easy way to obtain information on the protective potential of a particular agricultural management site.

  19. Systematic bias in genomic classification due to contaminating non-neoplastic tissue in breast tumor samples.

    PubMed

    Elloumi, Fathi; Hu, Zhiyuan; Li, Yan; Parker, Joel S; Gulley, Margaret L; Amos, Keith D; Troester, Melissa A

    2011-06-30

    Genomic tests are available to predict breast cancer recurrence and to guide clinical decision making. These predictors provide recurrence risk scores along with a measure of uncertainty, usually a confidence interval. The confidence interval conveys random error and not systematic bias. Standard tumor sampling methods make this problematic, as it is common to have a substantial proportion (typically 30-50%) of a tumor sample comprised of histologically benign tissue. This "normal" tissue could represent a source of non-random error or systematic bias in genomic classification. To assess the performance characteristics of genomic classification to systematic error from normal contamination, we collected 55 tumor samples and paired tumor-adjacent normal tissue. Using genomic signatures from the tumor and paired normal, we evaluated how increasing normal contamination altered recurrence risk scores for various genomic predictors. Simulations of normal tissue contamination caused misclassification of tumors in all predictors evaluated, but different breast cancer predictors showed different types of vulnerability to normal tissue bias. While two predictors had unpredictable direction of bias (either higher or lower risk of relapse resulted from normal contamination), one signature showed predictable direction of normal tissue effects. Due to this predictable direction of effect, this signature (the PAM50) was adjusted for normal tissue contamination and these corrections improved sensitivity and negative predictive value. For all three assays quality control standards and/or appropriate bias adjustment strategies can be used to improve assay reliability. Normal tissue sampled concurrently with tumor is an important source of bias in breast genomic predictors. All genomic predictors show some sensitivity to normal tissue contamination and ideal strategies for mitigating this bias vary depending upon the particular genes and computational methods used in the predictor.

  20. Text Classification for Assisting Moderators in Online Health Communities

    PubMed Central

    Huh, Jina; Yetisgen-Yildiz, Meliha; Pratt, Wanda

    2013-01-01

    Objectives Patients increasingly visit online health communities to get help on managing health. The large scale of these online communities makes it impossible for the moderators to engage in all conversations; yet, some conversations need their expertise. Our work explores low-cost text classification methods to this new domain of determining whether a thread in an online health forum needs moderators’ help. Methods We employed a binary classifier on WebMD’s online diabetes community data. To train the classifier, we considered three feature types: (1) word unigram, (2) sentiment analysis features, and (3) thread length. We applied feature selection methods based on χ2 statistics and under sampling to account for unbalanced data. We then performed a qualitative error analysis to investigate the appropriateness of the gold standard. Results Using sentiment analysis features, feature selection methods, and balanced training data increased the AUC value up to 0.75 and the F1-score up to 0.54 compared to the baseline of using word unigrams with no feature selection methods on unbalanced data (0.65 AUC and 0.40 F1-score). The error analysis uncovered additional reasons for why moderators respond to patients’ posts. Discussion We showed how feature selection methods and balanced training data can improve the overall classification performance. We present implications of weighing precision versus recall for assisting moderators of online health communities. Our error analysis uncovered social, legal, and ethical issues around addressing community members’ needs. We also note challenges in producing a gold standard, and discuss potential solutions for addressing these challenges. Conclusion Social media environments provide popular venues in which patients gain health-related information. Our work contributes to understanding scalable solutions for providing moderators’ expertise in these large-scale, social media environments. PMID:24025513

  1. Entropy of space-time outcome in a movement speed-accuracy task.

    PubMed

    Hsieh, Tsung-Yu; Pacheco, Matheus Maia; Newell, Karl M

    2015-12-01

    The experiment reported was set-up to investigate the space-time entropy of movement outcome as a function of a range of spatial (10, 20 and 30 cm) and temporal (250-2500 ms) criteria in a discrete aiming task. The variability and information entropy of the movement spatial and temporal errors considered separately increased and decreased on the respective dimension as a function of an increment of movement velocity. However, the joint space-time entropy was lowest when the relative contribution of spatial and temporal task criteria was comparable (i.e., mid-range of space-time constraints), and it increased with a greater trade-off between spatial or temporal task demands, revealing a U-shaped function across space-time task criteria. The traditional speed-accuracy functions of spatial error and temporal error considered independently mapped to this joint space-time U-shaped entropy function. The trade-off in movement tasks with joint space-time criteria is between spatial error and timing error, rather than movement speed and accuracy. Copyright © 2015 Elsevier B.V. All rights reserved.

  2. A nonlinear model of gold production in Malaysia

    NASA Astrophysics Data System (ADS)

    Ramli, Norashikin; Muda, Nora; Umor, Mohd Rozi

    2014-06-01

    Malaysia is a country which is rich in natural resources and one of it is a gold. Gold has already become an important national commodity. This study is conducted to determine a model that can be well fitted with the gold production in Malaysia from the year 1995-2010. Five nonlinear models are presented in this study which are Logistic model, Gompertz, Richard, Weibull and Chapman-Richard model. These model are used to fit the cumulative gold production in Malaysia. The best model is then selected based on the model performance. The performance of the fitted model is measured by sum squares error, root mean squares error, coefficient of determination, mean relative error, mean absolute error and mean absolute percentage error. This study has found that a Weibull model is shown to have significantly outperform compare to the other models. To confirm that Weibull is the best model, the latest data are fitted to the model. Once again, Weibull model gives the lowest readings at all types of measurement error. We can concluded that the future gold production in Malaysia can be predicted according to the Weibull model and this could be important findings for Malaysia to plan their economic activities.

  3. Reliability analysis of the AOSpine thoracolumbar spine injury classification system by a worldwide group of naïve spinal surgeons.

    PubMed

    Kepler, Christopher K; Vaccaro, Alexander R; Koerner, John D; Dvorak, Marcel F; Kandziora, Frank; Rajasekaran, Shanmuganathan; Aarabi, Bizhan; Vialle, Luiz R; Fehlings, Michael G; Schroeder, Gregory D; Reinhold, Maximilian; Schnake, Klaus John; Bellabarba, Carlo; Cumhur Öner, F

    2016-04-01

    The aims of this study were (1) to demonstrate the AOSpine thoracolumbar spine injury classification system can be reliably applied by an international group of surgeons and (2) to delineate those injury types which are difficult for spine surgeons to classify reliably. A previously described classification system of thoracolumbar injuries which consists of a morphologic classification of the fracture, a grading system for the neurologic status and relevant patient-specific modifiers was applied to 25 cases by 100 spinal surgeons from across the world twice independently, in grading sessions 1 month apart. The results were analyzed for classification reliability using the Kappa coefficient (κ). The overall Kappa coefficient for all cases was 0.56, which represents moderate reliability. Kappa values describing interobserver agreement were 0.80 for type A injuries, 0.68 for type B injuries and 0.72 for type C injuries, all representing substantial reliability. The lowest level of agreement for specific subtypes was for fracture subtype A4 (Kappa = 0.19). Intraobserver analysis demonstrated overall average Kappa statistic for subtype grading of 0.68 also representing substantial reproducibility. In a worldwide sample of spinal surgeons without previous exposure to the recently described AOSpine Thoracolumbar Spine Injury Classification System, we demonstrated moderate interobserver and substantial intraobserver reliability. These results suggest that most spine surgeons can reliably apply this system to spine trauma patients as or more reliably than previously described systems.

  4. Computerized Classification of Pneumoconiosis on Digital Chest Radiography Artificial Neural Network with Three Stages.

    PubMed

    Okumura, Eiichiro; Kawashita, Ikuo; Ishida, Takayuki

    2017-08-01

    It is difficult for radiologists to classify pneumoconiosis from category 0 to category 3 on chest radiographs. Therefore, we have developed a computer-aided diagnosis (CAD) system based on a three-stage artificial neural network (ANN) method for classification based on four texture features. The image database consists of 36 chest radiographs classified as category 0 to category 3. Regions of interest (ROIs) with a matrix size of 32 × 32 were selected from chest radiographs. We obtained a gray-level histogram, histogram of gray-level difference, gray-level run-length matrix (GLRLM) feature image, and gray-level co-occurrence matrix (GLCOM) feature image in each ROI. For ROI-based classification, the first ANN was trained with each texture feature. Next, the second ANN was trained with output patterns obtained from the first ANN. Finally, we obtained a case-based classification for distinguishing among four categories with the third ANN method. We determined the performance of the third ANN by receiver operating characteristic (ROC) analysis. The areas under the ROC curve (AUC) of the highest category (severe pneumoconiosis) case and the lowest category (early pneumoconiosis) case were 0.89 ± 0.09 and 0.84 ± 0.12, respectively. The three-stage ANN with four texture features showed the highest performance for classification among the four categories. Our CAD system would be useful for assisting radiologists in classification of pneumoconiosis from category 0 to category 3.

  5. Asteroid thermal modeling in the presence of reflected sunlight

    NASA Astrophysics Data System (ADS)

    Myhrvold, Nathan

    2018-03-01

    A new derivation of simple asteroid thermal models is presented, investigating the need to account correctly for Kirchhoff's law of thermal radiation when IR observations contain substantial reflected sunlight. The framework applies to both the NEATM and related thermal models. A new parameterization of these models eliminates the dependence of thermal modeling on visible absolute magnitude H, which is not always available. Monte Carlo simulations are used to assess the potential impact of violating Kirchhoff's law on estimates of physical parameters such as diameter and IR albedo, with an emphasis on NEOWISE results. The NEOWISE papers use ten different models, applied to 12 different combinations of WISE data bands, in 47 different combinations. The most prevalent combinations are simulated and the accuracy of diameter estimates is found to be depend critically on the model and data band combination. In the best case of full thermal modeling of all four band the errors in an idealized model the 1σ (68.27%) confidence interval is -5% to +6%, but this combination is just 1.9% of NEOWISE results. Other combinations representing 42% of the NEOWISE results have about twice the CI at -10% to +12%, before accounting for errors due to irregular shape or other real world effects that are not simulated. The model and data band combinations found for the majority of NEOWISE results have much larger systematic and random errors. Kirchhoff's law violation by NEOWISE models leads to errors in estimation accuracy that are strongest for asteroids with W1, W2 band emissivity ɛ12 in both the lowest (0.605 ≤ɛ12 ≤ 0 . 780), and highest decile (0.969 ≤ɛ12 ≤ 0 . 988), corresponding to the highest and lowest deciles of near-IR albedo pIR. Systematic accuracy error between deciles ranges from a low of 5% to as much as 45%, and there are also differences in the random errors. Kirchhoff's law effects also produce large errors in NEOWISE estimates of pIR, particularly for high values. IR observations of asteroids in bands that have substantial reflected sunlight can largely avoid these problems by adopting the Kirchhoff law compliant modeling framework presented here, which is conceptually straightforward and comes without computational cost.

  6. Estimating Population Cause-Specific Mortality Fractions from in-Hospital Mortality: Validation of a New Method

    PubMed Central

    Murray, Christopher J. L; Lopez, Alan D; Barofsky, Jeremy T; Bryson-Cahn, Chloe; Lozano, Rafael

    2007-01-01

    Background Cause-of-death data for many developing countries are not available. Information on deaths in hospital by cause is available in many low- and middle-income countries but is not a representative sample of deaths in the population. We propose a method to estimate population cause-specific mortality fractions (CSMFs) using data already collected in many middle-income and some low-income developing nations, yet rarely used: in-hospital death records. Methods and Findings For a given cause of death, a community's hospital deaths are equal to total community deaths multiplied by the proportion of deaths occurring in hospital. If we can estimate the proportion dying in hospital, we can estimate the proportion dying in the population using deaths in hospital. We propose to estimate the proportion of deaths for an age, sex, and cause group that die in hospital from the subset of the population where vital registration systems function or from another population. We evaluated our method using nearly complete vital registration (VR) data from Mexico 1998–2005, which records whether a death occurred in a hospital. In this validation test, we used 45 disease categories. We validated our method in two ways: nationally and between communities. First, we investigated how the method's accuracy changes as we decrease the amount of Mexican VR used to estimate the proportion of each age, sex, and cause group dying in hospital. Decreasing VR data used for this first step from 100% to 9% produces only a 12% maximum relative error between estimated and true CSMFs. Even if Mexico collected full VR information only in its capital city with 9% of its population, our estimation method would produce an average relative error in CSMFs across the 45 causes of just over 10%. Second, we used VR data for the capital zone (Distrito Federal and Estado de Mexico) and estimated CSMFs for the three lowest-development states. Our estimation method gave an average relative error of 20%, 23%, and 31% for Guerrero, Chiapas, and Oaxaca, respectively. Conclusions Where accurate International Classification of Diseases (ICD)-coded cause-of-death data are available for deaths in hospital and for VR covering a subset of the population, we demonstrated that population CSMFs can be estimated with low average error. In addition, we showed in the case of Mexico that this method can substantially reduce error from biased hospital data, even when applied to areas with widely different levels of development. For countries with ICD-coded deaths in hospital, this method potentially allows the use of existing data to inform health policy. PMID:18031195

  7. Phase History Decomposition for efficient Scatterer Classification in SAR Imagery

    DTIC Science & Technology

    2011-09-15

    frequency. Professor Rick Martin provided key advice on frequency parameter estimation and the relationship between likelihood ratio testing and the least...132 6.1.1 Imaging Error Due to Interpolation . . . . . . . . . . . . . . . . . . . . . . . . 135 6.2 Subwindow Design and Weighting... test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 MF matched filter

  8. Prevalence of periodontitis according to Hispanic or Latino background among study participants of the Hispanic Community Health Study/Study of Latinos.

    PubMed

    Jiménez, Monik C; Sanders, Anne E; Mauriello, Sally M; Kaste, Linda M; Beck, James D

    2014-08-01

    Hispanics and Latinos are an ethnically heterogeneous population with distinct oral health risk profiles. Few study investigators have examined potential variation in the burden of periodontitis according to Hispanic or Latino background. The authors used a multicenter longitudinal population-based cohort study to examine the periodontal health status at screening (2008-2011) of 14,006 Hispanic and Latino adults, aged 18 to 74 years, from four U.S. communities who self-identified as Cuban, Dominican, Mexican, Puerto Rican, Central American or South American. The authors present weighted, age-standardized prevalence estimates and corrected standard errors of probing depth (PD), attachment loss (AL) and periodontitis classified according to the case definition established by the Centers for Disease Control and Prevention and the American Academy of Periodontology (CDC-AAP). The authors used a Wald χ(2) test to compare prevalence estimates across Hispanic or Latino background, age and sex. Fifty-one percent of all participants had exhibited total periodontitis (mild, moderate or severe) per the CDC-AAP classification. Cubans and Central Americans exhibited the highest prevalence of moderate periodontitis (39.9 percent and 37.2 percent, respectively). Across all ages, Mexicans had the highest prevalence of PD across severity thresholds. Among those aged 18 through 44 years, Dominicans consistently had the lowest prevalence of AL at all severity thresholds. Measures of periodontitis varied significantly by age, sex and Hispanic or Latino background among the four sampled Hispanic Community Health Study/Study of Latinos communities. Further analyses are needed to account for lifestyle, behavioral, demographic and social factors, including those related to acculturation. Aggregating Hispanics and Latinos or using estimates from Mexicans may lead to substantial underestimation or overestimation of the burden of disease, thus leading to errors in the estimation of needed clinical and public health resources. This information will be useful in informing decisions from public health planning to patient-centered risk assessment.

  9. Short-Term Global Horizontal Irradiance Forecasting Based on Sky Imaging and Pattern Recognition

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hodge, Brian S; Feng, Cong; Cui, Mingjian

    Accurate short-term forecasting is crucial for solar integration in the power grid. In this paper, a classification forecasting framework based on pattern recognition is developed for 1-hour-ahead global horizontal irradiance (GHI) forecasting. Three sets of models in the forecasting framework are trained by the data partitioned from the preprocessing analysis. The first two sets of models forecast GHI for the first four daylight hours of each day. Then the GHI values in the remaining hours are forecasted by an optimal machine learning model determined based on a weather pattern classification model in the third model set. The weather pattern ismore » determined by a support vector machine (SVM) classifier. The developed framework is validated by the GHI and sky imaging data from the National Renewable Energy Laboratory (NREL). Results show that the developed short-term forecasting framework outperforms the persistence benchmark by 16% in terms of the normalized mean absolute error and 25% in terms of the normalized root mean square error.« less

  10. Quantitative CT based radiomics as predictor of resectability of pancreatic adenocarcinoma

    NASA Astrophysics Data System (ADS)

    van der Putten, Joost; Zinger, Svitlana; van der Sommen, Fons; de With, Peter H. N.; Prokop, Mathias; Hermans, John

    2018-02-01

    In current clinical practice, the resectability of pancreatic ductal adenocarcinoma (PDA) is determined subjec- tively by a physician, which is an error-prone procedure. In this paper, we present a method for automated determination of resectability of PDA from a routine abdominal CT, to reduce such decision errors. The tumor features are extracted from a group of patients with both hypo- and iso-attenuating tumors, of which 29 were resectable and 21 were not. The tumor contours are supplied by a medical expert. We present an approach that uses intensity, shape, and texture features to determine tumor resectability. The best classification results are obtained with fine Gaussian SVM and the L0 Feature Selection algorithms. Compared to expert predictions made on the same dataset, our method achieves better classification results. We obtain significantly better results on correctly predicting non-resectability (+17%) compared to a expert, which is essential for patient treatment (negative prediction value). Moreover, our predictions of resectability exceed expert predictions by approximately 3% (positive prediction value).

  11. Performance Analysis of Classification Methods for Indoor Localization in Vlc Networks

    NASA Astrophysics Data System (ADS)

    Sánchez-Rodríguez, D.; Alonso-González, I.; Sánchez-Medina, J.; Ley-Bosch, C.; Díaz-Vilariño, L.

    2017-09-01

    Indoor localization has gained considerable attention over the past decade because of the emergence of numerous location-aware services. Research works have been proposed on solving this problem by using wireless networks. Nevertheless, there is still much room for improvement in the quality of the proposed classification models. In the last years, the emergence of Visible Light Communication (VLC) brings a brand new approach to high quality indoor positioning. Among its advantages, this new technology is immune to electromagnetic interference and has the advantage of having a smaller variance of received signal power compared to RF based technologies. In this paper, a performance analysis of seventeen machine leaning classifiers for indoor localization in VLC networks is carried out. The analysis is accomplished in terms of accuracy, average distance error, computational cost, training size, precision and recall measurements. Results show that most of classifiers harvest an accuracy above 90 %. The best tested classifier yielded a 99.0 % accuracy, with an average error distance of 0.3 centimetres.

  12. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Law, P.R.

    In Matsushita (J. Math. Phys. {bold 22}, 979--982 (1981), {ital ibid}. {bold 24}, 36--40 (1983)), for curvature endomorphisms for the pseudo-Euclidean space {ital R}{sup 2,2}, an analog of the Petrov classification as a basis for applications to neutral Einstein metrics on compact, orientable, four-dimensional manifolds is provided. This paper points out flaws in Matsushita's classification and, moreover, that an error in Chern's ( Pseudo-Riemannian geometry and the Gauss--Bonnet formula,'' Acad. Brasileira Ciencias {bold 35}, 17--26 (1963) and {ital Shiing}-{ital Shen} {ital Chern}: {ital Selected} {ital Papers} (Springer-Verlag, New York, 1978)) Gauss--Bonnet formula for pseudo-Riemannian geometry was incorporated in Matsushita's subsequentmore » analysis. A self-contained account of the subject of the title is presented to correct these errors, including a discussion of the validity of an appropriate analog of the Thorpe--Hitchin inequality of the Riemannian case. When the inequality obtains in the neutral case, the Euler characteristic is nonpositive, in contradistinction to Matsushita's deductions.« less

  13. A non-contact method based on multiple signal classification algorithm to reduce the measurement time for accurately heart rate detection

    NASA Astrophysics Data System (ADS)

    Bechet, P.; Mitran, R.; Munteanu, M.

    2013-08-01

    Non-contact methods for the assessment of vital signs are of great interest for specialists due to the benefits obtained in both medical and special applications, such as those for surveillance, monitoring, and search and rescue. This paper investigates the possibility of implementing a digital processing algorithm based on the MUSIC (Multiple Signal Classification) parametric spectral estimation in order to reduce the observation time needed to accurately measure the heart rate. It demonstrates that, by proper dimensioning the signal subspace, the MUSIC algorithm can be optimized in order to accurately assess the heart rate during an 8-28 s time interval. The validation of the processing algorithm performance was achieved by minimizing the mean error of the heart rate after performing simultaneous comparative measurements on several subjects. In order to calculate the error the reference value of heart rate was measured using a classic measurement system through direct contact.

  14. Evaluating structural pattern recognition for handwritten math via primitive label graphs

    NASA Astrophysics Data System (ADS)

    Zanibbi, Richard; Mouchère, Harold; Viard-Gaudin, Christian

    2013-01-01

    Currently, structural pattern recognizer evaluations compare graphs of detected structure to target structures (i.e. ground truth) using recognition rates, recall and precision for object segmentation, classification and relationships. In document recognition, these target objects (e.g. symbols) are frequently comprised of multiple primitives (e.g. connected components, or strokes for online handwritten data), but current metrics do not characterize errors at the primitive level, from which object-level structure is obtained. Primitive label graphs are directed graphs defined over primitives and primitive pairs. We define new metrics obtained by Hamming distances over label graphs, which allow classification, segmentation and parsing errors to be characterized separately, or using a single measure. Recall and precision for detected objects may also be computed directly from label graphs. We illustrate the new metrics by comparing a new primitive-level evaluation to the symbol-level evaluation performed for the CROHME 2012 handwritten math recognition competition. A Python-based set of utilities for evaluating, visualizing and translating label graphs is publicly available.

  15. Compensating for the effects of site and equipment variation on delphinid species identification from their echolocation clicks.

    PubMed

    Roch, Marie A; Stinner-Sloan, Johanna; Baumann-Pickering, Simone; Wiggins, Sean M

    2015-01-01

    A concern for applications of machine learning techniques to bioacoustics is whether or not classifiers learn the categories for which they were trained. Unfortunately, information such as characteristics of specific recording equipment or noise environments can also be learned. This question is examined in the context of identifying delphinid species by their echolocation clicks. To reduce the ambiguity between species classification performance and other confounding factors, species whose clicks can be readily distinguished were used in this study: Pacific white-sided and Risso's dolphins. A subset of data from autonomous acoustic recorders located at seven sites in the Southern California Bight collected between 2006 and 2012 was selected. Cepstral-based features were extracted for each echolocation click and Gaussian mixture models were used to classify groups of 100 clicks. One hundred Monte-Carlo three-fold experiments were conducted to examine classification performance where fold composition was determined by acoustic encounter, recorder characteristics, or recording site. The error rate increased from 6.1% when grouped by acoustic encounter to 18.1%, 46.2%, and 33.2% for grouping by equipment, equipment category, and site, respectively. A noise compensation technique reduced error for these grouping schemes to 2.7%, 4.4%, 6.7%, and 11.4%, respectively, a reduction in error rate of 56%-86%.

  16. Review of Significant Incidents and Close Calls in Human Spaceflight from a Human Factors Perspective

    NASA Technical Reports Server (NTRS)

    Silva-Martinez, Jackelynne; Ellenberger, Richard; Dory, Jonathan

    2017-01-01

    This project aims to identify poor human factors design decisions that led to error-prone systems, or did not facilitate the flight crew making the right choices; and to verify that NASA is effectively preventing similar incidents from occurring again. This analysis was performed by reviewing significant incidents and close calls in human spaceflight identified by the NASA Johnson Space Center Safety and Mission Assurance Flight Safety Office. The review of incidents shows whether the identified human errors were due to the operational phase (flight crew and ground control) or if they initiated at the design phase (includes manufacturing and test). This classification was performed with the aid of the NASA Human Systems Integration domains. This in-depth analysis resulted in a tool that helps with the human factors classification of significant incidents and close calls in human spaceflight, which can be used to identify human errors at the operational level, and how they were or should be minimized. Current governing documents on human systems integration for both government and commercial crew were reviewed to see if current requirements, processes, training, and standard operating procedures protect the crew and ground control against these issues occurring in the future. Based on the findings, recommendations to target those areas are provided.

  17. Routes to failure: analysis of 41 civil aviation accidents from the Republic of China using the human factors analysis and classification system.

    PubMed

    Li, Wen-Chin; Harris, Don; Yu, Chung-San

    2008-03-01

    The human factors analysis and classification system (HFACS) is based upon Reason's organizational model of human error. HFACS was developed as an analytical framework for the investigation of the role of human error in aviation accidents, however, there is little empirical work formally describing the relationship between the components in the model. This research analyses 41 civil aviation accidents occurring to aircraft registered in the Republic of China (ROC) between 1999 and 2006 using the HFACS framework. The results show statistically significant relationships between errors at the operational level and organizational inadequacies at both the immediately adjacent level (preconditions for unsafe acts) and higher levels in the organization (unsafe supervision and organizational influences). The pattern of the 'routes to failure' observed in the data from this analysis of civil aircraft accidents show great similarities to that observed in the analysis of military accidents. This research lends further support to Reason's model that suggests that active failures are promoted by latent conditions in the organization. Statistical relationships linking fallible decisions in upper management levels were found to directly affect supervisory practices, thereby creating the psychological preconditions for unsafe acts and hence indirectly impairing the performance of pilots, ultimately leading to accidents.

  18. An evaluation of the underlying mechanisms of bloodstain pattern analysis error.

    PubMed

    Behrooz, Nima; Hulse-Smith, Lee; Chandra, Sanjeev

    2011-09-01

    An experiment was designed to explore the underlying mechanisms of blood disintegration and its subsequent effect on area of origin (AO) calculations. Blood spatter patterns were created through the controlled application of pressurized air (20-80 kPa) for 0.1 msec onto suspended blood droplets (2.7-3.2 mm diameter). The resulting disintegration process was captured using high-speed photography. Straight-line triangulation resulted in a 50% height overestimation, whereas using the lowest calculated height for each spatter pattern reduced this error to 8%. Incorporation of projectile motion resulted in a 28% height underestimation. The AO xy-coordinate was found to be very accurate with a maximum offset of only 4 mm, while AO size calculations were found to be two- to fivefold greater than expected. Subsequently, reverse triangulation analysis revealed the rotational offset for 26% of stains could not be attributed to measurement error, suggesting that some portion of error is inherent in the disintegration process. © 2011 American Academy of Forensic Sciences.

  19. Multipolar Electrostatic Energy Prediction for all 20 Natural Amino Acids Using Kriging Machine Learning.

    PubMed

    Fletcher, Timothy L; Popelier, Paul L A

    2016-06-14

    A machine learning method called kriging is applied to the set of all 20 naturally occurring amino acids. Kriging models are built that predict electrostatic multipole moments for all topological atoms in any amino acid based on molecular geometry only. These models then predict molecular electrostatic interaction energies. On the basis of 200 unseen test geometries for each amino acid, no amino acid shows a mean prediction error above 5.3 kJ mol(-1), while the lowest error observed is 2.8 kJ mol(-1). The mean error across the entire set is only 4.2 kJ mol(-1) (or 1 kcal mol(-1)). Charged systems are created by protonating or deprotonating selected amino acids, and these show no significant deviation in prediction error over their neutral counterparts. Similarly, the proposed methodology can also handle amino acids with aromatic side chains, without the need for modification. Thus, we present a generic method capable of accurately capturing multipolar polarizable electrostatics in amino acids.

  20. Comparing Methods to Assess Intraobserver Measurement Error of 3D Craniofacial Landmarks Using Geometric Morphometrics Through a Digitizer Arm.

    PubMed

    Menéndez, Lumila Paula

    2017-05-01

    Intraobserver error (INTRA-OE) is the difference between repeated measurements of the same variable made by the same observer. The objective of this work was to evaluate INTRA-OE from 3D landmarks registered with a Microscribe, in different datasets: (A) the 3D coordinates, (B) linear measurements calculated from A, and (C) the six-first principal component axes. INTRA-OE was analyzed by digitizing 42 landmarks from 23 skulls in three events two weeks apart from each other. Systematic error was tested through repeated measures ANOVA (ANOVA-RM), while random error through intraclass correlation coefficient. Results showed that the largest differences between the three observations were found in the first dataset. Some anatomical points like nasion, ectoconchion, temporosphenoparietal, asterion, and temporomandibular presented the highest INTRA-OE. In the second dataset, local distances had higher INTRA-OE than global distances while the third dataset showed the lowest INTRA-OE. © 2016 American Academy of Forensic Sciences.

  1. Teamwork and clinical error reporting among nurses in Korean hospitals.

    PubMed

    Hwang, Jee-In; Ahn, Jeonghoon

    2015-03-01

    To examine levels of teamwork and its relationships with clinical error reporting among Korean hospital nurses. The study employed a cross-sectional survey design. We distributed a questionnaire to 674 nurses in two teaching hospitals in Korea. The questionnaire included items on teamwork and the reporting of clinical errors. We measured teamwork using the Teamwork Perceptions Questionnaire, which has five subscales including team structure, leadership, situation monitoring, mutual support, and communication. Using logistic regression analysis, we determined the relationships between teamwork and error reporting. The response rate was 85.5%. The mean score of teamwork was 3.5 out of 5. At the subscale level, mutual support was rated highest, while leadership was rated lowest. Of the participating nurses, 522 responded that they had experienced at least one clinical error in the last 6 months. Among those, only 53.0% responded that they always or usually reported clinical errors to their managers and/or the patient safety department. Teamwork was significantly associated with better error reporting. Specifically, nurses with a higher team communication score were more likely to report clinical errors to their managers and the patient safety department (odds ratio = 1.82, 95% confidence intervals [1.05, 3.14]). Teamwork was rated as moderate and was positively associated with nurses' error reporting performance. Hospital executives and nurse managers should make substantial efforts to enhance teamwork, which will contribute to encouraging the reporting of errors and improving patient safety. Copyright © 2015. Published by Elsevier B.V.

  2. Typing mineral deposits using their grades and tonnages in an artificial neural network

    USGS Publications Warehouse

    Singer, Donald A.; Kouda, Ryoichi

    2003-01-01

    A test of the ability of a probabilistic neural network to classify deposits into types on the basis of deposit tonnage and average Cu, Mo, Ag, Au, Zn, and Pb grades is conducted. The purpose is to examine whether this type of system might serve as a basis for integrating geoscience information available in large mineral databases to classify sites by deposit type. Benefits of proper classification of many sites in large regions are relatively rapid identification of terranes permissive for deposit types and recognition of specific sites perhaps worthy of exploring further.Total tonnages and average grades of 1,137 well-explored deposits identified in published grade and tonnage models representing 13 deposit types were used to train and test the network. Tonnages were transformed by logarithms and grades by square roots to reduce effects of skewness. All values were scaled by subtracting the variable's mean and dividing by its standard deviation. Half of the deposits were selected randomly to be used in training the probabilistic neural network and the other half were used for independent testing. Tests were performed with a probabilistic neural network employing a Gaussian kernel and separate sigma weights for each class (type) and each variable (grade or tonnage).Deposit types were selected to challenge the neural network. For many types, tonnages or average grades are significantly different from other types, but individual deposits may plot in the grade and tonnage space of more than one type. Porphyry Cu, porphyry Cu-Au, and porphyry Cu-Mo types have similar tonnages and relatively small differences in grades. Redbed Cu deposits typically have tonnages that could be confused with porphyry Cu deposits, also contain Cu and, in some situations, Ag. Cyprus and kuroko massive sulfide types have about the same tonnages. Cu, Zn, Ag, and Au grades. Polymetallic vein, sedimentary exhalative Zn-Pb, and Zn-Pb skarn types contain many of the same metals. Sediment-hosted Au, Comstock Au-Ag, and low-sulfide Au-quartz vein types are principally Au deposits with differing amounts of Ag.Given the intent to test the neural network under the most difficult conditions, an overall 75% agreement between the experts and the neural network is considered excellent. Among the largestclassification errors are skarn Zn-Pb and Cyprus massive sulfide deposits classed by the neuralnetwork as kuroko massive sulfides—24 and 63% error respectively. Other large errors are the classification of 92% of porphyry Cu-Mo as porphyry Cu deposits. Most of the larger classification errors involve 25 or fewer training deposits, suggesting that some errors might be the result of small sample size. About 91% of the gold deposit types were classed properly and 98% of porphyry Cu deposits were classes as some type of porphyry Cu deposit. An experienced economic geologist would not make many of the classification errors that were made by the neural network because the geologic settings of deposits would be used to reduce errors. In a separate test, the probabilistic neural network correctly classed 93% of 336 deposits in eight deposit types when trained with presence or absence of 58 minerals and six generalized rock types. The overall success rate of the probabilistic neural network when trained on tonnage and average grades would probably be more than 90% with additional information on the presence of a few rock types.

  3. A novel evaluation of two related and two independent algorithms for eye movement classification during reading.

    PubMed

    Friedman, Lee; Rigas, Ioannis; Abdulin, Evgeny; Komogortsev, Oleg V

    2018-05-15

    Nystrӧm and Holmqvist have published a method for the classification of eye movements during reading (ONH) (Nyström & Holmqvist, 2010). When we applied this algorithm to our data, the results were not satisfactory, so we modified the algorithm (now the MNH) to better classify our data. The changes included: (1) reducing the amount of signal filtering, (2) excluding a new type of noise, (3) removing several adaptive thresholds and replacing them with fixed thresholds, (4) changing the way that the start and end of each saccade was determined, (5) employing a new algorithm for detecting PSOs, and (6) allowing a fixation period to either begin or end with noise. A new method for the evaluation of classification algorithms is presented. It was designed to provide comprehensive feedback to an algorithm developer, in a time-efficient manner, about the types and numbers of classification errors that an algorithm produces. This evaluation was conducted by three expert raters independently, across 20 randomly chosen recordings, each classified by both algorithms. The MNH made many fewer errors in determining when saccades start and end, and it also detected some fixations and saccades that the ONH did not. The MNH fails to detect very small saccades. We also evaluated two additional algorithms: the EyeLink Parser and a more current, machine-learning-based algorithm. The EyeLink Parser tended to find more saccades that ended too early than did the other methods, and we found numerous problems with the output of the machine-learning-based algorithm.

  4. Organic Chemical Attribution Signatures for the Sourcing of a Mustard Agent and Its Starting Materials

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fraga, Carlos G.; Bronk, Krys; Dockendorff, Brian P.

    Chemical attribution signatures (CAS) are being investigated for the sourcing of chemical warfare (CW) agents and their starting materials that may be implicated in chemical attacks or CW proliferation. The work reported here demonstrates for the first time trace impurities produced during the synthesis of tris(2-chloroethyl)amine (HN3) that point to specific reagent stocks used in the synthesis of this CW agent. Thirty batches of HN3 were synthesized using different combinations of commercial stocks of triethanolamine (TEA), thionyl chloride, chloroform, and acetone. The HN3 batches and reagent stocks were then analyzed for impurities by gas chromatography/mass spectrometry. Reaction-produced impurities indicative ofmore » specific TEA and chloroform stocks were exclusively discovered in HN3 batches made with those reagent stocks. In addition, some reagent impurities were found in the HN3 batches that were presumably not altered during synthesis and believed to be indicative of reagent type regardless of stock. Supervised classification using partial least squares discriminant analysis (PLSDA) on the impurity profiles of chloroform samples from seven stocks resulted in an average classification error by cross-validation of 2.4%. A classification error of zero was obtained using the seven-stock PLSDA model on a validation set of samples from an arbitrarily selected chloroform stock. In a separate analysis, all samples from two of seven chloroform stocks that were purposely not modeled had their samples matched to a chloroform stock rather than assigned a “no class” classification.« less

  5. Detecting single-trial EEG evoked potential using a wavelet domain linear mixed model: application to error potentials classification.

    PubMed

    Spinnato, J; Roubaud, M-C; Burle, B; Torrésani, B

    2015-06-01

    The main goal of this work is to develop a model for multisensor signals, such as magnetoencephalography or electroencephalography (EEG) signals that account for inter-trial variability, suitable for corresponding binary classification problems. An important constraint is that the model be simple enough to handle small size and unbalanced datasets, as often encountered in BCI-type experiments. The method involves the linear mixed effects statistical model, wavelet transform, and spatial filtering, and aims at the characterization of localized discriminant features in multisensor signals. After discrete wavelet transform and spatial filtering, a projection onto the relevant wavelet and spatial channels subspaces is used for dimension reduction. The projected signals are then decomposed as the sum of a signal of interest (i.e., discriminant) and background noise, using a very simple Gaussian linear mixed model. Thanks to the simplicity of the model, the corresponding parameter estimation problem is simplified. Robust estimates of class-covariance matrices are obtained from small sample sizes and an effective Bayes plug-in classifier is derived. The approach is applied to the detection of error potentials in multichannel EEG data in a very unbalanced situation (detection of rare events). Classification results prove the relevance of the proposed approach in such a context. The combination of the linear mixed model, wavelet transform and spatial filtering for EEG classification is, to the best of our knowledge, an original approach, which is proven to be effective. This paper improves upon earlier results on similar problems, and the three main ingredients all play an important role.

  6. A Method for the Study of Human Factors in Aircraft Operations

    NASA Technical Reports Server (NTRS)

    Barnhart, W.; Billings, C.; Cooper, G.; Gilstrap, R.; Lauber, J.; Orlady, H.; Puskas, B.; Stephens, W.

    1975-01-01

    A method for the study of human factors in the aviation environment is described. A conceptual framework is provided within which pilot and other human errors in aircraft operations may be studied with the intent of finding out how, and why, they occurred. An information processing model of human behavior serves as the basis for the acquisition and interpretation of information relating to occurrences which involve human error. A systematic method of collecting such data is presented and discussed. The classification of the data is outlined.

  7. The nearest neighbor and the bayes error rates.

    PubMed

    Loizou, G; Maybank, S J

    1987-02-01

    The (k, l) nearest neighbor method of pattern classification is compared to the Bayes method. If the two acceptance rates are equal then the asymptotic error rates satisfy the inequalities Ek,l + 1 ¿ E*(¿) ¿ Ek,l dE*(¿), where d is a function of k, l, and the number of pattern classes, and ¿ is the reject threshold for the Bayes method. An explicit expression for d is given which is optimal in the sense that for some probability distributions Ek,l and dE* (¿) are equal.

  8. Multiclass Bayes error estimation by a feature space sampling technique

    NASA Technical Reports Server (NTRS)

    Mobasseri, B. G.; Mcgillem, C. D.

    1979-01-01

    A general Gaussian M-class N-feature classification problem is defined. An algorithm is developed that requires the class statistics as its only input and computes the minimum probability of error through use of a combined analytical and numerical integration over a sequence simplifying transformations of the feature space. The results are compared with those obtained by conventional techniques applied to a 2-class 4-feature discrimination problem with results previously reported and 4-class 4-feature multispectral scanner Landsat data classified by training and testing of the available data.

  9. A Simulation Analysis of Errors in the Measurement of Standard Electrochemical Rate Constants from Phase-Selective Impedance Data.

    DTIC Science & Technology

    1987-09-30

    RESTRICTIVE MARKINGSC Unclassif ied 2a SECURIly CLASSIFICATION ALIIMOA4TY 3 DIS1RSBj~jiOAVAILAB.I1Y OF RkPORI _________________________________ Approved...of the AC current, including the time dependence at a growing DME, at a given fixed potential either in the presence or the absence of an...the relative error in k b(app) is ob relatively small for ks (true) : 0.5 cm s-, and increases rapidly for ob larger rate constants as kob reaches the

  10. LACIE performance predictor final operational capability program description, volume 3

    NASA Technical Reports Server (NTRS)

    1976-01-01

    The requirements and processing logic for the LACIE Error Model program (LEM) are described. This program is an integral part of the Large Area Crop Inventory Experiment (LACIE) system. LEM is that portion of the LPP (LACIE Performance Predictor) which simulates the sample segment classification, strata yield estimation, and production aggregation. LEM controls repetitive Monte Carlo trials based on input error distributions to obtain statistical estimates of the wheat area, yield, and production at different levels of aggregation. LEM interfaces with the rest of the LPP through a set of data files.

  11. HIV classification using the coalescent theory

    PubMed Central

    Bulla, Ingo; Schultz, Anne-Kathrin; Schreiber, Fabian; Zhang, Ming; Leitner, Thomas; Korber, Bette; Morgenstern, Burkhard; Stanke, Mario

    2010-01-01

    Motivation: Existing coalescent models and phylogenetic tools based on them are not designed for studying the genealogy of sequences like those of HIV, since in HIV recombinants with multiple cross-over points between the parental strains frequently arise. Hence, ambiguous cases in the classification of HIV sequences into subtypes and circulating recombinant forms (CRFs) have been treated with ad hoc methods in lack of tools based on a comprehensive coalescent model accounting for complex recombination patterns. Results: We developed the program ARGUS that scores classifications of sequences into subtypes and recombinant forms. It reconstructs ancestral recombination graphs (ARGs) that reflect the genealogy of the input sequences given a classification hypothesis. An ARG with maximal probability is approximated using a Markov chain Monte Carlo approach. ARGUS was able to distinguish the correct classification with a low error rate from plausible alternative classifications in simulation studies with realistic parameters. We applied our algorithm to decide between two recently debated alternatives in the classification of CRF02 of HIV-1 and find that CRF02 is indeed a recombinant of Subtypes A and G. Availability: ARGUS is implemented in C++ and the source code is available at http://gobics.de/software Contact: ibulla@uni-goettingen.de Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:20400454

  12. Impact of a reengineered electronic error-reporting system on medication event reporting and care process improvements at an urban medical center.

    PubMed

    McKaig, Donald; Collins, Christine; Elsaid, Khaled A

    2014-09-01

    A study was conducted to evaluate the impact of a reengineered approach to electronic error reporting at a 719-bed multidisciplinary urban medical center. The main outcome of interest was the monthly reported medication errors during the preimplementation (20 months) and postimplementation (26 months) phases. An interrupted time series analysis was used to describe baseline errors, immediate change following implementation of the current electronic error-reporting system (e-ERS), and trend of error reporting during postimplementation. Errors were categorized according to severity using the National Coordinating Council for Medication Error Reporting and Prevention (NCC MERP) Medication Error Index classifications. Reported errors were further analyzed by reporter and error site. During preimplementation, the monthly reported errors mean was 40.0 (95% confidence interval [CI]: 36.3-43.7). Immediately following e-ERS implementation, monthly reported errors significantly increased by 19.4 errors (95% CI: 8.4-30.5). The change in slope of reported errors trend was estimated at 0.76 (95% CI: 0.07-1.22). Near misses and no-patient-harm errors accounted for 90% of all errors, while errors that caused increased patient monitoring or temporary harm accounted for 9% and 1%, respectively. Nurses were the most frequent reporters, while physicians were more likely to report high-severity errors. Medical care units accounted for approximately half of all reported errors. Following the intervention, there was a significant increase in reporting of prevented errors and errors that reached the patient with no resultant harm. This improvement in reporting was sustained for 26 months and has contributed to designing and implementing quality improvement initiatives to enhance the safety of the medication use process.

  13. Error Rate Comparison during Polymerase Chain Reaction by DNA Polymerase

    DOE PAGES

    McInerney, Peter; Adams, Paul; Hadi, Masood Z.

    2014-01-01

    As larger-scale cloning projects become more prevalent, there is an increasing need for comparisons among high fidelity DNA polymerases used for PCR amplification. All polymerases marketed for PCR applications are tested for fidelity properties (i.e., error rate determination) by vendors, and numerous literature reports have addressed PCR enzyme fidelity. Nonetheless, it is often difficult to make direct comparisons among different enzymes due to numerous methodological and analytical differences from study to study. We have measured the error rates for 6 DNA polymerases commonly used in PCR applications, including 3 polymerases typically used for cloning applications requiring high fidelity. Error ratemore » measurement values reported here were obtained by direct sequencing of cloned PCR products. The strategy employed here allows interrogation of error rate across a very large DNA sequence space, since 94 unique DNA targets were used as templates for PCR cloning. The six enzymes included in the study, Taq polymerase, AccuPrime-Taq High Fidelity, KOD Hot Start, cloned Pfu polymerase, Phusion Hot Start, and Pwo polymerase, we find the lowest error rates with Pfu , Phusion, and Pwo polymerases. Error rates are comparable for these 3 enzymes and are >10x lower than the error rate observed with Taq polymerase. Mutation spectra are reported, with the 3 high fidelity enzymes displaying broadly similar types of mutations. For these enzymes, transition mutations predominate, with little bias observed for type of transition.« less

  14. Quotation accuracy in medical journal articles-a systematic review and meta-analysis.

    PubMed

    Jergas, Hannah; Baethge, Christopher

    2015-01-01

    Background. Quotations and references are an indispensable element of scientific communication. They should support what authors claim or provide important background information for readers. Studies indicate, however, that quotations not serving their purpose-quotation errors-may be prevalent. Methods. We carried out a systematic review, meta-analysis and meta-regression of quotation errors, taking account of differences between studies in error ascertainment. Results. Out of 559 studies screened we included 28 in the main analysis, and estimated major, minor and total quotation error rates of 11,9%, 95% CI [8.4, 16.6] 11.5% [8.3, 15.7], and 25.4% [19.5, 32.4]. While heterogeneity was substantial, even the lowest estimate of total quotation errors was considerable (6.7%). Indirect references accounted for less than one sixth of all quotation problems. The findings remained robust in a number of sensitivity and subgroup analyses (including risk of bias analysis) and in meta-regression. There was no indication of publication bias. Conclusions. Readers of medical journal articles should be aware of the fact that quotation errors are common. Measures against quotation errors include spot checks by editors and reviewers, correct placement of citations in the text, and declarations by authors that they have checked cited material. Future research should elucidate if and to what degree quotation errors are detrimental to scientific progress.

  15. Computer-aided Prognosis of Neuroblastoma on Whole-slide Images: Classification of Stromal Development

    PubMed Central

    Sertel, O.; Kong, J.; Shimada, H.; Catalyurek, U.V.; Saltz, J.H.; Gurcan, M.N.

    2009-01-01

    We are developing a computer-aided prognosis system for neuroblastoma (NB), a cancer of the nervous system and one of the most malignant tumors affecting children. Histopathological examination is an important stage for further treatment planning in routine clinical diagnosis of NB. According to the International Neuroblastoma Pathology Classification (the Shimada system), NB patients are classified into favorable and unfavorable histology based on the tissue morphology. In this study, we propose an image analysis system that operates on digitized H&E stained whole-slide NB tissue samples and classifies each slide as either stroma-rich or stroma-poor based on the degree of Schwannian stromal development. Our statistical framework performs the classification based on texture features extracted using co-occurrence statistics and local binary patterns. Due to the high resolution of digitized whole-slide images, we propose a multi-resolution approach that mimics the evaluation of a pathologist such that the image analysis starts from the lowest resolution and switches to higher resolutions when necessary. We employ an offine feature selection step, which determines the most discriminative features at each resolution level during the training step. A modified k-nearest neighbor classifier is used to determine the confidence level of the classification to make the decision at a particular resolution level. The proposed approach was independently tested on 43 whole-slide samples and provided an overall classification accuracy of 88.4%. PMID:20161324

  16. [Land cover classification of Four Lakes Region in Hubei Province based on MODIS and ENVISAT data].

    PubMed

    Xue, Lian; Jin, Wei-Bin; Xiong, Qin-Xue; Liu, Zhang-Yong

    2010-03-01

    Based on the differences of back scattering coefficient in ENVISAT ASAR data, a classification was made on the towns, waters, and vegetation-covered areas in the Four Lakes Region of Hubei Province. According to the local cropping systems and phenological characteristics in the region, and by using the discrepancies of the MODIS-NDVI index from late April to early May, the vegetation-covered areas were classified into croplands and non-croplands. The classification results based on the above-mentioned procedure was verified by the classification results based on the ETM data with high spatial resolution. Based on the DEM data, the non-croplands were categorized into forest land and bottomland; and based on the discrepancies of mean NDVI index per month, the crops were identified as mid rice, late rice, and cotton, and the croplands were identified as paddy field and upland field. The land cover classification based on the MODIS data with low spatial resolution was basically consistent with that based on the ETM data with high spatial resolution, and the total error rate was about 13.15% when the classification results based on ETM data were taken as the standard. The utilization of the above-mentioned procedures for large scale land cover classification and mapping could make the fast tracking of regional land cover classification.

  17. Use of Binary Partition Tree and energy minimization for object-based classification of urban land cover

    NASA Astrophysics Data System (ADS)

    Li, Mengmeng; Bijker, Wietske; Stein, Alfred

    2015-04-01

    Two main challenges are faced when classifying urban land cover from very high resolution satellite images: obtaining an optimal image segmentation and distinguishing buildings from other man-made objects. For optimal segmentation, this work proposes a hierarchical representation of an image by means of a Binary Partition Tree (BPT) and an unsupervised evaluation of image segmentations by energy minimization. For building extraction, we apply fuzzy sets to create a fuzzy landscape of shadows which in turn involves a two-step procedure. The first step is a preliminarily image classification at a fine segmentation level to generate vegetation and shadow information. The second step models the directional relationship between building and shadow objects to extract building information at the optimal segmentation level. We conducted the experiments on two datasets of Pléiades images from Wuhan City, China. To demonstrate its performance, the proposed classification is compared at the optimal segmentation level with Maximum Likelihood Classification and Support Vector Machine classification. The results show that the proposed classification produced the highest overall accuracies and kappa coefficients, and the smallest over-classification and under-classification geometric errors. We conclude first that integrating BPT with energy minimization offers an effective means for image segmentation. Second, we conclude that the directional relationship between building and shadow objects represented by a fuzzy landscape is important for building extraction.

  18. A Bio-Inspired Herbal Tea Flavour Assessment Technique

    PubMed Central

    Zakaria, Nur Zawatil Isqi; Masnan, Maz Jamilah; Zakaria, Ammar; Shakaff, Ali Yeon Md

    2014-01-01

    Herbal-based products are becoming a widespread production trend among manufacturers for the domestic and international markets. As the production increases to meet the market demand, it is very crucial for the manufacturer to ensure that their products have met specific criteria and fulfil the intended quality determined by the quality controller. One famous herbal-based product is herbal tea. This paper investigates bio-inspired flavour assessments in a data fusion framework involving an e-nose and e-tongue. The objectives are to attain good classification of different types and brands of herbal tea, classification of different flavour masking effects and finally classification of different concentrations of herbal tea. Two data fusion levels were employed in this research, low level data fusion and intermediate level data fusion. Four classification approaches; LDA, SVM, KNN and PNN were examined in search of the best classifier to achieve the research objectives. In order to evaluate the classifiers' performance, an error estimator based on k-fold cross validation and leave-one-out were applied. Classification based on GC-MS TIC data was also included as a comparison to the classification performance using fusion approaches. Generally, KNN outperformed the other classification techniques for the three flavour assessments in the low level data fusion and intermediate level data fusion. However, the classification results based on GC-MS TIC data are varied. PMID:25010697

  19. Condition Monitoring for Helicopter Data. Appendix A

    NASA Technical Reports Server (NTRS)

    Wen, Fang; Willett, Peter; Deb, Somnath

    2000-01-01

    In this paper the classical "Westland" set of empirical accelerometer helicopter data is analyzed with the aim of condition monitoring for diagnostic purposes. The goal is to determine features for failure events from these data, via a proprietary signal processing toolbox, and to weigh these according to a variety of classification algorithms. As regards signal processing, it appears that the autoregressive (AR) coefficients from a simple linear model encapsulate a great deal of information in a relatively few measurements; it has also been found that augmentation of these by harmonic and other parameters can improve classification significantly. As regards classification, several techniques have been explored, among these restricted Coulomb energy (RCE) networks, learning vector quantization (LVQ), Gaussian mixture classifiers and decision trees. A problem with these approaches, and in common with many classification paradigms, is that augmentation of the feature dimension can degrade classification ability. Thus, we also introduce the Bayesian data reduction algorithm (BDRA), which imposes a Dirichlet prior on training data and is thus able to quantify probability of error in an exact manner, such that features may be discarded or coarsened appropriately.

  20. Reduction from cost-sensitive ordinal ranking to weighted binary classification.

    PubMed

    Lin, Hsuan-Tien; Li, Ling

    2012-05-01

    We present a reduction framework from ordinal ranking to binary classification. The framework consists of three steps: extracting extended examples from the original examples, learning a binary classifier on the extended examples with any binary classification algorithm, and constructing a ranker from the binary classifier. Based on the framework, we show that a weighted 0/1 loss of the binary classifier upper-bounds the mislabeling cost of the ranker, both error-wise and regret-wise. Our framework allows not only the design of good ordinal ranking algorithms based on well-tuned binary classification approaches, but also the derivation of new generalization bounds for ordinal ranking from known bounds for binary classification. In addition, our framework unifies many existing ordinal ranking algorithms, such as perceptron ranking and support vector ordinal regression. When compared empirically on benchmark data sets, some of our newly designed algorithms enjoy advantages in terms of both training speed and generalization performance over existing algorithms. In addition, the newly designed algorithms lead to better cost-sensitive ordinal ranking performance, as well as improved listwise ranking performance.

  1. Frequency of use of the International Classification of Diseases ICD-10 diagnostic categories for mental and behavioural disorders across world regions.

    PubMed

    Faiad, Y; Khoury, B; Daouk, S; Maj, M; Keeley, J; Gureje, O; Reed, G

    2017-11-09

    The study aimed to examine variations in the use of International Classification of Diseases, Tenth Edition (ICD-10) diagnostic categories for mental and behavioural disorders across countries, regions and income levels using data from the online World Psychiatric Association (WPA)-World Health Organization (WHO) Global Survey that examined the attitudes of psychiatrists towards the classification of mental disorders. A survey was sent to 46 psychiatric societies which are members of WPA. A total of 4887 psychiatrists participated in the survey, which asked about their use of classification, their preferred system and the categories that were used most frequently. The majority (70.1%) of participating psychiatrists (out of 4887 psychiatrists) reported using the ICD-10 the most and using at least one diagnostic category once a week. Nine out of 44 diagnostic categories were considerably variable in terms of frequency of use across countries. These were: emotionally unstable personality disorder, borderline type; dissociative (conversion) disorder; somatoform disorders; obsessive-compulsive disorder (OCD); mental and behavioural disorders due to the use of alcohol; adjustment disorder; mental and behavioural disorders due to the use of cannabinoids; dementia in Alzheimer's disease; and acute and transient psychotic disorder. The frequency of use for these nine categories was examined across WHO regions and income levels. The most striking differences across WHO regions were found for five out of these nine categories. For dissociative (conversion) disorder, use was highest for the WHO Eastern Mediterranean Region (EMRO) and non-existent for the WHO African Region. For mental and behavioural disorders due to the use of alcohol, use was lowest for EMRO. For mental and behavioural disorders due to the use of cannabinoids, use was lowest for the WHO European Region and the WHO Western Pacific Region. For OCD and somatoform disorders, use was lowest for EMRO and the WHO Southeast Asian Region. Differences in the frequency of use across income levels were statistically significant for all categories except for mental and behavioural disorders due to the use of alcohol. The most striking variations were found for acute and transient psychotic disorder, which was reported to be more commonly used among psychiatrists from countries with lower income levels. The differences in frequency of use reported in the current study show that cross-cultural variations in psychiatric practice exist. However, whether these differences are due to the variations in prevalence, treatment-seeking behaviour and other factors, such as psychiatrist and patient characteristics as a result of culture, cannot be determined based on the findings of the study. Further research is needed to examine whether these variations are culturally determined and how that would affect the cross-cultural applicability of ICD-10 diagnostic categories.

  2. Analysis of urban area land cover using SEASAT Synthetic Aperture Radar data

    NASA Technical Reports Server (NTRS)

    Henderson, F. M. (Principal Investigator)

    1980-01-01

    Digitally processed SEASAT synthetic aperture raar (SAR) imagery of the Denver, Colorado urban area was examined to explore the potential of SAR data for mapping urban land cover and the compatability of SAR derived land cover classes with the United States Geological Survey classification system. The imagery is examined at three different scales to determine the effect of image enlargement on accuracy and level of detail extractable. At each scale the value of employing a simplistic preprocessing smoothing algorithm to improve image interpretation is addressed. A visual interpretation approach and an automated machine/visual approach are employed to evaluate the feasibility of producing a semiautomated land cover classification from SAR data. Confusion matrices of omission and commission errors are employed to define classification accuracies for each interpretation approach and image scale.

  3. Analysis of the Tanana River Basin using LANDSAT data

    NASA Technical Reports Server (NTRS)

    Morrissey, L. A.; Ambrosia, V. G.; Carson-Henry, C.

    1981-01-01

    Digital image classification techniques were used to classify land cover/resource information in the Tanana River Basin of Alaska. Portions of four scenes of LANDSAT digital data were analyzed using computer systems at Ames Research Center in an unsupervised approach to derive cluster statistics. The spectral classes were identified using the IDIMS display and color infrared photography. Classification errors were corrected using stratification procedures. The classification scheme resulted in the following eleven categories; sedimented/shallow water, clear/deep water, coniferous forest, mixed forest, deciduous forest, shrub and grass, bog, alpine tundra, barrens, snow and ice, and cultural features. Color coded maps and acreage summaries of the major land cover categories were generated for selected USGS quadrangles (1:250,000) which lie within the drainage basin. The project was completed within six months.

  4. A Bayesian taxonomic classification method for 16S rRNA gene sequences with improved species-level accuracy.

    PubMed

    Gao, Xiang; Lin, Huaiying; Revanna, Kashi; Dong, Qunfeng

    2017-05-10

    Species-level classification for 16S rRNA gene sequences remains a serious challenge for microbiome researchers, because existing taxonomic classification tools for 16S rRNA gene sequences either do not provide species-level classification, or their classification results are unreliable. The unreliable results are due to the limitations in the existing methods which either lack solid probabilistic-based criteria to evaluate the confidence of their taxonomic assignments, or use nucleotide k-mer frequency as the proxy for sequence similarity measurement. We have developed a method that shows significantly improved species-level classification results over existing methods. Our method calculates true sequence similarity between query sequences and database hits using pairwise sequence alignment. Taxonomic classifications are assigned from the species to the phylum levels based on the lowest common ancestors of multiple database hits for each query sequence, and further classification reliabilities are evaluated by bootstrap confidence scores. The novelty of our method is that the contribution of each database hit to the taxonomic assignment of the query sequence is weighted by a Bayesian posterior probability based upon the degree of sequence similarity of the database hit to the query sequence. Our method does not need any training datasets specific for different taxonomic groups. Instead only a reference database is required for aligning to the query sequences, making our method easily applicable for different regions of the 16S rRNA gene or other phylogenetic marker genes. Reliable species-level classification for 16S rRNA or other phylogenetic marker genes is critical for microbiome research. Our software shows significantly higher classification accuracy than the existing tools and we provide probabilistic-based confidence scores to evaluate the reliability of our taxonomic classification assignments based on multiple database matches to query sequences. Despite its higher computational costs, our method is still suitable for analyzing large-scale microbiome datasets for practical purposes. Furthermore, our method can be applied for taxonomic classification of any phylogenetic marker gene sequences. Our software, called BLCA, is freely available at https://github.com/qunfengdong/BLCA .

  5. Disorders of Articulation. Prentice-Hall Foundations of Speech Pathology Series.

    ERIC Educational Resources Information Center

    Carrell, James A.

    Designed for students of speech pathology and audiology and for practicing clinicians, the text considers the nature of the articulation process, criteria for diagnosis, and classification and etiology of disorders. Also discussed are phonetic characteristics, including phonemic errors and configurational and contextual defects; and functional…

  6. A PRIOR EVALUATION OF TWO-STAGE CLUSTER SAMPLING FOR ACCURACY ASSESSMENT OF LARGE-AREA LAND-COVER MAPS

    EPA Science Inventory

    Two-stage cluster sampling reduces the cost of collecting accuracy assessment reference data by constraining sample elements to fall within a limited number of geographic domains (clusters). However, because classification error is typically positively spatially correlated, withi...

  7. A review and discussion of flight management system incidents reported to the Aviation Safety Reporting System

    DOT National Transportation Integrated Search

    1992-02-01

    This report covers the activities related to the description, classification and : analysis of the types and kinds of flight crew errors, incidents and actions, as : reported to the Aviation Safety Reporting System (ASRS) database, that can occur as ...

  8. 21 CFR 886.1770 - Manual refractor.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... DEVICES OPHTHALMIC DEVICES Diagnostic Devices § 886.1770 Manual refractor. (a) Identification. A manual refractor is a device that is a set of lenses of varous dioptric powers intended to measure the refractive error of the eye. (b) Classification. Class I (general controls). The device is exempt from the...

  9. An assessment of the cultivated cropland class of NLCD 2006 using a multi-source and multi-criteria approach

    USGS Publications Warehouse

    Danielson, Patrick; Yang, Limin; Jin, Suming; Homer, Collin G.; Napton, Darrell

    2016-01-01

    We developed a method that analyzes the quality of the cultivated cropland class mapped in the USA National Land Cover Database (NLCD) 2006. The method integrates multiple geospatial datasets and a Multi Index Integrated Change Analysis (MIICA) change detection method that captures spectral changes to identify the spatial distribution and magnitude of potential commission and omission errors for the cultivated cropland class in NLCD 2006. The majority of the commission and omission errors in NLCD 2006 are in areas where cultivated cropland is not the most dominant land cover type. The errors are primarily attributed to the less accurate training dataset derived from the National Agricultural Statistics Service Cropland Data Layer dataset. In contrast, error rates are low in areas where cultivated cropland is the dominant land cover. Agreement between model-identified commission errors and independently interpreted reference data was high (79%). Agreement was low (40%) for omission error comparison. The majority of the commission errors in the NLCD 2006 cultivated crops were confused with low-intensity developed classes, while the majority of omission errors were from herbaceous and shrub classes. Some errors were caused by inaccurate land cover change from misclassification in NLCD 2001 and the subsequent land cover post-classification process.

  10. Statistical inference for template aging

    NASA Astrophysics Data System (ADS)

    Schuckers, Michael E.

    2006-04-01

    A change in classification error rates for a biometric device is often referred to as template aging. Here we offer two methods for determining whether the effect of time is statistically significant. The first of these is the use of a generalized linear model to determine if these error rates change linearly over time. This approach generalizes previous work assessing the impact of covariates using generalized linear models. The second approach uses of likelihood ratio tests methodology. The focus here is on statistical methods for estimation not the underlying cause of the change in error rates over time. These methodologies are applied to data from the National Institutes of Standards and Technology Biometric Score Set Release 1. The results of these applications are discussed.

  11. Evaluation of Natural Language Processing (NLP) Systems to Annotate Drug Product Labeling with MedDRA Terminology.

    PubMed

    Ly, Thomas; Pamer, Carol; Dang, Oanh; Brajovic, Sonja; Haider, Shahrukh; Botsis, Taxiarchis; Milward, David; Winter, Andrew; Lu, Susan; Ball, Robert

    2018-05-31

    The FDA Adverse Event Reporting System (FAERS) is a primary data source for identifying unlabeled adverse events (AEs) in a drug or biologic drug product's postmarketing phase. Many AE reports must be reviewed by drug safety experts to identify unlabeled AEs, even if the reported AEs are previously identified, labeled AEs. Integrating the labeling status of drug product AEs into FAERS could increase report triage and review efficiency. Medical Dictionary for Regulatory Activities (MedDRA) is the standard for coding AE terms in FAERS cases. However, drug manufacturers are not required to use MedDRA to describe AEs in product labels. We hypothesized that natural language processing (NLP) tools could assist in automating the extraction and MedDRA mapping of AE terms in drug product labels. We evaluated the performance of three NLP systems, (ETHER, I2E, MetaMap) for their ability to extract AE terms from drug labels and translate the terms to MedDRA Preferred Terms (PTs). Pharmacovigilance-based annotation guidelines for extracting AE terms from drug labels were developed for this study. We compared each system's output to MedDRA PT AE lists, manually mapped by FDA pharmacovigilance experts using the guidelines, for ten drug product labels known as the "gold standard AE list" (GSL) dataset. Strict time and configuration conditions were imposed in order to test each system's capabilities under conditions of no human intervention and minimal system configuration. Each NLP system's output was evaluated for precision, recall and F measure in comparison to the GSL. A qualitative error analysis (QEA) was conducted to categorize a random sample of each NLP system's false positive and false negative errors. A total of 417, 278, and 250 false positive errors occurred in the ETHER, I2E, and MetaMap outputs, respectively. A total of 100, 80, and 187 false negative errors occurred in ETHER, I2E, and MetaMap outputs, respectively. Precision ranged from 64% to 77%, recall from 64% to 83% and F measure from 67% to 79%. I2E had the highest precision (77%), recall (83%) and F measure (79%). ETHER had the lowest precision (64%). MetaMap had the lowest recall (64%). The QEA found that the most prevalent false positive errors were context errors such as "Context error/General term", "Context error/Instructions or monitoring parameters", "Context error/Medical history preexisting condition underlying condition risk factor or contraindication", and "Context error/AE manifestations or secondary complication". The most prevalent false negative errors were in the "Incomplete or missed extraction" error category. Missing AE terms were typically due to long terms, or terms containing non-contiguous words which do not correspond exactly to MedDRA synonyms. MedDRA mapping errors were a minority of errors for ETHER and I2E but were the most prevalent false positive errors for MetaMap. The results demonstrate that it may be feasible to use NLP tools to extract and map AE terms to MedDRA PTs. However, the NLP tools we tested would need to be modified or reconfigured to lower the error rates to support their use in a regulatory setting. Tools specific for extracting AE terms from drug labels and mapping the terms to MedDRA PTs may need to be developed to support pharmacovigilance. Conducting research using additional NLP systems on a larger, diverse GSL would also be informative. Copyright © 2018. Published by Elsevier Inc.

  12. Evaluation of Classifier Performance for Multiclass Phenotype Discrimination in Untargeted Metabolomics.

    PubMed

    Trainor, Patrick J; DeFilippis, Andrew P; Rai, Shesh N

    2017-06-21

    Statistical classification is a critical component of utilizing metabolomics data for examining the molecular determinants of phenotypes. Despite this, a comprehensive and rigorous evaluation of the accuracy of classification techniques for phenotype discrimination given metabolomics data has not been conducted. We conducted such an evaluation using both simulated and real metabolomics datasets, comparing Partial Least Squares-Discriminant Analysis (PLS-DA), Sparse PLS-DA, Random Forests, Support Vector Machines (SVM), Artificial Neural Network, k -Nearest Neighbors ( k -NN), and Naïve Bayes classification techniques for discrimination. We evaluated the techniques on simulated data generated to mimic global untargeted metabolomics data by incorporating realistic block-wise correlation and partial correlation structures for mimicking the correlations and metabolite clustering generated by biological processes. Over the simulation studies, covariance structures, means, and effect sizes were stochastically varied to provide consistent estimates of classifier performance over a wide range of possible scenarios. The effects of the presence of non-normal error distributions, the introduction of biological and technical outliers, unbalanced phenotype allocation, missing values due to abundances below a limit of detection, and the effect of prior-significance filtering (dimension reduction) were evaluated via simulation. In each simulation, classifier parameters, such as the number of hidden nodes in a Neural Network, were optimized by cross-validation to minimize the probability of detecting spurious results due to poorly tuned classifiers. Classifier performance was then evaluated using real metabolomics datasets of varying sample medium, sample size, and experimental design. We report that in the most realistic simulation studies that incorporated non-normal error distributions, unbalanced phenotype allocation, outliers, missing values, and dimension reduction, classifier performance (least to greatest error) was ranked as follows: SVM, Random Forest, Naïve Bayes, sPLS-DA, Neural Networks, PLS-DA and k -NN classifiers. When non-normal error distributions were introduced, the performance of PLS-DA and k -NN classifiers deteriorated further relative to the remaining techniques. Over the real datasets, a trend of better performance of SVM and Random Forest classifier performance was observed.

  13. An integrated method for atherosclerotic carotid plaque segmentation in ultrasound image.

    PubMed

    Qian, Chunjun; Yang, Xiaoping

    2018-01-01

    Carotid artery atherosclerosis is an important cause of stroke. Ultrasound imaging has been widely used in the diagnosis of atherosclerosis. Therefore, segmenting atherosclerotic carotid plaque in ultrasound image is an important task. Accurate plaque segmentation is helpful for the measurement of carotid plaque burden. In this paper, we propose and evaluate a novel learning-based integrated framework for plaque segmentation. In our study, four different classification algorithms, along with the auto-context iterative algorithm, were employed to effectively integrate features from ultrasound images and later also the iteratively estimated and refined probability maps together for pixel-wise classification. The four classification algorithms were support vector machine with linear kernel, support vector machine with radial basis function kernel, AdaBoost and random forest. The plaque segmentation was implemented in the generated probability map. The performance of the four different learning-based plaque segmentation methods was tested on 29 B-mode ultrasound images. The evaluation indices for our proposed methods were consisted of sensitivity, specificity, Dice similarity coefficient, overlap index, error of area, absolute error of area, point-to-point distance, and Hausdorff point-to-point distance, along with the area under the ROC curve. The segmentation method integrated the random forest and an auto-context model obtained the best results (sensitivity 80.4 ± 8.4%, specificity 96.5 ± 2.0%, Dice similarity coefficient 81.0 ± 4.1%, overlap index 68.3 ± 5.8%, error of area -1.02 ± 18.3%, absolute error of area 14.7 ± 10.9%, point-to-point distance 0.34 ± 0.10 mm, Hausdorff point-to-point distance 1.75 ± 1.02 mm, and area under the ROC curve 0.897), which were almost the best, compared with that from the existed methods. Our proposed learning-based integrated framework investigated in this study could be useful for atherosclerotic carotid plaque segmentation, which will be helpful for the measurement of carotid plaque burden. Copyright © 2017 Elsevier B.V. All rights reserved.

  14. Moments and Root-Mean-Square Error of the Bayesian MMSE Estimator of Classification Error in the Gaussian Model.

    PubMed

    Zollanvari, Amin; Dougherty, Edward R

    2014-06-01

    The most important aspect of any classifier is its error rate, because this quantifies its predictive capacity. Thus, the accuracy of error estimation is critical. Error estimation is problematic in small-sample classifier design because the error must be estimated using the same data from which the classifier has been designed. Use of prior knowledge, in the form of a prior distribution on an uncertainty class of feature-label distributions to which the true, but unknown, feature-distribution belongs, can facilitate accurate error estimation (in the mean-square sense) in circumstances where accurate completely model-free error estimation is impossible. This paper provides analytic asymptotically exact finite-sample approximations for various performance metrics of the resulting Bayesian Minimum Mean-Square-Error (MMSE) error estimator in the case of linear discriminant analysis (LDA) in the multivariate Gaussian model. These performance metrics include the first, second, and cross moments of the Bayesian MMSE error estimator with the true error of LDA, and therefore, the Root-Mean-Square (RMS) error of the estimator. We lay down the theoretical groundwork for Kolmogorov double-asymptotics in a Bayesian setting, which enables us to derive asymptotic expressions of the desired performance metrics. From these we produce analytic finite-sample approximations and demonstrate their accuracy via numerical examples. Various examples illustrate the behavior of these approximations and their use in determining the necessary sample size to achieve a desired RMS. The Supplementary Material contains derivations for some equations and added figures.

  15. Heat Tolerance and the Peripheral Effects of Anticholinergics

    DTIC Science & Technology

    1988-01-30

    0c. GENDER: Responses of 27 males and 12 females showed that both SR and AGD increased as a function of log MCh dose. SR was affected by gender only...at the 2 highest doses ( male SR slightly greater). At all but the lowest dose AGD was significantly greater for females. ACCLIMATION: Responses of 6... males to MCh were measured before and after 20. DISTRIBUTION /AVAILABILITY OF ABSTRACT 21. ABSTRACT SECURITY CLASSIFICATION 0 UNCLASSIFIED/UNLIMITED 0

  16. Choosing the Most Effective Pattern Classification Model under Learning-Time Constraint.

    PubMed

    Saito, Priscila T M; Nakamura, Rodrigo Y M; Amorim, Willian P; Papa, João P; de Rezende, Pedro J; Falcão, Alexandre X

    2015-01-01

    Nowadays, large datasets are common and demand faster and more effective pattern analysis techniques. However, methodologies to compare classifiers usually do not take into account the learning-time constraints required by applications. This work presents a methodology to compare classifiers with respect to their ability to learn from classification errors on a large learning set, within a given time limit. Faster techniques may acquire more training samples, but only when they are more effective will they achieve higher performance on unseen testing sets. We demonstrate this result using several techniques, multiple datasets, and typical learning-time limits required by applications.

  17. Environmental Monitoring Networks Optimization Using Advanced Active Learning Algorithms

    NASA Astrophysics Data System (ADS)

    Kanevski, Mikhail; Volpi, Michele; Copa, Loris

    2010-05-01

    The problem of environmental monitoring networks optimization (MNO) belongs to one of the basic and fundamental tasks in spatio-temporal data collection, analysis, and modeling. There are several approaches to this problem, which can be considered as a design or redesign of monitoring network by applying some optimization criteria. The most developed and widespread methods are based on geostatistics (family of kriging models, conditional stochastic simulations). In geostatistics the variance is mainly used as an optimization criterion which has some advantages and drawbacks. In the present research we study an application of advanced techniques following from the statistical learning theory (SLT) - support vector machines (SVM) and the optimization of monitoring networks when dealing with a classification problem (data are discrete values/classes: hydrogeological units, soil types, pollution decision levels, etc.) is considered. SVM is a universal nonlinear modeling tool for classification problems in high dimensional spaces. The SVM solution is maximizing the decision boundary between classes and has a good generalization property for noisy data. The sparse solution of SVM is based on support vectors - data which contribute to the solution with nonzero weights. Fundamentally the MNO for classification problems can be considered as a task of selecting new measurement points which increase the quality of spatial classification and reduce the testing error (error on new independent measurements). In SLT this is a typical problem of active learning - a selection of the new unlabelled points which efficiently reduce the testing error. A classical approach (margin sampling) to active learning is to sample the points closest to the classification boundary. This solution is suboptimal when points (or generally the dataset) are redundant for the same class. In the present research we propose and study two new advanced methods of active learning adapted to the solution of MNO problem: 1) hierarchical top-down clustering in an input space in order to remove redundancy when data are clustered, and 2) a general method (independent on classifier) which gives posterior probabilities that can be used to define the classifier confidence and corresponding proposals for new measurement points. The basic ideas and procedures are explained by applying simulated data sets. The real case study deals with the analysis and mapping of soil types, which is a multi-class classification problem. Maps of soil types are important for the analysis and 3D modeling of heavy metals migration in soil and prediction risk mapping. The results obtained demonstrate the high quality of SVM mapping and efficiency of monitoring network optimization by using active learning approaches. The research was partly supported by SNSF projects No. 200021-126505 and 200020-121835.

  18. Evaluation of drug administration errors in a teaching hospital

    PubMed Central

    2012-01-01

    Background Medication errors can occur at any of the three steps of the medication use process: prescribing, dispensing and administration. We aimed to determine the incidence, type and clinical importance of drug administration errors and to identify risk factors. Methods Prospective study based on disguised observation technique in four wards in a teaching hospital in Paris, France (800 beds). A pharmacist accompanied nurses and witnessed the preparation and administration of drugs to all patients during the three drug rounds on each of six days per ward. Main outcomes were number, type and clinical importance of errors and associated risk factors. Drug administration error rate was calculated with and without wrong time errors. Relationship between the occurrence of errors and potential risk factors were investigated using logistic regression models with random effects. Results Twenty-eight nurses caring for 108 patients were observed. Among 1501 opportunities for error, 415 administrations (430 errors) with one or more errors were detected (27.6%). There were 312 wrong time errors, ten simultaneously with another type of error, resulting in an error rate without wrong time error of 7.5% (113/1501). The most frequently administered drugs were the cardiovascular drugs (425/1501, 28.3%). The highest risks of error in a drug administration were for dermatological drugs. No potentially life-threatening errors were witnessed and 6% of errors were classified as having a serious or significant impact on patients (mainly omission). In multivariate analysis, the occurrence of errors was associated with drug administration route, drug classification (ATC) and the number of patient under the nurse's care. Conclusion Medication administration errors are frequent. The identification of its determinants helps to undertake designed interventions. PMID:22409837

  19. Evaluation of drug administration errors in a teaching hospital.

    PubMed

    Berdot, Sarah; Sabatier, Brigitte; Gillaizeau, Florence; Caruba, Thibaut; Prognon, Patrice; Durieux, Pierre

    2012-03-12

    Medication errors can occur at any of the three steps of the medication use process: prescribing, dispensing and administration. We aimed to determine the incidence, type and clinical importance of drug administration errors and to identify risk factors. Prospective study based on disguised observation technique in four wards in a teaching hospital in Paris, France (800 beds). A pharmacist accompanied nurses and witnessed the preparation and administration of drugs to all patients during the three drug rounds on each of six days per ward. Main outcomes were number, type and clinical importance of errors and associated risk factors. Drug administration error rate was calculated with and without wrong time errors. Relationship between the occurrence of errors and potential risk factors were investigated using logistic regression models with random effects. Twenty-eight nurses caring for 108 patients were observed. Among 1501 opportunities for error, 415 administrations (430 errors) with one or more errors were detected (27.6%). There were 312 wrong time errors, ten simultaneously with another type of error, resulting in an error rate without wrong time error of 7.5% (113/1501). The most frequently administered drugs were the cardiovascular drugs (425/1501, 28.3%). The highest risks of error in a drug administration were for dermatological drugs. No potentially life-threatening errors were witnessed and 6% of errors were classified as having a serious or significant impact on patients (mainly omission). In multivariate analysis, the occurrence of errors was associated with drug administration route, drug classification (ATC) and the number of patient under the nurse's care. Medication administration errors are frequent. The identification of its determinants helps to undertake designed interventions.

  20. SAR-based change detection using hypothesis testing and Markov random field modelling

    NASA Astrophysics Data System (ADS)

    Cao, W.; Martinis, S.

    2015-04-01

    The objective of this study is to automatically detect changed areas caused by natural disasters from bi-temporal co-registered and calibrated TerraSAR-X data. The technique in this paper consists of two steps: Firstly, an automatic coarse detection step is applied based on a statistical hypothesis test for initializing the classification. The original analytical formula as proposed in the constant false alarm rate (CFAR) edge detector is reviewed and rewritten in a compact form of the incomplete beta function, which is a builtin routine in commercial scientific software such as MATLAB and IDL. Secondly, a post-classification step is introduced to optimize the noisy classification result in the previous step. Generally, an optimization problem can be formulated as a Markov random field (MRF) on which the quality of a classification is measured by an energy function. The optimal classification based on the MRF is related to the lowest energy value. Previous studies provide methods for the optimization problem using MRFs, such as the iterated conditional modes (ICM) algorithm. Recently, a novel algorithm was presented based on graph-cut theory. This method transforms a MRF to an equivalent graph and solves the optimization problem by a max-flow/min-cut algorithm on the graph. In this study this graph-cut algorithm is applied iteratively to improve the coarse classification. At each iteration the parameters of the energy function for the current classification are set by the logarithmic probability density function (PDF). The relevant parameters are estimated by the method of logarithmic cumulants (MoLC). Experiments are performed using two flood events in Germany and Australia in 2011 and a forest fire on La Palma in 2009 using pre- and post-event TerraSAR-X data. The results show convincing coarse classifications and considerable improvement by the graph-cut post-classification step.

  1. A new framework for analysing automated acoustic species-detection data: occupancy estimation and optimization of recordings post-processing

    USGS Publications Warehouse

    Chambert, Thierry A.; Waddle, J. Hardin; Miller, David A.W.; Walls, Susan; Nichols, James D.

    2018-01-01

    The development and use of automated species-detection technologies, such as acoustic recorders, for monitoring wildlife are rapidly expanding. Automated classification algorithms provide a cost- and time-effective means to process information-rich data, but often at the cost of additional detection errors. Appropriate methods are necessary to analyse such data while dealing with the different types of detection errors.We developed a hierarchical modelling framework for estimating species occupancy from automated species-detection data. We explore design and optimization of data post-processing procedures to account for detection errors and generate accurate estimates. Our proposed method accounts for both imperfect detection and false positive errors and utilizes information about both occurrence and abundance of detections to improve estimation.Using simulations, we show that our method provides much more accurate estimates than models ignoring the abundance of detections. The same findings are reached when we apply the methods to two real datasets on North American frogs surveyed with acoustic recorders.When false positives occur, estimator accuracy can be improved when a subset of detections produced by the classification algorithm is post-validated by a human observer. We use simulations to investigate the relationship between accuracy and effort spent on post-validation, and found that very accurate occupancy estimates can be obtained with as little as 1% of data being validated.Automated monitoring of wildlife provides opportunity and challenges. Our methods for analysing automated species-detection data help to meet key challenges unique to these data and will prove useful for many wildlife monitoring programs.

  2. Segmentation, classification, and pose estimation of military vehicles in low resolution laser radar images

    NASA Astrophysics Data System (ADS)

    Neulist, Joerg; Armbruster, Walter

    2005-05-01

    Model-based object recognition in range imagery typically involves matching the image data to the expected model data for each feasible model and pose hypothesis. Since the matching procedure is computationally expensive, the key to efficient object recognition is the reduction of the set of feasible hypotheses. This is particularly important for military vehicles, which may consist of several large moving parts such as the hull, turret, and gun of a tank, and hence require an eight or higher dimensional pose space to be searched. The presented paper outlines techniques for reducing the set of feasible hypotheses based on an estimation of target dimensions and orientation. Furthermore, the presence of a turret and a main gun and their orientations are determined. The vehicle parts dimensions as well as their error estimates restrict the number of model hypotheses whereas the position and orientation estimates and their error bounds reduce the number of pose hypotheses needing to be verified. The techniques are applied to several hundred laser radar images of eight different military vehicles with various part classifications and orientations. On-target resolution in azimuth, elevation and range is about 30 cm. The range images contain up to 20% dropouts due to atmospheric absorption. Additionally some target retro-reflectors produce outliers due to signal crosstalk. The presented algorithms are extremely robust with respect to these and other error sources. The hypothesis space for hull orientation is reduced to about 5 degrees as is the error for turret rotation and gun elevation, provided the main gun is visible.

  3. Waterbodies Extraction from LANDSAT8-OLI Imagery Using Awater Indexs-Guied Stochastic Fully-Connected Conditional Random Field Model and the Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Wang, X.; Xu, L.

    2018-04-01

    One of the most important applications of remote sensing classification is water extraction. The water index (WI) based on Landsat images is one of the most common ways to distinguish water bodies from other land surface features. But conventional WI methods take into account spectral information only form a limited number of bands, and therefore the accuracy of those WI methods may be constrained in some areas which are covered with snow/ice, clouds, etc. An accurate and robust water extraction method is the key to the study at present. The support vector machine (SVM) using all bands spectral information can reduce for these classification error to some extent. Nevertheless, SVM which barely considers spatial information is relatively sensitive to noise in local regions. Conditional random field (CRF) which considers both spatial information and spectral information has proven to be able to compensate for these limitations. Hence, in this paper, we develop a systematic water extraction method by taking advantage of the complementarity between the SVM and a water index-guided stochastic fully-connected conditional random field (SVM-WIGSFCRF) to address the above issues. In addition, we comprehensively evaluate the reliability and accuracy of the proposed method using Landsat-8 operational land imager (OLI) images of one test site. We assess the method's performance by calculating the following accuracy metrics: Omission Errors (OE) and Commission Errors (CE); Kappa coefficient (KP) and Total Error (TE). Experimental results show that the new method can improve target detection accuracy under complex and changeable environments.

  4. Construction of a Calibrated Probabilistic Classification Catalog: Application to 50k Variable Sources in the All-Sky Automated Survey

    NASA Astrophysics Data System (ADS)

    Richards, Joseph W.; Starr, Dan L.; Miller, Adam A.; Bloom, Joshua S.; Butler, Nathaniel R.; Brink, Henrik; Crellin-Quick, Arien

    2012-12-01

    With growing data volumes from synoptic surveys, astronomers necessarily must become more abstracted from the discovery and introspection processes. Given the scarcity of follow-up resources, there is a particularly sharp onus on the frameworks that replace these human roles to provide accurate and well-calibrated probabilistic classification catalogs. Such catalogs inform the subsequent follow-up, allowing consumers to optimize the selection of specific sources for further study and permitting rigorous treatment of classification purities and efficiencies for population studies. Here, we describe a process to produce a probabilistic classification catalog of variability with machine learning from a multi-epoch photometric survey. In addition to producing accurate classifications, we show how to estimate calibrated class probabilities and motivate the importance of probability calibration. We also introduce a methodology for feature-based anomaly detection, which allows discovery of objects in the survey that do not fit within the predefined class taxonomy. Finally, we apply these methods to sources observed by the All-Sky Automated Survey (ASAS), and release the Machine-learned ASAS Classification Catalog (MACC), a 28 class probabilistic classification catalog of 50,124 ASAS sources in the ASAS Catalog of Variable Stars. We estimate that MACC achieves a sub-20% classification error rate and demonstrate that the class posterior probabilities are reasonably calibrated. MACC classifications compare favorably to the classifications of several previous domain-specific ASAS papers and to the ASAS Catalog of Variable Stars, which had classified only 24% of those sources into one of 12 science classes.

  5. CONSTRUCTION OF A CALIBRATED PROBABILISTIC CLASSIFICATION CATALOG: APPLICATION TO 50k VARIABLE SOURCES IN THE ALL-SKY AUTOMATED SURVEY

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Richards, Joseph W.; Starr, Dan L.; Miller, Adam A.

    2012-12-15

    With growing data volumes from synoptic surveys, astronomers necessarily must become more abstracted from the discovery and introspection processes. Given the scarcity of follow-up resources, there is a particularly sharp onus on the frameworks that replace these human roles to provide accurate and well-calibrated probabilistic classification catalogs. Such catalogs inform the subsequent follow-up, allowing consumers to optimize the selection of specific sources for further study and permitting rigorous treatment of classification purities and efficiencies for population studies. Here, we describe a process to produce a probabilistic classification catalog of variability with machine learning from a multi-epoch photometric survey. In additionmore » to producing accurate classifications, we show how to estimate calibrated class probabilities and motivate the importance of probability calibration. We also introduce a methodology for feature-based anomaly detection, which allows discovery of objects in the survey that do not fit within the predefined class taxonomy. Finally, we apply these methods to sources observed by the All-Sky Automated Survey (ASAS), and release the Machine-learned ASAS Classification Catalog (MACC), a 28 class probabilistic classification catalog of 50,124 ASAS sources in the ASAS Catalog of Variable Stars. We estimate that MACC achieves a sub-20% classification error rate and demonstrate that the class posterior probabilities are reasonably calibrated. MACC classifications compare favorably to the classifications of several previous domain-specific ASAS papers and to the ASAS Catalog of Variable Stars, which had classified only 24% of those sources into one of 12 science classes.« less

  6. Integrating molecular markers into the World Health Organization classification of CNS tumors: a survey of the neuro-oncology community.

    PubMed

    Aldape, Kenneth; Nejad, Romina; Louis, David N; Zadeh, Gelareh

    2017-03-01

    Molecular markers provide important biological and clinical information related to the classification of brain tumors, and the integration of relevant molecular parameters into brain tumor classification systems has been a widely discussed topic in neuro-oncology over the past decade. With recent advances in the development of clinically relevant molecular signatures and the 2016 World Health Organization (WHO) update, the views of the neuro-oncology community on such changes would be informative for implementing this process. A survey with 8 questions regarding molecular markers in tumor classification was sent to an email list of Society for Neuro-Oncology members and attendees of prior meetings (n=5065). There were 403 respondents. Analysis was performed using whole group response, based on self-reported subspecialty. The survey results show overall strong support for incorporating molecular knowledge into the classification and clinical management of brain tumors. Across all 7 subspecialty groups, ≥70% of respondents agreed to this integration. Interestingly, some variability is seen among subspecialties, notably with lowest support from neuropathologists, which may reflect their roles in implementing such diagnostic technologies. Based on a survey provided to the neuro-oncology community, we report strong support for the integration of molecular markers into the WHO classification of brain tumors, as well as for using an integrated "layered" diagnostic format. While membership from each specialty showed support, there was variation by specialty in enthusiasm regarding proposed changes. The initial results of this survey influenced the deliberations underlying the 2016 WHO classification of tumors of the central nervous system. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Neuro-Oncology.

  7. Sensitivity and accuracy of high-throughput metabarcoding methods used to describe aquatic communities for early detection of invasve fish species

    EPA Science Inventory

    For early detection biomonitoring of aquatic invasive species, sensitivity to rare individuals and accurate, high-resolution taxonomic classification are critical to minimize Type I and II detection errors. Given the great expense and effort associated with morphological identifi...

  8. The School-Based Multidisciplinary Team and Nondiscriminatory Assessment.

    ERIC Educational Resources Information Center

    Pfeiffer, Steven I.

    The potential of multidisciplinary teams to control for possible errors in diagnosis, classification, and placement and to provide a vehicle for ensuring effective outcomes of diagnostic practices is illustrated. The present functions of the school-based multidisciplinary team (also called, for example, assessment team, child study team, placement…

  9. Competitive Learning Neural Network Ensemble Weighted by Predicted Performance

    ERIC Educational Resources Information Center

    Ye, Qiang

    2010-01-01

    Ensemble approaches have been shown to enhance classification by combining the outputs from a set of voting classifiers. Diversity in error patterns among base classifiers promotes ensemble performance. Multi-task learning is an important characteristic for Neural Network classifiers. Introducing a secondary output unit that receives different…

  10. On Correlations, Distances and Error Rates.

    ERIC Educational Resources Information Center

    Dorans, Neil J.

    The nature of the criterion (dependent) variable may play a useful role in structuring a list of classification/prediction problems. Such criteria are continuous in nature, binary dichotomous, or multichotomous. In this paper, discussion is limited to the continuous normally distributed criterion scenarios. For both cases, it is assumed that the…

  11. Sleep Patterns and Its Relationship to Schooling and Family.

    ERIC Educational Resources Information Center

    Jones, Franklin Ross

    Diagnostic classifications of sleep and arousal disorders have been categorized in four major areas: disorders of initiating and maintaining sleep, disorders of excessive sleepiness, disorders of the sleep/wake pattern, and the parasomnias such as sleep walking, talking, and night errors. Another nomenclature classifies them into DIMS (disorders…

  12. Hand posture classification using electrocorticography signals in the gamma band over human sensorimotor brain areas

    NASA Astrophysics Data System (ADS)

    Chestek, Cynthia A.; Gilja, Vikash; Blabe, Christine H.; Foster, Brett L.; Shenoy, Krishna V.; Parvizi, Josef; Henderson, Jaimie M.

    2013-04-01

    Objective. Brain-machine interface systems translate recorded neural signals into command signals for assistive technology. In individuals with upper limb amputation or cervical spinal cord injury, the restoration of a useful hand grasp could significantly improve daily function. We sought to determine if electrocorticographic (ECoG) signals contain sufficient information to select among multiple hand postures for a prosthetic hand, orthotic, or functional electrical stimulation system.Approach. We recorded ECoG signals from subdural macro- and microelectrodes implanted in motor areas of three participants who were undergoing inpatient monitoring for diagnosis and treatment of intractable epilepsy. Participants performed five distinct isometric hand postures, as well as four distinct finger movements. Several control experiments were attempted in order to remove sensory information from the classification results. Online experiments were performed with two participants. Main results. Classification rates were 68%, 84% and 81% for correct identification of 5 isometric hand postures offline. Using 3 potential controls for removing sensory signals, error rates were approximately doubled on average (2.1×). A similar increase in errors (2.6×) was noted when the participant was asked to make simultaneous wrist movements along with the hand postures. In online experiments, fist versus rest was successfully classified on 97% of trials; the classification output drove a prosthetic hand. Online classification performance for a larger number of hand postures remained above chance, but substantially below offline performance. In addition, the long integration windows used would preclude the use of decoded signals for control of a BCI system. Significance. These results suggest that ECoG is a plausible source of command signals for prosthetic grasp selection. Overall, avenues remain for improvement through better electrode designs and placement, better participant training, and characterization of non-stationarities such that ECoG could be a viable signal source for grasp control for amputees or individuals with paralysis.

  13. Pornography classification: The hidden clues in video space-time.

    PubMed

    Moreira, Daniel; Avila, Sandra; Perez, Mauricio; Moraes, Daniel; Testoni, Vanessa; Valle, Eduardo; Goldenstein, Siome; Rocha, Anderson

    2016-11-01

    As web technologies and social networks become part of the general public's life, the problem of automatically detecting pornography is into every parent's mind - nobody feels completely safe when their children go online. In this paper, we focus on video-pornography classification, a hard problem in which traditional methods often employ still-image techniques - labeling frames individually prior to a global decision. Frame-based approaches, however, ignore significant cogent information brought by motion. Here, we introduce a space-temporal interest point detector and descriptor called Temporal Robust Features (TRoF). TRoF was custom-tailored for efficient (low processing time and memory footprint) and effective (high classification accuracy and low false negative rate) motion description, particularly suited to the task at hand. We aggregate local information extracted by TRoF into a mid-level representation using Fisher Vectors, the state-of-the-art model of Bags of Visual Words (BoVW). We evaluate our original strategy, contrasting it both to commercial pornography detection solutions, and to BoVW solutions based upon other space-temporal features from the scientific literature. The performance is assessed using the Pornography-2k dataset, a new challenging pornographic benchmark, comprising 2000 web videos and 140h of video footage. The dataset is also a contribution of this work and is very assorted, including both professional and amateur content, and it depicts several genres of pornography, from cartoon to live action, with diverse behavior and ethnicity. The best approach, based on a dense application of TRoF, yields a classification error reduction of almost 79% when compared to the best commercial classifier. A sparse description relying on TRoF detector is also noteworthy, for yielding a classification error reduction of over 69%, with 19× less memory footprint than the dense solution, and yet can also be implemented to meet real-time requirements. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  14. Large scale topography of Io

    NASA Technical Reports Server (NTRS)

    Gaskell, R. W.; Synnott, S. P.

    1987-01-01

    To investigate the large scale topography of the Jovian satellite Io, both limb observations and stereographic techniques applied to landmarks are used. The raw data for this study consists of Voyager 1 images of Io, 800x800 arrays of picture elements each of which can take on 256 possible brightness values. In analyzing this data it was necessary to identify and locate landmarks and limb points on the raw images, remove the image distortions caused by the camera electronics and translate the corrected locations into positions relative to a reference geoid. Minimizing the uncertainty in the corrected locations is crucial to the success of this project. In the highest resolution frames, an error of a tenth of a pixel in image space location can lead to a 300 m error in true location. In the lowest resolution frames, the same error can lead to an uncertainty of several km.

  15. Delay compensation - Its effect in reducing sampling errors in Fourier spectroscopy

    NASA Technical Reports Server (NTRS)

    Zachor, A. S.; Aaronson, S. M.

    1979-01-01

    An approximate formula is derived for the spectrum ghosts caused by periodic drive speed variations in a Michelson interferometer. The solution represents the case of fringe-controlled sampling and is applicable when the reference fringes are delayed to compensate for the delay introduced by the electrical filter in the signal channel. Numerical results are worked out for several common low-pass filters. It is shown that the maximum relative ghost amplitude over the range of frequencies corresponding to the lower half of the filter band is typically 20 times smaller than the relative zero-to-peak velocity error, when delayed sampling is used. In the lowest quarter of the filter band it is more than 100 times smaller than the relative velocity error. These values are ten and forty times smaller, respectively, than they would be without delay compensation if the filter is a 6-pole Butterworth.

  16. Cross-Spectrum PM Noise Measurement, Thermal Energy, and Metamaterial Filters.

    PubMed

    Gruson, Yannick; Giordano, Vincent; Rohde, Ulrich L; Poddar, Ajay K; Rubiola, Enrico

    2017-03-01

    Virtually all commercial instruments for the measurement of the oscillator PM noise make use of the cross-spectrum method (arXiv:1004.5539 [physics.ins-det], 2010). High sensitivity is achieved by correlation and averaging on two equal channels, which measure the same input, and reject the background of the instrument. We show that a systematic error is always present if the thermal energy of the input power splitter is not accounted for. Such error can result in noise underestimation up to a few decibels in the lowest-noise quartz oscillators, and in an invalid measurement in the case of cryogenic oscillators. As another alarming fact, the presence of metamaterial components in the oscillator results in unpredictable behavior and large errors, even in well controlled experimental conditions. We observed a spread of 40 dB in the phase noise spectra of an oscillator, just replacing the output filter.

  17. Comparison of artificial intelligence classifiers for SIP attack data

    NASA Astrophysics Data System (ADS)

    Safarik, Jakub; Slachta, Jiri

    2016-05-01

    Honeypot application is a source of valuable data about attacks on the network. We run several SIP honeypots in various computer networks, which are separated geographically and logically. Each honeypot runs on public IP address and uses standard SIP PBX ports. All information gathered via honeypot is periodically sent to the centralized server. This server classifies all attack data by neural network algorithm. The paper describes optimizations of a neural network classifier, which lower the classification error. The article contains the comparison of two neural network algorithm used for the classification of validation data. The first is the original implementation of the neural network described in recent work; the second neural network uses further optimizations like input normalization or cross-entropy cost function. We also use other implementations of neural networks and machine learning classification algorithms. The comparison test their capabilities on validation data to find the optimal classifier. The article result shows promise for further development of an accurate SIP attack classification engine.

  18. Automatic Identification of Critical Follow-Up Recommendation Sentences in Radiology Reports

    PubMed Central

    Yetisgen-Yildiz, Meliha; Gunn, Martin L.; Xia, Fei; Payne, Thomas H.

    2011-01-01

    Communication of follow-up recommendations when abnormalities are identified on imaging studies is prone to error. When recommendations are not systematically identified and promptly communicated to referrers, poor patient outcomes can result. Using information technology can improve communication and improve patient safety. In this paper, we describe a text processing approach that uses natural language processing (NLP) and supervised text classification methods to automatically identify critical recommendation sentences in radiology reports. To increase the classification performance we enhanced the simple unigram token representation approach with lexical, semantic, knowledge-base, and structural features. We tested different combinations of those features with the Maximum Entropy (MaxEnt) classification algorithm. Classifiers were trained and tested with a gold standard corpus annotated by a domain expert. We applied 5-fold cross validation and our best performing classifier achieved 95.60% precision, 79.82% recall, 87.0% F-score, and 99.59% classification accuracy in identifying the critical recommendation sentences in radiology reports. PMID:22195225

  19. Automatic identification of critical follow-up recommendation sentences in radiology reports.

    PubMed

    Yetisgen-Yildiz, Meliha; Gunn, Martin L; Xia, Fei; Payne, Thomas H

    2011-01-01

    Communication of follow-up recommendations when abnormalities are identified on imaging studies is prone to error. When recommendations are not systematically identified and promptly communicated to referrers, poor patient outcomes can result. Using information technology can improve communication and improve patient safety. In this paper, we describe a text processing approach that uses natural language processing (NLP) and supervised text classification methods to automatically identify critical recommendation sentences in radiology reports. To increase the classification performance we enhanced the simple unigram token representation approach with lexical, semantic, knowledge-base, and structural features. We tested different combinations of those features with the Maximum Entropy (MaxEnt) classification algorithm. Classifiers were trained and tested with a gold standard corpus annotated by a domain expert. We applied 5-fold cross validation and our best performing classifier achieved 95.60% precision, 79.82% recall, 87.0% F-score, and 99.59% classification accuracy in identifying the critical recommendation sentences in radiology reports.

  20. A Pruning Neural Network Model in Credit Classification Analysis

    PubMed Central

    Tang, Yajiao; Ji, Junkai; Dai, Hongwei; Yu, Yang; Todo, Yuki

    2018-01-01

    Nowadays, credit classification models are widely applied because they can help financial decision-makers to handle credit classification issues. Among them, artificial neural networks (ANNs) have been widely accepted as the convincing methods in the credit industry. In this paper, we propose a pruning neural network (PNN) and apply it to solve credit classification problem by adopting the well-known Australian and Japanese credit datasets. The model is inspired by synaptic nonlinearity of a dendritic tree in a biological neural model. And it is trained by an error back-propagation algorithm. The model is capable of realizing a neuronal pruning function by removing the superfluous synapses and useless dendrites and forms a tidy dendritic morphology at the end of learning. Furthermore, we utilize logic circuits (LCs) to simulate the dendritic structures successfully which makes PNN be implemented on the hardware effectively. The statistical results of our experiments have verified that PNN obtains superior performance in comparison with other classical algorithms in terms of accuracy and computational efficiency. PMID:29606961

Top