average classification error: Topics by Science.gov

Sample records for average classification error

Masked and unmasked error-related potentials during continuous control and feedback

NASA Astrophysics Data System (ADS)

Lopes Dias, Catarina; Sburlea, Andreea I.; Müller-Putz, Gernot R.

2018-06-01

The detection of error-related potentials (ErrPs) in tasks with discrete feedback is well established in the brain–computer interface (BCI) field. However, the decoding of ErrPs in tasks with continuous feedback is still in its early stages. Objective. We developed a task in which subjects have continuous control of a cursor’s position by means of a joystick. The cursor’s position was shown to the participants in two different modalities of continuous feedback: normal and jittered. The jittered feedback was created to mimic the instability that could exist if participants controlled the trajectory directly with brain signals. Approach. This paper studies the electroencephalographic (EEG)—measurable signatures caused by a loss of control over the cursor’s trajectory, causing a target miss. Main results. In both feedback modalities, time-locked potentials revealed the typical frontal-central components of error-related potentials. Errors occurring during the jittered feedback (masked errors) were delayed in comparison to errors occurring during normal feedback (unmasked errors). Masked errors displayed lower peak amplitudes than unmasked errors. Time-locked classification analysis allowed a good distinction between correct and error classes (average Cohen-, average TPR = 81.8% and average TNR = 96.4%). Time-locked classification analysis between masked error and unmasked error classes revealed results at chance level (average Cohen-, average TPR = 60.9% and average TNR = 58.3%). Afterwards, we performed asynchronous detection of ErrPs, combining both masked and unmasked trials. The asynchronous detection of ErrPs in a simulated online scenario resulted in an average TNR of 84.0% and in an average TPR of 64.9%. Significance. The time-locked classification results suggest that the masked and unmasked errors were indistinguishable in terms of classification. The asynchronous classification results suggest that the feedback modality did not hinder the asynchronous detection of ErrPs.
Toward attenuating the impact of arm positions on electromyography pattern-recognition based motion classification in transradial amputees

PubMed Central

2012-01-01

Background Electromyography (EMG) pattern-recognition based control strategies for multifunctional myoelectric prosthesis systems have been studied commonly in a controlled laboratory setting. Before these myoelectric prosthesis systems are clinically viable, it will be necessary to assess the effect of some disparities between the ideal laboratory setting and practical use on the control performance. One important obstacle is the impact of arm position variation that causes the changes of EMG pattern when performing identical motions in different arm positions. This study aimed to investigate the impacts of arm position variation on EMG pattern-recognition based motion classification in upper-limb amputees and the solutions for reducing these impacts. Methods With five unilateral transradial (TR) amputees, the EMG signals and tri-axial accelerometer mechanomyography (ACC-MMG) signals were simultaneously collected from both amputated and intact arms when performing six classes of arm and hand movements in each of five arm positions that were considered in the study. The effect of the arm position changes was estimated in terms of motion classification error and compared between amputated and intact arms. Then the performance of three proposed methods in attenuating the impact of arm positions was evaluated. Results With EMG signals, the average intra-position and inter-position classification errors across all five arm positions and five subjects were around 7.3% and 29.9% from amputated arms, respectively, about 1.0% and 10% low in comparison with those from intact arms. While ACC-MMG signals could yield a similar intra-position classification error (9.9%) as EMG, they had much higher inter-position classification error with an average value of 81.1% over the arm positions and the subjects. When the EMG data from all five arm positions were involved in the training set, the average classification error reached a value of around 10.8% for amputated arms. Using a two-stage cascade classifier, the average classification error was around 9.0% over all five arm positions. Reducing ACC-MMG channels from 8 to 2 only increased the average position classification error across all five arm positions from 0.7% to 1.0% in amputated arms. Conclusions The performance of EMG pattern-recognition based method in classifying movements strongly depends on arm positions. This dependency is a little stronger in intact arm than in amputated arm, which suggests that the investigations associated with practical use of a myoelectric prosthesis should use the limb amputees as subjects instead of using able-body subjects. The two-stage cascade classifier mode with ACC-MMG for limb position identification and EMG for limb motion classification may be a promising way to reduce the effect of limb position variation on classification performance. PMID:23036049
Sources of error in estimating truck traffic from automatic vehicle classification data

DOT National Transportation Integrated Search

1998-10-01

Truck annual average daily traffic estimation errors resulting from sample classification counts are computed in this paper under two scenarios. One scenario investigates an improper factoring procedure that may be used by highway agencies. The study...
Predicting alpine headwater stream intermittency: a case study in the northern Rocky Mountains

USGS Publications Warehouse

Sando, Thomas R.; Blasch, Kyle W.

2015-01-01

This investigation used climatic, geological, and environmental data coupled with observational stream intermittency data to predict alpine headwater stream intermittency. Prediction was made using a random forest classification model. Results showed that the most important variables in the prediction model were snowpack persistence, represented by average snow extent from March through July, mean annual mean monthly minimum temperature, and surface geology types. For stream catchments with intermittent headwater streams, snowpack, on average, persisted until early June, whereas for stream catchments with perennial headwater streams, snowpack, on average, persisted until early July. Additionally, on average, stream catchments with intermittent headwater streams were about 0.7 °C warmer than stream catchments with perennial headwater streams. Finally, headwater stream catchments primarily underlain by coarse, permeable sediment are significantly more likely to have intermittent headwater streams than those primarily underlain by impermeable bedrock. Comparison of the predicted streamflow classification with observed stream status indicated a four percent classification error for first-order streams and a 21 percent classification error for all stream orders in the study area.
Detection of Periodic Leg Movements by Machine Learning Methods Using Polysomnographic Parameters Other Than Leg Electromyography

PubMed Central

Umut, İlhan; Çentik, Güven

2016-01-01

The number of channels used for polysomnographic recording frequently causes difficulties for patients because of the many cables connected. Also, it increases the risk of having troubles during recording process and increases the storage volume. In this study, it is intended to detect periodic leg movement (PLM) in sleep with the use of the channels except leg electromyography (EMG) by analysing polysomnography (PSG) data with digital signal processing (DSP) and machine learning methods. PSG records of 153 patients of different ages and genders with PLM disorder diagnosis were examined retrospectively. A novel software was developed for the analysis of PSG records. The software utilizes the machine learning algorithms, statistical methods, and DSP methods. In order to classify PLM, popular machine learning methods (multilayer perceptron, K-nearest neighbour, and random forests) and logistic regression were used. Comparison of classified results showed that while K-nearest neighbour classification algorithm had higher average classification rate (91.87%) and lower average classification error value (RMSE = 0.2850), multilayer perceptron algorithm had the lowest average classification rate (83.29%) and the highest average classification error value (RMSE = 0.3705). Results showed that PLM can be classified with high accuracy (91.87%) without leg EMG record being present. PMID:27213008
Detection of Periodic Leg Movements by Machine Learning Methods Using Polysomnographic Parameters Other Than Leg Electromyography.

PubMed

Umut, İlhan; Çentik, Güven

2016-01-01

The number of channels used for polysomnographic recording frequently causes difficulties for patients because of the many cables connected. Also, it increases the risk of having troubles during recording process and increases the storage volume. In this study, it is intended to detect periodic leg movement (PLM) in sleep with the use of the channels except leg electromyography (EMG) by analysing polysomnography (PSG) data with digital signal processing (DSP) and machine learning methods. PSG records of 153 patients of different ages and genders with PLM disorder diagnosis were examined retrospectively. A novel software was developed for the analysis of PSG records. The software utilizes the machine learning algorithms, statistical methods, and DSP methods. In order to classify PLM, popular machine learning methods (multilayer perceptron, K-nearest neighbour, and random forests) and logistic regression were used. Comparison of classified results showed that while K-nearest neighbour classification algorithm had higher average classification rate (91.87%) and lower average classification error value (RMSE = 0.2850), multilayer perceptron algorithm had the lowest average classification rate (83.29%) and the highest average classification error value (RMSE = 0.3705). Results showed that PLM can be classified with high accuracy (91.87%) without leg EMG record being present.
Use of scan overlap redundancy to enhance multispectral aircraft scanner data

NASA Technical Reports Server (NTRS)

Lindenlaub, J. C.; Keat, J.

1973-01-01

Two criteria were suggested for optimizing the resolution error versus signal-to-noise-ratio tradeoff. The first criterion uses equal weighting coefficients and chooses n, the number of lines averaged, so as to make the average resolution error equal to the noise error. The second criterion adjusts both the number and relative sizes of the weighting coefficients so as to minimize the total error (resolution error plus noise error). The optimum set of coefficients depends upon the geometry of the resolution element, the number of redundant scan lines, the scan line increment, and the original signal-to-noise ratio of the channel. Programs were developed to find the optimum number and relative weights of the averaging coefficients. A working definition of signal-to-noise ratio was given and used to try line averaging on a typical set of data. Line averaging was evaluated only with respect to its effect on classification accuracy.
Spelling in adolescents with dyslexia: errors and modes of assessment.

PubMed

Tops, Wim; Callens, Maaike; Bijn, Evi; Brysbaert, Marc

2014-01-01

In this study we focused on the spelling of high-functioning students with dyslexia. We made a detailed classification of the errors in a word and sentence dictation task made by 100 students with dyslexia and 100 matched control students. All participants were in the first year of their bachelor's studies and had Dutch as mother tongue. Three main error categories were distinguished: phonological, orthographic, and grammatical errors (on the basis of morphology and language-specific spelling rules). The results indicated that higher-education students with dyslexia made on average twice as many spelling errors as the controls, with effect sizes of d ≥ 2. When the errors were classified as phonological, orthographic, or grammatical, we found a slight dominance of phonological errors in students with dyslexia. Sentence dictation did not provide more information than word dictation in the correct classification of students with and without dyslexia. © Hammill Institute on Disabilities 2012.
An experimental study of interstitial lung tissue classification in HRCT images using ANN and role of cost functions

NASA Astrophysics Data System (ADS)

Dash, Jatindra K.; Kale, Mandar; Mukhopadhyay, Sudipta; Khandelwal, Niranjan; Prabhakar, Nidhi; Garg, Mandeep; Kalra, Naveen

2017-03-01

In this paper, we investigate the effect of the error criteria used during a training phase of the artificial neural network (ANN) on the accuracy of the classifier for classification of lung tissues affected with Interstitial Lung Diseases (ILD). Mean square error (MSE) and the cross-entropy (CE) criteria are chosen being most popular choice in state-of-the-art implementations. The classification experiment performed on the six interstitial lung disease (ILD) patterns viz. Consolidation, Emphysema, Ground Glass Opacity, Micronodules, Fibrosis and Healthy from MedGIFT database. The texture features from an arbitrary region of interest (AROI) are extracted using Gabor filter. Two different neural networks are trained with the scaled conjugate gradient back propagation algorithm with MSE and CE error criteria function respectively for weight updation. Performance is evaluated in terms of average accuracy of these classifiers using 4 fold cross-validation. Each network is trained for five times for each fold with randomly initialized weight vectors and accuracies are computed. Significant improvement in classification accuracy is observed when ANN is trained by using CE (67.27%) as error function compared to MSE (63.60%). Moreover, standard deviation of the classification accuracy for the network trained with CE (6.69) error criteria is found less as compared to network trained with MSE (10.32) criteria.
Regulation of IAP (Inhibitor of Apoptosis) Gene Expression by the p53 Tumor Suppressor Protein

DTIC Science & Technology

2005-05-01

adenovirus, gene therapy, polymorphism, 31 16. PRICE CODE 17. SECURITY CLASSIFICATION 18. SECURITY CLASSIFICATION 19. SECURITY CLASSIFICATION 20...averaged results of three inde- pendent experiments, with standard error. Right panel: Level of p53 in infected cells using the antibody Ab-6 (Calbiochem...with highly purified mitochondria as described in (2). The arrow marks oligomerized BAK. The right _ -. panel depicts the purity of BMH CrosIinked Mito
Land use surveys by means of automatic interpretation of LANDSAT system data

NASA Technical Reports Server (NTRS)

Dejesusparada, N. (Principal Investigator); Lombardo, M. A.; Novo, E. M. L. D.; Niero, M.; Foresti, C.

1981-01-01

Analyses for seven land-use classes are presented. The classes are: urban area, industrial area, bare soil, cultivated area, pastureland, reforestation, and natural vegetation. The automatic classification of LANDSAT MSS data using a maximum likelihood algorithm shows a 39% average error of emission and a 3.45 error of commission for the seven classes.
Automated Classification of Selected Data Elements from Free-text Diagnostic Reports for Clinical Research.

PubMed

Löpprich, Martin; Krauss, Felix; Ganzinger, Matthias; Senghas, Karsten; Riezler, Stefan; Knaup, Petra

2016-08-05

In the Multiple Myeloma clinical registry at Heidelberg University Hospital, most data are extracted from discharge letters. Our aim was to analyze if it is possible to make the manual documentation process more efficient by using methods of natural language processing for multiclass classification of free-text diagnostic reports to automatically document the diagnosis and state of disease of myeloma patients. The first objective was to create a corpus consisting of free-text diagnosis paragraphs of patients with multiple myeloma from German diagnostic reports, and its manual annotation of relevant data elements by documentation specialists. The second objective was to construct and evaluate a framework using different NLP methods to enable automatic multiclass classification of relevant data elements from free-text diagnostic reports. The main diagnoses paragraph was extracted from the clinical report of one third randomly selected patients of the multiple myeloma research database from Heidelberg University Hospital (in total 737 selected patients). An EDC system was setup and two data entry specialists performed independently a manual documentation of at least nine specific data elements for multiple myeloma characterization. Both data entries were compared and assessed by a third specialist and an annotated text corpus was created. A framework was constructed, consisting of a self-developed package to split multiple diagnosis sequences into several subsequences, four different preprocessing steps to normalize the input data and two classifiers: a maximum entropy classifier (MEC) and a support vector machine (SVM). In total 15 different pipelines were examined and assessed by a ten-fold cross-validation, reiterated 100 times. For quality indication the average error rate and the average F1-score were conducted. For significance testing the approximate randomization test was used. The created annotated corpus consists of 737 different diagnoses paragraphs with a total number of 865 coded diagnosis. The dataset is publicly available in the supplementary online files for training and testing of further NLP methods. Both classifiers showed low average error rates (MEC: 1.05; SVM: 0.84) and high F1-scores (MEC: 0.89; SVM: 0.92). However the results varied widely depending on the classified data element. Preprocessing methods increased this effect and had significant impact on the classification, both positive and negative. The automatic diagnosis splitter increased the average error rate significantly, even if the F1-score decreased only slightly. The low average error rates and high average F1-scores of each pipeline demonstrate the suitability of the investigated NPL methods. However, it was also shown that there is no best practice for an automatic classification of data elements from free-text diagnostic reports.
Sea ice classification using fast learning neural networks

NASA Technical Reports Server (NTRS)

Dawson, M. S.; Fung, A. K.; Manry, M. T.

1992-01-01

A first learning neural network approach to the classification of sea ice is presented. The fast learning (FL) neural network and a multilayer perceptron (MLP) trained with backpropagation learning (BP network) were tested on simulated data sets based on the known dominant scattering characteristics of the target class. Four classes were used in the data simulation: open water, thick lossy saline ice, thin saline ice, and multiyear ice. The BP network was unable to consistently converge to less than 25 percent error while the FL method yielded an average error of approximately 1 percent on the first iteration of training. The fast learning method presented can significantly reduce the CPU time necessary to train a neural network as well as consistently yield higher classification accuracy than BP networks.
Improving the mapping of crop types in the Midwestern U.S. by fusing Landsat and MODIS satellite data

NASA Astrophysics Data System (ADS)

Zhu, Likai; Radeloff, Volker C.; Ives, Anthony R.

2017-06-01

Mapping crop types is of great importance for assessing agricultural production, land-use patterns, and the environmental effects of agriculture. Indeed, both radiometric and spatial resolution of Landsat's sensors images are optimized for cropland monitoring. However, accurate mapping of crop types requires frequent cloud-free images during the growing season, which are often not available, and this raises the question of whether Landsat data can be combined with data from other satellites. Here, our goal is to evaluate to what degree fusing Landsat with MODIS Nadir Bidirectional Reflectance Distribution Function (BRDF)-Adjusted Reflectance (NBAR) data can improve crop-type classification. Choosing either one or two images from all cloud-free Landsat observations available for the Arlington Agricultural Research Station area in Wisconsin from 2010 to 2014, we generated 87 combinations of images, and used each combination as input into the Spatial and Temporal Adaptive Reflectance Fusion Model (STARFM) algorithm to predict Landsat-like images at the nominal dates of each 8-day MODIS NBAR product. Both the original Landsat and STARFM-predicted images were then classified with a support vector machine (SVM), and we compared the classification errors of three scenarios: 1) classifying the one or two original Landsat images of each combination only, 2) classifying the one or two original Landsat images plus all STARFM-predicted images, and 3) classifying the one or two original Landsat images together with STARFM-predicted images for key dates. Our results indicated that using two Landsat images as the input of STARFM did not significantly improve the STARFM predictions compared to using only one, and predictions using Landsat images between July and August as input were most accurate. Including all STARFM-predicted images together with the Landsat images significantly increased average classification error by 4% points (from 21% to 25%) compared to using only Landsat images. However, incorporating only STARFM-predicted images for key dates decreased average classification error by 2% points (from 21% to 19%) compared to using only Landsat images. In particular, if only a single Landsat image was available, adding STARFM predictions for key dates significantly decreased the average classification error by 4 percentage points from 30% to 26% (p < 0.05). We conclude that adding STARFM-predicted images can be effective for improving crop-type classification when only limited Landsat observations are available, but carefully selecting images from a full set of STARFM predictions is crucial. We developed an approach to identify the optimal subsets of all STARFM predictions, which gives an alternative method of feature selection for future research.
Analysis and application of classification methods of complex carbonate reservoirs

NASA Astrophysics Data System (ADS)

Li, Xiongyan; Qin, Ruibao; Ping, Haitao; Wei, Dan; Liu, Xiaomei

2018-06-01

There are abundant carbonate reservoirs from the Cenozoic to Mesozoic era in the Middle East. Due to variation in sedimentary environment and diagenetic process of carbonate reservoirs, several porosity types coexist in carbonate reservoirs. As a result, because of the complex lithologies and pore types as well as the impact of microfractures, the pore structure is very complicated. Therefore, it is difficult to accurately calculate the reservoir parameters. In order to accurately evaluate carbonate reservoirs, based on the pore structure evaluation of carbonate reservoirs, the classification methods of carbonate reservoirs are analyzed based on capillary pressure curves and flow units. Based on the capillary pressure curves, although the carbonate reservoirs can be classified, the relationship between porosity and permeability after classification is not ideal. On the basis of the flow units, the high-precision functional relationship between porosity and permeability after classification can be established. Therefore, the carbonate reservoirs can be quantitatively evaluated based on the classification of flow units. In the dolomite reservoirs, the average absolute error of calculated permeability decreases from 15.13 to 7.44 mD. Similarly, the average absolute error of calculated permeability of limestone reservoirs is reduced from 20.33 to 7.37 mD. Only by accurately characterizing pore structures and classifying reservoir types, reservoir parameters could be calculated accurately. Therefore, characterizing pore structures and classifying reservoir types are very important to accurate evaluation of complex carbonate reservoirs in the Middle East.
Cluster designs to assess the prevalence of acute malnutrition by lot quality assurance sampling: a validation study by computer simulation.

PubMed

Olives, Casey; Pagano, Marcello; Deitchler, Megan; Hedt, Bethany L; Egge, Kari; Valadez, Joseph J

2009-04-01

Traditional lot quality assurance sampling (LQAS) methods require simple random sampling to guarantee valid results. However, cluster sampling has been proposed to reduce the number of random starting points. This study uses simulations to examine the classification error of two such designs, a 67x3 (67 clusters of three observations) and a 33x6 (33 clusters of six observations) sampling scheme to assess the prevalence of global acute malnutrition (GAM). Further, we explore the use of a 67x3 sequential sampling scheme for LQAS classification of GAM prevalence. Results indicate that, for independent clusters with moderate intracluster correlation for the GAM outcome, the three sampling designs maintain approximate validity for LQAS analysis. Sequential sampling can substantially reduce the average sample size that is required for data collection. The presence of intercluster correlation can impact dramatically the classification error that is associated with LQAS analysis.
Performance Analysis of Classification Methods for Indoor Localization in Vlc Networks

NASA Astrophysics Data System (ADS)

Sánchez-Rodríguez, D.; Alonso-González, I.; Sánchez-Medina, J.; Ley-Bosch, C.; Díaz-Vilariño, L.

2017-09-01

Indoor localization has gained considerable attention over the past decade because of the emergence of numerous location-aware services. Research works have been proposed on solving this problem by using wireless networks. Nevertheless, there is still much room for improvement in the quality of the proposed classification models. In the last years, the emergence of Visible Light Communication (VLC) brings a brand new approach to high quality indoor positioning. Among its advantages, this new technology is immune to electromagnetic interference and has the advantage of having a smaller variance of received signal power compared to RF based technologies. In this paper, a performance analysis of seventeen machine leaning classifiers for indoor localization in VLC networks is carried out. The analysis is accomplished in terms of accuracy, average distance error, computational cost, training size, precision and recall measurements. Results show that most of classifiers harvest an accuracy above 90 %. The best tested classifier yielded a 99.0 % accuracy, with an average error distance of 0.3 centimetres.
Evaluation of a New Software Version of the RTVue Optical Coherence Tomograph for Image Segmentation and Detection of Glaucoma in High Myopia.

PubMed

Holló, Gábor; Shu-Wei, Hsu; Naghizadeh, Farzaneh

2016-06-01

To compare the current (6.3) and a novel software version (6.12) of the RTVue-100 optical coherence tomograph (RTVue-OCT) for ganglion cell complex (GCC) and retinal nerve fiber layer thickness (RNFLT) image segmentation and detection of glaucoma in high myopia. RNFLT and GCC scans were acquired with software version 6.3 of the RTVue-OCT on 51 highly myopic eyes (spherical refractive error ≤-6.0 D) of 51 patients, and were analyzed with both the software versions. Twenty-two eyes were nonglaucomatous, 13 were ocular hypertensive and 16 eyes had glaucoma. No difference was seen for any RNFLT, and average GCC parameter between the software versions (paired t test, P≥0.084). Global loss volume was significantly lower (more normal) with version 6.12 than with version 6.3 (Wilcoxon signed-rank test, P<0.001). The percentage agreement (κ) between the clinical (normal and ocular hypertensive vs. glaucoma) and the software-provided classifications (normal and borderline vs. outside normal limits) were 0.3219 and 0.4442 for average RNFLT, and 0.2926 and 0.4977 for average GCC with versions 1 and 2, respectively (McNemar symmetry test, P≥0.289). No difference in average RNFLT and GCC classification (McNemar symmetry test, P≥0.727) and the number of eyes with at least 1 segmentation error (P≥0.109) was found between the software versions, respectively. Although GCC segmentation was improved with software version 6.12 compared with the current version in highly myopic eyes, this did not result in a significant change of the average RNFLT and GCC values, and did not significantly improve the software-provided classification for glaucoma.
Land use in the Paraiba Valley through remotely sensed data. [Brazil

NASA Technical Reports Server (NTRS)

Dejesusparada, N. (Principal Investigator); Lombardo, M. A.; Novo, E. M. L. D.; Niero, M.; Foresti, C.

1980-01-01

A methodology for land use survey was developed and land use modification rates were determined using LANDSAT imagery of the Paraiba Valley (state of Sao Paulo). Both visual and automatic interpretation methods were employed to analyze seven land use classes: urban area, industrial area, bare soil, cultivated area, pastureland, reforestation and natural vegetation. By means of visual interpretation, little spectral differences are observed among those classes. The automatic classification of LANDSAT MSS data using maximum likelihood algorithm shows a 39% average error of omission and a 3.4% error of inclusion for the seven classes. The complexity of land uses in the study area, the large spectral variations of analyzed classes, and the low resolution of LANDSAT data influenced the classification results.
Multi-template tensor-based morphometry: Application to analysis of Alzheimer's disease

PubMed Central

Koikkalainen, Juha; Lötjönen, Jyrki; Thurfjell, Lennart; Rueckert, Daniel; Waldemar, Gunhild; Soininen, Hilkka

2012-01-01

In this paper methods for using multiple templates in tensor-based morphometry (TBM) are presented and comparedtothe conventional single-template approach. TBM analysis requires non-rigid registrations which are often subject to registration errors. When using multiple templates and, therefore, multiple registrations, it can be assumed that the registration errors are averaged and eventually compensated. Four different methods are proposed for multi-template TBM. The methods were evaluated using magnetic resonance (MR) images of healthy controls, patients with stable or progressive mild cognitive impairment (MCI), and patients with Alzheimer's disease (AD) from the ADNI database (N=772). The performance of TBM features in classifying images was evaluated both quantitatively and qualitatively. Classification results show that the multi-template methods are statistically significantly better than the single-template method. The overall classification accuracy was 86.0% for the classification of control and AD subjects, and 72.1%for the classification of stable and progressive MCI subjects. The statistical group-level difference maps produced using multi-template TBM were smoother, formed larger continuous regions, and had larger t-values than the maps obtained with single-template TBM. PMID:21419228

Cluster designs to assess the prevalence of acute malnutrition by lot quality assurance sampling: a validation study by computer simulation

PubMed Central

Olives, Casey; Pagano, Marcello; Deitchler, Megan; Hedt, Bethany L; Egge, Kari; Valadez, Joseph J

2009-01-01

Traditional lot quality assurance sampling (LQAS) methods require simple random sampling to guarantee valid results. However, cluster sampling has been proposed to reduce the number of random starting points. This study uses simulations to examine the classification error of two such designs, a 67×3 (67 clusters of three observations) and a 33×6 (33 clusters of six observations) sampling scheme to assess the prevalence of global acute malnutrition (GAM). Further, we explore the use of a 67×3 sequential sampling scheme for LQAS classification of GAM prevalence. Results indicate that, for independent clusters with moderate intracluster correlation for the GAM outcome, the three sampling designs maintain approximate validity for LQAS analysis. Sequential sampling can substantially reduce the average sample size that is required for data collection. The presence of intercluster correlation can impact dramatically the classification error that is associated with LQAS analysis. PMID:20011037
Kernel Wiener filter and its application to pattern recognition.

PubMed

Yoshino, Hirokazu; Dong, Chen; Washizawa, Yoshikazu; Yamashita, Yukihiko

2010-11-01

The Wiener filter (WF) is widely used for inverse problems. From an observed signal, it provides the best estimated signal with respect to the squared error averaged over the original and the observed signals among linear operators. The kernel WF (KWF), extended directly from WF, has a problem that an additive noise has to be handled by samples. Since the computational complexity of kernel methods depends on the number of samples, a huge computational cost is necessary for the case. By using the first-order approximation of kernel functions, we realize KWF that can handle such a noise not by samples but as a random variable. We also propose the error estimation method for kernel filters by using the approximations. In order to show the advantages of the proposed methods, we conducted the experiments to denoise images and estimate errors. We also apply KWF to classification since KWF can provide an approximated result of the maximum a posteriori classifier that provides the best recognition accuracy. The noise term in the criterion can be used for the classification in the presence of noise or a new regularization to suppress changes in the input space, whereas the ordinary regularization for the kernel method suppresses changes in the feature space. In order to show the advantages of the proposed methods, we conducted experiments of binary and multiclass classifications and classification in the presence of noise.
Typing mineral deposits using their grades and tonnages in an artificial neural network

USGS Publications Warehouse

Singer, Donald A.; Kouda, Ryoichi

2003-01-01

A test of the ability of a probabilistic neural network to classify deposits into types on the basis of deposit tonnage and average Cu, Mo, Ag, Au, Zn, and Pb grades is conducted. The purpose is to examine whether this type of system might serve as a basis for integrating geoscience information available in large mineral databases to classify sites by deposit type. Benefits of proper classification of many sites in large regions are relatively rapid identification of terranes permissive for deposit types and recognition of specific sites perhaps worthy of exploring further.Total tonnages and average grades of 1,137 well-explored deposits identified in published grade and tonnage models representing 13 deposit types were used to train and test the network. Tonnages were transformed by logarithms and grades by square roots to reduce effects of skewness. All values were scaled by subtracting the variable's mean and dividing by its standard deviation. Half of the deposits were selected randomly to be used in training the probabilistic neural network and the other half were used for independent testing. Tests were performed with a probabilistic neural network employing a Gaussian kernel and separate sigma weights for each class (type) and each variable (grade or tonnage).Deposit types were selected to challenge the neural network. For many types, tonnages or average grades are significantly different from other types, but individual deposits may plot in the grade and tonnage space of more than one type. Porphyry Cu, porphyry Cu-Au, and porphyry Cu-Mo types have similar tonnages and relatively small differences in grades. Redbed Cu deposits typically have tonnages that could be confused with porphyry Cu deposits, also contain Cu and, in some situations, Ag. Cyprus and kuroko massive sulfide types have about the same tonnages. Cu, Zn, Ag, and Au grades. Polymetallic vein, sedimentary exhalative Zn-Pb, and Zn-Pb skarn types contain many of the same metals. Sediment-hosted Au, Comstock Au-Ag, and low-sulfide Au-quartz vein types are principally Au deposits with differing amounts of Ag.Given the intent to test the neural network under the most difficult conditions, an overall 75% agreement between the experts and the neural network is considered excellent. Among the largestclassification errors are skarn Zn-Pb and Cyprus massive sulfide deposits classed by the neuralnetwork as kuroko massive sulfides—24 and 63% error respectively. Other large errors are the classification of 92% of porphyry Cu-Mo as porphyry Cu deposits. Most of the larger classification errors involve 25 or fewer training deposits, suggesting that some errors might be the result of small sample size. About 91% of the gold deposit types were classed properly and 98% of porphyry Cu deposits were classes as some type of porphyry Cu deposit. An experienced economic geologist would not make many of the classification errors that were made by the neural network because the geologic settings of deposits would be used to reduce errors. In a separate test, the probabilistic neural network correctly classed 93% of 336 deposits in eight deposit types when trained with presence or absence of 58 minerals and six generalized rock types. The overall success rate of the probabilistic neural network when trained on tonnage and average grades would probably be more than 90% with additional information on the presence of a few rock types.
Development of a methodology for classifying software errors

NASA Technical Reports Server (NTRS)

Gerhart, S. L.

1976-01-01

A mathematical formalization of the intuition behind classification of software errors is devised and then extended to a classification discipline: Every classification scheme should have an easily discernible mathematical structure and certain properties of the scheme should be decidable (although whether or not these properties hold is relative to the intended use of the scheme). Classification of errors then becomes an iterative process of generalization from actual errors to terms defining the errors together with adjustment of definitions according to the classification discipline. Alternatively, whenever possible, small scale models may be built to give more substance to the definitions. The classification discipline and the difficulties of definition are illustrated by examples of classification schemes from the literature and a new study of observed errors in published papers of programming methodologies.
Using beta binomials to estimate classification uncertainty for ensemble models.

PubMed

Clark, Robert D; Liang, Wenkel; Lee, Adam C; Lawless, Michael S; Fraczkiewicz, Robert; Waldman, Marvin

2014-01-01

Quantitative structure-activity (QSAR) models have enormous potential for reducing drug discovery and development costs as well as the need for animal testing. Great strides have been made in estimating their overall reliability, but to fully realize that potential, researchers and regulators need to know how confident they can be in individual predictions. Submodels in an ensemble model which have been trained on different subsets of a shared training pool represent multiple samples of the model space, and the degree of agreement among them contains information on the reliability of ensemble predictions. For artificial neural network ensembles (ANNEs) using two different methods for determining ensemble classification - one using vote tallies and the other averaging individual network outputs - we have found that the distribution of predictions across positive vote tallies can be reasonably well-modeled as a beta binomial distribution, as can the distribution of errors. Together, these two distributions can be used to estimate the probability that a given predictive classification will be in error. Large data sets comprised of logP, Ames mutagenicity, and CYP2D6 inhibition data are used to illustrate and validate the method. The distributions of predictions and errors for the training pool accurately predicted the distribution of predictions and errors for large external validation sets, even when the number of positive and negative examples in the training pool were not balanced. Moreover, the likelihood of a given compound being prospectively misclassified as a function of the degree of consensus between networks in the ensemble could in most cases be estimated accurately from the fitted beta binomial distributions for the training pool. Confidence in an individual predictive classification by an ensemble model can be accurately assessed by examining the distributions of predictions and errors as a function of the degree of agreement among the constituent submodels. Further, ensemble uncertainty estimation can often be improved by adjusting the voting or classification threshold based on the parameters of the error distribution. Finally, the profiles for models whose predictive uncertainty estimates are not reliable provide clues to that effect without the need for comparison to an external test set.
Classification and reduction of pilot error

NASA Technical Reports Server (NTRS)

Rogers, W. H.; Logan, A. L.; Boley, G. D.

1989-01-01

Human error is a primary or contributing factor in about two-thirds of commercial aviation accidents worldwide. With the ultimate goal of reducing pilot error accidents, this contract effort is aimed at understanding the factors underlying error events and reducing the probability of certain types of errors by modifying underlying factors such as flight deck design and procedures. A review of the literature relevant to error classification was conducted. Classification includes categorizing types of errors, the information processing mechanisms and factors underlying them, and identifying factor-mechanism-error relationships. The classification scheme developed by Jens Rasmussen was adopted because it provided a comprehensive yet basic error classification shell or structure that could easily accommodate addition of details on domain-specific factors. For these purposes, factors specific to the aviation environment were incorporated. Hypotheses concerning the relationship of a small number of underlying factors, information processing mechanisms, and error types types identified in the classification scheme were formulated. ASRS data were reviewed and a simulation experiment was performed to evaluate and quantify the hypotheses.
Improved wetland remote sensing in Yellowstone National Park using classification trees to combine TM imagery and ancillary environmental data

USGS Publications Warehouse

Wright, C.; Gallant, Alisa L.

2007-01-01

The U.S. Fish and Wildlife Service uses the term palustrine wetland to describe vegetated wetlands traditionally identified as marsh, bog, fen, swamp, or wet meadow. Landsat TM imagery was combined with image texture and ancillary environmental data to model probabilities of palustrine wetland occurrence in Yellowstone National Park using classification trees. Model training and test locations were identified from National Wetlands Inventory maps, and classification trees were built for seven years spanning a range of annual precipitation. At a coarse level, palustrine wetland was separated from upland. At a finer level, five palustrine wetland types were discriminated: aquatic bed (PAB), emergent (PEM), forested (PFO), scrub–shrub (PSS), and unconsolidated shore (PUS). TM-derived variables alone were relatively accurate at separating wetland from upland, but model error rates dropped incrementally as image texture, DEM-derived terrain variables, and other ancillary GIS layers were added. For classification trees making use of all available predictors, average overall test error rates were 7.8% for palustrine wetland/upland models and 17.0% for palustrine wetland type models, with consistent accuracies across years. However, models were prone to wetland over-prediction. While the predominant PEM class was classified with omission and commission error rates less than 14%, we had difficulty identifying the PAB and PSS classes. Ancillary vegetation information greatly improved PSS classification and moderately improved PFO discrimination. Association with geothermal areas distinguished PUS wetlands. Wetland over-prediction was exacerbated by class imbalance in likely combination with spatial and spectral limitations of the TM sensor. Wetland probability surfaces may be more informative than hard classification, and appear to respond to climate-driven wetland variability. The developed method is portable, relatively easy to implement, and should be applicable in other settings and over larger extents.
The 'Soil Cover App' - a new tool for fast determination of dead and living biomass on soil

NASA Astrophysics Data System (ADS)

Bauer, Thomas; Strauss, Peter; Riegler-Nurscher, Peter; Prankl, Johann; Prankl, Heinrich

2017-04-01

Worldwide many agricultural practices aim on soil protection strategies using living or dead biomass as soil cover. Especially for the case when management practices are focusing on soil erosion mitigation the effectiveness of these practices is directly driven by the amount of soil coverleft on the soil surface. Hence there is a need for quick and reliable methods of soil cover estimation not only for living biomass but particularly for dead biomass (mulch). Available methods for the soil cover measurement are either subjective, depending on an educated guess or time consuming, e.g., if the image is analysed manually at grid points. We therefore developed a mobile application using an algorithm based on entangled forest classification. The final output of the algorithm gives classified labels for each pixel of the input image as well as the percentage of each class which are living biomass, dead biomass, stones and soil. Our training dataset consisted of more than 250 different images and their annotated class information. Images have been taken in a set of different environmental conditions such as light, soil coverages from between 0% to 100%, different materials such as living plants, residues, straw material and stones. We compared the results provided by our mobile application with a data set of 180 images that had been manually annotated A comparison between both methods revealed a regression slope of 0.964 with a coefficient of determination R2 = 0.92, corresponding to an average error of about 4%. While average error of living plant classification was about 3%, dead residue classification resulted in an 8% error. Thus the new mobile application tool offers a fast and easy way to obtain information on the protective potential of a particular agricultural management site.
Improving ECG Classification Accuracy Using an Ensemble of Neural Network Modules

PubMed Central

Javadi, Mehrdad; Ebrahimpour, Reza; Sajedin, Atena; Faridi, Soheil; Zakernejad, Shokoufeh

2011-01-01

This paper illustrates the use of a combined neural network model based on Stacked Generalization method for classification of electrocardiogram (ECG) beats. In conventional Stacked Generalization method, the combiner learns to map the base classifiers' outputs to the target data. We claim adding the input pattern to the base classifiers' outputs helps the combiner to obtain knowledge about the input space and as the result, performs better on the same task. Experimental results support our claim that the additional knowledge according to the input space, improves the performance of the proposed method which is called Modified Stacked Generalization. In particular, for classification of 14966 ECG beats that were not previously seen during training phase, the Modified Stacked Generalization method reduced the error rate for 12.41% in comparison with the best of ten popular classifier fusion methods including Max, Min, Average, Product, Majority Voting, Borda Count, Decision Templates, Weighted Averaging based on Particle Swarm Optimization and Stacked Generalization. PMID:22046232
Multi-factorial analysis of class prediction error: estimating optimal number of biomarkers for various classification rules.

PubMed

Khondoker, Mizanur R; Bachmann, Till T; Mewissen, Muriel; Dickinson, Paul; Dobrzelecki, Bartosz; Campbell, Colin J; Mount, Andrew R; Walton, Anthony J; Crain, Jason; Schulze, Holger; Giraud, Gerard; Ross, Alan J; Ciani, Ilenia; Ember, Stuart W J; Tlili, Chaker; Terry, Jonathan G; Grant, Eilidh; McDonnell, Nicola; Ghazal, Peter

2010-12-01

Machine learning and statistical model based classifiers have increasingly been used with more complex and high dimensional biological data obtained from high-throughput technologies. Understanding the impact of various factors associated with large and complex microarray datasets on the predictive performance of classifiers is computationally intensive, under investigated, yet vital in determining the optimal number of biomarkers for various classification purposes aimed towards improved detection, diagnosis, and therapeutic monitoring of diseases. We investigate the impact of microarray based data characteristics on the predictive performance for various classification rules using simulation studies. Our investigation using Random Forest, Support Vector Machines, Linear Discriminant Analysis and k-Nearest Neighbour shows that the predictive performance of classifiers is strongly influenced by training set size, biological and technical variability, replication, fold change and correlation between biomarkers. Optimal number of biomarkers for a classification problem should therefore be estimated taking account of the impact of all these factors. A database of average generalization errors is built for various combinations of these factors. The database of generalization errors can be used for estimating the optimal number of biomarkers for given levels of predictive accuracy as a function of these factors. Examples show that curves from actual biological data resemble that of simulated data with corresponding levels of data characteristics. An R package optBiomarker implementing the method is freely available for academic use from the Comprehensive R Archive Network (http://www.cran.r-project.org/web/packages/optBiomarker/).
Cough event classification by pretrained deep neural network.

PubMed

Liu, Jia-Ming; You, Mingyu; Wang, Zheng; Li, Guo-Zheng; Xu, Xianghuai; Qiu, Zhongmin

2015-01-01

Cough is an essential symptom in respiratory diseases. In the measurement of cough severity, an accurate and objective cough monitor is expected by respiratory disease society. This paper aims to introduce a better performed algorithm, pretrained deep neural network (DNN), to the cough classification problem, which is a key step in the cough monitor. The deep neural network models are built from two steps, pretrain and fine-tuning, followed by a Hidden Markov Model (HMM) decoder to capture tamporal information of the audio signals. By unsupervised pretraining a deep belief network, a good initialization for a deep neural network is learned. Then the fine-tuning step is a back propogation tuning the neural network so that it can predict the observation probability associated with each HMM states, where the HMM states are originally achieved by force-alignment with a Gaussian Mixture Model Hidden Markov Model (GMM-HMM) on the training samples. Three cough HMMs and one noncough HMM are employed to model coughs and noncoughs respectively. The final decision is made based on viterbi decoding algorihtm that generates the most likely HMM sequence for each sample. A sample is labeled as cough if a cough HMM is found in the sequence. The experiments were conducted on a dataset that was collected from 22 patients with respiratory diseases. Patient dependent (PD) and patient independent (PI) experimental settings were used to evaluate the models. Five criteria, sensitivity, specificity, F1, macro average and micro average are shown to depict different aspects of the models. From overall evaluation criteria, the DNN based methods are superior to traditional GMM-HMM based method on F1 and micro average with maximal 14% and 11% error reduction in PD and 7% and 10% in PI, meanwhile keep similar performances on macro average. They also surpass GMM-HMM model on specificity with maximal 14% error reduction on both PD and PI. In this paper, we tried pretrained deep neural network in cough classification problem. Our results showed that comparing with the conventional GMM-HMM framework, the HMM-DNN could get better overall performance on cough classification task.
Linear and Order Statistics Combiners for Pattern Classification

NASA Technical Reports Server (NTRS)

Tumer, Kagan; Ghosh, Joydeep; Lau, Sonie (Technical Monitor)

2001-01-01

Several researchers have experimentally shown that substantial improvements can be obtained in difficult pattern recognition problems by combining or integrating the outputs of multiple classifiers. This chapter provides an analytical framework to quantify the improvements in classification results due to combining. The results apply to both linear combiners and order statistics combiners. We first show that to a first order approximation, the error rate obtained over and above the Bayes error rate, is directly proportional to the variance of the actual decision boundaries around the Bayes optimum boundary. Combining classifiers in output space reduces this variance, and hence reduces the 'added' error. If N unbiased classifiers are combined by simple averaging. the added error rate can be reduced by a factor of N if the individual errors in approximating the decision boundaries are uncorrelated. Expressions are then derived for linear combiners which are biased or correlated, and the effect of output correlations on ensemble performance is quantified. For order statistics based non-linear combiners, we derive expressions that indicate how much the median, the maximum and in general the i-th order statistic can improve classifier performance. The analysis presented here facilitates the understanding of the relationships among error rates, classifier boundary distributions, and combining in output space. Experimental results on several public domain data sets are provided to illustrate the benefits of combining and to support the analytical results.
Towards reporting standards for neuropsychological study results: A proposal to minimize communication errors with standardized qualitative descriptors for normalized test scores.

PubMed

Schoenberg, Mike R; Rum, Ruba S

2017-11-01

Rapid, clear and efficient communication of neuropsychological results is essential to benefit patient care. Errors in communication are a lead cause of medical errors; nevertheless, there remains a lack of consistency in how neuropsychological scores are communicated. A major limitation in the communication of neuropsychological results is the inconsistent use of qualitative descriptors for standardized test scores and the use of vague terminology. PubMed search from 1 Jan 2007 to 1 Aug 2016 to identify guidelines or consensus statements for the description and reporting of qualitative terms to communicate neuropsychological test scores was conducted. The review found the use of confusing and overlapping terms to describe various ranges of percentile standardized test scores. In response, we propose a simplified set of qualitative descriptors for normalized test scores (Q-Simple) as a means to reduce errors in communicating test results. The Q-Simple qualitative terms are: 'very superior', 'superior', 'high average', 'average', 'low average', 'borderline' and 'abnormal/impaired'. A case example illustrates the proposed Q-Simple qualitative classification system to communicate neuropsychological results for neurosurgical planning. The Q-Simple qualitative descriptor system is aimed as a means to improve and standardize communication of standardized neuropsychological test scores. Research are needed to further evaluate neuropsychological communication errors. Conveying the clinical implications of neuropsychological results in a manner that minimizes risk for communication errors is a quintessential component of evidence-based practice. Copyright © 2017 Elsevier B.V. All rights reserved.
Five-way smoking status classification using text hot-spot identification and error-correcting output codes.

PubMed

Cohen, Aaron M

2008-01-01

We participated in the i2b2 smoking status classification challenge task. The purpose of this task was to evaluate the ability of systems to automatically identify patient smoking status from discharge summaries. Our submission included several techniques that we compared and studied, including hot-spot identification, zero-vector filtering, inverse class frequency weighting, error-correcting output codes, and post-processing rules. We evaluated our approaches using the same methods as the i2b2 task organizers, using micro- and macro-averaged F1 as the primary performance metric. Our best performing system achieved a micro-F1 of 0.9000 on the test collection, equivalent to the best performing system submitted to the i2b2 challenge. Hot-spot identification, zero-vector filtering, classifier weighting, and error correcting output coding contributed additively to increased performance, with hot-spot identification having by far the largest positive effect. High performance on automatic identification of patient smoking status from discharge summaries is achievable with the efficient and straightforward machine learning techniques studied here.
Spotting East African mammals in open savannah from space.

PubMed

Yang, Zheng; Wang, Tiejun; Skidmore, Andrew K; de Leeuw, Jan; Said, Mohammed Y; Freer, Jim

2014-01-01

Knowledge of population dynamics is essential for managing and conserving wildlife. Traditional methods of counting wild animals such as aerial survey or ground counts not only disturb animals, but also can be labour intensive and costly. New, commercially available very high-resolution satellite images offer great potential for accurate estimates of animal abundance over large open areas. However, little research has been conducted in the area of satellite-aided wildlife census, although computer processing speeds and image analysis algorithms have vastly improved. This paper explores the possibility of detecting large animals in the open savannah of Maasai Mara National Reserve, Kenya from very high-resolution GeoEye-1 satellite images. A hybrid image classification method was employed for this specific purpose by incorporating the advantages of both pixel-based and object-based image classification approaches. This was performed in two steps: firstly, a pixel-based image classification method, i.e., artificial neural network was applied to classify potential targets with similar spectral reflectance at pixel level; and then an object-based image classification method was used to further differentiate animal targets from the surrounding landscapes through the applications of expert knowledge. As a result, the large animals in two pilot study areas were successfully detected with an average count error of 8.2%, omission error of 6.6% and commission error of 13.7%. The results of the study show for the first time that it is feasible to perform automated detection and counting of large wild animals in open savannahs from space, and therefore provide a complementary and alternative approach to the conventional wildlife survey techniques.
Fully Convolutional Networks for Ground Classification from LIDAR Point Clouds

NASA Astrophysics Data System (ADS)

Rizaldy, A.; Persello, C.; Gevaert, C. M.; Oude Elberink, S. J.

2018-05-01

Deep Learning has been massively used for image classification in recent years. The use of deep learning for ground classification from LIDAR point clouds has also been recently studied. However, point clouds need to be converted into an image in order to use Convolutional Neural Networks (CNNs). In state-of-the-art techniques, this conversion is slow because each point is converted into a separate image. This approach leads to highly redundant computation during conversion and classification. The goal of this study is to design a more efficient data conversion and ground classification. This goal is achieved by first converting the whole point cloud into a single image. The classification is then performed by a Fully Convolutional Network (FCN), a modified version of CNN designed for pixel-wise image classification. The proposed method is significantly faster than state-of-the-art techniques. On the ISPRS Filter Test dataset, it is 78 times faster for conversion and 16 times faster for classification. Our experimental analysis on the same dataset shows that the proposed method results in 5.22 % of total error, 4.10 % of type I error, and 15.07 % of type II error. Compared to the previous CNN-based technique and LAStools software, the proposed method reduces the total error and type I error (while type II error is slightly higher). The method was also tested on a very high point density LIDAR point clouds resulting in 4.02 % of total error, 2.15 % of type I error and 6.14 % of type II error.
A multiresolution hierarchical classification algorithm for filtering airborne LiDAR data

NASA Astrophysics Data System (ADS)

Chen, Chuanfa; Li, Yanyan; Li, Wei; Dai, Honglei

2013-08-01

We presented a multiresolution hierarchical classification (MHC) algorithm for differentiating ground from non-ground LiDAR point cloud based on point residuals from the interpolated raster surface. MHC includes three levels of hierarchy, with the simultaneous increase of cell resolution and residual threshold from the low to the high level of the hierarchy. At each level, the surface is iteratively interpolated towards the ground using thin plate spline (TPS) until no ground points are classified, and the classified ground points are used to update the surface in the next iteration. 15 groups of benchmark dataset, provided by the International Society for Photogrammetry and Remote Sensing (ISPRS) commission, were used to compare the performance of MHC with those of the 17 other publicized filtering methods. Results indicated that MHC with the average total error and average Cohen’s kappa coefficient of 4.11% and 86.27% performs better than all other filtering methods.
Influence of nuclei segmentation on breast cancer malignancy classification

NASA Astrophysics Data System (ADS)

Jelen, Lukasz; Fevens, Thomas; Krzyzak, Adam

2009-02-01

Breast Cancer is one of the most deadly cancers affecting middle-aged women. Accurate diagnosis and prognosis are crucial to reduce the high death rate. Nowadays there are numerous diagnostic tools for breast cancer diagnosis. In this paper we discuss a role of nuclear segmentation from fine needle aspiration biopsy (FNA) slides and its influence on malignancy classification. Classification of malignancy plays a very important role during the diagnosis process of breast cancer. Out of all cancer diagnostic tools, FNA slides provide the most valuable information about the cancer malignancy grade which helps to choose an appropriate treatment. This process involves assessing numerous nuclear features and therefore precise segmentation of nuclei is very important. In this work we compare three powerful segmentation approaches and test their impact on the classification of breast cancer malignancy. The studied approaches involve level set segmentation, fuzzy c-means segmentation and textural segmentation based on co-occurrence matrix. Segmented nuclei were used to extract nuclear features for malignancy classification. For classification purposes four different classifiers were trained and tested with previously extracted features. The compared classifiers are Multilayer Perceptron (MLP), Self-Organizing Maps (SOM), Principal Component-based Neural Network (PCA) and Support Vector Machines (SVM). The presented results show that level set segmentation yields the best results over the three compared approaches and leads to a good feature extraction with a lowest average error rate of 6.51% over four different classifiers. The best performance was recorded for multilayer perceptron with an error rate of 3.07% using fuzzy c-means segmentation.
Analysis of using EMG and mechanical sensors to enhance intent recognition in powered lower limb prostheses

NASA Astrophysics Data System (ADS)

Young, A. J.; Kuiken, T. A.; Hargrove, L. J.

2014-10-01

Objective. The purpose of this study was to determine the contribution of electromyography (EMG) data, in combination with a diverse array of mechanical sensors, to locomotion mode intent recognition in transfemoral amputees using powered prostheses. Additionally, we determined the effect of adding time history information using a dynamic Bayesian network (DBN) for both the mechanical and EMG sensors. Approach. EMG signals from the residual limbs of amputees have been proposed to enhance pattern recognition-based intent recognition systems for powered lower limb prostheses, but mechanical sensors on the prosthesis—such as inertial measurement units, position and velocity sensors, and load cells—may be just as useful. EMG and mechanical sensor data were collected from 8 transfemoral amputees using a powered knee/ankle prosthesis over basic locomotion modes such as walking, slopes and stairs. An offline study was conducted to determine the benefit of different sensor sets for predicting intent. Main results. EMG information was not as accurate alone as mechanical sensor information (p < 0.05) for any classification strategy. However, EMG in combination with the mechanical sensor data did significantly reduce intent recognition errors (p < 0.05) both for transitions between locomotion modes and steady-state locomotion. The sensor time history (DBN) classifier significantly reduced error rates compared to a linear discriminant classifier for steady-state steps, without increasing the transitional error, for both EMG and mechanical sensors. Combining EMG and mechanical sensor data with sensor time history reduced the average transitional error from 18.4% to 12.2% and the average steady-state error from 3.8% to 1.0% when classifying level-ground walking, ramps, and stairs in eight transfemoral amputee subjects. Significance. These results suggest that a neural interface in combination with time history methods for locomotion mode classification can enhance intent recognition performance; this strategy should be considered for future real-time experiments.
Bayes Error Rate Estimation Using Classifier Ensembles

NASA Technical Reports Server (NTRS)

Tumer, Kagan; Ghosh, Joydeep

2003-01-01

The Bayes error rate gives a statistical lower bound on the error achievable for a given classification problem and the associated choice of features. By reliably estimating th is rate, one can assess the usefulness of the feature set that is being used for classification. Moreover, by comparing the accuracy achieved by a given classifier with the Bayes rate, one can quantify how effective that classifier is. Classical approaches for estimating or finding bounds for the Bayes error, in general, yield rather weak results for small sample sizes; unless the problem has some simple characteristics, such as Gaussian class-conditional likelihoods. This article shows how the outputs of a classifier ensemble can be used to provide reliable and easily obtainable estimates of the Bayes error with negligible extra computation. Three methods of varying sophistication are described. First, we present a framework that estimates the Bayes error when multiple classifiers, each providing an estimate of the a posteriori class probabilities, a recombined through averaging. Second, we bolster this approach by adding an information theoretic measure of output correlation to the estimate. Finally, we discuss a more general method that just looks at the class labels indicated by ensem ble members and provides error estimates based on the disagreements among classifiers. The methods are illustrated for artificial data, a difficult four-class problem involving underwater acoustic data, and two problems from the Problem benchmarks. For data sets with known Bayes error, the combiner-based methods introduced in this article outperform existing methods. The estimates obtained by the proposed methods also seem quite reliable for the real-life data sets for which the true Bayes rates are unknown.

EEG Classification with a Sequential Decision-Making Method in Motor Imagery BCI.

PubMed

Liu, Rong; Wang, Yongxuan; Newman, Geoffrey I; Thakor, Nitish V; Ying, Sarah

2017-12-01

To develop subject-specific classifier to recognize mental states fast and reliably is an important issue in brain-computer interfaces (BCI), particularly in practical real-time applications such as wheelchair or neuroprosthetic control. In this paper, a sequential decision-making strategy is explored in conjunction with an optimal wavelet analysis for EEG classification. The subject-specific wavelet parameters based on a grid-search method were first developed to determine evidence accumulative curve for the sequential classifier. Then we proposed a new method to set the two constrained thresholds in the sequential probability ratio test (SPRT) based on the cumulative curve and a desired expected stopping time. As a result, it balanced the decision time of each class, and we term it balanced threshold SPRT (BTSPRT). The properties of the method were illustrated on 14 subjects' recordings from offline and online tests. Results showed the average maximum accuracy of the proposed method to be 83.4% and the average decision time of 2.77[Formula: see text]s, when compared with 79.2% accuracy and a decision time of 3.01[Formula: see text]s for the sequential Bayesian (SB) method. The BTSPRT method not only improves the classification accuracy and decision speed comparing with the other nonsequential or SB methods, but also provides an explicit relationship between stopping time, thresholds and error, which is important for balancing the speed-accuracy tradeoff. These results suggest that BTSPRT would be useful in explicitly adjusting the tradeoff between rapid decision-making and error-free device control.
Are Nomothetic or Ideographic Approaches Superior in Predicting Daily Exercise Behaviors?

PubMed

Cheung, Ying Kuen; Hsueh, Pei-Yun Sabrina; Qian, Min; Yoon, Sunmoo; Meli, Laura; Diaz, Keith M; Schwartz, Joseph E; Kronish, Ian M; Davidson, Karina W

2017-01-01

The understanding of how stress influences health behavior can provide insights into developing healthy lifestyle interventions. This understanding is traditionally attained through observational studies that examine associations at a population level. This nomothetic approach, however, is fundamentally limited by the fact that the environment- person milieu that constitutes stress exposure and experience can vary substantially between individuals, and the modifiable elements of these exposures and experiences are individual-specific. With recent advances in smartphone and sensing technologies, it is now possible to conduct idiographic assessment in users' own environment, leveraging the full-range observations of actions and experiences that result in differential response to naturally occurring events. The aim of this paper is to explore the hypothesis that an ideographic N-of-1 model can better capture an individual's stress- behavior pathway (or the lack thereof) and provide useful person-specific predictors of exercise behavior. This paper used the data collected in an observational study in 79 participants who were followed for up to a 1-year period, wherein their physical activity was continuously and objectively monitored by actigraphy and their stress experience was recorded via ecological momentary assessment on a mobile app. In addition, our analyses considered exogenous and environmental variables retrieved from public archive such as day in a week, daylight time, temperature and precipitation. Leveraging the multiple data sources, we developed prediction algorithms for exercise behavior using random forest and classification tree techniques using a nomothetic approach and an N-of-1 approach. The two approaches were compared based on classification errors in predicting personalized exercise behavior. Eight factors were selected by random forest for the nomothetic decision model, which was used to predict whether a participant would exercise on a particular day. The predictors included previous exercise behavior, emotional factors (e.g., midday stress), external factors such as weather (e.g., temperature), and self-determination factors (e.g., expectation of exercise). The nomothetic model yielded an average classification error of 36%. The ideographic N-of-1 models used on average about two predictors for each individual, and had an average classification error of 25%, which represented an improvement of 11 percentage points. Compared to the traditional one-size-fits-all, nomothetic model that generalizes population-evidence for individuals, the proposed N-of-1 model can better capture the individual difference in their stressbehavior pathways. In this paper, we demonstrate it is feasible to perform personalized exercise behavior prediction, mainly made possible by mobile health technology and machine learning analytics. Schattauer GmbH.
Classification-Based Spatial Error Concealment for Visual Communications

NASA Astrophysics Data System (ADS)

Chen, Meng; Zheng, Yefeng; Wu, Min

2006-12-01

In an error-prone transmission environment, error concealment is an effective technique to reconstruct the damaged visual content. Due to large variations of image characteristics, different concealment approaches are necessary to accommodate the different nature of the lost image content. In this paper, we address this issue and propose using classification to integrate the state-of-the-art error concealment techniques. The proposed approach takes advantage of multiple concealment algorithms and adaptively selects the suitable algorithm for each damaged image area. With growing awareness that the design of sender and receiver systems should be jointly considered for efficient and reliable multimedia communications, we proposed a set of classification-based block concealment schemes, including receiver-side classification, sender-side attachment, and sender-side embedding. Our experimental results provide extensive performance comparisons and demonstrate that the proposed classification-based error concealment approaches outperform the conventional approaches.
C-fuzzy variable-branch decision tree with storage and classification error rate constraints

NASA Astrophysics Data System (ADS)

Yang, Shiueng-Bien

2009-10-01

The C-fuzzy decision tree (CFDT), which is based on the fuzzy C-means algorithm, has recently been proposed. The CFDT is grown by selecting the nodes to be split according to its classification error rate. However, the CFDT design does not consider the classification time taken to classify the input vector. Thus, the CFDT can be improved. We propose a new C-fuzzy variable-branch decision tree (CFVBDT) with storage and classification error rate constraints. The design of the CFVBDT consists of two phases-growing and pruning. The CFVBDT is grown by selecting the nodes to be split according to the classification error rate and the classification time in the decision tree. Additionally, the pruning method selects the nodes to prune based on the storage requirement and the classification time of the CFVBDT. Furthermore, the number of branches of each internal node is variable in the CFVBDT. Experimental results indicate that the proposed CFVBDT outperforms the CFDT and other methods.
Multiple-rule bias in the comparison of classification rules

PubMed Central

Yousefi, Mohammadmahdi R.; Hua, Jianping; Dougherty, Edward R.

2011-01-01

Motivation: There is growing discussion in the bioinformatics community concerning overoptimism of reported results. Two approaches contributing to overoptimism in classification are (i) the reporting of results on datasets for which a proposed classification rule performs well and (ii) the comparison of multiple classification rules on a single dataset that purports to show the advantage of a certain rule. Results: This article provides a careful probabilistic analysis of the second issue and the ‘multiple-rule bias’, resulting from choosing a classification rule having minimum estimated error on the dataset. It quantifies this bias corresponding to estimating the expected true error of the classification rule possessing minimum estimated error and it characterizes the bias from estimating the true comparative advantage of the chosen classification rule relative to the others by the estimated comparative advantage on the dataset. The analysis is applied to both synthetic and real data using a number of classification rules and error estimators. Availability: We have implemented in C code the synthetic data distribution model, classification rules, feature selection routines and error estimation methods. The code for multiple-rule analysis is implemented in MATLAB. The source code is available at http://gsp.tamu.edu/Publications/supplementary/yousefi11a/. Supplementary simulation results are also included. Contact: edward@ece.tamu.edu Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:21546390
Multiple category-lot quality assurance sampling: a new classification system with application to schistosomiasis control.

PubMed

Olives, Casey; Valadez, Joseph J; Brooker, Simon J; Pagano, Marcello

2012-01-01

Originally a binary classifier, Lot Quality Assurance Sampling (LQAS) has proven to be a useful tool for classification of the prevalence of Schistosoma mansoni into multiple categories (≤10%, >10 and <50%, ≥50%), and semi-curtailed sampling has been shown to effectively reduce the number of observations needed to reach a decision. To date the statistical underpinnings for Multiple Category-LQAS (MC-LQAS) have not received full treatment. We explore the analytical properties of MC-LQAS, and validate its use for the classification of S. mansoni prevalence in multiple settings in East Africa. We outline MC-LQAS design principles and formulae for operating characteristic curves. In addition, we derive the average sample number for MC-LQAS when utilizing semi-curtailed sampling and introduce curtailed sampling in this setting. We also assess the performance of MC-LQAS designs with maximum sample sizes of n=15 and n=25 via a weighted kappa-statistic using S. mansoni data collected in 388 schools from four studies in East Africa. Overall performance of MC-LQAS classification was high (kappa-statistic of 0.87). In three of the studies, the kappa-statistic for a design with n=15 was greater than 0.75. In the fourth study, where these designs performed poorly (kappa-statistic less than 0.50), the majority of observations fell in regions where potential error is known to be high. Employment of semi-curtailed and curtailed sampling further reduced the sample size by as many as 0.5 and 3.5 observations per school, respectively, without increasing classification error. This work provides the needed analytics to understand the properties of MC-LQAS for assessing the prevalance of S. mansoni and shows that in most settings a sample size of 15 children provides a reliable classification of schools.
Forecasting Daily Volume and Acuity of Patients in the Emergency Department.

PubMed

Calegari, Rafael; Fogliatto, Flavio S; Lucini, Filipe R; Neyeloff, Jeruza; Kuchenbecker, Ricardo S; Schaan, Beatriz D

2016-01-01

This study aimed at analyzing the performance of four forecasting models in predicting the demand for medical care in terms of daily visits in an emergency department (ED) that handles high complexity cases, testing the influence of climatic and calendrical factors on demand behavior. We tested different mathematical models to forecast ED daily visits at Hospital de Clínicas de Porto Alegre (HCPA), which is a tertiary care teaching hospital located in Southern Brazil. Model accuracy was evaluated using mean absolute percentage error (MAPE), considering forecasting horizons of 1, 7, 14, 21, and 30 days. The demand time series was stratified according to patient classification using the Manchester Triage System's (MTS) criteria. Models tested were the simple seasonal exponential smoothing (SS), seasonal multiplicative Holt-Winters (SMHW), seasonal autoregressive integrated moving average (SARIMA), and multivariate autoregressive integrated moving average (MSARIMA). Performance of models varied according to patient classification, such that SS was the best choice when all types of patients were jointly considered, and SARIMA was the most accurate for modeling demands of very urgent (VU) and urgent (U) patients. The MSARIMA models taking into account climatic factors did not improve the performance of the SARIMA models, independent of patient classification.
Forecasting Daily Volume and Acuity of Patients in the Emergency Department

PubMed Central

Fogliatto, Flavio S.; Neyeloff, Jeruza; Kuchenbecker, Ricardo S.; Schaan, Beatriz D.

2016-01-01

This study aimed at analyzing the performance of four forecasting models in predicting the demand for medical care in terms of daily visits in an emergency department (ED) that handles high complexity cases, testing the influence of climatic and calendrical factors on demand behavior. We tested different mathematical models to forecast ED daily visits at Hospital de Clínicas de Porto Alegre (HCPA), which is a tertiary care teaching hospital located in Southern Brazil. Model accuracy was evaluated using mean absolute percentage error (MAPE), considering forecasting horizons of 1, 7, 14, 21, and 30 days. The demand time series was stratified according to patient classification using the Manchester Triage System's (MTS) criteria. Models tested were the simple seasonal exponential smoothing (SS), seasonal multiplicative Holt-Winters (SMHW), seasonal autoregressive integrated moving average (SARIMA), and multivariate autoregressive integrated moving average (MSARIMA). Performance of models varied according to patient classification, such that SS was the best choice when all types of patients were jointly considered, and SARIMA was the most accurate for modeling demands of very urgent (VU) and urgent (U) patients. The MSARIMA models taking into account climatic factors did not improve the performance of the SARIMA models, independent of patient classification. PMID:27725842
Error analysis of filtering operations in pixel-duplicated images of diabetic retinopathy

NASA Astrophysics Data System (ADS)

Mehrubeoglu, Mehrube; McLauchlan, Lifford

2010-08-01

In this paper, diabetic retinopathy is chosen for a sample target image to demonstrate the effectiveness of image enlargement through pixel duplication in identifying regions of interest. Pixel duplication is presented as a simpler alternative to data interpolation techniques for detecting small structures in the images. A comparative analysis is performed on different image processing schemes applied to both original and pixel-duplicated images. Structures of interest are detected and and classification parameters optimized for minimum false positive detection in the original and enlarged retinal pictures. The error analysis demonstrates the advantages as well as shortcomings of pixel duplication in image enhancement when spatial averaging operations (smoothing filters) are also applied.
Mathematical foundations of hybrid data assimilation from a synchronization perspective

NASA Astrophysics Data System (ADS)

Penny, Stephen G.

2017-12-01

The state-of-the-art data assimilation methods used today in operational weather prediction centers around the world can be classified as generalized one-way coupled impulsive synchronization. This classification permits the investigation of hybrid data assimilation methods, which combine dynamic error estimates of the system state with long time-averaged (climatological) error estimates, from a synchronization perspective. Illustrative results show how dynamically informed formulations of the coupling matrix (via an Ensemble Kalman Filter, EnKF) can lead to synchronization when observing networks are sparse and how hybrid methods can lead to synchronization when those dynamic formulations are inadequate (due to small ensemble sizes). A large-scale application with a global ocean general circulation model is also presented. Results indicate that the hybrid methods also have useful applications in generalized synchronization, in particular, for correcting systematic model errors.
Mathematical foundations of hybrid data assimilation from a synchronization perspective.

PubMed

Penny, Stephen G

2017-12-01

The state-of-the-art data assimilation methods used today in operational weather prediction centers around the world can be classified as generalized one-way coupled impulsive synchronization. This classification permits the investigation of hybrid data assimilation methods, which combine dynamic error estimates of the system state with long time-averaged (climatological) error estimates, from a synchronization perspective. Illustrative results show how dynamically informed formulations of the coupling matrix (via an Ensemble Kalman Filter, EnKF) can lead to synchronization when observing networks are sparse and how hybrid methods can lead to synchronization when those dynamic formulations are inadequate (due to small ensemble sizes). A large-scale application with a global ocean general circulation model is also presented. Results indicate that the hybrid methods also have useful applications in generalized synchronization, in particular, for correcting systematic model errors.
Clarification of terminology in medication errors: definitions and classification.

PubMed

Ferner, Robin E; Aronson, Jeffrey K

2006-01-01

We have previously described and analysed some terms that are used in drug safety and have proposed definitions. Here we discuss and define terms that are used in the field of medication errors, particularly terms that are sometimes misunderstood or misused. We also discuss the classification of medication errors. A medication error is a failure in the treatment process that leads to, or has the potential to lead to, harm to the patient. Errors can be classified according to whether they are mistakes, slips, or lapses. Mistakes are errors in the planning of an action. They can be knowledge based or rule based. Slips and lapses are errors in carrying out an action - a slip through an erroneous performance and a lapse through an erroneous memory. Classification of medication errors is important because the probabilities of errors of different classes are different, as are the potential remedies.
Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms.

PubMed

Ozcift, Akin; Gulten, Arif

2011-12-01

Improving accuracies of machine learning algorithms is vital in designing high performance computer-aided diagnosis (CADx) systems. Researches have shown that a base classifier performance might be enhanced by ensemble classification strategies. In this study, we construct rotation forest (RF) ensemble classifiers of 30 machine learning algorithms to evaluate their classification performances using Parkinson's, diabetes and heart diseases from literature. While making experiments, first the feature dimension of three datasets is reduced using correlation based feature selection (CFS) algorithm. Second, classification performances of 30 machine learning algorithms are calculated for three datasets. Third, 30 classifier ensembles are constructed based on RF algorithm to assess performances of respective classifiers with the same disease data. All the experiments are carried out with leave-one-out validation strategy and the performances of the 60 algorithms are evaluated using three metrics; classification accuracy (ACC), kappa error (KE) and area under the receiver operating characteristic (ROC) curve (AUC). Base classifiers succeeded 72.15%, 77.52% and 84.43% average accuracies for diabetes, heart and Parkinson's datasets, respectively. As for RF classifier ensembles, they produced average accuracies of 74.47%, 80.49% and 87.13% for respective diseases. RF, a newly proposed classifier ensemble algorithm, might be used to improve accuracy of miscellaneous machine learning algorithms to design advanced CADx systems. Copyright Â© 2011 Elsevier Ireland Ltd. All rights reserved.
Tests of a Semi-Analytical Case 1 and Gelbstoff Case 2 SeaWiFS Algorithm with a Global Data Set

NASA Technical Reports Server (NTRS)

Carder, Kendall L.; Hawes, Steve K.; Lee, Zhongping

1997-01-01

A semi-analytical algorithm was tested with a total of 733 points of either unpackaged or packaged-pigment data, with corresponding algorithm parameters for each data type. The 'unpackaged' type consisted of data sets that were generally consistent with the Case 1 CZCS algorithm and other well calibrated data sets. The 'packaged' type consisted of data sets apparently containing somewhat more packaged pigments, requiring modification of the absorption parameters of the model consistent with the CalCOFI study area. This resulted in two equally divided data sets. A more thorough scrutiny of these and other data sets using a semianalytical model requires improved knowledge of the phytoplankton and gelbstoff of the specific environment studied. Since the semi-analytical algorithm is dependent upon 4 spectral channels including the 412 nm channel, while most other algorithms are not, a means of testing data sets for consistency was sought. A numerical filter was developed to classify data sets into the above classes. The filter uses reflectance ratios, which can be determined from space. The sensitivity of such numerical filters to measurement resulting from atmospheric correction and sensor noise errors requires further study. The semi-analytical algorithm performed superbly on each of the data sets after classification, resulting in RMS1 errors of 0.107 and 0.121, respectively, for the unpackaged and packaged data-set classes, with little bias and slopes near 1.0. In combination, the RMS1 performance was 0.114. While these numbers appear rather sterling, one must bear in mind what mis-classification does to the results. Using an average or compromise parameterization on the modified global data set yielded an RMS1 error of 0.171, while using the unpackaged parameterization on the global evaluation data set yielded an RMS1 error of 0.284. So, without classification, the algorithm performs better globally using the average parameters than it does using the unpackaged parameters. Finally, the effects of even more extreme pigment packaging must be examined in order to improve algorithm performance at high latitudes. Note, however, that the North Sea and Mississippi River plume studies contributed data to the packaged and unpackaged classess, respectively, with little effect on algorithm performance. This suggests that gelbstoff-rich Case 2 waters do not seriously degrade performance of the semi-analytical algorithm.
Determining the optimal window length for pattern recognition-based myoelectric control: balancing the competing effects of classification error and controller delay.

PubMed

Smith, Lauren H; Hargrove, Levi J; Lock, Blair A; Kuiken, Todd A

2011-04-01

Pattern recognition-based control of myoelectric prostheses has shown great promise in research environments, but has not been optimized for use in a clinical setting. To explore the relationship between classification error, controller delay, and real-time controllability, 13 able-bodied subjects were trained to operate a virtual upper-limb prosthesis using pattern recognition of electromyogram (EMG) signals. Classification error and controller delay were varied by training different classifiers with a variety of analysis window lengths ranging from 50 to 550 ms and either two or four EMG input channels. Offline analysis showed that classification error decreased with longer window lengths (p < 0.01 ). Real-time controllability was evaluated with the target achievement control (TAC) test, which prompted users to maneuver the virtual prosthesis into various target postures. The results indicated that user performance improved with lower classification error (p < 0.01 ) and was reduced with longer controller delay (p < 0.01 ), as determined by the window length. Therefore, both of these effects should be considered when choosing a window length; it may be beneficial to increase the window length if this results in a reduced classification error, despite the corresponding increase in controller delay. For the system employed in this study, the optimal window length was found to be between 150 and 250 ms, which is within acceptable controller delays for conventional multistate amplitude controllers.
A comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies.

PubMed

Khondoker, Mizanur; Dobson, Richard; Skirrow, Caroline; Simmons, Andrew; Stahl, Daniel

2016-10-01

Recent literature on the comparison of machine learning methods has raised questions about the neutrality, unbiasedness and utility of many comparative studies. Reporting of results on favourable datasets and sampling error in the estimated performance measures based on single samples are thought to be the major sources of bias in such comparisons. Better performance in one or a few instances does not necessarily imply so on an average or on a population level and simulation studies may be a better alternative for objectively comparing the performances of machine learning algorithms. We compare the classification performance of a number of important and widely used machine learning algorithms, namely the Random Forests (RF), Support Vector Machines (SVM), Linear Discriminant Analysis (LDA) and k-Nearest Neighbour (kNN). Using massively parallel processing on high-performance supercomputers, we compare the generalisation errors at various combinations of levels of several factors: number of features, training sample size, biological variation, experimental variation, effect size, replication and correlation between features. For smaller number of correlated features, number of features not exceeding approximately half the sample size, LDA was found to be the method of choice in terms of average generalisation errors as well as stability (precision) of error estimates. SVM (with RBF kernel) outperforms LDA as well as RF and kNN by a clear margin as the feature set gets larger provided the sample size is not too small (at least 20). The performance of kNN also improves as the number of features grows and outplays that of LDA and RF unless the data variability is too high and/or effect sizes are too small. RF was found to outperform only kNN in some instances where the data are more variable and have smaller effect sizes, in which cases it also provide more stable error estimates than kNN and LDA. Applications to a number of real datasets supported the findings from the simulation study. © The Author(s) 2013.
The use of a contextual, modal and psychological classification of medication errors in the emergency department: a retrospective descriptive study.

PubMed

Cabilan, C J; Hughes, James A; Shannon, Carl

2017-12-01

To describe the contextual, modal and psychological classification of medication errors in the emergency department to know the factors associated with the reported medication errors. The causes of medication errors are unique in every clinical setting; hence, error minimisation strategies are not always effective. For this reason, it is fundamental to understand the causes specific to the emergency department so that targeted strategies can be implemented. Retrospective analysis of reported medication errors in the emergency department. All voluntarily staff-reported medication-related incidents from 2010-2015 from the hospital's electronic incident management system were retrieved for analysis. Contextual classification involved the time, place and the type of medications involved. Modal classification pertained to the stage and issue (e.g. wrong medication, wrong patient). Psychological classification categorised the errors in planning (knowledge-based and rule-based errors) and skill (slips and lapses). There were 405 errors reported. Most errors occurred in the acute care area, short-stay unit and resuscitation area, during the busiest shifts (0800-1559, 1600-2259). Half of the errors involved high-alert medications. Many of the errors occurred during administration (62·7%), prescribing (28·6%) and commonly during both stages (18·5%). Wrong dose, wrong medication and omission were the issues that dominated. Knowledge-based errors characterised the errors that occurred in prescribing and administration. The highest proportion of slips (79·5%) and lapses (76·1%) occurred during medication administration. It is likely that some of the errors occurred due to the lack of adherence to safety protocols. Technology such as computerised prescribing, barcode medication administration and reminder systems could potentially decrease the medication errors in the emergency department. There was a possibility that some of the errors could be prevented if safety protocols were adhered to, which highlights the need to also address clinicians' attitudes towards safety. Technology can be implemented to help minimise errors in the ED, but this must be coupled with efforts to enhance the culture of safety. © 2017 John Wiley & Sons Ltd.
Effects of uncertainty and variability on population declines and IUCN Red List classifications.

PubMed

Rueda-Cediel, Pamela; Anderson, Kurt E; Regan, Tracey J; Regan, Helen M

2018-01-22

The International Union for Conservation of Nature (IUCN) Red List Categories and Criteria is a quantitative framework for classifying species according to extinction risk. Population models may be used to estimate extinction risk or population declines. Uncertainty and variability arise in threat classifications through measurement and process error in empirical data and uncertainty in the models used to estimate extinction risk and population declines. Furthermore, species traits are known to affect extinction risk. We investigated the effects of measurement and process error, model type, population growth rate, and age at first reproduction on the reliability of risk classifications based on projected population declines on IUCN Red List classifications. We used an age-structured population model to simulate true population trajectories with different growth rates, reproductive ages and levels of variation, and subjected them to measurement error. We evaluated the ability of scalar and matrix models parameterized with these simulated time series to accurately capture the IUCN Red List classification generated with true population declines. Under all levels of measurement error tested and low process error, classifications were reasonably accurate; scalar and matrix models yielded roughly the same rate of misclassifications, but the distribution of errors differed; matrix models led to greater overestimation of extinction risk than underestimations; process error tended to contribute to misclassifications to a greater extent than measurement error; and more misclassifications occurred for fast, rather than slow, life histories. These results indicate that classifications of highly threatened taxa (i.e., taxa with low growth rates) under criterion A are more likely to be reliable than for less threatened taxa when assessed with population models. Greater scrutiny needs to be placed on data used to parameterize population models for species with high growth rates, particularly when available evidence indicates a potential transition to higher risk categories. © 2018 Society for Conservation Biology.
Supervised classification of brain tissues through local multi-scale texture analysis by coupling DIR and FLAIR MR sequences

NASA Astrophysics Data System (ADS)

Poletti, Enea; Veronese, Elisa; Calabrese, Massimiliano; Bertoldo, Alessandra; Grisan, Enrico

2012-02-01

The automatic segmentation of brain tissues in magnetic resonance (MR) is usually performed on T1-weighted images, due to their high spatial resolution. T1w sequence, however, has some major downsides when brain lesions are present: the altered appearance of diseased tissues causes errors in tissues classification. In order to overcome these drawbacks, we employed two different MR sequences: fluid attenuated inversion recovery (FLAIR) and double inversion recovery (DIR). The former highlights both gray matter (GM) and white matter (WM), the latter highlights GM alone. We propose here a supervised classification scheme that does not require any anatomical a priori information to identify the 3 classes, "GM", "WM", and "background". Features are extracted by means of a local multi-scale texture analysis, computed for each pixel of the DIR and FLAIR sequences. The 9 textures considered are average, standard deviation, kurtosis, entropy, contrast, correlation, energy, homogeneity, and skewness, evaluated on a neighborhood of 3x3, 5x5, and 7x7 pixels. Hence, the total number of features associated to a pixel is 56 (9 textures x3 scales x2 sequences +2 original pixel values). The classifier employed is a Support Vector Machine with Radial Basis Function as kernel. From each of the 4 brain volumes evaluated, a DIR and a FLAIR slice have been selected and manually segmented by 2 expert neurologists, providing 1st and 2nd human reference observations which agree with an average accuracy of 99.03%. SVM performances have been assessed with a 4-fold cross-validation, yielding an average classification accuracy of 98.79%.
Human Error Assessment and Reduction Technique (HEART) and Human Factor Analysis and Classification System (HFACS)

NASA Technical Reports Server (NTRS)

Alexander, Tiffaney Miller

2017-01-01

Research results have shown that more than half of aviation, aerospace and aeronautics mishaps incidents are attributed to human error. As a part of Safety within space exploration ground processing operations, the identification and/or classification of underlying contributors and causes of human error must be identified, in order to manage human error. This research provides a framework and methodology using the Human Error Assessment and Reduction Technique (HEART) and Human Factor Analysis and Classification System (HFACS), as an analysis tool to identify contributing factors, their impact on human error events, and predict the Human Error probabilities (HEPs) of future occurrences. This research methodology was applied (retrospectively) to six (6) NASA ground processing operations scenarios and thirty (30) years of Launch Vehicle related mishap data. This modifiable framework can be used and followed by other space and similar complex operations.

Human Error Assessment and Reduction Technique (HEART) and Human Factor Analysis and Classification System (HFACS)

NASA Technical Reports Server (NTRS)

Alexander, Tiffaney Miller

2017-01-01

Research results have shown that more than half of aviation, aerospace and aeronautics mishaps/incidents are attributed to human error. As a part of Safety within space exploration ground processing operations, the identification and/or classification of underlying contributors and causes of human error must be identified, in order to manage human error. This research provides a framework and methodology using the Human Error Assessment and Reduction Technique (HEART) and Human Factor Analysis and Classification System (HFACS), as an analysis tool to identify contributing factors, their impact on human error events, and predict the Human Error probabilities (HEPs) of future occurrences. This research methodology was applied (retrospectively) to six (6) NASA ground processing operations scenarios and thirty (30) years of Launch Vehicle related mishap data. This modifiable framework can be used and followed by other space and similar complex operations.
Human Error Assessment and Reduction Technique (HEART) and Human Factor Analysis and Classification System (HFACS)

NASA Technical Reports Server (NTRS)

Alexander, Tiffaney Miller

2017-01-01

Research results have shown that more than half of aviation, aerospace and aeronautics mishaps incidents are attributed to human error. As a part of Quality within space exploration ground processing operations, the identification and or classification of underlying contributors and causes of human error must be identified, in order to manage human error.This presentation will provide a framework and methodology using the Human Error Assessment and Reduction Technique (HEART) and Human Factor Analysis and Classification System (HFACS), as an analysis tool to identify contributing factors, their impact on human error events, and predict the Human Error probabilities (HEPs) of future occurrences. This research methodology was applied (retrospectively) to six (6) NASA ground processing operations scenarios and thirty (30) years of Launch Vehicle related mishap data. This modifiable framework can be used and followed by other space and similar complex operations.
Factors That Affect Large Subunit Ribosomal DNA Amplicon Sequencing Studies of Fungal Communities: Classification Method, Primer Choice, and Error

PubMed Central

Porter, Teresita M.; Golding, G. Brian

2012-01-01

Nuclear large subunit ribosomal DNA is widely used in fungal phylogenetics and to an increasing extent also amplicon-based environmental sequencing. The relatively short reads produced by next-generation sequencing, however, makes primer choice and sequence error important variables for obtaining accurate taxonomic classifications. In this simulation study we tested the performance of three classification methods: 1) a similarity-based method (BLAST + Metagenomic Analyzer, MEGAN); 2) a composition-based method (Ribosomal Database Project naïve Bayesian classifier, NBC); and, 3) a phylogeny-based method (Statistical Assignment Package, SAP). We also tested the effects of sequence length, primer choice, and sequence error on classification accuracy and perceived community composition. Using a leave-one-out cross validation approach, results for classifications to the genus rank were as follows: BLAST + MEGAN had the lowest error rate and was particularly robust to sequence error; SAP accuracy was highest when long LSU query sequences were classified; and, NBC runs significantly faster than the other tested methods. All methods performed poorly with the shortest 50–100 bp sequences. Increasing simulated sequence error reduced classification accuracy. Community shifts were detected due to sequence error and primer selection even though there was no change in the underlying community composition. Short read datasets from individual primers, as well as pooled datasets, appear to only approximate the true community composition. We hope this work informs investigators of some of the factors that affect the quality and interpretation of their environmental gene surveys. PMID:22558215
Classification of breast cancer cytological specimen using convolutional neural network

NASA Astrophysics Data System (ADS)

Żejmo, Michał; Kowal, Marek; Korbicz, Józef; Monczak, Roman

2017-01-01

The paper presents a deep learning approach for automatic classification of breast tumors based on fine needle cytology. The main aim of the system is to distinguish benign from malignant cases based on microscopic images. Experiment was carried out on cytological samples derived from 50 patients (25 benign cases + 25 malignant cases) diagnosed in Regional Hospital in Zielona Góra. To classify microscopic images, we used convolutional neural networks (CNN) of two types: GoogLeNet and AlexNet. Due to the very large size of images of cytological specimen (on average 200000 × 100000 pixels), they were divided into smaller patches of size 256 × 256 pixels. Breast cancer classification usually is based on morphometric features of nuclei. Therefore, training and validation patches were selected using Support Vector Machine (SVM) so that suitable amount of cell material was depicted. Neural classifiers were tuned using GPU accelerated implementation of gradient descent algorithm. Training error was defined as a cross-entropy classification loss. Classification accuracy was defined as the percentage ratio of successfully classified validation patches to the total number of validation patches. The best accuracy rate of 83% was obtained by GoogLeNet model. We observed that more misclassified patches belong to malignant cases.
Supporting diagnosis of attention-deficit hyperactive disorder with novelty detection.

PubMed

Lee, Hyoung-Joo; Cho, Sungzoon; Shin, Min-Sup

2008-03-01

Computerized continuous performance test (CPT) is a widely used diagnostic tool for attention-deficit hyperactivity disorder (ADHD). It measures the number of correctly detected stimuli as well as response times. Typically, when calculating a cut-off score for discriminating between normal and abnormal, only the normal children's data are collected. Then the average and standard deviation of each measure or variable is computed. If any of variables is larger than 2 sigma above the average, that child is diagnosed as abnormal. We will call this approach as "T-score 70" classifier. However, its performance has a lot to be desired due to a high false negative error. In order to improve the classification accuracy we propose to use novelty detection approaches for supporting ADHD diagnosis. Novelty detection is a model building framework where a classifier is constructed using only one class of training data and a new input pattern is classified according to its similarity to the training data. A total of eight novelty detectors are introduced and applied to our ADHD datasets collected from two modes of tests, visual and auditory. They are evaluated and compared with the T-score model on validation datasets in terms of false positive and negative error rates, and area under receiver operating characteristics curve (AuROC). Experimental results show that the cut-off score of 70 is suboptimal which leads to a low false positive error but a very high false negative error. A few novelty detectors such as Parzen density estimators yield much more balanced classification performances. Moreover, most novelty detectors outperform the T-score method for most age groups statistically with a significance level of 1% in terms of AuROC. In particular, we recommend the Parzen and Gaussian density estimators, kernel principal component analysis, one-class support vector machine, and K-means clustering novelty detector which can improve upon the T-score method on average by at least 30% for the visual test and 40% for the auditory test. In addition, their performances are relatively stable over various parameter values as long as they are within reasonable ranges. The proposed novelty detection approaches can replace the T-score method which has been considered the "gold standard" for supporting ADHD diagnosis. Furthermore, they can be applied to other psychological tests where only normal data are available.
A Temporal Mining Framework for Classifying Un-Evenly Spaced Clinical Data: An Approach for Building Effective Clinical Decision-Making System.

PubMed

Jane, Nancy Yesudhas; Nehemiah, Khanna Harichandran; Arputharaj, Kannan

2016-01-01

Clinical time-series data acquired from electronic health records (EHR) are liable to temporal complexities such as irregular observations, missing values and time constrained attributes that make the knowledge discovery process challenging. This paper presents a temporal rough set induced neuro-fuzzy (TRiNF) mining framework that handles these complexities and builds an effective clinical decision-making system. TRiNF provides two functionalities namely temporal data acquisition (TDA) and temporal classification. In TDA, a time-series forecasting model is constructed by adopting an improved double exponential smoothing method. The forecasting model is used in missing value imputation and temporal pattern extraction. The relevant attributes are selected using a temporal pattern based rough set approach. In temporal classification, a classification model is built with the selected attributes using a temporal pattern induced neuro-fuzzy classifier. For experimentation, this work uses two clinical time series dataset of hepatitis and thrombosis patients. The experimental result shows that with the proposed TRiNF framework, there is a significant reduction in the error rate, thereby obtaining the classification accuracy on an average of 92.59% for hepatitis and 91.69% for thrombosis dataset. The obtained classification results prove the efficiency of the proposed framework in terms of its improved classification accuracy.
Approximated mutual information training for speech recognition using myoelectric signals.

PubMed

Guo, Hua J; Chan, A D C

2006-01-01

A new training algorithm called the approximated maximum mutual information (AMMI) is proposed to improve the accuracy of myoelectric speech recognition using hidden Markov models (HMMs). Previous studies have demonstrated that automatic speech recognition can be performed using myoelectric signals from articulatory muscles of the face. Classification of facial myoelectric signals can be performed using HMMs that are trained using the maximum likelihood (ML) algorithm; however, this algorithm maximizes the likelihood of the observations in the training sequence, which is not directly associated with optimal classification accuracy. The AMMI training algorithm attempts to maximize the mutual information, thereby training the HMMs to optimize their parameters for discrimination. Our results show that AMMI training consistently reduces the error rates compared to these by the ML training, increasing the accuracy by approximately 3% on average.
Global terrain classification using Multiple-Error-Removed Improved-Terrain (MERIT) to address susceptibility of landslides and other geohazards

NASA Astrophysics Data System (ADS)

Iwahashi, J.; Yamazaki, D.; Matsuoka, M.; Thamarux, P.; Herrick, J.; Yong, A.; Mital, U.

2017-12-01

A seamless model of landform classifications with regional accuracy will be a powerful platform for geophysical studies that forecast geologic hazards. Spatial variability as a function of landform on a global scale was captured in the automated classifications of Iwahashi and Pike (2007) and additional developments are presented here that incorporate more accurate depictions using higher-resolution elevation data than the original 1-km scale Shuttle Radar Topography Mission digital elevation model (DEM). We create polygon-based terrain classifications globally by using the 280-m DEM interpolated from the Multi-Error-Removed Improved-Terrain DEM (MERIT; Yamazaki et al., 2017). The multi-scale pixel-image analysis method, known as Multi-resolution Segmentation (Baatz and Schäpe, 2000), is first used to classify the terrains based on geometric signatures (slope and local convexity) calculated from the 280-m DEM. Next, we apply the machine learning method of "k-means clustering" to prepare the polygon-based classification at the globe-scale using slope, local convexity and surface texture. We then group the divisions with similar properties by hierarchical clustering and other statistical analyses using geological and geomorphological data of the area where landslides and earthquakes are frequent (e.g. Japan and California). We find the 280-m DEM resolution is only partially sufficient for classifying plains. We nevertheless observe that the categories correspond to reported landslide and liquefaction features at the global scale, suggesting that our model is an appropriate platform to forecast ground failure. To predict seismic amplification, we estimate site conditions using the time-averaged shear-wave velocity in the upper 30-m (VS30) measurements compiled by Yong et al. (2016) and the terrain model developed by Yong (2016; Y16). We plan to test our method on finer resolution DEMs and report our findings to obtain a more globally consistent terrain model as there are known errors in DEM derivatives at higher-resolutions. We expect the improvement in DEM resolution (4 times greater detail) and the combination of regional and global coverage will yield a consistent dataset of polygons that have the potential to improve relations to the Y16 estimates significantly.
On Identifying Clusters Within the C-type Asteroids of the Sloan Digital Sky Survey

NASA Astrophysics Data System (ADS)

Poole, Renae; Ziffer, J.; Harvell, T.

2012-10-01

We applied AutoClass, a data mining technique based upon Bayesian Classification, to C-group asteroid colors in the Sloan Digital Sky Survey (SDSS). Previous taxonomic studies relied mostly on Principal Component Analysis (PCA) to differentiate asteroids within the C-group (e.g. B, G, F, Ch, Cg and Cb). AutoClass's advantage is that it calculates the most probable classification for us, removing the human factor from this part of the analysis. In our results, AutoClass divided the C-groups into two large classes and six smaller classes. The two large classes (n=4974 and 2033, respectively) display distinct regions with some overlap in color-vs-color plots. Each cluster's average spectrum is compared to 'typical' spectra of the C-group subtypes as defined by Tholen (1989) and each cluster's members are evaluated for consistency with previous taxonomies. Of the 117 asteroids classified as B-type in previous taxonomies, only 12 were found with SDSS colors that matched our criteria of having less than 0.1 magnitude error in u and 0.05 magnitude error in g, r, i, and z colors. Although this is a relatively small group, 11 of the 12 B-types were placed by AutoClass in the same cluster. By determining the C-group sub-classifications in the large SDSS database, this research furthers our understanding of the stratigraphy and composition of the main-belt.
Organic Chemical Attribution Signatures for the Sourcing of a Mustard Agent and Its Starting Materials

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fraga, Carlos G.; Bronk, Krys; Dockendorff, Brian P.

Chemical attribution signatures (CAS) are being investigated for the sourcing of chemical warfare (CW) agents and their starting materials that may be implicated in chemical attacks or CW proliferation. The work reported here demonstrates for the first time trace impurities produced during the synthesis of tris(2-chloroethyl)amine (HN3) that point to specific reagent stocks used in the synthesis of this CW agent. Thirty batches of HN3 were synthesized using different combinations of commercial stocks of triethanolamine (TEA), thionyl chloride, chloroform, and acetone. The HN3 batches and reagent stocks were then analyzed for impurities by gas chromatography/mass spectrometry. Reaction-produced impurities indicative ofmore » specific TEA and chloroform stocks were exclusively discovered in HN3 batches made with those reagent stocks. In addition, some reagent impurities were found in the HN3 batches that were presumably not altered during synthesis and believed to be indicative of reagent type regardless of stock. Supervised classification using partial least squares discriminant analysis (PLSDA) on the impurity profiles of chloroform samples from seven stocks resulted in an average classification error by cross-validation of 2.4%. A classification error of zero was obtained using the seven-stock PLSDA model on a validation set of samples from an arbitrarily selected chloroform stock. In a separate analysis, all samples from two of seven chloroform stocks that were purposely not modeled had their samples matched to a chloroform stock rather than assigned a “no class” classification.« less
On the use of interaction error potentials for adaptive brain computer interfaces.

PubMed

Llera, A; van Gerven, M A J; Gómez, V; Jensen, O; Kappen, H J

2011-12-01

We propose an adaptive classification method for the Brain Computer Interfaces (BCI) which uses Interaction Error Potentials (IErrPs) as a reinforcement signal and adapts the classifier parameters when an error is detected. We analyze the quality of the proposed approach in relation to the misclassification of the IErrPs. In addition we compare static versus adaptive classification performance using artificial and MEG data. We show that the proposed adaptive framework significantly improves the static classification methods. Copyright © 2011 Elsevier Ltd. All rights reserved.
A neural network for noise correlation classification

NASA Astrophysics Data System (ADS)

Paitz, Patrick; Gokhberg, Alexey; Fichtner, Andreas

2018-02-01

We present an artificial neural network (ANN) for the classification of ambient seismic noise correlations into two categories, suitable and unsuitable for noise tomography. By using only a small manually classified data subset for network training, the ANN allows us to classify large data volumes with low human effort and to encode the valuable subjective experience of data analysts that cannot be captured by a deterministic algorithm. Based on a new feature extraction procedure that exploits the wavelet-like nature of seismic time-series, we efficiently reduce the dimensionality of noise correlation data, still keeping relevant features needed for automated classification. Using global- and regional-scale data sets, we show that classification errors of 20 per cent or less can be achieved when the network training is performed with as little as 3.5 per cent and 16 per cent of the data sets, respectively. Furthermore, the ANN trained on the regional data can be applied to the global data, and vice versa, without a significant increase of the classification error. An experiment where four students manually classified the data, revealed that the classification error they would assign to each other is substantially larger than the classification error of the ANN (>35 per cent). This indicates that reproducibility would be hampered more by human subjectivity than by imperfections of the ANN.
Medication errors as malpractice-a qualitative content analysis of 585 medication errors by nurses in Sweden.

PubMed

Björkstén, Karin Sparring; Bergqvist, Monica; Andersén-Karlsson, Eva; Benson, Lina; Ulfvarson, Johanna

2016-08-24

Many studies address the prevalence of medication errors but few address medication errors serious enough to be regarded as malpractice. Other studies have analyzed the individual and system contributory factor leading to a medication error. Nurses have a key role in medication administration, and there are contradictory reports on the nurses' work experience in relation to the risk and type for medication errors. All medication errors where a nurse was held responsible for malpractice (n = 585) during 11 years in Sweden were included. A qualitative content analysis and classification according to the type and the individual and system contributory factors was made. In order to test for possible differences between nurses' work experience and associations within and between the errors and contributory factors, Fisher's exact test was used, and Cohen's kappa (k) was performed to estimate the magnitude and direction of the associations. There were a total of 613 medication errors in the 585 cases, the most common being "Wrong dose" (41 %), "Wrong patient" (13 %) and "Omission of drug" (12 %). In 95 % of the cases, an average of 1.4 individual contributory factors was found; the most common being "Negligence, forgetfulness or lack of attentiveness" (68 %), "Proper protocol not followed" (25 %), "Lack of knowledge" (13 %) and "Practice beyond scope" (12 %). In 78 % of the cases, an average of 1.7 system contributory factors was found; the most common being "Role overload" (36 %), "Unclear communication or orders" (30 %) and "Lack of adequate access to guidelines or unclear organisational routines" (30 %). The errors "Wrong patient due to mix-up of patients" and "Wrong route" and the contributory factors "Lack of knowledge" and "Negligence, forgetfulness or lack of attentiveness" were more common in less experienced nurses. The experienced nurses were more prone to "Practice beyond scope of practice" and to make errors in spite of "Lack of adequate access to guidelines or unclear organisational routines". Medication errors regarded as malpractice in Sweden were of the same character as medication errors worldwide. A complex interplay between individual and system factors often contributed to the errors.
Multistrategy Self-Organizing Map Learning for Classification Problems

PubMed Central

Hasan, S.; Shamsuddin, S. M.

2011-01-01

Multistrategy Learning of Self-Organizing Map (SOM) and Particle Swarm Optimization (PSO) is commonly implemented in clustering domain due to its capabilities in handling complex data characteristics. However, some of these multistrategy learning architectures have weaknesses such as slow convergence time always being trapped in the local minima. This paper proposes multistrategy learning of SOM lattice structure with Particle Swarm Optimisation which is called ESOMPSO for solving various classification problems. The enhancement of SOM lattice structure is implemented by introducing a new hexagon formulation for better mapping quality in data classification and labeling. The weights of the enhanced SOM are optimised using PSO to obtain better output quality. The proposed method has been tested on various standard datasets with substantial comparisons with existing SOM network and various distance measurement. The results show that our proposed method yields a promising result with better average accuracy and quantisation errors compared to the other methods as well as convincing significant test. PMID:21876686
Multiple Category-Lot Quality Assurance Sampling: A New Classification System with Application to Schistosomiasis Control

PubMed Central

Olives, Casey; Valadez, Joseph J.; Brooker, Simon J.; Pagano, Marcello

2012-01-01

Background Originally a binary classifier, Lot Quality Assurance Sampling (LQAS) has proven to be a useful tool for classification of the prevalence of Schistosoma mansoni into multiple categories (≤10%, >10 and <50%, ≥50%), and semi-curtailed sampling has been shown to effectively reduce the number of observations needed to reach a decision. To date the statistical underpinnings for Multiple Category-LQAS (MC-LQAS) have not received full treatment. We explore the analytical properties of MC-LQAS, and validate its use for the classification of S. mansoni prevalence in multiple settings in East Africa. Methodology We outline MC-LQAS design principles and formulae for operating characteristic curves. In addition, we derive the average sample number for MC-LQAS when utilizing semi-curtailed sampling and introduce curtailed sampling in this setting. We also assess the performance of MC-LQAS designs with maximum sample sizes of n = 15 and n = 25 via a weighted kappa-statistic using S. mansoni data collected in 388 schools from four studies in East Africa. Principle Findings Overall performance of MC-LQAS classification was high (kappa-statistic of 0.87). In three of the studies, the kappa-statistic for a design with n = 15 was greater than 0.75. In the fourth study, where these designs performed poorly (kappa-statistic less than 0.50), the majority of observations fell in regions where potential error is known to be high. Employment of semi-curtailed and curtailed sampling further reduced the sample size by as many as 0.5 and 3.5 observations per school, respectively, without increasing classification error. Conclusion/Significance This work provides the needed analytics to understand the properties of MC-LQAS for assessing the prevalance of S. mansoni and shows that in most settings a sample size of 15 children provides a reliable classification of schools. PMID:22970333
Long-term surface EMG monitoring using K-means clustering and compressive sensing

NASA Astrophysics Data System (ADS)

Balouchestani, Mohammadreza; Krishnan, Sridhar

2015-05-01

In this work, we present an advanced K-means clustering algorithm based on Compressed Sensing theory (CS) in combination with the K-Singular Value Decomposition (K-SVD) method for Clustering of long-term recording of surface Electromyography (sEMG) signals. The long-term monitoring of sEMG signals aims at recording of the electrical activity produced by muscles which are very useful procedure for treatment and diagnostic purposes as well as for detection of various pathologies. The proposed algorithm is examined for three scenarios of sEMG signals including healthy person (sEMG-Healthy), a patient with myopathy (sEMG-Myopathy), and a patient with neuropathy (sEMG-Neuropathr), respectively. The proposed algorithm can easily scan large sEMG datasets of long-term sEMG recording. We test the proposed algorithm with Principal Component Analysis (PCA) and Linear Correlation Coefficient (LCC) dimensionality reduction methods. Then, the output of the proposed algorithm is fed to K-Nearest Neighbours (K-NN) and Probabilistic Neural Network (PNN) classifiers in order to calclute the clustering performance. The proposed algorithm achieves a classification accuracy of 99.22%. This ability allows reducing 17% of Average Classification Error (ACE), 9% of Training Error (TE), and 18% of Root Mean Square Error (RMSE). The proposed algorithm also reduces 14% clustering energy consumption compared to the existing K-Means clustering algorithm.
Analyzing thematic maps and mapping for accuracy

USGS Publications Warehouse

Rosenfield, G.H.

1982-01-01

Two problems which exist while attempting to test the accuracy of thematic maps and mapping are: (1) evaluating the accuracy of thematic content, and (2) evaluating the effects of the variables on thematic mapping. Statistical analysis techniques are applicable to both these problems and include techniques for sampling the data and determining their accuracy. In addition, techniques for hypothesis testing, or inferential statistics, are used when comparing the effects of variables. A comprehensive and valid accuracy test of a classification project, such as thematic mapping from remotely sensed data, includes the following components of statistical analysis: (1) sample design, including the sample distribution, sample size, size of the sample unit, and sampling procedure; and (2) accuracy estimation, including estimation of the variance and confidence limits. Careful consideration must be given to the minimum sample size necessary to validate the accuracy of a given. classification category. The results of an accuracy test are presented in a contingency table sometimes called a classification error matrix. Usually the rows represent the interpretation, and the columns represent the verification. The diagonal elements represent the correct classifications. The remaining elements of the rows represent errors by commission, and the remaining elements of the columns represent the errors of omission. For tests of hypothesis that compare variables, the general practice has been to use only the diagonal elements from several related classification error matrices. These data are arranged in the form of another contingency table. The columns of the table represent the different variables being compared, such as different scales of mapping. The rows represent the blocking characteristics, such as the various categories of classification. The values in the cells of the tables might be the counts of correct classification or the binomial proportions of these counts divided by either the row totals or the column totals from the original classification error matrices. In hypothesis testing, when the results of tests of multiple sample cases prove to be significant, some form of statistical test must be used to separate any results that differ significantly from the others. In the past, many analyses of the data in this error matrix were made by comparing the relative magnitudes of the percentage of correct classifications, for either individual categories, the entire map or both. More rigorous analyses have used data transformations and (or) two-way classification analysis of variance. A more sophisticated step of data analysis techniques would be to use the entire classification error matrices using the methods of discrete multivariate analysis or of multiviariate analysis of variance.
Analysis of DSN software anomalies

NASA Technical Reports Server (NTRS)

Galorath, D. D.; Hecht, H.; Hecht, M.; Reifer, D. J.

1981-01-01

A categorized data base of software errors which were discovered during the various stages of development and operational use of the Deep Space Network DSN/Mark 3 System was developed. A study team identified several existing error classification schemes (taxonomies), prepared a detailed annotated bibliography of the error taxonomy literature, and produced a new classification scheme which was tuned to the DSN anomaly reporting system and encapsulated the work of others. Based upon the DSN/RCI error taxonomy, error data on approximately 1000 reported DSN/Mark 3 anomalies were analyzed, interpreted and classified. Next, error data are summarized and histograms were produced highlighting key tendencies.
Some remarks on using circulation classifications to evaluate circulation model and atmospheric reanalysis data

NASA Astrophysics Data System (ADS)

Stryhal, Jan; Huth, Radan

2017-04-01

Automated classifications of atmospheric circulation patterns represent a tool widely used for studying the circulation in both the real atmosphere, represented by atmospheric reanalyses, and in circulation model outputs. It is well known that the results of studies utilizing one of these methods are influenced by several subjective choices, of which one of the most crucial is the selection of the method itself. Authors of the present study used eight methods from the COST733 classification software (Grosswettertypes, two variants of Jenkinson-Collison, Lund, T-mode PCA with oblique rotation of principal components, k-medoids, k-means with differing starting partitions, and SANDRA) to assess the winter 1961-2000 daily sea level pressure patterns in five reanalysis datasets (ERA-40, NCEP-1, JRA-55, 20CRv2, and ERA-20C), as well as in the historical runs and 21st century projections of an ensemble of CMIP5 GCMs. The classification methods were quite consistent in displaying the strongest biases in GCM simulations. However, the results also showed that multiple classifications are required to quantify the biases in certain types of circulation (e.g., zonal circulation or blocking-like patterns). There was no sign that any method should have a tendency to over- or underestimate the biases in circulation type frequency. The bias found by a particular method for a particular domain clearly reflects the ability of the algorithm to detect groups of similar patterns within the data space, and whether these groups do or do not differ one dataset to another is to a large extend coincidental. There were, nevertheless, systematic differences between groups of methods that use some form of correlation to classify the patterns to circulation types (CTs) and those which use the Euclidean distance. The comparison of reanalyses, which was conducted over eight European domains, showed that there is even a weak negative correlation between the average differences of CT frequency found by cluster analysis methods on one hand, and the remaining methods on the other. This suggests that groups of different methods capture different kinds of errors and that averaging the results obtained by an ensemble of methods very likely leads to an underestimation of the errors actually present in the data.
Neyman-Pearson classification algorithms and NP receiver operating characteristics

PubMed Central

Tong, Xin; Feng, Yang; Li, Jingyi Jessica

2018-01-01

In many binary classification applications, such as disease diagnosis and spam detection, practitioners commonly face the need to limit type I error (that is, the conditional probability of misclassifying a class 0 observation as class 1) so that it remains below a desired threshold. To address this need, the Neyman-Pearson (NP) classification paradigm is a natural choice; it minimizes type II error (that is, the conditional probability of misclassifying a class 1 observation as class 0) while enforcing an upper bound, α, on the type I error. Despite its century-long history in hypothesis testing, the NP paradigm has not been well recognized and implemented in classification schemes. Common practices that directly limit the empirical type I error to no more than α do not satisfy the type I error control objective because the resulting classifiers are likely to have type I errors much larger than α, and the NP paradigm has not been properly implemented in practice. We develop the first umbrella algorithm that implements the NP paradigm for all scoring-type classification methods, such as logistic regression, support vector machines, and random forests. Powered by this algorithm, we propose a novel graphical tool for NP classification methods: NP receiver operating characteristic (NP-ROC) bands motivated by the popular ROC curves. NP-ROC bands will help choose α in a data-adaptive way and compare different NP classifiers. We demonstrate the use and properties of the NP umbrella algorithm and NP-ROC bands, available in the R package nproc, through simulation and real data studies. PMID:29423442

Neyman-Pearson classification algorithms and NP receiver operating characteristics.

PubMed

Tong, Xin; Feng, Yang; Li, Jingyi Jessica

2018-02-01

In many binary classification applications, such as disease diagnosis and spam detection, practitioners commonly face the need to limit type I error (that is, the conditional probability of misclassifying a class 0 observation as class 1) so that it remains below a desired threshold. To address this need, the Neyman-Pearson (NP) classification paradigm is a natural choice; it minimizes type II error (that is, the conditional probability of misclassifying a class 1 observation as class 0) while enforcing an upper bound, α, on the type I error. Despite its century-long history in hypothesis testing, the NP paradigm has not been well recognized and implemented in classification schemes. Common practices that directly limit the empirical type I error to no more than α do not satisfy the type I error control objective because the resulting classifiers are likely to have type I errors much larger than α, and the NP paradigm has not been properly implemented in practice. We develop the first umbrella algorithm that implements the NP paradigm for all scoring-type classification methods, such as logistic regression, support vector machines, and random forests. Powered by this algorithm, we propose a novel graphical tool for NP classification methods: NP receiver operating characteristic (NP-ROC) bands motivated by the popular ROC curves. NP-ROC bands will help choose α in a data-adaptive way and compare different NP classifiers. We demonstrate the use and properties of the NP umbrella algorithm and NP-ROC bands, available in the R package nproc, through simulation and real data studies.
Evaluation of normalization methods for cDNA microarray data by k-NN classification

PubMed Central

Wu, Wei; Xing, Eric P; Myers, Connie; Mian, I Saira; Bissell, Mina J

2005-01-01

Background Non-biological factors give rise to unwanted variations in cDNA microarray data. There are many normalization methods designed to remove such variations. However, to date there have been few published systematic evaluations of these techniques for removing variations arising from dye biases in the context of downstream, higher-order analytical tasks such as classification. Results Ten location normalization methods that adjust spatial- and/or intensity-dependent dye biases, and three scale methods that adjust scale differences were applied, individually and in combination, to five distinct, published, cancer biology-related cDNA microarray data sets. Leave-one-out cross-validation (LOOCV) classification error was employed as the quantitative end-point for assessing the effectiveness of a normalization method. In particular, a known classifier, k-nearest neighbor (k-NN), was estimated from data normalized using a given technique, and the LOOCV error rate of the ensuing model was computed. We found that k-NN classifiers are sensitive to dye biases in the data. Using NONRM and GMEDIAN as baseline methods, our results show that single-bias-removal techniques which remove either spatial-dependent dye bias (referred later as spatial effect) or intensity-dependent dye bias (referred later as intensity effect) moderately reduce LOOCV classification errors; whereas double-bias-removal techniques which remove both spatial- and intensity effect reduce LOOCV classification errors even further. Of the 41 different strategies examined, three two-step processes, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, all of which removed intensity effect globally and spatial effect locally, appear to reduce LOOCV classification errors most consistently and effectively across all data sets. We also found that the investigated scale normalization methods do not reduce LOOCV classification error. Conclusion Using LOOCV error of k-NNs as the evaluation criterion, three double-bias-removal normalization strategies, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, outperform other strategies for removing spatial effect, intensity effect and scale differences from cDNA microarray data. The apparent sensitivity of k-NN LOOCV classification error to dye biases suggests that this criterion provides an informative measure for evaluating normalization methods. All the computational tools used in this study were implemented using the R language for statistical computing and graphics. PMID:16045803
Evaluation of normalization methods for cDNA microarray data by k-NN classification.

PubMed

Wu, Wei; Xing, Eric P; Myers, Connie; Mian, I Saira; Bissell, Mina J

2005-07-26

Non-biological factors give rise to unwanted variations in cDNA microarray data. There are many normalization methods designed to remove such variations. However, to date there have been few published systematic evaluations of these techniques for removing variations arising from dye biases in the context of downstream, higher-order analytical tasks such as classification. Ten location normalization methods that adjust spatial- and/or intensity-dependent dye biases, and three scale methods that adjust scale differences were applied, individually and in combination, to five distinct, published, cancer biology-related cDNA microarray data sets. Leave-one-out cross-validation (LOOCV) classification error was employed as the quantitative end-point for assessing the effectiveness of a normalization method. In particular, a known classifier, k-nearest neighbor (k-NN), was estimated from data normalized using a given technique, and the LOOCV error rate of the ensuing model was computed. We found that k-NN classifiers are sensitive to dye biases in the data. Using NONRM and GMEDIAN as baseline methods, our results show that single-bias-removal techniques which remove either spatial-dependent dye bias (referred later as spatial effect) or intensity-dependent dye bias (referred later as intensity effect) moderately reduce LOOCV classification errors; whereas double-bias-removal techniques which remove both spatial- and intensity effect reduce LOOCV classification errors even further. Of the 41 different strategies examined, three two-step processes, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, all of which removed intensity effect globally and spatial effect locally, appear to reduce LOOCV classification errors most consistently and effectively across all data sets. We also found that the investigated scale normalization methods do not reduce LOOCV classification error. Using LOOCV error of k-NNs as the evaluation criterion, three double-bias-removal normalization strategies, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, outperform other strategies for removing spatial effect, intensity effect and scale differences from cDNA microarray data. The apparent sensitivity of k-NN LOOCV classification error to dye biases suggests that this criterion provides an informative measure for evaluating normalization methods. All the computational tools used in this study were implemented using the R language for statistical computing and graphics.
Human Vision-Motivated Algorithm Allows Consistent Retinal Vessel Classification Based on Local Color Contrast for Advancing General Diagnostic Exams.

PubMed

Ivanov, Iliya V; Leitritz, Martin A; Norrenberg, Lars A; Völker, Michael; Dynowski, Marek; Ueffing, Marius; Dietter, Johannes

2016-02-01

Abnormalities of blood vessel anatomy, morphology, and ratio can serve as important diagnostic markers for retinal diseases such as AMD or diabetic retinopathy. Large cohort studies demand automated and quantitative image analysis of vascular abnormalities. Therefore, we developed an analytical software tool to enable automated standardized classification of blood vessels supporting clinical reading. A dataset of 61 images was collected from a total of 33 women and 8 men with a median age of 38 years. The pupils were not dilated, and images were taken after dark adaption. In contrast to current methods in which classification is based on vessel profile intensity averages, and similar to human vision, local color contrast was chosen as a discriminator to allow artery vein discrimination and arterial-venous ratio (AVR) calculation without vessel tracking. With 83% ± 1 standard error of the mean for our dataset, we achieved best classification for weighted lightness information from a combination of the red, green, and blue channels. Tested on an independent dataset, our method reached 89% correct classification, which, when benchmarked against conventional ophthalmologic classification, shows significantly improved classification scores. Our study demonstrates that vessel classification based on local color contrast can cope with inter- or intraimage lightness variability and allows consistent AVR calculation. We offer an open-source implementation of this method upon request, which can be integrated into existing tool sets and applied to general diagnostic exams.
Bayesian learning for spatial filtering in an EEG-based brain-computer interface.

PubMed

Zhang, Haihong; Yang, Huijuan; Guan, Cuntai

2013-07-01

Spatial filtering for EEG feature extraction and classification is an important tool in brain-computer interface. However, there is generally no established theory that links spatial filtering directly to Bayes classification error. To address this issue, this paper proposes and studies a Bayesian analysis theory for spatial filtering in relation to Bayes error. Following the maximum entropy principle, we introduce a gamma probability model for describing single-trial EEG power features. We then formulate and analyze the theoretical relationship between Bayes classification error and the so-called Rayleigh quotient, which is a function of spatial filters and basically measures the ratio in power features between two classes. This paper also reports our extensive study that examines the theory and its use in classification, using three publicly available EEG data sets and state-of-the-art spatial filtering techniques and various classifiers. Specifically, we validate the positive relationship between Bayes error and Rayleigh quotient in real EEG power features. Finally, we demonstrate that the Bayes error can be practically reduced by applying a new spatial filter with lower Rayleigh quotient.
Assessing the statistical significance of the achieved classification error of classifiers constructed using serum peptide profiles, and a prescription for random sampling repeated studies for massive high-throughput genomic and proteomic studies.

PubMed

Lyons-Weiler, James; Pelikan, Richard; Zeh, Herbert J; Whitcomb, David C; Malehorn, David E; Bigbee, William L; Hauskrecht, Milos

2005-01-01

Peptide profiles generated using SELDI/MALDI time of flight mass spectrometry provide a promising source of patient-specific information with high potential impact on the early detection and classification of cancer and other diseases. The new profiling technology comes, however, with numerous challenges and concerns. Particularly important are concerns of reproducibility of classification results and their significance. In this work we describe a computational validation framework, called PACE (Permutation-Achieved Classification Error), that lets us assess, for a given classification model, the significance of the Achieved Classification Error (ACE) on the profile data. The framework compares the performance statistic of the classifier on true data samples and checks if these are consistent with the behavior of the classifier on the same data with randomly reassigned class labels. A statistically significant ACE increases our belief that a discriminative signal was found in the data. The advantage of PACE analysis is that it can be easily combined with any classification model and is relatively easy to interpret. PACE analysis does not protect researchers against confounding in the experimental design, or other sources of systematic or random error. We use PACE analysis to assess significance of classification results we have achieved on a number of published data sets. The results show that many of these datasets indeed possess a signal that leads to a statistically significant ACE.
Automated Classification of Phonological Errors in Aphasic Language

PubMed Central

Ahuja, Sanjeev B.; Reggia, James A.; Berndt, Rita S.

1984-01-01

Using heuristically-guided state space search, a prototype program has been developed to simulate and classify phonemic errors occurring in the speech of neurologically-impaired patients. Simulations are based on an interchangeable rule/operator set of elementary errors which represent a theory of phonemic processing faults. This work introduces and evaluates a novel approach to error simulation and classification, it provides a prototype simulation tool for neurolinguistic research, and it forms the initial phase of a larger research effort involving computer modelling of neurolinguistic processes.
ANALYSIS OF A CLASSIFICATION ERROR MATRIX USING CATEGORICAL DATA TECHNIQUES.

USGS Publications Warehouse

Rosenfield, George H.; Fitzpatrick-Lins, Katherine

1984-01-01

Summary form only given. A classification error matrix typically contains tabulation results of an accuracy evaluation of a thematic classification, such as that of a land use and land cover map. The diagonal elements of the matrix represent the counts corrected, and the usual designation of classification accuracy has been the total percent correct. The nondiagonal elements of the matrix have usually been neglected. The classification error matrix is known in statistical terms as a contingency table of categorical data. As an example, an application of these methodologies to a problem of remotely sensed data concerning two photointerpreters and four categories of classification indicated that there is no significant difference in the interpretation between the two photointerpreters, and that there are significant differences among the interpreted category classifications. However, two categories, oak and cottonwood, are not separable in classification in this experiment at the 0. 51 percent probability. A coefficient of agreement is determined for the interpreted map as a whole, and individually for each of the interpreted categories. A conditional coefficient of agreement for the individual categories is compared to other methods for expressing category accuracy which have already been presented in the remote sensing literature.
New wideband radar target classification method based on neural learning and modified Euclidean metric

NASA Astrophysics Data System (ADS)

Jiang, Yicheng; Cheng, Ping; Ou, Yangkui

2001-09-01

A new method for target classification of high-range resolution radar is proposed. It tries to use neural learning to obtain invariant subclass features of training range profiles. A modified Euclidean metric based on the Box-Cox transformation technique is investigated for Nearest Neighbor target classification improvement. The classification experiments using real radar data of three different aircraft have demonstrated that classification error can reduce 8% if this method proposed in this paper is chosen instead of the conventional method. The results of this paper have shown that by choosing an optimized metric, it is indeed possible to reduce the classification error without increasing the number of samples.
What Do Spelling Errors Tell Us? Classification and Analysis of Errors Made by Greek Schoolchildren with and without Dyslexia

ERIC Educational Resources Information Center

Protopapas, Athanassios; Fakou, Aikaterini; Drakopoulou, Styliani; Skaloumbakas, Christos; Mouzaki, Angeliki

2013-01-01

In this study we propose a classification system for spelling errors and determine the most common spelling difficulties of Greek children with and without dyslexia. Spelling skills of 542 children from the general population and 44 children with dyslexia, Grades 3-4 and 7, were assessed with a dictated common word list and age-appropriate…
n-Iterative Exponential Forgetting Factor for EEG Signals Parameter Estimation

PubMed Central

Palma Orozco, Rosaura

2018-01-01

Electroencephalograms (EEG) signals are of interest because of their relationship with physiological activities, allowing a description of motion, speaking, or thinking. Important research has been developed to take advantage of EEG using classification or predictor algorithms based on parameters that help to describe the signal behavior. Thus, great importance should be taken to feature extraction which is complicated for the Parameter Estimation (PE)–System Identification (SI) process. When based on an average approximation, nonstationary characteristics are presented. For PE the comparison of three forms of iterative-recursive uses of the Exponential Forgetting Factor (EFF) combined with a linear function to identify a synthetic stochastic signal is presented. The one with best results seen through the functional error is applied to approximate an EEG signal for a simple classification example, showing the effectiveness of our proposal. PMID:29568310
Indirect Validation of Probe Speed Data on Arterial Corridors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Eshragh, Sepideh; Young, Stanley E.; Sharifi, Elham

This study aimed to estimate the accuracy of probe speed data on arterial corridors on the basis of roadway geometric attributes and functional classification. It was assumed that functional class (medium and low) along with other road characteristics (such as weighted average of the annual average daily traffic, average signal density, average access point density, and average speed) were available as correlation factors to estimate the accuracy of probe traffic data. This study tested these factors as predictors of the fidelity of probe traffic data by using the results of an extensive validation exercise. This study showed strong correlations betweenmore » these geometric attributes and the accuracy of probe data when they were assessed by using average absolute speed error. Linear models were regressed to existing data to estimate appropriate models for medium- and low-type arterial corridors. The proposed models for medium- and low-type arterials were validated further on the basis of the results of a slowdown analysis. These models can be used to predict the accuracy of probe data indirectly in medium and low types of arterial corridors.« less
Effectiveness of Global Features for Automatic Medical Image Classification and Retrieval – the experiences of OHSU at ImageCLEFmed

PubMed Central

Kalpathy-Cramer, Jayashree; Hersh, William

2008-01-01

In 2006 and 2007, Oregon Health & Science University (OHSU) participated in the automatic image annotation task for medical images at ImageCLEF, an annual international benchmarking event that is part of the Cross Language Evaluation Forum (CLEF). The goal of the automatic annotation task was to classify 1000 test images based on the Image Retrieval in Medical Applications (IRMA) code, given a set of 10,000 training images. There were 116 distinct classes in 2006 and 2007. We evaluated the efficacy of a variety of primarily global features for this classification task. These included features based on histograms, gray level correlation matrices and the gist technique. A multitude of classifiers including k-nearest neighbors, two-level neural networks, support vector machines, and maximum likelihood classifiers were evaluated. Our official error rates for the 1000 test images were 26% in 2006 using the flat classification structure. The error count in 2007 was 67.8 using the hierarchical classification error computation based on the IRMA code in 2007. Confusion matrices as well as clustering experiments were used to identify visually similar classes. The use of the IRMA code did not help us in the classification task as the semantic hierarchy of the IRMA classes did not correspond well with the hierarchy based on clustering of image features that we used. Our most frequent misclassification errors were along the view axis. Subsequent experiments based on a two-stage classification system decreased our error rate to 19.8% for the 2006 dataset and our error count to 55.4 for the 2007 data. PMID:19884953
Physician Preferences to Communicate Neuropsychological Results: Comparison of Qualitative Descriptors and a Proposal to Reduce Communication Errors.

PubMed

Schoenberg, Mike R; Osborn, Katie E; Mahone, E Mark; Feigon, Maia; Roth, Robert M; Pliskin, Neil H

2017-11-08

Errors in communication are a leading cause of medical errors. A potential source of error in communicating neuropsychological results is confusion in the qualitative descriptors used to describe standardized neuropsychological data. This study sought to evaluate the extent to which medical consumers of neuropsychological assessments believed that results/findings were not clearly communicated. In addition, preference data for a variety of qualitative descriptors commonly used to communicate normative neuropsychological test scores were obtained. Preference data were obtained for five qualitative descriptor systems as part of a larger 36-item internet-based survey of physician satisfaction with neuropsychological services. A new qualitative descriptor system termed the Simplified Qualitative Classification System (Q-Simple) was proposed to reduce the potential for communication errors using seven terms: very superior, superior, high average, average, low average, borderline, and abnormal/impaired. A non-random convenience sample of 605 clinicians identified from four United States academic medical centers from January 1, 2015 through January 7, 2016 were invited to participate. A total of 182 surveys were completed. A minority of clinicians (12.5%) indicated that neuropsychological study results were not clearly communicated. When communicating neuropsychological standardized scores, the two most preferred qualitative descriptor systems were by Heaton and colleagues (26%) and a newly proposed Q-simple system (22%). Comprehensive norms for an extended Halstead-Reitan battery: Demographic corrections, research findings, and clinical applications. Odessa, TX: Psychological Assessment Resources) (26%) and the newly proposed Q-Simple system (22%). Initial findings highlight the need to improve and standardize communication of neuropsychological results. These data offer initial guidance for preferred terms to communicate test results and form a foundation for more standardized practice among neuropsychologists. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Error-related brain activity and error awareness in an error classification paradigm.

PubMed

Di Gregorio, Francesco; Steinhauser, Marco; Maier, Martin E

2016-10-01

Error-related brain activity has been linked to error detection enabling adaptive behavioral adjustments. However, it is still unclear which role error awareness plays in this process. Here, we show that the error-related negativity (Ne/ERN), an event-related potential reflecting early error monitoring, is dissociable from the degree of error awareness. Participants responded to a target while ignoring two different incongruent distractors. After responding, they indicated whether they had committed an error, and if so, whether they had responded to one or to the other distractor. This error classification paradigm allowed distinguishing partially aware errors, (i.e., errors that were noticed but misclassified) and fully aware errors (i.e., errors that were correctly classified). The Ne/ERN was larger for partially aware errors than for fully aware errors. Whereas this speaks against the idea that the Ne/ERN foreshadows the degree of error awareness, it confirms the prediction of a computational model, which relates the Ne/ERN to post-response conflict. This model predicts that stronger distractor processing - a prerequisite of error classification in our paradigm - leads to lower post-response conflict and thus a smaller Ne/ERN. This implies that the relationship between Ne/ERN and error awareness depends on how error awareness is related to response conflict in a specific task. Our results further indicate that the Ne/ERN but not the degree of error awareness determines adaptive performance adjustments. Taken together, we conclude that the Ne/ERN is dissociable from error awareness and foreshadows adaptive performance adjustments. Our results suggest that the relationship between the Ne/ERN and error awareness is correlative and mediated by response conflict. Copyright © 2016 Elsevier Inc. All rights reserved.
The associations of insomnia with costly workplace accidents and errors: results from the America Insomnia Survey.

PubMed

Shahly, Victoria; Berglund, Patricia A; Coulouvrat, Catherine; Fitzgerald, Timothy; Hajak, Goeran; Roth, Thomas; Shillington, Alicia C; Stephenson, Judith J; Walsh, James K; Kessler, Ronald C

2012-10-01

Insomnia is a common and seriously impairing condition that often goes unrecognized. To examine associations of broadly defined insomnia (ie, meeting inclusion criteria for a diagnosis from International Statistical Classification of Diseases, 10th Revision, DSM-IV, or Research Diagnostic Criteria/International Classification of Sleep Disorders, Second Edition) with costly workplace accidents and errors after excluding other chronic conditions among workers in the America Insomnia Survey (AIS). A national cross-sectional telephone survey (65.0% cooperation rate) of commercially insured health plan members selected from the more than 34 million in the HealthCore Integrated Research Database. Four thousand nine hundred ninety-one employed AIS respondents. Costly workplace accidents or errors in the 12 months before the AIS interview were assessed with one question about workplace accidents "that either caused damage or work disruption with a value of $500 or more" and another about other mistakes "that cost your company $500 or more." Current insomnia with duration of at least 12 months was assessed with the Brief Insomnia Questionnaire, a validated (area under the receiver operating characteristic curve, 0.86 compared with diagnoses based on blinded clinical reappraisal interviews), fully structured diagnostic interview. Eighteen other chronic conditions were assessed with medical/pharmacy claims records and validated self-report scales. Insomnia had a significant odds ratio with workplace accidents and/or errors controlled for other chronic conditions (1.4). The odds ratio did not vary significantly with respondent age, sex, educational level, or comorbidity. The average costs of insomnia-related accidents and errors ($32 062) were significantly higher than those of other accidents and errors ($21 914). Simulations estimated that insomnia was associated with 7.2% of all costly workplace accidents and errors and 23.7% of all the costs of these incidents. These proportions are higher than for any other chronic condition, with annualized US population projections of 274 000 costly insomnia-related workplace accidents and errors having a combined value of US $31.1 billion. Effectiveness trials are needed to determine whether expanded screening, outreach, and treatment of workers with insomnia would yield a positive return on investment for employers.
Algorithmic Classification of Five Characteristic Types of Paraphasias.

PubMed

Fergadiotis, Gerasimos; Gorman, Kyle; Bedrick, Steven

2016-12-01

This study was intended to evaluate a series of algorithms developed to perform automatic classification of paraphasic errors (formal, semantic, mixed, neologistic, and unrelated errors). We analyzed 7,111 paraphasias from the Moss Aphasia Psycholinguistics Project Database (Mirman et al., 2010) and evaluated the classification accuracy of 3 automated tools. First, we used frequency norms from the SUBTLEXus database (Brysbaert & New, 2009) to differentiate nonword errors and real-word productions. Then we implemented a phonological-similarity algorithm to identify phonologically related real-word errors. Last, we assessed the performance of a semantic-similarity criterion that was based on word2vec (Mikolov, Yih, & Zweig, 2013). Overall, the algorithmic classification replicated human scoring for the major categories of paraphasias studied with high accuracy. The tool that was based on the SUBTLEXus frequency norms was more than 97% accurate in making lexicality judgments. The phonological-similarity criterion was approximately 91% accurate, and the overall classification accuracy of the semantic classifier ranged from 86% to 90%. Overall, the results highlight the potential of tools from the field of natural language processing for the development of highly reliable, cost-effective diagnostic tools suitable for collecting high-quality measurement data for research and clinical purposes.
Human error analysis of commercial aviation accidents using the human factors analysis and classification system (HFACS)

DOT National Transportation Integrated Search

2001-02-01

The Human Factors Analysis and Classification System (HFACS) is a general human error framework : originally developed and tested within the U.S. military as a tool for investigating and analyzing the human : causes of aviation accidents. Based upon ...
Current Assessment and Classification of Suicidal Phenomena using the FDA 2012 Draft Guidance Document on Suicide Assessment: A Critical Review.

PubMed

Sheehan, David V; Giddens, Jennifer M; Sheehan, Kathy Harnett

2014-09-01

Standard international classification criteria require that classification categories be comprehensive to avoid type II error. Categories should be mutually exclusive and definitions should be clear and unambiguous (to avoid type I and type II errors). In addition, the classification system should be robust enough to last over time and provide comparability between data collections. This article was designed to evaluate the extent to which the classification system contained in the United States Food and Drug Administration 2012 Draft Guidance for the prospective assessment and classification of suicidal ideation and behavior in clinical trials meets these criteria. A critical review is used to assess the extent to which the proposed categories contained in the Food and Drug Administration 2012 Draft Guidance are comprehensive, unambiguous, and robust. Assumptions that underlie the classification system are also explored. The Food and Drug Administration classification system contained in the 2012 Draft Guidance does not capture the full range of suicidal ideation and behavior (type II error). Definitions, moreover, are frequently ambiguous (susceptible to multiple interpretations), and the potential for misclassification (type I and type II errors) is compounded by frequent mismatches in category titles and definitions. These issues have the potential to compromise data comparability within clinical trial sites, across sites, and over time. These problems need to be remedied because of the potential for flawed data output and consequent threats to public health, to research on the safety of medications, and to the search for effective medication treatments for suicidality.
Particle Swarm Optimization approach to defect detection in armour ceramics.

PubMed

Kesharaju, Manasa; Nagarajah, Romesh

2017-03-01

In this research, various extracted features were used in the development of an automated ultrasonic sensor based inspection system that enables defect classification in each ceramic component prior to despatch to the field. Classification is an important task and large number of irrelevant, redundant features commonly introduced to a dataset reduces the classifiers performance. Feature selection aims to reduce the dimensionality of the dataset while improving the performance of a classification system. In the context of a multi-criteria optimization problem (i.e. to minimize classification error rate and reduce number of features) such as one discussed in this research, the literature suggests that evolutionary algorithms offer good results. Besides, it is noted that Particle Swarm Optimization (PSO) has not been explored especially in the field of classification of high frequency ultrasonic signals. Hence, a binary coded Particle Swarm Optimization (BPSO) technique is investigated in the implementation of feature subset selection and to optimize the classification error rate. In the proposed method, the population data is used as input to an Artificial Neural Network (ANN) based classification system to obtain the error rate, as ANN serves as an evaluator of PSO fitness function. Copyright © 2016. Published by Elsevier B.V.

The impact of OCR accuracy on automated cancer classification of pathology reports.

PubMed

Zuccon, Guido; Nguyen, Anthony N; Bergheim, Anton; Wickman, Sandra; Grayson, Narelle

2012-01-01

To evaluate the effects of Optical Character Recognition (OCR) on the automatic cancer classification of pathology reports. Scanned images of pathology reports were converted to electronic free-text using a commercial OCR system. A state-of-the-art cancer classification system, the Medical Text Extraction (MEDTEX) system, was used to automatically classify the OCR reports. Classifications produced by MEDTEX on the OCR versions of the reports were compared with the classification from a human amended version of the OCR reports. The employed OCR system was found to recognise scanned pathology reports with up to 99.12% character accuracy and up to 98.95% word accuracy. Errors in the OCR processing were found to minimally impact on the automatic classification of scanned pathology reports into notifiable groups. However, the impact of OCR errors is not negligible when considering the extraction of cancer notification items, such as primary site, histological type, etc. The automatic cancer classification system used in this work, MEDTEX, has proven to be robust to errors produced by the acquisition of freetext pathology reports from scanned images through OCR software. However, issues emerge when considering the extraction of cancer notification items.
Hand posture classification using electrocorticography signals in the gamma band over human sensorimotor brain areas

NASA Astrophysics Data System (ADS)

Chestek, Cynthia A.; Gilja, Vikash; Blabe, Christine H.; Foster, Brett L.; Shenoy, Krishna V.; Parvizi, Josef; Henderson, Jaimie M.

2013-04-01

Objective. Brain-machine interface systems translate recorded neural signals into command signals for assistive technology. In individuals with upper limb amputation or cervical spinal cord injury, the restoration of a useful hand grasp could significantly improve daily function. We sought to determine if electrocorticographic (ECoG) signals contain sufficient information to select among multiple hand postures for a prosthetic hand, orthotic, or functional electrical stimulation system.Approach. We recorded ECoG signals from subdural macro- and microelectrodes implanted in motor areas of three participants who were undergoing inpatient monitoring for diagnosis and treatment of intractable epilepsy. Participants performed five distinct isometric hand postures, as well as four distinct finger movements. Several control experiments were attempted in order to remove sensory information from the classification results. Online experiments were performed with two participants. Main results. Classification rates were 68%, 84% and 81% for correct identification of 5 isometric hand postures offline. Using 3 potential controls for removing sensory signals, error rates were approximately doubled on average (2.1×). A similar increase in errors (2.6×) was noted when the participant was asked to make simultaneous wrist movements along with the hand postures. In online experiments, fist versus rest was successfully classified on 97% of trials; the classification output drove a prosthetic hand. Online classification performance for a larger number of hand postures remained above chance, but substantially below offline performance. In addition, the long integration windows used would preclude the use of decoded signals for control of a BCI system. Significance. These results suggest that ECoG is a plausible source of command signals for prosthetic grasp selection. Overall, avenues remain for improvement through better electrode designs and placement, better participant training, and characterization of non-stationarities such that ECoG could be a viable signal source for grasp control for amputees or individuals with paralysis.
Error Detection in Mechanized Classification Systems

ERIC Educational Resources Information Center

Hoyle, W. G.

1976-01-01

When documentary material is indexed by a mechanized classification system, and the results judged by trained professionals, the number of documents in disagreement, after suitable adjustment, defines the error rate of the system. In a test case disagreement was 22 percent and, of this 22 percent, the computer correctly identified two-thirds of…
A dictionary learning approach for human sperm heads classification.

PubMed

Shaker, Fariba; Monadjemi, S Amirhassan; Alirezaie, Javad; Naghsh-Nilchi, Ahmad Reza

2017-12-01

To diagnose infertility in men, semen analysis is conducted in which sperm morphology is one of the factors that are evaluated. Since manual assessment of sperm morphology is time-consuming and subjective, automatic classification methods are being developed. Automatic classification of sperm heads is a complicated task due to the intra-class differences and inter-class similarities of class objects. In this research, a Dictionary Learning (DL) technique is utilized to construct a dictionary of sperm head shapes. This dictionary is used to classify the sperm heads into four different classes. Square patches are extracted from the sperm head images. Columnized patches from each class of sperm are used to learn class-specific dictionaries. The patches from a test image are reconstructed using each class-specific dictionary and the overall reconstruction error for each class is used to select the best matching class. Average accuracy, precision, recall, and F-score are used to evaluate the classification method. The method is evaluated using two publicly available datasets of human sperm head shapes. The proposed DL based method achieved an average accuracy of 92.2% on the HuSHeM dataset, and an average recall of 62% on the SCIAN-MorphoSpermGS dataset. The results show a significant improvement compared to a previously published shape-feature-based method. We have achieved high-performance results. In addition, our proposed approach offers a more balanced classifier in which all four classes are recognized with high precision and recall. In this paper, we use a Dictionary Learning approach in classifying human sperm heads. It is shown that the Dictionary Learning method is far more effective in classifying human sperm heads than classifiers using shape-based features. Also, a dataset of human sperm head shapes is introduced to facilitate future research. Copyright © 2017 Elsevier Ltd. All rights reserved.
Privacy-Preserving Evaluation of Generalization Error and Its Application to Model and Attribute Selection

NASA Astrophysics Data System (ADS)

Sakuma, Jun; Wright, Rebecca N.

Privacy-preserving classification is the task of learning or training a classifier on the union of privately distributed datasets without sharing the datasets. The emphasis of existing studies in privacy-preserving classification has primarily been put on the design of privacy-preserving versions of particular data mining algorithms, However, in classification problems, preprocessing and postprocessing— such as model selection or attribute selection—play a prominent role in achieving higher classification accuracy. In this paper, we show generalization error of classifiers in privacy-preserving classification can be securely evaluated without sharing prediction results. Our main technical contribution is a new generalized Hamming distance protocol that is universally applicable to preprocessing and postprocessing of various privacy-preserving classification problems, such as model selection in support vector machine and attribute selection in naive Bayes classification.
Information analysis of a spatial database for ecological land classification

NASA Technical Reports Server (NTRS)

Davis, Frank W.; Dozier, Jeff

1990-01-01

An ecological land classification was developed for a complex region in southern California using geographic information system techniques of map overlay and contingency table analysis. Land classes were identified by mutual information analysis of vegetation pattern in relation to other mapped environmental variables. The analysis was weakened by map errors, especially errors in the digital elevation data. Nevertheless, the resulting land classification was ecologically reasonable and performed well when tested with higher quality data from the region.
Classification of radiological errors in chest radiographs, using support vector machine on the spatial frequency features of false- negative and false-positive regions

NASA Astrophysics Data System (ADS)

Pietrzyk, Mariusz W.; Donovan, Tim; Brennan, Patrick C.; Dix, Alan; Manning, David J.

2011-03-01

Aim: To optimize automated classification of radiological errors during lung nodule detection from chest radiographs (CxR) using a support vector machine (SVM) run on the spatial frequency features extracted from the local background of selected regions. Background: The majority of the unreported pulmonary nodules are visually detected but not recognized; shown by the prolonged dwell time values at false-negative regions. Similarly, overestimated nodule locations are capturing substantial amounts of foveal attention. Spatial frequency properties of selected local backgrounds are correlated with human observer responses either in terms of accuracy in indicating abnormality position or in the precision of visual sampling the medical images. Methods: Seven radiologists participated in the eye tracking experiments conducted under conditions of pulmonary nodule detection from a set of 20 postero-anterior CxR. The most dwelled locations have been identified and subjected to spatial frequency (SF) analysis. The image-based features of selected ROI were extracted with un-decimated Wavelet Packet Transform. An analysis of variance was run to select SF features and a SVM schema was implemented to classify False-Negative and False-Positive from all ROI. Results: A relative high overall accuracy was obtained for each individually developed Wavelet-SVM algorithm, with over 90% average correct ratio for errors recognition from all prolonged dwell locations. Conclusion: The preliminary results show that combined eye-tracking and image-based features can be used for automated detection of radiological error with SVM. The work is still in progress and not all analytical procedures have been completed, which might have an effect on the specificity of the algorithm.
A new interferential multispectral image compression algorithm based on adaptive classification and curve-fitting

NASA Astrophysics Data System (ADS)

Wang, Ke-Yan; Li, Yun-Song; Liu, Kai; Wu, Cheng-Ke

2008-08-01

A novel compression algorithm for interferential multispectral images based on adaptive classification and curve-fitting is proposed. The image is first partitioned adaptively into major-interference region and minor-interference region. Different approximating functions are then constructed for two kinds of regions respectively. For the major interference region, some typical interferential curves are selected to predict other curves. These typical curves are then processed by curve-fitting method. For the minor interference region, the data of each interferential curve are independently approximated. Finally the approximating errors of two regions are entropy coded. The experimental results show that, compared with JPEG2000, the proposed algorithm not only decreases the average output bit-rate by about 0.2 bit/pixel for lossless compression, but also improves the reconstructed images and reduces the spectral distortion greatly, especially at high bit-rate for lossy compression.
Evaluation of algorithms for estimating wheat acreage from multispectral scanner data. [Kansas and Texas

NASA Technical Reports Server (NTRS)

Nalepka, R. F. (Principal Investigator); Richardson, W.; Pentland, A. P.

1976-01-01

The author has identified the following significant results. Fourteen different classification algorithms were tested for their ability to estimate the proportion of wheat in an area. For some algorithms, accuracy of classification in field centers was observed. The data base consisted of ground truth and LANDSAT data from 55 sections (1 x 1 mile) from five LACIE intensive test sites in Kansas and Texas. Signatures obtained from training fields selected at random from the ground truth were generally representative of the data distribution patterns. LIMMIX, an algorithm that chooses a pure signature when the data point is close enough to a signature mean and otherwise chooses the best mixture of a pair of signatures, reduced the average absolute error to 6.1% and the bias to 1.0%. QRULE run with a null test achieved a similar reduction.
Spatial methods for deriving crop rotation history

NASA Astrophysics Data System (ADS)

Mueller-Warrant, George W.; Trippe, Kristin M.; Whittaker, Gerald W.; Anderson, Nicole P.; Sullivan, Clare S.

2017-08-01

Benefits of converting 11 years of remote sensing classification data into cropping history of agricultural fields included measuring lengths of rotation cycles and identifying specific sequences of intervening crops grown between final years of old grass seed stands and establishment of new ones. Spatial and non-spatial methods were complementary. Individual-year classification errors were often correctable in spreadsheet-based non-spatial analysis, whereas their presence in spatial data generally led to exclusion of fields from further analysis. Markov-model testing of non-spatial data revealed that year-to-year cropping sequences did not match average frequencies for transitions among crops grown in western Oregon, implying that rotations into new grass seed stands were influenced by growers' desires to achieve specific objectives. Moran's I spatial analysis of length of time between consecutive grass seed stands revealed that clustering of fields was relatively uncommon, with high and low value clusters only accounting for 7.1 and 6.2% of fields.
Selecting a Classification Ensemble and Detecting Process Drift in an Evolving Data Stream

DOE Office of Scientific and Technical Information (OSTI.GOV)

Heredia-Langner, Alejandro; Rodriguez, Luke R.; Lin, Andy

2015-09-30

We characterize the commercial behavior of a group of companies in a common line of business using a small ensemble of classifiers on a stream of records containing commercial activity information. This approach is able to effectively find a subset of classifiers that can be used to predict company labels with reasonable accuracy. Performance of the ensemble, its error rate under stable conditions, can be characterized using an exponentially weighted moving average (EWMA) statistic. The behavior of the EWMA statistic can be used to monitor a record stream from the commercial network and determine when significant changes have occurred. Resultsmore » indicate that larger classification ensembles may not necessarily be optimal, pointing to the need to search the combinatorial classifier space in a systematic way. Results also show that current and past performance of an ensemble can be used to detect when statistically significant changes in the activity of the network have occurred. The dataset used in this work contains tens of thousands of high level commercial activity records with continuous and categorical variables and hundreds of labels, making classification challenging.« less
Structural MRI-based detection of Alzheimer's disease using feature ranking and classification error.

PubMed

Beheshti, Iman; Demirel, Hasan; Farokhian, Farnaz; Yang, Chunlan; Matsuda, Hiroshi

2016-12-01

This paper presents an automatic computer-aided diagnosis (CAD) system based on feature ranking for detection of Alzheimer's disease (AD) using structural magnetic resonance imaging (sMRI) data. The proposed CAD system is composed of four systematic stages. First, global and local differences in the gray matter (GM) of AD patients compared to the GM of healthy controls (HCs) are analyzed using a voxel-based morphometry technique. The aim is to identify significant local differences in the volume of GM as volumes of interests (VOIs). Second, the voxel intensity values of the VOIs are extracted as raw features. Third, the raw features are ranked using a seven-feature ranking method, namely, statistical dependency (SD), mutual information (MI), information gain (IG), Pearson's correlation coefficient (PCC), t-test score (TS), Fisher's criterion (FC), and the Gini index (GI). The features with higher scores are more discriminative. To determine the number of top features, the estimated classification error based on training set made up of the AD and HC groups is calculated, with the vector size that minimized this error selected as the top discriminative feature. Fourth, the classification is performed using a support vector machine (SVM). In addition, a data fusion approach among feature ranking methods is introduced to improve the classification performance. The proposed method is evaluated using a data-set from ADNI (130 AD and 130 HC) with 10-fold cross-validation. The classification accuracy of the proposed automatic system for the diagnosis of AD is up to 92.48% using the sMRI data. An automatic CAD system for the classification of AD based on feature-ranking method and classification errors is proposed. In this regard, seven-feature ranking methods (i.e., SD, MI, IG, PCC, TS, FC, and GI) are evaluated. The optimal size of top discriminative features is determined by the classification error estimation in the training phase. The experimental results indicate that the performance of the proposed system is comparative to that of state-of-the-art classification models. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Detecting Pilot's Engagement Using fNIRS Connectivity Features in an Automated vs. Manual Landing Scenario

PubMed Central

Verdière, Kevin J.; Roy, Raphaëlle N.; Dehais, Frédéric

2018-01-01

Monitoring pilot's mental states is a relevant approach to mitigate human error and enhance human machine interaction. A promising brain imaging technique to perform such a continuous measure of human mental state under ecological settings is Functional Near-InfraRed Spectroscopy (fNIRS). However, to our knowledge no study has yet assessed the potential of fNIRS connectivity metrics as long as passive Brain Computer Interfaces (BCI) are concerned. Therefore, we designed an experimental scenario in a realistic simulator in which 12 pilots had to perform landings under two contrasted levels of engagement (manual vs. automated). The collected data were used to benchmark the performance of classical oxygenation features (i.e., Average, Peak, Variance, Skewness, Kurtosis, Area Under the Curve, and Slope) and connectivity features (i.e., Covariance, Pearson's, and Spearman's Correlation, Spectral Coherence, and Wavelet Coherence) to discriminate these two landing conditions. Classification performance was obtained by using a shrinkage Linear Discriminant Analysis (sLDA) and a stratified cross validation using each feature alone or by combining them. Our findings disclosed that the connectivity features performed significantly better than the classical concentration metrics with a higher accuracy for the wavelet coherence (average: 65.3/59.9 %, min: 45.3/45.0, max: 80.5/74.7 computed for HbO/HbR signals respectively). A maximum classification performance was obtained by combining the area under the curve with the wavelet coherence (average: 66.9/61.6 %, min: 57.3/44.8, max: 80.0/81.3 computed for HbO/HbR signals respectively). In a general manner all connectivity measures allowed an efficient classification when computed over HbO signals. Those promising results provide methodological cues for further implementation of fNIRS-based passive BCIs. PMID:29422841
Intelligent quotient estimation of mental retarded people from different psychometric instruments using artificial neural networks.

PubMed

Di Nuovo, Alessandro G; Di Nuovo, Santo; Buono, Serafino

2012-02-01

The estimation of a person's intelligence quotient (IQ) by means of psychometric tests is indispensable in the application of psychological assessment to several fields. When complex tests as the Wechsler scales, which are the most commonly used and universally recognized parameter for the diagnosis of degrees of retardation, are not applicable, it is necessary to use other psycho-diagnostic tools more suited for the subject's specific condition. But to ensure a homogeneous diagnosis it is necessary to reach a common metric, thus, the aim of our work is to build models able to estimate accurately and reliably the Wechsler IQ, starting from different psycho-diagnostic tools. Four different psychometric tests (Leiter international performance scale; coloured progressive matrices test; the mental development scale; psycho educational profile), along with the Wechsler scale, were administered to a group of 40 mentally retarded subjects, with various pathologies, and control persons. The obtained database is used to evaluate Wechsler IQ estimation models starting from the scores obtained in the other tests. Five modelling methods, two statistical and three from machine learning, that belong to the family of artificial neural networks (ANNs) are employed to build the estimator. Several error metrics for estimated IQ and for retardation level classification are defined to compare the performance of the various models with univariate and multivariate analyses. Eight empirical studies show that, after ten-fold cross-validation, best average estimation error is of 3.37 IQ points and mental retardation level classification error of 7.5%. Furthermore our experiments prove the superior performance of ANN methods over statistical regression ones, because in all cases considered ANN models show the lowest estimation error (from 0.12 to 0.9 IQ points) and the lowest classification error (from 2.5% to 10%). Since the estimation performance is better than the confidence interval of Wechsler scales (five IQ points), we consider models built very accurate and reliable and they can be used into help clinical diagnosis. Therefore a computer software based on the results of our work is currently used in a clinical center and empirical trails confirm its validity. Furthermore positive results in our multivariate studies suggest new approaches for clinicians. Copyright © 2011 Elsevier B.V. All rights reserved.
Single Versus Multiple Events Error Potential Detection in a BCI-Controlled Car Game With Continuous and Discrete Feedback.

PubMed

Kreilinger, Alex; Hiebel, Hannah; Müller-Putz, Gernot R

2016-03-01

This work aimed to find and evaluate a new method for detecting errors in continuous brain-computer interface (BCI) applications. Instead of classifying errors on a single-trial basis, the new method was based on multiple events (MEs) analysis to increase the accuracy of error detection. In a BCI-driven car game, based on motor imagery (MI), discrete events were triggered whenever subjects collided with coins and/or barriers. Coins counted as correct events, whereas barriers were errors. This new method, termed ME method, combined and averaged the classification results of single events (SEs) and determined the correctness of MI trials, which consisted of event sequences instead of SEs. The benefit of this method was evaluated in an offline simulation. In an online experiment, the new method was used to detect erroneous MI trials. Such MI trials were discarded and could be repeated by the users. We found that, even with low SE error potential (ErrP) detection rates, feasible accuracies can be achieved when combining MEs to distinguish erroneous from correct MI trials. Online, all subjects reached higher scores with error detection than without, at the cost of longer times needed for completing the game. Findings suggest that ErrP detection may become a reliable tool for monitoring continuous states in BCI applications when combining MEs. This paper demonstrates a novel technique for detecting errors in online continuous BCI applications, which yields promising results even with low single-trial detection rates.
Documentation of procedures for textural/spatial pattern recognition techniques

NASA Technical Reports Server (NTRS)

Haralick, R. M.; Bryant, W. F.

1976-01-01

A C-130 aircraft was flown over the Sam Houston National Forest on March 21, 1973 at 10,000 feet altitude to collect multispectral scanner (MSS) data. Existing textural and spatial automatic processing techniques were used to classify the MSS imagery into specified timber categories. Several classification experiments were performed on this data using features selected from the spectral bands and a textural transform band. The results indicate that (1) spatial post-processing a classified image can cut the classification error to 1/2 or 1/3 of its initial value, (2) spatial post-processing the classified image using combined spectral and textural features produces a resulting image with less error than post-processing a classified image using only spectral features and (3) classification without spatial post processing using the combined spectral textural features tends to produce about the same error rate as a classification without spatial post processing using only spectral features.
Class-specific Error Bounds for Ensemble Classifiers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Prenger, R; Lemmond, T; Varshney, K

2009-10-06

The generalization error, or probability of misclassification, of ensemble classifiers has been shown to be bounded above by a function of the mean correlation between the constituent (i.e., base) classifiers and their average strength. This bound suggests that increasing the strength and/or decreasing the correlation of an ensemble's base classifiers may yield improved performance under the assumption of equal error costs. However, this and other existing bounds do not directly address application spaces in which error costs are inherently unequal. For applications involving binary classification, Receiver Operating Characteristic (ROC) curves, performance curves that explicitly trade off false alarms and missedmore » detections, are often utilized to support decision making. To address performance optimization in this context, we have developed a lower bound for the entire ROC curve that can be expressed in terms of the class-specific strength and correlation of the base classifiers. We present empirical analyses demonstrating the efficacy of these bounds in predicting relative classifier performance. In addition, we specify performance regions of the ROC curve that are naturally delineated by the class-specific strengths of the base classifiers and show that each of these regions can be associated with a unique set of guidelines for performance optimization of binary classifiers within unequal error cost regimes.« less
Selecting a restoration technique to minimize OCR error.

PubMed

Cannon, M; Fugate, M; Hush, D R; Scovel, C

2003-01-01

This paper introduces a learning problem related to the task of converting printed documents to ASCII text files. The goal of the learning procedure is to produce a function that maps documents to restoration techniques in such a way that on average the restored documents have minimum optical character recognition error. We derive a general form for the optimal function and use it to motivate the development of a nonparametric method based on nearest neighbors. We also develop a direct method of solution based on empirical error minimization for which we prove a finite sample bound on estimation error that is independent of distribution. We show that this empirical error minimization problem is an extension of the empirical optimization problem for traditional M-class classification with general loss function and prove computational hardness for this problem. We then derive a simple iterative algorithm called generalized multiclass ratchet (GMR) and prove that it produces an optimal function asymptotically (with probability 1). To obtain the GMR algorithm we introduce a new data map that extends Kesler's construction for the multiclass problem and then apply an algorithm called Ratchet to this mapped data, where Ratchet is a modification of the Pocket algorithm . Finally, we apply these methods to a collection of documents and report on the experimental results.
Sequential Probability Ratio Testing with Power Projective Base Method Improves Decision-Making for BCI

PubMed Central

Liu, Rong

2017-01-01

Obtaining a fast and reliable decision is an important issue in brain-computer interfaces (BCI), particularly in practical real-time applications such as wheelchair or neuroprosthetic control. In this study, the EEG signals were firstly analyzed with a power projective base method. Then we were applied a decision-making model, the sequential probability ratio testing (SPRT), for single-trial classification of motor imagery movement events. The unique strength of this proposed classification method lies in its accumulative process, which increases the discriminative power as more and more evidence is observed over time. The properties of the method were illustrated on thirteen subjects' recordings from three datasets. Results showed that our proposed power projective method outperformed two benchmark methods for every subject. Moreover, with sequential classifier, the accuracies across subjects were significantly higher than that with nonsequential ones. The average maximum accuracy of the SPRT method was 84.1%, as compared with 82.3% accuracy for the sequential Bayesian (SB) method. The proposed SPRT method provides an explicit relationship between stopping time, thresholds, and error, which is important for balancing the time-accuracy trade-off. These results suggest SPRT would be useful in speeding up decision-making while trading off errors in BCI. PMID:29348781
Automated Grouping of Opportunity Rover Alpha Particle X-Ray Spectrometer Compositional Data

NASA Technical Reports Server (NTRS)

VanBommel, S. J.; Gellert, R.; Clark, B. C.; Ming, D. W.; Mittlefehldt, D. W.; Schroder, C.; Yen, A. S.

2016-01-01

The Alpha Particle X-ray Spectrometer (APXS) conducts high-precision in situ measurements of rocks and soils on both active NASA Mars rovers. Since 2004 the rover Opportunity has acquired around 440 unique APXS measurements, including a wide variety of compositions, during its 42+ kilometers traverse across several geological formations. Here we discuss an analytical comparison algorithm providing a means to cluster samples due to compositional similarity and the resulting automated classification scheme. Due to the inherent variance of elements in the APXS data set, each element has an associated weight that is inversely proportional to the variance. Thus, the more consistent the abundance of an element in the data set, the more it contributes to the classification. All 16 elements standard to the APXS data set are considered. Careful attention is also given to the errors associated with the composition measured by the APXS - larger uncertainties reduce the weighting of the element accordingly. The comparison of two targets, i and j, generates a similarity score, S(sub ij). This score is immediately comparable to an average ratio across all elements if one assumes standard weighted uncertainty. The algorithm facilitates the classification of APXS targets by chemistry alone - independent of target appearance and geological context which can be added later as a consistency check. For the N targets considered, a N by N hollow matrix, S, is generated where S = S(sup T). The average relation score, S(sub av), for target N(sub i) is simply the average of column i of S. A large S(sub av) is indicative of a unique sample. In such an instance any targets with a low comparison score can be classified alike. The threshold between classes requires careful consideration. Applying the algorithm to recent Marathon Valley targets indicates similarities with Burns formation and average-Mars-like rocks encountered earlier at Endeavour Crater as well as a new class of felsic rocks.

Human error analysis of commercial aviation accidents: application of the Human Factors Analysis and Classification system (HFACS).

PubMed

Wiegmann, D A; Shappell, S A

2001-11-01

The Human Factors Analysis and Classification System (HFACS) is a general human error framework originally developed and tested within the U.S. military as a tool for investigating and analyzing the human causes of aviation accidents. Based on Reason's (1990) model of latent and active failures, HFACS addresses human error at all levels of the system, including the condition of aircrew and organizational factors. The purpose of the present study was to assess the utility of the HFACS framework as an error analysis and classification tool outside the military. The HFACS framework was used to analyze human error data associated with aircrew-related commercial aviation accidents that occurred between January 1990 and December 1996 using database records maintained by the NTSB and the FAA. Investigators were able to reliably accommodate all the human causal factors associated with the commercial aviation accidents examined in this study using the HFACS system. In addition, the classification of data using HFACS highlighted several critical safety issues in need of intervention research. These results demonstrate that the HFACS framework can be a viable tool for use within the civil aviation arena. However, additional research is needed to examine its applicability to areas outside the flight deck, such as aircraft maintenance and air traffic control domains.
Automatic intelligibility classification of sentence-level pathological speech

PubMed Central

Kim, Jangwon; Kumar, Naveen; Tsiartas, Andreas; Li, Ming; Narayanan, Shrikanth S.

2014-01-01

Pathological speech usually refers to the condition of speech distortion resulting from atypicalities in voice and/or in the articulatory mechanisms owing to disease, illness or other physical or biological insult to the production system. Although automatic evaluation of speech intelligibility and quality could come in handy in these scenarios to assist experts in diagnosis and treatment design, the many sources and types of variability often make it a very challenging computational processing problem. In this work we propose novel sentence-level features to capture abnormal variation in the prosodic, voice quality and pronunciation aspects in pathological speech. In addition, we propose a post-classification posterior smoothing scheme which refines the posterior of a test sample based on the posteriors of other test samples. Finally, we perform feature-level fusions and subsystem decision fusion for arriving at a final intelligibility decision. The performances are tested on two pathological speech datasets, the NKI CCRT Speech Corpus (advanced head and neck cancer) and the TORGO database (cerebral palsy or amyotrophic lateral sclerosis), by evaluating classification accuracy without overlapping subjects’ data among training and test partitions. Results show that the feature sets of each of the voice quality subsystem, prosodic subsystem, and pronunciation subsystem, offer significant discriminating power for binary intelligibility classification. We observe that the proposed posterior smoothing in the acoustic space can further reduce classification errors. The smoothed posterior score fusion of subsystems shows the best classification performance (73.5% for unweighted, and 72.8% for weighted, average recalls of the binary classes). PMID:25414544
Determination of optimum threshold values for EMG time domain features; a multi-dataset investigation

NASA Astrophysics Data System (ADS)

Nlandu Kamavuako, Ernest; Scheme, Erik Justin; Englehart, Kevin Brian

2016-08-01

Objective. For over two decades, Hudgins’ set of time domain features have extensively been applied for classification of hand motions. The calculation of slope sign change and zero crossing features uses a threshold to attenuate the effect of background noise. However, there is no consensus on the optimum threshold value. In this study, we investigate for the first time the effect of threshold selection on the feature space and classification accuracy using multiple datasets. Approach. In the first part, four datasets were used, and classification error (CE), separability index, scatter matrix separability criterion, and cardinality of the features were used as performance measures. In the second part, data from eight classes were collected during two separate days with two days in between from eight able-bodied subjects. The threshold for each feature was computed as a factor (R = 0:0.01:4) times the average root mean square of data during rest. For each day, we quantified CE for R = 0 (CEr0) and minimum error (CEbest). Moreover, a cross day threshold validation was applied where, for example, CE of day two (CEodt) is computed based on optimum threshold from day one and vice versa. Finally, we quantified the effect of the threshold when using training data from one day and test data of the other. Main results. All performance metrics generally degraded with increasing threshold values. On average, CEbest (5.26 ± 2.42%) was significantly better than CEr0 (7.51 ± 2.41%, P = 0.018), and CEodt (7.50 ± 2.50%, P = 0.021). During the two-fold validation between days, CEbest performed similar to CEr0. Interestingly, when using the threshold values optimized per subject from day one and day two respectively, on the cross-days classification, the performance decreased. Significance. We have demonstrated that threshold value has a strong impact on the feature space and that an optimum threshold can be quantified. However, this optimum threshold is highly data and subject driven and thus do not generalize well. There is a strong evidence that R = 0 provides a good trade-off between system performance and generalization. These findings are important for practical use of pattern recognition based myoelectric control.
Determination of optimum threshold values for EMG time domain features; a multi-dataset investigation.

PubMed

Kamavuako, Ernest Nlandu; Scheme, Erik Justin; Englehart, Kevin Brian

2016-08-01

For over two decades, Hudgins' set of time domain features have extensively been applied for classification of hand motions. The calculation of slope sign change and zero crossing features uses a threshold to attenuate the effect of background noise. However, there is no consensus on the optimum threshold value. In this study, we investigate for the first time the effect of threshold selection on the feature space and classification accuracy using multiple datasets. In the first part, four datasets were used, and classification error (CE), separability index, scatter matrix separability criterion, and cardinality of the features were used as performance measures. In the second part, data from eight classes were collected during two separate days with two days in between from eight able-bodied subjects. The threshold for each feature was computed as a factor (R = 0:0.01:4) times the average root mean square of data during rest. For each day, we quantified CE for R = 0 (CEr0) and minimum error (CEbest). Moreover, a cross day threshold validation was applied where, for example, CE of day two (CEodt) is computed based on optimum threshold from day one and vice versa. Finally, we quantified the effect of the threshold when using training data from one day and test data of the other. All performance metrics generally degraded with increasing threshold values. On average, CEbest (5.26 ± 2.42%) was significantly better than CEr0 (7.51 ± 2.41%, P = 0.018), and CEodt (7.50 ± 2.50%, P = 0.021). During the two-fold validation between days, CEbest performed similar to CEr0. Interestingly, when using the threshold values optimized per subject from day one and day two respectively, on the cross-days classification, the performance decreased. We have demonstrated that threshold value has a strong impact on the feature space and that an optimum threshold can be quantified. However, this optimum threshold is highly data and subject driven and thus do not generalize well. There is a strong evidence that R = 0 provides a good trade-off between system performance and generalization. These findings are important for practical use of pattern recognition based myoelectric control.
Satellite Sampling and Retrieval Errors in Regional Monthly Rain Estimates from TMI AMSR-E, SSM/I, AMSU-B and the TRMM PR

NASA Technical Reports Server (NTRS)

Fisher, Brad; Wolff, David B.

2010-01-01

Passive and active microwave rain sensors onboard earth-orbiting satellites estimate monthly rainfall from the instantaneous rain statistics collected during satellite overpasses. It is well known that climate-scale rain estimates from meteorological satellites incur sampling errors resulting from the process of discrete temporal sampling and statistical averaging. Sampling and retrieval errors ultimately become entangled in the estimation of the mean monthly rain rate. The sampling component of the error budget effectively introduces statistical noise into climate-scale rain estimates that obscure the error component associated with the instantaneous rain retrieval. Estimating the accuracy of the retrievals on monthly scales therefore necessitates a decomposition of the total error budget into sampling and retrieval error quantities. This paper presents results from a statistical evaluation of the sampling and retrieval errors for five different space-borne rain sensors on board nine orbiting satellites. Using an error decomposition methodology developed by one of the authors, sampling and retrieval errors were estimated at 0.25 resolution within 150 km of ground-based weather radars located at Kwajalein, Marshall Islands and Melbourne, Florida. Error and bias statistics were calculated according to the land, ocean and coast classifications of the surface terrain mask developed for the Goddard Profiling (GPROF) rain algorithm. Variations in the comparative error statistics are attributed to various factors related to differences in the swath geometry of each rain sensor, the orbital and instrument characteristics of the satellite and the regional climatology. The most significant result from this study found that each of the satellites incurred negative longterm oceanic retrieval biases of 10 to 30%.
Development of an FBG Sensor Array for Multi-Impact Source Localization on CFRP Structures.

PubMed

Jiang, Mingshun; Sai, Yaozhang; Geng, Xiangyi; Sui, Qingmei; Liu, Xiaohui; Jia, Lei

2016-10-24

We proposed and studied an impact detection system based on a fiber Bragg grating (FBG) sensor array and multiple signal classification (MUSIC) algorithm to determine the location and the number of low velocity impacts on a carbon fiber-reinforced polymer (CFRP) plate. A FBG linear array, consisting of seven FBG sensors, was used for detecting the ultrasonic signals from impacts. The edge-filter method was employed for signal demodulation. Shannon wavelet transform was used to extract narrow band signals from the impacts. The Gerschgorin disc theorem was used for estimating the number of impacts. We used the MUSIC algorithm to obtain the coordinates of multi-impacts. The impact detection system was tested on a 500 mm × 500 mm × 1.5 mm CFRP plate. The results show that the maximum error and average error of the multi-impacts' localization are 9.2 mm and 7.4 mm, respectively.
A multisensor approach to sea ice classification for the validation of DMSP-SSM/I passive microwave derived sea ice products

NASA Technical Reports Server (NTRS)

Steffen, K.; Schweiger, A. J.

1990-01-01

The validation of sea ice products derived from the Special Sensor Microwave Imager (SSM/I) on board a DMSP platform is examined using data from the Landsat MSS and NOAA-AVHRR sensors. Image processing techniques for retrieving ice concentrations from each type of imagery are developed and results are intercompared to determine the ice parameter retrieval accuracy of the SSM/I NASA-Team algorithm. For case studies in the Beaufort Sea and East Greenland Sea, average retrieval errors of the SSM/I algorithm are between 1.7 percent for spring conditions and 4.3 percent during freeze up in comparison with Landsat derived ice concentrations. For a case study in the East Greenland Sea, SSM/I derived ice concentration in comparison with AVHRR imagery display a mean error of 9.6 percent.
Practical Procedures for Constructing Mastery Tests to Minimize Errors of Classification and to Maximize or Optimize Decision Reliability.

ERIC Educational Resources Information Center

Byars, Alvin Gregg

The objectives of this investigation are to develop, describe, assess, and demonstrate procedures for constructing mastery tests to minimize errors of classification and to maximize decision reliability. The guidelines are based on conditions where item exchangeability is a reasonable assumption and the test constructor can control the number of…
A Framework of Temporal-Spatial Descriptors-Based Feature Extraction for Improved Myoelectric Pattern Recognition.

PubMed

Khushaba, Rami N; Al-Timemy, Ali H; Al-Ani, Ahmed; Al-Jumaily, Adel

2017-10-01

The extraction of the accurate and efficient descriptors of muscular activity plays an important role in tackling the challenging problem of myoelectric control of powered prostheses. In this paper, we present a new feature extraction framework that aims to give an enhanced representation of muscular activities through increasing the amount of information that can be extracted from individual and combined electromyogram (EMG) channels. We propose to use time-domain descriptors (TDDs) in estimating the EMG signal power spectrum characteristics; a step that preserves the computational power required for the construction of spectral features. Subsequently, TDD is used in a process that involves: 1) representing the temporal evolution of the EMG signals by progressively tracking the correlation between the TDD extracted from each analysis time window and a nonlinearly mapped version of it across the same EMG channel and 2) representing the spatial coherence between the different EMG channels, which is achieved by calculating the correlation between the TDD extracted from the differences of all possible combinations of pairs of channels and their nonlinearly mapped versions. The proposed temporal-spatial descriptors (TSDs) are validated on multiple sparse and high-density (HD) EMG data sets collected from a number of intact-limbed and amputees performing a large number of hand and finger movements. Classification results showed significant reductions in the achieved error rates in comparison to other methods, with the improvement of at least 8% on average across all subjects. Additionally, the proposed TSDs achieved significantly well in problems with HD-EMG with average classification errors of <5% across all subjects using windows lengths of 50 ms only.
Three-column classification and Schatzker classification: a three- and two-dimensional computed tomography characterisation and analysis of tibial plateau fractures.

PubMed

Patange Subba Rao, Sheethal Prasad; Lewis, James; Haddad, Ziad; Paringe, Vishal; Mohanty, Khitish

2014-10-01

The aim of the study was to evaluate inter-observer reliability and intra-observer reproducibility between the three-column classification and Schatzker classification systems using 2D and 3D CT models. Fifty-two consecutive patients with tibial plateau fractures were evaluated by five orthopaedic surgeons. All patients were classified into Schatzker and three-column classification systems using x-rays and 2D and 3D CT images. The inter-observer reliability was evaluated in the first round and the intra-observer reliability was determined during the second round 2 weeks later. The average intra-observer reproducibility for the three-column classification was from substantial to excellent in all sub classifications, as compared with Schatzker classification. The inter-observer kappa values increased from substantial to excellent in three-column classification and to moderate in Schatzker classification The average values for three-column classification for all the categories are as follows: (I-III) k2D = 0.718, 95% CI 0.554-0.864, p < 0.0001 and average 3D = 0.874, 95% CI 0.754-0.890, p < 0.0001. For Schatzker classification system, the average values for all six categories are as follows: (I-VI) k2D = 0.536, 95% CI 0.365-0.685, p < 0.0001 and average k3D = 0.552 95% CI 0.405-0.700, p < 0.0001. The values are statistically significant. Statistically significant inter-observer values in both rounds were noted with the three-column classification, making it statistically an excellent agreement. The intra-observer reproducibility for the three-column classification improved as compared with the Schatzker classification. The three-column classification seems to be an effective way to characterise and classify fractures of tibial plateau.
Comparing K-mer based methods for improved classification of 16S sequences.

PubMed

Vinje, Hilde; Liland, Kristian Hovde; Almøy, Trygve; Snipen, Lars

2015-07-01

The need for precise and stable taxonomic classification is highly relevant in modern microbiology. Parallel to the explosion in the amount of sequence data accessible, there has also been a shift in focus for classification methods. Previously, alignment-based methods were the most applicable tools. Now, methods based on counting K-mers by sliding windows are the most interesting classification approach with respect to both speed and accuracy. Here, we present a systematic comparison on five different K-mer based classification methods for the 16S rRNA gene. The methods differ from each other both in data usage and modelling strategies. We have based our study on the commonly known and well-used naïve Bayes classifier from the RDP project, and four other methods were implemented and tested on two different data sets, on full-length sequences as well as fragments of typical read-length. The difference in classification error obtained by the methods seemed to be small, but they were stable and for both data sets tested. The Preprocessed nearest-neighbour (PLSNN) method performed best for full-length 16S rRNA sequences, significantly better than the naïve Bayes RDP method. On fragmented sequences the naïve Bayes Multinomial method performed best, significantly better than all other methods. For both data sets explored, and on both full-length and fragmented sequences, all the five methods reached an error-plateau. We conclude that no K-mer based method is universally best for classifying both full-length sequences and fragments (reads). All methods approach an error plateau indicating improved training data is needed to improve classification from here. Classification errors occur most frequent for genera with few sequences present. For improving the taxonomy and testing new classification methods, the need for a better and more universal and robust training data set is crucial.
An experimental evaluation of the incidence of fitness-function/search-algorithm combinations on the classification performance of myoelectric control systems with iPCA tuning

PubMed Central

2013-01-01

Background The information of electromyographic signals can be used by Myoelectric Control Systems (MCSs) to actuate prostheses. These devices allow the performing of movements that cannot be carried out by persons with amputated limbs. The state of the art in the development of MCSs is based on the use of individual principal component analysis (iPCA) as a stage of pre-processing of the classifiers. The iPCA pre-processing implies an optimization stage which has not yet been deeply explored. Methods The present study considers two factors in the iPCA stage: namely A (the fitness function), and B (the search algorithm). The A factor comprises two levels, namely A1 (the classification error) and A2 (the correlation factor). Otherwise, the B factor has four levels, specifically B1 (the Sequential Forward Selection, SFS), B2 (the Sequential Floating Forward Selection, SFFS), B3 (Artificial Bee Colony, ABC), and B4 (Particle Swarm Optimization, PSO). This work evaluates the incidence of each one of the eight possible combinations between A and B factors over the classification error of the MCS. Results A two factor ANOVA was performed on the computed classification errors and determined that: (1) the interactive effects over the classification error are not significative (F0.01,3,72 = 4.0659 > f AB = 0.09), (2) the levels of factor A have significative effects on the classification error (F0.02,1,72 = 5.0162 < f A = 6.56), and (3) the levels of factor B over the classification error are not significative (F0.01,3,72 = 4.0659 > f B = 0.08). Conclusions Considering the classification performance we found a superiority of using the factor A2 in combination with any of the levels of factor B. With respect to the time performance the analysis suggests that the PSO algorithm is at least 14 percent better than its best competitor. The latter behavior has been observed for a particular configuration set of parameters in the search algorithms. Future works will investigate the effect of these parameters in the classification performance, such as length of the reduced size vector, number of particles and bees used during optimal search, the cognitive parameters in the PSO algorithm as well as the limit of cycles to improve a solution in the ABC algorithm. PMID:24369728
Exploring human error in military aviation flight safety events using post-incident classification systems.

PubMed

Hooper, Brionny J; O'Hare, David P A

2013-08-01

Human error classification systems theoretically allow researchers to analyze postaccident data in an objective and consistent manner. The Human Factors Analysis and Classification System (HFACS) framework is one such practical analysis tool that has been widely used to classify human error in aviation. The Cognitive Error Taxonomy (CET) is another. It has been postulated that the focus on interrelationships within HFACS can facilitate the identification of the underlying causes of pilot error. The CET provides increased granularity at the level of unsafe acts. The aim was to analyze the influence of factors at higher organizational levels on the unsafe acts of front-line operators and to compare the errors of fixed-wing and rotary-wing operations. This study analyzed 288 aircraft incidents involving human error from an Australasian military organization occurring between 2001 and 2008. Action errors accounted for almost twice (44%) the proportion of rotary wing compared to fixed wing (23%) incidents. Both classificatory systems showed significant relationships between precursor factors such as the physical environment, mental and physiological states, crew resource management, training and personal readiness, and skill-based, but not decision-based, acts. The CET analysis showed different predisposing factors for different aspects of skill-based behaviors. Skill-based errors in military operations are more prevalent in rotary wing incidents and are related to higher level supervisory processes in the organization. The Cognitive Error Taxonomy provides increased granularity to HFACS analyses of unsafe acts.
Measuring the relative extent of pulmonary infiltrates by hierarchical classification of patient-specific image features

NASA Astrophysics Data System (ADS)

Tsevas, S.; Iakovidis, D. K.

2011-11-01

Pulmonary infiltrates are common radiological findings indicating the filling of airspaces with fluid, inflammatory exudates, or cells. They are most common in cases of pneumonia, acute respiratory syndrome, atelectasis, pulmonary oedema and haemorrhage, whereas their extent is usually correlated with the extent or the severity of the underlying disease. In this paper we propose a novel pattern recognition framework for the measurement of the extent of pulmonary infiltrates in routine chest radiographs. The proposed framework follows a hierarchical approach to the assessment of image content. It includes the following: (a) sampling of the lung fields; (b) extraction of patient-specific grey-level histogram signatures from each sample; (c) classification of the extracted signatures into classes representing normal lung parenchyma and pulmonary infiltrates; (d) the samples for which the probability of belonging to one of the two classes does not reach an acceptable level are rejected and classified according to their textural content; (e) merging of the classification results of the two classification stages. The proposed framework has been evaluated on real radiographic images with pulmonary infiltrates caused by bacterial infections. The results show that accurate measurements of the infiltration areas can be obtained with respect to each lung field area. The average measurement error rate on the considered dataset reached 9.7% ± 1.0%.
How should children with speech sound disorders be classified? A review and critical evaluation of current classification systems.

PubMed

Waring, R; Knight, R

2013-01-01

Children with speech sound disorders (SSD) form a heterogeneous group who differ in terms of the severity of their condition, underlying cause, speech errors, involvement of other aspects of the linguistic system and treatment response. To date there is no universal and agreed-upon classification system. Instead, a number of theoretically differing classification systems have been proposed based on either an aetiological (medical) approach, a descriptive-linguistic approach or a processing approach. To describe and review the supporting evidence, and to provide a critical evaluation of the current childhood SSD classification systems. Descriptions of the major specific approaches to classification are reviewed and research papers supporting the reliability and validity of the systems are evaluated. Three specific paediatric SSD classification systems; the aetiologic-based Speech Disorders Classification System, the descriptive-linguistic Differential Diagnosis system, and the processing-based Psycholinguistic Framework are identified as potentially useful in classifying children with SSD into homogeneous subgroups. The Differential Diagnosis system has a growing body of empirical support from clinical population studies, across language error pattern studies and treatment efficacy studies. The Speech Disorders Classification System is currently a research tool with eight proposed subgroups. The Psycholinguistic Framework is a potential bridge to linking cause and surface level speech errors. There is a need for a universally agreed-upon classification system that is useful to clinicians and researchers. The resulting classification system needs to be robust, reliable and valid. A universal classification system would allow for improved tailoring of treatments to subgroups of SSD which may, in turn, lead to improved treatment efficacy. © 2012 Royal College of Speech and Language Therapists.
The application of Aronson's taxonomy to medication errors in nursing.

PubMed

Johnson, Maree; Young, Helen

2011-01-01

Medication administration is a frequent nursing activity that is prone to error. In this study of 318 self-reported medication incidents (including near misses), very few resulted in patient harm-7% required intervention or prolonged hospitalization or caused temporary harm. Aronson's classification system provided an excellent framework for analysis of the incidents with a close connection between the type of error and the change strategy to minimize medication incidents. Taking a behavioral approach to medication error classification has provided helpful strategies for nurses such as nurse-call cards on patient lockers when patients are absent and checking of medication sign-off by outgoing and incoming staff at handover.
Undergraduate paramedic students cannot do drug calculations.

PubMed

Eastwood, Kathryn; Boyle, Malcolm J; Williams, Brett

2012-01-01

Previous investigation of drug calculation skills of qualified paramedics has highlighted poor mathematical ability with no published studies having been undertaken on undergraduate paramedics. There are three major error classifications. Conceptual errors involve an inability to formulate an equation from information given, arithmetical errors involve an inability to operate a given equation, and finally computation errors are simple errors of addition, subtraction, division and multiplication. The objective of this study was to determine if undergraduate paramedics at a large Australia university could accurately perform common drug calculations and basic mathematical equations normally required in the workplace. A cross-sectional study methodology using a paper-based questionnaire was administered to undergraduate paramedic students to collect demographical data, student attitudes regarding their drug calculation performance, and answers to a series of basic mathematical and drug calculation questions. Ethics approval was granted. The mean score of correct answers was 39.5% with one student scoring 100%, 3.3% of students (n=3) scoring greater than 90%, and 63% (n=58) scoring 50% or less, despite 62% (n=57) of the students stating they 'did not have any drug calculations issues'. On average those who completed a minimum of year 12 Specialist Maths achieved scores over 50%. Conceptual errors made up 48.5%, arithmetical 31.1% and computational 17.4%. This study suggests undergraduate paramedics have deficiencies in performing accurate calculations, with conceptual errors indicating a fundamental lack of mathematical understanding. The results suggest an unacceptable level of mathematical competence to practice safely in the unpredictable prehospital environment.
Verb inflection in monolingual Dutch and sequential bilingual Turkish-Dutch children with and without SLI.

PubMed

Blom, Elma; de Jong, Jan; Orgassa, Antje; Baker, Anne; Weerman, Fred

2013-01-01

Both children with specific language impairment (SLI) and children who acquire a second language (L2) make errors with verb inflection. This overlap between SLI and L2 raises the question if verb inflection can discriminate between L2 children with and without SLI. In this study we addressed this question for Dutch. The secondary goal of the study was to investigate variation in error types and error profiles across groups. Data were collected from 6-8-year-old children with SLI who acquire Dutch as their first language (L1), Dutch L1 children with a typical development (TD), Dutch L2 children with SLI, and Dutch L1 TD children who were on average 2 years younger. An experimental elicitation task was employed that tested use of verb inflection; context (3SG, 3PL) was manipulated and word order and verb type were controlled. Accuracy analyses revealed effects of impairment in both L1 and L2 children with SLI. However, individual variation indicated that there is no specific error profile for SLI. Verb inflection use as measured in our study discriminated fairly well in the L1 group but classification was less accurate in the L2 group. Between-group differences emerged furthermore for certain types of errors, but all groups also showed considerable variation in errors and there was not a specific error profile that distinguished SLI from TD. © 2013 Royal College of Speech and Language Therapists.
Center for Seismic Studies Final Technical Report, October 1992 through October 1993

DTIC Science & Technology

1994-02-07

SECURITY CLASSIFICATION 18. SECURITY CLASSIFICATION 19. SECURITY CLASSIFICATION 20. LIMITATION OF ABSTRACT OF REPORT OF THIS PAGE OF ABSTRACT...Upper limit of depth error as a function of mb for estimates based on P and S waves for three netowrks : GSETr-2, ALPHA, and ALPHA + a 50 station...U 4A 4 U 4S as 1 I I I Figure 42: Upper limit of depth error as a function of mb for estimatesbased on P and S waves for three netowrk : GSETT-2o ALPHA
Assimilation of a knowledge base and physical models to reduce errors in passive-microwave classifications of sea ice

NASA Technical Reports Server (NTRS)

Maslanik, J. A.; Key, J.

1992-01-01

An expert system framework has been developed to classify sea ice types using satellite passive microwave data, an operational classification algorithm, spatial and temporal information, ice types estimated from a dynamic-thermodynamic model, output from a neural network that detects the onset of melt, and knowledge about season and region. The rule base imposes boundary conditions upon the ice classification, modifies parameters in the ice algorithm, determines a `confidence' measure for the classified data, and under certain conditions, replaces the algorithm output with model output. Results demonstrate the potential power of such a system for minimizing overall error in the classification and for providing non-expert data users with a means of assessing the usefulness of the classification results for their applications.

Impact of Compounding Error on Strategies for Subtyping Pathogenic Bacteria

PubMed Central

Orfe, Lisa; Davis, Margaret A.; Lafrentz, Stacey; Kang, Min-Su

2008-01-01

Abstract Comparative-omics will identify a multitude of markers that can be used for intraspecific discrimination between strains of bacteria. It seems intuitive that with this plethora of markers we can construct higher resolution subtyping assays using discrete markers to define strain “barcodes.” Unfortunately, with each new marker added to an assay, overall assay robustness declines because errors are compounded exponentially. For example, the difference in accuracy of strain classification for an assay with 60 markers will change from 99.9% to 54.7% when average probe accuracy declines from 99.999% to 99.0%. To illustrate this effect empirically, we constructed a 19 probe bead-array for subtyping Listeria monocytogenes and showed that despite seemingly reliable individual probe accuracy (>97%), our best classification results at the strain level were <75%. A more robust strategy would use as few markers as possible to achieve strain discrimination. Consequently, we developed two variable number of tandem repeat (VNTR) assays (Vibrio parahaemolyticus and L. monocytogenes) and demonstrate that these assays along with a published assay (Salmonella enterica) produce robust results when products were machine scored. The discriminatory ability with four to seven VNTR loci was comparable to pulsed-field gel electrophoresis. Passage experiments showed some instability with ca. 5% of passaged lines showing evidence for new alleles within 30 days (V. parahaemolyticus and S. enterica). Changes were limited to a single locus and allele so conservative rules can be used to determine strain matching. Most importantly, VNTRs appear robust and portable and can clearly discriminate between strains with relatively few loci thereby limiting effects of compounding error. PMID:18713065
Cloud Liquid Water Path Comparisons from Passive Microwave and Solar Reflectance Satellite Measurements: Assessment of Sub-Field-of-View Cloud Effects in Microwave Retrievals

NASA Technical Reports Server (NTRS)

Greenwald, Thomas J.; Christopher, Sundar A.; Chou, Joyce

1997-01-01

Satellite observations of the cloud liquid water path (LWP) are compared from special sensor microwave imager (SSM/I) measurements and GOES 8 imager solar reflectance (SR) measurements to ascertain the impact of sub-field-of-view (FOV) cloud effects on SSM/I 37 GHz retrievals. The SR retrievals also incorporate estimates of the cloud droplet effective radius derived from the GOES 8 3.9-micron channel. The comparisons consist of simultaneous collocated and full-resolution measurements and are limited to nonprecipitating marine stratocumulus in the eastern Pacific for two days in October 1995. The retrievals from these independent methods are consistent for overcast SSM/I FOVS, with RMS differences as low as 0.030 kg/sq m, although biases exist for clouds with more open spatial structure, where the RMS differences increase to 0.039 kg/sq m. For broken cloudiness within the SSM/I FOV the average beam-filling error (BFE) in the microwave retrievals is found to be about 22% (average cloud amount of 73%). This systematic error is comparable with the average random errors in the microwave retrievals. However, even larger BFEs can be expected for individual FOVs and for regions with less cloudiness. By scaling the microwave retrievals by the cloud amount within the FOV, the systematic BFE can be significantly reduced but with increased RMS differences of O.046-0.058 kg/sq m when compared to the SR retrievals. The beam-filling effects reported here are significant and are expected to impact directly upon studies that use instantaneous SSM/I measurements of cloud LWP, such as cloud classification studies and validation studies involving surface-based or in situ data.
A Guide for Setting the Cut-Scores to Minimize Weighted Classification Errors in Test Batteries

ERIC Educational Resources Information Center

Grabovsky, Irina; Wainer, Howard

2017-01-01

In this article, we extend the methodology of the Cut-Score Operating Function that we introduced previously and apply it to a testing scenario with multiple independent components and different testing policies. We derive analytically the overall classification error rate for a test battery under the policy when several retakes are allowed for…
Design of a hybrid model for cardiac arrhythmia classification based on Daubechies wavelet transform.

PubMed

Rajagopal, Rekha; Ranganathan, Vidhyapriya

2018-06-05

Automation in cardiac arrhythmia classification helps medical professionals make accurate decisions about the patient's health. The aim of this work was to design a hybrid classification model to classify cardiac arrhythmias. The design phase of the classification model comprises the following stages: preprocessing of the cardiac signal by eliminating detail coefficients that contain noise, feature extraction through Daubechies wavelet transform, and arrhythmia classification using a collaborative decision from the K nearest neighbor classifier (KNN) and a support vector machine (SVM). The proposed model is able to classify 5 arrhythmia classes as per the ANSI/AAMI EC57: 1998 classification standard. Level 1 of the proposed model involves classification using the KNN and the classifier is trained with examples from all classes. Level 2 involves classification using an SVM and is trained specifically to classify overlapped classes. The final classification of a test heartbeat pertaining to a particular class is done using the proposed KNN/SVM hybrid model. The experimental results demonstrated that the average sensitivity of the proposed model was 92.56%, the average specificity 99.35%, the average positive predictive value 98.13%, the average F-score 94.5%, and the average accuracy 99.78%. The results obtained using the proposed model were compared with the results of discriminant, tree, and KNN classifiers. The proposed model is able to achieve a high classification accuracy.
[Classifications in forensic medicine and their logical basis].

PubMed

Kovalev, A V; Shmarov, L A; Ten'kov, A A

2014-01-01

The objective of the present study was to characterize the main requirements for the correct construction of classifications used in forensic medicine, with special reference to the errors that occur in the relevant text-books, guidelines, and manuals and the ways to avoid them. This publication continues the series of thematic articles of the authors devoted to the logical errors in the expert conclusions. The preparation of further publications is underway to report the results of the in-depth analysis of the logical errors encountered in expert conclusions, text-books, guidelines, and manuals.
Sensitivity analysis of the GEMS soil organic carbon model to land cover land use classification uncertainties under different climate scenarios in Senegal

USGS Publications Warehouse

Dieye, A.M.; Roy, David P.; Hanan, N.P.; Liu, S.; Hansen, M.; Toure, A.

2012-01-01

Spatially explicit land cover land use (LCLU) change information is needed to drive biogeochemical models that simulate soil organic carbon (SOC) dynamics. Such information is increasingly being mapped using remotely sensed satellite data with classification schemes and uncertainties constrained by the sensing system, classification algorithms and land cover schemes. In this study, automated LCLU classification of multi-temporal Landsat satellite data were used to assess the sensitivity of SOC modeled by the Global Ensemble Biogeochemical Modeling System (GEMS). The GEMS was run for an area of 1560 km2 in Senegal under three climate change scenarios with LCLU maps generated using different Landsat classification approaches. This research provides a method to estimate the variability of SOC, specifically the SOC uncertainty due to satellite classification errors, which we show is dependent not only on the LCLU classification errors but also on where the LCLU classes occur relative to the other GEMS model inputs.
Evaluating data mining algorithms using molecular dynamics trajectories.

PubMed

Tatsis, Vasileios A; Tjortjis, Christos; Tzirakis, Panagiotis

2013-01-01

Molecular dynamics simulations provide a sample of a molecule's conformational space. Experiments on the mus time scale, resulting in large amounts of data, are nowadays routine. Data mining techniques such as classification provide a way to analyse such data. In this work, we evaluate and compare several classification algorithms using three data sets which resulted from computer simulations, of a potential enzyme mimetic biomolecule. We evaluated 65 classifiers available in the well-known data mining toolkit Weka, using 'classification' errors to assess algorithmic performance. Results suggest that: (i) 'meta' classifiers perform better than the other groups, when applied to molecular dynamics data sets; (ii) Random Forest and Rotation Forest are the best classifiers for all three data sets; and (iii) classification via clustering yields the highest classification error. Our findings are consistent with bibliographic evidence, suggesting a 'roadmap' for dealing with such data.
Classification of DNA nucleotides with transverse tunneling currents

NASA Astrophysics Data System (ADS)

Nyvold Pedersen, Jonas; Boynton, Paul; Di Ventra, Massimiliano; Jauho, Antti-Pekka; Flyvbjerg, Henrik

2017-01-01

It has been theoretically suggested and experimentally demonstrated that fast and low-cost sequencing of DNA, RNA, and peptide molecules might be achieved by passing such molecules between electrodes embedded in a nanochannel. The experimental realization of this scheme faces major challenges, however. In realistic liquid environments, typical currents in tunneling devices are of the order of picoamps. This corresponds to only six electrons per microsecond, and this number affects the integration time required to do current measurements in real experiments. This limits the speed of sequencing, though current fluctuations due to Brownian motion of the molecule average out during the required integration time. Moreover, data acquisition equipment introduces noise, and electronic filters create correlations in time-series data. We discuss how these effects must be included in the analysis of, e.g., the assignment of specific nucleobases to current signals. As the signals from different molecules overlap, unambiguous classification is impossible with a single measurement. We argue that the assignment of molecules to a signal is a standard pattern classification problem and calculation of the error rates is straightforward. The ideas presented here can be extended to other sequencing approaches of current interest.
Shape classification of wear particles by image boundary analysis using machine learning algorithms

NASA Astrophysics Data System (ADS)

Yuan, Wei; Chin, K. S.; Hua, Meng; Dong, Guangneng; Wang, Chunhui

2016-05-01

The shape features of wear particles generated from wear track usually contain plenty of information about the wear states of a machinery operational condition. Techniques to quickly identify types of wear particles quickly to respond to the machine operation and prolong the machine's life appear to be lacking and are yet to be established. To bridge rapid off-line feature recognition with on-line wear mode identification, this paper presents a new radial concave deviation (RCD) method that mainly involves the use of the particle boundary signal to analyze wear particle features. Signal output from the RCDs subsequently facilitates the determination of several other feature parameters, typically relevant to the shape and size of the wear particle. Debris feature and type are identified through the use of various classification methods, such as linear discriminant analysis, quadratic discriminant analysis, naïve Bayesian method, and classification and regression tree method (CART). The average errors of the training and test via ten-fold cross validation suggest CART is a highly suitable approach for classifying and analyzing particle features. Furthermore, the results of the wear debris analysis enable the maintenance team to diagnose faults appropriately.
Evaluation criteria for software classification inventories, accuracies, and maps

NASA Technical Reports Server (NTRS)

Jayroe, R. R., Jr.

1976-01-01

Statistical criteria are presented for modifying the contingency table used to evaluate tabular classification results obtained from remote sensing and ground truth maps. This classification technique contains information on the spatial complexity of the test site, on the relative location of classification errors, on agreement of the classification maps with ground truth maps, and reduces back to the original information normally found in a contingency table.
The search for structure - Object classification in large data sets. [for astronomers

NASA Technical Reports Server (NTRS)

Kurtz, Michael J.

1988-01-01

Research concerning object classifications schemes are reviewed, focusing on large data sets. Classification techniques are discussed, including syntactic, decision theoretic methods, fuzzy techniques, and stochastic and fuzzy grammars. Consideration is given to the automation of MK classification (Morgan and Keenan, 1973) and other problems associated with the classification of spectra. In addition, the classification of galaxies is examined, including the problems of systematic errors, blended objects, galaxy types, and galaxy clusters.
Procedural Error and Task Interruption

DTIC Science & Technology

2016-09-30

red for research on errors and individual differences . Results indicate predictive validity for fluid intelligence and specifi c forms of work...TERMS procedural error, task interruption, individual differences , fluid intelligence, sleep deprivation 16. SECURITY CLASSIFICATION OF: 17...and individual differences . It generates rich data on several kinds of errors, including procedural errors in which steps are skipped or repeated
The Sources of Error in Spanish Writing.

ERIC Educational Resources Information Center

Justicia, Fernando; Defior, Sylvia; Pelegrina, Santiago; Martos, Francisco J.

1999-01-01

Determines the pattern of errors in Spanish spelling. Analyzes and proposes a classification system for the errors made by children in the initial stages of the acquisition of spelling skills. Finds the diverse forms of only 20 Spanish words produces 36% of the spelling errors in Spanish; and substitution is the most frequent type of error. (RS)
A research of selected textural features for detection of asbestos-cement roofing sheets using orthoimages

NASA Astrophysics Data System (ADS)

Książek, Judyta

2015-10-01

At present, there has been a great interest in the development of texture based image classification methods in many different areas. This study presents the results of research carried out to assess the usefulness of selected textural features for detection of asbestos-cement roofs in orthophotomap classification. Two different orthophotomaps of southern Poland (with ground resolution: 5 cm and 25 cm) were used. On both orthoimages representative samples for two classes: asbestos-cement roofing sheets and other roofing materials were selected. Estimation of texture analysis usefulness was conducted using machine learning methods based on decision trees (C5.0 algorithm). For this purpose, various sets of texture parameters were calculated in MaZda software. During the calculation of decision trees different numbers of texture parameters groups were considered. In order to obtain the best settings for decision trees models cross-validation was performed. Decision trees models with the lowest mean classification error were selected. The accuracy of the classification was held based on validation data sets, which were not used for the classification learning. For 5 cm ground resolution samples, the lowest mean classification error was 15.6%. The lowest mean classification error in the case of 25 cm ground resolution was 20.0%. The obtained results confirm potential usefulness of the texture parameter image processing for detection of asbestos-cement roofing sheets. In order to improve the accuracy another extended study should be considered in which additional textural features as well as spectral characteristics should be analyzed.
Application of principal component analysis to distinguish patients with schizophrenia from healthy controls based on fractional anisotropy measurements.

PubMed

Caprihan, A; Pearlson, G D; Calhoun, V D

2008-08-15

Principal component analysis (PCA) is often used to reduce the dimension of data before applying more sophisticated data analysis methods such as non-linear classification algorithms or independent component analysis. This practice is based on selecting components corresponding to the largest eigenvalues. If the ultimate goal is separation of data in two groups, then these set of components need not have the most discriminatory power. We measured the distance between two such populations using Mahalanobis distance and chose the eigenvectors to maximize it, a modified PCA method, which we call the discriminant PCA (DPCA). DPCA was applied to diffusion tensor-based fractional anisotropy images to distinguish age-matched schizophrenia subjects from healthy controls. The performance of the proposed method was evaluated by the one-leave-out method. We show that for this fractional anisotropy data set, the classification error with 60 components was close to the minimum error and that the Mahalanobis distance was twice as large with DPCA, than with PCA. Finally, by masking the discriminant function with the white matter tracts of the Johns Hopkins University atlas, we identified left superior longitudinal fasciculus as the tract which gave the least classification error. In addition, with six optimally chosen tracts the classification error was zero.
Pareto-optimal multi-objective dimensionality reduction deep auto-encoder for mammography classification.

PubMed

Taghanaki, Saeid Asgari; Kawahara, Jeremy; Miles, Brandon; Hamarneh, Ghassan

2017-07-01

Feature reduction is an essential stage in computer aided breast cancer diagnosis systems. Multilayer neural networks can be trained to extract relevant features by encoding high-dimensional data into low-dimensional codes. Optimizing traditional auto-encoders works well only if the initial weights are close to a proper solution. They are also trained to only reduce the mean squared reconstruction error (MRE) between the encoder inputs and the decoder outputs, but do not address the classification error. The goal of the current work is to test the hypothesis that extending traditional auto-encoders (which only minimize reconstruction error) to multi-objective optimization for finding Pareto-optimal solutions provides more discriminative features that will improve classification performance when compared to single-objective and other multi-objective approaches (i.e. scalarized and sequential). In this paper, we introduce a novel multi-objective optimization of deep auto-encoder networks, in which the auto-encoder optimizes two objectives: MRE and mean classification error (MCE) for Pareto-optimal solutions, rather than just MRE. These two objectives are optimized simultaneously by a non-dominated sorting genetic algorithm. We tested our method on 949 X-ray mammograms categorized into 12 classes. The results show that the features identified by the proposed algorithm allow a classification accuracy of up to 98.45%, demonstrating favourable accuracy over the results of state-of-the-art methods reported in the literature. We conclude that adding the classification objective to the traditional auto-encoder objective and optimizing for finding Pareto-optimal solutions, using evolutionary multi-objective optimization, results in producing more discriminative features. Copyright © 2017 Elsevier B.V. All rights reserved.
Undergraduate paramedic students cannot do drug calculations

PubMed Central

Eastwood, Kathryn; Boyle, Malcolm J; Williams, Brett

2012-01-01

BACKGROUND: Previous investigation of drug calculation skills of qualified paramedics has highlighted poor mathematical ability with no published studies having been undertaken on undergraduate paramedics. There are three major error classifications. Conceptual errors involve an inability to formulate an equation from information given, arithmetical errors involve an inability to operate a given equation, and finally computation errors are simple errors of addition, subtraction, division and multiplication. The objective of this study was to determine if undergraduate paramedics at a large Australia university could accurately perform common drug calculations and basic mathematical equations normally required in the workplace. METHODS: A cross-sectional study methodology using a paper-based questionnaire was administered to undergraduate paramedic students to collect demographical data, student attitudes regarding their drug calculation performance, and answers to a series of basic mathematical and drug calculation questions. Ethics approval was granted. RESULTS: The mean score of correct answers was 39.5% with one student scoring 100%, 3.3% of students (n=3) scoring greater than 90%, and 63% (n=58) scoring 50% or less, despite 62% (n=57) of the students stating they ‘did not have any drug calculations issues’. On average those who completed a minimum of year 12 Specialist Maths achieved scores over 50%. Conceptual errors made up 48.5%, arithmetical 31.1% and computational 17.4%. CONCLUSIONS: This study suggests undergraduate paramedics have deficiencies in performing accurate calculations, with conceptual errors indicating a fundamental lack of mathematical understanding. The results suggest an unacceptable level of mathematical competence to practice safely in the unpredictable prehospital environment. PMID:25215067
Wildlife management by habitat units: A preliminary plan of action

NASA Technical Reports Server (NTRS)

Frentress, C. D.; Frye, R. G.

1975-01-01

Procedures for yielding vegetation type maps were developed using LANDSAT data and a computer assisted classification analysis (LARSYS) to assist in managing populations of wildlife species by defined area units. Ground cover in Travis County, Texas was classified on two occasions using a modified version of the unsupervised approach to classification. The first classification produced a total of 17 classes. Examination revealed that further grouping was justified. A second analysis produced 10 classes which were displayed on printouts which were later color-coded. The final classification was 82 percent accurate. While the classification map appeared to satisfactorily depict the existing vegetation, two classes were determined to contain significant error. The major sources of error could have been eliminated by stratifying cluster sites more closely among previously mapped soil associations that are identified with particular plant associations and by precisely defining class nomenclature using established criteria early in the analysis.
EEG-based decoding of error-related brain activity in a real-world driving task

NASA Astrophysics Data System (ADS)

Zhang, H.; Chavarriaga, R.; Khaliliardali, Z.; Gheorghe, L.; Iturrate, I.; Millán, J. d. R.

2015-12-01

Objectives. Recent studies have started to explore the implementation of brain-computer interfaces (BCI) as part of driving assistant systems. The current study presents an EEG-based BCI that decodes error-related brain activity. Such information can be used, e.g., to predict driver’s intended turning direction before reaching road intersections. Approach. We executed experiments in a car simulator (N = 22) and a real car (N = 8). While subject was driving, a directional cue was shown before reaching an intersection, and we classified the presence or not of an error-related potentials from EEG to infer whether the cued direction coincided with the subject’s intention. In this protocol, the directional cue can correspond to an estimation of the driving direction provided by a driving assistance system. We analyzed ERPs elicited during normal driving and evaluated the classification performance in both offline and online tests. Results. An average classification accuracy of 0.698 ± 0.065 was obtained in offline experiments in the car simulator, while tests in the real car yielded a performance of 0.682 ± 0.059. The results were significantly higher than chance level for all cases. Online experiments led to equivalent performances in both simulated and real car driving experiments. These results support the feasibility of decoding these signals to help estimating whether the driver’s intention coincides with the advice provided by the driving assistant in a real car. Significance. The study demonstrates a BCI system in real-world driving, extending the work from previous simulated studies. As far as we know, this is the first online study in real car decoding driver’s error-related brain activity. Given the encouraging results, the paradigm could be further improved by using more sophisticated machine learning approaches and possibly be combined with applications in intelligent vehicles.
Development of the ICD-10 simplified version and field test.

PubMed

Paoin, Wansa; Yuenyongsuwan, Maliwan; Yokobori, Yukiko; Endo, Hiroyoshi; Kim, Sukil

2018-05-01

The International Statistical Classification of Diseases and Related Health Problems, 10th Revision (ICD-10) has been used in various Asia-Pacific countries for more than 20 years. Although ICD-10 is a powerful tool, clinical coding processes are complex; therefore, many developing countries have not been able to implement ICD-10-based health statistics (WHO-FIC APN, 2007). This study aimed to simplify ICD-10 clinical coding processes, to modify index terms to facilitate computer searching and to provide a simplified version of ICD-10 for use in developing countries. The World Health Organization Family of International Classifications Asia-Pacific Network (APN) developed a simplified version of the ICD-10 and conducted field testing in Cambodia during February and March 2016. Ten hospitals were selected to participate. Each hospital sent a team to join a training workshop before using the ICD-10 simplified version to code 100 cases. All hospitals subsequently sent their coded records to the researchers. Overall, there were 1038 coded records with a total of 1099 ICD clinical codes assigned. The average accuracy rate was calculated as 80.71% (66.67-93.41%). Three types of clinical coding errors were found. These related to errors relating to the coder (14.56%), those resulting from the physician documentation (1.27%) and those considered system errors (3.46%). The field trial results demonstrated that the APN ICD-10 simplified version is feasible for implementation as an effective tool to implement ICD-10 clinical coding for hospitals. Developing countries may consider adopting the APN ICD-10 simplified version for ICD-10 code assignment in hospitals and health care centres. The simplified version can be viewed as an introductory tool which leads to the implementation of the full ICD-10 and may support subsequent ICD-11 adoption.

The Relationship between Occurrence Timing of Dispensing Errors and Subsequent Danger to Patients under the Situation According to the Classification of Drugs by Efficacy.

PubMed

Tsuji, Toshikazu; Nagata, Kenichiro; Kawashiri, Takehiro; Yamada, Takaaki; Irisa, Toshihiro; Murakami, Yuko; Kanaya, Akiko; Egashira, Nobuaki; Masuda, Satohiro

2016-01-01

There are many reports regarding various medical institutions' attempts at the prevention of dispensing errors. However, the relationship between occurrence timing of dispensing errors and subsequent danger to patients has not been studied under the situation according to the classification of drugs by efficacy. Therefore, we analyzed the relationship between position and time regarding the occurrence of dispensing errors. Furthermore, we investigated the relationship between occurrence timing of them and danger to patients. In this study, dispensing errors and incidents in three categories (drug name errors, drug strength errors, drug count errors) were classified into two groups in terms of its drug efficacy (efficacy similarity (-) group, efficacy similarity (+) group), into three classes in terms of the occurrence timing of dispensing errors (initial phase errors, middle phase errors, final phase errors). Then, the rates of damage shifting from "dispensing errors" to "damage to patients" were compared as an index of danger between two groups and among three classes. Consequently, the rate of damage in "efficacy similarity (-) group" was significantly higher than that in "efficacy similarity (+) group". Furthermore, the rate of damage is the highest in "initial phase errors", the lowest in "final phase errors" among three classes. From the results of this study, it became clear that the earlier the timing of dispensing errors occurs, the more severe the damage to patients becomes.
Error modeling for surrogates of dynamical systems using machine learning: Machine-learning-based error model for surrogates of dynamical systems

DOE PAGES

Trehan, Sumeet; Carlberg, Kevin T.; Durlofsky, Louis J.

2017-07-14

A machine learning–based framework for modeling the error introduced by surrogate models of parameterized dynamical systems is proposed. The framework entails the use of high-dimensional regression techniques (eg, random forests, and LASSO) to map a large set of inexpensively computed “error indicators” (ie, features) produced by the surrogate model at a given time instance to a prediction of the surrogate-model error in a quantity of interest (QoI). This eliminates the need for the user to hand-select a small number of informative features. The methodology requires a training set of parameter instances at which the time-dependent surrogate-model error is computed bymore » simulating both the high-fidelity and surrogate models. Using these training data, the method first determines regression-model locality (via classification or clustering) and subsequently constructs a “local” regression model to predict the time-instantaneous error within each identified region of feature space. We consider 2 uses for the resulting error model: (1) as a correction to the surrogate-model QoI prediction at each time instance and (2) as a way to statistically model arbitrary functions of the time-dependent surrogate-model error (eg, time-integrated errors). We then apply the proposed framework to model errors in reduced-order models of nonlinear oil-water subsurface flow simulations, with time-varying well-control (bottom-hole pressure) parameters. The reduced-order models used in this work entail application of trajectory piecewise linearization in conjunction with proper orthogonal decomposition. Moreover, when the first use of the method is considered, numerical experiments demonstrate consistent improvement in accuracy in the time-instantaneous QoI prediction relative to the original surrogate model, across a large number of test cases. When the second use is considered, results show that the proposed method provides accurate statistical predictions of the time- and well-averaged errors.« less
Error modeling for surrogates of dynamical systems using machine learning: Machine-learning-based error model for surrogates of dynamical systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Trehan, Sumeet; Carlberg, Kevin T.; Durlofsky, Louis J.

A machine learning–based framework for modeling the error introduced by surrogate models of parameterized dynamical systems is proposed. The framework entails the use of high-dimensional regression techniques (eg, random forests, and LASSO) to map a large set of inexpensively computed “error indicators” (ie, features) produced by the surrogate model at a given time instance to a prediction of the surrogate-model error in a quantity of interest (QoI). This eliminates the need for the user to hand-select a small number of informative features. The methodology requires a training set of parameter instances at which the time-dependent surrogate-model error is computed bymore » simulating both the high-fidelity and surrogate models. Using these training data, the method first determines regression-model locality (via classification or clustering) and subsequently constructs a “local” regression model to predict the time-instantaneous error within each identified region of feature space. We consider 2 uses for the resulting error model: (1) as a correction to the surrogate-model QoI prediction at each time instance and (2) as a way to statistically model arbitrary functions of the time-dependent surrogate-model error (eg, time-integrated errors). We then apply the proposed framework to model errors in reduced-order models of nonlinear oil-water subsurface flow simulations, with time-varying well-control (bottom-hole pressure) parameters. The reduced-order models used in this work entail application of trajectory piecewise linearization in conjunction with proper orthogonal decomposition. Moreover, when the first use of the method is considered, numerical experiments demonstrate consistent improvement in accuracy in the time-instantaneous QoI prediction relative to the original surrogate model, across a large number of test cases. When the second use is considered, results show that the proposed method provides accurate statistical predictions of the time- and well-averaged errors.« less
Quantifying uncertainty in carbon and nutrient pools of coarse woody debris

NASA Astrophysics Data System (ADS)

See, C. R.; Campbell, J. L.; Fraver, S.; Domke, G. M.; Harmon, M. E.; Knoepp, J. D.; Woodall, C. W.

2016-12-01

Woody detritus constitutes a major pool of both carbon and nutrients in forested ecosystems. Estimating coarse wood stocks relies on many assumptions, even when full surveys are conducted. Researchers rarely report error in coarse wood pool estimates, despite the importance to ecosystem budgets and modelling efforts. To date, no study has attempted a comprehensive assessment of error rates and uncertainty inherent in the estimation of this pool. Here, we use Monte Carlo analysis to propagate the error associated with the major sources of uncertainty present in the calculation of coarse wood carbon and nutrient (i.e., N, P, K, Ca, Mg, Na) pools. We also evaluate individual sources of error to identify the importance of each source of uncertainty in our estimates. We quantify sampling error by comparing the three most common field methods used to survey coarse wood (two transect methods and a whole-plot survey). We quantify the measurement error associated with length and diameter measurement, and technician error in species identification and decay class using plots surveyed by multiple technicians. We use previously published values of model error for the four most common methods of volume estimation: Smalian's, conical frustum, conic paraboloid, and average-of-ends. We also use previously published values for error in the collapse ratio (cross-sectional height/width) of decayed logs that serves as a surrogate for the volume remaining. We consider sampling error in chemical concentration and density for all decay classes, using distributions from both published and unpublished studies. Analytical uncertainty is calculated using standard reference plant material from the National Institute of Standards. Our results suggest that technician error in decay classification can have a large effect on uncertainty, since many of the error distributions included in the calculation (e.g. density, chemical concentration, volume-model selection, collapse ratio) are decay-class specific.
I Hear You Eat and Speak: Automatic Recognition of Eating Condition and Food Type, Use-Cases, and Impact on ASR Performance

PubMed Central

Hantke, Simone; Weninger, Felix; Kurle, Richard; Ringeval, Fabien; Batliner, Anton; Mousa, Amr El-Desoky; Schuller, Björn

2016-01-01

We propose a new recognition task in the area of computational paralinguistics: automatic recognition of eating conditions in speech, i. e., whether people are eating while speaking, and what they are eating. To this end, we introduce the audio-visual iHEARu-EAT database featuring 1.6 k utterances of 30 subjects (mean age: 26.1 years, standard deviation: 2.66 years, gender balanced, German speakers), six types of food (Apple, Nectarine, Banana, Haribo Smurfs, Biscuit, and Crisps), and read as well as spontaneous speech, which is made publicly available for research purposes. We start with demonstrating that for automatic speech recognition (ASR), it pays off to know whether speakers are eating or not. We also propose automatic classification both by brute-forcing of low-level acoustic features as well as higher-level features related to intelligibility, obtained from an Automatic Speech Recogniser. Prediction of the eating condition was performed with a Support Vector Machine (SVM) classifier employed in a leave-one-speaker-out evaluation framework. Results show that the binary prediction of eating condition (i. e., eating or not eating) can be easily solved independently of the speaking condition; the obtained average recalls are all above 90%. Low-level acoustic features provide the best performance on spontaneous speech, which reaches up to 62.3% average recall for multi-way classification of the eating condition, i. e., discriminating the six types of food, as well as not eating. The early fusion of features related to intelligibility with the brute-forced acoustic feature set improves the performance on read speech, reaching a 66.4% average recall for the multi-way classification task. Analysing features and classifier errors leads to a suitable ordinal scale for eating conditions, on which automatic regression can be performed with up to 56.2% determination coefficient. PMID:27176486
Classification Model for Forest Fire Hotspot Occurrences Prediction Using ANFIS Algorithm

NASA Astrophysics Data System (ADS)

Wijayanto, A. K.; Sani, O.; Kartika, N. D.; Herdiyeni, Y.

2017-01-01

This study proposed the application of data mining technique namely Adaptive Neuro-Fuzzy inference system (ANFIS) on forest fires hotspot data to develop classification models for hotspots occurrence in Central Kalimantan. Hotspot is a point that is indicated as the location of fires. In this study, hotspot distribution is categorized as true alarm and false alarm. ANFIS is a soft computing method in which a given inputoutput data set is expressed in a fuzzy inference system (FIS). The FIS implements a nonlinear mapping from its input space to the output space. The method of this study classified hotspots as target objects by correlating spatial attributes data using three folds in ANFIS algorithm to obtain the best model. The best result obtained from the 3rd fold provided low error for training (error = 0.0093676) and also low error testing result (error = 0.0093676). Attribute of distance to road is the most determining factor that influences the probability of true and false alarm where the level of human activities in this attribute is higher. This classification model can be used to develop early warning system of forest fire.
Evaluation of procedures for prediction of unconventional gas in the presence of geologic trends

USGS Publications Warehouse

Attanasi, E.D.; Coburn, T.C.

2009-01-01

This study extends the application of local spatial nonparametric prediction models to the estimation of recoverable gas volumes in continuous-type gas plays to regimes where there is a single geologic trend. A transformation is presented, originally proposed by Tomczak, that offsets the distortions caused by the trend. This article reports on numerical experiments that compare predictive and classification performance of the local nonparametric prediction models based on the transformation with models based on Euclidean distance. The transformation offers improvement in average root mean square error when the trend is not severely misspecified. Because of the local nature of the models, even those based on Euclidean distance in the presence of trends are reasonably robust. The tests based on other model performance metrics such as prediction error associated with the high-grade tracts and the ability of the models to identify sites with the largest gas volumes also demonstrate the robustness of both local modeling approaches. ?? International Association for Mathematical Geology 2009.
Classification based upon gene expression data: bias and precision of error rates.

PubMed

Wood, Ian A; Visscher, Peter M; Mengersen, Kerrie L

2007-06-01

Gene expression data offer a large number of potentially useful predictors for the classification of tissue samples into classes, such as diseased and non-diseased. The predictive error rate of classifiers can be estimated using methods such as cross-validation. We have investigated issues of interpretation and potential bias in the reporting of error rate estimates. The issues considered here are optimization and selection biases, sampling effects, measures of misclassification rate, baseline error rates, two-level external cross-validation and a novel proposal for detection of bias using the permutation mean. Reporting an optimal estimated error rate incurs an optimization bias. Downward bias of 3-5% was found in an existing study of classification based on gene expression data and may be endemic in similar studies. Using a simulated non-informative dataset and two example datasets from existing studies, we show how bias can be detected through the use of label permutations and avoided using two-level external cross-validation. Some studies avoid optimization bias by using single-level cross-validation and a test set, but error rates can be more accurately estimated via two-level cross-validation. In addition to estimating the simple overall error rate, we recommend reporting class error rates plus where possible the conditional risk incorporating prior class probabilities and a misclassification cost matrix. We also describe baseline error rates derived from three trivial classifiers which ignore the predictors. R code which implements two-level external cross-validation with the PAMR package, experiment code, dataset details and additional figures are freely available for non-commercial use from http://www.maths.qut.edu.au/profiles/wood/permr.jsp
LACIE performance predictor FOC users manual

NASA Technical Reports Server (NTRS)

1976-01-01

The LACIE Performance Predictor (LPP) is a computer simulation of the LACIE process for predicting worldwide wheat production. The simulation provides for the introduction of various errors into the system and provides estimates based on these errors, thus allowing the user to determine the impact of selected error sources. The FOC LPP simulates the acquisition of the sample segment data by the LANDSAT Satellite (DAPTS), the classification of the agricultural area within the sample segment (CAMS), the estimation of the wheat yield (YES), and the production estimation and aggregation (CAS). These elements include data acquisition characteristics, environmental conditions, classification algorithms, the LACIE aggregation and data adjustment procedures. The operational structure for simulating these elements consists of the following key programs: (1) LACIE Utility Maintenance Process, (2) System Error Executive, (3) Ephemeris Generator, (4) Access Generator, (5) Acquisition Selector, (6) LACIE Error Model (LEM), and (7) Post Processor.
Sensitivity of geographic information system outputs to errors in remotely sensed data

NASA Technical Reports Server (NTRS)

Ramapriyan, H. K.; Boyd, R. K.; Gunther, F. J.; Lu, Y. C.

1981-01-01

The sensitivity of the outputs of a geographic information system (GIS) to errors in inputs derived from remotely sensed data (RSD) is investigated using a suitability model with per-cell decisions and a gridded geographic data base whose cells are larger than the RSD pixels. The process of preparing RSD as input to a GIS is analyzed, and the errors associated with classification and registration are examined. In the case of the model considered, it is found that the errors caused during classification and registration are partially compensated by the aggregation of pixels. The compensation is quantified by means of an analytical model, a Monte Carlo simulation, and experiments with Landsat data. The results show that error reductions of the order of 50% occur because of aggregation when 25 pixels of RSD are used per cell in the geographic data base.
Efficient Measurement of Quantum Gate Error by Interleaved Randomized Benchmarking

NASA Astrophysics Data System (ADS)

Magesan, Easwar; Gambetta, Jay M.; Johnson, B. R.; Ryan, Colm A.; Chow, Jerry M.; Merkel, Seth T.; da Silva, Marcus P.; Keefe, George A.; Rothwell, Mary B.; Ohki, Thomas A.; Ketchen, Mark B.; Steffen, M.

2012-08-01

We describe a scalable experimental protocol for estimating the average error of individual quantum computational gates. This protocol consists of interleaving random Clifford gates between the gate of interest and provides an estimate as well as theoretical bounds for the average error of the gate under test, so long as the average noise variation over all Clifford gates is small. This technique takes into account both state preparation and measurement errors and is scalable in the number of qubits. We apply this protocol to a superconducting qubit system and find a bounded average error of 0.003 [0,0.016] for the single-qubit gates Xπ/2 and Yπ/2. These bounded values provide better estimates of the average error than those extracted via quantum process tomography.
Error Analysis in Composition of Iranian Lower Intermediate Students

ERIC Educational Resources Information Center

Taghavi, Mehdi

2012-01-01

Learners make errors during the process of learning languages. This study examines errors in writing task of twenty Iranian lower intermediate male students aged between 13 and 15. A subject was given to the participants was a composition about the seasons of a year. All of the errors were identified and classified. Corder's classification (1967)…
Combining multiple decisions: applications to bioinformatics

NASA Astrophysics Data System (ADS)

Yukinawa, N.; Takenouchi, T.; Oba, S.; Ishii, S.

2008-01-01

Multi-class classification is one of the fundamental tasks in bioinformatics and typically arises in cancer diagnosis studies by gene expression profiling. This article reviews two recent approaches to multi-class classification by combining multiple binary classifiers, which are formulated based on a unified framework of error-correcting output coding (ECOC). The first approach is to construct a multi-class classifier in which each binary classifier to be aggregated has a weight value to be optimally tuned based on the observed data. In the second approach, misclassification of each binary classifier is formulated as a bit inversion error with a probabilistic model by making an analogy to the context of information transmission theory. Experimental studies using various real-world datasets including cancer classification problems reveal that both of the new methods are superior or comparable to other multi-class classification methods.
Credit Risk Evaluation Using a C-Variable Least Squares Support Vector Classification Model

NASA Astrophysics Data System (ADS)

Yu, Lean; Wang, Shouyang; Lai, K. K.

Credit risk evaluation is one of the most important issues in financial risk management. In this paper, a C-variable least squares support vector classification (C-VLSSVC) model is proposed for credit risk analysis. The main idea of this model is based on the prior knowledge that different classes may have different importance for modeling and more weights should be given to those classes with more importance. The C-VLSSVC model can be constructed by a simple modification of the regularization parameter in LSSVC, whereby more weights are given to the lease squares classification errors with important classes than the lease squares classification errors with unimportant classes while keeping the regularized terms in its original form. For illustration purpose, a real-world credit dataset is used to test the effectiveness of the C-VLSSVC model.
SDL: Saliency-Based Dictionary Learning Framework for Image Similarity.

PubMed

Sarkar, Rituparna; Acton, Scott T

2018-02-01

In image classification, obtaining adequate data to learn a robust classifier has often proven to be difficult in several scenarios. Classification of histological tissue images for health care analysis is a notable application in this context due to the necessity of surgery, biopsy or autopsy. To adequately exploit limited training data in classification, we propose a saliency guided dictionary learning method and subsequently an image similarity technique for histo-pathological image classification. Salient object detection from images aids in the identification of discriminative image features. We leverage the saliency values for the local image regions to learn a dictionary and respective sparse codes for an image, such that the more salient features are reconstructed with smaller error. The dictionary learned from an image gives a compact representation of the image itself and is capable of representing images with similar content, with comparable sparse codes. We employ this idea to design a similarity measure between a pair of images, where local image features of one image, are encoded with the dictionary learned from the other and vice versa. To effectively utilize the learned dictionary, we take into account the contribution of each dictionary atom in the sparse codes to generate a global image representation for image comparison. The efficacy of the proposed method was evaluated using three tissue data sets that consist of mammalian kidney, lung and spleen tissue, breast cancer, and colon cancer tissue images. From the experiments, we observe that our methods outperform the state of the art with an increase of 14.2% in the average classification accuracy over all data sets.
Exception handling for sensor fusion

NASA Astrophysics Data System (ADS)

Chavez, G. T.; Murphy, Robin R.

1993-08-01

This paper presents a control scheme for handling sensing failures (sensor malfunctions, significant degradations in performance due to changes in the environment, and errant expectations) in sensor fusion for autonomous mobile robots. The advantages of the exception handling mechanism are that it emphasizes a fast response to sensing failures, is able to use only a partial causal model of sensing failure, and leads to a graceful degradation of sensing if the sensing failure cannot be compensated for. The exception handling mechanism consists of two modules: error classification and error recovery. The error classification module in the exception handler attempts to classify the type and source(s) of the error using a modified generate-and-test procedure. If the source of the error is isolated, the error recovery module examines its cache of recovery schemes, which either repair or replace the current sensing configuration. If the failure is due to an error in expectation or cannot be identified, the planner is alerted. Experiments using actual sensor data collected by the CSM Mobile Robotics/Machine Perception Laboratory's Denning mobile robot demonstrate the operation of the exception handling mechanism.
Original and Mirror Face Images and Minimum Squared Error Classification for Visible Light Face Recognition.

PubMed

Wang, Rong

2015-01-01

In real-world applications, the image of faces varies with illumination, facial expression, and poses. It seems that more training samples are able to reveal possible images of the faces. Though minimum squared error classification (MSEC) is a widely used method, its applications on face recognition usually suffer from the problem of a limited number of training samples. In this paper, we improve MSEC by using the mirror faces as virtual training samples. We obtained the mirror faces generated from original training samples and put these two kinds of samples into a new set. The face recognition experiments show that our method does obtain high accuracy performance in classification.
Rank preserving sparse learning for Kinect based scene classification.

PubMed

Tao, Dapeng; Jin, Lianwen; Yang, Zhao; Li, Xuelong

2013-10-01

With the rapid development of the RGB-D sensors and the promptly growing population of the low-cost Microsoft Kinect sensor, scene classification, which is a hard, yet important, problem in computer vision, has gained a resurgence of interest recently. That is because the depth of information provided by the Kinect sensor opens an effective and innovative way for scene classification. In this paper, we propose a new scheme for scene classification, which applies locality-constrained linear coding (LLC) to local SIFT features for representing the RGB-D samples and classifies scenes through the cooperation between a new rank preserving sparse learning (RPSL) based dimension reduction and a simple classification method. RPSL considers four aspects: 1) it preserves the rank order information of the within-class samples in a local patch; 2) it maximizes the margin between the between-class samples on the local patch; 3) the L1-norm penalty is introduced to obtain the parsimony property; and 4) it models the classification error minimization by utilizing the least-squares error minimization. Experiments are conducted on the NYU Depth V1 dataset and demonstrate the robustness and effectiveness of RPSL for scene classification.
A novel onset detection technique for brain-computer interfaces using sound-production related cognitive tasks in simulated-online system

NASA Astrophysics Data System (ADS)

Song, YoungJae; Sepulveda, Francisco

2017-02-01

Objective. Self-paced EEG-based BCIs (SP-BCIs) have traditionally been avoided due to two sources of uncertainty: (1) precisely when an intentional command is sent by the brain, i.e., the command onset detection problem, and (2) how different the intentional command is when compared to non-specific (or idle) states. Performance evaluation is also a problem and there are no suitable standard metrics available. In this paper we attempted to tackle these issues. Approach. Self-paced covert sound-production cognitive tasks (i.e., high pitch and siren-like sounds) were used to distinguish between intentional commands (IC) and idle states. The IC states were chosen for their ease of execution and negligible overlap with common cognitive states. Band power and a digital wavelet transform were used for feature extraction, and the Davies-Bouldin index was used for feature selection. Classification was performed using linear discriminant analysis. Main results. Performance was evaluated under offline and simulated-online conditions. For the latter, a performance score called true-false-positive (TFP) rate, ranging from 0 (poor) to 100 (perfect), was created to take into account both classification performance and onset timing errors. Averaging the results from the best performing IC task for all seven participants, an 77.7% true-positive (TP) rate was achieved in offline testing. For simulated-online analysis the best IC average TFP score was 76.67% (87.61% TP rate, 4.05% false-positive rate). Significance. Results were promising when compared to previous IC onset detection studies using motor imagery, in which best TP rates were reported as 72.0% and 79.7%, and which, crucially, did not take timing errors into account. Moreover, based on our literature review, there is no previous covert sound-production onset detection system for spBCIs. Results showed that the proposed onset detection technique and TFP performance metric have good potential for use in SP-BCIs.
Comparison of two Classification methods (MLC and SVM) to extract land use and land cover in Johor Malaysia

NASA Astrophysics Data System (ADS)

Rokni Deilmai, B.; Ahmad, B. Bin; Zabihi, H.

2014-06-01

Mapping is essential for the analysis of the land use and land cover, which influence many environmental processes and properties. For the purpose of the creation of land cover maps, it is important to minimize error. These errors will propagate into later analyses based on these land cover maps. The reliability of land cover maps derived from remotely sensed data depends on an accurate classification. In this study, we have analyzed multispectral data using two different classifiers including Maximum Likelihood Classifier (MLC) and Support Vector Machine (SVM). To pursue this aim, Landsat Thematic Mapper data and identical field-based training sample datasets in Johor Malaysia used for each classification method, which results indicate in five land cover classes forest, oil palm, urban area, water, rubber. Classification results indicate that SVM was more accurate than MLC. With demonstrated capability to produce reliable cover results, the SVM methods should be especially useful for land cover classification.

A Robust Step Detection Algorithm and Walking Distance Estimation Based on Daily Wrist Activity Recognition Using a Smart Band.

PubMed

Trong Bui, Duong; Nguyen, Nhan Duc; Jeong, Gu-Min

2018-06-25

Human activity recognition and pedestrian dead reckoning are an interesting field because of their importance utilities in daily life healthcare. Currently, these fields are facing many challenges, one of which is the lack of a robust algorithm with high performance. This paper proposes a new method to implement a robust step detection and adaptive distance estimation algorithm based on the classification of five daily wrist activities during walking at various speeds using a smart band. The key idea is that the non-parametric adaptive distance estimator is performed after two activity classifiers and a robust step detector. In this study, two classifiers perform two phases of recognizing five wrist activities during walking. Then, a robust step detection algorithm, which is integrated with an adaptive threshold, peak and valley correction algorithm, is applied to the classified activities to detect the walking steps. In addition, the misclassification activities are fed back to the previous layer. Finally, three adaptive distance estimators, which are based on a non-parametric model of the average walking speed, calculate the length of each strike. The experimental results show that the average classification accuracy is about 99%, and the accuracy of the step detection is 98.7%. The error of the estimated distance is 2.2⁻4.2% depending on the type of wrist activities.
A Comparative Study of Land Cover Classification by Using Multispectral and Texture Data

PubMed Central

Qadri, Salman; Khan, Dost Muhammad; Ahmad, Farooq; Qadri, Syed Furqan; Babar, Masroor Ellahi; Shahid, Muhammad; Ul-Rehman, Muzammil; Razzaq, Abdul; Shah Muhammad, Syed; Fahad, Muhammad; Ahmad, Sarfraz; Pervez, Muhammad Tariq; Naveed, Nasir; Aslam, Naeem; Jamil, Mutiullah; Rehmani, Ejaz Ahmad; Ahmad, Nazir; Akhtar Khan, Naeem

2016-01-01

The main objective of this study is to find out the importance of machine vision approach for the classification of five types of land cover data such as bare land, desert rangeland, green pasture, fertile cultivated land, and Sutlej river land. A novel spectra-statistical framework is designed to classify the subjective land cover data types accurately. Multispectral data of these land covers were acquired by using a handheld device named multispectral radiometer in the form of five spectral bands (blue, green, red, near infrared, and shortwave infrared) while texture data were acquired with a digital camera by the transformation of acquired images into 229 texture features for each image. The most discriminant 30 features of each image were obtained by integrating the three statistical features selection techniques such as Fisher, Probability of Error plus Average Correlation, and Mutual Information (F + PA + MI). Selected texture data clustering was verified by nonlinear discriminant analysis while linear discriminant analysis approach was applied for multispectral data. For classification, the texture and multispectral data were deployed to artificial neural network (ANN: n-class). By implementing a cross validation method (80-20), we received an accuracy of 91.332% for texture data and 96.40% for multispectral data, respectively. PMID:27376088
On the Discriminant Analysis in the 2-Populations Case

NASA Astrophysics Data System (ADS)

Rublík, František

2008-01-01

The empirical Bayes Gaussian rule, which in the normal case yields good values of the probability of total error, may yield high values of the maximum probability error. From this point of view the presented modified version of the classification rule of Broffitt, Randles and Hogg appears to be superior. The modification included in this paper is termed as a WR method, and the choice of its weights is discussed. The mentioned methods are also compared with the K nearest neighbours classification rule.
Regularised extreme learning machine with misclassification cost and rejection cost for gene expression data classification.

PubMed

Lu, Huijuan; Wei, Shasha; Zhou, Zili; Miao, Yanzi; Lu, Yi

2015-01-01

The main purpose of traditional classification algorithms on bioinformatics application is to acquire better classification accuracy. However, these algorithms cannot meet the requirement that minimises the average misclassification cost. In this paper, a new algorithm of cost-sensitive regularised extreme learning machine (CS-RELM) was proposed by using probability estimation and misclassification cost to reconstruct the classification results. By improving the classification accuracy of a group of small sample which higher misclassification cost, the new CS-RELM can minimise the classification cost. The 'rejection cost' was integrated into CS-RELM algorithm to further reduce the average misclassification cost. By using Colon Tumour dataset and SRBCT (Small Round Blue Cells Tumour) dataset, CS-RELM was compared with other cost-sensitive algorithms such as extreme learning machine (ELM), cost-sensitive extreme learning machine, regularised extreme learning machine, cost-sensitive support vector machine (SVM). The results of experiments show that CS-RELM with embedded rejection cost could reduce the average cost of misclassification and made more credible classification decision than others.
A classification of errors in lay comprehension of medical documents.

PubMed

Keselman, Alla; Smith, Catherine Arnott

2012-12-01

Emphasis on participatory medicine requires that patients and consumers participate in tasks traditionally reserved for healthcare providers. This includes reading and comprehending medical documents, often but not necessarily in the context of interacting with Personal Health Records (PHRs). Research suggests that while giving patients access to medical documents has many benefits (e.g., improved patient-provider communication), lay people often have difficulty understanding medical information. Informatics can address the problem by developing tools that support comprehension; this requires in-depth understanding of the nature and causes of errors that lay people make when comprehending clinical documents. The objective of this study was to develop a classification scheme of comprehension errors, based on lay individuals' retellings of two documents containing clinical text: a description of a clinical trial and a typical office visit note. While not comprehensive, the scheme can serve as a foundation of further development of a taxonomy of patients' comprehension errors. Eighty participants, all healthy volunteers, read and retold two medical documents. A data-driven content analysis procedure was used to extract and classify retelling errors. The resulting hierarchical classification scheme contains nine categories and 23 subcategories. The most common error made by the participants involved incorrectly recalling brand names of medications. Other common errors included misunderstanding clinical concepts, misreporting the objective of a clinical research study and physician's findings during a patient's visit, and confusing and misspelling clinical terms. A combination of informatics support and health education is likely to improve the accuracy of lay comprehension of medical documents. Published by Elsevier Inc.
An extension of the receiver operating characteristic curve and AUC-optimal classification.

PubMed

Takenouchi, Takashi; Komori, Osamu; Eguchi, Shinto

2012-10-01

While most proposed methods for solving classification problems focus on minimization of the classification error rate, we are interested in the receiver operating characteristic (ROC) curve, which provides more information about classification performance than the error rate does. The area under the ROC curve (AUC) is a natural measure for overall assessment of a classifier based on the ROC curve. We discuss a class of concave functions for AUC maximization in which a boosting-type algorithm including RankBoost is considered, and the Bayesian risk consistency and the lower bound of the optimum function are discussed. A procedure derived by maximizing a specific optimum function has high robustness, based on gross error sensitivity. Additionally, we focus on the partial AUC, which is the partial area under the ROC curve. For example, in medical screening, a high true-positive rate to the fixed lower false-positive rate is preferable and thus the partial AUC corresponding to lower false-positive rates is much more important than the remaining AUC. We extend the class of concave optimum functions for partial AUC optimality with the boosting algorithm. We investigated the validity of the proposed method through several experiments with data sets in the UCI repository.
Review of medication errors that are new or likely to occur more frequently with electronic medication management systems.

PubMed

Van de Vreede, Melita; McGrath, Anne; de Clifford, Jan

2018-05-14

Objective. The aim of the present study was to identify and quantify medication errors reportedly related to electronic medication management systems (eMMS) and those considered likely to occur more frequently with eMMS. This included developing a new classification system relevant to eMMS errors. Methods. Eight Victorian hospitals with eMMS participated in a retrospective audit of reported medication incidents from their incident reporting databases between May and July 2014. Site-appointed project officers submitted deidentified incidents they deemed new or likely to occur more frequently due to eMMS, together with the Incident Severity Rating (ISR). The authors reviewed and classified incidents. Results. There were 5826 medication-related incidents reported. In total, 93 (47 prescribing errors, 46 administration errors) were identified as new or potentially related to eMMS. Only one ISR2 (moderate) and no ISR1 (severe or death) errors were reported, so harm to patients in this 3-month period was minimal. The most commonly reported error types were 'human factors' and 'unfamiliarity or training' (70%) and 'cross-encounter or hybrid system errors' (22%). Conclusions. Although the results suggest that the errors reported were of low severity, organisations must remain vigilant to the risk of new errors and avoid the assumption that eMMS is the panacea to all medication error issues. What is known about the topic? eMMS have been shown to reduce some types of medication errors, but it has been reported that some new medication errors have been identified and some are likely to occur more frequently with eMMS. There are few published Australian studies that have reported on medication error types that are likely to occur more frequently with eMMS in more than one organisation and that include administration and prescribing errors. What does this paper add? This paper includes a new simple classification system for eMMS that is useful and outlines the most commonly reported incident types and can inform organisations and vendors on possible eMMS improvements. The paper suggests a new classification system for eMMS medication errors. What are the implications for practitioners? The results of the present study will highlight to organisations the need for ongoing review of system design, refinement of workflow issues, staff education and training and reporting and monitoring of errors.
EEG error potentials detection and classification using time-frequency features for robot reinforcement learning.

PubMed

Boubchir, Larbi; Touati, Youcef; Daachi, Boubaker; Chérif, Arab Ali

2015-08-01

In thought-based steering of robots, error potentials (ErrP) can appear when the action resulting from the brain-machine interface (BMI) classifier/controller does not correspond to the user's thought. Using the Steady State Visual Evoked Potentials (SSVEP) techniques, ErrP, which appear when a classification error occurs, are not easily recognizable by only examining the temporal or frequency characteristics of EEG signals. A supplementary classification process is therefore needed to identify them in order to stop the course of the action and back up to a recovery state. This paper presents a set of time-frequency (t-f) features for the detection and classification of EEG ErrP in extra-brain activities due to misclassification observed by a user exploiting non-invasive BMI and robot control in the task space. The proposed features are able to characterize and detect ErrP activities in the t-f domain. These features are derived from the information embedded in the t-f representation of EEG signals, and include the Instantaneous Frequency (IF), t-f information complexity, SVD information, energy concentration and sub-bands' energies. The experiment results on real EEG data show that the use of the proposed t-f features for detecting and classifying EEG ErrP achieved an overall classification accuracy up to 97% for 50 EEG segments using 2-class SVM classifier.
Passive quantum error correction of linear optics networks through error averaging

NASA Astrophysics Data System (ADS)

Marshman, Ryan J.; Lund, Austin P.; Rohde, Peter P.; Ralph, Timothy C.

2018-02-01

We propose and investigate a method of error detection and noise correction for bosonic linear networks using a method of unitary averaging. The proposed error averaging does not rely on ancillary photons or control and feedforward correction circuits, remaining entirely passive in its operation. We construct a general mathematical framework for this technique and then give a series of proof of principle examples including numerical analysis. Two methods for the construction of averaging are then compared to determine the most effective manner of implementation and probe the related error thresholds. Finally we discuss some of the potential uses of this scheme.
Evaluation of spatial filtering on the accuracy of wheat area estimate

NASA Technical Reports Server (NTRS)

Dejesusparada, N. (Principal Investigator); Moreira, M. A.; Chen, S. C.; Delima, A. M.

1982-01-01

A 3 x 3 pixel spatial filter for postclassification was used for wheat classification to evaluate the effects of this procedure on the accuracy of area estimation using LANDSAT digital data obtained from a single pass. Quantitative analyses were carried out in five test sites (approx 40 sq km each) and t tests showed that filtering with threshold values significantly decreased errors of commission and omission. In area estimation filtering improved the overestimate of 4.5% to 2.7% and the root-mean-square error decreased from 126.18 ha to 107.02 ha. Extrapolating the same procedure of automatic classification using spatial filtering for postclassification to the whole study area, the accuracy in area estimate was improved from the overestimate of 10.9% to 9.7%. It is concluded that when single pass LANDSAT data is used for crop identification and area estimation the postclassification procedure using a spatial filter provides a more accurate area estimate by reducing classification errors.
Maximum likelihood estimation of label imperfections and its use in the identification of mislabeled patterns

NASA Technical Reports Server (NTRS)

Chittineni, C. B.

1979-01-01

The problem of estimating label imperfections and the use of the estimation in identifying mislabeled patterns is presented. Expressions for the maximum likelihood estimates of classification errors and a priori probabilities are derived from the classification of a set of labeled patterns. Expressions also are given for the asymptotic variances of probability of correct classification and proportions. Simple models are developed for imperfections in the labels and for classification errors and are used in the formulation of a maximum likelihood estimation scheme. Schemes are presented for the identification of mislabeled patterns in terms of threshold on the discriminant functions for both two-class and multiclass cases. Expressions are derived for the probability that the imperfect label identification scheme will result in a wrong decision and are used in computing thresholds. The results of practical applications of these techniques in the processing of remotely sensed multispectral data are presented.
Automated algorithm for mapping regions of cold-air pooling in complex terrain

NASA Astrophysics Data System (ADS)

Lundquist, Jessica D.; Pepin, Nicholas; Rochford, Caitlin

2008-11-01

In complex terrain, air in contact with the ground becomes cooled from radiative energy loss on a calm clear night and, being denser than the free atmosphere at the same elevation, sinks to valley bottoms. Cold-air pooling (CAP) occurs where this cooled air collects on the landscape. This article focuses on identifying locations on a landscape subject to considerably lower minimum temperatures than the regional average during conditions of clear skies and weak synoptic-scale winds, providing a simple automated method to map locations where cold air is likely to pool. Digital elevation models of regions of complex terrain were used to derive surfaces of local slope, curvature, and percentile elevation relative to surrounding terrain. Each pixel was classified as prone to CAP, not prone to CAP, or exhibiting no signal, based on the criterion that CAP occurs in regions with flat slopes in local depressions or valleys (negative curvature and low percentile). Along-valley changes in the topographic amplification factor (TAF) were then calculated to determine whether the cold air in the valley was likely to drain or pool. Results were checked against distributed temperature measurements in Loch Vale, Rocky Mountain National Park, Colorado; in the Eastern Pyrenees, France; and in Yosemite National Park, Sierra Nevada, California. Using CAP classification to interpolate temperatures across complex terrain resulted in improvements in root-mean-square errors compared to more basic interpolation techniques at most sites within the three areas examined, with average error reductions of up to 3°C at individual sites and about 1°C averaged over all sites in the study areas.
Misclassification Errors in Unsupervised Classification Methods. Comparison Based on the Simulation of Targeted Proteomics Data

PubMed Central

Andreev, Victor P; Gillespie, Brenda W; Helfand, Brian T; Merion, Robert M

2016-01-01

Unsupervised classification methods are gaining acceptance in omics studies of complex common diseases, which are often vaguely defined and are likely the collections of disease subtypes. Unsupervised classification based on the molecular signatures identified in omics studies have the potential to reflect molecular mechanisms of the subtypes of the disease and to lead to more targeted and successful interventions for the identified subtypes. Multiple classification algorithms exist but none is ideal for all types of data. Importantly, there are no established methods to estimate sample size in unsupervised classification (unlike power analysis in hypothesis testing). Therefore, we developed a simulation approach allowing comparison of misclassification errors and estimating the required sample size for a given effect size, number, and correlation matrix of the differentially abundant proteins in targeted proteomics studies. All the experiments were performed in silico. The simulated data imitated the expected one from the study of the plasma of patients with lower urinary tract dysfunction with the aptamer proteomics assay Somascan (SomaLogic Inc, Boulder, CO), which targeted 1129 proteins, including 330 involved in inflammation, 180 in stress response, 80 in aging, etc. Three popular clustering methods (hierarchical, k-means, and k-medoids) were compared. K-means clustering performed much better for the simulated data than the other two methods and enabled classification with misclassification error below 5% in the simulated cohort of 100 patients based on the molecular signatures of 40 differentially abundant proteins (effect size 1.5) from among the 1129-protein panel. PMID:27524871
Computer discrimination procedures applicable to aerial and ERTS multispectral data

NASA Technical Reports Server (NTRS)

Richardson, A. J.; Torline, R. J.; Allen, W. A.

1970-01-01

Two statistical models are compared in the classification of crops recorded on color aerial photographs. A theory of error ellipses is applied to the pattern recognition problem. An elliptical boundary condition classification model (EBC), useful for recognition of candidate patterns, evolves out of error ellipse theory. The EBC model is compared with the minimum distance to the mean (MDM) classification model in terms of pattern recognition ability. The pattern recognition results of both models are interpreted graphically using scatter diagrams to represent measurement space. Measurement space, for this report, is determined by optical density measurements collected from Kodak Ektachrome Infrared Aero Film 8443 (EIR). The EBC model is shown to be a significant improvement over the MDM model.
Statistical learning from nonrecurrent experience with discrete input variables and recursive-error-minimization equations

NASA Astrophysics Data System (ADS)

Carter, Jeffrey R.; Simon, Wayne E.

1990-08-01

Neural networks are trained using Recursive Error Minimization (REM) equations to perform statistical classification. Using REM equations with continuous input variables reduces the required number of training experiences by factors of one to two orders of magnitude over standard back propagation. Replacing the continuous input variables with discrete binary representations reduces the number of connections by a factor proportional to the number of variables reducing the required number of experiences by another order of magnitude. Undesirable effects of using recurrent experience to train neural networks for statistical classification problems are demonstrated and nonrecurrent experience used to avoid these undesirable effects. 1. THE 1-41 PROBLEM The statistical classification problem which we address is is that of assigning points in ddimensional space to one of two classes. The first class has a covariance matrix of I (the identity matrix) the covariance matrix of the second class is 41. For this reason the problem is known as the 1-41 problem. Both classes have equal probability of occurrence and samples from both classes may appear anywhere throughout the ddimensional space. Most samples near the origin of the coordinate system will be from the first class while most samples away from the origin will be from the second class. Since the two classes completely overlap it is impossible to have a classifier with zero error. The minimum possible error is known as the Bayes error and
Statistical classification of drug incidents due to look-alike sound-alike mix-ups.

PubMed

Wong, Zoie Shui Yee

2016-06-01

It has been recognised that medication names that look or sound similar are a cause of medication errors. This study builds statistical classifiers for identifying medication incidents due to look-alike sound-alike mix-ups. A total of 227 patient safety incident advisories related to medication were obtained from the Canadian Patient Safety Institute's Global Patient Safety Alerts system. Eight feature selection strategies based on frequent terms, frequent drug terms and constituent terms were performed. Statistical text classifiers based on logistic regression, support vector machines with linear, polynomial, radial-basis and sigmoid kernels and decision tree were trained and tested. The models developed achieved an average accuracy of above 0.8 across all the model settings. The receiver operating characteristic curves indicated the classifiers performed reasonably well. The results obtained in this study suggest that statistical text classification can be a feasible method for identifying medication incidents due to look-alike sound-alike mix-ups based on a database of advisories from Global Patient Safety Alerts. © The Author(s) 2014.
Reliability and Validity in Hospital Case-Mix Measurement

PubMed Central

Pettengill, Julian; Vertrees, James

1982-01-01

There is widespread interest in the development of a measure of hospital output. This paper describes the problem of measuring the expected cost of the mix of inpatient cases treated in a hospital (hospital case-mix) and a general approach to its solution. The solution is based on a set of homogenous groups of patients, defined by a patient classification system, and a set of estimated relative cost weights corresponding to the patient categories. This approach is applied to develop a summary measure of the expected relative costliness of the mix of Medicare patients treated in 5,576 participating hospitals. The Medicare case-mix index is evaluated by estimating a hospital average cost function. This provides a direct test of the hypothesis that the relationship between Medicare case-mix and Medicare cost per case is proportional. The cost function analysis also provides a means of simulating the effects of classification error on our estimate of this relationship. Our results indicate that this general approach to measuring hospital case-mix provides a valid and robust measure of the expected cost of a hospital's case-mix. PMID:10309909
Learning-based landmarks detection for osteoporosis analysis

NASA Astrophysics Data System (ADS)

Cheng, Erkang; Zhu, Ling; Yang, Jie; Azhari, Azhari; Sitam, Suhardjo; Liang, Xin; Megalooikonomou, Vasileios; Ling, Haibin

2016-03-01

Osteoporosis is the common cause for a broken bone among senior citizens. Early diagnosis of osteoporosis requires routine examination which may be costly for patients. A potential low cost diagnosis is to identify a senior citizen at high risk of osteoporosis by pre-screening during routine dental examination. Therefore, osteoporosis analysis using dental radiographs severs as a key step in routine dental examination. The aim of this study is to localize landmarks in dental radiographs which are helpful to assess the evidence of osteoporosis. We collect eight landmarks which are critical in osteoporosis analysis. Our goal is to localize these landmarks automatically for a given dental radiographic image. To address the challenges such as large variations of appearances in subjects, in this paper, we formulate the task into a multi-class classification problem. A hybrid feature pool is used to represent these landmarks. For the discriminative classification problem, we use a random forest to fuse the hybrid feature representation. In the experiments, we also evaluate the performances of individual feature component and the hybrid fused feature. Our proposed method achieves average detection error of 2:9mm.
Calibration of remotely sensed proportion or area estimates for misclassification error

Treesearch

Raymond L. Czaplewski; Glenn P. Catts

1992-01-01

Classifications of remotely sensed data contain misclassification errors that bias areal estimates. Monte Carlo techniques were used to compare two statistical methods that correct or calibrate remotely sensed areal estimates for misclassification bias using reference data from an error matrix. The inverse calibration estimator was consistently superior to the...
Mapping gully-affected areas in the region of Taroudannt, Morocco based on Object-Based Image Analysis (OBIA)

NASA Astrophysics Data System (ADS)

d'Oleire-Oltmanns, Sebastian; Marzolff, Irene; Tiede, Dirk; Blaschke, Thomas

2015-04-01

The need for area-wide landform mapping approaches, especially in terms of land degradation, can be ascribed to the fact that within area-wide landform mapping approaches, the (spatial) context of erosional landforms is considered by providing additional information on the physiography neighboring the distinct landform. This study presents an approach for the detection of gully-affected areas by applying object-based image analysis in the region of Taroudannt, Morocco, which is highly affected by gully erosion while simultaneously representing a major region of agro-industry with a high demand of arable land. Various sensors provide readily available high-resolution optical satellite data with a much better temporal resolution than 3D terrain data which lead to the development of an area-wide mapping approach to extract gully-affected areas using only optical satellite imagery. The classification rule-set was developed with a clear focus on virtual spatial independence within the software environment of eCognition Developer. This allows the incorporation of knowledge about the target objects under investigation. Only optical QuickBird-2 satellite data and freely-available OpenStreetMap (OSM) vector data were used as input data. The OSM vector data were incorporated in order to mask out plantations and residential areas. Optical input data are more readily available for a broad range of users compared to terrain data, which is considered to be a major advantage. The methodology additionally incorporates expert knowledge and freely-available vector data in a cyclic object-based image analysis approach. This connects the two fields of geomorphology and remote sensing. The classification results allow conclusions on the current distribution of gullies. The results of the classification were checked against manually delineated reference data incorporating expert knowledge based on several field campaigns in the area, resulting in an overall classification accuracy of 62%. The error of omission accounts for 38% and the error of commission for 16%, respectively. Additionally, a manual assessment was carried out to assess the quality of the applied classification algorithm. The limited error of omission contributes with 23% to the overall error of omission and the limited error of commission contributes with 98% to the overall error of commission. This assessment improves the results and confirms the high quality of the developed approach for area-wide mapping of gully-affected areas in larger regions. In the field of landform mapping, the overall quality of the classification results is often assessed with more than one method to incorporate all aspects adequately.

Mapping global surface water inundation dynamics using synergistic information from SMAP, AMSR2 and Landsat

NASA Astrophysics Data System (ADS)

Du, J.; Kimball, J. S.; Galantowicz, J. F.; Kim, S.; Chan, S.; Reichle, R. H.; Jones, L. A.; Watts, J. D.

2017-12-01

A method to monitor global land surface water (fw) inundation dynamics was developed by exploiting the enhanced fw sensitivity of L-band (1.4 GHz) passive microwave observations from the Soil Moisture Active Passive (SMAP) mission. The L-band fw (fwLBand) retrievals were derived using SMAP H-polarization brightness temperature (Tb) observations and predefined L-band reference microwave emissivities for water and land endmembers. Potential soil moisture and vegetation contributions to the microwave signal were represented from overlapping higher frequency Tb observations from AMSR2. The resulting fwLBand global record has high temporal sampling (1-3 days) and 36-km spatial resolution. The fwLBand annual averages corresponded favourably (R=0.84, p<0.001) with a 250-m resolution static global water map (MOD44W) aggregated at the same spatial scale, while capturing significant inundation variations worldwide. The monthly fwLBand averages also showed seasonal inundation changes consistent with river discharge records within six major US river basins. An uncertainty analysis indicated generally reliable fwLBand performance for major land cover areas and under low to moderate vegetation cover, but with lower accuracy for detecting water bodies covered by dense vegetation. Finer resolution (30-m) fwLBand results were obtained for three sub-regions in North America using an empirical downscaling approach and ancillary global Water Occurrence Dataset (WOD) derived from the historical Landsat record. The resulting 30-m fwLBand retrievals showed favourable classification accuracy for water (commission error 31.84%; omission error 28.08%) and land (commission error 0.82%; omission error 0.99%) and seasonal wet and dry periods when compared to independent water maps derived from Landsat-8 imagery. The new fwLBand algorithms and continuing SMAP and AMSR2 operations provide for near real-time, multi-scale monitoring of global surface water inundation dynamics, potentially benefiting hydrological monitoring, flood assessments, and global climate and carbon modeling.
Maximum-likelihood techniques for joint segmentation-classification of multispectral chromosome images.

PubMed

Schwartzkopf, Wade C; Bovik, Alan C; Evans, Brian L

2005-12-01

Traditional chromosome imaging has been limited to grayscale images, but recently a 5-fluorophore combinatorial labeling technique (M-FISH) was developed wherein each class of chromosomes binds with a different combination of fluorophores. This results in a multispectral image, where each class of chromosomes has distinct spectral components. In this paper, we develop new methods for automatic chromosome identification by exploiting the multispectral information in M-FISH chromosome images and by jointly performing chromosome segmentation and classification. We (1) develop a maximum-likelihood hypothesis test that uses multispectral information, together with conventional criteria, to select the best segmentation possibility; (2) use this likelihood function to combine chromosome segmentation and classification into a robust chromosome identification system; and (3) show that the proposed likelihood function can also be used as a reliable indicator of errors in segmentation, errors in classification, and chromosome anomalies, which can be indicators of radiation damage, cancer, and a wide variety of inherited diseases. We show that the proposed multispectral joint segmentation-classification method outperforms past grayscale segmentation methods when decomposing touching chromosomes. We also show that it outperforms past M-FISH classification techniques that do not use segmentation information.
Classification accuracy on the family planning participation status using kernel discriminant analysis

NASA Astrophysics Data System (ADS)

Kurniawan, Dian; Suparti; Sugito

2018-05-01

Population growth in Indonesia has increased every year. According to the population census conducted by the Central Bureau of Statistics (BPS) in 2010, the population of Indonesia has reached 237.6 million people. Therefore, to control the population growth rate, the government hold Family Planning or Keluarga Berencana (KB) program for couples of childbearing age. The purpose of this program is to improve the health of mothers and children in order to manifest prosperous society by controlling births while ensuring control of population growth. The data used in this study is the updated family data of Semarang city in 2016 that conducted by National Family Planning Coordinating Board (BKKBN). From these data, classifiers with kernel discriminant analysis will be obtained, and also classification accuracy will be obtained from that method. The result of the analysis showed that normal kernel discriminant analysis gives 71.05 % classification accuracy with 28.95 % classification error. Whereas triweight kernel discriminant analysis gives 73.68 % classification accuracy with 26.32 % classification error. Using triweight kernel discriminant for data preprocessing of family planning participation of childbearing age couples in Semarang City of 2016 can be stated better than with normal kernel discriminant.
A parametric multiclass Bayes error estimator for the multispectral scanner spatial model performance evaluation

NASA Technical Reports Server (NTRS)

Mobasseri, B. G.; Mcgillem, C. D.; Anuta, P. E. (Principal Investigator)

1978-01-01

The author has identified the following significant results. The probability of correct classification of various populations in data was defined as the primary performance index. The multispectral data being of multiclass nature as well, required a Bayes error estimation procedure that was dependent on a set of class statistics alone. The classification error was expressed in terms of an N dimensional integral, where N was the dimensionality of the feature space. The multispectral scanner spatial model was represented by a linear shift, invariant multiple, port system where the N spectral bands comprised the input processes. The scanner characteristic function, the relationship governing the transformation of the input spatial, and hence, spectral correlation matrices through the systems, was developed.
Detecting paroxysmal coughing from pertussis cases using voice recognition technology.

PubMed

Parker, Danny; Picone, Joseph; Harati, Amir; Lu, Shuang; Jenkyns, Marion H; Polgreen, Philip M

2013-01-01

Pertussis is highly contagious; thus, prompt identification of cases is essential to control outbreaks. Clinicians experienced with the disease can easily identify classic cases, where patients have bursts of rapid coughing followed by gasps, and a characteristic whooping sound. However, many clinicians have never seen a case, and thus may miss initial cases during an outbreak. The purpose of this project was to use voice-recognition software to distinguish pertussis coughs from croup and other coughs. We collected a series of recordings representing pertussis, croup and miscellaneous coughing by children. We manually categorized coughs as either pertussis or non-pertussis, and extracted features for each category. We used Mel-frequency cepstral coefficients (MFCC), a sampling rate of 16 KHz, a frame Duration of 25 msec, and a frame rate of 10 msec. The coughs were filtered. Each cough was divided into 3 sections of proportion 3-4-3. The average of the 13 MFCCs for each section was computed and made into a 39-element feature vector used for the classification. We used the following machine learning algorithms: Neural Networks, K-Nearest Neighbor (KNN), and a 200 tree Random Forest (RF). Data were reserved for cross-validation of the KNN and RF. The Neural Network was trained 100 times, and the averaged results are presented. After categorization, we had 16 examples of non-pertussis coughs and 31 examples of pertussis coughs. Over 90% of all pertussis coughs were properly classified as pertussis. The error rates were: Type I errors of 7%, 12%, and 25% and Type II errors of 8%, 0%, and 0%, using the Neural Network, Random Forest, and KNN, respectively. Our results suggest that we can build a robust classifier to assist clinicians and the public to help identify pertussis cases in children presenting with typical symptoms.
Using Gaussian mixture models to detect and classify dolphin whistles and pulses.

PubMed

Peso Parada, Pablo; Cardenal-López, Antonio

2014-06-01

In recent years, a number of automatic detection systems for free-ranging cetaceans have been proposed that aim to detect not just surfaced, but also submerged, individuals. These systems are typically based on pattern-recognition techniques applied to underwater acoustic recordings. Using a Gaussian mixture model, a classification system was developed that detects sounds in recordings and classifies them as one of four types: background noise, whistles, pulses, and combined whistles and pulses. The classifier was tested using a database of underwater recordings made off the Spanish coast during 2011. Using cepstral-coefficient-based parameterization, a sound detection rate of 87.5% was achieved for a 23.6% classification error rate. To improve these results, two parameters computed using the multiple signal classification algorithm and an unpredictability measure were included in the classifier. These parameters, which helped to classify the segments containing whistles, increased the detection rate to 90.3% and reduced the classification error rate to 18.1%. Finally, the potential of the multiple signal classification algorithm and unpredictability measure for estimating whistle contours and classifying cetacean species was also explored, with promising results.
Unbiased Taxonomic Annotation of Metagenomic Samples

PubMed Central

Fosso, Bruno; Pesole, Graziano; Rosselló, Francesc

2018-01-01

Abstract The classification of reads from a metagenomic sample using a reference taxonomy is usually based on first mapping the reads to the reference sequences and then classifying each read at a node under the lowest common ancestor of the candidate sequences in the reference taxonomy with the least classification error. However, this taxonomic annotation can be biased by an imbalanced taxonomy and also by the presence of multiple nodes in the taxonomy with the least classification error for a given read. In this article, we show that the Rand index is a better indicator of classification error than the often used area under the receiver operating characteristic (ROC) curve and F-measure for both balanced and imbalanced reference taxonomies, and we also address the second source of bias by reducing the taxonomic annotation problem for a whole metagenomic sample to a set cover problem, for which a logarithmic approximation can be obtained in linear time and an exact solution can be obtained by integer linear programming. Experimental results with a proof-of-concept implementation of the set cover approach to taxonomic annotation in a next release of the TANGO software show that the set cover approach further reduces ambiguity in the taxonomic annotation obtained with TANGO without distorting the relative abundance profile of the metagenomic sample. PMID:29028181
Global land cover mapping: a review and uncertainty analysis

USGS Publications Warehouse

Congalton, Russell G.; Gu, Jianyu; Yadav, Kamini; Thenkabail, Prasad S.; Ozdogan, Mutlu

2014-01-01

Given the advances in remotely sensed imagery and associated technologies, several global land cover maps have been produced in recent times including IGBP DISCover, UMD Land Cover, Global Land Cover 2000 and GlobCover 2009. However, the utility of these maps for specific applications has often been hampered due to considerable amounts of uncertainties and inconsistencies. A thorough review of these global land cover projects including evaluating the sources of error and uncertainty is prudent and enlightening. Therefore, this paper describes our work in which we compared, summarized and conducted an uncertainty analysis of the four global land cover mapping projects using an error budget approach. The results showed that the classification scheme and the validation methodology had the highest error contribution and implementation priority. A comparison of the classification schemes showed that there are many inconsistencies between the definitions of the map classes. This is especially true for the mixed type classes for which thresholds vary for the attributes/discriminators used in the classification process. Examination of these four global mapping projects provided quite a few important lessons for the future global mapping projects including the need for clear and uniform definitions of the classification scheme and an efficient, practical, and valid design of the accuracy assessment.
Preschool speech error patterns predict articulation and phonological awareness outcomes in children with histories of speech sound disorders.

PubMed

Preston, Jonathan L; Hull, Margaret; Edwards, Mary Louise

2013-05-01

To determine if speech error patterns in preschoolers with speech sound disorders (SSDs) predict articulation and phonological awareness (PA) outcomes almost 4 years later. Twenty-five children with histories of preschool SSDs (and normal receptive language) were tested at an average age of 4;6 (years;months) and were followed up at age 8;3. The frequency of occurrence of preschool distortion errors, typical substitution and syllable structure errors, and atypical substitution and syllable structure errors was used to predict later speech sound production, PA, and literacy outcomes. Group averages revealed below-average school-age articulation scores and low-average PA but age-appropriate reading and spelling. Preschool speech error patterns were related to school-age outcomes. Children for whom >10% of their speech sound errors were atypical had lower PA and literacy scores at school age than children who produced <10% atypical errors. Preschoolers who produced more distortion errors were likely to have lower school-age articulation scores than preschoolers who produced fewer distortion errors. Different preschool speech error patterns predict different school-age clinical outcomes. Many atypical speech sound errors in preschoolers may be indicative of weak phonological representations, leading to long-term PA weaknesses. Preschoolers' distortions may be resistant to change over time, leading to persisting speech sound production problems.
Defining and classifying medical error: lessons for patient safety reporting systems.

PubMed

Tamuz, M; Thomas, E J; Franchois, K E

2004-02-01

It is important for healthcare providers to report safety related events, but little attention has been paid to how the definition and classification of events affects a hospital's ability to learn from its experience. To examine how the definition and classification of safety related events influences key organizational routines for gathering information, allocating incentives, and analyzing event reporting data. In semi-structured interviews, professional staff and administrators in a tertiary care teaching hospital and its pharmacy were asked to describe the existing programs designed to monitor medication safety, including the reporting systems. With a focus primarily on the pharmacy staff, interviews were audio recorded, transcribed, and analyzed using qualitative research methods. Eighty six interviews were conducted, including 36 in the hospital pharmacy. Examples are presented which show that: (1) the definition of an event could lead to under-reporting; (2) the classification of a medication error into alternative categories can influence the perceived incentives and disincentives for incident reporting; (3) event classification can enhance or impede organizational routines for data analysis and learning; and (4) routines that promote organizational learning within the pharmacy can reduce the flow of medication error data to the hospital. These findings from one hospital raise important practical and research questions about how reporting systems are influenced by the definition and classification of safety related events. By understanding more clearly how hospitals define and classify their experience, we may improve our capacity to learn and ultimately improve patient safety.
Fisher classifier and its probability of error estimation

NASA Technical Reports Server (NTRS)

Chittineni, C. B.

1979-01-01

Computationally efficient expressions are derived for estimating the probability of error using the leave-one-out method. The optimal threshold for the classification of patterns projected onto Fisher's direction is derived. A simple generalization of the Fisher classifier to multiple classes is presented. Computational expressions are developed for estimating the probability of error of the multiclass Fisher classifier.
Human factors analysis and classification system-HFACS.

DOT National Transportation Integrated Search

2000-02-01

Human error has been implicated in 70 to 80% of all civil and military aviation accidents. Yet, most accident : reporting systems are not designed around any theoretical framework of human error. As a result, most : accident databases are not conduci...
The localization of focal heart activity via body surface potential measurements: tests in a heterogeneous torso phantom

NASA Astrophysics Data System (ADS)

Wetterling, F.; Liehr, M.; Schimpf, P.; Liu, H.; Haueisen, J.

2009-09-01

The non-invasive localization of focal heart activity via body surface potential measurements (BSPM) could greatly benefit the understanding and treatment of arrhythmic heart diseases. However, the in vivo validation of source localization algorithms is rather difficult with currently available measurement techniques. In this study, we used a physical torso phantom composed of different conductive compartments and seven dipoles, which were placed in the anatomical position of the human heart in order to assess the performance of the Recursively Applied and Projected Multiple Signal Classification (RAP-MUSIC) algorithm. Electric potentials were measured on the torso surface for single dipoles with and without further uncorrelated or correlated dipole activity. The localization error averaged 11 ± 5 mm over 22 dipoles, which shows the ability of RAP-MUSIC to distinguish an uncorrelated dipole from surrounding sources activity. For the first time, real computational modelling errors could be included within the validation procedure due to the physically modelled heterogeneities. In conclusion, the introduced heterogeneous torso phantom can be used to validate state-of-the-art algorithms under nearly realistic measurement conditions.
IMPACTS OF PATCH SIZE AND LANDSCAPE HETEROGENEITY ON THEMATIC IMAGE CLASSIFICATION ACCURACY

EPA Science Inventory

Impacts of Patch Size and Landscape Heterogeneity on Thematic Image Classification Accuracy.
Currently, most thematic accuracy assessments of classified remotely sensed images oily account for errors between the various classes employed, at particular pixels of interest, thu...
High-density force myography: A possible alternative for upper-limb prosthetic control.

PubMed

Radmand, Ashkan; Scheme, Erik; Englehart, Kevin

2016-01-01

Several multiple degree-of-freedom upper-limb prostheses that have the promise of highly dexterous control have recently been developed. Inadequate controllability, however, has limited adoption of these devices. Introducing more robust control methods will likely result in higher acceptance rates. This work investigates the suitability of using high-density force myography (HD-FMG) for prosthetic control. HD-FMG uses a high-density array of pressure sensors to detect changes in the pressure patterns between the residual limb and socket caused by the contraction of the forearm muscles. In this work, HD-FMG outperforms the standard electromyography (EMG)-based system in detecting different wrist and hand gestures. With the arm in a fixed, static position, eight hand and wrist motions were classified with 0.33% error using the HD-FMG technique. Comparatively, classification errors in the range of 2.2%-11.3% have been reported in the literature for multichannel EMG-based approaches. As with EMG, position variation in HD-FMG can introduce classification error, but incorporating position variation into the training protocol reduces this effect. Channel reduction was also applied to the HD-FMG technique to decrease the dimensionality of the problem as well as the size of the sensorized area. We found that with informed, symmetric channel reduction, classification error could be decreased to 0.02%.
Common component classification: what can we learn from machine learning?

PubMed

Anderson, Ariana; Labus, Jennifer S; Vianna, Eduardo P; Mayer, Emeran A; Cohen, Mark S

2011-05-15

Machine learning methods have been applied to classifying fMRI scans by studying locations in the brain that exhibit temporal intensity variation between groups, frequently reporting classification accuracy of 90% or better. Although empirical results are quite favorable, one might doubt the ability of classification methods to withstand changes in task ordering and the reproducibility of activation patterns over runs, and question how much of the classification machines' power is due to artifactual noise versus genuine neurological signal. To examine the true strength and power of machine learning classifiers we create and then deconstruct a classifier to examine its sensitivity to physiological noise, task reordering, and across-scan classification ability. The models are trained and tested both within and across runs to assess stability and reproducibility across conditions. We demonstrate the use of independent components analysis for both feature extraction and artifact removal and show that removal of such artifacts can reduce predictive accuracy even when data has been cleaned in the preprocessing stages. We demonstrate how mistakes in the feature selection process can cause the cross-validation error seen in publication to be a biased estimate of the testing error seen in practice and measure this bias by purposefully making flawed models. We discuss other ways to introduce bias and the statistical assumptions lying behind the data and model themselves. Finally we discuss the complications in drawing inference from the smaller sample sizes typically seen in fMRI studies, the effects of small or unbalanced samples on the Type 1 and Type 2 error rates, and how publication bias can give a false confidence of the power of such methods. Collectively this work identifies challenges specific to fMRI classification and methods affecting the stability of models. Copyright © 2010 Elsevier Inc. All rights reserved.
J-Plus: Morphological Classification Of Compact And Extended Sources By Pdf Analysis

NASA Astrophysics Data System (ADS)

López-Sanjuan, C.; Vázquez-Ramió, H.; Varela, J.; Spinoso, D.; Cristóbal-Hornillos, D.; Viironen, K.; Muniesa, D.; J-PLUS Collaboration

2017-10-01

We present a morphological classification of J-PLUS EDR sources into compact (i.e. stars) and extended (i.e. galaxies). Such classification is based on the Bayesian modelling of the concentration distribution, including observational errors and magnitude + sky position priors. We provide the star / galaxy probability of each source computed from the gri images. The comparison with the SDSS number counts support our classification up to r 21. The 31.7 deg² analised comprises 150k stars and 101k galaxies.
Preschool speech error patterns predict articulation and phonological awareness outcomes in children with histories of speech sound disorders

PubMed Central

Preston, Jonathan L.; Hull, Margaret; Edwards, Mary Louise

2012-01-01

Purpose To determine if speech error patterns in preschoolers with speech sound disorders (SSDs) predict articulation and phonological awareness (PA) outcomes almost four years later. Method Twenty-five children with histories of preschool SSDs (and normal receptive language) were tested at an average age of 4;6 and followed up at 8;3. The frequency of occurrence of preschool distortion errors, typical substitution and syllable structure errors, and atypical substitution and syllable structure errors were used to predict later speech sound production, PA, and literacy outcomes. Results Group averages revealed below-average school-age articulation scores and low-average PA, but age-appropriate reading and spelling. Preschool speech error patterns were related to school-age outcomes. Children for whom more than 10% of their speech sound errors were atypical had lower PA and literacy scores at school-age than children who produced fewer than 10% atypical errors. Preschoolers who produced more distortion errors were likely to have lower school-age articulation scores. Conclusions Different preschool speech error patterns predict different school-age clinical outcomes. Many atypical speech sound errors in preschool may be indicative of weak phonological representations, leading to long-term PA weaknesses. Preschool distortions may be resistant to change over time, leading to persisting speech sound production problems. PMID:23184137
Halftoning Algorithms and Systems.

DTIC Science & Technology

1996-08-01

TERMS 15. NUMBER IF PAGESi. Halftoning algorithms; error diffusions ; color printing; topographic maps 16. PRICE CODE 17. SECURITY CLASSIFICATION 18...graylevels for each screen level. In the case of error diffusion algorithms, the calibration procedure using the new centering concept manifests itself as a...Novel Centering Concept for Overlapping Correction Paper / Transparency (Patent Applied 5/94)I * Applications To Error Diffusion * To Dithering (IS&T
Simulation techniques for estimating error in the classification of normal patterns

NASA Technical Reports Server (NTRS)

Whitsitt, S. J.; Landgrebe, D. A.

1974-01-01

Methods of efficiently generating and classifying samples with specified multivariate normal distributions were discussed. Conservative confidence tables for sample sizes are given for selective sampling. Simulation results are compared with classified training data. Techniques for comparing error and separability measure for two normal patterns are investigated and used to display the relationship between the error and the Chernoff bound.

Hydrologic classification of rivers based on cluster analysis of dimensionless hydrologic signatures: Applications for environmental instream flows

NASA Astrophysics Data System (ADS)

Praskievicz, S. J.; Luo, C.

2017-12-01

Classification of rivers is useful for a variety of purposes, such as generating and testing hypotheses about watershed controls on hydrology, predicting hydrologic variables for ungaged rivers, and setting goals for river management. In this research, we present a bottom-up (based on machine learning) river classification designed to investigate the underlying physical processes governing rivers' hydrologic regimes. The classification was developed for the entire state of Alabama, based on 248 United States Geological Survey (USGS) stream gages that met criteria for length and completeness of records. Five dimensionless hydrologic signatures were derived for each gage: slope of the flow duration curve (indicator of flow variability), baseflow index (ratio of baseflow to average streamflow), rising limb density (number of rising limbs per unit time), runoff ratio (ratio of long-term average streamflow to long-term average precipitation), and streamflow elasticity (sensitivity of streamflow to precipitation). We used a Bayesian clustering algorithm to classify the gages, based on the five hydrologic signatures, into distinct hydrologic regimes. We then used classification and regression trees (CART) to predict each gaged river's membership in different hydrologic regimes based on climatic and watershed variables. Using existing geospatial data, we applied the CART analysis to classify ungaged streams in Alabama, with the National Hydrography Dataset Plus (NHDPlus) catchment (average area 3 km2) as the unit of classification. The results of the classification can be used for meeting management and conservation objectives in Alabama, such as developing statewide standards for environmental instream flows. Such hydrologic classification approaches are promising for contributing to process-based understanding of river systems.
Multi-label classification of chronically ill patients with bag of words and supervised dimensionality reduction algorithms.

PubMed

Bromuri, Stefano; Zufferey, Damien; Hennebert, Jean; Schumacher, Michael

2014-10-01

This research is motivated by the issue of classifying illnesses of chronically ill patients for decision support in clinical settings. Our main objective is to propose multi-label classification of multivariate time series contained in medical records of chronically ill patients, by means of quantization methods, such as bag of words (BoW), and multi-label classification algorithms. Our second objective is to compare supervised dimensionality reduction techniques to state-of-the-art multi-label classification algorithms. The hypothesis is that kernel methods and locality preserving projections make such algorithms good candidates to study multi-label medical time series. We combine BoW and supervised dimensionality reduction algorithms to perform multi-label classification on health records of chronically ill patients. The considered algorithms are compared with state-of-the-art multi-label classifiers in two real world datasets. Portavita dataset contains 525 diabetes type 2 (DT2) patients, with co-morbidities of DT2 such as hypertension, dyslipidemia, and microvascular or macrovascular issues. MIMIC II dataset contains 2635 patients affected by thyroid disease, diabetes mellitus, lipoid metabolism disease, fluid electrolyte disease, hypertensive disease, thrombosis, hypotension, chronic obstructive pulmonary disease (COPD), liver disease and kidney disease. The algorithms are evaluated using multi-label evaluation metrics such as hamming loss, one error, coverage, ranking loss, and average precision. Non-linear dimensionality reduction approaches behave well on medical time series quantized using the BoW algorithm, with results comparable to state-of-the-art multi-label classification algorithms. Chaining the projected features has a positive impact on the performance of the algorithm with respect to pure binary relevance approaches. The evaluation highlights the feasibility of representing medical health records using the BoW for multi-label classification tasks. The study also highlights that dimensionality reduction algorithms based on kernel methods, locality preserving projections or both are good candidates to deal with multi-label classification tasks in medical time series with many missing values and high label density. Copyright © 2014 Elsevier Inc. All rights reserved.
Feature Selection, Flaring Size and Time-to-Flare Prediction Using Support Vector Regression, and Automated Prediction of Flaring Behavior Based on Spatio-Temporal Measures Using Hidden Markov Models

NASA Astrophysics Data System (ADS)

Al-Ghraibah, Amani

Solar flares release stored magnetic energy in the form of radiation and can have significant detrimental effects on earth including damage to technological infrastructure. Recent work has considered methods to predict future flare activity on the basis of quantitative measures of the solar magnetic field. Accurate advanced warning of solar flare occurrence is an area of increasing concern and much research is ongoing in this area. Our previous work 111] utilized standard pattern recognition and classification techniques to determine (classify) whether a region is expected to flare within a predictive time window, using a Relevance Vector Machine (RVM) classification method. We extracted 38 features which describing the complexity of the photospheric magnetic field, the result classification metrics will provide the baseline against which we compare our new work. We find a true positive rate (TPR) of 0.8, true negative rate (TNR) of 0.7, and true skill score (TSS) of 0.49. This dissertation proposes three basic topics; the first topic is an extension to our previous work [111, where we consider a feature selection method to determine an appropriate feature subset with cross validation classification based on a histogram analysis of selected features. Classification using the top five features resulting from this analysis yield better classification accuracies across a large unbalanced dataset. In particular, the feature subsets provide better discrimination of the many regions that flare where we find a TPR of 0.85, a TNR of 0.65 sightly lower than our previous work, and a TSS of 0.5 which has an improvement comparing with our previous work. In the second topic, we study the prediction of solar flare size and time-to-flare using support vector regression (SVR). When we consider flaring regions only, we find an average error in estimating flare size of approximately half a GOES class. When we additionally consider non-flaring regions, we find an increased average error of approximately 3/4 a GOES class. We also consider thresholding the regressed flare size for the experiment containing both flaring and non-flaring regions and find a TPR. of 0.69 and a TNR of 0.86 for flare prediction, consistent with our previous studies of flare prediction using the same magnetic complexity features. The results for both of these size regression experiments are consistent across a wide range of predictive time windows, indicating that the magnetic complexity features may be persistent in appearance long before flare activity. This conjecture is supported by our larger error rates of some 40 hours in the time-to-flare regression problem. The magnetic complexity features considered here appear to have discriminative potential for flare size, but their persistence in time makes them less discriminative for the time-to-flare problem. We also study the prediction of solar flare size and time-to-flare using two temporal features, namely the ▵- and ▵-▵-features, the same average size and time-to-flare regression error are found when these temporal features are used in size and time-to-flare prediction. In the third topic, we study the temporal evolution of active region magnetic fields using Hidden Markov Models (HMMs) which is one of the efficient temporal analyses found in literature. We extracted 38 features which describing the complexity of the photospheric magnetic field. These features are converted into a sequence of symbols using k-nearest neighbor search method. We study many parameters before prediction; like the length of the training window Wtrain which denotes to the number of history images use to train the flare and non-flare HMMs, and number of hidden states Q. In training phase, the model parameters of the HMM of each category are optimized so as to best describe the training symbol sequences. In testing phase, we use the best flare and non-flare models to predict/classify active regions as a flaring or non-flaring region using a sliding window method. The best prediction result is found where the length of the history training images are 15 images (i.e., Wtrain= 15) and the length of the sliding testing window is less than or equal to W train, the best result give a TPR of 0.79 consistent with previous flare prediction work, TNR of 0.87 arid TSS of 0.66, where both are higher than our previous flare prediction work. We find that the best number of hidden states which can describe the temporal evolution of the solar ARs is equal to five states, at the same time, a close resultant metrics are found using different number of states.
Optimal number of features as a function of sample size for various classification rules.

PubMed

Hua, Jianping; Xiong, Zixiang; Lowey, James; Suh, Edward; Dougherty, Edward R

2005-04-15

Given the joint feature-label distribution, increasing the number of features always results in decreased classification error; however, this is not the case when a classifier is designed via a classification rule from sample data. Typically (but not always), for fixed sample size, the error of a designed classifier decreases and then increases as the number of features grows. The potential downside of using too many features is most critical for small samples, which are commonplace for gene-expression-based classifiers for phenotype discrimination. For fixed sample size and feature-label distribution, the issue is to find an optimal number of features. Since only in rare cases is there a known distribution of the error as a function of the number of features and sample size, this study employs simulation for various feature-label distributions and classification rules, and across a wide range of sample and feature-set sizes. To achieve the desired end, finding the optimal number of features as a function of sample size, it employs massively parallel computation. Seven classifiers are treated: 3-nearest-neighbor, Gaussian kernel, linear support vector machine, polynomial support vector machine, perceptron, regular histogram and linear discriminant analysis. Three Gaussian-based models are considered: linear, nonlinear and bimodal. In addition, real patient data from a large breast-cancer study is considered. To mitigate the combinatorial search for finding optimal feature sets, and to model the situation in which subsets of genes are co-regulated and correlation is internal to these subsets, we assume that the covariance matrix of the features is blocked, with each block corresponding to a group of correlated features. Altogether there are a large number of error surfaces for the many cases. These are provided in full on a companion website, which is meant to serve as resource for those working with small-sample classification. For the companion website, please visit http://public.tgen.org/tamu/ofs/ e-dougherty@ee.tamu.edu.
The Human Factors Analysis and Classification System : HFACS : final report.

DOT National Transportation Integrated Search

2000-02-01

Human error has been implicated in 70 to 80% of all civil and military aviation accidents. Yet, most accident reporting systems are not designed around any theoretical framework of human error. As a result, most accident databases are not conducive t...
Accuracy of measurement in electrically evoked compound action potentials.

PubMed

Hey, Matthias; Müller-Deile, Joachim

2015-01-15

Electrically evoked compound action potentials (ECAP) in cochlear implant (CI) patients are characterized by the amplitude of the N1P1 complex. The measurement of evoked potentials yields a combination of the measured signal with various noise components but for ECAP procedures performed in the clinical routine, only the averaged curve is accessible. To date no detailed analysis of error dimension has been published. The aim of this study was to determine the error of the N1P1 amplitude and to determine the factors that impact the outcome. Measurements were performed on 32 CI patients with either CI24RE (CA) or CI512 implants using the Software Custom Sound EP (Cochlear). N1P1 error approximation of non-averaged raw data consisting of recorded single-sweeps was compared to methods of error approximation based on mean curves. The error approximation of the N1P1 amplitude using averaged data showed comparable results to single-point error estimation. The error of the N1P1 amplitude depends on the number of averaging steps and amplification; in contrast, the error of the N1P1 amplitude is not dependent on the stimulus intensity. Single-point error showed smaller N1P1 error and better coincidence with 1/√(N) function (N is the number of measured sweeps) compared to the known maximum-minimum criterion. Evaluation of N1P1 amplitude should be accompanied by indication of its error. The retrospective approximation of this measurement error from the averaged data available in clinically used software is possible and best done utilizing the D-trace in forward masking artefact reduction mode (no stimulation applied and recording contains only the switch-on-artefact). Copyright © 2014 Elsevier B.V. All rights reserved.
A Confidence Paradigm for Classification Systems

DTIC Science & Technology

2008-09-01

methodology to determine how much confi- dence one should have in a classifier output. This research proposes a framework to determine the level of...theoretical framework that attempts to unite the viewpoints of the classification system developer (or engineer) and the classification system user (or...operating point. An algorithm is developed that minimizes a “confidence” measure called Binned Error in the Posterior ( BEP ). Then, we prove that training a
A False Alarm Reduction Method for a Gas Sensor Based Electronic Nose

PubMed Central

Rahman, Mohammad Mizanur; Suksompong, Prapun; Toochinda, Pisanu; Taparugssanagorn, Attaphongse

2017-01-01

Electronic noses (E-Noses) are becoming popular for food and fruit quality assessment due to their robustness and repeated usability without fatigue, unlike human experts. An E-Nose equipped with classification algorithms and having open ended classification boundaries such as the k-nearest neighbor (k-NN), support vector machine (SVM), and multilayer perceptron neural network (MLPNN), are found to suffer from false classification errors of irrelevant odor data. To reduce false classification and misclassification errors, and to improve correct rejection performance; algorithms with a hyperspheric boundary, such as a radial basis function neural network (RBFNN) and generalized regression neural network (GRNN) with a Gaussian activation function in the hidden layer should be used. The simulation results presented in this paper show that GRNN has more correct classification efficiency and false alarm reduction capability compared to RBFNN. As the design of a GRNN and RBFNN is complex and expensive due to large numbers of neuron requirements, a simple hyperspheric classification method based on minimum, maximum, and mean (MMM) values of each class of the training dataset was presented. The MMM algorithm was simple and found to be fast and efficient in correctly classifying data of training classes, and correctly rejecting data of extraneous odors, and thereby reduced false alarms. PMID:28895910
A False Alarm Reduction Method for a Gas Sensor Based Electronic Nose.

PubMed

Rahman, Mohammad Mizanur; Charoenlarpnopparut, Chalie; Suksompong, Prapun; Toochinda, Pisanu; Taparugssanagorn, Attaphongse

2017-09-12

Electronic noses (E-Noses) are becoming popular for food and fruit quality assessment due to their robustness and repeated usability without fatigue, unlike human experts. An E-Nose equipped with classification algorithms and having open ended classification boundaries such as the k -nearest neighbor ( k -NN), support vector machine (SVM), and multilayer perceptron neural network (MLPNN), are found to suffer from false classification errors of irrelevant odor data. To reduce false classification and misclassification errors, and to improve correct rejection performance; algorithms with a hyperspheric boundary, such as a radial basis function neural network (RBFNN) and generalized regression neural network (GRNN) with a Gaussian activation function in the hidden layer should be used. The simulation results presented in this paper show that GRNN has more correct classification efficiency and false alarm reduction capability compared to RBFNN. As the design of a GRNN and RBFNN is complex and expensive due to large numbers of neuron requirements, a simple hyperspheric classification method based on minimum, maximum, and mean (MMM) values of each class of the training dataset was presented. The MMM algorithm was simple and found to be fast and efficient in correctly classifying data of training classes, and correctly rejecting data of extraneous odors, and thereby reduced false alarms.
Grouping patients for masseter muscle genotype-phenotype studies.

PubMed

Moawad, Hadwah Abdelmatloub; Sinanan, Andrea C M; Lewis, Mark P; Hunt, Nigel P

2012-03-01

To use various facial classifications, including either/both vertical and horizontal facial criteria, to assess their effects on the interpretation of masseter muscle (MM) gene expression. Fresh MM biopsies were obtained from 29 patients (age, 16-36 years) with various facial phenotypes. Based on clinical and cephalometric analysis, patients were grouped using three different classifications: (1) basic vertical, (2) basic horizontal, and (3) combined vertical and horizontal. Gene expression levels of the myosin heavy chain genes MYH1, MYH2, MYH3, MYH6, MYH7, and MYH8 were recorded using quantitative reverse transcriptase polymerase chain reaction (RT-PCR) and were related to the various classifications. The significance level for statistical analysis was set at P ≤ .05. Using classification 1, none of the MYH genes were found to be significantly different between long face (LF) patients and the average vertical group. Using classification 2, MYH3, MYH6, and MYH7 genes were found to be significantly upregulated in retrognathic patients compared with prognathic and average horizontal groups. Using classification 3, only the MYH7 gene was found to be significantly upregulated in retrognathic LF compared with prognathic LF, prognathic average vertical faces, and average vertical and horizontal groups. The use of basic vertical or basic horizontal facial classifications may not be sufficient for genetics-based studies of facial phenotypes. Prognathic and retrognathic facial phenotypes have different MM gene expressions; therefore, it is not recommended to combine them into one single group, even though they may have a similar vertical facial phenotype.
Differences in chewing sounds of dry-crisp snacks by multivariate data analysis

NASA Astrophysics Data System (ADS)

De Belie, N.; Sivertsvik, M.; De Baerdemaeker, J.

2003-09-01

Chewing sounds of different types of dry-crisp snacks (two types of potato chips, prawn crackers, cornflakes and low calorie snacks from extruded starch) were analysed to assess differences in sound emission patterns. The emitted sounds were recorded by a microphone placed over the ear canal. The first bite and the first subsequent chew were selected from the time signal and a fast Fourier transformation provided the power spectra. Different multivariate analysis techniques were used for classification of the snack groups. This included principal component analysis (PCA) and unfold partial least-squares (PLS) algorithms, as well as multi-way techniques such as three-way PLS, three-way PCA (Tucker3), and parallel factor analysis (PARAFAC) on the first bite and subsequent chew. The models were evaluated by calculating the classification errors and the root mean square error of prediction (RMSEP) for independent validation sets. It appeared that the logarithm of the power spectra obtained from the chewing sounds could be used successfully to distinguish the different snack groups. When different chewers were used, recalibration of the models was necessary. Multi-way models distinguished better between chewing sounds of different snack groups than PCA on bite or chew separately and than unfold PLS. From all three-way models applied, N-PLS with three components showed the best classification capabilities, resulting in classification errors of 14-18%. The major amount of incorrect classifications was due to one type of potato chips that had a very irregular shape, resulting in a wide variation of the emitted sounds.
Average symbol error rate for M-ary quadrature amplitude modulation in generalized atmospheric turbulence and misalignment errors

NASA Astrophysics Data System (ADS)

Sharma, Prabhat Kumar

2016-11-01

A framework is presented for the analysis of average symbol error rate (SER) for M-ary quadrature amplitude modulation in a free-space optical communication system. The standard probability density function (PDF)-based approach is extended to evaluate the average SER by representing the Q-function through its Meijer's G-function equivalent. Specifically, a converging power series expression for the average SER is derived considering the zero-boresight misalignment errors in the receiver side. The analysis presented here assumes a unified expression for the PDF of channel coefficient which incorporates the M-distributed atmospheric turbulence and Rayleigh-distributed radial displacement for the misalignment errors. The analytical results are compared with the results obtained using Q-function approximation. Further, the presented results are supported by the Monte Carlo simulations.
Consequences of land-cover misclassification in models of impervious surface

USGS Publications Warehouse

McMahon, G.

2007-01-01

Model estimates of impervious area as a function of landcover area may be biased and imprecise because of errors in the land-cover classification. This investigation of the effects of land-cover misclassification on impervious surface models that use National Land Cover Data (NLCD) evaluates the consequences of adjusting land-cover within a watershed to reflect uncertainty assessment information. Model validation results indicate that using error-matrix information to adjust land-cover values used in impervious surface models does not substantially improve impervious surface predictions. Validation results indicate that the resolution of the landcover data (Level I and Level II) is more important in predicting impervious surface accurately than whether the land-cover data have been adjusted using information in the error matrix. Level I NLCD, adjusted for land-cover misclassification, is preferable to the other land-cover options for use in models of impervious surface. This result is tied to the lower classification error rates for the Level I NLCD. ?? 2007 American Society for Photogrammetry and Remote Sensing.
Classification of spatially unresolved objects

NASA Technical Reports Server (NTRS)

Nalepka, R. F.; Horwitz, H. M.; Hyde, P. D.; Morgenstern, J. P.

1972-01-01

A proportion estimation technique for classification of multispectral scanner images is reported that uses data point averaging to extract and compute estimated proportions for a single average data point to classify spatial unresolved areas. Example extraction calculations of spectral signatures for bare soil, weeds, alfalfa, and barley prove quite accurate.
Improved classification accuracy by feature extraction using genetic algorithms

NASA Astrophysics Data System (ADS)

Patriarche, Julia; Manduca, Armando; Erickson, Bradley J.

2003-05-01

A feature extraction algorithm has been developed for the purposes of improving classification accuracy. The algorithm uses a genetic algorithm / hill-climber hybrid to generate a set of linearly recombined features, which may be of reduced dimensionality compared with the original set. The genetic algorithm performs the global exploration, and a hill climber explores local neighborhoods. Hybridizing the genetic algorithm with a hill climber improves both the rate of convergence, and the final overall cost function value; it also reduces the sensitivity of the genetic algorithm to parameter selection. The genetic algorithm includes the operators: crossover, mutation, and deletion / reactivation - the last of these effects dimensionality reduction. The feature extractor is supervised, and is capable of deriving a separate feature space for each tissue (which are reintegrated during classification). A non-anatomical digital phantom was developed as a gold standard for testing purposes. In tests with the phantom, and with images of multiple sclerosis patients, classification with feature extractor derived features yielded lower error rates than using standard pulse sequences, and with features derived using principal components analysis. Using the multiple sclerosis patient data, the algorithm resulted in a mean 31% reduction in classification error of pure tissues.
Locally Weighted Score Estimation for Quantile Classification in Binary Regression Models

PubMed Central

Rice, John D.; Taylor, Jeremy M. G.

2016-01-01

One common use of binary response regression methods is classification based on an arbitrary probability threshold dictated by the particular application. Since this is given to us a priori, it is sensible to incorporate the threshold into our estimation procedure. Specifically, for the linear logistic model, we solve a set of locally weighted score equations, using a kernel-like weight function centered at the threshold. The bandwidth for the weight function is selected by cross validation of a novel hybrid loss function that combines classification error and a continuous measure of divergence between observed and fitted values; other possible cross-validation functions based on more common binary classification metrics are also examined. This work has much in common with robust estimation, but diers from previous approaches in this area in its focus on prediction, specifically classification into high- and low-risk groups. Simulation results are given showing the reduction in error rates that can be obtained with this method when compared with maximum likelihood estimation, especially under certain forms of model misspecification. Analysis of a melanoma data set is presented to illustrate the use of the method in practice. PMID:28018492
Error detection and reduction in blood banking.

PubMed

Motschman, T L; Moore, S B

1996-12-01

Error management plays a major role in facility process improvement efforts. By detecting and reducing errors, quality and, therefore, patient care improve. It begins with a strong organizational foundation of management attitude with clear, consistent employee direction and appropriate physical facilities. Clearly defined critical processes, critical activities, and SOPs act as the framework for operations as well as active quality monitoring. To assure that personnel can detect an report errors they must be trained in both operational duties and error management practices. Use of simulated/intentional errors and incorporation of error detection into competency assessment keeps employees practiced, confident, and diminishes fear of the unknown. Personnel can clearly see that errors are indeed used as opportunities for process improvement and not for punishment. The facility must have a clearly defined and consistently used definition for reportable errors. Reportable errors should include those errors with potentially harmful outcomes as well as those errors that are "upstream," and thus further away from the outcome. A well-written error report consists of who, what, when, where, why/how, and follow-up to the error. Before correction can occur, an investigation to determine the underlying cause of the error should be undertaken. Obviously, the best corrective action is prevention. Correction can occur at five different levels; however, only three of these levels are directed at prevention. Prevention requires a method to collect and analyze data concerning errors. In the authors' facility a functional error classification method and a quality system-based classification have been useful. An active method to search for problems uncovers them further upstream, before they can have disastrous outcomes. In the continual quest for improving processes, an error management program is itself a process that needs improvement, and we must strive to always close the circle of quality assurance. Ultimately, the goal of better patient care will be the reward.
Slopeland utilizable limitation classification using landslide inventory

NASA Astrophysics Data System (ADS)

Tsai, Shu Fen; Lin, Chao Yuan

2016-04-01

In 1976, "Slopeland Conservation and Utilization Act" was promulgated as well as the criteria for slopeland utilization limitation classification (SULC) i.e., average slope, effective soil depth, degree of soil erosion, and parent rock became standardized. Due to the development areas on slope land steadily increased and the extreme rainfall events occurred frequently, the areas affected by landslides also increased year by year. According to the act, the land which damaged by disaster must be categorized to the conservation land and required rehabilitation. Nevertheless, the large-scale disaster on slope land and the limitation of SWCB officers are the constraint of field investigation. Therefore, how to establish the ongoing inspective procedure of post-disaster SULC using remote sensing was essential. A-Li-Shan, Ai-Liao, and Tai-Ma-Li Watershed were selected to be case studies in this project. The spatial data from big data i.e., Digital Elevation Model (DEM), soil map, and satellite images integrated with Geographic Information Systems (GIS) were applied to post-disaster SULC. The collapse and deposition area which delineated by vegetation recovery rate was established landslide inventory of cadastral unit combined with watershed unit. The results were verified with field survey and the accuracy was 97%. The landslide inventory could be an effective reference for sediment disaster investigation and a practical evidence for judgement to expropriation. Finally, the results showed that the ongoing inspective procedure of post-disaster SULC was practicable. From the four criteria, the average slope was the major factor. It was found that the non-uniform slopes, especially derived from cadastral units, often produce significant slope difference and lead to errors of average slope evaluation. Therefore, the Grid-based DEM slope derivation has been recommended as the standard method to calculate the average slope. Others criteria were previously required to classify the farm land tax. However, as a result of environmental change and advancements in farm machinery, it seems that those criteria were further inappropriate criteria for agricultural land. In conclusion, soil and water conservation works, which were enhanced to disaster prevention under climate change, should reconsider the SULC criteria. The average slope from DEM derivation and the sediment disaster from landslide inventory were suggested and adequate for SULC.
Effects of error covariance structure on estimation of model averaging weights and predictive performance

USGS Publications Warehouse

Lu, Dan; Ye, Ming; Meyer, Philip D.; Curtis, Gary P.; Shi, Xiaoqing; Niu, Xu-Feng; Yabusaki, Steve B.

2013-01-01

When conducting model averaging for assessing groundwater conceptual model uncertainty, the averaging weights are often evaluated using model selection criteria such as AIC, AICc, BIC, and KIC (Akaike Information Criterion, Corrected Akaike Information Criterion, Bayesian Information Criterion, and Kashyap Information Criterion, respectively). However, this method often leads to an unrealistic situation in which the best model receives overwhelmingly large averaging weight (close to 100%), which cannot be justified by available data and knowledge. It was found in this study that this problem was caused by using the covariance matrix, CE, of measurement errors for estimating the negative log likelihood function common to all the model selection criteria. This problem can be resolved by using the covariance matrix, Cek, of total errors (including model errors and measurement errors) to account for the correlation between the total errors. An iterative two-stage method was developed in the context of maximum likelihood inverse modeling to iteratively infer the unknown Cek from the residuals during model calibration. The inferred Cek was then used in the evaluation of model selection criteria and model averaging weights. While this method was limited to serial data using time series techniques in this study, it can be extended to spatial data using geostatistical techniques. The method was first evaluated in a synthetic study and then applied to an experimental study, in which alternative surface complexation models were developed to simulate column experiments of uranium reactive transport. It was found that the total errors of the alternative models were temporally correlated due to the model errors. The iterative two-stage method using Cekresolved the problem that the best model receives 100% model averaging weight, and the resulting model averaging weights were supported by the calibration results and physical understanding of the alternative models. Using Cek obtained from the iterative two-stage method also improved predictive performance of the individual models and model averaging in both synthetic and experimental studies.
Modified Bat Algorithm for Feature Selection with the Wisconsin Diagnosis Breast Cancer (WDBC) Dataset

PubMed

Jeyasingh, Suganthi; Veluchamy, Malathi

2017-05-01

Early diagnosis of breast cancer is essential to save lives of patients. Usually, medical datasets include a large variety of data that can lead to confusion during diagnosis. The Knowledge Discovery on Database (KDD) process helps to improve efficiency. It requires elimination of inappropriate and repeated data from the dataset before final diagnosis. This can be done using any of the feature selection algorithms available in data mining. Feature selection is considered as a vital step to increase the classification accuracy. This paper proposes a Modified Bat Algorithm (MBA) for feature selection to eliminate irrelevant features from an original dataset. The Bat algorithm was modified using simple random sampling to select the random instances from the dataset. Ranking was with the global best features to recognize the predominant features available in the dataset. The selected features are used to train a Random Forest (RF) classification algorithm. The MBA feature selection algorithm enhanced the classification accuracy of RF in identifying the occurrence of breast cancer. The Wisconsin Diagnosis Breast Cancer Dataset (WDBC) was used for estimating the performance analysis of the proposed MBA feature selection algorithm. The proposed algorithm achieved better performance in terms of Kappa statistic, Mathew’s Correlation Coefficient, Precision, F-measure, Recall, Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Relative Absolute Error (RAE) and Root Relative Squared Error (RRSE). Creative Commons Attribution License

Simulated rRNA/DNA Ratios Show Potential To Misclassify Active Populations as Dormant

DOE Office of Scientific and Technical Information (OSTI.GOV)

Steven, Blaire; Hesse, Cedar; Soghigian, John

The use of rRNA/DNA ratios derived from surveys of rRNA sequences in RNA and DNA extracts is an appealing but poorly validated approach to infer the activity status of environmental microbes. To improve the interpretation of rRNA/DNA ratios, we performed simulations to investigate the effects of community structure, rRNA amplification, and sampling depth on the accuracy of rRNA/DNA ratios in classifying bacterial populations as “active” or “dormant.” Community structure was an insignificant factor. In contrast, the extent of rRNA amplification that occurs as cells transition from dormant to growing had a significant effect (P < 0.0001) on classification accuracy, withmore » misclassification errors ranging from 16 to 28%, depending on the rRNA amplification model. The error rate increased to 47% when communities included a mixture of rRNA amplification models, but most of the inflated error was false negatives (i.e., active populations misclassified as dormant). Sampling depth also affected error rates (P < 0.001). Inadequate sampling depth produced various artifacts that are characteristic of rRNA/DNA ratios generated from real communities. These data show important constraints on the use of rRNA/DNA ratios to infer activity status. Whereas classification of populations as active based on rRNA/DNA ratios appears generally valid, classification of populations as dormant is potentially far less accurate.« less
Simulated rRNA/DNA Ratios Show Potential To Misclassify Active Populations as Dormant

DOE PAGES

Steven, Blaire; Hesse, Cedar; Soghigian, John; ...

2017-03-31

The use of rRNA/DNA ratios derived from surveys of rRNA sequences in RNA and DNA extracts is an appealing but poorly validated approach to infer the activity status of environmental microbes. To improve the interpretation of rRNA/DNA ratios, we performed simulations to investigate the effects of community structure, rRNA amplification, and sampling depth on the accuracy of rRNA/DNA ratios in classifying bacterial populations as “active” or “dormant.” Community structure was an insignificant factor. In contrast, the extent of rRNA amplification that occurs as cells transition from dormant to growing had a significant effect (P < 0.0001) on classification accuracy, withmore » misclassification errors ranging from 16 to 28%, depending on the rRNA amplification model. The error rate increased to 47% when communities included a mixture of rRNA amplification models, but most of the inflated error was false negatives (i.e., active populations misclassified as dormant). Sampling depth also affected error rates (P < 0.001). Inadequate sampling depth produced various artifacts that are characteristic of rRNA/DNA ratios generated from real communities. These data show important constraints on the use of rRNA/DNA ratios to infer activity status. Whereas classification of populations as active based on rRNA/DNA ratios appears generally valid, classification of populations as dormant is potentially far less accurate.« less
77 FR 59989 - Labor Surplus Area Classification Under Executive Orders

Federal Register 2010, 2011, 2012, 2013, 2014

2012-10-01

... DEPARTMENT OF LABOR Employment and Training Administration Labor Surplus Area Classification Under... Bureau of Labor Statistics are used in making these classifications. The average unemployment rate for all states includes data for the Commonwealth of Puerto Rico. The basic LSA classification criteria...
An Analysis of U.S. Army Fratricide Incidents during the Global War on Terror (11 September 2001 to 31 March 2008)

DTIC Science & Technology

2010-03-15

Swiss cheese model of human error causation. ................................................................... 3 2. Results for the classification of...based on Reason’s “ Swiss cheese ” model of human error (1990). Figure 1 describes how an accident is likely to occur when all of the errors, or “holes...align. A detailed description of HFACS can be found in Wiegmann and Shappell (2003). Figure 1. The Swiss cheese model of human error
Segmentation of anterior cruciate ligament in knee MR images using graph cuts with patient-specific shape constraints and label refinement.

PubMed

Lee, Hansang; Hong, Helen; Kim, Junmo

2014-12-01

We propose a graph-cut-based segmentation method for the anterior cruciate ligament (ACL) in knee MRI with a novel shape prior and label refinement. As the initial seeds for graph cuts, candidates for the ACL and the background are extracted from knee MRI roughly by means of adaptive thresholding with Gaussian mixture model fitting. The extracted ACL candidate is segmented iteratively by graph cuts with patient-specific shape constraints. Two shape constraints termed fence and neighbor costs are suggested such that the graph cuts prevent any leakage into adjacent regions with similar intensity. The segmented ACL label is refined by means of superpixel classification. Superpixel classification makes the segmented label propagate into missing inhomogeneous regions inside the ACL. In the experiments, the proposed method segmented the ACL with Dice similarity coefficient of 66.47±7.97%, average surface distance of 2.247±0.869, and root mean squared error of 3.538±1.633, which increased the accuracy by 14.8%, 40.3%, and 37.6% from the Boykov model, respectively. Copyright © 2014 Elsevier Ltd. All rights reserved.
Feature extraction using extrema sampling of discrete derivatives for spike sorting in implantable upper-limb neural prostheses.

PubMed

Zamani, Majid; Demosthenous, Andreas

2014-07-01

Next generation neural interfaces for upper-limb (and other) prostheses aim to develop implantable interfaces for one or more nerves, each interface having many neural signal channels that work reliably in the stump without harming the nerves. To achieve real-time multi-channel processing it is important to integrate spike sorting on-chip to overcome limitations in transmission bandwidth. This requires computationally efficient algorithms for feature extraction and clustering suitable for low-power hardware implementation. This paper describes a new feature extraction method for real-time spike sorting based on extrema analysis (namely positive peaks and negative peaks) of spike shapes and their discrete derivatives at different frequency bands. Employing simulation across different datasets, the accuracy and computational complexity of the proposed method are assessed and compared with other methods. The average classification accuracy of the proposed method in conjunction with online sorting (O-Sort) is 91.6%, outperforming all the other methods tested with the O-Sort clustering algorithm. The proposed method offers a better tradeoff between classification error and computational complexity, making it a particularly strong choice for on-chip spike sorting.
Intermittent Demand Forecasting in a Tertiary Pediatric Intensive Care Unit.

PubMed

Cheng, Chen-Yang; Chiang, Kuo-Liang; Chen, Meng-Yin

2016-10-01

Forecasts of the demand for medical supplies both directly and indirectly affect the operating costs and the quality of the care provided by health care institutions. Specifically, overestimating demand induces an inventory surplus, whereas underestimating demand possibly compromises patient safety. Uncertainty in forecasting the consumption of medical supplies generates intermittent demand events. The intermittent demand patterns for medical supplies are generally classified as lumpy, erratic, smooth, and slow-moving demand. This study was conducted with the purpose of advancing a tertiary pediatric intensive care unit's efforts to achieve a high level of accuracy in its forecasting of the demand for medical supplies. On this point, several demand forecasting methods were compared in terms of the forecast accuracy of each. The results confirm that applying Croston's method combined with a single exponential smoothing method yields the most accurate results for forecasting lumpy, erratic, and slow-moving demand, whereas the Simple Moving Average (SMA) method is the most suitable for forecasting smooth demand. In addition, when the classification of demand consumption patterns were combined with the demand forecasting models, the forecasting errors were minimized, indicating that this classification framework can play a role in improving patient safety and reducing inventory management costs in health care institutions.
Classification of Effective Soil Depth by Using Multinomial Logistic Regression Analysis

NASA Astrophysics Data System (ADS)

Chang, C. H.; Chan, H. C.; Chen, B. A.

2016-12-01

Classification of effective soil depth is a task of determining the slopeland utilizable limitation in Taiwan. The "Slopeland Conservation and Utilization Act" categorizes the slopeland into agriculture and husbandry land, land suitable for forestry and land for enhanced conservation according to the factors including average slope, effective soil depth, soil erosion and parental rock. However, sit investigation of the effective soil depth requires a cost-effective field work. This research aimed to classify the effective soil depth by using multinomial logistic regression with the environmental factors. The Wen-Shui Watershed located at the central Taiwan was selected as the study areas. The analysis of multinomial logistic regression is performed by the assistance of a Geographic Information Systems (GIS). The effective soil depth was categorized into four levels including deeper, deep, shallow and shallower. The environmental factors of slope, aspect, digital elevation model (DEM), curvature and normalized difference vegetation index (NDVI) were selected for classifying the soil depth. An Error Matrix was then used to assess the model accuracy. The results showed an overall accuracy of 75%. At the end, a map of effective soil depth was produced to help planners and decision makers in determining the slopeland utilizable limitation in the study areas.
Caracterisation des occupations du sol en milieu urbain par imagerie radar

NASA Astrophysics Data System (ADS)

Codjia, Claude

This study aims to test the relevance of medium and high-resolution SAR images on the characterization of the types of land use in urban areas. To this end, we have relied on textural approaches based on second-order statistics. Specifically, we look for texture parameters most relevant for discriminating urban objects. We have used in this regard Radarsat-1 in fine polarization mode and Radarsat-2 HH fine mode in dual and quad polarization and ultrafine mode HH polarization. The land uses sought were dense building, medium density building, low density building, industrial and institutional buildings, low density vegetation, dense vegetation and water. We have identified nine texture parameters for analysis, grouped into families according to their mathematical definitions in a first step. The parameters of similarity / dissimilarity include Homogeneity, Contrast, the Differential Inverse Moment and Dissimilarity. The parameters of disorder are Entropy and the Second Angular Momentum. The Standard Deviation and Correlation are the dispersion parameters and the Average is a separate family. It is clear from experience that certain combinations of texture parameters from different family used in classifications yield good results while others produce kappa of very little interest. Furthermore, we realize that if the use of several texture parameters improves classifications, its performance ceils from three parameters. The calculation of correlations between the textures and their principal axes confirm the results. Despite the good performance of this approach based on the complementarity of texture parameters, systematic errors due to the cardinal effects remain on classifications. To overcome this problem, a radiometric compensation model was developed based on the radar cross section (SER). A radar simulation from the digital surface model of the environment allowed us to extract the building backscatter zones and to analyze the related backscatter. Thus, we were able to devise a strategy of compensation of cardinal effects solely based on the responses of the objects according to their orientation from the plane of illumination through the radar's beam. It appeared that a compensation algorithm based on the radar cross section was appropriate. Some examples of the application of this algorithm on HH polarized RADARSAT-2 images are presented as well. Application of this algorithm will allow considerable gains with regard to certain forms of automation (classification and segmentation) at the level of radar imagery thus generating a higher level of quality in regard to visual interpretation. Application of this algorithm on RADARSAT-1 and RADARSAT-2 images with HH, HV, VH, and VV polarisations helped make considerable gains and eliminate most of the classification errors due to the cardinal effects.
78 FR 63248 - Labor Surplus Area Classification under Executive Orders 12073 and 10582

Federal Register 2010, 2011, 2012, 2013, 2014

2013-10-23

... DEPARTMENT OF LABOR Employment and Training Administration Labor Surplus Area Classification under... Statistics unemployment estimates to make these classifications. The average unemployment rate for all states includes data for the Commonwealth of Puerto Rico. The basic LSA classification criteria include a ``floor...
Average Likelihood Methods of Classification of Code Division Multiple Access (CDMA)

DTIC Science & Technology

2016-05-01

case of cognitive radio applications. Modulation classification is part of a broader problem known as blind or uncooperative demodulation the goal of...Introduction 2 2.1 Modulation Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.2 Research Objectives...6 3 Modulation Classification Methods 7 3.0.1 Ad Hoc
Object-Based Land Use Classification of Agricultural Land by Coupling Multi-Temporal Spectral Characteristics and Phenological Events in Germany

NASA Astrophysics Data System (ADS)

Knoefel, Patrick; Loew, Fabian; Conrad, Christopher

2015-04-01

Crop maps based on classification of remotely sensed data are of increased attendance in agricultural management. This induces a more detailed knowledge about the reliability of such spatial information. However, classification of agricultural land use is often limited by high spectral similarities of the studied crop types. More, spatially and temporally varying agro-ecological conditions can introduce confusion in crop mapping. Classification errors in crop maps in turn may have influence on model outputs, like agricultural production monitoring. One major goal of the PhenoS project ("Phenological structuring to determine optimal acquisition dates for Sentinel-2 data for field crop classification"), is the detection of optimal phenological time windows for land cover classification purposes. Since many crop species are spectrally highly similar, accurate classification requires the right selection of satellite images for a certain classification task. In the course of one growing season, phenological phases exist where crops are separable with higher accuracies. For this purpose, coupling of multi-temporal spectral characteristics and phenological events is promising. The focus of this study is set on the separation of spectrally similar cereal crops like winter wheat, barley, and rye of two test sites in Germany called "Harz/Central German Lowland" and "Demmin". However, this study uses object based random forest (RF) classification to investigate the impact of image acquisition frequency and timing on crop classification uncertainty by permuting all possible combinations of available RapidEye time series recorded on the test sites between 2010 and 2014. The permutations were applied to different segmentation parameters. Then, classification uncertainty was assessed and analysed, based on the probabilistic soft-output from the RF algorithm at the per-field basis. From this soft output, entropy was calculated as a spatial measure of classification uncertainty. The results indicate that uncertainty estimates provide a valuable addition to traditional accuracy assessments and helps the user to allocate error in crop maps.
A new classification of glaucomas

PubMed Central

Bordeianu, Constantin-Dan

2014-01-01

Purpose To suggest a new glaucoma classification that is pathogenic, etiologic, and clinical. Methods After discussing the logical pathway used in criteria selection, the paper presents the new classification and compares it with the classification currently in use, that is, the one issued by the European Glaucoma Society in 2008. Results The paper proves that the new classification is clear (being based on a coherent and consistently followed set of criteria), is comprehensive (framing all forms of glaucoma), and helps in understanding the sickness understanding (in that it uses a logical framing system). The great advantage is that it facilitates therapeutic decision making in that it offers direct therapeutic suggestions and avoids errors leading to disasters. Moreover, the scheme remains open to any new development. Conclusion The suggested classification is a pathogenic, etiologic, and clinical classification that fulfills the conditions of an ideal classification. The suggested classification is the first classification in which the main criterion is consistently used for the first 5 to 7 crossings until its differentiation capabilities are exhausted. Then, secondary criteria (etiologic and clinical) pick up the relay until each form finds its logical place in the scheme. In order to avoid unclear aspects, the genetic criterion is no longer used, being replaced by age, one of the clinical criteria. The suggested classification brings only benefits to all categories of ophthalmologists: the beginners will have a tool to better understand the sickness and to ease their decision making, whereas the experienced doctors will have their practice simplified. For all doctors, errors leading to therapeutic disasters will be less likely to happen. Finally, researchers will have the object of their work gathered in the group of glaucoma with unknown or uncertain pathogenesis, whereas the results of their work will easily find a logical place in the scheme, as the suggested classification remains open to any new development. PMID:25246759
Intervention program efficacy for spelling difficulties.

PubMed

Sampaio, Maria Nobre; Capellini, Simone Aparecida

2014-01-01

To develop an intervention procedure for spelling difficulties and to verify the effectiveness of the intervention program in students with lower spelling performance. We developed an intervention program for spelling difficulties, according to the semiology of the errors. The program consisted of three modules totaling 16 sessions. The study included 40 students of the third to fifth grade of public elementary education of the city of Marilia (SP), of both genders, in aged of eight to 12 years old, being distributed in the following groups: GI (20 students with lower spelling performance) and GII (20 students with higher spelling performance). In situation of pre and post-testing, all groups were submitted to the Pro-Orthography. The results statistically analyzed showed that, in general, all groups had average of right that has higher in post-testing, reducing the types of errors second semiologycal classification, mainly related to natural spelling errors. However, the results also showed that the groups submitted to the intervention program showed better performance on spelling tests in relation to not submitted. The intervention program developed was effective once the groups submitted showed better performance on spelling tests in relation to not submitted. Therefore, the intervention program can help professionals in the Health and Education to minimize the problems related to spelling, giving students an intervention that is effective for the development of the spelling knowledge.
Prediction of protein mutant stability using classification and regression tool.

PubMed

Huang, Liang-Tsung; Saraboji, K; Ho, Shinn-Ying; Hwang, Shiow-Fen; Ponnuswamy, M N; Gromiha, M Michael

2007-02-01

Prediction of protein stability upon amino acid substitutions is an important problem in molecular biology and the solving of which would help for designing stable mutants. In this work, we have analyzed the stability of protein mutants using two different datasets of 1396 and 2204 mutants obtained from ProTherm database, respectively for free energy change due to thermal (DeltaDeltaG) and denaturant denaturations (DeltaDeltaG(H(2)O)). We have used a set of 48 physical, chemical energetic and conformational properties of amino acid residues and computed the difference of amino acid properties for each mutant in both sets of data. These differences in amino acid properties have been related to protein stability (DeltaDeltaG and DeltaDeltaG(H(2)O)) and are used to train with classification and regression tool for predicting the stability of protein mutants. Further, we have tested the method with 4 fold, 5 fold and 10 fold cross validation procedures. We found that the physical properties, shape and flexibility are important determinants of protein stability. The classification of mutants based on secondary structure (helix, strand, turn and coil) and solvent accessibility (buried, partially buried, partially exposed and exposed) distinguished the stabilizing/destabilizing mutants at an average accuracy of 81% and 80%, respectively for DeltaDeltaG and DeltaDeltaG(H(2)O). The correlation between the experimental and predicted stability change is 0.61 for DeltaDeltaG and 0.44 for DeltaDeltaG(H(2)O). Further, the free energy change due to the replacement of amino acid residue has been predicted within an average error of 1.08 kcal/mol and 1.37 kcal/mol for thermal and chemical denaturation, respectively. The relative importance of secondary structure and solvent accessibility, and the influence of the dataset on prediction of protein mutant stability have been discussed.
Classification of burn wounds using support vector machines

NASA Astrophysics Data System (ADS)

Acha, Begona; Serrano, Carmen; Palencia, Sergio; Murillo, Juan Jose

2004-05-01

The purpose of this work is to improve a previous method developed by the authors for the classification of burn wounds into their depths. The inputs of the system are color and texture information, as these are the characteristics observed by physicians in order to give a diagnosis. Our previous work consisted in segmenting the burn wound from the rest of the image and classifying the burn into its depth. In this paper we focus on the classification problem only. We already proposed to use a Fuzzy-ARTMAP neural network (NN). However, we may take advantage of new powerful classification tools such as Support Vector Machines (SVM). We apply the five-folded cross validation scheme to divide the database into training and validating sets. Then, we apply a feature selection method for each classifier, which will give us the set of features that yields the smallest classification error for each classifier. Features used to classify are first-order statistical parameters extracted from the L*, u* and v* color components of the image. The feature selection algorithms used are the Sequential Forward Selection (SFS) and the Sequential Backward Selection (SBS) methods. As data of the problem faced here are not linearly separable, the SVM was trained using some different kernels. The validating process shows that the SVM method, when using a Gaussian kernel of variance 1, outperforms classification results obtained with the rest of the classifiers, yielding an error classification rate of 0.7% whereas the Fuzzy-ARTMAP NN attained 1.6 %.
Comparative assessment of LANDSAT-D MSS and TM data quality for mapping applications in the Southeast

NASA Technical Reports Server (NTRS)

1984-01-01

Rectifications of multispectral scanner and thematic mapper data sets for full and subscene areas, analyses of planimetric errors, assessments of the number and distribution of ground control points required to minimize errors, and factors contributing to error residual are examined. Other investigations include the generation of three dimensional terrain models and the effects of spatial resolution on digital classification accuracies.
Influence of ECG measurement accuracy on ECG diagnostic statements.

PubMed

Zywietz, C; Celikag, D; Joseph, G

1996-01-01

Computer analysis of electrocardiograms (ECGs) provides a large amount of ECG measurement data, which may be used for diagnostic classification and storage in ECG databases. Until now, neither error limits for ECG measurements have been specified nor has their influence on diagnostic statements been systematically investigated. An analytical method is presented to estimate the influence of measurement errors on the accuracy of diagnostic ECG statements. Systematic (offset) errors will usually result in an increase of false positive or false negative statements since they cause a shift of the working point on the receiver operating characteristics curve. Measurement error dispersion broadens the distribution function of discriminative measurement parameters and, therefore, usually increases the overlap between discriminative parameters. This results in a flattening of the receiver operating characteristics curve and an increase of false positive and false negative classifications. The method developed has been applied to ECG conduction defect diagnoses by using the proposed International Electrotechnical Commission's interval measurement tolerance limits. These limits appear too large because more than 30% of false positive atrial conduction defect statements and 10-18% of false intraventricular conduction defect statements could be expected due to tolerated measurement errors. To assure long-term usability of ECG measurement databases, it is recommended that systems provide its error tolerance limits obtained on a defined test set.
Decision support system for determining the contact lens for refractive errors patients with classification ID3

NASA Astrophysics Data System (ADS)

Situmorang, B. H.; Setiawan, M. P.; Tosida, E. T.

2017-01-01

Refractive errors are abnormalities of the refraction of light so that the shadows do not focus precisely on the retina resulting in blurred vision [1]. Refractive errors causing the patient should wear glasses or contact lenses in order eyesight returned to normal. The use of glasses or contact lenses in a person will be different from others, it is influenced by patient age, the amount of tear production, vision prescription, and astigmatic. Because the eye is one organ of the human body is very important to see, then the accuracy in determining glasses or contact lenses which will be used is required. This research aims to develop a decision support system that can produce output on the right contact lenses for refractive errors patients with a value of 100% accuracy. Iterative Dichotomize Three (ID3) classification methods will generate gain and entropy values of attributes that include code sample data, age of the patient, astigmatic, the ratio of tear production, vision prescription, and classes that will affect the outcome of the decision tree. The eye specialist test result for the training data obtained the accuracy rate of 96.7% and an error rate of 3.3%, the result test using confusion matrix obtained the accuracy rate of 96.1% and an error rate of 3.1%; for the data testing obtained accuracy rate of 100% and an error rate of 0.
The generalization ability of online SVM classification based on Markov sampling.

PubMed

Xu, Jie; Yan Tang, Yuan; Zou, Bin; Xu, Zongben; Li, Luoqing; Lu, Yang

2015-03-01

In this paper, we consider online support vector machine (SVM) classification learning algorithms with uniformly ergodic Markov chain (u.e.M.c.) samples. We establish the bound on the misclassification error of an online SVM classification algorithm with u.e.M.c. samples based on reproducing kernel Hilbert spaces and obtain a satisfactory convergence rate. We also introduce a novel online SVM classification algorithm based on Markov sampling, and present the numerical studies on the learning ability of online SVM classification based on Markov sampling for benchmark repository. The numerical studies show that the learning performance of the online SVM classification algorithm based on Markov sampling is better than that of classical online SVM classification based on random sampling as the size of training samples is larger.

Aspen, climate, and sudden decline in western USA

Treesearch

Gerald E. Rehfeldt; Dennis E. Ferguson; Nicholas L. Crookston

2009-01-01

A bioclimate model predicting the presence or absence of aspen, Populus tremuloides, in western USA from climate variables was developed by using the Random Forests classification tree on Forest Inventory data from about 118,000 permanent sample plots. A reasonably parsimonious model used eight predictors to describe aspen's climate profile. Classification errors...
A framework for software fault tolerance in real-time systems

NASA Technical Reports Server (NTRS)

Anderson, T.; Knight, J. C.

1983-01-01

A classification scheme for errors and a technique for the provision of software fault tolerance in cyclic real-time systems is presented. The technique requires that the process structure of a system be represented by a synchronization graph which is used by an executive as a specification of the relative times at which they will communicate during execution. Communication between concurrent processes is severely limited and may only take place between processes engaged in an exchange. A history of error occurrences is maintained by an error handler. When an error is detected, the error handler classifies it using the error history information and then initiates appropriate recovery action.
The random coding bound is tight for the average code.

NASA Technical Reports Server (NTRS)

Gallager, R. G.

1973-01-01

The random coding bound of information theory provides a well-known upper bound to the probability of decoding error for the best code of a given rate and block length. The bound is constructed by upperbounding the average error probability over an ensemble of codes. The bound is known to give the correct exponential dependence of error probability on block length for transmission rates above the critical rate, but it gives an incorrect exponential dependence at rates below a second lower critical rate. Here we derive an asymptotic expression for the average error probability over the ensemble of codes used in the random coding bound. The result shows that the weakness of the random coding bound at rates below the second critical rate is due not to upperbounding the ensemble average, but rather to the fact that the best codes are much better than the average at low rates.
Software platform for managing the classification of error- related potentials of observers

NASA Astrophysics Data System (ADS)

Asvestas, P.; Ventouras, E.-C.; Kostopoulos, S.; Sidiropoulos, K.; Korfiatis, V.; Korda, A.; Uzunolglu, A.; Karanasiou, I.; Kalatzis, I.; Matsopoulos, G.

2015-09-01

Human learning is partly based on observation. Electroencephalographic recordings of subjects who perform acts (actors) or observe actors (observers), contain a negative waveform in the Evoked Potentials (EPs) of the actors that commit errors and of observers who observe the error-committing actors. This waveform is called the Error-Related Negativity (ERN). Its detection has applications in the context of Brain-Computer Interfaces. The present work describes a software system developed for managing EPs of observers, with the aim of classifying them into observations of either correct or incorrect actions. It consists of an integrated platform for the storage, management, processing and classification of EPs recorded during error-observation experiments. The system was developed using C# and the following development tools and frameworks: MySQL, .NET Framework, Entity Framework and Emgu CV, for interfacing with the machine learning library of OpenCV. Up to six features can be computed per EP recording per electrode. The user can select among various feature selection algorithms and then proceed to train one of three types of classifiers: Artificial Neural Networks, Support Vector Machines, k-nearest neighbour. Next the classifier can be used for classifying any EP curve that has been inputted to the database.
Estimation of open water evaporation using land-based meteorological data

NASA Astrophysics Data System (ADS)

Li, Fawen; Zhao, Yong

2017-10-01

Water surface evaporation is an important process in the hydrologic and energy cycles. Accurate simulation of water evaporation is important for the evaluation of water resources. In this paper, using meteorological data from the Aixinzhuang reservoir, the main factors affecting water surface evaporation were determined by the principal component analysis method. To illustrate the influence of these factors on water surface evaporation, the paper first adopted the Dalton model to simulate water surface evaporation. The results showed that the simulation precision was poor for the peak value zone. To improve the model simulation's precision, a modified Dalton model considering relative humidity was proposed. The results show that the 10-day average relative error is 17.2%, assessed as qualified; the monthly average relative error is 12.5%, assessed as qualified; and the yearly average relative error is 3.4%, assessed as excellent. To validate its applicability, the meteorological data of Kuancheng station in the Luan River basin were selected to test the modified model. The results show that the 10-day average relative error is 15.4%, assessed as qualified; the monthly average relative error is 13.3%, assessed as qualified; and the yearly average relative error is 6.0%, assessed as good. These results showed that the modified model had good applicability and versatility. The research results can provide technical support for the calculation of water surface evaporation in northern China or similar regions.
Satellite inventory of Minnesota forest resources

NASA Technical Reports Server (NTRS)

Bauer, Marvin E.; Burk, Thomas E.; Ek, Alan R.; Coppin, Pol R.; Lime, Stephen D.; Walsh, Terese A.; Walters, David K.; Befort, William; Heinzen, David F.

1993-01-01

The methods and results of using Landsat Thematic Mapper (TM) data to classify and estimate the acreage of forest covertypes in northeastern Minnesota are described. Portions of six TM scenes covering five counties with a total area of 14,679 square miles were classified into six forest and five nonforest classes. The approach involved the integration of cluster sampling, image processing, and estimation. Using cluster sampling, 343 plots, each 88 acres in size, were photo interpreted and field mapped as a source of reference data for classifier training and calibration of the TM data classifications. Classification accuracies of up to 75 percent were achieved; most misclassification was between similar or related classes. An inverse method of calibration, based on the error rates obtained from the classifications of the cluster plots, was used to adjust the classification class proportions for classification errors. The resulting area estimates for total forest land in the five-county area were within 3 percent of the estimate made independently by the USDA Forest Service. Area estimates for conifer and hardwood forest types were within 0.8 and 6.0 percent respectively, of the Forest Service estimates. A trial of a second method of estimating the same classes as the Forest Service resulted in standard errors of 0.002 to 0.015. A study of the use of multidate TM data for change detection showed that forest canopy depletion, canopy increment, and no change could be identified with greater than 90 percent accuracy. The project results have been the basis for the Minnesota Department of Natural Resources and the Forest Service to define and begin to implement an annual system of forest inventory which utilizes Landsat TM data to detect changes in forest cover.
Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines

PubMed Central

del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J.; Raboso, Mariano

2015-01-01

Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation—based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking—to reduce the dimensions of images—and binarization—to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements. PMID:26091392
Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines.

PubMed

del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J; Raboso, Mariano

2015-06-17

Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation-based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking-to reduce the dimensions of images-and binarization-to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements.
Robust Averaging of Covariances for EEG Recordings Classification in Motor Imagery Brain-Computer Interfaces.

PubMed

Uehara, Takashi; Sartori, Matteo; Tanaka, Toshihisa; Fiori, Simone

2017-06-01

The estimation of covariance matrices is of prime importance to analyze the distribution of multivariate signals. In motor imagery-based brain-computer interfaces (MI-BCI), covariance matrices play a central role in the extraction of features from recorded electroencephalograms (EEGs); therefore, correctly estimating covariance is crucial for EEG classification. This letter discusses algorithms to average sample covariance matrices (SCMs) for the selection of the reference matrix in tangent space mapping (TSM)-based MI-BCI. Tangent space mapping is a powerful method of feature extraction and strongly depends on the selection of a reference covariance matrix. In general, the observed signals may include outliers; therefore, taking the geometric mean of SCMs as the reference matrix may not be the best choice. In order to deal with the effects of outliers, robust estimators have to be used. In particular, we discuss and test the use of geometric medians and trimmed averages (defined on the basis of several metrics) as robust estimators. The main idea behind trimmed averages is to eliminate data that exhibit the largest distance from the average covariance calculated on the basis of all available data. The results of the experiments show that while the geometric medians show little differences from conventional methods in terms of classification accuracy in the classification of electroencephalographic recordings, the trimmed averages show significant improvement for all subjects.
Combining forecast weights: Why and how?

NASA Astrophysics Data System (ADS)

Yin, Yip Chee; Kok-Haur, Ng; Hock-Eam, Lim

2012-09-01

This paper proposes a procedure called forecast weight averaging which is a specific combination of forecast weights obtained from different methods of constructing forecast weights for the purpose of improving the accuracy of pseudo out of sample forecasting. It is found that under certain specified conditions, forecast weight averaging can lower the mean squared forecast error obtained from model averaging. In addition, we show that in a linear and homoskedastic environment, this superior predictive ability of forecast weight averaging holds true irrespective whether the coefficients are tested by t statistic or z statistic provided the significant level is within the 10% range. By theoretical proofs and simulation study, we have shown that model averaging like, variance model averaging, simple model averaging and standard error model averaging, each produces mean squared forecast error larger than that of forecast weight averaging. Finally, this result also holds true marginally when applied to business and economic empirical data sets, Gross Domestic Product (GDP growth rate), Consumer Price Index (CPI) and Average Lending Rate (ALR) of Malaysia.
Application of texture analysis method for classification of benign and malignant thyroid nodules in ultrasound images.

PubMed

Abbasian Ardakani, Ali; Gharbali, Akbar; Mohammadi, Afshin

2015-01-01

The aim of this study was to evaluate computer aided diagnosis (CAD) system with texture analysis (TA) to improve radiologists' accuracy in identification of thyroid nodules as malignant or benign. A total of 70 cases (26 benign and 44 malignant) were analyzed in this study. We extracted up to 270 statistical texture features as a descriptor for each selected region of interests (ROIs) in three normalization schemes (default, 3s and 1%-99%). Then features by the lowest probability of classification error and average correlation coefficients (POE+ACC), and Fisher coefficient (Fisher) eliminated to 10 best and most effective features. These features were analyzed under standard and nonstandard states. For TA of the thyroid nodules, Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA) and Non-Linear Discriminant Analysis (NDA) were applied. First Nearest-Neighbour (1-NN) classifier was performed for the features resulting from PCA and LDA. NDA features were classified by artificial neural network (A-NN). Receiver operating characteristic (ROC) curve analysis was used for examining the performance of TA methods. The best results were driven in 1-99% normalization with features extracted by POE+ACC algorithm and analyzed by NDA with the area under the ROC curve ( Az) of 0.9722 which correspond to sensitivity of 94.45%, specificity of 100%, and accuracy of 97.14%. Our results indicate that TA is a reliable method, can provide useful information help radiologist in detection and classification of benign and malignant thyroid nodules.
Bounding the error on bottom estimation for multi-angle swath bathymetry sonar

NASA Astrophysics Data System (ADS)

Mullins, Geoff K.; Bird, John S.

2005-04-01

With the recent introduction of multi-angle swath bathymetry (MASB) sonar to the commercial marketplace (e.g., Benthos Inc., C3D sonar, 2004), additions must be made to the current sonar lexicon. The correct interpretation of measurements made with MASB sonar, which uses filled transducer arrays to compute angle-of-arrival information (AOA) from backscattered signal, is essential not only for mapping, but for applications such as statistical bottom classification. In this paper it is shown that aside from uncorrelated channel to channel noise, there exists a tradeoff between effects that govern the error bounds on bottom estimation for surfaces having shallow grazing angle and surfaces distributed along a radial arc centered at the transducer. In the first case, as the bottom aligns with the radial direction to the receiver, footprint shift and shallow grazing angle effects dominate the uncertainty in physical bottom position (surface aligns along a single AOA). Alternatively, if signal from a radial arc arrives, a single AOA is usually estimated (not necessarily at the average location of the surface). Through theoretical treatment, simulation, and field measurements, the aforementioned factors affecting MASB bottom mapping are examined. [Work supported by NSERC.
Longitudinal Neuroimaging Hippocampal Markers for Diagnosing Alzheimer's Disease.

PubMed

Platero, Carlos; Lin, Lin; Tobar, M Carmen

2018-05-21

Hippocampal atrophy measures from magnetic resonance imaging (MRI) are powerful tools for monitoring Alzheimer's disease (AD) progression. In this paper, we introduce a longitudinal image analysis framework based on robust registration and simultaneous hippocampal segmentation and longitudinal marker classification of brain MRI of an arbitrary number of time points. The framework comprises two innovative parts: a longitudinal segmentation and a longitudinal classification step. The results show that both steps of the longitudinal pipeline improved the reliability and the accuracy of the discrimination between clinical groups. We introduce a novel approach to the joint segmentation of the hippocampus across multiple time points; this approach is based on graph cuts of longitudinal MRI scans with constraints on hippocampal atrophy and supported by atlases. Furthermore, we use linear mixed effect (LME) modeling for differential diagnosis between clinical groups. The classifiers are trained from the average residue between the longitudinal marker of the subjects and the LME model. In our experiments, we analyzed MRI-derived longitudinal hippocampal markers from two publicly available datasets (Alzheimer's Disease Neuroimaging Initiative, ADNI and Minimal Interval Resonance Imaging in Alzheimer's Disease, MIRIAD). In test/retest reliability experiments, the proposed method yielded lower volume errors and significantly higher dice overlaps than the cross-sectional approach (volume errors: 1.55% vs 0.8%; dice overlaps: 0.945 vs 0.975). To diagnose AD, the discrimination ability of our proposal gave an area under the receiver operating characteristic (ROC) curve (AUC) [Formula: see text] 0.947 for the control vs AD, AUC [Formula: see text] 0.720 for mild cognitive impairment (MCI) vs AD, and AUC [Formula: see text] 0.805 for the control vs MCI.
Active Learning to Overcome Sample Selection Bias: Application to Photometric Variable Star Classification

NASA Astrophysics Data System (ADS)

Richards, Joseph W.; Starr, Dan L.; Brink, Henrik; Miller, Adam A.; Bloom, Joshua S.; Butler, Nathaniel R.; James, J. Berian; Long, James P.; Rice, John

2012-01-01

Despite the great promise of machine-learning algorithms to classify and predict astrophysical parameters for the vast numbers of astrophysical sources and transients observed in large-scale surveys, the peculiarities of the training data often manifest as strongly biased predictions on the data of interest. Typically, training sets are derived from historical surveys of brighter, more nearby objects than those from more extensive, deeper surveys (testing data). This sample selection bias can cause catastrophic errors in predictions on the testing data because (1) standard assumptions for machine-learned model selection procedures break down and (2) dense regions of testing space might be completely devoid of training data. We explore possible remedies to sample selection bias, including importance weighting, co-training, and active learning (AL). We argue that AL—where the data whose inclusion in the training set would most improve predictions on the testing set are queried for manual follow-up—is an effective approach and is appropriate for many astronomical applications. For a variable star classification problem on a well-studied set of stars from Hipparcos and Optical Gravitational Lensing Experiment, AL is the optimal method in terms of error rate on the testing data, beating the off-the-shelf classifier by 3.4% and the other proposed methods by at least 3.0%. To aid with manual labeling of variable stars, we developed a Web interface which allows for easy light curve visualization and querying of external databases. Finally, we apply AL to classify variable stars in the All Sky Automated Survey, finding dramatic improvement in our agreement with the ASAS Catalog of Variable Stars, from 65.5% to 79.5%, and a significant increase in the classifier's average confidence for the testing set, from 14.6% to 42.9%, after a few AL iterations.
Effects of stress typicality during speeded grammatical classification.

PubMed

Arciuli, Joanne; Cupples, Linda

2003-01-01

The experiments reported here were designed to investigate the influence of stress typicality during speeded grammatical classification of disyllabic English words by native and non-native speakers. Trochaic nouns and iambic gram verbs were considered to be typically stressed, whereas iambic nouns and trochaic verbs were considered to be atypically stressed. Experiments 1a and 2a showed that while native speakers classified typically stressed words individual more quickly and more accurately than atypically stressed words during differences reading, there were no overall effects during classification of spoken stimuli. However, a subgroup of native speakers with high error rates did show a significant effect during classification of spoken stimuli. Experiments 1b and 2b showed that non-native speakers classified typically stressed words more quickly and more accurately than atypically stressed words during reading. Typically stressed words were classified more accurately than atypically stressed words when the stimuli were spoken. Importantly, there was a significant relationship between error rates, vocabulary size and the size of the stress typicality effect in each experiment. We conclude that participants use information about lexical stress to help them distinguish between disyllabic nouns and verbs during speeded grammatical classification. This is especially so for individuals with a limited vocabulary who lack other knowledge (e.g., semantic knowledge) about the differences between these grammatical categories.
Automatic classification of diseases from free-text death certificates for real-time surveillance.

PubMed

Koopman, Bevan; Karimi, Sarvnaz; Nguyen, Anthony; McGuire, Rhydwyn; Muscatello, David; Kemp, Madonna; Truran, Donna; Zhang, Ming; Thackway, Sarah

2015-07-15

Death certificates provide an invaluable source for mortality statistics which can be used for surveillance and early warnings of increases in disease activity and to support the development and monitoring of prevention or response strategies. However, their value can be realised only if accurate, quantitative data can be extracted from death certificates, an aim hampered by both the volume and variable nature of certificates written in natural language. This study aims to develop a set of machine learning and rule-based methods to automatically classify death certificates according to four high impact diseases of interest: diabetes, influenza, pneumonia and HIV. Two classification methods are presented: i) a machine learning approach, where detailed features (terms, term n-grams and SNOMED CT concepts) are extracted from death certificates and used to train a set of supervised machine learning models (Support Vector Machines); and ii) a set of keyword-matching rules. These methods were used to identify the presence of diabetes, influenza, pneumonia and HIV in a death certificate. An empirical evaluation was conducted using 340,142 death certificates, divided between training and test sets, covering deaths from 2000-2007 in New South Wales, Australia. Precision and recall (positive predictive value and sensitivity) were used as evaluation measures, with F-measure providing a single, overall measure of effectiveness. A detailed error analysis was performed on classification errors. Classification of diabetes, influenza, pneumonia and HIV was highly accurate (F-measure 0.96). More fine-grained ICD-10 classification effectiveness was more variable but still high (F-measure 0.80). The error analysis revealed that word variations as well as certain word combinations adversely affected classification. In addition, anomalies in the ground truth likely led to an underestimation of the effectiveness. The high accuracy and low cost of the classification methods allow for an effective means for automatic and real-time surveillance of diabetes, influenza, pneumonia and HIV deaths. In addition, the methods are generally applicable to other diseases of interest and to other sources of medical free-text besides death certificates.
Measures of Linguistic Accuracy in Second Language Writing Research.

ERIC Educational Resources Information Center

Polio, Charlene G.

1997-01-01

Investigates the reliability of measures of linguistic accuracy in second language writing. The study uses a holistic scale, error-free T-units, and an error classification system on the essays of English-as-a-Second-Language students and discusses why disagreements arise within a rater and between raters. (24 references) (Author/CK)
[Study of inversion and classification of particle size distribution under dependent model algorithm].

PubMed

Sun, Xiao-Gang; Tang, Hong; Yuan, Gui-Bin

2008-05-01

For the total light scattering particle sizing technique, an inversion and classification method was proposed with the dependent model algorithm. The measured particle system was inversed simultaneously by different particle distribution functions whose mathematic model was known in advance, and then classified according to the inversion errors. The simulation experiments illustrated that it is feasible to use the inversion errors to determine the particle size distribution. The particle size distribution function was obtained accurately at only three wavelengths in the visible light range with the genetic algorithm, and the inversion results were steady and reliable, which decreased the number of multi wavelengths to the greatest extent and increased the selectivity of light source. The single peak distribution inversion error was less than 5% and the bimodal distribution inversion error was less than 10% when 5% stochastic noise was put in the transmission extinction measurement values at two wavelengths. The running time of this method was less than 2 s. The method has advantages of simplicity, rapidity, and suitability for on-line particle size measurement.
Classification accuracy for stratification with remotely sensed data

Treesearch

Raymond L. Czaplewski; Paul L. Patterson

2003-01-01

Tools are developed that help specify the classification accuracy required from remotely sensed data. These tools are applied during the planning stage of a sample survey that will use poststratification, prestratification with proportional allocation, or double sampling for stratification. Accuracy standards are developed in terms of an âerror matrix,â which is...
Comparison of wheat classification accuracy using different classifiers of the image-100 system

NASA Technical Reports Server (NTRS)

Dejesusparada, N. (Principal Investigator); Chen, S. C.; Moreira, M. A.; Delima, A. M.

1981-01-01

Classification results using single-cell and multi-cell signature acquisition options, a point-by-point Gaussian maximum-likelihood classifier, and K-means clustering of the Image-100 system are presented. Conclusions reached are that: a better indication of correct classification can be provided by using a test area which contains various cover types of the study area; classification accuracy should be evaluated considering both the percentages of correct classification and error of commission; supervised classification approaches are better than K-means clustering; Gaussian distribution maximum likelihood classifier is better than Single-cell and Multi-cell Signature Acquisition Options of the Image-100 system; and in order to obtain a high classification accuracy in a large and heterogeneous crop area, using Gaussian maximum-likelihood classifier, homogeneous spectral subclasses of the study crop should be created to derive training statistics.

Assessment of differences between repeated pulse wave velocity measurements in terms of 'bias' in the extrapolated cardiovascular risk and the classification of aortic stiffness: is a single PWV measurement enough?

PubMed

Papaioannou, T G; Protogerou, A D; Nasothimiou, E G; Tzamouranis, D; Skliros, N; Achimastos, A; Papadogiannis, D; Stefanadis, C I

2012-10-01

Currently, there is no recommendation regarding the minimum number of pulse wave velocity (PWV) measurements to optimize individual's cardiovascular risk (CVR) stratification. The aim of this study was to examine differences between three single consecutive and averaged PWV measurements in terms of the extrapolated CVR and the classification of aortic stiffness as normal. In 60 subjects who referred for CVR assessment, three repeated measurements of blood pressure (BP), heart rate and PWV were performed. The reproducibility was evaluated by the intraclass correlation coefficient (ICC) and mean±s.d. of differences. The absolute differences between single and averaged PWV measurements were classified as: ≤0.25, 0.26-0.49, 0.50-0.99 and ≥1 m s(-1). A difference ≥0.5 m s(-1) (corresponding to 7.5% change in CVR, meta-analysis data from >12 000 subjects) was considered as clinically meaningful; PWV values (single or averaged) were classified as normal according to respective age-corrected normal values (European Network data). Kappa statistic was used to evaluate the agreement between classifications. PWV for the first, second and third measurement was 7.0±1.9, 6.9±1.9, 6.9±2.0 m s(-1), respectively (P=0.319); BP and heart rate did not vary significantly. A good reproducibility between single measurements was observed (ICC>0.94, s.d. ranged between 0.43 and 0.64 m s(-1)). A high percent with difference ≥0.5 m s(-1) was observed between: any pair of the three single PWV measurements (26.6-38.3%); the first or second single measurement and the average of the first and second (18.3%); any single measurement and the average of three measurements (10-20%). In only up to 5% a difference ≥0.5 m s(-1) was observed between the average of three and the average of any two PWV measurements. There was no significant agreement regarding PWV classification as normal between: the first or second measurement and the averaged PWV values. There was significant agreement in classification made by the average of the first two and the average of three PWV measurements (κ=0.85, P<0.001). Even when high reproducibility in PWV measurement is succeeded single measurements provide quite variable results in terms of the extrapolated CVR and the classification of aortic stiffness as normal. The average of two PWV measurements provides similar results with the average of three.
Ensemble of classifiers for confidence-rated classification of NDE signal

NASA Astrophysics Data System (ADS)

Banerjee, Portia; Safdarnejad, Seyed; Udpa, Lalita; Udpa, Satish

2016-02-01

Ensemble of classifiers in general, aims to improve classification accuracy by combining results from multiple weak hypotheses into a single strong classifier through weighted majority voting. Improved versions of ensemble of classifiers generate self-rated confidence scores which estimate the reliability of each of its prediction and boost the classifier using these confidence-rated predictions. However, such a confidence metric is based only on the rate of correct classification. In existing works, although ensemble of classifiers has been widely used in computational intelligence, the effect of all factors of unreliability on the confidence of classification is highly overlooked. With relevance to NDE, classification results are affected by inherent ambiguity of classifica-tion, non-discriminative features, inadequate training samples and noise due to measurement. In this paper, we extend the existing ensemble classification by maximizing confidence of every classification decision in addition to minimizing the classification error. Initial results of the approach on data from eddy current inspection show improvement in classification performance of defect and non-defect indications.
On the statistical assessment of classifiers using DNA microarray data

PubMed Central

Ancona, N; Maglietta, R; Piepoli, A; D'Addabbo, A; Cotugno, R; Savino, M; Liuni, S; Carella, M; Pesole, G; Perri, F

2006-01-01

Background In this paper we present a method for the statistical assessment of cancer predictors which make use of gene expression profiles. The methodology is applied to a new data set of microarray gene expression data collected in Casa Sollievo della Sofferenza Hospital, Foggia – Italy. The data set is made up of normal (22) and tumor (25) specimens extracted from 25 patients affected by colon cancer. We propose to give answers to some questions which are relevant for the automatic diagnosis of cancer such as: Is the size of the available data set sufficient to build accurate classifiers? What is the statistical significance of the associated error rates? In what ways can accuracy be considered dependant on the adopted classification scheme? How many genes are correlated with the pathology and how many are sufficient for an accurate colon cancer classification? The method we propose answers these questions whilst avoiding the potential pitfalls hidden in the analysis and interpretation of microarray data. Results We estimate the generalization error, evaluated through the Leave-K-Out Cross Validation error, for three different classification schemes by varying the number of training examples and the number of the genes used. The statistical significance of the error rate is measured by using a permutation test. We provide a statistical analysis in terms of the frequencies of the genes involved in the classification. Using the whole set of genes, we found that the Weighted Voting Algorithm (WVA) classifier learns the distinction between normal and tumor specimens with 25 training examples, providing e = 21% (p = 0.045) as an error rate. This remains constant even when the number of examples increases. Moreover, Regularized Least Squares (RLS) and Support Vector Machines (SVM) classifiers can learn with only 15 training examples, with an error rate of e = 19% (p = 0.035) and e = 18% (p = 0.037) respectively. Moreover, the error rate decreases as the training set size increases, reaching its best performances with 35 training examples. In this case, RLS and SVM have error rates of e = 14% (p = 0.027) and e = 11% (p = 0.019). Concerning the number of genes, we found about 6000 genes (p < 0.05) correlated with the pathology, resulting from the signal-to-noise statistic. Moreover the performances of RLS and SVM classifiers do not change when 74% of genes is used. They progressively reduce up to e = 16% (p < 0.05) when only 2 genes are employed. The biological relevance of a set of genes determined by our statistical analysis and the major roles they play in colorectal tumorigenesis is discussed. Conclusions The method proposed provides statistically significant answers to precise questions relevant for the diagnosis and prognosis of cancer. We found that, with as few as 15 examples, it is possible to train statistically significant classifiers for colon cancer diagnosis. As for the definition of the number of genes sufficient for a reliable classification of colon cancer, our results suggest that it depends on the accuracy required. PMID:16919171
Speech variability effects on recognition accuracy associated with concurrent task performance by pilots

NASA Technical Reports Server (NTRS)

Simpson, C. A.

1985-01-01

In the present study of the responses of pairs of pilots to aircraft warning classification tasks using an isolated word, speaker-dependent speech recognition system, the induced stress was manipulated by means of different scoring procedures for the classification task and by the inclusion of a competitive manual control task. Both speech patterns and recognition accuracy were analyzed, and recognition errors were recorded by type for an isolated word speaker-dependent system and by an offline technique for a connected word speaker-dependent system. While errors increased with task loading for the isolated word system, there was no such effect for task loading in the case of the connected word system.
Robust Transmission of H.264/AVC Streams Using Adaptive Group Slicing and Unequal Error Protection

NASA Astrophysics Data System (ADS)

Thomos, Nikolaos; Argyropoulos, Savvas; Boulgouris, Nikolaos V.; Strintzis, Michael G.

2006-12-01

We present a novel scheme for the transmission of H.264/AVC video streams over lossy packet networks. The proposed scheme exploits the error-resilient features of H.264/AVC codec and employs Reed-Solomon codes to protect effectively the streams. A novel technique for adaptive classification of macroblocks into three slice groups is also proposed. The optimal classification of macroblocks and the optimal channel rate allocation are achieved by iterating two interdependent steps. Dynamic programming techniques are used for the channel rate allocation process in order to reduce complexity. Simulations clearly demonstrate the superiority of the proposed method over other recent algorithms for transmission of H.264/AVC streams.
The pot calling the kettle black: the extent and type of errors in a computerized immunization registry and by parent report.

PubMed

MacDonald, Shannon E; Schopflocher, Donald P; Golonka, Richard P

2014-01-04

Accurate classification of children's immunization status is essential for clinical care, administration and evaluation of immunization programs, and vaccine program research. Computerized immunization registries have been proposed as a valuable alternative to provider paper records or parent report, but there is a need to better understand the challenges associated with their use. This study assessed the accuracy of immunization status classification in an immunization registry as compared to parent report and determined the number and type of errors occurring in both sources. This study was a sub-analysis of a larger study which compared the characteristics of children whose immunizations were up to date (UTD) at two years as compared to those not UTD. Children's immunization status was initially determined from a population-based immunization registry, and then compared to parent report of immunization status, as reported in a postal survey. Discrepancies between the two sources were adjudicated by review of immunization providers' hard-copy clinic records. Descriptive analyses included calculating proportions and confidence intervals for errors in classification and reporting of the type and frequency of errors. Among the 461 survey respondents, there were 60 discrepancies in immunization status. The majority of errors were due to parent report (n = 44), but the registry was not without fault (n = 16). Parents tended to erroneously report their child as UTD, whereas the registry was more likely to wrongly classify children as not UTD. Reasons for registry errors included failure to account for varicella disease history, variable number of doses required due to age at series initiation, and doses administered out of the region. These results confirm that parent report is often flawed, but also identify that registries are prone to misclassification of immunization status. Immunization program administrators and researchers need to institute measures to identify and reduce misclassification, in order for registries to play an effective role in the control of vaccine-preventable disease.
The pot calling the kettle black: the extent and type of errors in a computerized immunization registry and by parent report

PubMed Central

2014-01-01

Background Accurate classification of children’s immunization status is essential for clinical care, administration and evaluation of immunization programs, and vaccine program research. Computerized immunization registries have been proposed as a valuable alternative to provider paper records or parent report, but there is a need to better understand the challenges associated with their use. This study assessed the accuracy of immunization status classification in an immunization registry as compared to parent report and determined the number and type of errors occurring in both sources. Methods This study was a sub-analysis of a larger study which compared the characteristics of children whose immunizations were up to date (UTD) at two years as compared to those not UTD. Children’s immunization status was initially determined from a population-based immunization registry, and then compared to parent report of immunization status, as reported in a postal survey. Discrepancies between the two sources were adjudicated by review of immunization providers’ hard-copy clinic records. Descriptive analyses included calculating proportions and confidence intervals for errors in classification and reporting of the type and frequency of errors. Results Among the 461 survey respondents, there were 60 discrepancies in immunization status. The majority of errors were due to parent report (n = 44), but the registry was not without fault (n = 16). Parents tended to erroneously report their child as UTD, whereas the registry was more likely to wrongly classify children as not UTD. Reasons for registry errors included failure to account for varicella disease history, variable number of doses required due to age at series initiation, and doses administered out of the region. Conclusions These results confirm that parent report is often flawed, but also identify that registries are prone to misclassification of immunization status. Immunization program administrators and researchers need to institute measures to identify and reduce misclassification, in order for registries to play an effective role in the control of vaccine-preventable disease. PMID:24387002
Field evaluation of the error arising from inadequate time averaging in the standard use of depth-integrating suspended-sediment samplers

USGS Publications Warehouse

Topping, David J.; Rubin, David M.; Wright, Scott A.; Melis, Theodore S.

2011-01-01

Several common methods for measuring suspended-sediment concentration in rivers in the United States use depth-integrating samplers to collect a velocity-weighted suspended-sediment sample in a subsample of a river cross section. Because depth-integrating samplers are always moving through the water column as they collect a sample, and can collect only a limited volume of water and suspended sediment, they collect only minimally time-averaged data. Four sources of error exist in the field use of these samplers: (1) bed contamination, (2) pressure-driven inrush, (3) inadequate sampling of the cross-stream spatial structure in suspended-sediment concentration, and (4) inadequate time averaging. The first two of these errors arise from misuse of suspended-sediment samplers, and the third has been the subject of previous study using data collected in the sand-bedded Middle Loup River in Nebraska. Of these four sources of error, the least understood source of error arises from the fact that depth-integrating samplers collect only minimally time-averaged data. To evaluate this fourth source of error, we collected suspended-sediment data between 1995 and 2007 at four sites on the Colorado River in Utah and Arizona, using a P-61 suspended-sediment sampler deployed in both point- and one-way depth-integrating modes, and D-96-A1 and D-77 bag-type depth-integrating suspended-sediment samplers. These data indicate that the minimal duration of time averaging during standard field operation of depth-integrating samplers leads to an error that is comparable in magnitude to that arising from inadequate sampling of the cross-stream spatial structure in suspended-sediment concentration. This random error arising from inadequate time averaging is positively correlated with grain size and does not largely depend on flow conditions or, for a given size class of suspended sediment, on elevation above the bed. Averaging over time scales >1 minute is the likely minimum duration required to result in substantial decreases in this error. During standard two-way depth integration, a depth-integrating suspended-sediment sampler collects a sample of the water-sediment mixture during two transits at each vertical in a cross section: one transit while moving from the water surface to the bed, and another transit while moving from the bed to the water surface. As the number of transits is doubled at an individual vertical, this error is reduced by ~30 percent in each size class of suspended sediment. For a given size class of suspended sediment, the error arising from inadequate sampling of the cross-stream spatial structure in suspended-sediment concentration depends only on the number of verticals collected, whereas the error arising from inadequate time averaging depends on both the number of verticals collected and the number of transits collected at each vertical. Summing these two errors in quadrature yields a total uncertainty in an equal-discharge-increment (EDI) or equal-width-increment (EWI) measurement of the time-averaged velocity-weighted suspended-sediment concentration in a river cross section (exclusive of any laboratory-processing errors). By virtue of how the number of verticals and transits influences the two individual errors within this total uncertainty, the error arising from inadequate time averaging slightly dominates that arising from inadequate sampling of the cross-stream spatial structure in suspended-sediment concentration. Adding verticals to an EDI or EWI measurement is slightly more effective in reducing the total uncertainty than adding transits only at each vertical, because a new vertical contributes both temporal and spatial information. However, because collection of depth-integrated samples at more transits at each vertical is generally easier and faster than at more verticals, addition of a combination of verticals and transits is likely a more practical approach to reducing the total uncertainty in most field situatio
Bayesian Network Structure Learning for Urban Land Use Classification from Landsat ETM+ and Ancillary Data

NASA Astrophysics Data System (ADS)

Park, M.; Stenstrom, M. K.

2004-12-01

Recognizing urban information from the satellite imagery is problematic due to the diverse features and dynamic changes of urban landuse. The use of Landsat imagery for urban land use classification involves inherent uncertainty due to its spatial resolution and the low separability among land uses. To resolve the uncertainty problem, we investigated the performance of Bayesian networks to classify urban land use since Bayesian networks provide a quantitative way of handling uncertainty and have been successfully used in many areas. In this study, we developed the optimized networks for urban land use classification from Landsat ETM+ images of Marina del Rey area based on USGS land cover/use classification level III. The networks started from a tree structure based on mutual information between variables and added the links to improve accuracy. This methodology offers several advantages: (1) The network structure shows the dependency relationships between variables. The class node value can be predicted even with particular band information missing due to sensor system error. The missing information can be inferred from other dependent bands. (2) The network structure provides information of variables that are important for the classification, which is not available from conventional classification methods such as neural networks and maximum likelihood classification. In our case, for example, bands 1, 5 and 6 are the most important inputs in determining the land use of each pixel. (3) The networks can be reduced with those input variables important for classification. This minimizes the problem without considering all possible variables. We also examined the effect of incorporating ancillary data: geospatial information such as X and Y coordinate values of each pixel and DEM data, and vegetation indices such as NDVI and Tasseled Cap transformation. The results showed that the locational information improved overall accuracy (81%) and kappa coefficient (76%), and lowered the omission and commission errors compared with using only spectral data (accuracy 71%, kappa coefficient 62%). Incorporating DEM data did not significantly improve overall accuracy (74%) and kappa coefficient (66%) but lowered the omission and commission errors. Incorporating NDVI did not much improve the overall accuracy (72%) and k coefficient (65%). Including Tasseled Cap transformation reduced the accuracy (accuracy 70%, kappa 61%). Therefore, additional information from the DEM and vegetation indices was not useful as locational ancillary data.
Estimating Population Cause-Specific Mortality Fractions from in-Hospital Mortality: Validation of a New Method

PubMed Central

Murray, Christopher J. L; Lopez, Alan D; Barofsky, Jeremy T; Bryson-Cahn, Chloe; Lozano, Rafael

2007-01-01

Background Cause-of-death data for many developing countries are not available. Information on deaths in hospital by cause is available in many low- and middle-income countries but is not a representative sample of deaths in the population. We propose a method to estimate population cause-specific mortality fractions (CSMFs) using data already collected in many middle-income and some low-income developing nations, yet rarely used: in-hospital death records. Methods and Findings For a given cause of death, a community's hospital deaths are equal to total community deaths multiplied by the proportion of deaths occurring in hospital. If we can estimate the proportion dying in hospital, we can estimate the proportion dying in the population using deaths in hospital. We propose to estimate the proportion of deaths for an age, sex, and cause group that die in hospital from the subset of the population where vital registration systems function or from another population. We evaluated our method using nearly complete vital registration (VR) data from Mexico 1998–2005, which records whether a death occurred in a hospital. In this validation test, we used 45 disease categories. We validated our method in two ways: nationally and between communities. First, we investigated how the method's accuracy changes as we decrease the amount of Mexican VR used to estimate the proportion of each age, sex, and cause group dying in hospital. Decreasing VR data used for this first step from 100% to 9% produces only a 12% maximum relative error between estimated and true CSMFs. Even if Mexico collected full VR information only in its capital city with 9% of its population, our estimation method would produce an average relative error in CSMFs across the 45 causes of just over 10%. Second, we used VR data for the capital zone (Distrito Federal and Estado de Mexico) and estimated CSMFs for the three lowest-development states. Our estimation method gave an average relative error of 20%, 23%, and 31% for Guerrero, Chiapas, and Oaxaca, respectively. Conclusions Where accurate International Classification of Diseases (ICD)-coded cause-of-death data are available for deaths in hospital and for VR covering a subset of the population, we demonstrated that population CSMFs can be estimated with low average error. In addition, we showed in the case of Mexico that this method can substantially reduce error from biased hospital data, even when applied to areas with widely different levels of development. For countries with ICD-coded deaths in hospital, this method potentially allows the use of existing data to inform health policy. PMID:18031195
Vector quantizer designs for joint compression and terrain categorization of multispectral imagery

NASA Technical Reports Server (NTRS)

Gorman, John D.; Lyons, Daniel F.

1994-01-01

Two vector quantizer designs for compression of multispectral imagery and their impact on terrain categorization performance are evaluated. The mean-squared error (MSE) and classification performance of the two quantizers are compared, and it is shown that a simple two-stage design minimizing MSE subject to a constraint on classification performance has a significantly better classification performance than a standard MSE-based tree-structured vector quantizer followed by maximum likelihood classification. This improvement in classification performance is obtained with minimal loss in MSE performance. The results show that it is advantageous to tailor compression algorithm designs to the required data exploitation tasks. Applications of joint compression/classification include compression for the archival or transmission of Landsat imagery that is later used for land utility surveys and/or radiometric analysis.
Contextual convolutional neural networks for lung nodule classification using Gaussian-weighted average image patches

NASA Astrophysics Data System (ADS)

Lee, Haeil; Lee, Hansang; Park, Minseok; Kim, Junmo

2017-03-01

Lung cancer is the most common cause of cancer-related death. To diagnose lung cancers in early stages, numerous studies and approaches have been developed for cancer screening with computed tomography (CT) imaging. In recent years, convolutional neural networks (CNN) have become one of the most common and reliable techniques in computer aided detection (CADe) and diagnosis (CADx) by achieving state-of-the-art-level performances for various tasks. In this study, we propose a CNN classification system for false positive reduction of initially detected lung nodule candidates. First, image patches of lung nodule candidates are extracted from CT scans to train a CNN classifier. To reflect the volumetric contextual information of lung nodules to 2D image patch, we propose a weighted average image patch (WAIP) generation by averaging multiple slice images of lung nodule candidates. Moreover, to emphasize central slices of lung nodules, slice images are locally weighted according to Gaussian distribution and averaged to generate the 2D WAIP. With these extracted patches, 2D CNN is trained to achieve the classification of WAIPs of lung nodule candidates into positive and negative labels. We used LUNA 2016 public challenge database to validate the performance of our approach for false positive reduction in lung CT nodule classification. Experiments show our approach improves the classification accuracy of lung nodules compared to the baseline 2D CNN with patches from single slice image.
Detecting Paroxysmal Coughing from Pertussis Cases Using Voice Recognition Technology

PubMed Central

Parker, Danny; Picone, Joseph; Harati, Amir; Lu, Shuang; Jenkyns, Marion H.; Polgreen, Philip M.

2013-01-01

Background Pertussis is highly contagious; thus, prompt identification of cases is essential to control outbreaks. Clinicians experienced with the disease can easily identify classic cases, where patients have bursts of rapid coughing followed by gasps, and a characteristic whooping sound. However, many clinicians have never seen a case, and thus may miss initial cases during an outbreak. The purpose of this project was to use voice-recognition software to distinguish pertussis coughs from croup and other coughs. Methods We collected a series of recordings representing pertussis, croup and miscellaneous coughing by children. We manually categorized coughs as either pertussis or non-pertussis, and extracted features for each category. We used Mel-frequency cepstral coefficients (MFCC), a sampling rate of 16 KHz, a frame Duration of 25 msec, and a frame rate of 10 msec. The coughs were filtered. Each cough was divided into 3 sections of proportion 3-4-3. The average of the 13 MFCCs for each section was computed and made into a 39-element feature vector used for the classification. We used the following machine learning algorithms: Neural Networks, K-Nearest Neighbor (KNN), and a 200 tree Random Forest (RF). Data were reserved for cross-validation of the KNN and RF. The Neural Network was trained 100 times, and the averaged results are presented. Results After categorization, we had 16 examples of non-pertussis coughs and 31 examples of pertussis coughs. Over 90% of all pertussis coughs were properly classified as pertussis. The error rates were: Type I errors of 7%, 12%, and 25% and Type II errors of 8%, 0%, and 0%, using the Neural Network, Random Forest, and KNN, respectively. Conclusion Our results suggest that we can build a robust classifier to assist clinicians and the public to help identify pertussis cases in children presenting with typical symptoms. PMID:24391730
Medication errors: definitions and classification

PubMed Central

Aronson, Jeffrey K

2009-01-01

To understand medication errors and to identify preventive strategies, we need to classify them and define the terms that describe them. The four main approaches to defining technical terms consider etymology, usage, previous definitions, and the Ramsey–Lewis method (based on an understanding of theory and practice). A medication error is ‘a failure in the treatment process that leads to, or has the potential to lead to, harm to the patient’. Prescribing faults, a subset of medication errors, should be distinguished from prescription errors. A prescribing fault is ‘a failure in the prescribing [decision-making] process that leads to, or has the potential to lead to, harm to the patient’. The converse of this, ‘balanced prescribing’ is ‘the use of a medicine that is appropriate to the patient's condition and, within the limits created by the uncertainty that attends therapeutic decisions, in a dosage regimen that optimizes the balance of benefit to harm’. This excludes all forms of prescribing faults, such as irrational, inappropriate, and ineffective prescribing, underprescribing and overprescribing. A prescription error is ‘a failure in the prescription writing process that results in a wrong instruction about one or more of the normal features of a prescription’. The ‘normal features’ include the identity of the recipient, the identity of the drug, the formulation, dose, route, timing, frequency, and duration of administration. Medication errors can be classified, invoking psychological theory, as knowledge-based mistakes, rule-based mistakes, action-based slips, and memory-based lapses. This classification informs preventive strategies. PMID:19594526
Hierarchic Agglomerative Clustering Methods for Automatic Document Classification.

ERIC Educational Resources Information Center

Griffiths, Alan; And Others

1984-01-01

Considers classifications produced by application of single linkage, complete linkage, group average, and word clustering methods to Keen and Cranfield document test collections, and studies structure of hierarchies produced, extent to which methods distort input similarity matrices during classification generation, and retrieval effectiveness…
A volumetric pulmonary CT segmentation method with applications in emphysema assessment

NASA Astrophysics Data System (ADS)

Silva, José Silvestre; Silva, Augusto; Santos, Beatriz S.

2006-03-01

A segmentation method is a mandatory pre-processing step in many automated or semi-automated analysis tasks such as region identification and densitometric analysis, or even for 3D visualization purposes. In this work we present a fully automated volumetric pulmonary segmentation algorithm based on intensity discrimination and morphologic procedures. Our method first identifies the trachea as well as primary bronchi and then the pulmonary region is identified by applying a threshold and morphologic operations. When both lungs are in contact, additional procedures are performed to obtain two separated lung volumes. To evaluate the performance of the method, we compared contours extracted from 3D lung surfaces with reference contours, using several figures of merit. Results show that the worst case generally occurs at the middle sections of high resolution CT exams, due the presence of aerial and vascular structures. Nevertheless, the average error is inferior to the average error associated with radiologist inter-observer variability, which suggests that our method produces lung contours similar to those drawn by radiologists. The information created by our segmentation algorithm is used by an identification and representation method in pulmonary emphysema that also classifies emphysema according to its severity degree. Two clinically proved thresholds are applied which identify regions with severe emphysema, and with highly severe emphysema. Based on this thresholding strategy, an application for volumetric emphysema assessment was developed offering new display paradigms concerning the visualization of classification results. This framework is easily extendable to accommodate other classifiers namely those related with texture based segmentation as it is often the case with interstitial diseases.
Conversion of Continuous-Valued Deep Networks to Efficient Event-Driven Networks for Image Classification.

PubMed

Rueckauer, Bodo; Lungu, Iulia-Alexandra; Hu, Yuhuang; Pfeiffer, Michael; Liu, Shih-Chii

2017-01-01

Spiking neural networks (SNNs) can potentially offer an efficient way of doing inference because the neurons in the networks are sparsely activated and computations are event-driven. Previous work showed that simple continuous-valued deep Convolutional Neural Networks (CNNs) can be converted into accurate spiking equivalents. These networks did not include certain common operations such as max-pooling, softmax, batch-normalization and Inception-modules. This paper presents spiking equivalents of these operations therefore allowing conversion of nearly arbitrary CNN architectures. We show conversion of popular CNN architectures, including VGG-16 and Inception-v3, into SNNs that produce the best results reported to date on MNIST, CIFAR-10 and the challenging ImageNet dataset. SNNs can trade off classification error rate against the number of available operations whereas deep continuous-valued neural networks require a fixed number of operations to achieve their classification error rate. From the examples of LeNet for MNIST and BinaryNet for CIFAR-10, we show that with an increase in error rate of a few percentage points, the SNNs can achieve more than 2x reductions in operations compared to the original CNNs. This highlights the potential of SNNs in particular when deployed on power-efficient neuromorphic spiking neuron chips, for use in embedded applications.
Conversion of Continuous-Valued Deep Networks to Efficient Event-Driven Networks for Image Classification

PubMed Central

Rueckauer, Bodo; Lungu, Iulia-Alexandra; Hu, Yuhuang; Pfeiffer, Michael; Liu, Shih-Chii

2017-01-01

Spiking neural networks (SNNs) can potentially offer an efficient way of doing inference because the neurons in the networks are sparsely activated and computations are event-driven. Previous work showed that simple continuous-valued deep Convolutional Neural Networks (CNNs) can be converted into accurate spiking equivalents. These networks did not include certain common operations such as max-pooling, softmax, batch-normalization and Inception-modules. This paper presents spiking equivalents of these operations therefore allowing conversion of nearly arbitrary CNN architectures. We show conversion of popular CNN architectures, including VGG-16 and Inception-v3, into SNNs that produce the best results reported to date on MNIST, CIFAR-10 and the challenging ImageNet dataset. SNNs can trade off classification error rate against the number of available operations whereas deep continuous-valued neural networks require a fixed number of operations to achieve their classification error rate. From the examples of LeNet for MNIST and BinaryNet for CIFAR-10, we show that with an increase in error rate of a few percentage points, the SNNs can achieve more than 2x reductions in operations compared to the original CNNs. This highlights the potential of SNNs in particular when deployed on power-efficient neuromorphic spiking neuron chips, for use in embedded applications. PMID:29375284
3D multi-view convolutional neural networks for lung nodule classification

PubMed Central

Kang, Guixia; Hou, Beibei; Zhang, Ningbo

2017-01-01

The 3D convolutional neural network (CNN) is able to make full use of the spatial 3D context information of lung nodules, and the multi-view strategy has been shown to be useful for improving the performance of 2D CNN in classifying lung nodules. In this paper, we explore the classification of lung nodules using the 3D multi-view convolutional neural networks (MV-CNN) with both chain architecture and directed acyclic graph architecture, including 3D Inception and 3D Inception-ResNet. All networks employ the multi-view-one-network strategy. We conduct a binary classification (benign and malignant) and a ternary classification (benign, primary malignant and metastatic malignant) on Computed Tomography (CT) images from Lung Image Database Consortium and Image Database Resource Initiative database (LIDC-IDRI). All results are obtained via 10-fold cross validation. As regards the MV-CNN with chain architecture, results show that the performance of 3D MV-CNN surpasses that of 2D MV-CNN by a significant margin. Finally, a 3D Inception network achieved an error rate of 4.59% for the binary classification and 7.70% for the ternary classification, both of which represent superior results for the corresponding task. We compare the multi-view-one-network strategy with the one-view-one-network strategy. The results reveal that the multi-view-one-network strategy can achieve a lower error rate than the one-view-one-network strategy. PMID:29145492
Automatic classification for mammogram backgrounds based on bi-rads complexity definition and on a multi content analysis framework

NASA Astrophysics Data System (ADS)

Wu, Jie; Besnehard, Quentin; Marchessoux, Cédric

2011-03-01

Clinical studies for the validation of new medical imaging devices require hundreds of images. An important step in creating and tuning the study protocol is the classification of images into "difficult" and "easy" cases. This consists of classifying the image based on features like the complexity of the background, the visibility of the disease (lesions). Therefore, an automatic medical background classification tool for mammograms would help for such clinical studies. This classification tool is based on a multi-content analysis framework (MCA) which was firstly developed to recognize image content of computer screen shots. With the implementation of new texture features and a defined breast density scale, the MCA framework is able to automatically classify digital mammograms with a satisfying accuracy. BI-RADS (Breast Imaging Reporting Data System) density scale is used for grouping the mammograms, which standardizes the mammography reporting terminology and assessment and recommendation categories. Selected features are input into a decision tree classification scheme in MCA framework, which is the so called "weak classifier" (any classifier with a global error rate below 50%). With the AdaBoost iteration algorithm, these "weak classifiers" are combined into a "strong classifier" (a classifier with a low global error rate) for classifying one category. The results of classification for one "strong classifier" show the good accuracy with the high true positive rates. For the four categories the results are: TP=90.38%, TN=67.88%, FP=32.12% and FN =9.62%.

Spatial averaging errors in creating hemispherical reflectance (albedo) maps from directional reflectance data

NASA Technical Reports Server (NTRS)

Kimes, D. S.; Kerber, A. G.; Sellers, P. J.

1993-01-01

Spatial averaging errors which may occur when creating hemispherical reflectance maps for different cover types from direct nadir technique to estimate the hemispherical reflectance are assessed by comparing the results with those obtained with a knowledge-based system called VEG (Kimes et al., 1991, 1992). It was found that hemispherical reflectance errors provided by using VEG are much less than those using the direct nadir techniques, depending on conditions. Suggestions are made concerning sampling and averaging strategies for creating hemispherical reflectance maps for photosynthetic, carbon cycle, and climate change studies.
Estimation of time averages from irregularly spaced observations - With application to coastal zone color scanner estimates of chlorophyll concentration

NASA Technical Reports Server (NTRS)

Chelton, Dudley B.; Schlax, Michael G.

1991-01-01

The sampling error of an arbitrary linear estimate of a time-averaged quantity constructed from a time series of irregularly spaced observations at a fixed located is quantified through a formalism. The method is applied to satellite observations of chlorophyll from the coastal zone color scanner. The two specific linear estimates under consideration are the composite average formed from the simple average of all observations within the averaging period and the optimal estimate formed by minimizing the mean squared error of the temporal average based on all the observations in the time series. The resulting suboptimal estimates are shown to be more accurate than composite averages. Suboptimal estimates are also found to be nearly as accurate as optimal estimates using the correct signal and measurement error variances and correlation functions for realistic ranges of these parameters, which makes it a viable practical alternative to the composite average method generally employed at present.
Classification of electroencephalograph signals using time-frequency decomposition and linear discriminant analysis

NASA Astrophysics Data System (ADS)

Szuflitowska, B.; Orlowski, P.

2017-08-01

Automated detection system consists of two key steps: extraction of features from EEG signals and classification for detection of pathology activity. The EEG sequences were analyzed using Short-Time Fourier Transform and the classification was performed using Linear Discriminant Analysis. The accuracy of the technique was tested on three sets of EEG signals: epilepsy, healthy and Alzheimer's Disease. The classification error below 10% has been considered a success. The higher accuracy are obtained for new data of unknown classes than testing data. The methodology can be helpful in differentiation epilepsy seizure and disturbances in the EEG signal in Alzheimer's Disease.
The Influence of Item Calibration Error on Variable-Length Computerized Adaptive Testing

ERIC Educational Resources Information Center

Patton, Jeffrey M.; Cheng, Ying; Yuan, Ke-Hai; Diao, Qi

2013-01-01

Variable-length computerized adaptive testing (VL-CAT) allows both items and test length to be "tailored" to examinees, thereby achieving the measurement goal (e.g., scoring precision or classification) with as few items as possible. Several popular test termination rules depend on the standard error of the ability estimate, which in turn depends…
Lexical Errors in Second Language Scientific Writing: Some Conceptual Implications

ERIC Educational Resources Information Center

Carrió Pastor, María Luisa; Mestre-Mestre, Eva María

2014-01-01

Nowadays, scientific writers are required not only a thorough knowledge of their subject field, but also a sound command of English as a lingua franca. In this paper, the lexical errors produced in scientific texts written in English by non-native researchers are identified to propose a classification of the categories they contain. This study…
Anatomical and/or pathological predictors for the “incorrect” classification of red dot markers on wrist radiographs taken following trauma

PubMed Central

Kranz, R

2015-01-01

Objective: To establish the prevalence of red dot markers in a sample of wrist radiographs and to identify any anatomical and/or pathological characteristics that predict “incorrect” red dot classification. Methods: Accident and emergency (A&E) wrist cases from a digital imaging and communications in medicine/digital teaching library were examined for red dot prevalence and for the presence of several anatomical and pathological features. Binary logistic regression analyses were run to establish if any of these features were predictors of incorrect red dot classification. Results: 398 cases were analysed. Red dot was “incorrectly” classified in 8.5% of cases; 6.3% were “false negatives” (“FNs”)and 2.3% false positives (FPs) (one decimal place). Old fractures [odds ratio (OR), 5.070 (1.256–20.471)] and reported degenerative change [OR, 9.870 (2.300–42.359)] were found to predict FPs. Frykman V [OR, 9.500 (1.954–46.179)], Frykman VI [OR, 6.333 (1.205–33.283)] and non-Frykman positive abnormalities [OR, 4.597 (1.264–16.711)] predict “FNs”. Old fractures and Frykman VI were predictive of error at 90% confidence interval (CI); the rest at 95% CI. Conclusion: The five predictors of incorrect red dot classification may inform the image interpretation training of radiographers and other professionals to reduce diagnostic error. Verification with larger samples would reinforce these findings. Advances in knowledge: All healthcare providers strive to eradicate diagnostic error. By examining specific anatomical and pathological predictors on radiographs for such error, as well as extrinsic factors that may affect reporting accuracy, image interpretation training can focus on these “problem” areas and influence which radiographic abnormality detection schemes are appropriate to implement in A&E departments. PMID:25496373
How Should Children with Speech Sound Disorders be Classified? A Review and Critical Evaluation of Current Classification Systems

ERIC Educational Resources Information Center

Waring, R.; Knight, R.

2013-01-01

Background: Children with speech sound disorders (SSD) form a heterogeneous group who differ in terms of the severity of their condition, underlying cause, speech errors, involvement of other aspects of the linguistic system and treatment response. To date there is no universal and agreed-upon classification system. Instead, a number of…
Evaluation of the confusion matrix method in the validation of an automated system for measuring feeding behaviour of cattle.

PubMed

Ruuska, Salla; Hämäläinen, Wilhelmiina; Kajava, Sari; Mughal, Mikaela; Matilainen, Pekka; Mononen, Jaakko

2018-03-01

The aim of the present study was to evaluate empirically confusion matrices in device validation. We compared the confusion matrix method to linear regression and error indices in the validation of a device measuring feeding behaviour of dairy cattle. In addition, we studied how to extract additional information on classification errors with confusion probabilities. The data consisted of 12 h behaviour measurements from five dairy cows; feeding and other behaviour were detected simultaneously with a device and from video recordings. The resulting 216 000 pairs of classifications were used to construct confusion matrices and calculate performance measures. In addition, hourly durations of each behaviour were calculated and the accuracy of measurements was evaluated with linear regression and error indices. All three validation methods agreed when the behaviour was detected very accurately or inaccurately. Otherwise, in the intermediate cases, the confusion matrix method and error indices produced relatively concordant results, but the linear regression method often disagreed with them. Our study supports the use of confusion matrix analysis in validation since it is robust to any data distribution and type of relationship, it makes a stringent evaluation of validity, and it offers extra information on the type and sources of errors. Copyright © 2018 Elsevier B.V. All rights reserved.
Bayes-LQAS: classifying the prevalence of global acute malnutrition

PubMed Central

2010-01-01

Lot Quality Assurance Sampling (LQAS) applications in health have generally relied on frequentist interpretations for statistical validity. Yet health professionals often seek statements about the probability distribution of unknown parameters to answer questions of interest. The frequentist paradigm does not pretend to yield such information, although a Bayesian formulation might. This is the source of an error made in a recent paper published in this journal. Many applications lend themselves to a Bayesian treatment, and would benefit from such considerations in their design. We discuss Bayes-LQAS (B-LQAS), which allows for incorporation of prior information into the LQAS classification procedure, and thus shows how to correct the aforementioned error. Further, we pay special attention to the formulation of Bayes Operating Characteristic Curves and the use of prior information to improve survey designs. As a motivating example, we discuss the classification of Global Acute Malnutrition prevalence and draw parallels between the Bayes and classical classifications schemes. We also illustrate the impact of informative and non-informative priors on the survey design. Results indicate that using a Bayesian approach allows the incorporation of expert information and/or historical data and is thus potentially a valuable tool for making accurate and precise classifications. PMID:20534159
Bayes-LQAS: classifying the prevalence of global acute malnutrition.

PubMed

Olives, Casey; Pagano, Marcello

2010-06-09

Lot Quality Assurance Sampling (LQAS) applications in health have generally relied on frequentist interpretations for statistical validity. Yet health professionals often seek statements about the probability distribution of unknown parameters to answer questions of interest. The frequentist paradigm does not pretend to yield such information, although a Bayesian formulation might. This is the source of an error made in a recent paper published in this journal. Many applications lend themselves to a Bayesian treatment, and would benefit from such considerations in their design. We discuss Bayes-LQAS (B-LQAS), which allows for incorporation of prior information into the LQAS classification procedure, and thus shows how to correct the aforementioned error. Further, we pay special attention to the formulation of Bayes Operating Characteristic Curves and the use of prior information to improve survey designs. As a motivating example, we discuss the classification of Global Acute Malnutrition prevalence and draw parallels between the Bayes and classical classifications schemes. We also illustrate the impact of informative and non-informative priors on the survey design. Results indicate that using a Bayesian approach allows the incorporation of expert information and/or historical data and is thus potentially a valuable tool for making accurate and precise classifications.
Optimization of the ANFIS using a genetic algorithm for physical work rate classification.

PubMed

Habibi, Ehsanollah; Salehi, Mina; Yadegarfar, Ghasem; Taheri, Ali

2018-03-13

Recently, a new method was proposed for physical work rate classification based on an adaptive neuro-fuzzy inference system (ANFIS). This study aims to present a genetic algorithm (GA)-optimized ANFIS model for a highly accurate classification of physical work rate. Thirty healthy men participated in this study. Directly measured heart rate and oxygen consumption of the participants in the laboratory were used for training the ANFIS classifier model in MATLAB version 8.0.0 using a hybrid algorithm. A similar process was done using the GA as an optimization technique. The accuracy, sensitivity and specificity of the ANFIS classifier model were increased successfully. The mean accuracy of the model was increased from 92.95 to 97.92%. Also, the calculated root mean square error of the model was reduced from 5.4186 to 3.1882. The maximum estimation error of the optimized ANFIS during the network testing process was ± 5%. The GA can be effectively used for ANFIS optimization and leads to an accurate classification of physical work rate. In addition to high accuracy, simple implementation and inter-individual variability consideration are two other advantages of the presented model.
Galaxy Zoo 1: data release of morphological classifications for nearly 900 000 galaxies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Linott, C.; Slosar, A.; Lintott, C.

Morphology is a powerful indicator of a galaxy's dynamical and merger history. It is strongly correlated with many physical parameters, including mass, star formation history and the distribution of mass. The Galaxy Zoo project collected simple morphological classifications of nearly 900,000 galaxies drawn from the Sloan Digital Sky Survey, contributed by hundreds of thousands of volunteers. This large number of classifications allows us to exclude classifier error, and measure the influence of subtle biases inherent in morphological classification. This paper presents the data collected by the project, alongside measures of classification accuracy and bias. The data are now publicly availablemore » and full catalogues can be downloaded in electronic format from http://data.galaxyzoo.org.« less
Comparison of Single and Multi-Scale Method for Leaf and Wood Points Classification from Terrestrial Laser Scanning Data

NASA Astrophysics Data System (ADS)

Wei, Hongqiang; Zhou, Guiyun; Zhou, Junjie

2018-04-01

The classification of leaf and wood points is an essential preprocessing step for extracting inventory measurements and canopy characterization of trees from the terrestrial laser scanning (TLS) data. The geometry-based approach is one of the widely used classification method. In the geometry-based method, it is common practice to extract salient features at one single scale before the features are used for classification. It remains unclear how different scale(s) used affect the classification accuracy and efficiency. To assess the scale effect on the classification accuracy and efficiency, we extracted the single-scale and multi-scale salient features from the point clouds of two oak trees of different sizes and conducted the classification on leaf and wood. Our experimental results show that the balanced accuracy of the multi-scale method is higher than the average balanced accuracy of the single-scale method by about 10 % for both trees. The average speed-up ratio of single scale classifiers over multi-scale classifier for each tree is higher than 30.
Linear Discriminant Analysis Achieves High Classification Accuracy for the BOLD fMRI Response to Naturalistic Movie Stimuli

PubMed Central

Mandelkow, Hendrik; de Zwart, Jacco A.; Duyn, Jeff H.

2016-01-01

Naturalistic stimuli like movies evoke complex perceptual processes, which are of great interest in the study of human cognition by functional MRI (fMRI). However, conventional fMRI analysis based on statistical parametric mapping (SPM) and the general linear model (GLM) is hampered by a lack of accurate parametric models of the BOLD response to complex stimuli. In this situation, statistical machine-learning methods, a.k.a. multivariate pattern analysis (MVPA), have received growing attention for their ability to generate stimulus response models in a data-driven fashion. However, machine-learning methods typically require large amounts of training data as well as computational resources. In the past, this has largely limited their application to fMRI experiments involving small sets of stimulus categories and small regions of interest in the brain. By contrast, the present study compares several classification algorithms known as Nearest Neighbor (NN), Gaussian Naïve Bayes (GNB), and (regularized) Linear Discriminant Analysis (LDA) in terms of their classification accuracy in discriminating the global fMRI response patterns evoked by a large number of naturalistic visual stimuli presented as a movie. Results show that LDA regularized by principal component analysis (PCA) achieved high classification accuracies, above 90% on average for single fMRI volumes acquired 2 s apart during a 300 s movie (chance level 0.7% = 2 s/300 s). The largest source of classification errors were autocorrelations in the BOLD signal compounded by the similarity of consecutive stimuli. All classifiers performed best when given input features from a large region of interest comprising around 25% of the voxels that responded significantly to the visual stimulus. Consistent with this, the most informative principal components represented widespread distributions of co-activated brain regions that were similar between subjects and may represent functional networks. In light of these results, the combination of naturalistic movie stimuli and classification analysis in fMRI experiments may prove to be a sensitive tool for the assessment of changes in natural cognitive processes under experimental manipulation. PMID:27065832
Comparison of three methods for long-term monitoring of boreal lake area using Landsat TM and ETM+ imagery

USGS Publications Warehouse

Roach, Jennifer K.; Griffith, Brad; Verbyla, David

2012-01-01

Programs to monitor lake area change are becoming increasingly important in high latitude regions, and their development often requires evaluating tradeoffs among different approaches in terms of accuracy of measurement, consistency across multiple users over long time periods, and efficiency. We compared three supervised methods for lake classification from Landsat imagery (density slicing, classification trees, and feature extraction). The accuracy of lake area and number estimates was evaluated relative to high-resolution aerial photography acquired within two days of satellite overpasses. The shortwave infrared band 5 was better at separating surface water from nonwater when used alone than when combined with other spectral bands. The simplest of the three methods, density slicing, performed best overall. The classification tree method resulted in the most omission errors (approx. 2x), feature extraction resulted in the most commission errors (approx. 4x), and density slicing had the least directional bias (approx. half of the lakes with overestimated area and half of the lakes with underestimated area). Feature extraction was the least consistent across training sets (i.e., large standard error among different training sets). Density slicing was the best of the three at classifying small lakes as evidenced by its lower optimal minimum lake size criterion of 5850 m2 compared with the other methods (8550 m2). Contrary to conventional wisdom, the use of additional spectral bands and a more sophisticated method not only required additional processing effort but also had a cost in terms of the accuracy and consistency of lake classifications.
Hearing handicap in patients with chronic kidney disease: a study of the different classifications of the degree of hearing loss.

PubMed

Costa, Klinger Vagner Teixeira da; Ferreira, Sonia Maria Soares; Menezes, Pedro de Lemos

The association between hearing loss and chronic kidney disease and hemodialysis has been well documented. However, the classification used for the degree of loss may underestimate the actual diagnosis due to specific characteristics related to the most affected auditory frequencies. Furthermore, correlations of hearing loss and hemodialysis time with hearing handicap remain unknown in this population. To compare the results of Lloyd's and Kaplan's and The Bureau Internacional d'Audiophonologie classifications in chronic kidney disease patients, and to correlate the averages calculated by their formulas with hemodialysis time and the hearing handicap. This is an analytical, observational and cross-sectional study with 80 patients on hemodialysis. Tympanometry, speech audiometry, pure tone audiometry and interview of patients with hearing loss through Hearing Handicap Inventory for Adults. Cases were classified according to the degree of loss. The correlations of tone averages with hemodialysis time and the total scores of Hearing Handicap Inventory for Adults and its domains were verified. 86 ears (53.75%) had hearing loss in at least one of the tonal averages in 48 patients who responded to Hearing Handicap Inventory for Adults. The Bureau Internacional d'Audiophonologie classification identified a greater number of cases (n=52) with some degree of disability compared to Lloyd and Kaplan (n=16). In the group with hemodialysis time of at least 2 years, there was weak but statistically significant correlation of The Bureau Internacional d'Audiophonologie classification average with hemodialysis time (r=0.363). There were moderate correlations of average The Bureau Internacional d'Audiophonologie classification (r=0.510) and tritone 2 (r=0.470) with the total scores of Hearing Handicap Inventory for Adults and with its social domain. The Bureau Internacional d'Audiophonologie classification seems to be more appropriate than Lloyd's and Kaplan's for use in this population; its average showed correlations with hearing loss in patients with hemodialysis time≥2 years and it exhibited moderate levels of correlation with the total score of Hearing Handicap Inventory for Adults and its social domain (r=0.557 and r=0.512). Copyright © 2016 Associação Brasileira de Otorrinolaringologia e Cirurgia Cérvico-Facial. Published by Elsevier Editora Ltda. All rights reserved.
Electroencephalography epilepsy classifications using hybrid cuckoo search and neural network

NASA Astrophysics Data System (ADS)

Pratiwi, A. B.; Damayanti, A.; Miswanto

2017-07-01

Epilepsy is a condition that affects the brain and causes repeated seizures. This seizure is episodes that can vary and nearly undetectable to long periods of vigorous shaking or brain contractions. Epilepsy often can be confirmed with an electrocephalography (EEG). Neural Networks has been used in biomedic signal analysis, it has successfully classified the biomedic signal, such as EEG signal. In this paper, a hybrid cuckoo search and neural network are used to recognize EEG signal for epilepsy classifications. The weight of the multilayer perceptron is optimized by the cuckoo search algorithm based on its error. The aim of this methods is making the network faster to obtained the local or global optimal then the process of classification become more accurate. Based on the comparison results with the traditional multilayer perceptron, the hybrid cuckoo search and multilayer perceptron provides better performance in term of error convergence and accuracy. The purpose methods give MSE 0.001 and accuracy 90.0 %.
Exposure assessment in investigations of waterborne illness: a quantitative estimate of measurement error

PubMed Central

Jones, Andria Q; Dewey, Catherine E; Doré, Kathryn; Majowicz, Shannon E; McEwen, Scott A; Waltner-Toews, David

2006-01-01

Background Exposure assessment is typically the greatest weakness of epidemiologic studies of disinfection by-products (DBPs) in drinking water, which largely stems from the difficulty in obtaining accurate data on individual-level water consumption patterns and activity. Thus, surrogate measures for such waterborne exposures are commonly used. Little attention however, has been directed towards formal validation of these measures. Methods We conducted a study in the City of Hamilton, Ontario (Canada) in 2001–2002, to assess the accuracy of two surrogate measures of home water source: (a) urban/rural status as assigned using residential postal codes, and (b) mapping of residential postal codes to municipal water systems within a Geographic Information System (GIS). We then assessed the accuracy of a commonly-used surrogate measure of an individual's actual drinking water source, namely, their home water source. Results The surrogates for home water source provided good classification of residents served by municipal water systems (approximately 98% predictive value), but did not perform well in classifying those served by private water systems (average: 63.5% predictive value). More importantly, we found that home water source was a poor surrogate measure of the individuals' actual drinking water source(s), being associated with high misclassification errors. Conclusion This study demonstrated substantial misclassification errors associated with a surrogate measure commonly used in studies of drinking water disinfection byproducts. Further, the limited accuracy of two surrogate measures of an individual's home water source heeds caution in their use in exposure classification methodology. While these surrogates are inexpensive and convenient, they should not be substituted for direct collection of accurate data pertaining to the subjects' waterborne disease exposure. In instances where such surrogates must be used, estimation of the misclassification and its subsequent effects are recommended for the interpretation and communication of results. Our results also lend support for further investigation into the quantification of the exposure misclassification associated with these surrogate measures, which would provide useful estimates for consideration in interpretation of waterborne disease studies. PMID:16729887
Characterization of Used Nuclear Fuel with Multivariate Analysis for Process Monitoring

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dayman, Kenneth J.; Coble, Jamie B.; Orton, Christopher R.

2014-01-01

The Multi-Isotope Process (MIP) Monitor combines gamma spectroscopy and multivariate analysis to detect anomalies in various process streams in a nuclear fuel reprocessing system. Measured spectra are compared to models of nominal behavior at each measurement location to detect unexpected changes in system behavior. In order to improve the accuracy and specificity of process monitoring, fuel characterization may be used to more accurately train subsequent models in a full analysis scheme. This paper presents initial development of a reactor-type classifier that is used to select a reactor-specific partial least squares model to predict fuel burnup. Nuclide activities for prototypic usedmore » fuel samples were generated in ORIGEN-ARP and used to investigate techniques to characterize used nuclear fuel in terms of reactor type (pressurized or boiling water reactor) and burnup. A variety of reactor type classification algorithms, including k-nearest neighbors, linear and quadratic discriminant analyses, and support vector machines, were evaluated to differentiate used fuel from pressurized and boiling water reactors. Then, reactor type-specific partial least squares models were developed to predict the burnup of the fuel. Using these reactor type-specific models instead of a model trained for all light water reactors improved the accuracy of burnup predictions. The developed classification and prediction models were combined and applied to a large dataset that included eight fuel assembly designs, two of which were not used in training the models, and spanned the range of the initial 235U enrichment, cooling time, and burnup values expected of future commercial used fuel for reprocessing. Error rates were consistent across the range of considered enrichment, cooling time, and burnup values. Average absolute relative errors in burnup predictions for validation data both within and outside the training space were 0.0574% and 0.0597%, respectively. The errors seen in this work are artificially low, because the models were trained, optimized, and tested on simulated, noise-free data. However, these results indicate that the developed models may generalize well to new data and that the proposed approach constitutes a viable first step in developing a fuel characterization algorithm based on gamma spectra.« less
Evaluation and comparison of benchmark QSAR models to predict a relevant REACH endpoint: The bioconcentration factor (BCF).

PubMed

Gissi, Andrea; Lombardo, Anna; Roncaglioni, Alessandra; Gadaleta, Domenico; Mangiatordi, Giuseppe Felice; Nicolotti, Orazio; Benfenati, Emilio

2015-02-01

The bioconcentration factor (BCF) is an important bioaccumulation hazard assessment metric in many regulatory contexts. Its assessment is required by the REACH regulation (Registration, Evaluation, Authorization and Restriction of Chemicals) and by CLP (Classification, Labeling and Packaging). We challenged nine well-known and widely used BCF QSAR models against 851 compounds stored in an ad-hoc created database. The goodness of the regression analysis was assessed by considering the determination coefficient (R(2)) and the Root Mean Square Error (RMSE); Cooper's statistics and Matthew's Correlation Coefficient (MCC) were calculated for all the thresholds relevant for regulatory purposes (i.e. 100L/kg for Chemical Safety Assessment; 500L/kg for Classification and Labeling; 2000 and 5000L/kg for Persistent, Bioaccumulative and Toxic (PBT) and very Persistent, very Bioaccumulative (vPvB) assessment) to assess the classification, with particular attention to the models' ability to control the occurrence of false negatives. As a first step, statistical analysis was performed for the predictions of the entire dataset; R(2)>0.70 was obtained using CORAL, T.E.S.T. and EPISuite Arnot-Gobas models. As classifiers, ACD and logP-based equations were the best in terms of sensitivity, ranging from 0.75 to 0.94. External compound predictions were carried out for the models that had their own training sets. CORAL model returned the best performance (R(2)ext=0.59), followed by the EPISuite Meylan model (R(2)ext=0.58). The latter gave also the highest sensitivity on external compounds with values from 0.55 to 0.85, depending on the thresholds. Statistics were also compiled for compounds falling into the models Applicability Domain (AD), giving better performances. In this respect, VEGA CAESAR was the best model in terms of regression (R(2)=0.94) and classification (average sensitivity>0.80). This model also showed the best regression (R(2)=0.85) and sensitivity (average>0.70) for new compounds in the AD but not present in the training set. However, no single optimal model exists and, thus, it would be wise a case-by-case assessment. Yet, integrating the wealth of information from multiple models remains the winner approach. Copyright © 2014 Elsevier Inc. All rights reserved.

Peculiarities of use of ECOC and AdaBoost based classifiers for thematic processing of hyperspectral data

NASA Astrophysics Data System (ADS)

Dementev, A. O.; Dmitriev, E. V.; Kozoderov, V. V.; Egorov, V. D.

2017-10-01

Hyperspectral imaging is up-to-date promising technology widely applied for the accurate thematic mapping. The presence of a large number of narrow survey channels allows us to use subtle differences in spectral characteristics of objects and to make a more detailed classification than in the case of using standard multispectral data. The difficulties encountered in the processing of hyperspectral images are usually associated with the redundancy of spectral information which leads to the problem of the curse of dimensionality. Methods currently used for recognizing objects on multispectral and hyperspectral images are usually based on standard base supervised classification algorithms of various complexity. Accuracy of these algorithms can be significantly different depending on considered classification tasks. In this paper we study the performance of ensemble classification methods for the problem of classification of the forest vegetation. Error correcting output codes and boosting are tested on artificial data and real hyperspectral images. It is demonstrates, that boosting gives more significant improvement when used with simple base classifiers. The accuracy in this case in comparable the error correcting output code (ECOC) classifier with Gaussian kernel SVM base algorithm. However the necessity of boosting ECOC with Gaussian kernel SVM is questionable. It is demonstrated, that selected ensemble classifiers allow us to recognize forest species with high enough accuracy which can be compared with ground-based forest inventory data.
Method of estimating natural recharge to the Edwards Aquifer in the San Antonio area, Texas

USGS Publications Warehouse

Puente, Celso

1978-01-01

The principal errors in the estimates of annual recharge are related to errors in estimating runoff in ungaged areas, which represent about 30 percent of the infiltration area. The estimated long-term average annual recharge in each basin, however, is probably representative of the actual recharge because the averaging procedure tends to cancel out the major errors.
Reconstruction of regional mean temperature for East Asia since 1900s and its uncertainties

NASA Astrophysics Data System (ADS)

Hua, W.

2017-12-01

Regional average surface air temperature (SAT) is one of the key variables often used to investigate climate change. Unfortunately, because of the limited observations over East Asia, there were also some gaps in the observation data sampling for regional mean SAT analysis, which was important to estimate past climate change. In this study, the regional average temperature of East Asia since 1900s is calculated by the Empirical Orthogonal Function (EOF)-based optimal interpolation (OA) method with considering the data errors. The results show that our estimate is more precise and robust than the results from simple average, which provides a better way for past climate reconstruction. In addition to the reconstructed regional average SAT anomaly time series, we also estimated uncertainties of reconstruction. The root mean square error (RMSE) results show that the the error decreases with respect to time, and are not sufficiently large to alter the conclusions on the persist warming in East Asia during twenty-first century. Moreover, the test of influence of data error on reconstruction clearly shows the sensitivity of reconstruction to the size of the data error.
Nova Mus 2008 = QY Mus

NASA Astrophysics Data System (ADS)

Templeton, Matthew R.

2008-10-01

Nova Mus 2008 = QY Mus was discovered by William Liller, Vina del Mar, Chile, on 2008 September 28.998 UT at magnitude 8.6 (Tech Pan film + orange filter). The position is RA = 13h 16m 36.44s , Dec = -67d 36m 47.8s (from P. Nelson). This object was announced as a nova in IAU Circular 8990 (Daniel W.E. Green, editor). The nova classification was determined using low-resolution spectra by W. Liller indicating the presence of broad H-alpha lines at least 2300 angstroms wide. Several observers confirmed the nova and provided photometry. The position above was provided by Peter Nelson (Ellinbank, Vic., Aus.), and is averaged from four separate exposures (rms error approx. 0.4 arcseconds). The GCVS team have formally designated Nova Mus 2008 as QY MUS. Observations should be reported to the AAVSO International Database as QY MUS.
Preliminary evaluation of spectral, normal and meteorological crop stage estimation approaches

NASA Technical Reports Server (NTRS)

Cate, R. B.; Artley, J. A.; Doraiswamy, P. C.; Hodges, T.; Kinsler, M. C.; Phinney, D. E.; Sestak, M. L. (Principal Investigator)

1980-01-01

Several of the projects in the AgRISTARS program require crop phenology information, including classification, acreage and yield estimation, and detection of episodal events. This study evaluates several crop calendar estimation techniques for their potential use in the program. The techniques, although generic in approach, were developed and tested on spring wheat data collected in 1978. There are three basic approaches to crop stage estimation: historical averages for an area (normal crop calendars), agrometeorological modeling of known crop-weather relationships agrometeorological (agromet) crop calendars, and interpretation of spectral signatures (spectral crop calendars). In all, 10 combinations of planting and biostage estimation models were evaluated. Dates of stage occurrence are estimated with biases between -4 and +4 days while root mean square errors range from 10 to 15 days. Results are inconclusive as to the superiority of any of the models and further evaluation of the models with the 1979 data set is recommended.
Error, Power, and Blind Sentinels: The Statistics of Seagrass Monitoring

PubMed Central

Schultz, Stewart T.; Kruschel, Claudia; Bakran-Petricioli, Tatjana; Petricioli, Donat

2015-01-01

We derive statistical properties of standard methods for monitoring of habitat cover worldwide, and criticize them in the context of mandated seagrass monitoring programs, as exemplified by Posidonia oceanica in the Mediterranean Sea. We report the novel result that cartographic methods with non-trivial classification errors are generally incapable of reliably detecting habitat cover losses less than about 30 to 50%, and the field labor required to increase their precision can be orders of magnitude higher than that required to estimate habitat loss directly in a field campaign. We derive a universal utility threshold of classification error in habitat maps that represents the minimum habitat map accuracy above which direct methods are superior. Widespread government reliance on blind-sentinel methods for monitoring seafloor can obscure the gradual and currently ongoing losses of benthic resources until the time has long passed for meaningful management intervention. We find two classes of methods with very high statistical power for detecting small habitat cover losses: 1) fixed-plot direct methods, which are over 100 times as efficient as direct random-plot methods in a variable habitat mosaic; and 2) remote methods with very low classification error such as geospatial underwater videography, which is an emerging, low-cost, non-destructive method for documenting small changes at millimeter visual resolution. General adoption of these methods and their further development will require a fundamental cultural change in conservation and management bodies towards the recognition and promotion of requirements of minimal statistical power and precision in the development of international goals for monitoring these valuable resources and the ecological services they provide. PMID:26367863
Noise-enhanced convolutional neural networks.

PubMed

Audhkhasi, Kartik; Osoba, Osonde; Kosko, Bart

2016-06-01

Injecting carefully chosen noise can speed convergence in the backpropagation training of a convolutional neural network (CNN). The Noisy CNN algorithm speeds training on average because the backpropagation algorithm is a special case of the generalized expectation-maximization (EM) algorithm and because such carefully chosen noise always speeds up the EM algorithm on average. The CNN framework gives a practical way to learn and recognize images because backpropagation scales with training data. It has only linear time complexity in the number of training samples. The Noisy CNN algorithm finds a special separating hyperplane in the network's noise space. The hyperplane arises from the likelihood-based positivity condition that noise-boosts the EM algorithm. The hyperplane cuts through a uniform-noise hypercube or Gaussian ball in the noise space depending on the type of noise used. Noise chosen from above the hyperplane speeds training on average. Noise chosen from below slows it on average. The algorithm can inject noise anywhere in the multilayered network. Adding noise to the output neurons reduced the average per-iteration training-set cross entropy by 39% on a standard MNIST image test set of handwritten digits. It also reduced the average per-iteration training-set classification error by 47%. Adding noise to the hidden layers can also reduce these performance measures. The noise benefit is most pronounced for smaller data sets because the largest EM hill-climbing gains tend to occur in the first few iterations. This noise effect can assist random sampling from large data sets because it allows a smaller random sample to give the same or better performance than a noiseless sample gives. Copyright © 2015 Elsevier Ltd. All rights reserved.
Multilayer perceptron, fuzzy sets, and classification

NASA Technical Reports Server (NTRS)

Pal, Sankar K.; Mitra, Sushmita

1992-01-01

A fuzzy neural network model based on the multilayer perceptron, using the back-propagation algorithm, and capable of fuzzy classification of patterns is described. The input vector consists of membership values to linguistic properties while the output vector is defined in terms of fuzzy class membership values. This allows efficient modeling of fuzzy or uncertain patterns with appropriate weights being assigned to the backpropagated errors depending upon the membership values at the corresponding outputs. During training, the learning rate is gradually decreased in discrete steps until the network converges to a minimum error solution. The effectiveness of the algorithm is demonstrated on a speech recognition problem. The results are compared with those of the conventional MLP, the Bayes classifier, and the other related models.
Automatic white blood cell classification using pre-trained deep learning models: ResNet and Inception

NASA Astrophysics Data System (ADS)

Habibzadeh, Mehdi; Jannesari, Mahboobeh; Rezaei, Zahra; Baharvand, Hossein; Totonchi, Mehdi

2018-04-01

This works gives an account of evaluation of white blood cell differential counts via computer aided diagnosis (CAD) system and hematology rules. Leukocytes, also called white blood cells (WBCs) play main role of the immune system. Leukocyte is responsible for phagocytosis and immunity and therefore in defense against infection involving the fatal diseases incidence and mortality related issues. Admittedly, microscopic examination of blood samples is a time consuming, expensive and error-prone task. A manual diagnosis would search for specific Leukocytes and number abnormalities in the blood slides while complete blood count (CBC) examination is performed. Complications may arise from the large number of varying samples including different types of Leukocytes, related sub-types and concentration in blood, which makes the analysis prone to human error. This process can be automated by computerized techniques which are more reliable and economical. In essence, we seek to determine a fast, accurate mechanism for classification and gather information about distribution of white blood evidences which may help to diagnose the degree of any abnormalities during CBC test. In this work, we consider the problem of pre-processing and supervised classification of white blood cells into their four primary types including Neutrophils, Eosinophils, Lymphocytes, and Monocytes using a consecutive proposed deep learning framework. For first step, this research proposes three consecutive pre-processing calculations namely are color distortion; bounding box distortion (crop) and image flipping mirroring. In second phase, white blood cell recognition performed with hierarchy topological feature extraction using Inception and ResNet architectures. Finally, the results obtained from the preliminary analysis of cell classification with (11200) training samples and 1244 white blood cells evaluation data set are presented in confusion matrices and interpreted using accuracy rate, and false positive with the classification framework being validated with experiments conducted on poor quality blood images sized 320 × 240 pixels. The deferential outcomes in the challenging cell detection task, as shown in result section, indicate that there is a significant achievement in using Inception and ResNet architecture with proposed settings. Our framework detects on average 100% of the four main white blood cell types using ResNet V1 50 while also alternative promising result with 99.84% and 99.46% accuracy rate obtained with ResNet V1 152 and ResNet 101, respectively with 3000 epochs and fine-tuning all layers. Further statistical confusion matrix tests revealed that this work achieved 1, 0.9979, 0.9989 sensitivity values when area under the curve (AUC) scores above 1, 0.9992, 0.9833 on three proposed techniques. In addition, current work shows negligible and small false negative 0, 2, 1 and substantial false positive with 0, 0, 5 values in Leukocytes detection.
One-Class Classification-Based Real-Time Activity Error Detection in Smart Homes.

PubMed

Das, Barnan; Cook, Diane J; Krishnan, Narayanan C; Schmitter-Edgecombe, Maureen

2016-08-01

Caring for individuals with dementia is frequently associated with extreme physical and emotional stress, which often leads to depression. Smart home technology and advances in machine learning techniques can provide innovative solutions to reduce caregiver burden. One key service that caregivers provide is prompting individuals with memory limitations to initiate and complete daily activities. We hypothesize that sensor technologies combined with machine learning techniques can automate the process of providing reminder-based interventions. The first step towards automated interventions is to detect when an individual faces difficulty with activities. We propose machine learning approaches based on one-class classification that learn normal activity patterns. When we apply these classifiers to activity patterns that were not seen before, the classifiers are able to detect activity errors, which represent potential prompt situations. We validate our approaches on smart home sensor data obtained from older adult participants, some of whom faced difficulties performing routine activities and thus committed errors.
Exploring diversity in ensemble classification: Applications in large area land cover mapping

NASA Astrophysics Data System (ADS)

Mellor, Andrew; Boukir, Samia

2017-07-01

Ensemble classifiers, such as random forests, are now commonly applied in the field of remote sensing, and have been shown to perform better than single classifier systems, resulting in reduced generalisation error. Diversity across the members of ensemble classifiers is known to have a strong influence on classification performance - whereby classifier errors are uncorrelated and more uniformly distributed across ensemble members. The relationship between ensemble diversity and classification performance has not yet been fully explored in the fields of information science and machine learning and has never been examined in the field of remote sensing. This study is a novel exploration of ensemble diversity and its link to classification performance, applied to a multi-class canopy cover classification problem using random forests and multisource remote sensing and ancillary GIS data, across seven million hectares of diverse dry-sclerophyll dominated public forests in Victoria Australia. A particular emphasis is placed on analysing the relationship between ensemble diversity and ensemble margin - two key concepts in ensemble learning. The main novelty of our work is on boosting diversity by emphasizing the contribution of lower margin instances used in the learning process. Exploring the influence of tree pruning on diversity is also a new empirical analysis that contributes to a better understanding of ensemble performance. Results reveal insights into the trade-off between ensemble classification accuracy and diversity, and through the ensemble margin, demonstrate how inducing diversity by targeting lower margin training samples is a means of achieving better classifier performance for more difficult or rarer classes and reducing information redundancy in classification problems. Our findings inform strategies for collecting training data and designing and parameterising ensemble classifiers, such as random forests. This is particularly important in large area remote sensing applications, for which training data is costly and resource intensive to collect.
Analysis of swallowing sounds using hidden Markov models.

PubMed

Aboofazeli, Mohammad; Moussavi, Zahra

2008-04-01

In recent years, acoustical analysis of the swallowing mechanism has received considerable attention due to its diagnostic potentials. This paper presents a hidden Markov model (HMM) based method for the swallowing sound segmentation and classification. Swallowing sound signals of 15 healthy and 11 dysphagic subjects were studied. The signals were divided into sequences of 25 ms segments each of which were represented by seven features. The sequences of features were modeled by HMMs. Trained HMMs were used for segmentation of the swallowing sounds into three distinct phases, i.e., initial quiet period, initial discrete sounds (IDS) and bolus transit sounds (BTS). Among the seven features, accuracy of segmentation by the HMM based on multi-scale product of wavelet coefficients was higher than that of the other HMMs and the linear prediction coefficient (LPC)-based HMM showed the weakest performance. In addition, HMMs were used for classification of the swallowing sounds of healthy subjects and dysphagic patients. Classification accuracy of different HMM configurations was investigated. When we increased the number of states of the HMMs from 4 to 8, the classification error gradually decreased. In most cases, classification error for N=9 was higher than that of N=8. Among the seven features used, root mean square (RMS) and waveform fractal dimension (WFD) showed the best performance in the HMM-based classification of swallowing sounds. When the sequences of the features of IDS segment were modeled separately, the accuracy reached up to 85.5%. As a second stage classification, a screening algorithm was used which correctly classified all the subjects but one healthy subject when RMS was used as characteristic feature of the swallowing sounds and the number of states was set to N=8.
Spelling in Adolescents with Dyslexia: Errors and Modes of Assessment

ERIC Educational Resources Information Center

Tops, Wim; Callens, Maaike; Bijn, Evi; Brysbaert, Marc

2014-01-01

In this study we focused on the spelling of high-functioning students with dyslexia. We made a detailed classification of the errors in a word and sentence dictation task made by 100 students with dyslexia and 100 matched control students. All participants were in the first year of their bachelor's studies and had Dutch as mother tongue. Three…
Estimation of a cover-type change matrix from error-prone data

Treesearch

Steen Magnussen

2009-01-01

Coregistration and classification errors seriously compromise per-pixel estimates of land cover change. A more robust estimation of change is proposed in which adjacent pixels are grouped into 3x3 clusters and treated as a unit of observation. A complete change matrix is recovered in a two-step process. The diagonal elements of a change matrix are recovered from...
Curriculum Assessment Using Artificial Neural Network and Support Vector Machine Modeling Approaches: A Case Study. IR Applications. Volume 29

ERIC Educational Resources Information Center

Chen, Chau-Kuang

2010-01-01

Artificial Neural Network (ANN) and Support Vector Machine (SVM) approaches have been on the cutting edge of science and technology for pattern recognition and data classification. In the ANN model, classification accuracy can be achieved by using the feed-forward of inputs, back-propagation of errors, and the adjustment of connection weights. In…
North American vegetation model for land-use planning in a changing climate: A solution to large classification problems

Treesearch

Gerald E. Rehfeldt; Nicholas L. Crookston; Cuauhtemoc Saenz-Romero; Elizabeth M. Campbell

2012-01-01

Data points intensively sampling 46 North American biomes were used to predict the geographic distribution of biomes from climate variables using the Random Forests classification tree. Techniques were incorporated to accommodate a large number of classes and to predict the future occurrence of climates beyond the contemporary climatic range of the biomes. Errors of...
Feasibility of Equivalent Dipole Models for Electroencephalogram-Based Brain Computer Interfaces.

PubMed

Schimpf, Paul H

2017-09-15

This article examines the localization errors of equivalent dipolar sources inverted from the surface electroencephalogram in order to determine the feasibility of using their location as classification parameters for non-invasive brain computer interfaces. Inverse localization errors are examined for two head models: a model represented by four concentric spheres and a realistic model based on medical imagery. It is shown that the spherical model results in localization ambiguity such that a number of dipolar sources, with different azimuths and varying orientations, provide a near match to the electroencephalogram of the best equivalent source. No such ambiguity exists for the elevation of inverted sources, indicating that for spherical head models, only the elevation of inverted sources (and not the azimuth) can be expected to provide meaningful classification parameters for brain-computer interfaces. In a realistic head model, all three parameters of the inverted source location are found to be reliable, providing a more robust set of parameters. In both cases, the residual error hypersurfaces demonstrate local minima, indicating that a search for the best-matching sources should be global. Source localization error vs. signal-to-noise ratio is also demonstrated for both head models.
Reduction of the dimension of neural network models in problems of pattern recognition and forecasting

NASA Astrophysics Data System (ADS)

Nasertdinova, A. D.; Bochkarev, V. V.

2017-11-01

Deep neural networks with a large number of parameters are a powerful tool for solving problems of pattern recognition, prediction and classification. Nevertheless, overfitting remains a serious problem in the use of such networks. A method of solving the problem of overfitting is proposed in this article. This method is based on reducing the number of independent parameters of a neural network model using the principal component analysis, and can be implemented using existing libraries of neural computing. The algorithm was tested on the problem of recognition of handwritten symbols from the MNIST database, as well as on the task of predicting time series (rows of the average monthly number of sunspots and series of the Lorentz system were used). It is shown that the application of the principal component analysis enables reducing the number of parameters of the neural network model when the results are good. The average error rate for the recognition of handwritten figures from the MNIST database was 1.12% (which is comparable to the results obtained using the "Deep training" methods), while the number of parameters of the neural network can be reduced to 130 times.
Wind data mining by Kohonen Neural Networks.

PubMed

Fayos, José; Fayos, Carolina

2007-02-14

Time series of Circulation Weather Type (CWT), including daily averaged wind direction and vorticity, are self-classified by similarity using Kohonen Neural Networks (KNN). It is shown that KNN is able to map by similarity all 7300 five-day CWT sequences during the period of 1975-94, in London, United Kingdom. It gives, as a first result, the most probable wind sequences preceding each one of the 27 CWT Lamb classes in that period. Inversely, as a second result, the observed diffuse correlation between both five-day CWT sequences and the CWT of the 6(th) day, in the long 20-year period, can be generalized to predict the last from the previous CWT sequence in a different test period, like 1995, as both time series are similar. Although the average prediction error is comparable to that obtained by forecasting standard methods, the KNN approach gives complementary results, as they depend only on an objective classification of observed CWT data, without any model assumption. The 27 CWT of the Lamb Catalogue were coded with binary three-dimensional vectors, pointing to faces, edges and vertex of a "wind-cube," so that similar CWT vectors were close.
Variability of Organophosphorous Pesticide Metabolite Levels in Spot and 24-hr Urine Samples Collected from Young Children during 1 Week

PubMed Central

Kogut, Katherine; Eisen, Ellen A.; Jewell, Nicholas P.; Quirós-Alcalá, Lesliam; Castorina, Rosemary; Chevrier, Jonathan; Holland, Nina T.; Barr, Dana Boyd; Kavanagh-Baird, Geri; Eskenazi, Brenda

2012-01-01

Background: Dialkyl phosphate (DAP) metabolites in spot urine samples are frequently used to characterize children’s exposures to organophosphorous (OP) pesticides. However, variable exposure and short biological half-lives of OP pesticides could result in highly variable measurements, leading to exposure misclassification. Objective: We examined within- and between-child variability in DAP metabolites in urine samples collected during 1 week. Methods: We collected spot urine samples over 7 consecutive days from 25 children (3–6 years of age). On two of the days, we collected 24-hr voids. We assessed the reproducibility of urinary DAP metabolite concentrations and evaluated the sensitivity and specificity of spot urine samples as predictors of high (top 20%) or elevated (top 40%) weekly average DAP metabolite concentrations. Results: Within-child variance exceeded between-child variance by a factor of two to eight, depending on metabolite grouping. Although total DAP concentrations in single spot urine samples were moderately to strongly associated with concentrations in same-day 24-hr samples (r ≈ 0.6–0.8, p < 0.01), concentrations in spot samples collected > 1 day apart and in 24-hr samples collected 3 days apart were weakly correlated (r ≈ –0.21 to 0.38). Single spot samples predicted high (top 20%) and elevated (top 40%) full-week average total DAP excretion with only moderate sensitivity (≈ 0.52 and ≈ 0.67, respectively) but relatively high specificity (≈ 0.88 and ≈ 0.78, respectively). Conclusions: The high variability we observed in children’s DAP metabolite concentrations suggests that single-day urine samples provide only a brief snapshot of exposure. Sensitivity analyses suggest that classification of cumulative OP exposure based on spot samples is prone to type 2 classification errors. PMID:23052012

Deep Learning with Convolutional Neural Networks Applied to Electromyography Data: A Resource for the Classification of Movements for Prosthetic Hands

PubMed Central

Atzori, Manfredo; Cognolato, Matteo; Müller, Henning

2016-01-01

Natural control methods based on surface electromyography (sEMG) and pattern recognition are promising for hand prosthetics. However, the control robustness offered by scientific research is still not sufficient for many real life applications, and commercial prostheses are capable of offering natural control for only a few movements. In recent years deep learning revolutionized several fields of machine learning, including computer vision and speech recognition. Our objective is to test its methods for natural control of robotic hands via sEMG using a large number of intact subjects and amputees. We tested convolutional networks for the classification of an average of 50 hand movements in 67 intact subjects and 11 transradial amputees. The simple architecture of the neural network allowed to make several tests in order to evaluate the effect of pre-processing, layer architecture, data augmentation and optimization. The classification results are compared with a set of classical classification methods applied on the same datasets. The classification accuracy obtained with convolutional neural networks using the proposed architecture is higher than the average results obtained with the classical classification methods, but lower than the results obtained with the best reference methods in our tests. The results show that convolutional neural networks with a very simple architecture can produce accurate results comparable to the average classical classification methods. They show that several factors (including pre-processing, the architecture of the net and the optimization parameters) can be fundamental for the analysis of sEMG data. Larger networks can achieve higher accuracy on computer vision and object recognition tasks. This fact suggests that it may be interesting to evaluate if larger networks can increase sEMG classification accuracy too. PMID:27656140
Deep Learning with Convolutional Neural Networks Applied to Electromyography Data: A Resource for the Classification of Movements for Prosthetic Hands.

PubMed

Atzori, Manfredo; Cognolato, Matteo; Müller, Henning

2016-01-01

Natural control methods based on surface electromyography (sEMG) and pattern recognition are promising for hand prosthetics. However, the control robustness offered by scientific research is still not sufficient for many real life applications, and commercial prostheses are capable of offering natural control for only a few movements. In recent years deep learning revolutionized several fields of machine learning, including computer vision and speech recognition. Our objective is to test its methods for natural control of robotic hands via sEMG using a large number of intact subjects and amputees. We tested convolutional networks for the classification of an average of 50 hand movements in 67 intact subjects and 11 transradial amputees. The simple architecture of the neural network allowed to make several tests in order to evaluate the effect of pre-processing, layer architecture, data augmentation and optimization. The classification results are compared with a set of classical classification methods applied on the same datasets. The classification accuracy obtained with convolutional neural networks using the proposed architecture is higher than the average results obtained with the classical classification methods, but lower than the results obtained with the best reference methods in our tests. The results show that convolutional neural networks with a very simple architecture can produce accurate results comparable to the average classical classification methods. They show that several factors (including pre-processing, the architecture of the net and the optimization parameters) can be fundamental for the analysis of sEMG data. Larger networks can achieve higher accuracy on computer vision and object recognition tasks. This fact suggests that it may be interesting to evaluate if larger networks can increase sEMG classification accuracy too.
Methods for estimating flood frequency in Montana based on data through water year 1998

USGS Publications Warehouse

Parrett, Charles; Johnson, Dave R.

2004-01-01

Annual peak discharges having recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years (T-year floods) were determined for 660 gaged sites in Montana and in adjacent areas of Idaho, Wyoming, and Canada, based on data through water year 1998. The updated flood-frequency information was subsequently used in regression analyses, either ordinary or generalized least squares, to develop equations relating T-year floods to various basin and climatic characteristics, equations relating T-year floods to active-channel width, and equations relating T-year floods to bankfull width. The equations can be used to estimate flood frequency at ungaged sites. Montana was divided into eight regions, within which flood characteristics were considered to be reasonably homogeneous, and the three sets of regression equations were developed for each region. A measure of the overall reliability of the regression equations is the average standard error of prediction. The average standard errors of prediction for the equations based on basin and climatic characteristics ranged from 37.4 percent to 134.1 percent. Average standard errors of prediction for the equations based on active-channel width ranged from 57.2 percent to 141.3 percent. Average standard errors of prediction for the equations based on bankfull width ranged from 63.1 percent to 155.5 percent. In most regions, the equations based on basin and climatic characteristics generally had smaller average standard errors of prediction than equations based on active-channel or bankfull width. An exception was the Southeast Plains Region, where all equations based on active-channel width had smaller average standard errors of prediction than equations based on basin and climatic characteristics or bankfull width. Methods for weighting estimates derived from the basin- and climatic-characteristic equations and the channel-width equations also were developed. The weights were based on the cross correlation of residuals from the different methods and the average standard errors of prediction. When all three methods were combined, the average standard errors of prediction ranged from 37.4 percent to 120.2 percent. Weighting of estimates reduced the standard errors of prediction for all T-year flood estimates in four regions, reduced the standard errors of prediction for some T-year flood estimates in two regions, and provided no reduction in average standard error of prediction in two regions. A computer program for solving the regression equations, weighting estimates, and determining reliability of individual estimates was developed and placed on the USGS Montana District World Wide Web page. A new regression method, termed Region of Influence regression, also was tested. Test results indicated that the Region of Influence method was not as reliable as the regional equations based on generalized least squares regression. Two additional methods for estimating flood frequency at ungaged sites located on the same streams as gaged sites also are described. The first method, based on a drainage-area-ratio adjustment, is intended for use on streams where the ungaged site of interest is located near a gaged site. The second method, based on interpolation between gaged sites, is intended for use on streams that have two or more streamflow-gaging stations.
Identification of terrain cover using the optimum polarimetric classifier

NASA Technical Reports Server (NTRS)

Kong, J. A.; Swartz, A. A.; Yueh, H. A.; Novak, L. M.; Shin, R. T.

1988-01-01

A systematic approach for the identification of terrain media such as vegetation canopy, forest, and snow-covered fields is developed using the optimum polarimetric classifier. The covariance matrices for various terrain cover are computed from theoretical models of random medium by evaluating the scattering matrix elements. The optimal classification scheme makes use of a quadratic distance measure and is applied to classify a vegetation canopy consisting of both trees and grass. Experimentally measured data are used to validate the classification scheme. Analytical and Monte Carlo simulated classification errors using the fully polarimetric feature vector are compared with classification based on single features which include the phase difference between the VV and HH polarization returns. It is shown that the full polarimetric results are optimal and provide better classification performance than single feature measurements.
Classifying nursing errors in clinical management within an Australian hospital.

PubMed

Tran, D T; Johnson, M

2010-12-01

Although many classification systems relating to patient safety exist, no taxonomy was identified that classified nursing errors in clinical management. To develop a classification system for nursing errors relating to clinical management (NECM taxonomy) and to describe contributing factors and patient consequences. We analysed 241 (11%) self-reported incidents relating to clinical management in nursing in a metropolitan hospital. Descriptive analysis of numeric data and content analysis of text data were undertaken to derive the NECM taxonomy, contributing factors and consequences for patients. Clinical management incidents represented 1.63 incidents per 1000 occupied bed days. The four themes of the NECM taxonomy were nursing care process (67%), communication (22%), administrative process (5%), and knowledge and skill (6%). Half of the incidents did not cause any patient harm. Contributing factors (n=111) included the following: patient clinical, social conditions and behaviours (27%); resources (22%); environment and workload (18%); other health professionals (15%); communication (13%); and nurse's knowledge and experience (5%). The NECM taxonomy provides direction to clinicians and managers on areas in clinical management that are most vulnerable to error, and therefore, priorities for system change management. Any nurses who wish to classify nursing errors relating to clinical management could use these types of errors. This study informs further research into risk management behaviour, and self-assessment tools for clinicians. Globally, nurses need to continue to monitor and act upon patient safety issues. © 2010 The Authors. International Nursing Review © 2010 International Council of Nurses.
In-vivo determination of chewing patterns using FBG and artificial neural networks

NASA Astrophysics Data System (ADS)

Pegorini, Vinicius; Zen Karam, Leandro; Rocha Pitta, Christiano S.; Ribeiro, Richardson; Simioni Assmann, Tangriani; Cardozo da Silva, Jean Carlos; Bertotti, Fábio L.; Kalinowski, Hypolito J.; Cardoso, Rafael

2015-09-01

This paper reports the process of pattern classification of the chewing process of ruminants. We propose a simplified signal processing scheme for optical fiber Bragg grating (FBG) sensors based on machine learning techniques. The FBG sensors measure the biomechanical forces during jaw movements and an artificial neural network is responsible for the classification of the associated chewing pattern. In this study, three patterns associated to dietary supplement, hay and ryegrass were considered. Additionally, two other important events for ingestive behavior studies were monitored, rumination and idle period. Experimental results show that the proposed approach for pattern classification has been capable of differentiating the materials involved in the chewing process with a small classification error.
Accuracy assessment, using stratified plurality sampling, of portions of a LANDSAT classification of the Arctic National Wildlife Refuge Coastal Plain

NASA Technical Reports Server (NTRS)

Card, Don H.; Strong, Laurence L.

1989-01-01

An application of a classification accuracy assessment procedure is described for a vegetation and land cover map prepared by digital image processing of LANDSAT multispectral scanner data. A statistical sampling procedure called Stratified Plurality Sampling was used to assess the accuracy of portions of a map of the Arctic National Wildlife Refuge coastal plain. Results are tabulated as percent correct classification overall as well as per category with associated confidence intervals. Although values of percent correct were disappointingly low for most categories, the study was useful in highlighting sources of classification error and demonstrating shortcomings of the plurality sampling method.
Hyperspectral image classification based on local binary patterns and PCANet

NASA Astrophysics Data System (ADS)

Yang, Huizhen; Gao, Feng; Dong, Junyu; Yang, Yang

2018-04-01

Hyperspectral image classification has been well acknowledged as one of the challenging tasks of hyperspectral data processing. In this paper, we propose a novel hyperspectral image classification framework based on local binary pattern (LBP) features and PCANet. In the proposed method, linear prediction error (LPE) is first employed to select a subset of informative bands, and LBP is utilized to extract texture features. Then, spectral and texture features are stacked into a high dimensional vectors. Next, the extracted features of a specified position are transformed to a 2-D image. The obtained images of all pixels are fed into PCANet for classification. Experimental results on real hyperspectral dataset demonstrate the effectiveness of the proposed method.
Sub-pixel image classification for forest types in East Texas

NASA Astrophysics Data System (ADS)

Westbrook, Joey

Sub-pixel classification is the extraction of information about the proportion of individual materials of interest within a pixel. Landcover classification at the sub-pixel scale provides more discrimination than traditional per-pixel multispectral classifiers for pixels where the material of interest is mixed with other materials. It allows for the un-mixing of pixels to show the proportion of each material of interest. The materials of interest for this study are pine, hardwood, mixed forest and non-forest. The goal of this project was to perform a sub-pixel classification, which allows a pixel to have multiple labels, and compare the result to a traditional supervised classification, which allows a pixel to have only one label. The satellite image used was a Landsat 5 Thematic Mapper (TM) scene of the Stephen F. Austin Experimental Forest in Nacogdoches County, Texas and the four cover type classes are pine, hardwood, mixed forest and non-forest. Once classified, a multi-layer raster datasets was created that comprised four raster layers where each layer showed the percentage of that cover type within the pixel area. Percentage cover type maps were then produced and the accuracy of each was assessed using a fuzzy error matrix for the sub-pixel classifications, and the results were compared to the supervised classification in which a traditional error matrix was used. The overall accuracy of the sub-pixel classification using the aerial photo for both training and reference data had the highest (65% overall) out of the three sub-pixel classifications. This was understandable because the analyst can visually observe the cover types actually on the ground for training data and reference data, whereas using the FIA (Forest Inventory and Analysis) plot data, the analyst must assume that an entire pixel contains the exact percentage of a cover type found in a plot. An increase in accuracy was found after reclassifying each sub-pixel classification from nine classes with 10 percent interval each to five classes with 20 percent interval each. When compared to the supervised classification which has a satisfactory overall accuracy of 90%, none of the sub-pixel classification achieved the same level. However, since traditional per-pixel classifiers assign only one label to pixels throughout the landscape while sub-pixel classifications assign multiple labels to each pixel, the traditional 85% accuracy of acceptance for pixel-based classifications should not apply to sub-pixel classifications. More research is needed in order to define the level of accuracy that is deemed acceptable for sub-pixel classifications.
WE-A-17A-03: Catheter Digitization in High-Dose-Rate Brachytherapy with the Assistance of An Electromagnetic (EM) Tracking System

DOE Office of Scientific and Technical Information (OSTI.GOV)

Damato, AL; Bhagwat, MS; Buzurovic, I

Purpose: To investigate the use of a system using EM tracking, postprocessing and error-detection algorithms for measuring brachytherapy catheter locations and for detecting errors and resolving uncertainties in treatment-planning catheter digitization. Methods: An EM tracker was used to localize 13 catheters in a clinical surface applicator (A) and 15 catheters inserted into a phantom (B). Two pairs of catheters in (B) crossed paths at a distance <2 mm, producing an undistinguishable catheter artifact in that location. EM data was post-processed for noise reduction and reformatted to provide the dwell location configuration. CT-based digitization was automatically extracted from the brachytherapy planmore » DICOM files (CT). EM dwell digitization error was characterized in terms of the average and maximum distance between corresponding EM and CT dwells per catheter. The error detection rate (detected errors / all errors) was calculated for 3 types of errors: swap of two catheter numbers; incorrect catheter number identification superior to the closest position between two catheters (mix); and catheter-tip shift. Results: The averages ± 1 standard deviation of the average and maximum registration error per catheter were 1.9±0.7 mm and 3.0±1.1 mm for (A) and 1.6±0.6 mm and 2.7±0.8 mm for (B). The error detection rate was 100% (A and B) for swap errors, mix errors, and shift >4.5 mm (A) and >5.5 mm (B); errors were detected for shifts on average >2.0 mm (A) and >2.4 mm (B). Both mix errors associated with undistinguishable catheter artifacts were detected and at least one of the involved catheters was identified. Conclusion: We demonstrated the use of an EM tracking system for localization of brachytherapy catheters, detection of digitization errors and resolution of undistinguishable catheter artifacts. Automatic digitization may be possible with a registration between the imaging and the EM frame of reference. Research funded by the Kaye Family Award 2012.« less
Superpixel-based spectral classification for the detection of head and neck cancer with hyperspectral imaging

NASA Astrophysics Data System (ADS)

Chung, Hyunkoo; Lu, Guolan; Tian, Zhiqiang; Wang, Dongsheng; Chen, Zhuo Georgia; Fei, Baowei

2016-03-01

Hyperspectral imaging (HSI) is an emerging imaging modality for medical applications. HSI acquires two dimensional images at various wavelengths. The combination of both spectral and spatial information provides quantitative information for cancer detection and diagnosis. This paper proposes using superpixels, principal component analysis (PCA), and support vector machine (SVM) to distinguish regions of tumor from healthy tissue. The classification method uses 2 principal components decomposed from hyperspectral images and obtains an average sensitivity of 93% and an average specificity of 85% for 11 mice. The hyperspectral imaging technology and classification method can have various applications in cancer research and management.
Accuracy Study of a Robotic System for MRI-guided Prostate Needle Placement

PubMed Central

Seifabadi, Reza; Cho, Nathan BJ.; Song, Sang-Eun; Tokuda, Junichi; Hata, Nobuhiko; Tempany, Clare M.; Fichtinger, Gabor; Iordachita, Iulian

2013-01-01

Background Accurate needle placement is the first concern in percutaneous MRI-guided prostate interventions. In this phantom study, different sources contributing to the overall needle placement error of a MRI-guided robot for prostate biopsy have been identified, quantified, and minimized to the possible extent. Methods and Materials The overall needle placement error of the system was evaluated in a prostate phantom. This error was broken into two parts: the error associated with the robotic system (called before-insertion error) and the error associated with needle-tissue interaction (called due-to-insertion error). The before-insertion error was measured directly in a soft phantom and different sources contributing into this part were identified and quantified. A calibration methodology was developed to minimize the 4-DOF manipulator’s error. The due-to-insertion error was indirectly approximated by comparing the overall error and the before-insertion error. The effect of sterilization on the manipulator’s accuracy and repeatability was also studied. Results The average overall system error in phantom study was 2.5 mm (STD=1.1mm). The average robotic system error in super soft phantom was 1.3 mm (STD=0.7 mm). Assuming orthogonal error components, the needle-tissue interaction error was approximated to be 2.13 mm thus having larger contribution to the overall error. The average susceptibility artifact shift was 0.2 mm. The manipulator’s targeting accuracy was 0.71 mm (STD=0.21mm) after robot calibration. The robot’s repeatability was 0.13 mm. Sterilization had no noticeable influence on the robot’s accuracy and repeatability. Conclusions The experimental methodology presented in this paper may help researchers to identify, quantify, and minimize different sources contributing into the overall needle placement error of an MRI-guided robotic system for prostate needle placement. In the robotic system analyzed here, the overall error of the studied system remained within the acceptable range. PMID:22678990
Accuracy study of a robotic system for MRI-guided prostate needle placement.

PubMed

Seifabadi, Reza; Cho, Nathan B J; Song, Sang-Eun; Tokuda, Junichi; Hata, Nobuhiko; Tempany, Clare M; Fichtinger, Gabor; Iordachita, Iulian

2013-09-01

Accurate needle placement is the first concern in percutaneous MRI-guided prostate interventions. In this phantom study, different sources contributing to the overall needle placement error of a MRI-guided robot for prostate biopsy have been identified, quantified and minimized to the possible extent. The overall needle placement error of the system was evaluated in a prostate phantom. This error was broken into two parts: the error associated with the robotic system (called 'before-insertion error') and the error associated with needle-tissue interaction (called 'due-to-insertion error'). Before-insertion error was measured directly in a soft phantom and different sources contributing into this part were identified and quantified. A calibration methodology was developed to minimize the 4-DOF manipulator's error. The due-to-insertion error was indirectly approximated by comparing the overall error and the before-insertion error. The effect of sterilization on the manipulator's accuracy and repeatability was also studied. The average overall system error in the phantom study was 2.5 mm (STD = 1.1 mm). The average robotic system error in the Super Soft plastic phantom was 1.3 mm (STD = 0.7 mm). Assuming orthogonal error components, the needle-tissue interaction error was found to be approximately 2.13 mm, thus making a larger contribution to the overall error. The average susceptibility artifact shift was 0.2 mm. The manipulator's targeting accuracy was 0.71 mm (STD = 0.21 mm) after robot calibration. The robot's repeatability was 0.13 mm. Sterilization had no noticeable influence on the robot's accuracy and repeatability. The experimental methodology presented in this paper may help researchers to identify, quantify and minimize different sources contributing into the overall needle placement error of an MRI-guided robotic system for prostate needle placement. In the robotic system analysed here, the overall error of the studied system remained within the acceptable range. Copyright © 2012 John Wiley & Sons, Ltd.
A real-time heat strain risk classifier using heart rate and skin temperature.

PubMed

Buller, Mark J; Latzka, William A; Yokota, Miyo; Tharion, William J; Moran, Daniel S

2008-12-01

Heat injury is a real concern to workers engaged in physically demanding tasks in high heat strain environments. Several real-time physiological monitoring systems exist that can provide indices of heat strain, e.g. physiological strain index (PSI), and provide alerts to medical personnel. However, these systems depend on core temperature measurement using expensive, ingestible thermometer pills. Seeking a better solution, we suggest the use of a model which can identify the probability that individuals are 'at risk' from heat injury using non-invasive measures. The intent is for the system to identify individuals who need monitoring more closely or who should apply heat strain mitigation strategies. We generated a model that can identify 'at risk' (PSI 7.5) workers from measures of heart rate and chest skin temperature. The model was built using data from six previously published exercise studies in which some subjects wore chemical protective equipment. The model has an overall classification error rate of 10% with one false negative error (2.7%), and outperforms an earlier model and a least squares regression model with classification errors of 21% and 14%, respectively. Additionally, the model allows the classification criteria to be adjusted based on the task and acceptable level of risk. We conclude that the model could be a valuable part of a multi-faceted heat strain management system.
Validation of a systems-actuarial computer process for multidimensional classification of child psychopathology.

PubMed

McDermott, P A; Hale, R L

1982-07-01

Tested diagnostic classifications of child psychopathology produced by a computerized technique known as multidimensional actuarial classification (MAC) against the criterion of expert psychological opinion. The MAC program applies series of statistical decision rules to assess the importance of and relationships among several dimensions of classification, i.e., intellectual functioning, academic achievement, adaptive behavior, and social and behavioral adjustment, to perform differential diagnosis of children's mental retardation, specific learning disabilities, behavioral and emotional disturbance, possible communication or perceptual-motor impairment, and academic under- and overachievement in reading and mathematics. Classifications rendered by MAC are compared to those offered by two expert child psychologists for cases of 73 children referred for psychological services. Experts' agreement with MAC was significant for all classification areas, as was MAC's agreement with the experts held as a conjoint reference standard. Whereas the experts' agreement with MAC averaged 86.0% above chance, their agreement with one another averaged 76.5% above chance. Implications of the findings are explored and potential advantages of the systems-actuarial approach are discussed.
Classification of right-hand grasp movement based on EMOTIV Epoc+

NASA Astrophysics Data System (ADS)

Tobing, T. A. M. L.; Prawito, Wijaya, S. K.

2017-07-01

Combinations of BCT elements for right-hand grasp movement have been obtained, providing the average value of their classification accuracy. The aim of this study is to find a suitable combination for best classification accuracy of right-hand grasp movement based on EEG headset, EMOTIV Epoc+. There are three movement classifications: grasping hand, relax, and opening hand. These classifications take advantage of Event-Related Desynchronization (ERD) phenomenon that makes it possible to differ relaxation, imagery, and movement state from each other. The combinations of elements are the usage of Independent Component Analysis (ICA), spectrum analysis by Fast Fourier Transform (FFT), maximum mu and beta power with their frequency as features, and also classifier Probabilistic Neural Network (PNN) and Radial Basis Function (RBF). The average values of classification accuracy are ± 83% for training and ± 57% for testing. To have a better understanding of the signal quality recorded by EMOTIV Epoc+, the result of classification accuracy of left or right-hand grasping movement EEG signal (provided by Physionet) also be given, i.e.± 85% for training and ± 70% for testing. The comparison of accuracy value from each combination, experiment condition, and external EEG data are provided for the purpose of value analysis of classification accuracy.
Sampling Errors of SSM/I and TRMM Rainfall Averages: Comparison with Error Estimates from Surface Data and a Sample Model

NASA Technical Reports Server (NTRS)

Bell, Thomas L.; Kundu, Prasun K.; Kummerow, Christian D.; Einaudi, Franco (Technical Monitor)

2000-01-01

Quantitative use of satellite-derived maps of monthly rainfall requires some measure of the accuracy of the satellite estimates. The rainfall estimate for a given map grid box is subject to both remote-sensing error and, in the case of low-orbiting satellites, sampling error due to the limited number of observations of the grid box provided by the satellite. A simple model of rain behavior predicts that Root-mean-square (RMS) random error in grid-box averages should depend in a simple way on the local average rain rate, and the predicted behavior has been seen in simulations using surface rain-gauge and radar data. This relationship was examined using satellite SSM/I data obtained over the western equatorial Pacific during TOGA COARE. RMS error inferred directly from SSM/I rainfall estimates was found to be larger than predicted from surface data, and to depend less on local rain rate than was predicted. Preliminary examination of TRMM microwave estimates shows better agreement with surface data. A simple method of estimating rms error in satellite rainfall estimates is suggested, based on quantities that can be directly computed from the satellite data.
EVALUATION OF THE REPRODUCIBILITY OF TWO TECHNIQUES USED TO DETERMINE AND RECORD CENTRIC RELATION IN ANGLE’S CLASS I PATIENTS

PubMed Central

Paixão, Fernanda; Silva, Wilkens Aurélio Buarque e; Silva, Frederico Andrade e; Ramos, Guilherme da Gama; Cruz, Mônica Vieira de Jesus

2007-01-01

The centric relation is a mandibular position that determines a balance relation among the temporomandibular joints, the chew muscles and the occlusion. This position makes possible to the dentist to plan and to execute oral rehabilitation respecting the physiological principles of the stomatognathic system. The aim of this study was to investigate the reproducibility of centric relation records obtained using two techniques: Dawson’s Bilateral Manipulation and Gysi’s Gothic Arch Tracing. Twenty volunteers (14 females and 6 males) with no dental loss, presenting occlusal contacts according to those described in Angle’s I classification and without signs and symptoms of temporomandibular disorders were selected. All volunteers were submitted five times with a 1-week interval, always in the same schedule, to the Dawson’s Bilateral Manipulation and to the Gysi’s Gothic Arch Tracing with aid of an intraoral apparatus. The average standard error of each technique was calculated (Bilateral Manipulation 0.94 and Gothic Arch Tracing 0.27). Shapiro-Wilk test was applied and the results allowed application of Student’s t-test (sampling error of 5%). The techniques showed different degrees of variability. The Gysi’s Gothic Arch Tracing was found to be more accurate than the Bilateral Manipulation in reproducing the centric relation records. PMID:19089144
Automated Segmentation Errors When Using Optical Coherence Tomography to Measure Retinal Nerve Fiber Layer Thickness in Glaucoma.

PubMed

Mansberger, Steven L; Menda, Shivali A; Fortune, Brad A; Gardiner, Stuart K; Demirel, Shaban

2017-02-01

To characterize the error of optical coherence tomography (OCT) measurements of retinal nerve fiber layer (RNFL) thickness when using automated retinal layer segmentation algorithms without manual refinement. Cross-sectional study. This study was set in a glaucoma clinical practice, and the dataset included 3490 scans from 412 eyes of 213 individuals with a diagnosis of glaucoma or glaucoma suspect. We used spectral domain OCT (Spectralis) to measure RNFL thickness in a 6-degree peripapillary circle, and exported the native "automated segmentation only" results. In addition, we exported the results after "manual refinement" to correct errors in the automated segmentation of the anterior (internal limiting membrane) and the posterior boundary of the RNFL. Our outcome measures included differences in RNFL thickness and glaucoma classification (i.e., normal, borderline, or outside normal limits) between scans with automated segmentation only and scans using manual refinement. Automated segmentation only resulted in a thinner global RNFL thickness (1.6 μm thinner, P < .001) when compared to manual refinement. When adjusted by operator, a multivariate model showed increased differences with decreasing RNFL thickness (P < .001), decreasing scan quality (P < .001), and increasing age (P < .03). Manual refinement changed 298 of 3486 (8.5%) of scans to a different global glaucoma classification, wherein 146 of 617 (23.7%) of borderline classifications became normal. Superior and inferior temporal clock hours had the largest differences. Automated segmentation without manual refinement resulted in reduced global RNFL thickness and overestimated the classification of glaucoma. Differences increased in eyes with a thinner RNFL thickness, older age, and decreased scan quality. Operators should inspect and manually refine OCT retinal layer segmentation when assessing RNFL thickness in the management of patients with glaucoma. Copyright © 2016 Elsevier Inc. All rights reserved.
Acquiring Research-grade ALSM Data in the Commercial Marketplace

NASA Astrophysics Data System (ADS)

Haugerud, R. A.; Harding, D. J.; Latypov, D.; Martinez, D.; Routh, S.; Ziegler, J.

2003-12-01

The Puget Sound Lidar Consortium, working with TerraPoint, LLC, has procured a large volume of ALSM (topographic lidar) data for scientific research. Research-grade ALSM data can be characterized by their completeness, density, and accuracy. Complete data include-at a minimum-X, Y, Z, time, and classification (ground, vegetation, structure, blunder) for each laser reflection. Off-nadir angle and return number for multiple returns are also useful. We began with a pulse density of 1/sq m, and after limited experiments still find this density satisfactory in the dense second-growth forests of western Washington. Lower pulse densities would have produced unacceptably limited sampling in forested areas and aliased some topographic features. Higher pulse densities do not produce markedly better topographic models, in part because of limitations of reproducibility between the overlapping survey swaths used to achieve higher density. Our experience in a variety of forest types demonstrates that the fraction of pulses that produce ground returns varies with vegetation cover, laser beam divergence, laser power, and detector sensitivity, but have not quantified this relationship. The most significant operational limits on vertical accuracy of ALSM appear to be instrument calibration and the accuracy with which returns are classified as ground or vegetation. TerraPoint has recently implemented in-situ calibration using overlapping swaths (Latypov and Zosse, 2002, see http://www.terrapoint.com/News_damirACSM_ASPRS2002.html). On the consumer side, we routinely perform a similar overlap analysis to produce maps of relative Z error between swaths; we find that in bare, low-slope regions the in-situ calibration has reduced this internal Z error to 6-10 cm RMSE. Comparison with independent ground control points commonly illuminates inconsistencies in how GPS heights have been reduced to orthometric heights. Once these inconsistencies are resolved, it appears that the internal errors are the bulk of the error of the survey. The error maps suggest that with in-situ calibration, minor time-varying errors with a period of circa 1 sec are the largest remaining source of survey error. For forested terrain, limited ground penetration and errors in return classification can severely limit the accuracy of resulting topographic models. Initial work by Haugerud and Harding demonstrated the feasibility of fully-automatic return classification; however, TerraPoint has found that better results can be obtained more effectively with 3rd-party classification software that allows a mix of automated routines and human intervention. Our relationship has been evolving since early 2000. Important aspects of this relationship include close communication between data producer and consumer, a willingness to learn from each other, significant technical expertise and resources on the consumer side, and continued refinement of achievable, quantitative performance and accuracy specifications. Most recently we have instituted a slope-dependent Z accuracy specification that TerraPoint first developed as a heuristic for surveying mountainous terrain in Switzerland. We are now working on quantifying the internal consistency of topographic models in forested areas, using a variant of overlap analysis, and standards for the spatial distribution of internal errors.

Brain fingerprinting classification concealed information test detects US Navy military medical information with P300

PubMed Central

Farwell, Lawrence A.; Richardson, Drew C.; Richardson, Graham M.; Furedy, John J.

2014-01-01

A classification concealed information test (CIT) used the “brain fingerprinting” method of applying P300 event-related potential (ERP) in detecting information that is (1) acquired in real life and (2) unique to US Navy experts in military medicine. Military medicine experts and non-experts were asked to push buttons in response to three types of text stimuli. Targets contain known information relevant to military medicine, are identified to subjects as relevant, and require pushing one button. Subjects are told to push another button to all other stimuli. Probes contain concealed information relevant to military medicine, and are not identified to subjects. Irrelevants contain equally plausible, but incorrect/irrelevant information. Error rate was 0%. Median and mean statistical confidences for individual determinations were 99.9% with no indeterminates (results lacking sufficiently high statistical confidence to be classified). We compared error rate and statistical confidence for determinations of both information present and information absent produced by classification CIT (Is a probe ERP more similar to a target or to an irrelevant ERP?) vs. comparison CIT (Does a probe produce a larger ERP than an irrelevant?) using P300 plus the late negative component (LNP; together, P300-MERMER). Comparison CIT produced a significantly higher error rate (20%) and lower statistical confidences: mean 67%; information-absent mean was 28.9%, less than chance (50%). We compared analysis using P300 alone with the P300 + LNP. P300 alone produced the same 0% error rate but significantly lower statistical confidences. These findings add to the evidence that the brain fingerprinting methods as described here provide sufficient conditions to produce less than 1% error rate and greater than 95% median statistical confidence in a CIT on information obtained in the course of real life that is characteristic of individuals with specific training, expertise, or organizational affiliation. PMID:25565941
Temporal variability in urinary levels of drinking water disinfection byproducts dichloroacetic acid and trichloroacetic acid among men

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Yi-Xin; Zeng, Qiang; Wang, Le

Urinary haloacetic acids (HAAs), such as dichloroacetic acid (DCAA) and trichloroacetic acid (TCAA), have been suggested as potential biomarkers of exposure to drinking water disinfection byproducts (DBPs). However, variable exposure to and the short elimination half-lives of these biomarkers can result in considerable variability in urinary measurements, leading to exposure misclassification. Here we examined the variability of DCAA and TCAA levels in the urine among eleven men who provided urine samples on 8 days over 3 months. The urinary concentrations of DCAA and TCAA were measured by gas chromatography coupled with electron capture detection. We calculated the intraclass correlation coefficientsmore » (ICCs) to characterize the within-person and between-person variances and computed the sensitivity and specificity to assess how well single or multiple urine collections accurately determined personal 3-month average DCAA and TCAA levels. The within-person variance was much higher than the between-person variance for all three sample types (spot, first morning, and 24-h urine samples) for DCAA (ICC=0.08–0.37) and TCAA (ICC=0.09–0.23), regardless of the sampling interval. A single-spot urinary sample predicted high (top 33%) 3-month average DCAA and TCAA levels with high specificity (0.79 and 0.78, respectively) but relatively low sensitivity (0.47 and 0.50, respectively). Collecting two or three urine samples from each participant improved the classification. The poor reproducibility of the measured urinary DCAA and TCAA concentrations indicate that a single measurement may not accurately reflect individual long-term exposure. Collection of multiple urine samples from one person is an option for reducing exposure classification errors in studies exploring the effects of DBP exposure on reproductive health. - Highlights: • We evaluated the variability of DCAA and TCAA levels in the urine among men. • Urinary DCAA and TCAA levels varied greatly over a 3-month period. • Single measurement may not accurately reflect personal long-term exposure levels. • Collecting multiple samples from one person improved the exposure classification.« less
Sampling errors for satellite-derived tropical rainfall - Monte Carlo study using a space-time stochastic model

NASA Technical Reports Server (NTRS)

Bell, Thomas L.; Abdullah, A.; Martin, Russell L.; North, Gerald R.

1990-01-01

Estimates of monthly average rainfall based on satellite observations from a low earth orbit will differ from the true monthly average because the satellite observes a given area only intermittently. This sampling error inherent in satellite monitoring of rainfall would occur even if the satellite instruments could measure rainfall perfectly. The size of this error is estimated for a satellite system being studied at NASA, the Tropical Rainfall Measuring Mission (TRMM). First, the statistical description of rainfall on scales from 1 to 1000 km is examined in detail, based on rainfall data from the Global Atmospheric Research Project Atlantic Tropical Experiment (GATE). A TRMM-like satellite is flown over a two-dimensional time-evolving simulation of rainfall using a stochastic model with statistics tuned to agree with GATE statistics. The distribution of sampling errors found from many months of simulated observations is found to be nearly normal, even though the distribution of area-averaged rainfall is far from normal. For a range of orbits likely to be employed in TRMM, sampling error is found to be less than 10 percent of the mean for rainfall averaged over a 500 x 500 sq km area.
Morbidity Assessment in Surgery: Refinement Proposal Based on a Concept of Perioperative Adverse Events

PubMed Central

Kazaryan, Airazat M.; Røsok, Bård I.; Edwin, Bjørn

2013-01-01

Background. Morbidity is a cornerstone assessing surgical treatment; nevertheless surgeons have not reached extensive consensus on this problem. Methods and Findings. Clavien, Dindo, and Strasberg with coauthors (1992, 2004, 2009, and 2010) made significant efforts to the standardization of surgical morbidity (Clavien-Dindo-Strasberg classification, last revision, the Accordion classification). However, this classification includes only postoperative complications and has two principal shortcomings: disregard of intraoperative events and confusing terminology. Postoperative events have a major impact on patient well-being. However, intraoperative events should also be recorded and reported even if they do not evidently affect the patient's postoperative well-being. The term surgical complication applied in the Clavien-Dindo-Strasberg classification may be regarded as an incident resulting in a complication caused by technical failure of surgery, in contrast to the so-called medical complications. Therefore, the term surgical complication contributes to misinterpretation of perioperative morbidity. The term perioperative adverse events comprising both intraoperative unfavourable incidents and postoperative complications could be regarded as better alternative. In 2005, Satava suggested a simple grading to evaluate intraoperative surgical errors. Based on that approach, we have elaborated a 3-grade classification of intraoperative incidents so that it can be used to grade intraoperative events of any type of surgery. Refinements have been made to the Accordion classification of postoperative complications. Interpretation. The proposed systematization of perioperative adverse events utilizing the combined application of two appraisal tools, that is, the elaborated classification of intraoperative incidents on the basis of the Satava approach to surgical error evaluation together with the modified Accordion classification of postoperative complication, appears to be an effective tool for comprehensive assessment of surgical outcomes. This concept was validated in regard to various surgical procedures. Broad implementation of this approach will promote the development of surgical science and practice. PMID:23762627
Entropy-based gene ranking without selection bias for the predictive classification of microarray data.

PubMed

Furlanello, Cesare; Serafini, Maria; Merler, Stefano; Jurman, Giuseppe

2003-11-06

We describe the E-RFE method for gene ranking, which is useful for the identification of markers in the predictive classification of array data. The method supports a practical modeling scheme designed to avoid the construction of classification rules based on the selection of too small gene subsets (an effect known as the selection bias, in which the estimated predictive errors are too optimistic due to testing on samples already considered in the feature selection process). With E-RFE, we speed up the recursive feature elimination (RFE) with SVM classifiers by eliminating chunks of uninteresting genes using an entropy measure of the SVM weights distribution. An optimal subset of genes is selected according to a two-strata model evaluation procedure: modeling is replicated by an external stratified-partition resampling scheme, and, within each run, an internal K-fold cross-validation is used for E-RFE ranking. Also, the optimal number of genes can be estimated according to the saturation of Zipf's law profiles. Without a decrease of classification accuracy, E-RFE allows a speed-up factor of 100 with respect to standard RFE, while improving on alternative parametric RFE reduction strategies. Thus, a process for gene selection and error estimation is made practical, ensuring control of the selection bias, and providing additional diagnostic indicators of gene importance.
DMSP SSJ4 Data Restoration, Classification, and On-Line Data Access

NASA Technical Reports Server (NTRS)

Wing, Simon; Bredekamp, Joseph H. (Technical Monitor)

2000-01-01

Compress and clean raw data file for permanent storage We have identified various error conditions/types and developed algorithms to get rid of these errors/noises, including the more complicated noise in the newer data sets. (status = 100% complete). Internet access of compacted raw data. It is now possible to access the raw data via our web site, http://www.jhuapl.edu/Aurora/index.html. The software to read and plot the compacted raw data is also available from the same web site. The users can now download the raw data, read, plot, or manipulate the data as they wish on their own computer. The users are able to access the cleaned data sets. Internet access of the color spectrograms. This task has also been completed. It is now possible to access the spectrograms from the web site mentioned above. Improve the particle precipitation region classification. The algorithm for doing this task has been developed and implemented. As a result, the accuracies improved. Now the web site routinely distributes the results of applying the new algorithm to the cleaned data set. Mark the classification region on the spectrograms. The software to mark the classification region in the spectrograms has been completed. This is also available from our web site.
Malingering in Toxic Exposure. Classification Accuracy of Reliable Digit Span and WAIS-III Digit Span Scaled Scores

ERIC Educational Resources Information Center

Greve, Kevin W.; Springer, Steven; Bianchini, Kevin J.; Black, F. William; Heinly, Matthew T.; Love, Jeffrey M.; Swift, Douglas A.; Ciota, Megan A.

2007-01-01

This study examined the sensitivity and false-positive error rate of reliable digit span (RDS) and the WAIS-III Digit Span (DS) scaled score in persons alleging toxic exposure and determined whether error rates differed from published rates in traumatic brain injury (TBI) and chronic pain (CP). Data were obtained from the files of 123 persons…
Authorship Discovery in Blogs Using Bayesian Classification with Corrective Scaling

DTIC Science & Technology

2008-06-01

4 2.3 W. Fucks ’ Diagram of n-Syllable Word Frequencies . . . . . . . . . . . . . . 5 3.1 Confusion Matrix for All Test Documents of 500...of the books which scholars believed he had. • Wilhelm Fucks discriminated between authors using the average number of syllables per word and average...distance between equal-syllabled words [8]. Fucks , too, concluded that a study such as his reveals a “possibility of a quantitative classification
Position Between Trunk and Pelvis During Gait Depending on the Gross Motor Function Classification System.

PubMed

Sanz-Mengibar, Jose Manuel; Altschuck, Natalie; Sanchez-de-Muniain, Paloma; Bauer, Christian; Santonja-Medina, Fernando

2017-04-01

To understand whether there is a trunk postural control threshold in the sagittal plane for the transition between the Gross Motor Function Classification System (GMFCS) levels measured with 3-dimensional gait analysis. Kinematics from 97 children with spastic bilateral cerebral palsy from spine angles according to Plug-In Gait model (Vicon) were plotted relative to their GMFCS level. Only average and minimum values of the lumbar spine segment correlated with GMFCS levels. Maximal values at loading response correlated independently with age at all functional levels. Average and minimum values were significant when analyzing age in combination with GMFCS level. There are specific postural control patterns in the average and minimum values for the position between trunk and pelvis in the sagittal plane during gait, for the transition among GMFCS I-III levels. Higher classifications of gross motor skills correlate with more extended spine angles.
Software errors and complexity: An empirical investigation

NASA Technical Reports Server (NTRS)

Basili, Victor R.; Perricone, Berry T.

1983-01-01

The distributions and relationships derived from the change data collected during the development of a medium scale satellite software project show that meaningful results can be obtained which allow an insight into software traits and the environment in which it is developed. Modified and new modules were shown to behave similarly. An abstract classification scheme for errors which allows a better understanding of the overall traits of a software project is also shown. Finally, various size and complexity metrics are examined with respect to errors detected within the software yielding some interesting results.
Software errors and complexity: An empirical investigation

NASA Technical Reports Server (NTRS)

Basili, V. R.; Perricone, B. T.

1982-01-01

The distributions and relationships derived from the change data collected during the development of a medium scale satellite software project show that meaningful results can be obtained which allow an insight into software traits and the environment in which it is developed. Modified and new modules were shown to behave similarly. An abstract classification scheme for errors which allows a better understanding of the overall traits of a software project is also shown. Finally, various size and complexity metrics are examined with respect to errors detected within the software yielding some interesting results.
Video compression of coronary angiograms based on discrete wavelet transform with block classification.

PubMed

Ho, B T; Tsai, M J; Wei, J; Ma, M; Saipetch, P

1996-01-01

A new method of video compression for angiographic images has been developed to achieve high compression ratio (~20:1) while eliminating block artifacts which leads to loss of diagnostic accuracy. This method adopts motion picture experts group's (MPEGs) motion compensated prediction to takes advantage of frame to frame correlation. However, in contrast to MPEG, the error images arising from mismatches in the motion estimation are encoded by discrete wavelet transform (DWT) rather than block discrete cosine transform (DCT). Furthermore, the authors developed a classification scheme which label each block in an image as intra, error, or background type and encode it accordingly. This hybrid coding can significantly improve the compression efficiency in certain eases. This method can be generalized for any dynamic image sequences applications sensitive to block artifacts.
Empirically Estimable Classification Bounds Based on a Nonparametric Divergence Measure

PubMed Central

Berisha, Visar; Wisler, Alan; Hero, Alfred O.; Spanias, Andreas

2015-01-01

Information divergence functions play a critical role in statistics and information theory. In this paper we show that a non-parametric f-divergence measure can be used to provide improved bounds on the minimum binary classification probability of error for the case when the training and test data are drawn from the same distribution and for the case where there exists some mismatch between training and test distributions. We confirm the theoretical results by designing feature selection algorithms using the criteria from these bounds and by evaluating the algorithms on a series of pathological speech classification tasks. PMID:26807014
Comparisons of neural networks to standard techniques for image classification and correlation

NASA Technical Reports Server (NTRS)

Paola, Justin D.; Schowengerdt, Robert A.

1994-01-01

Neural network techniques for multispectral image classification and spatial pattern detection are compared to the standard techniques of maximum-likelihood classification and spatial correlation. The neural network produced a more accurate classification than maximum-likelihood of a Landsat scene of Tucson, Arizona. Some of the errors in the maximum-likelihood classification are illustrated using decision region and class probability density plots. As expected, the main drawback to the neural network method is the long time required for the training stage. The network was trained using several different hidden layer sizes to optimize both the classification accuracy and training speed, and it was found that one node per class was optimal. The performance improved when 3x3 local windows of image data were entered into the net. This modification introduces texture into the classification without explicit calculation of a texture measure. Larger windows were successfully used for the detection of spatial features in Landsat and Magellan synthetic aperture radar imagery.
Study on Classification Accuracy Inspection of Land Cover Data Aided by Automatic Image Change Detection Technology

NASA Astrophysics Data System (ADS)

Xie, W.-J.; Zhang, L.; Chen, H.-P.; Zhou, J.; Mao, W.-J.

2018-04-01

The purpose of carrying out national geographic conditions monitoring is to obtain information of surface changes caused by human social and economic activities, so that the geographic information can be used to offer better services for the government, enterprise and public. Land cover data contains detailed geographic conditions information, thus has been listed as one of the important achievements in the national geographic conditions monitoring project. At present, the main issue of the production of the land cover data is about how to improve the classification accuracy. For the land cover data quality inspection and acceptance, classification accuracy is also an important check point. So far, the classification accuracy inspection is mainly based on human-computer interaction or manual inspection in the project, which are time consuming and laborious. By harnessing the automatic high-resolution remote sensing image change detection technology based on the ERDAS IMAGINE platform, this paper carried out the classification accuracy inspection test of land cover data in the project, and presented a corresponding technical route, which includes data pre-processing, change detection, result output and information extraction. The result of the quality inspection test shows the effectiveness of the technical route, which can meet the inspection needs for the two typical errors, that is, missing and incorrect update error, and effectively reduces the work intensity of human-computer interaction inspection for quality inspectors, and also provides a technical reference for the data production and quality control of the land cover data.
On Time/Space Aggregation of Fine-Scale Error Estimates (Invited)

NASA Astrophysics Data System (ADS)

Huffman, G. J.

2013-12-01

Estimating errors inherent in fine time/space-scale satellite precipitation data sets is still an on-going problem and a key area of active research. Complicating features of these data sets include the intrinsic intermittency of the precipitation in space and time and the resulting highly skewed distribution of precipitation rates. Additional issues arise from the subsampling errors that satellites introduce, the errors due to retrieval algorithms, and the correlated error that retrieval and merger algorithms sometimes introduce. Several interesting approaches have been developed recently that appear to make progress on these long-standing issues. At the same time, the monthly averages over 2.5°x2.5° grid boxes in the Global Precipitation Climatology Project (GPCP) Satellite-Gauge (SG) precipitation data set follow a very simple sampling-based error model (Huffman 1997) with coefficients that are set using coincident surface and GPCP SG data. This presentation outlines the unsolved problem of how to aggregate the fine-scale errors (discussed above) to an arbitrary time/space averaging volume for practical use in applications, reducing in the limit to simple Gaussian expressions at the monthly 2.5°x2.5° scale. Scatter diagrams with different time/space averaging show that the relationship between the satellite and validation data improves due to the reduction in random error. One of the key, and highly non-linear, issues is that fine-scale estimates tend to have large numbers of cases with points near the axes on the scatter diagram (one of the values is exactly or nearly zero, while the other value is higher). Averaging 'pulls' the points away from the axes and towards the 1:1 line, which usually happens for higher precipitation rates before lower rates. Given this qualitative observation of how aggregation affects error, we observe that existing aggregation rules, such as the Steiner et al. (2003) power law, only depend on the aggregated precipitation rate. Is this sufficient, or is it necessary to aggregate the precipitation error estimates across the time/space data cube used for averaging? At least for small time/space data cubes it would seem that the detailed variables that affect each precipitation error estimate in the aggregation, such as sensor type, land/ocean surface type, convective/stratiform type, and so on, drive variations that must be accounted for explicitly.
Improved EEG Event Classification Using Differential Energy.

PubMed

Harati, A; Golmohammadi, M; Lopez, S; Obeid, I; Picone, J

2015-12-01

Feature extraction for automatic classification of EEG signals typically relies on time frequency representations of the signal. Techniques such as cepstral-based filter banks or wavelets are popular analysis techniques in many signal processing applications including EEG classification. In this paper, we present a comparison of a variety of approaches to estimating and postprocessing features. To further aid in discrimination of periodic signals from aperiodic signals, we add a differential energy term. We evaluate our approaches on the TUH EEG Corpus, which is the largest publicly available EEG corpus and an exceedingly challenging task due to the clinical nature of the data. We demonstrate that a variant of a standard filter bank-based approach, coupled with first and second derivatives, provides a substantial reduction in the overall error rate. The combination of differential energy and derivatives produces a 24 % absolute reduction in the error rate and improves our ability to discriminate between signal events and background noise. This relatively simple approach proves to be comparable to other popular feature extraction approaches such as wavelets, but is much more computationally efficient.
Comparison of remote sensing image processing techniques to identify tornado damage areas from Landsat TM data

USGS Publications Warehouse

Myint, S.W.; Yuan, M.; Cerveny, R.S.; Giri, C.P.

2008-01-01

Remote sensing techniques have been shown effective for large-scale damage surveys after a hazardous event in both near real-time or post-event analyses. The paper aims to compare accuracy of common imaging processing techniques to detect tornado damage tracks from Landsat TM data. We employed the direct change detection approach using two sets of images acquired before and after the tornado event to produce a principal component composite images and a set of image difference bands. Techniques in the comparison include supervised classification, unsupervised classification, and objectoriented classification approach with a nearest neighbor classifier. Accuracy assessment is based on Kappa coefficient calculated from error matrices which cross tabulate correctly identified cells on the TM image and commission and omission errors in the result. Overall, the Object-oriented Approach exhibits the highest degree of accuracy in tornado damage detection. PCA and Image Differencing methods show comparable outcomes. While selected PCs can improve detection accuracy 5 to 10%, the Object-oriented Approach performs significantly better with 15-20% higher accuracy than the other two techniques. ?? 2008 by MDPI.
Benchmark data on the separability among crops in the southern San Joaquin Valley of California

NASA Technical Reports Server (NTRS)

Morse, A.; Card, D. H.

1984-01-01

Landsat MSS data were input to a discriminant analysis of 21 crops on each of eight dates in 1979 using a total of 4,142 fields in southern Fresno County, California. The 21 crops, which together account for over 70 percent of the agricultural acreage in the southern San Joaquin Valley, were analyzed to quantify the spectral separability, defined as omission error, between all pairs of crops. On each date the fields were segregated into six groups based on the mean value of the MSS7/MSS5 ratio, which is correlated with green biomass. Discriminant analysis was run on each group on each date. The resulting contingency tables offer information that can be profitably used in conjunction with crop calendars to pick the best dates for a classification. The tables show expected percent correct classification and error rates for all the crops. The patterns in the contingency tables show that the percent correct classification for crops generally increases with the amount of greenness in the fields being classified. However, there are exceptions to this general rule, notably grain.
Comparison of Remote Sensing Image Processing Techniques to Identify Tornado Damage Areas from Landsat TM Data

PubMed Central

Myint, Soe W.; Yuan, May; Cerveny, Randall S.; Giri, Chandra P.

2008-01-01

Remote sensing techniques have been shown effective for large-scale damage surveys after a hazardous event in both near real-time or post-event analyses. The paper aims to compare accuracy of common imaging processing techniques to detect tornado damage tracks from Landsat TM data. We employed the direct change detection approach using two sets of images acquired before and after the tornado event to produce a principal component composite images and a set of image difference bands. Techniques in the comparison include supervised classification, unsupervised classification, and object-oriented classification approach with a nearest neighbor classifier. Accuracy assessment is based on Kappa coefficient calculated from error matrices which cross tabulate correctly identified cells on the TM image and commission and omission errors in the result. Overall, the Object-oriented Approach exhibits the highest degree of accuracy in tornado damage detection. PCA and Image Differencing methods show comparable outcomes. While selected PCs can improve detection accuracy 5 to 10%, the Object-oriented Approach performs significantly better with 15-20% higher accuracy than the other two techniques. PMID:27879757

Multicategory nets of single-layer perceptrons: complexity and sample-size issues.

PubMed

Raudys, Sarunas; Kybartas, Rimantas; Zavadskas, Edmundas Kazimieras

2010-05-01

The standard cost function of multicategory single-layer perceptrons (SLPs) does not minimize the classification error rate. In order to reduce classification error, it is necessary to: 1) refuse the traditional cost function, 2) obtain near to optimal pairwise linear classifiers by specially organized SLP training and optimal stopping, and 3) fuse their decisions properly. To obtain better classification in unbalanced training set situations, we introduce the unbalance correcting term. It was found that fusion based on the Kulback-Leibler (K-L) distance and the Wu-Lin-Weng (WLW) method result in approximately the same performance in situations where sample sizes are relatively small. The explanation for this observation is by theoretically known verity that an excessive minimization of inexact criteria becomes harmful at times. Comprehensive comparative investigations of six real-world pattern recognition (PR) problems demonstrated that employment of SLP-based pairwise classifiers is comparable and as often as not outperforming the linear support vector (SV) classifiers in moderate dimensional situations. The colored noise injection used to design pseudovalidation sets proves to be a powerful tool for facilitating finite sample problems in moderate-dimensional PR tasks.
Remote sensing of submerged aquatic vegetation in lower Chesapeake Bay - A comparison of Landsat MSS to TM imagery

NASA Technical Reports Server (NTRS)

Ackleson, S. G.; Klemas, V.

1987-01-01

Landsat MSS and TM imagery, obtained simultaneously over Guinea Marsh, VA, as analyzed and compares for its ability to detect submerged aquatic vegetation (SAV). An unsupervised clustering algorithm was applied to each image, where the input classification parameters are defined as functions of apparent sensor noise. Class confidence and accuracy were computed for all water areas by comparing the classified images, pixel-by-pixel, to rasterized SAV distributions derived from color aerial photography. To illustrate the effect of water depth on classification error, areas of depth greater than 1.9 m were masked, and class confidence and accuracy recalculated. A single-scattering radiative-transfer model is used to illustrate how percent canopy cover and water depth affect the volume reflectance from a water column containing SAV. For a submerged canopy that is morphologically and optically similar to Zostera marina inhabiting Lower Chesapeake Bay, dense canopies may be isolated by masking optically deep water. For less dense canopies, the effect of increasing water depth is to increase the apparent percent crown cover, which may result in classification error.
Structural Validation of a French Food Frequency Questionnaire of 94 Items.

PubMed

Gazan, Rozenn; Vieux, Florent; Darmon, Nicole; Maillot, Matthieu

2017-01-01

Food frequency questionnaires (FFQs) are used to estimate the usual food and nutrient intakes over a period of time. Such estimates can suffer from measurement errors, either due to bias induced by respondent's answers or to errors induced by the structure of the questionnaire (e.g., using a limited number of food items and an aggregated food database with average portion sizes). The "structural validation" presented in this study aims to isolate and quantify the impact of the inherent structure of a FFQ on the estimation of food and nutrient intakes, independently of respondent's perception of the questionnaire. A semi-quantitative FFQ ( n = 94 items, including 50 items with questions on portion sizes) and an associated aggregated food composition database (named the item-composition database) were developed, based on the self-reported weekly dietary records of 1918 adults (18-79 years-old) in the French Individual and National Dietary Survey 2 (INCA2), and the French CIQUAL 2013 food-composition database of all the foods ( n = 1342 foods) declared as consumed in the population. Reference intakes of foods ("REF_FOOD") and nutrients ("REF_NUT") were calculated for each adult using the food-composition database and the amounts of foods self-reported in his/her dietary record. Then, answers to the FFQ were simulated for each adult based on his/her self-reported dietary record. "FFQ_FOOD" and "FFQ_NUT" intakes were estimated using the simulated answers and the item-composition database. Measurement errors (in %), spearman correlations and cross-classification were used to compare "REF_FOOD" with "FFQ_FOOD" and "REF_NUT" with "FFQ_NUT". Compared to "REF_NUT," "FFQ_NUT" total quantity and total energy intake were underestimated on average by 198 g/day and 666 kJ/day, respectively. "FFQ_FOOD" intakes were well estimated for starches, underestimated for most of the subgroups, and overestimated for some subgroups, in particular vegetables. Underestimation were mainly due to the use of portion sizes, leading to an underestimation of most of nutrients, except free sugars which were overestimated. The "structural validation" by simulating answers to a FFQ based on a reference dietary survey is innovative and pragmatic and allows quantifying the error induced by the simplification of the method of collection.
Development and validation of Aviation Causal Contributors for Error Reporting Systems (ACCERS).

PubMed

Baker, David P; Krokos, Kelley J

2007-04-01

This investigation sought to develop a reliable and valid classification system for identifying and classifying the underlying causes of pilot errors reported under the Aviation Safety Action Program (ASAP). ASAP is a voluntary safety program that air carriers may establish to study pilot and crew performance on the line. In ASAP programs, similar to the Aviation Safety Reporting System, pilots self-report incidents by filing a short text description of the event. The identification of contributors to errors is critical if organizations are to improve human performance, yet it is difficult for analysts to extract this information from text narratives. A taxonomy was needed that could be used by pilots to classify the causes of errors. After completing a thorough literature review, pilot interviews and a card-sorting task were conducted in Studies 1 and 2 to develop the initial structure of the Aviation Causal Contributors for Event Reporting Systems (ACCERS) taxonomy. The reliability and utility of ACCERS was then tested in studies 3a and 3b by having pilots independently classify the primary and secondary causes of ASAP reports. The results provided initial evidence for the internal and external validity of ACCERS. Pilots were found to demonstrate adequate levels of agreement with respect to their category classifications. ACCERS appears to be a useful system for studying human error captured under pilot ASAP reports. Future work should focus on how ACCERS is organized and whether it can be used or modified to classify human error in ASAP programs for other aviation-related job categories such as dispatchers. Potential applications of this research include systems in which individuals self-report errors and that attempt to extract and classify the causes of those events.
Automated spectral classification and the GAIA project

NASA Technical Reports Server (NTRS)

Lasala, Jerry; Kurtz, Michael J.

1995-01-01

Two dimensional spectral types for each of the stars observed in the global astrometric interferometer for astrophysics (GAIA) mission would provide additional information for the galactic structure and stellar evolution studies, as well as helping in the identification of unusual objects and populations. The classification of the large quantity generated spectra requires that automated techniques are implemented. Approaches for the automatic classification are reviewed, and a metric-distance method is discussed. In tests, the metric-distance method produced spectral types with mean errors comparable to those of human classifiers working at similar resolution. Data and equipment requirements for an automated classification survey, are discussed. A program of auxiliary observations is proposed to yield spectral types and radial velocities for the GAIA-observed stars.
Speaker normalization and adaptation using second-order connectionist networks.

PubMed

Watrous, R L

1993-01-01

A method for speaker normalization and adaption using connectionist networks is developed. A speaker-specific linear transformation of observations of the speech signal is computed using second-order network units. Classification is accomplished by a multilayer feedforward network that operates on the normalized speech data. The network is adapted for a new talker by modifying the transformation parameters while leaving the classifier fixed. This is accomplished by backpropagating classification error through the classifier to the second-order transformation units. This method was evaluated for the classification of ten vowels for 76 speakers using the first two formant values of the Peterson-Barney data. The results suggest that rapid speaker adaptation resulting in high classification accuracy can be accomplished by this method.
Errors in imaging patients in the emergency setting

PubMed Central

Reginelli, Alfonso; Lo Re, Giuseppe; Midiri, Federico; Muzj, Carlo; Romano, Luigia; Brunese, Luca

2016-01-01

Emergency and trauma care produces a “perfect storm” for radiological errors: uncooperative patients, inadequate histories, time-critical decisions, concurrent tasks and often junior personnel working after hours in busy emergency departments. The main cause of diagnostic errors in the emergency department is the failure to correctly interpret radiographs, and the majority of diagnoses missed on radiographs are fractures. Missed diagnoses potentially have important consequences for patients, clinicians and radiologists. Radiologists play a pivotal role in the diagnostic assessment of polytrauma patients and of patients with non-traumatic craniothoracoabdominal emergencies, and key elements to reduce errors in the emergency setting are knowledge, experience and the correct application of imaging protocols. This article aims to highlight the definition and classification of errors in radiology, the causes of errors in emergency radiology and the spectrum of diagnostic errors in radiography, ultrasonography and CT in the emergency setting. PMID:26838955
Errors in imaging patients in the emergency setting.

PubMed

Pinto, Antonio; Reginelli, Alfonso; Pinto, Fabio; Lo Re, Giuseppe; Midiri, Federico; Muzj, Carlo; Romano, Luigia; Brunese, Luca

2016-01-01

Emergency and trauma care produces a "perfect storm" for radiological errors: uncooperative patients, inadequate histories, time-critical decisions, concurrent tasks and often junior personnel working after hours in busy emergency departments. The main cause of diagnostic errors in the emergency department is the failure to correctly interpret radiographs, and the majority of diagnoses missed on radiographs are fractures. Missed diagnoses potentially have important consequences for patients, clinicians and radiologists. Radiologists play a pivotal role in the diagnostic assessment of polytrauma patients and of patients with non-traumatic craniothoracoabdominal emergencies, and key elements to reduce errors in the emergency setting are knowledge, experience and the correct application of imaging protocols. This article aims to highlight the definition and classification of errors in radiology, the causes of errors in emergency radiology and the spectrum of diagnostic errors in radiography, ultrasonography and CT in the emergency setting.
Objective automated quantification of fluorescence signal in histological sections of rat lens.

PubMed

Talebizadeh, Nooshin; Hagström, Nanna Zhou; Yu, Zhaohua; Kronschläger, Martin; Söderberg, Per; Wählby, Carolina

2017-08-01

Visual quantification and classification of fluorescent signals is the gold standard in microscopy. The purpose of this study was to develop an automated method to delineate cells and to quantify expression of fluorescent signal of biomarkers in each nucleus and cytoplasm of lens epithelial cells in a histological section. A region of interest representing the lens epithelium was manually demarcated in each input image. Thereafter, individual cell nuclei within the region of interest were automatically delineated based on watershed segmentation and thresholding with an algorithm developed in Matlab™. Fluorescence signal was quantified within nuclei, cytoplasms and juxtaposed backgrounds. The classification of cells as labelled or not labelled was based on comparison of the fluorescence signal within cells with local background. The classification rule was thereafter optimized as compared with visual classification of a limited dataset. The performance of the automated classification was evaluated by asking 11 independent blinded observers to classify all cells (n = 395) in one lens image. Time consumed by the automatic algorithm and visual classification of cells was recorded. On an average, 77% of the cells were correctly classified as compared with the majority vote of the visual observers. The average agreement among visual observers was 83%. However, variation among visual observers was high, and agreement between two visual observers was as low as 71% in the worst case. Automated classification was on average 10 times faster than visual scoring. The presented method enables objective and fast detection of lens epithelial cells and quantification of expression of fluorescent signal with an accuracy comparable with the variability among visual observers. © 2017 International Society for Advancement of Cytometry. © 2017 International Society for Advancement of Cytometry.
Average capacity of the ground to train communication link of a curved track in the turbulence of gamma-gamma distribution

NASA Astrophysics Data System (ADS)

Yang, Yanqiu; Yu, Lin; Zhang, Yixin

2017-04-01

A model of the average capacity of optical wireless communication link with pointing errors for the ground-to-train of the curved track is established based on the non-Kolmogorov. By adopting the gamma-gamma distribution model, we derive the average capacity expression for this channel. The numerical analysis reveals that heavier fog reduces the average capacity of link. The strength of atmospheric turbulence, the variance of pointing errors, and the covered track length need to be reduced for the larger average capacity of link. The normalized beamwidth and the average signal-to-noise ratio (SNR) of the turbulence-free link need to be increased. We can increase the transmit aperture to expand the beamwidth and enhance the signal intensity, thereby decreasing the impact of the beam wander accordingly. As the system adopting the automatic tracking of beam at the receiver positioned on the roof of the train, for eliminating the pointing errors caused by beam wander and train vibration, the equivalent average capacity of the channel will achieve a maximum value. The impact of the non-Kolmogorov spectral index's variation on the average capacity of link can be ignored.
Methods for Tier 1 Modeling within the Training Range Environmental Evaluation and Characterization System

DTIC Science & Technology

2009-08-01

properties, part b. USLE K-Factor by Organic Matter Content Soil -Texture Classification Dry Bulk Density, g/cm3 Field Capacity, % Available...Universal Soil Loss Equation ( USLE ) can be used to estimate annual average sheet and rill erosion, A (tons/acre-yr), from the equation A R K L S...erodibility factors, K, for various soil classifications and percent organic matter content ( USLE Fact Sheet 2008). Textural Class Average Less than 2
Automatic evidence quality prediction to support evidence-based decision making.

PubMed

Sarker, Abeed; Mollá, Diego; Paris, Cécile

2015-06-01

Evidence-based medicine practice requires practitioners to obtain the best available medical evidence, and appraise the quality of the evidence when making clinical decisions. Primarily due to the plethora of electronically available data from the medical literature, the manual appraisal of the quality of evidence is a time-consuming process. We present a fully automatic approach for predicting the quality of medical evidence in order to aid practitioners at point-of-care. Our approach extracts relevant information from medical article abstracts and utilises data from a specialised corpus to apply supervised machine learning for the prediction of the quality grades. Following an in-depth analysis of the usefulness of features (e.g., publication types of articles), they are extracted from the text via rule-based approaches and from the meta-data associated with the articles, and then applied in the supervised classification model. We propose the use of a highly scalable and portable approach using a sequence of high precision classifiers, and introduce a simple evaluation metric called average error distance (AED) that simplifies the comparison of systems. We also perform elaborate human evaluations to compare the performance of our system against human judgments. We test and evaluate our approaches on a publicly available, specialised, annotated corpus containing 1132 evidence-based recommendations. Our rule-based approach performs exceptionally well at the automatic extraction of publication types of articles, with F-scores of up to 0.99 for high-quality publication types. For evidence quality classification, our approach obtains an accuracy of 63.84% and an AED of 0.271. The human evaluations show that the performance of our system, in terms of AED and accuracy, is comparable to the performance of humans on the same data. The experiments suggest that our structured text classification framework achieves evaluation results comparable to those of human performance. Our overall classification approach and evaluation technique are also highly portable and can be used for various evidence grading scales. Copyright © 2015 Elsevier B.V. All rights reserved.
Reliability, Validity, and Classification Accuracy of the DSM-5 Diagnostic Criteria for Gambling Disorder and Comparison to DSM-IV.

PubMed

Stinchfield, Randy; McCready, John; Turner, Nigel E; Jimenez-Murcia, Susana; Petry, Nancy M; Grant, Jon; Welte, John; Chapman, Heather; Winters, Ken C

2016-09-01

The DSM-5 was published in 2013 and it included two substantive revisions for gambling disorder (GD). These changes are the reduction in the threshold from five to four criteria and elimination of the illegal activities criterion. The purpose of this study was to twofold. First, to assess the reliability, validity and classification accuracy of the DSM-5 diagnostic criteria for GD. Second, to compare the DSM-5-DSM-IV on reliability, validity, and classification accuracy, including an examination of the effect of the elimination of the illegal acts criterion on diagnostic accuracy. To compare DSM-5 and DSM-IV, eight datasets from three different countries (Canada, USA, and Spain; total N = 3247) were used. All datasets were based on similar research methods. Participants were recruited from outpatient gambling treatment services to represent the group with a GD and from the community to represent the group without a GD. All participants were administered a standardized measure of diagnostic criteria. The DSM-5 yielded satisfactory reliability, validity and classification accuracy. In comparing the DSM-5 to the DSM-IV, most comparisons of reliability, validity and classification accuracy showed more similarities than differences. There was evidence of modest improvements in classification accuracy for DSM-5 over DSM-IV, particularly in reduction of false negative errors. This reduction in false negative errors was largely a function of lowering the cut score from five to four and this revision is an improvement over DSM-IV. From a statistical standpoint, eliminating the illegal acts criterion did not make a significant impact on diagnostic accuracy. From a clinical standpoint, illegal acts can still be addressed in the context of the DSM-5 criterion of lying to others.
The presence of English and Spanish dyslexia in the Web

NASA Astrophysics Data System (ADS)

Rello, Luz; Baeza-Yates, Ricardo

2012-09-01

In this study we present a lower bound of the prevalence of dyslexia in the Web for English and Spanish. On the basis of analysis of corpora written by dyslexic people, we propose a classification of the different kinds of dyslexic errors. A representative data set of dyslexic words is used to calculate this lower bound in web pages containing English and Spanish dyslexic errors. We also present an analysis of dyslexic errors in major Internet domains, social media sites, and throughout English- and Spanish-speaking countries. To show the independence of our estimations from the presence of other kinds of errors, we compare them with the overall lexical quality of the Web and with the error rate of noncorrected corpora. The presence of dyslexic errors in the Web motivates work in web accessibility for dyslexic users.
Nineteen hundred seventy three significant accomplishments. [Landsat satellite data applications

NASA Technical Reports Server (NTRS)

1974-01-01

Data collected by the Skylab remote sensing satellites was used to develop applications techniques and to combine automatic data classification with statistical clustering methods. Continuing research was concentrated in the correlation and registration of data products and in the definition of the atmospheric effects on remote sensing. The causes of errors encountered in the automated classification of agricultural data are identified. Other applications in forestry, geography, environmental geology, and land use are discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Gissi, Andrea; Dipartimento di Farmacia – Scienze del Farmaco, Università degli Studi di Bari “Aldo Moro”, Via E. Orabona 4, 70125 Bari; Lombardo, Anna

The bioconcentration factor (BCF) is an important bioaccumulation hazard assessment metric in many regulatory contexts. Its assessment is required by the REACH regulation (Registration, Evaluation, Authorization and Restriction of Chemicals) and by CLP (Classification, Labeling and Packaging). We challenged nine well-known and widely used BCF QSAR models against 851 compounds stored in an ad-hoc created database. The goodness of the regression analysis was assessed by considering the determination coefficient (R{sup 2}) and the Root Mean Square Error (RMSE); Cooper's statistics and Matthew's Correlation Coefficient (MCC) were calculated for all the thresholds relevant for regulatory purposes (i.e. 100 L/kg for Chemicalmore » Safety Assessment; 500 L/kg for Classification and Labeling; 2000 and 5000 L/kg for Persistent, Bioaccumulative and Toxic (PBT) and very Persistent, very Bioaccumulative (vPvB) assessment) to assess the classification, with particular attention to the models' ability to control the occurrence of false negatives. As a first step, statistical analysis was performed for the predictions of the entire dataset; R{sup 2}>0.70 was obtained using CORAL, T.E.S.T. and EPISuite Arnot–Gobas models. As classifiers, ACD and log P-based equations were the best in terms of sensitivity, ranging from 0.75 to 0.94. External compound predictions were carried out for the models that had their own training sets. CORAL model returned the best performance (R{sup 2}{sub ext}=0.59), followed by the EPISuite Meylan model (R{sup 2}{sub ext}=0.58). The latter gave also the highest sensitivity on external compounds with values from 0.55 to 0.85, depending on the thresholds. Statistics were also compiled for compounds falling into the models Applicability Domain (AD), giving better performances. In this respect, VEGA CAESAR was the best model in terms of regression (R{sup 2}=0.94) and classification (average sensitivity>0.80). This model also showed the best regression (R{sup 2}=0.85) and sensitivity (average>0.70) for new compounds in the AD but not present in the training set. However, no single optimal model exists and, thus, it would be wise a case-by-case assessment. Yet, integrating the wealth of information from multiple models remains the winner approach. - Highlights: • REACH encourages the use of in silico methods in the assessment of chemicals safety. • The performances of nine BCF models were evaluated on a benchmark database of 851 chemicals. • We compared the models on the basis of both regression and classification performance. • Statistics on chemicals out of the training set and/or within the applicability domain were compiled. • The results show that QSAR models are useful as weight-of-evidence in support to other methods.« less
Total ozone trend significance from space time variability of daily Dobson data

NASA Technical Reports Server (NTRS)

Wilcox, R. W.

1981-01-01

Estimates of standard errors of total ozone time and area means, as derived from ozone's natural temporal and spatial variability and autocorrelation in middle latitudes determined from daily Dobson data are presented. Assessing the significance of apparent total ozone trends is equivalent to assessing the standard error of the means. Standard errors of time averages depend on the temporal variability and correlation of the averaged parameter. Trend detectability is discussed, both for the present network and for satellite measurements.
The calculation of average error probability in a digital fibre optical communication system

NASA Astrophysics Data System (ADS)

Rugemalira, R. A. M.

1980-03-01

This paper deals with the problem of determining the average error probability in a digital fibre optical communication system, in the presence of message dependent inhomogeneous non-stationary shot noise, additive Gaussian noise and intersymbol interference. A zero-forcing equalization receiver filter is considered. Three techniques for error rate evaluation are compared. The Chernoff bound and the Gram-Charlier series expansion methods are compared to the characteristic function technique. The latter predicts a higher receiver sensitivity
Evaluation of causes and frequency of medication errors during information technology downtime.

PubMed

Hanuscak, Tara L; Szeinbach, Sheryl L; Seoane-Vazquez, Enrique; Reichert, Brendan J; McCluskey, Charles F

2009-06-15

The causes and frequency of medication errors occurring during information technology downtime were evaluated. Individuals from a convenience sample of 78 hospitals who were directly responsible for supporting and maintaining clinical information systems (CISs) and automated dispensing systems (ADSs) were surveyed using an online tool between February 2007 and May 2007 to determine if medication errors were reported during periods of system downtime. The errors were classified using the National Coordinating Council for Medication Error Reporting and Prevention severity scoring index. The percentage of respondents reporting downtime was estimated. Of the 78 eligible hospitals, 32 respondents with CIS and ADS responsibilities completed the online survey for a response rate of 41%. For computerized prescriber order entry, patch installations and system upgrades caused an average downtime of 57% over a 12-month period. Lost interface and interface malfunction were reported for centralized and decentralized ADSs, with an average downtime response of 34% and 29%, respectively. The average downtime response was 31% for software malfunctions linked to clinical decision-support systems. Although patient harm did not result from 30 (54%) medication errors, the potential for harm was present for 9 (16%) of these errors. Medication errors occurred during CIS and ADS downtime despite the availability of backup systems and standard protocols to handle periods of system downtime. Efforts should be directed to reduce the frequency and length of down-time in order to minimize medication errors during such downtime.
A consensus view of fold space: Combining SCOP, CATH, and the Dali Domain Dictionary

PubMed Central

Day, Ryan; Beck, David A.C.; Armen, Roger S.; Daggett, Valerie

2003-01-01

We have determined consensus protein-fold classifications on the basis of three classification methods, SCOP, CATH, and Dali. These classifications make use of different methods of defining and categorizing protein folds that lead to different views of protein-fold space. Pairwise comparisons of domains on the basis of their fold classifications show that much of the disagreement between the classification systems is due to differing domain definitions rather than assigning the same domain to different folds. However, there are significant differences in the fold assignments between the three systems. These remaining differences can be explained primarily in terms of the breadth of the fold classifications. Many structures may be defined as having one fold in one system, whereas far fewer are defined as having the analogous fold in another system. By comparing these folds for a nonredundant set of proteins, the consensus method breaks up broad fold classifications and combines restrictive fold classifications into metafolds, creating, in effect, an averaged view of fold space. This averaged view requires that the structural similarities between proteins having the same metafold be recognized by multiple classification systems. Thus, the consensus map is useful for researchers looking for fold similarities that are relatively independent of the method used to compare proteins. The 30 most populated metafolds, representing the folds of about half of a nonredundant subset of the PDB, are presented here. The full list of metafolds is presented on the Web. PMID:14500873

A consensus view of fold space: combining SCOP, CATH, and the Dali Domain Dictionary.

PubMed

Day, Ryan; Beck, David A C; Armen, Roger S; Daggett, Valerie

2003-10-01

We have determined consensus protein-fold classifications on the basis of three classification methods, SCOP, CATH, and Dali. These classifications make use of different methods of defining and categorizing protein folds that lead to different views of protein-fold space. Pairwise comparisons of domains on the basis of their fold classifications show that much of the disagreement between the classification systems is due to differing domain definitions rather than assigning the same domain to different folds. However, there are significant differences in the fold assignments between the three systems. These remaining differences can be explained primarily in terms of the breadth of the fold classifications. Many structures may be defined as having one fold in one system, whereas far fewer are defined as having the analogous fold in another system. By comparing these folds for a nonredundant set of proteins, the consensus method breaks up broad fold classifications and combines restrictive fold classifications into metafolds, creating, in effect, an averaged view of fold space. This averaged view requires that the structural similarities between proteins having the same metafold be recognized by multiple classification systems. Thus, the consensus map is useful for researchers looking for fold similarities that are relatively independent of the method used to compare proteins. The 30 most populated metafolds, representing the folds of about half of a nonredundant subset of the PDB, are presented here. The full list of metafolds is presented on the Web.
Error Reduction Methods for Integrated-path Differential-absorption Lidar Measurements

NASA Technical Reports Server (NTRS)

Chen, Jeffrey R.; Numata, Kenji; Wu, Stewart T.

2012-01-01

We report new modeling and error reduction methods for differential-absorption optical-depth (DAOD) measurements of atmospheric constituents using direct-detection integrated-path differential-absorption lidars. Errors from laser frequency noise are quantified in terms of the line center fluctuation and spectral line shape of the laser pulses, revealing relationships verified experimentally. A significant DAOD bias is removed by introducing a correction factor. Errors from surface height and reflectance variations can be reduced to tolerable levels by incorporating altimetry knowledge and "log after averaging", or by pointing the laser and receiver to a fixed surface spot during each wavelength cycle to shorten the time of "averaging before log".
Low-back electromyography (EMG) data-driven load classification for dynamic lifting tasks.

PubMed

Totah, Deema; Ojeda, Lauro; Johnson, Daniel D; Gates, Deanna; Mower Provost, Emily; Barton, Kira

2018-01-01

Numerous devices have been designed to support the back during lifting tasks. To improve the utility of such devices, this research explores the use of preparatory muscle activity to classify muscle loading and initiate appropriate device activation. The goal of this study was to determine the earliest time window that enabled accurate load classification during a dynamic lifting task. Nine subjects performed thirty symmetrical lifts, split evenly across three weight conditions (no-weight, 10-lbs and 24-lbs), while low-back muscle activity data was collected. Seven descriptive statistics features were extracted from 100 ms windows of data. A multinomial logistic regression (MLR) classifier was trained and tested, employing leave-one subject out cross-validation, to classify lifted load values. Dimensionality reduction was achieved through feature cross-correlation analysis and greedy feedforward selection. The time of full load support by the subject was defined as load-onset. Regions of highest average classification accuracy started at 200 ms before until 200 ms after load-onset with average accuracies ranging from 80% (±10%) to 81% (±7%). The average recall for each class ranged from 69-92%. These inter-subject classification results indicate that preparatory muscle activity can be leveraged to identify the intent to lift a weight up to 100 ms prior to load-onset. The high accuracies shown indicate the potential to utilize intent classification for assistive device applications. Active assistive devices, e.g. exoskeletons, could prevent back injury by off-loading low-back muscles. Early intent classification allows more time for actuators to respond and integrate seamlessly with the user.
A constrained reconstruction technique of hyperelasticity parameters for breast cancer assessment

NASA Astrophysics Data System (ADS)

Mehrabian, Hatef; Campbell, Gordon; Samani, Abbas

2010-12-01

In breast elastography, breast tissue usually undergoes large compression resulting in significant geometric and structural changes. This implies that breast elastography is associated with tissue nonlinear behavior. In this study, an elastography technique is presented and an inverse problem formulation is proposed to reconstruct parameters characterizing tissue hyperelasticity. Such parameters can potentially be used for tumor classification. This technique can also have other important clinical applications such as measuring normal tissue hyperelastic parameters in vivo. Such parameters are essential in planning and conducting computer-aided interventional procedures. The proposed parameter reconstruction technique uses a constrained iterative inversion; it can be viewed as an inverse problem. To solve this problem, we used a nonlinear finite element model corresponding to its forward problem. In this research, we applied Veronda-Westmann, Yeoh and polynomial models to model tissue hyperelasticity. To validate the proposed technique, we conducted studies involving numerical and tissue-mimicking phantoms. The numerical phantom consisted of a hemisphere connected to a cylinder, while we constructed the tissue-mimicking phantom from polyvinyl alcohol with freeze-thaw cycles that exhibits nonlinear mechanical behavior. Both phantoms consisted of three types of soft tissues which mimic adipose, fibroglandular tissue and a tumor. The results of the simulations and experiments show feasibility of accurate reconstruction of tumor tissue hyperelastic parameters using the proposed method. In the numerical phantom, all hyperelastic parameters corresponding to the three models were reconstructed with less than 2% error. With the tissue-mimicking phantom, we were able to reconstruct the ratio of the hyperelastic parameters reasonably accurately. Compared to the uniaxial test results, the average error of the ratios of the parameters reconstructed for inclusion to the middle and external layers were 13% and 9.6%, respectively. Given that the parameter ratios of the abnormal tissues to the normal ones range from three times to more than ten times, this accuracy is sufficient for tumor classification.
Classification and correction of the radar bright band with polarimetric radar

NASA Astrophysics Data System (ADS)

Hall, Will; Rico-Ramirez, Miguel; Kramer, Stefan

2015-04-01

The annular region of enhanced radar reflectivity, known as the Bright Band (BB), occurs when the radar beam intersects a layer of melting hydrometeors. Radar reflectivity is related to rainfall through a power law equation and so this enhanced region can lead to overestimations of rainfall by a factor of up to 5, so it is important to correct for this. The BB region can be identified by using several techniques including hydrometeor classification and freezing level forecasts from mesoscale meteorological models. Advances in dual-polarisation radar measurements and continued research in the field has led to increased accuracy in the ability to identify the melting snow region. A method proposed by Kitchen et al (1994), a form of which is currently used operationally in the UK, utilises idealised Vertical Profiles of Reflectivity (VPR) to correct for the BB enhancement. A simpler and more computationally efficient method involves the formation of an average VPR from multiple elevations for correction that can still cause a significant decrease in error (Vignal 2000). The purpose of this research is to evaluate a method that relies only on analysis of measurements from an operational C-band polarimetric radar without the need for computationally expensive models. Initial results show that LDR is a strong classifier of melting snow with a high Critical Success Index of 97% when compared to the other variables. An algorithm based on idealised VPRs resulted in the largest decrease in error when BB corrected scans are compared to rain gauges and to lower level scans with a reduction in RMSE of 61% for rain-rate measurements. References Kitchen, M., R. Brown, and A. G. Davies, 1994: Real-time correction of weather radar data for the effects of bright band, range and orographic growth in widespread precipitation. Q.J.R. Meteorol. Soc., 120, 1231-1254. Vignal, B. et al, 2000: Three methods to determine profiles of reflectivity from volumetric radar data to correct precipitation estimates. J. Appl. Meteor., 39(10), 1715-1726.
A classification on human factor accident/incident of China civil aviation in recent twelve years.

PubMed

Luo, Xiao-li

2004-10-01

To study human factor accident/incident occurred during 1990-2001 using new classification standard. The human factor accident/incident classification standard is developed on the basis of Reason's Model, combining with CAAC's traditional classifying method, and applied to the classified statistical analysis for 361 flying incidents and 35 flight accidents of China civil aviation, which is induced by human factors and occurred from 1990 to 2001. 1) the incident percentage of taxi and cruise is higher than that of takeoff, climb and descent. 2) The dominating type of flight incidents is diverging of runway, overrunning, near-miss, tail/wingtip/engine strike and ground obstacle impacting. 3) The top three accidents are out of control caused by crew, mountain collision and over runway. 4) Crew's basic operating skill is lower than what we imagined, the mostly representation is poor correcting ability when flight error happened. 5) Crew errors can be represented by incorrect control, regulation and procedure violation, disorientation and diverging percentage of correct flight level. The poor CRM skill is the dominant factor impacting China civil aviation safety, this result has a coincidence with previous study, but there is much difference and distinct characteristic in top incident phase, the type of crew error and behavior performance compared with that of advanced countries. We should strengthen CRM training for all of pilots aiming at the Chinese pilot behavior characteristic in order to improve the safety level of China civil aviation.
The joint use of the tangential electric field and surface Laplacian in EEG classification.

PubMed

Carvalhaes, C G; de Barros, J Acacio; Perreau-Guimaraes, M; Suppes, P

2014-01-01

We investigate the joint use of the tangential electric field (EF) and the surface Laplacian (SL) derivation as a method to improve the classification of EEG signals. We considered five classification tasks to test the validity of such approach. In all five tasks, the joint use of the components of the EF and the SL outperformed the scalar potential. The smallest effect occurred in the classification of a mental task, wherein the average classification rate was improved by 0.5 standard deviations. The largest effect was obtained in the classification of visual stimuli and corresponded to an improvement of 2.1 standard deviations.
Content-based multiple bitstream image transmission over noisy channels.

PubMed

Cao, Lei; Chen, Chang Wen

2002-01-01

In this paper, we propose a novel combined source and channel coding scheme for image transmission over noisy channels. The main feature of the proposed scheme is a systematic decomposition of image sources so that unequal error protection can be applied according to not only bit error sensitivity but also visual content importance. The wavelet transform is adopted to hierarchically decompose the image. The association between the wavelet coefficients and what they represent spatially in the original image is fully exploited so that wavelet blocks are classified based on their corresponding image content. The classification produces wavelet blocks in each class with similar content and statistics, therefore enables high performance source compression using the set partitioning in hierarchical trees (SPIHT) algorithm. To combat the channel noise, an unequal error protection strategy with rate-compatible punctured convolutional/cyclic redundancy check (RCPC/CRC) codes is implemented based on the bit contribution to both peak signal-to-noise ratio (PSNR) and visual quality. At the receiving end, a postprocessing method making use of the SPIHT decoding structure and the classification map is developed to restore the degradation due to the residual error after channel decoding. Experimental results show that the proposed scheme is indeed able to provide protection both for the bits that are more sensitive to errors and for the more important visual content under a noisy transmission environment. In particular, the reconstructed images illustrate consistently better visual quality than using the single-bitstream-based schemes.
Robust Image Regression Based on the Extended Matrix Variate Power Exponential Distribution of Dependent Noise.

PubMed

Luo, Lei; Yang, Jian; Qian, Jianjun; Tai, Ying; Lu, Gui-Fu

2017-09-01

Dealing with partial occlusion or illumination is one of the most challenging problems in image representation and classification. In this problem, the characterization of the representation error plays a crucial role. In most current approaches, the error matrix needs to be stretched into a vector and each element is assumed to be independently corrupted. This ignores the dependence between the elements of error. In this paper, it is assumed that the error image caused by partial occlusion or illumination changes is a random matrix variate and follows the extended matrix variate power exponential distribution. This has the heavy tailed regions and can be used to describe a matrix pattern of l×m dimensional observations that are not independent. This paper reveals the essence of the proposed distribution: it actually alleviates the correlations between pixels in an error matrix E and makes E approximately Gaussian. On the basis of this distribution, we derive a Schatten p -norm-based matrix regression model with L q regularization. Alternating direction method of multipliers is applied to solve this model. To get a closed-form solution in each step of the algorithm, two singular value function thresholding operators are introduced. In addition, the extended Schatten p -norm is utilized to characterize the distance between the test samples and classes in the design of the classifier. Extensive experimental results for image reconstruction and classification with structural noise demonstrate that the proposed algorithm works much more robustly than some existing regression-based methods.
Assessing the accuracy of the International Classification of Diseases codes to identify abusive head trauma: a feasibility study.

PubMed

Berger, Rachel P; Parks, Sharyn; Fromkin, Janet; Rubin, Pamela; Pecora, Peter J

2015-04-01

To assess the accuracy of an International Classification of Diseases (ICD) code-based operational case definition for abusive head trauma (AHT). Subjects were children <5 years of age evaluated for AHT by a hospital-based Child Protection Team (CPT) at a tertiary care paediatric hospital with a completely electronic medical record (EMR) system. Subjects were designated as non-AHT traumatic brain injury (TBI) or AHT based on whether the CPT determined that the injuries were due to AHT. The sensitivity and specificity of the ICD-based definition were calculated. There were 223 children evaluated for AHT: 117 AHT and 106 non-AHT TBI. The sensitivity and specificity of the ICD-based operational case definition were 92% (95% CI 85.8 to 96.2) and 96% (95% CI 92.3 to 99.7), respectively. All errors in sensitivity and three of the four specificity errors were due to coder error; one specificity error was a physician error. In a paediatric tertiary care hospital with an EMR system, the accuracy of an ICD-based case definition for AHT was high. Additional studies are needed to assess the accuracy of this definition in all types of hospitals in which children with AHT are cared for. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Modeling habitat dynamics accounting for possible misclassification

USGS Publications Warehouse

Veran, Sophie; Kleiner, Kevin J.; Choquet, Remi; Collazo, Jaime; Nichols, James D.

2012-01-01

Land cover data are widely used in ecology as land cover change is a major component of changes affecting ecological systems. Landscape change estimates are characterized by classification errors. Researchers have used error matrices to adjust estimates of areal extent, but estimation of land cover change is more difficult and more challenging, with error in classification being confused with change. We modeled land cover dynamics for a discrete set of habitat states. The approach accounts for state uncertainty to produce unbiased estimates of habitat transition probabilities using ground information to inform error rates. We consider the case when true and observed habitat states are available for the same geographic unit (pixel) and when true and observed states are obtained at one level of resolution, but transition probabilities estimated at a different level of resolution (aggregations of pixels). Simulation results showed a strong bias when estimating transition probabilities if misclassification was not accounted for. Scaling-up does not necessarily decrease the bias and can even increase it. Analyses of land cover data in the Southeast region of the USA showed that land change patterns appeared distorted if misclassification was not accounted for: rate of habitat turnover was artificially increased and habitat composition appeared more homogeneous. Not properly accounting for land cover misclassification can produce misleading inferences about habitat state and dynamics and also misleading predictions about species distributions based on habitat. Our models that explicitly account for state uncertainty should be useful in obtaining more accurate inferences about change from data that include errors.
Estimating Gestational Age With Sonography: Regression-Derived Formula Versus the Fetal Biometric Average.

PubMed

Cawyer, Chase R; Anderson, Sarah B; Szychowski, Jeff M; Neely, Cherry; Owen, John

2018-03-01

To compare the accuracy of a new regression-derived formula developed from the National Fetal Growth Studies data to the common alternative method that uses the average of the gestational ages (GAs) calculated for each fetal biometric measurement (biparietal diameter, head circumference, abdominal circumference, and femur length). This retrospective cross-sectional study identified nonanomalous singleton pregnancies that had a crown-rump length plus at least 1 additional sonographic examination with complete fetal biometric measurements. With the use of the crown-rump length to establish the referent estimated date of delivery, each method's (National Institute of Child Health and Human Development regression versus Hadlock average [Radiology 1984; 152:497-501]), error at every examination was computed. Error, defined as the difference between the crown-rump length-derived GA and each method's predicted GA (weeks), was compared in 3 GA intervals: 1 (14 weeks-20 weeks 6 days), 2 (21 weeks-28 weeks 6 days), and 3 (≥29 weeks). In addition, the proportion of each method's examinations that had errors outside prespecified (±) day ranges was computed by using odds ratios. A total of 16,904 sonograms were identified. The overall and prespecified GA range subset mean errors were significantly smaller for the regression compared to the average (P < .01), and the regression had significantly lower odds of observing examinations outside the specified range of error in GA intervals 2 (odds ratio, 1.15; 95% confidence interval, 1.01-1.31) and 3 (odds ratio, 1.24; 95% confidence interval, 1.17-1.32) than the average method. In a contemporary unselected population of women dated by a crown-rump length-derived GA, the National Institute of Child Health and Human Development regression formula produced fewer estimates outside a prespecified margin of error than the commonly used Hadlock average; the differences were most pronounced for GA estimates at 29 weeks and later. © 2017 by the American Institute of Ultrasound in Medicine.
Identification of coffee bean varieties using hyperspectral imaging: influence of preprocessing methods and pixel-wise spectra analysis.

PubMed

Zhang, Chu; Liu, Fei; He, Yong

2018-02-01

Hyperspectral imaging was used to identify and to visualize the coffee bean varieties. Spectral preprocessing of pixel-wise spectra was conducted by different methods, including moving average smoothing (MA), wavelet transform (WT) and empirical mode decomposition (EMD). Meanwhile, spatial preprocessing of the gray-scale image at each wavelength was conducted by median filter (MF). Support vector machine (SVM) models using full sample average spectra and pixel-wise spectra, and the selected optimal wavelengths by second derivative spectra all achieved classification accuracy over 80%. Primarily, the SVM models using pixel-wise spectra were used to predict the sample average spectra, and these models obtained over 80% of the classification accuracy. Secondly, the SVM models using sample average spectra were used to predict pixel-wise spectra, but achieved with lower than 50% of classification accuracy. The results indicated that WT and EMD were suitable for pixel-wise spectra preprocessing. The use of pixel-wise spectra could extend the calibration set, and resulted in the good prediction results for pixel-wise spectra and sample average spectra. The overall results indicated the effectiveness of using spectral preprocessing and the adoption of pixel-wise spectra. The results provided an alternative way of data processing for applications of hyperspectral imaging in food industry.
Underwater target classification using wavelet packets and neural networks.

PubMed

Azimi-Sadjadi, M R; Yao, D; Huang, Q; Dobeck, G J

2000-01-01

In this paper, a new subband-based classification scheme is developed for classifying underwater mines and mine-like targets from the acoustic backscattered signals. The system consists of a feature extractor using wavelet packets in conjunction with linear predictive coding (LPC), a feature selection scheme, and a backpropagation neural-network classifier. The data set used for this study consists of the backscattered signals from six different objects: two mine-like targets and four nontargets for several aspect angles. Simulation results on ten different noisy realizations and for signal-to-noise ratio (SNR) of 12 dB are presented. The receiver operating characteristic (ROC) curve of the classifier generated based on these results demonstrated excellent classification performance of the system. The generalization ability of the trained network was demonstrated by computing the error and classification rate statistics on a large data set. A multiaspect fusion scheme was also adopted in order to further improve the classification performance.
The effect of the atmosphere on the classification of satellite observations to identify surface features

NASA Technical Reports Server (NTRS)

Fraser, R. S.; Bahethi, O. P.; Al-Abbas, A. H.

1977-01-01

The effect of differences in atmospheric turbidity on the classification of Landsat 1 observations of a rural scene is presented. The observations are classified by an unsupervised clustering technique. These clusters serve as a training set for use of a maximum-likelihood algorithm. The measured radiances in each of the four spectral bands are then changed by amounts measured by Landsat 1. These changes can be associated with a decrease in atmospheric turbidity by a factor of 1.3. The classification of 22% of the pixels changes as a result of the modification. The modified observations are then reclassified as an independent set. Only 3% of the pixels have a different classification than the unmodified set. Hence, if classification errors of rural areas are not to exceed 15%, a new training set has to be developed whenever the difference in turbidity between the training and test sets reaches unity.
Multinomial mixture model with heterogeneous classification probabilities

USGS Publications Warehouse

Holland, M.D.; Gray, B.R.

2011-01-01

Royle and Link (Ecology 86(9):2505-2512, 2005) proposed an analytical method that allowed estimation of multinomial distribution parameters and classification probabilities from categorical data measured with error. While useful, we demonstrate algebraically and by simulations that this method yields biased multinomial parameter estimates when the probabilities of correct category classifications vary among sampling units. We address this shortcoming by treating these probabilities as logit-normal random variables within a Bayesian framework. We use Markov chain Monte Carlo to compute Bayes estimates from a simulated sample from the posterior distribution. Based on simulations, this elaborated Royle-Link model yields nearly unbiased estimates of multinomial and correct classification probability estimates when classification probabilities are allowed to vary according to the normal distribution on the logit scale or according to the Beta distribution. The method is illustrated using categorical submersed aquatic vegetation data. ?? 2010 Springer Science+Business Media, LLC.
Diagnostic classification of macular ganglion cell and retinal nerve fiber layer analysis: differentiation of false-positives from glaucoma.

PubMed

Kim, Ko Eun; Jeoung, Jin Wook; Park, Ki Ho; Kim, Dong Myung; Kim, Seok Hwan

2015-03-01

To investigate the rate and associated factors of false-positive diagnostic classification of ganglion cell analysis (GCA) and retinal nerve fiber layer (RNFL) maps, and characteristic false-positive patterns on optical coherence tomography (OCT) deviation maps. Prospective, cross-sectional study. A total of 104 healthy eyes of 104 normal participants. All participants underwent peripapillary and macular spectral-domain (Cirrus-HD, Carl Zeiss Meditec Inc, Dublin, CA) OCT scans. False-positive diagnostic classification was defined as yellow or red color-coded areas for GCA and RNFL maps. Univariate and multivariate logistic regression analyses were used to determine associated factors. Eyes with abnormal OCT deviation maps were categorized on the basis of the shape and location of abnormal color-coded area. Differences in clinical characteristics among the subgroups were compared. (1) The rate and associated factors of false-positive OCT maps; (2) patterns of false-positive, color-coded areas on the GCA deviation map and associated clinical characteristics. Of the 104 healthy eyes, 42 (40.4%) and 32 (30.8%) showed abnormal diagnostic classifications on any of the GCA and RNFL maps, respectively. Multivariate analysis revealed that false-positive GCA diagnostic classification was associated with longer axial length and larger fovea-disc angle, whereas longer axial length and smaller disc area were associated with abnormal RNFL maps. Eyes with abnormal GCA deviation map were categorized as group A (donut-shaped round area around the inner annulus), group B (island-like isolated area), and group C (diffuse, circular area with an irregular inner margin in either). The axial length showed a significant increasing trend from group A to C (P=0.001), and likewise, the refractive error was more myopic in group C than in groups A (P=0.015) and B (P=0.014). Group C had thinner average ganglion cell-inner plexiform layer thickness compared with other groups (group A=B>C, P=0.004). Abnormal OCT diagnostic classification should be interpreted with caution, especially in eyes with long axial lengths, large fovea-disc angles, and small optic discs. Our findings suggest that the characteristic patterns of OCT deviation map can provide useful clues to distinguish glaucomatous changes from false-positive findings. Copyright © 2015 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.
A geomorphic approach to 100-year floodplain mapping for the Conterminous United States

NASA Astrophysics Data System (ADS)

Jafarzadegan, Keighobad; Merwade, Venkatesh; Saksena, Siddharth

2018-06-01

Floodplain mapping using hydrodynamic models is difficult in data scarce regions. Additionally, using hydrodynamic models to map floodplain over large stream network can be computationally challenging. Some of these limitations of floodplain mapping using hydrodynamic modeling can be overcome by developing computationally efficient statistical methods to identify floodplains in large and ungauged watersheds using publicly available data. This paper proposes a geomorphic model to generate probabilistic 100-year floodplain maps for the Conterminous United States (CONUS). The proposed model first categorizes the watersheds in the CONUS into three classes based on the height of the water surface corresponding to the 100-year flood from the streambed. Next, the probability that any watershed in the CONUS belongs to one of these three classes is computed through supervised classification using watershed characteristics related to topography, hydrography, land use and climate. The result of this classification is then fed into a probabilistic threshold binary classifier (PTBC) to generate the probabilistic 100-year floodplain maps. The supervised classification algorithm is trained by using the 100-year Flood Insurance Rated Maps (FIRM) from the U.S. Federal Emergency Management Agency (FEMA). FEMA FIRMs are also used to validate the performance of the proposed model in areas not included in the training. Additionally, HEC-RAS model generated flood inundation extents are used to validate the model performance at fifteen sites that lack FEMA maps. Validation results show that the probabilistic 100-year floodplain maps, generated by proposed model, match well with both FEMA and HEC-RAS generated maps. On average, the error of predicted flood extents is around 14% across the CONUS. The high accuracy of the validation results shows the reliability of the geomorphic model as an alternative approach for fast and cost effective delineation of 100-year floodplains for the CONUS.
An embedded implementation based on adaptive filter bank for brain-computer interface systems.

PubMed

Belwafi, Kais; Romain, Olivier; Gannouni, Sofien; Ghaffari, Fakhreddine; Djemal, Ridha; Ouni, Bouraoui

2018-07-15

Brain-computer interface (BCI) is a new communication pathway for users with neurological deficiencies. The implementation of a BCI system requires complex electroencephalography (EEG) signal processing including filtering, feature extraction and classification algorithms. Most of current BCI systems are implemented on personal computers. Therefore, there is a great interest in implementing BCI on embedded platforms to meet system specifications in terms of time response, cost effectiveness, power consumption, and accuracy. This article presents an embedded-BCI (EBCI) system based on a Stratix-IV field programmable gate array. The proposed system relays on the weighted overlap-add (WOLA) algorithm to perform dynamic filtering of EEG-signals by analyzing the event-related desynchronization/synchronization (ERD/ERS). The EEG-signals are classified, using the linear discriminant analysis algorithm, based on their spatial features. The proposed system performs fast classification within a time delay of 0.430 s/trial, achieving an average accuracy of 76.80% according to an offline approach and 80.25% using our own recording. The estimated power consumption of the prototype is approximately 0.7 W. Results show that the proposed EBCI system reduces the overall classification error rate for the three datasets of the BCI-competition by 5% compared to other similar implementations. Moreover, experiment shows that the proposed system maintains a high accuracy rate with a short processing time, a low power consumption, and a low cost. Performing dynamic filtering of EEG-signals using WOLA increases the recognition rate of ERD/ERS patterns of motor imagery brain activity. This approach allows to develop a complete prototype of a EBCI system that achieves excellent accuracy rates. Copyright © 2018 Elsevier B.V. All rights reserved.
Improving laboratory data entry quality using Six Sigma.

PubMed

Elbireer, Ali; Le Chasseur, Julie; Jackson, Brooks

2013-01-01

The Uganda Makerere University provides clinical laboratory support to over 70 clients in Uganda. With increased volume, manual data entry errors have steadily increased, prompting laboratory managers to employ the Six Sigma method to evaluate and reduce their problems. The purpose of this paper is to describe how laboratory data entry quality was improved by using Six Sigma. The Six Sigma Quality Improvement (QI) project team followed a sequence of steps, starting with defining project goals, measuring data entry errors to assess current performance, analyzing data and determining data-entry error root causes. Finally the team implemented changes and control measures to address the root causes and to maintain improvements. Establishing the Six Sigma project required considerable resources and maintaining the gains requires additional personnel time and dedicated resources. After initiating the Six Sigma project, there was a 60.5 percent reduction in data entry errors from 423 errors a month (i.e. 4.34 Six Sigma) in the first month, down to an average 166 errors/month (i.e. 4.65 Six Sigma) over 12 months. The team estimated the average cost of identifying and fixing a data entry error to be $16.25 per error. Thus, reducing errors by an average of 257 errors per month over one year has saved the laboratory an estimated $50,115 a year. The Six Sigma QI project provides a replicable framework for Ugandan laboratory staff and other resource-limited organizations to promote quality environment. Laboratory staff can deliver excellent care at a lower cost, by applying QI principles. This innovative QI method of reducing data entry errors in medical laboratories may improve the clinical workflow processes and make cost savings across the health care continuum.

Stretchy binary classification.

PubMed

Toh, Kar-Ann; Lin, Zhiping; Sun, Lei; Li, Zhengguo

2018-01-01

In this article, we introduce an analytic formulation for compressive binary classification. The formulation seeks to solve the least ℓ p -norm of the parameter vector subject to a classification error constraint. An analytic and stretchable estimation is conjectured where the estimation can be viewed as an extension of the pseudoinverse with left and right constructions. Our variance analysis indicates that the estimation based on the left pseudoinverse is unbiased and the estimation based on the right pseudoinverse is biased. Sparseness can be obtained for the biased estimation under certain mild conditions. The proposed estimation is investigated numerically using both synthetic and real-world data. Copyright © 2017 Elsevier Ltd. All rights reserved.
[Effects of residents' care needs classification (and misclassification) in nursing homes: the example of SOSIA classification].

PubMed

Nebuloni, G; Di Giulio, P; Gregori, D; Sandonà, P; Berchialla, P; Foltran, F; Renga, G

2011-01-01

Since 2003, the Lombardy region has introduced a case-mix reimbursement system for nursing homes based on the SOSIA form which classifies residents into eight classes of frailty. In the present study the agreement between SOSIA classification and other well documented instruments, including Barthel Index, Mini Mental State Examination and Clinical Dementia Rating Scale is evaluated in 100 nursing home residents. Only 50% of residents with severe dementia have been recognized as seriously impaired when assessed with SOSIA form; since misclassification errors underestimate residents' care needs, they determine an insufficient reimbursement limiting nursing home possibility to offer care appropriate for the case-mix.
Effect of filtration of signals of brain activity on quality of recognition of brain activity patterns using artificial intelligence methods

NASA Astrophysics Data System (ADS)

Hramov, Alexander E.; Frolov, Nikita S.; Musatov, Vyachaslav Yu.

2018-02-01

In present work we studied features of the human brain states classification, corresponding to the real movements of hands and legs. For this purpose we used supervised learning algorithm based on feed-forward artificial neural networks (ANNs) with error back-propagation along with the support vector machine (SVM) method. We compared the quality of operator movements classification by means of EEG signals obtained experimentally in the absence of preliminary processing and after filtration in different ranges up to 25 Hz. It was shown that low-frequency filtering of multichannel EEG data significantly improved accuracy of operator movements classification.
MO-FG-202-05: Identifying Treatment Planning System Errors in IROC-H Phantom Irradiations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kerns, J; Followill, D; Howell, R

Purpose: Treatment Planning System (TPS) errors can affect large numbers of cancer patients receiving radiation therapy. Using an independent recalculation system, the Imaging and Radiation Oncology Core-Houston (IROC-H) can identify institutions that have not sufficiently modelled their linear accelerators in their TPS model. Methods: Linear accelerator point measurement data from IROC-H’s site visits was aggregated and analyzed from over 30 linear accelerator models. Dosimetrically similar models were combined to create “classes”. The class data was used to construct customized beam models in an independent treatment dose verification system (TVS). Approximately 200 head and neck phantom plans from 2012 to 2015more » were recalculated using this TVS. Comparison of plan accuracy was evaluated by comparing the measured dose to the institution’s TPS dose as well as the TVS dose. In cases where the TVS was more accurate than the institution by an average of >2%, the institution was identified as having a non-negligible TPS error. Results: Of the ∼200 recalculated plans, the average improvement using the TVS was ∼0.1%; i.e. the recalculation, on average, slightly outperformed the institution’s TPS. Of all the recalculated phantoms, 20% were identified as having a non-negligible TPS error. Fourteen plans failed current IROC-H criteria; the average TVS improvement of the failing plans was ∼3% and 57% were found to have non-negligible TPS errors. Conclusion: IROC-H has developed an independent recalculation system to identify institutions that have considerable TPS errors. A large number of institutions were found to have non-negligible TPS errors. Even institutions that passed IROC-H criteria could be identified as having a TPS error. Resolution of such errors would improve dose delivery for a large number of IROC-H phantoms and ultimately, patients.« less
GDF v2.0, an enhanced version of GDF

NASA Astrophysics Data System (ADS)

Tsoulos, Ioannis G.; Gavrilis, Dimitris; Dermatas, Evangelos

2007-12-01

An improved version of the function estimation program GDF is presented. The main enhancements of the new version include: multi-output function estimation, capability of defining custom functions in the grammar and selection of the error function. The new version has been evaluated on a series of classification and regression datasets, that are widely used for the evaluation of such methods. It is compared to two known neural networks and outperforms them in 5 (out of 10) datasets. Program summaryTitle of program: GDF v2.0 Catalogue identifier: ADXC_v2_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADXC_v2_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 98 147 No. of bytes in distributed program, including test data, etc.: 2 040 684 Distribution format: tar.gz Programming language: GNU C++ Computer: The program is designed to be portable in all systems running the GNU C++ compiler Operating system: Linux, Solaris, FreeBSD RAM: 200000 bytes Classification: 4.9 Does the new version supersede the previous version?: Yes Nature of problem: The technique of function estimation tries to discover from a series of input data a functional form that best describes them. This can be performed with the use of parametric models, whose parameters can adapt according to the input data. Solution method: Functional forms are being created by genetic programming which are approximations for the symbolic regression problem. Reasons for new version: The GDF package was extended in order to be more flexible and user customizable than the old package. The user can extend the package by defining his own error functions and he can extend the grammar of the package by adding new functions to the function repertoire. Also, the new version can perform function estimation of multi-output functions and it can be used for classification problems. Summary of revisions: The following features have been added to the package GDF: Multi-output function approximation. The package can now approximate any function f:R→R. This feature gives also to the package the capability of performing classification and not only regression. User defined function can be added to the repertoire of the grammar, extending the regression capabilities of the package. This feature is limited to 3 functions, but easily this number can be increased. Capability of selecting the error function. The package offers now to the user apart from the mean square error other error functions such as: mean absolute square error, maximum square error. Also, user defined error functions can be added to the set of error functions. More verbose output. The main program displays more information to the user as well as the default values for the parameters. Also, the package gives to the user the capability to define an output file, where the output of the gdf program for the testing set will be stored after the termination of the process. Additional comments: A technical report describing the revisions, experiments and test runs is packaged with the source code. Running time: Depending on the train data.
Ground truth management system to support multispectral scanner /MSS/ digital analysis

NASA Technical Reports Server (NTRS)

Coiner, J. C.; Ungar, S. G.

1977-01-01

A computerized geographic information system for management of ground truth has been designed and implemented to relate MSS classification results to in situ observations. The ground truth system transforms, generalizes and rectifies ground observations to conform to the pixel size and shape of high resolution MSS aircraft data. These observations can then be aggregated for comparison to lower resolution sensor data. Construction of a digital ground truth array allows direct pixel by pixel comparison between classification results of MSS data and ground truth. By making comparisons, analysts can identify spatial distribution of error within the MSS data as well as usual figures of merit for the classifications. Use of the ground truth system permits investigators to compare a variety of environmental or anthropogenic data, such as soil color or tillage patterns, with classification results and allows direct inclusion of such data into classification operations. To illustrate the system, examples from classification of simulated Thematic Mapper data for agricultural test sites in North Dakota and Kansas are provided.
12 CFR 702.105 - Weighted-average life of investments.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 12 Banks and Banking 6 2011-01-01 2011-01-01 false Weighted-average life of investments. 702.105... PROMPT CORRECTIVE ACTION Net Worth Classification § 702.105 Weighted-average life of investments. Except as provided below (Table 3), the weighted-average life of an investment for purposes of §§ 702.106(c...
12 CFR 702.105 - Weighted-average life of investments.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 12 Banks and Banking 6 2010-01-01 2010-01-01 false Weighted-average life of investments. 702.105... PROMPT CORRECTIVE ACTION Net Worth Classification § 702.105 Weighted-average life of investments. Except as provided below (Table 3), the weighted-average life of an investment for purposes of §§ 702.106(c...
12 CFR 702.105 - Weighted-average life of investments.

Code of Federal Regulations, 2014 CFR

2014-01-01

... 12 Banks and Banking 7 2014-01-01 2014-01-01 false Weighted-average life of investments. 702.105... PROMPT CORRECTIVE ACTION Net Worth Classification § 702.105 Weighted-average life of investments. Except as provided below (Table 3), the weighted-average life of an investment for purposes of §§ 702.106(c...
12 CFR 702.105 - Weighted-average life of investments.

Code of Federal Regulations, 2013 CFR

2013-01-01

... 12 Banks and Banking 7 2013-01-01 2013-01-01 false Weighted-average life of investments. 702.105... PROMPT CORRECTIVE ACTION Net Worth Classification § 702.105 Weighted-average life of investments. Except as provided below (Table 3), the weighted-average life of an investment for purposes of §§ 702.106(c...
12 CFR 702.105 - Weighted-average life of investments.

Code of Federal Regulations, 2012 CFR

2012-01-01

... 12 Banks and Banking 7 2012-01-01 2012-01-01 false Weighted-average life of investments. 702.105... PROMPT CORRECTIVE ACTION Net Worth Classification § 702.105 Weighted-average life of investments. Except as provided below (Table 3), the weighted-average life of an investment for purposes of §§ 702.106(c...
An automated Pearson's correlation change classification (APC3) approach for GC/MS metabonomic data using total ion chromatograms (TICs).

PubMed

Prakash, Bhaskaran David; Esuvaranathan, Kesavan; Ho, Paul C; Pasikanti, Kishore Kumar; Chan, Eric Chun Yong; Yap, Chun Wei

2013-05-21

A fully automated and computationally efficient Pearson's correlation change classification (APC3) approach is proposed and shown to have overall comparable performance with both an average accuracy and an average AUC of 0.89 ± 0.08 but is 3.9 to 7 times faster, easier to use and have low outlier susceptibility in contrast to other dimensional reduction and classification combinations using only the total ion chromatogram (TIC) intensities of GC/MS data. The use of only the TIC permits the possible application of APC3 to other metabonomic data such as LC/MS TICs or NMR spectra. A RapidMiner implementation is available for download at http://padel.nus.edu.sg/software/padelapc3.
Disregarding population specificity: its influence on the sex assessment methods from the tibia.

PubMed

Kotěrová, Anežka; Velemínská, Jana; Dupej, Ján; Brzobohatá, Hana; Pilný, Aleš; Brůžek, Jaroslav

2017-01-01

Forensic anthropology has developed classification techniques for sex estimation of unknown skeletal remains, for example population-specific discriminant function analyses. These methods were designed for populations that lived mostly in the late nineteenth and twentieth centuries. Their level of reliability or misclassification is important for practical use in today's forensic practice; it is, however, unknown. We addressed the question of what the likelihood of errors would be if population specificity of discriminant functions of the tibia were disregarded. Moreover, five classification functions in a Czech sample were proposed (accuracies 82.1-87.5 %, sex bias ranged from -1.3 to -5.4 %). We measured ten variables traditionally used for sex assessment of the tibia on a sample of 30 male and 26 female models from recent Czech population. To estimate the classification accuracy and error (misclassification) rates ignoring population specificity, we selected published classification functions of tibia for the Portuguese, south European, and the North American populations. These functions were applied on the dimensions of the Czech population. Comparing the classification success of the reference and the tested Czech sample showed that females from Czech population were significantly overestimated and mostly misclassified as males. Overall accuracy of sex assessment significantly decreased (53.6-69.7 %), sex bias -29.4-100 %, which is most probably caused by secular trend and the generally high variability of body size. Results indicate that the discriminant functions, developed for skeletal series representing geographically and chronologically diverse populations, are not applicable in current forensic investigations. Finally, implications and recommendations for future research are discussed.
Decision Making for Borderline Cases in Pass/Fail Clinical Anatomy Courses: The Practical Value of the Standard Error of Measurement and Likelihood Ratio in a Diagnostic Test

ERIC Educational Resources Information Center

Severo, Milton; Silva-Pereira, Fernanda; Ferreira, Maria Amelia

2013-01-01

Several studies have shown that the standard error of measurement (SEM) can be used as an additional “safety net” to reduce the frequency of false-positive or false-negative student grading classifications. Practical examinations in clinical anatomy are often used as diagnostic tests to admit students to course final examinations. The aim of this…
Reducing error and improving efficiency during vascular interventional radiology: implementation of a preprocedural team rehearsal.

PubMed

Morbi, Abigail H M; Hamady, Mohamad S; Riga, Celia V; Kashef, Elika; Pearch, Ben J; Vincent, Charles; Moorthy, Krishna; Vats, Amit; Cheshire, Nicholas J W; Bicknell, Colin D

2012-08-01

To determine the type and frequency of errors during vascular interventional radiology (VIR) and design and implement an intervention to reduce error and improve efficiency in this setting. Ethical guidance was sought from the Research Services Department at Imperial College London. Informed consent was not obtained. Field notes were recorded during 55 VIR procedures by a single observer. Two blinded assessors identified failures from field notes and categorized them into one or more errors by using a 22-part classification system. The potential to cause harm, disruption to procedural flow, and preventability of each failure was determined. A preprocedural team rehearsal (PPTR) was then designed and implemented to target frequent preventable potential failures. Thirty-three procedures were observed subsequently to determine the efficacy of the PPTR. Nonparametric statistical analysis was used to determine the effect of intervention on potential failure rates, potential to cause harm and procedural flow disruption scores (Mann-Whitney U test), and number of preventable failures (Fisher exact test). Before intervention, 1197 potential failures were recorded, of which 54.6% were preventable. A total of 2040 errors were deemed to have occurred to produce these failures. Planning error (19.7%), staff absence (16.2%), equipment unavailability (12.2%), communication error (11.2%), and lack of safety consciousness (6.1%) were the most frequent errors, accounting for 65.4% of the total. After intervention, 352 potential failures were recorded. Classification resulted in 477 errors. Preventable failures decreased from 54.6% to 27.3% (P < .001) with implementation of PPTR. Potential failure rates per hour decreased from 18.8 to 9.2 (P < .001), with no increase in potential to cause harm or procedural flow disruption per failure. Failures during VIR procedures are largely because of ineffective planning, communication error, and equipment difficulties, rather than a result of technical or patient-related issues. Many of these potential failures are preventable. A PPTR is an effective means of targeting frequent preventable failures, reducing procedural delays and improving patient safety.
Bug Distribution and Statistical Pattern Classification.

ERIC Educational Resources Information Center

Tatsuoka, Kikumi K.; Tatsuoka, Maurice M.

1987-01-01

The rule space model permits measurement of cognitive skill acquisition and error diagnosis. Further discussion introduces Bayesian hypothesis testing and bug distribution. An illustration involves an artificial intelligence approach to testing fractions and arithmetic. (Author/GDC)
Average capacity optimization in free-space optical communication system over atmospheric turbulence channels with pointing errors.

PubMed

Liu, Chao; Yao, Yong; Sun, Yun Xu; Xiao, Jun Jun; Zhao, Xin Hui

2010-10-01

A model is proposed to study the average capacity optimization in free-space optical (FSO) channels, accounting for effects of atmospheric turbulence and pointing errors. For a given transmitter laser power, it is shown that both transmitter beam divergence angle and beam waist can be tuned to maximize the average capacity. Meanwhile, their optimum values strongly depend on the jitter and operation wavelength. These results can be helpful for designing FSO communication systems.
Assessment of Metronidazole Susceptibility in Helicobacter pylori: Statistical Validation and Error Rate Analysis of Breakpoints Determined by the Disk Diffusion Test

PubMed Central

Chaves, Sandra; Gadanho, Mário; Tenreiro, Rogério; Cabrita, José

1999-01-01

Metronidazole susceptibility of 100 Helicobacter pylori strains was assessed by determining the inhibition zone diameters by disk diffusion test and the MICs by agar dilution and PDM Epsilometer test (E test). Linear regression analysis was performed, allowing the definition of significant linear relations, and revealed correlations of disk diffusion results with both E-test and agar dilution results (r2 = 0.88 and 0.81, respectively). No significant differences (P = 0.84) were found between MICs defined by E test and those defined by agar dilution, taken as a standard. Reproducibility comparison between E-test and disk diffusion tests showed that they are equivalent and with good precision. Two interpretative susceptibility schemes (with or without an intermediate class) were compared by an interpretative error rate analysis method. The susceptibility classification scheme that included the intermediate category was retained, and breakpoints were assessed for diffusion assay with 5-μg metronidazole disks. Strains with inhibition zone diameters less than 16 mm were defined as resistant (MIC > 8 μg/ml), those with zone diameters equal to or greater than 16 mm but less than 21 mm were considered intermediate (4 μg/ml < MIC ≤ 8 μg/ml), and those with zone diameters of 21 mm or greater were regarded as susceptible (MIC ≤ 4 μg/ml). Error rate analysis applied to this classification scheme showed occurrence frequencies of 1% for major errors and 7% for minor errors, when the results were compared to those obtained by agar dilution. No very major errors were detected, suggesting that disk diffusion might be a good alternative for determining the metronidazole sensitivity of H. pylori strains. PMID:10203543
The relationship between twelve-month home stimulation and school achievement.

PubMed

van Doorninck, W J; Caldwell, B M; Wright, C; Frankenburg, W K

1981-09-01

Home Observation for Measurement of the Environment (HOME) was designed to reflect parental support of early cognitive and socioemotional development. 12-month HOME scores were correlated with elementary school achievement, 5--9 years later. 50 low-income children were rank ordered by a weighted average of centile estimates of achievement test scores, letter grades, and curriculum levels in reading and math. 24 children were classified as having significant school achievement problems. The HOME total score correlated significantly, r = .37, with school centile scores among the low-income families. The statistically more appropriate contingency table analysis revealed a 68% correct classification rate and a significantly reduced error rate over random or blanket prediction. The results supported the predictive value of the 12-month HOME for school achievement among low-income families. In an additional sample of 21 middle-income families, there was insufficient variability among HOME scores to allow prediction. The HOME total scores were highly correlated, r = .86, among siblings tested at least 10 months apart.
[A new method for the classification of neonates based on maturity and somatic development].

PubMed

Berkö, P

1992-03-01

The author points out the sad fact that the methods of estimating classifying, comparative examining of the maturity, somatic development of newborns and the methods of marking off the retarded newborns used up to now are essentially centering round the birth-weight. He exposes the errors, deficiencies of these methods and confronts them with the possibilities of the NDN-system (newborn's somatic development and nutritional state) worked out earlier by him. He thinks the NDF-system to be able to express simultaneously and exactly the gestational age, weight- and length-development, state of being fed of the newborns and the relation to the populational average, the fact and type of retardation, as well. The informational means of NDF-system in NDF-index. The NDF-system makes it possible to break down the birth-weight centric view as it offers a more suitable and qualified method than used before to describe the maturity and somatic development and the classifying of the newborns on the basis of these.

Estimation of Rainfall Sampling Uncertainty: A Comparison of Two Diverse Approaches

NASA Technical Reports Server (NTRS)

Steiner, Matthias; Zhang, Yu; Baeck, Mary Lynn; Wood, Eric F.; Smith, James A.; Bell, Thomas L.; Lau, William K. M. (Technical Monitor)

2002-01-01

The spatial and temporal intermittence of rainfall causes the averages of satellite observations of rain rate to differ from the "true" average rain rate over any given area and time period, even if the satellite observations are perfectly accurate. The difference of satellite averages based on occasional observation by satellite systems and the continuous-time average of rain rate is referred to as sampling error. In this study, rms sampling error estimates are obtained for average rain rates over boxes 100 km, 200 km, and 500 km on a side, for averaging periods of 1 day, 5 days, and 30 days. The study uses a multi-year, merged radar data product provided by Weather Services International Corp. at a resolution of 2 km in space and 15 min in time, over an area of the central U.S. extending from 35N to 45N in latitude and 100W to 80W in longitude. The intervals between satellite observations are assumed to be equal, and similar In size to what present and future satellite systems are able to provide (from 1 h to 12 h). The sampling error estimates are obtained using a resampling method called "resampling by shifts," and are compared to sampling error estimates proposed by Bell based on earlier work by Laughlin. The resampling estimates are found to scale with areal size and time period as the theory predicts. The dependence on average rain rate and time interval between observations is also similar to what the simple theory suggests.
A SVM framework for fault detection of the braking system in a high speed train

NASA Astrophysics Data System (ADS)

Liu, Jie; Li, Yan-Fu; Zio, Enrico

2017-03-01

In April 2015, the number of operating High Speed Trains (HSTs) in the world has reached 3603. An efficient, effective and very reliable braking system is evidently very critical for trains running at a speed around 300 km/h. Failure of a highly reliable braking system is a rare event and, consequently, informative recorded data on fault conditions are scarce. This renders the fault detection problem a classification problem with highly unbalanced data. In this paper, a Support Vector Machine (SVM) framework, including feature selection, feature vector selection, model construction and decision boundary optimization, is proposed for tackling this problem. Feature vector selection can largely reduce the data size and, thus, the computational burden. The constructed model is a modified version of the least square SVM, in which a higher cost is assigned to the error of classification of faulty conditions than the error of classification of normal conditions. The proposed framework is successfully validated on a number of public unbalanced datasets. Then, it is applied for the fault detection of braking systems in HST: in comparison with several SVM approaches for unbalanced datasets, the proposed framework gives better results.
Semi-supervised anomaly detection - towards model-independent searches of new physics

NASA Astrophysics Data System (ADS)

Kuusela, Mikael; Vatanen, Tommi; Malmi, Eric; Raiko, Tapani; Aaltonen, Timo; Nagai, Yoshikazu

2012-06-01

Most classification algorithms used in high energy physics fall under the category of supervised machine learning. Such methods require a training set containing both signal and background events and are prone to classification errors should this training data be systematically inaccurate for example due to the assumed MC model. To complement such model-dependent searches, we propose an algorithm based on semi-supervised anomaly detection techniques, which does not require a MC training sample for the signal data. We first model the background using a multivariate Gaussian mixture model. We then search for deviations from this model by fitting to the observations a mixture of the background model and a number of additional Gaussians. This allows us to perform pattern recognition of any anomalous excess over the background. We show by a comparison to neural network classifiers that such an approach is a lot more robust against misspecification of the signal MC than supervised classification. In cases where there is an unexpected signal, a neural network might fail to correctly identify it, while anomaly detection does not suffer from such a limitation. On the other hand, when there are no systematic errors in the training data, both methods perform comparably.
A Q-backpropagated time delay neural network for diagnosing severity of gait disturbances in Parkinson's disease.

PubMed

Nancy Jane, Y; Khanna Nehemiah, H; Arputharaj, Kannan

2016-04-01

Parkinson's disease (PD) is a movement disorder that affects the patient's nervous system and health-care applications mostly uses wearable sensors to collect these data. Since these sensors generate time stamped data, analyzing gait disturbances in PD becomes challenging task. The objective of this paper is to develop an effective clinical decision-making system (CDMS) that aids the physician in diagnosing the severity of gait disturbances in PD affected patients. This paper presents a Q-backpropagated time delay neural network (Q-BTDNN) classifier that builds a temporal classification model, which performs the task of classification and prediction in CDMS. The proposed Q-learning induced backpropagation (Q-BP) training algorithm trains the Q-BTDNN by generating a reinforced error signal. The network's weights are adjusted through backpropagating the generated error signal. For experimentation, the proposed work uses a PD gait database, which contains gait measures collected through wearable sensors from three different PD research studies. The experimental result proves the efficiency of Q-BP in terms of its improved classification accuracy of 91.49%, 92.19% and 90.91% with three datasets accordingly compared to other neural network training algorithms. Copyright © 2016 Elsevier Inc. All rights reserved.
Bayesian logistic regression approaches to predict incorrect DRG assignment.

PubMed

Suleiman, Mani; Demirhan, Haydar; Boyd, Leanne; Girosi, Federico; Aksakalli, Vural

2018-05-07

Episodes of care involving similar diagnoses and treatments and requiring similar levels of resource utilisation are grouped to the same Diagnosis-Related Group (DRG). In jurisdictions which implement DRG based payment systems, DRGs are a major determinant of funding for inpatient care. Hence, service providers often dedicate auditing staff to the task of checking that episodes have been coded to the correct DRG. The use of statistical models to estimate an episode's probability of DRG error can significantly improve the efficiency of clinical coding audits. This study implements Bayesian logistic regression models with weakly informative prior distributions to estimate the likelihood that episodes require a DRG revision, comparing these models with each other and to classical maximum likelihood estimates. All Bayesian approaches had more stable model parameters than maximum likelihood. The best performing Bayesian model improved overall classification per- formance by 6% compared to maximum likelihood, with a 34% gain compared to random classification, respectively. We found that the original DRG, coder and the day of coding all have a significant effect on the likelihood of DRG error. Use of Bayesian approaches has improved model parameter stability and classification accuracy. This method has already lead to improved audit efficiency in an operational capacity.
Development of classification models to detect Salmonella Enteritidis and Salmonella Typhimurium found in poultry carcass rinses by visible-near infrared hyperspectral imaging

NASA Astrophysics Data System (ADS)

Seo, Young Wook; Yoon, Seung Chul; Park, Bosoon; Hinton, Arthur; Windham, William R.; Lawrence, Kurt C.

2013-05-01

Salmonella is a major cause of foodborne disease outbreaks resulting from the consumption of contaminated food products in the United States. This paper reports the development of a hyperspectral imaging technique for detecting and differentiating two of the most common Salmonella serotypes, Salmonella Enteritidis (SE) and Salmonella Typhimurium (ST), from background microflora that are often found in poultry carcass rinse. Presumptive positive screening of colonies with a traditional direct plating method is a labor intensive and time consuming task. Thus, this paper is concerned with the detection of differences in spectral characteristics among the pure SE, ST, and background microflora grown on brilliant green sulfa (BGS) and xylose lysine tergitol 4 (XLT4) agar media with a spread plating technique. Visible near-infrared hyperspectral imaging, providing the spectral and spatial information unique to each microorganism, was utilized to differentiate SE and ST from the background microflora. A total of 10 classification models, including five machine learning algorithms, each without and with principal component analysis (PCA), were validated and compared to find the best model in classification accuracy. The five machine learning (classification) algorithms used in this study were Mahalanobis distance (MD), k-nearest neighbor (kNN), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and support vector machine (SVM). The average classification accuracy of all 10 models on a calibration (or training) set of the pure cultures on BGS agar plates was 98% (Kappa coefficient = 0.95) in determining the presence of SE and/or ST although it was difficult to differentiate between SE and ST. The average classification accuracy of all 10 models on a training set for ST detection on XLT4 agar was over 99% (Kappa coefficient = 0.99) although SE colonies on XLT4 agar were difficult to differentiate from background microflora. The average classification accuracy of all 10 models on a validation set of chicken carcass rinses spiked with SE or ST and incubated on BGS agar plates was 94.45% and 83.73%, without and with PCA for classification, respectively. The best performing classification model on the validation set was QDA without PCA by achieving the classification accuracy of 98.65% (Kappa coefficient=0.98). The overall best performing classification model regardless of using PCA was MD with the classification accuracy of 94.84% (Kappa coefficient=0.88) on the validation set.
Low-back electromyography (EMG) data-driven load classification for dynamic lifting tasks

PubMed Central

Ojeda, Lauro; Johnson, Daniel D.; Gates, Deanna; Mower Provost, Emily; Barton, Kira

2018-01-01

Objective Numerous devices have been designed to support the back during lifting tasks. To improve the utility of such devices, this research explores the use of preparatory muscle activity to classify muscle loading and initiate appropriate device activation. The goal of this study was to determine the earliest time window that enabled accurate load classification during a dynamic lifting task. Methods Nine subjects performed thirty symmetrical lifts, split evenly across three weight conditions (no-weight, 10-lbs and 24-lbs), while low-back muscle activity data was collected. Seven descriptive statistics features were extracted from 100 ms windows of data. A multinomial logistic regression (MLR) classifier was trained and tested, employing leave-one subject out cross-validation, to classify lifted load values. Dimensionality reduction was achieved through feature cross-correlation analysis and greedy feedforward selection. The time of full load support by the subject was defined as load-onset. Results Regions of highest average classification accuracy started at 200 ms before until 200 ms after load-onset with average accuracies ranging from 80% (±10%) to 81% (±7%). The average recall for each class ranged from 69–92%. Conclusion These inter-subject classification results indicate that preparatory muscle activity can be leveraged to identify the intent to lift a weight up to 100 ms prior to load-onset. The high accuracies shown indicate the potential to utilize intent classification for assistive device applications. Significance Active assistive devices, e.g. exoskeletons, could prevent back injury by off-loading low-back muscles. Early intent classification allows more time for actuators to respond and integrate seamlessly with the user. PMID:29447252
A data-driven modeling approach to stochastic computation for low-energy biomedical devices.

PubMed

Lee, Kyong Ho; Jang, Kuk Jin; Shoeb, Ali; Verma, Naveen

2011-01-01

Low-power devices that can detect clinically relevant correlations in physiologically-complex patient signals can enable systems capable of closed-loop response (e.g., controlled actuation of therapeutic stimulators, continuous recording of disease states, etc.). In ultra-low-power platforms, however, hardware error sources are becoming increasingly limiting. In this paper, we present how data-driven methods, which allow us to accurately model physiological signals, also allow us to effectively model and overcome prominent hardware error sources with nearly no additional overhead. Two applications, EEG-based seizure detection and ECG-based arrhythmia-beat classification, are synthesized to a logic-gate implementation, and two prominent error sources are introduced: (1) SRAM bit-cell errors and (2) logic-gate switching errors ('stuck-at' faults). Using patient data from the CHB-MIT and MIT-BIH databases, performance similar to error-free hardware is achieved even for very high fault rates (up to 0.5 for SRAMs and 7 × 10(-2) for logic) that cause computational bit error rates as high as 50%.
Association of medication errors with drug classifications, clinical units, and consequence of errors: Are they related?

PubMed

Muroi, Maki; Shen, Jay J; Angosta, Alona

2017-02-01

Registered nurses (RNs) play an important role in safe medication administration and patient safety. This study examined a total of 1276 medication error (ME) incident reports made by RNs in hospital inpatient settings in the southwestern region of the United States. The most common drug class associated with MEs was cardiovascular drugs (24.7%). Among this class, anticoagulants had the most errors (11.3%). The antimicrobials was the second most common drug class associated with errors (19.1%) and vancomycin was the most common antimicrobial that caused errors in this category (6.1%). MEs occurred more frequently in the medical-surgical and intensive care units than any other hospital units. Ten percent of MEs reached the patients with harm and 11% reached the patients with increased monitoring. Understanding the contributing factors related to MEs, addressing and eliminating risk of errors across hospital units, and providing education and resources for nurses may help reduce MEs. Copyright © 2016 Elsevier Inc. All rights reserved.
Classification Management. Journal of the National Classification Management Society, Volume 18, 1982,

DTIC Science & Technology

1983-01-01

changes. Concurrently, CIA formed and AD HOC esting to step back and look at the U.S. security Intelligence Community Working Group to re...administrative error; to prevent embarrassment to expected damage will be. If you foresee the dam- a person, organization, or agency; to restrain com- age...the decision will be to classify the informa- petition; or to pTevent or delay the public release of tion. But note that in this thought process, you
Validation of tool mark analysis of cut costal cartilage.

PubMed

Love, Jennifer C; Derrick, Sharon M; Wiersema, Jason M; Peters, Charles

2012-03-01

This study was designed to establish the potential error rate associated with the generally accepted method of tool mark analysis of cut marks in costal cartilage. Three knives with different blade types were used to make experimental cut marks in costal cartilage of pigs. Each cut surface was cast, and each cast was examined by three analysts working independently. The presence of striations, regularity of striations, and presence of a primary and secondary striation pattern were recorded for each cast. The distance between each striation was measured. The results showed that striations were not consistently impressed on the cut surface by the blade's cutting edge. Also, blade type classification by the presence or absence of striations led to a 65% misclassification rate. Use of the classification tree and cross-validation methods and inclusion of the mean interstriation distance decreased the error rate to c. 50%. © 2011 American Academy of Forensic Sciences.
Automatic and semi-automatic approaches for arteriolar-to-venular computation in retinal photographs

NASA Astrophysics Data System (ADS)

Mendonça, Ana Maria; Remeseiro, Beatriz; Dashtbozorg, Behdad; Campilho, Aurélio

2017-03-01

The Arteriolar-to-Venular Ratio (AVR) is a popular dimensionless measure which allows the assessment of patients' condition for the early diagnosis of different diseases, including hypertension and diabetic retinopathy. This paper presents two new approaches for AVR computation in retinal photographs which include a sequence of automated processing steps: vessel segmentation, caliber measurement, optic disc segmentation, artery/vein classification, region of interest delineation, and AVR calculation. Both approaches have been tested on the INSPIRE-AVR dataset, and compared with a ground-truth provided by two medical specialists. The obtained results demonstrate the reliability of the fully automatic approach which provides AVR ratios very similar to at least one of the observers. Furthermore, the semi-automatic approach, which includes the manual modification of the artery/vein classification if needed, allows to significantly reduce the error to a level below the human error.
Bayes classification of terrain cover using normalized polarimetric data

NASA Technical Reports Server (NTRS)

Yueh, H. A.; Swartz, A. A.; Kong, J. A.; Shin, R. T.; Novak, L. M.

1988-01-01

The normalized polarimetric classifier (NPC) which uses only the relative magnitudes and phases of the polarimetric data is proposed for discrimination of terrain elements. The probability density functions (PDFs) of polarimetric data are assumed to have a complex Gaussian distribution, and the marginal PDF of the normalized polarimetric data is derived by adopting the Euclidean norm as the normalization function. The general form of the distance measure for the NPC is also obtained. It is demonstrated that for polarimetric data with an arbitrary PDF, the distance measure of NPC will be independent of the normalization function selected even when the classifier is mistrained. A complex Gaussian distribution is assumed for the polarimetric data consisting of grass and tree regions. The probability of error for the NPC is compared with those of several other single-feature classifiers. The classification error of NPCs is shown to be independent of the normalization function.
The intercrater plains of Mercury and the Moon: Their nature, origin and role in terrestrial planet evolution. Measurement and errors of crater statistics. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Leake, M. A.

1982-01-01

Planetary imagery techniques, errors in measurement or degradation assignment, and statistical formulas are presented with respect to cratering data. Base map photograph preparation, measurement of crater diameters and sampled area, and instruments used are discussed. Possible uncertainties, such as Sun angle, scale factors, degradation classification, and biases in crater recognition are discussed. The mathematical formulas used in crater statistics are presented.
Radiomic machine-learning classifiers for prognostic biomarkers of advanced nasopharyngeal carcinoma.

PubMed

Zhang, Bin; He, Xin; Ouyang, Fusheng; Gu, Dongsheng; Dong, Yuhao; Zhang, Lu; Mo, Xiaokai; Huang, Wenhui; Tian, Jie; Zhang, Shuixing

2017-09-10

We aimed to identify optimal machine-learning methods for radiomics-based prediction of local failure and distant failure in advanced nasopharyngeal carcinoma (NPC). We enrolled 110 patients with advanced NPC. A total of 970 radiomic features were extracted from MRI images for each patient. Six feature selection methods and nine classification methods were evaluated in terms of their performance. We applied the 10-fold cross-validation as the criterion for feature selection and classification. We repeated each combination for 50 times to obtain the mean area under the curve (AUC) and test error. We observed that the combination methods Random Forest (RF) + RF (AUC, 0.8464 ± 0.0069; test error, 0.3135 ± 0.0088) had the highest prognostic performance, followed by RF + Adaptive Boosting (AdaBoost) (AUC, 0.8204 ± 0.0095; test error, 0.3384 ± 0.0097), and Sure Independence Screening (SIS) + Linear Support Vector Machines (LSVM) (AUC, 0.7883 ± 0.0096; test error, 0.3985 ± 0.0100). Our radiomics study identified optimal machine-learning methods for the radiomics-based prediction of local failure and distant failure in advanced NPC, which could enhance the applications of radiomics in precision oncology and clinical practice. Copyright © 2017 Elsevier B.V. All rights reserved.
Average Likelihood Methods for Code Division Multiple Access (CDMA)

DTIC Science & Technology

2014-05-01

lengths in the range of 22 to 213 and possibly higher. Keywords: DS / CDMA signals, classification, balanced CDMA load, synchronous CDMA , decision...likelihood ratio test (ALRT). We begin this classification problem by finding the size of the spreading matrix that generated the DS - CDMA signal. As...Theoretical Background The classification of DS / CDMA signals should not be confused with the problem of multiuser detection. The multiuser detection deals
Learning classification trees

NASA Technical Reports Server (NTRS)

Buntine, Wray

1991-01-01

Algorithms for learning classification trees have had successes in artificial intelligence and statistics over many years. How a tree learning algorithm can be derived from Bayesian decision theory is outlined. This introduces Bayesian techniques for splitting, smoothing, and tree averaging. The splitting rule turns out to be similar to Quinlan's information gain splitting rule, while smoothing and averaging replace pruning. Comparative experiments with reimplementations of a minimum encoding approach, Quinlan's C4 and Breiman et al. Cart show the full Bayesian algorithm is consistently as good, or more accurate than these other approaches though at a computational price.
Spectral-spatial classification using tensor modeling for cancer detection with hyperspectral imaging

NASA Astrophysics Data System (ADS)

Lu, Guolan; Halig, Luma; Wang, Dongsheng; Chen, Zhuo Georgia; Fei, Baowei

2014-03-01

As an emerging technology, hyperspectral imaging (HSI) combines both the chemical specificity of spectroscopy and the spatial resolution of imaging, which may provide a non-invasive tool for cancer detection and diagnosis. Early detection of malignant lesions could improve both survival and quality of life of cancer patients. In this paper, we introduce a tensor-based computation and modeling framework for the analysis of hyperspectral images to detect head and neck cancer. The proposed classification method can distinguish between malignant tissue and healthy tissue with an average sensitivity of 96.97% and an average specificity of 91.42% in tumor-bearing mice. The hyperspectral imaging and classification technology has been demonstrated in animal models and can have many potential applications in cancer research and management.
Classification of E-Nose Aroma Data of Four Fruit Types by ABC-Based Neural Network

PubMed Central

Adak, M. Fatih; Yumusak, Nejat

2016-01-01

Electronic nose technology is used in many areas, and frequently in the beverage industry for classification and quality-control purposes. In this study, four different aroma data (strawberry, lemon, cherry, and melon) were obtained using a MOSES II electronic nose for the purpose of fruit classification. To improve the performance of the classification, the training phase of the neural network with two hidden layers was optimized using artificial bee colony algorithm (ABC), which is known to be successful in exploration. Test data were given to two different neural networks, each of which were trained separately with backpropagation (BP) and ABC, and average test performances were measured as 60% for the artificial neural network trained with BP and 76.39% for the artificial neural network trained with ABC. Training and test phases were repeated 30 times to obtain these average performance measurements. This level of performance shows that the artificial neural network trained with ABC is successful in classifying aroma data. PMID:26927124
Classification of E-Nose Aroma Data of Four Fruit Types by ABC-Based Neural Network.

PubMed

Adak, M Fatih; Yumusak, Nejat

2016-02-27

Electronic nose technology is used in many areas, and frequently in the beverage industry for classification and quality-control purposes. In this study, four different aroma data (strawberry, lemon, cherry, and melon) were obtained using a MOSES II electronic nose for the purpose of fruit classification. To improve the performance of the classification, the training phase of the neural network with two hidden layers was optimized using artificial bee colony algorithm (ABC), which is known to be successful in exploration. Test data were given to two different neural networks, each of which were trained separately with backpropagation (BP) and ABC, and average test performances were measured as 60% for the artificial neural network trained with BP and 76.39% for the artificial neural network trained with ABC. Training and test phases were repeated 30 times to obtain these average performance measurements. This level of performance shows that the artificial neural network trained with ABC is successful in classifying aroma data.

High Dimensional Classification Using Features Annealed Independence Rules.

PubMed

Fan, Jianqing; Fan, Yingying

2008-01-01

Classification using high-dimensional features arises frequently in many contemporary statistical studies such as tumor classification using microarray or other high-throughput data. The impact of dimensionality on classifications is largely poorly understood. In a seminal paper, Bickel and Levina (2004) show that the Fisher discriminant performs poorly due to diverging spectra and they propose to use the independence rule to overcome the problem. We first demonstrate that even for the independence classification rule, classification using all the features can be as bad as the random guessing due to noise accumulation in estimating population centroids in high-dimensional feature space. In fact, we demonstrate further that almost all linear discriminants can perform as bad as the random guessing. Thus, it is paramountly important to select a subset of important features for high-dimensional classification, resulting in Features Annealed Independence Rules (FAIR). The conditions under which all the important features can be selected by the two-sample t-statistic are established. The choice of the optimal number of features, or equivalently, the threshold value of the test statistics are proposed based on an upper bound of the classification error. Simulation studies and real data analysis support our theoretical results and demonstrate convincingly the advantage of our new classification procedure.
A label distance maximum-based classifier for multi-label learning.

PubMed

Liu, Xiaoli; Bao, Hang; Zhao, Dazhe; Cao, Peng

2015-01-01

Multi-label classification is useful in many bioinformatics tasks such as gene function prediction and protein site localization. This paper presents an improved neural network algorithm, Max Label Distance Back Propagation Algorithm for Multi-Label Classification. The method was formulated by modifying the total error function of the standard BP by adding a penalty term, which was realized by maximizing the distance between the positive and negative labels. Extensive experiments were conducted to compare this method against state-of-the-art multi-label methods on three popular bioinformatic benchmark datasets. The results illustrated that this proposed method is more effective for bioinformatic multi-label classification compared to commonly used techniques.
Multidate mapping of mosquito habitat. [Nebraska, South Dakota

NASA Technical Reports Server (NTRS)

Woodzick, T. L.; Maxwell, E. L.

1977-01-01

LANDSAT data from three overpasses formed the data base for a multidate classification of 15 ground cover categories in the margins of Lewis and Clark Lake, a fresh water impoundment between South Dakota and Nebraska. When scaled to match topographic maps of the area, the ground cover classification maps were used as a general indicator of potential mosquito-breeding habitat by distinguishing productive wetlands areas from nonproductive nonwetlands areas. The 12 channel multidate classification was found to have an accuracy 23% higher than the average of the three single date 4 channel classifications.
Lacie phase 1 Classification and Mensuration Subsystem (CAMS) rework experiment

NASA Technical Reports Server (NTRS)

Chhikara, R. S.; Hsu, E. M.; Liszcz, C. J.

1976-01-01

An experiment was designed to test the ability of the Classification and Mensuration Subsystem rework operations to improve wheat proportion estimates for segments that had been processed previously. Sites selected for the experiment included three in Kansas and three in Texas, with the remaining five distributed in Montana and North and South Dakota. The acquisition dates were selected to be representative of imagery available in actual operations. No more than one acquisition per biophase were used, and biophases were determined by actual crop calendars. All sites were worked by each of four Analyst-Interpreter/Data Processing Analyst Teams who reviewed the initial processing of each segment and accepted or reworked it for an estimate of the proportion of small grains in the segment. Classification results, acquisitions and classification errors and performance results between CAMS regular and ITS rework are tabulated.
Global Surface Temperature Change and Uncertainties Since 1861

NASA Technical Reports Server (NTRS)

Shen, Samuel S. P.; Lau, William K. M. (Technical Monitor)

2002-01-01

The objective of this talk is to analyze the warming trend and its uncertainties of the global and hemi-spheric surface temperatures. By the method of statistical optimal averaging scheme, the land surface air temperature and sea surface temperature observational data are used to compute the spatial average annual mean surface air temperature. The optimal averaging method is derived from the minimization of the mean square error between the true and estimated averages and uses the empirical orthogonal functions. The method can accurately estimate the errors of the spatial average due to observational gaps and random measurement errors. In addition, quantified are three independent uncertainty factors: urbanization, change of the in situ observational practices and sea surface temperature data corrections. Based on these uncertainties, the best linear fit to annual global surface temperature gives an increase of 0.61 +/- 0.16 C between 1861 and 2000. This lecture will also touch the topics on the impact of global change on nature and environment. as well as the latest assessment methods for the attributions of global change.
Sample Errors Call Into Question Conclusions Regarding Same-Sex Married Parents: A Comment on "Family Structure and Child Health: Does the Sex Composition of Parents Matter?"

PubMed

Paul Sullins, D

2017-12-01

Because of classification errors reported by the National Center for Health Statistics, an estimated 42 % of the same-sex married partners in the sample for this study are misclassified different-sex married partners, thus calling into question findings regarding same-sex married parents. Including biological parentage as a control variable suppresses same-sex/different-sex differences, thus obscuring the data error. Parentage is not appropriate as a control because it correlates nearly perfectly (+.97, gamma) with the same-sex/different-sex distinction and is invariant for the category of joint biological parents.
Image Augmentation for Object Image Classification Based On Combination of Pre-Trained CNN and SVM

NASA Astrophysics Data System (ADS)

Shima, Yoshihiro

2018-04-01

Neural networks are a powerful means of classifying object images. The proposed image category classification method for object images combines convolutional neural networks (CNNs) and support vector machines (SVMs). A pre-trained CNN, called Alex-Net, is used as a pattern-feature extractor. Alex-Net is pre-trained for the large-scale object-image dataset ImageNet. Instead of training, Alex-Net, pre-trained for ImageNet is used. An SVM is used as trainable classifier. The feature vectors are passed to the SVM from Alex-Net. The STL-10 dataset are used as object images. The number of classes is ten. Training and test samples are clearly split. STL-10 object images are trained by the SVM with data augmentation. We use the pattern transformation method with the cosine function. We also apply some augmentation method such as rotation, skewing and elastic distortion. By using the cosine function, the original patterns were left-justified, right-justified, top-justified, or bottom-justified. Patterns were also center-justified and enlarged. Test error rate is decreased by 0.435 percentage points from 16.055% by augmentation with cosine transformation. Error rates are increased by other augmentation method such as rotation, skewing and elastic distortion, compared without augmentation. Number of augmented data is 30 times that of the original STL-10 5K training samples. Experimental test error rate for the test 8k STL-10 object images was 15.620%, which shows that image augmentation is effective for image category classification.
Applying Cost-Sensitive Extreme Learning Machine and Dissimilarity Integration to Gene Expression Data Classification.

PubMed

Liu, Yanqiu; Lu, Huijuan; Yan, Ke; Xia, Haixia; An, Chunlin

2016-01-01

Embedding cost-sensitive factors into the classifiers increases the classification stability and reduces the classification costs for classifying high-scale, redundant, and imbalanced datasets, such as the gene expression data. In this study, we extend our previous work, that is, Dissimilar ELM (D-ELM), by introducing misclassification costs into the classifier. We name the proposed algorithm as the cost-sensitive D-ELM (CS-D-ELM). Furthermore, we embed rejection cost into the CS-D-ELM to increase the classification stability of the proposed algorithm. Experimental results show that the rejection cost embedded CS-D-ELM algorithm effectively reduces the average and overall cost of the classification process, while the classification accuracy still remains competitive. The proposed method can be extended to classification problems of other redundant and imbalanced data.
Automated detection of cloud and cloud-shadow in single-date Landsat imagery using neural networks and spatial post-processing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hughes, Michael J.; Hayes, Daniel J

2014-01-01

Use of Landsat data to answer ecological questions is contingent on the effective removal of cloud and cloud shadow from satellite images. We develop a novel algorithm to identify and classify clouds and cloud shadow, \\textsc{sparcs}: Spacial Procedures for Automated Removal of Cloud and Shadow. The method uses neural networks to determine cloud, cloud-shadow, water, snow/ice, and clear-sky membership of each pixel in a Landsat scene, and then applies a set of procedures to enforce spatial rules. In a comparison to FMask, a high-quality cloud and cloud-shadow classification algorithm currently available, \\textsc{sparcs} performs favorably, with similar omission errors for cloudsmore » (0.8% and 0.9%, respectively), substantially lower omission error for cloud-shadow (8.3% and 1.1%), and fewer errors of commission (7.8% and 5.0%). Additionally, textsc{sparcs} provides a measure of uncertainty in its classification that can be exploited by other processes that use the cloud and cloud-shadow detection. To illustrate this, we present an application that constructs obstruction-free composites of images acquired on different dates in support of algorithms detecting vegetation change.« less
Discrimination of Aspergillus isolates at the species and strain level by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry fingerprinting.

PubMed

Hettick, Justin M; Green, Brett J; Buskirk, Amanda D; Kashon, Michael L; Slaven, James E; Janotka, Erika; Blachere, Francoise M; Schmechel, Detlef; Beezhold, Donald H

2008-09-15

Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) was used to generate highly reproducible mass spectral fingerprints for 12 species of fungi of the genus Aspergillus and 5 different strains of Aspergillus flavus. Prior to MALDI-TOF MS analysis, the fungi were subjected to three 1-min bead beating cycles in an acetonitrile/trifluoroacetic acid solvent. The mass spectra contain abundant peaks in the range of 5 to 20kDa and may be used to discriminate between species unambiguously. A discriminant analysis using all peaks from the MALDI-TOF MS data yielded error rates for classification of 0 and 18.75% for resubstitution and cross-validation methods, respectively. If a subset of 28 significant peaks is chosen, resubstitution and cross-validation error rates are 0%. Discriminant analysis of the MALDI-TOF MS data for 5 strains of A. flavus using all peaks yielded error rates for classification of 0 and 5% for resubstitution and cross-validation methods, respectively. These data indicate that MALDI-TOF MS data may be used for unambiguous identification of members of the genus Aspergillus at both the species and strain levels.
Identifying medication error chains from critical incident reports: a new analytic approach.

PubMed

Huckels-Baumgart, Saskia; Manser, Tanja

2014-10-01

Research into the distribution of medication errors usually focuses on isolated stages within the medication use process. Our study aimed to provide a novel process-oriented approach to medication incident analysis focusing on medication error chains. Our study was conducted across a 900-bed teaching hospital in Switzerland. All reported 1,591 medication errors 2009-2012 were categorized using the Medication Error Index NCC MERP and the WHO Classification for Patient Safety Methodology. In order to identify medication error chains, each reported medication incident was allocated to the relevant stage of the hospital medication use process. Only 25.8% of the reported medication errors were detected before they propagated through the medication use process. The majority of medication errors (74.2%) formed an error chain encompassing two or more stages. The most frequent error chain comprised preparation up to and including medication administration (45.2%). "Non-consideration of documentation/prescribing" during the drug preparation was the most frequent contributor for "wrong dose" during the administration of medication. Medication error chains provide important insights for detecting and stopping medication errors before they reach the patient. Existing and new safety barriers need to be extended to interrupt error chains and to improve patient safety. © 2014, The American College of Clinical Pharmacology.
Impact of sensor's point spread function on land cover characterization: Assessment and deconvolution

USGS Publications Warehouse

Huang, C.; Townshend, J.R.G.; Liang, S.; Kalluri, S.N.V.; DeFries, R.S.

2002-01-01

Measured and modeled point spread functions (PSF) of sensor systems indicate that a significant portion of the recorded signal of each pixel of a satellite image originates from outside the area represented by that pixel. This hinders the ability to derive surface information from satellite images on a per-pixel basis. In this study, the impact of the PSF of the Moderate Resolution Imaging Spectroradiometer (MODIS) 250 m bands was assessed using four images representing different landscapes. Experimental results showed that though differences between pixels derived with and without PSF effects were small on the average, the PSF generally brightened dark objects and darkened bright objects. This impact of the PSF lowered the performance of a support vector machine (SVM) classifier by 5.4% in overall accuracy and increased the overall root mean square error (RMSE) by 2.4% in estimating subpixel percent land cover. An inversion method based on the known PSF model reduced the signals originating from surrounding areas by as much as 53%. This method differs from traditional PSF inversion deconvolution methods in that the PSF was adjusted with lower weighting factors for signals originating from neighboring pixels than those specified by the PSF model. By using this deconvolution method, the lost classification accuracy due to residual impact of PSF effects was reduced to only 1.66% in overall accuracy. The increase in the RMSE of estimated subpixel land cover proportions due to the residual impact of PSF effects was reduced to 0.64%. Spatial aggregation also effectively reduced the errors in estimated land cover proportion images. About 50% of the estimation errors were removed after applying the deconvolution method and aggregating derived proportion images to twice their dimensional pixel size. ?? 2002 Elsevier Science Inc. All rights reserved.
Analyzing the Potential for Unmanned Aerial Systems (UAS) Photogrammetry in Estimating Surface Deformations at a Geothermal Fiel

NASA Astrophysics Data System (ADS)

Pai, H.; Burnett, J.; Sladek, C.; Wing, M.; Feigl, K. L.; Selker, J. S.; Tyler, S.; Team, P.

2016-12-01

UAS systems equipped with a variety of spectral imaging devices are increasingly incorporated in spatial environmental assessments of continental surfaces (e.g., digital elevation maps, vegetative coverage classifications, surface temperatures). This presented work performed by the UAS team at the Center for Transformative Environmental Monitoring Programs (AirCTEMPS) examines the potential to measure small (sub-cm) deformation from a geothermal injection experiment at Brady's geothermal field in western Nevada (USA). Areal mapping of the 700 x 270 m area of interest was conducted with a nadir pointing Sony A5100 digital camera onboard an autopiloted quadcopter. A total of 16 ground control points were installed using a TopCon GR3 GPS receiver. Two such mapping campaigns were conducted with one before and one after an anticipated surface deformation event. A digital elevation map (DEM) for each time period was created from over 1500 images having 80% overlap/sidelap by using structure from motion (SfM) via Agisoft Photoscan software. The resulting DEM resolution was 8 mm/pixel with residual aerial triangulation errors was < 5 mm. We present preliminary results from an optimized workflow which achieved errors and average differential DEM heights between campaigns at the cm-scale which is broader than the maximum expected deformation. Despite the disconnect between error and deformation severity, this study presents a unique application of sub-cm UAS-based DEMs and further distinguishes itself by comparing results to concurrent Interferometric Synthetic Radar (InSAR). The intent of our study and presentation of results is to streamline, cross-validate, and share methods to encourage further adoption of UAS imagery into the standard toolkit for environmental surface sensing across spatial scales.
Methods for data classification

DOEpatents

Garrity, George [Okemos, MI; Lilburn, Timothy G [Front Royal, VA

2011-10-11

The present invention provides methods for classifying data and uncovering and correcting annotation errors. In particular, the present invention provides a self-organizing, self-correcting algorithm for use in classifying data. Additionally, the present invention provides a method for classifying biological taxa.
Vector velocity volume flow estimation: Sources of error and corrections applied for arteriovenous fistulas.

PubMed

Jensen, Jonas; Olesen, Jacob Bjerring; Stuart, Matthias Bo; Hansen, Peter Møller; Nielsen, Michael Bachmann; Jensen, Jørgen Arendt

2016-08-01

A method for vector velocity volume flow estimation is presented, along with an investigation of its sources of error and correction of actual volume flow measurements. Volume flow errors are quantified theoretically by numerical modeling, through flow phantom measurements, and studied in vivo. This paper investigates errors from estimating volumetric flow using a commercial ultrasound scanner and the common assumptions made in the literature. The theoretical model shows, e.g. that volume flow is underestimated by 15%, when the scan plane is off-axis with the vessel center by 28% of the vessel radius. The error sources were also studied in vivo under realistic clinical conditions, and the theoretical results were applied for correcting the volume flow errors. Twenty dialysis patients with arteriovenous fistulas were scanned to obtain vector flow maps of fistulas. When fitting an ellipsis to cross-sectional scans of the fistulas, the major axis was on average 10.2mm, which is 8.6% larger than the minor axis. The ultrasound beam was on average 1.5mm from the vessel center, corresponding to 28% of the semi-major axis in an average fistula. Estimating volume flow with an elliptical, rather than circular, vessel area and correcting the ultrasound beam for being off-axis, gave a significant (p=0.008) reduction in error from 31.2% to 24.3%. The error is relative to the Ultrasound Dilution Technique, which is considered the gold standard for volume flow estimation for dialysis patients. The study shows the importance of correcting for volume flow errors, which are often made in clinical practice. Copyright © 2016 Elsevier B.V. All rights reserved.
Evaluation and modification of five techniques for estimating stormwater runoff for watersheds in west-central Florida

USGS Publications Warehouse

Trommer, J.T.; Loper, J.E.; Hammett, K.M.

1996-01-01

Several traditional techniques have been used for estimating stormwater runoff from ungaged watersheds. Applying these techniques to water- sheds in west-central Florida requires that some of the empirical relationships be extrapolated beyond tested ranges. As a result, there is uncertainty as to the accuracy of these estimates. Sixty-six storms occurring in 15 west-central Florida watersheds were initially modeled using the Rational Method, the U.S. Geological Survey Regional Regression Equations, the Natural Resources Conservation Service TR-20 model, the U.S. Army Corps of Engineers Hydrologic Engineering Center-1 model, and the Environmental Protection Agency Storm Water Management Model. The techniques were applied according to the guidelines specified in the user manuals or standard engineering textbooks as though no field data were available and the selection of input parameters was not influenced by observed data. Computed estimates were compared with observed runoff to evaluate the accuracy of the techniques. One watershed was eliminated from further evaluation when it was determined that the area contributing runoff to the stream varies with the amount and intensity of rainfall. Therefore, further evaluation and modification of the input parameters were made for only 62 storms in 14 watersheds. Runoff ranged from 1.4 to 99.3 percent percent of rainfall. The average runoff for all watersheds included in this study was about 36 percent of rainfall. The average runoff for the urban, natural, and mixed land-use watersheds was about 41, 27, and 29 percent, respectively. Initial estimates of peak discharge using the rational method produced average watershed errors that ranged from an underestimation of 50.4 percent to an overestimation of 767 percent. The coefficient of runoff ranged from 0.20 to 0.60. Calibration of the technique produced average errors that ranged from an underestimation of 3.3 percent to an overestimation of 1.5 percent. The average calibrated coefficient of runoff for each watershed ranged from 0.02 to 0.72. The average values of the coefficient of runoff necessary to calibrate the urban, natural, and mixed land-use watersheds were 0.39, 0.16, and 0.08, respectively. The U.S. Geological Survey regional regression equations for determining peak discharge produced errors that ranged from an underestimation of 87.3 percent to an over- estimation of 1,140 percent. The regression equations for determining runoff volume produced errors that ranged from an underestimation of 95.6 percent to an overestimation of 324 percent. Regression equations developed from data used for this study produced errors that ranged between an underestimation of 82.8 percent and an over- estimation of 328 percent for peak discharge, and from an underestimation of 71.2 percent to an overestimation of 241 percent for runoff volume. Use of the equations developed for west-central Florida streams produced average errors for each type of watershed that were lower than errors associated with use of the U.S. Geological Survey equations. Initial estimates of peak discharges and runoff volumes using the Natural Resources Conservation Service TR-20 model, produced average errors of 44.6 and 42.7 percent respectively, for all the watersheds. Curve numbers and times of concentration were adjusted to match estimated and observed peak discharges and runoff volumes. The average change in the curve number for all the watersheds was a decrease of 2.8 percent. The average change in the time of concentration was an increase of 59.2 percent. The shape of the input dimensionless unit hydrograph also had to be adjusted to match the shape and peak time of the estimated and observed flood hydrographs. Peak rate factors for the modified input dimensionless unit hydrographs ranged from 162 to 454. The mean errors for peak discharges and runoff volumes were reduced to 18.9 and 19.5 percent, respectively, using the average calibrated input parameters for ea
DOE Office of Scientific and Technical Information (OSTI.GOV)

Yan, H; Chen, Z; Nath, R

Purpose: kV fluoroscopic imaging combined with MV treatment beam imaging has been investigated for intrafractional motion monitoring and correction. It is, however, subject to additional kV imaging dose to normal tissue. To balance tracking accuracy and imaging dose, we previously proposed an adaptive imaging strategy to dynamically decide future imaging type and moments based on motion tracking uncertainty. kV imaging may be used continuously for maximal accuracy or only when the position uncertainty (probability of out of threshold) is high if a preset imaging dose limit is considered. In this work, we propose more accurate methods to estimate tracking uncertaintymore » through analyzing acquired data in real-time. Methods: We simulated motion tracking process based on a previously developed imaging framework (MV + initial seconds of kV imaging) using real-time breathing data from 42 patients. Motion tracking errors for each time point were collected together with the time point’s corresponding features, such as tumor motion speed and 2D tracking error of previous time points, etc. We tested three methods for error uncertainty estimation based on the features: conditional probability distribution, logistic regression modeling, and support vector machine (SVM) classification to detect errors exceeding a threshold. Results: For conditional probability distribution, polynomial regressions on three features (previous tracking error, prediction quality, and cosine of the angle between the trajectory and the treatment beam) showed strong correlation with the variation (uncertainty) of the mean 3D tracking error and its standard deviation: R-square = 0.94 and 0.90, respectively. The logistic regression and SVM classification successfully identified about 95% of tracking errors exceeding 2.5mm threshold. Conclusion: The proposed methods can reliably estimate the motion tracking uncertainty in real-time, which can be used to guide adaptive additional imaging to confirm the tumor is within the margin or initialize motion compensation if it is out of the margin.« less
Estimating the Classification Efficiency of a Test Battery.

ERIC Educational Resources Information Center

De Corte, Wilfried

2000-01-01

Shows how a theorem proven by H. Brogden (1951, 1959) can be used to estimate the allocation average (a predictor based classification of a test battery) assuming that the predictor intercorrelations and validities are known and that the predictor variables have a joint multivariate normal distribution. (SLD)
Sensitivity analysis of Jacobian determinant used in treatment planning for lung cancer

NASA Astrophysics Data System (ADS)

Shao, Wei; Gerard, Sarah E.; Pan, Yue; Patton, Taylor J.; Reinhardt, Joseph M.; Durumeric, Oguz C.; Bayouth, John E.; Christensen, Gary E.

2018-03-01

Four-dimensional computed tomography (4DCT) is regularly used to visualize tumor motion in radiation therapy for lung cancer. These 4DCT images can be analyzed to estimate local ventilation by finding a dense correspondence map between the end inhalation and the end exhalation CT image volumes using deformable image registration. Lung regions with ventilation values above a threshold are labeled as regions of high pulmonary function and are avoided when possible in the radiation plan. This paper investigates a sensitivity analysis of the relative Jacobian error to small registration errors. We present a linear approximation of the relative Jacobian error. Next, we give a formula for the sensitivity of the relative Jacobian error with respect to the Jacobian of perturbation displacement field. Preliminary sensitivity analysis results are presented using 4DCT scans from 10 individuals. For each subject, we generated 6400 random smooth biologically plausible perturbation vector fields using a cubic B-spline model. We showed that the correlation between the Jacobian determinant and the Frobenius norm of the sensitivity matrix is close to -1, which implies that the relative Jacobian error in high-functional regions is less sensitive to noise. We also showed that small displacement errors on the average of 0.53 mm may lead to a 10% relative change in Jacobian determinant. We finally showed that the average relative Jacobian error and the sensitivity of the system for all subjects are positively correlated (close to +1), i.e. regions with high sensitivity has more error in Jacobian determinant on average.
Real-time Neuroimaging and Cognitive Monitoring Using Wearable Dry EEG

PubMed Central

Mullen, Tim R.; Kothe, Christian A.E.; Chi, Mike; Ojeda, Alejandro; Kerth, Trevor; Makeig, Scott; Jung, Tzyy-Ping; Cauwenberghs, Gert

2015-01-01

Goal We present and evaluate a wearable high-density dry electrode EEG system and an open-source software framework for online neuroimaging and state classification. Methods The system integrates a 64-channel dry EEG form-factor with wireless data streaming for online analysis. A real-time software framework is applied, including adaptive artifact rejection, cortical source localization, multivariate effective connectivity inference, data visualization, and cognitive state classification from connectivity features using a constrained logistic regression approach (ProxConn). We evaluate the system identification methods on simulated 64-channel EEG data. Then we evaluate system performance, using ProxConn and a benchmark ERP method, in classifying response errors in 9 subjects using the dry EEG system. Results Simulations yielded high accuracy (AUC=0.97±0.021) for real-time cortical connectivity estimation. Response error classification using cortical effective connectivity (sdDTF) was significantly above chance with similar performance (AUC) for cLORETA (0.74±0.09) and LCMV (0.72±0.08) source localization. Cortical ERP-based classification was equivalent to ProxConn for cLORETA (0.74±0.16) but significantly better for LCMV (0.82±0.12). Conclusion We demonstrated the feasibility for real-time cortical connectivity analysis and cognitive state classification from high-density wearable dry EEG. Significance This paper is the first validated application of these methods to 64-channel dry EEG. The work addresses a need for robust real-time measurement and interpretation of complex brain activity in the dynamic environment of the wearable setting. Such advances can have broad impact in research, medicine, and brain-computer interfaces. The pipelines are made freely available in the open-source SIFT and BCILAB toolboxes. PMID:26415149

Simple, accurate formula for the average bit error probability of multiple-input multiple-output free-space optical links over negative exponential turbulence channels.

PubMed

Peppas, Kostas P; Lazarakis, Fotis; Alexandridis, Antonis; Dangakis, Kostas

2012-08-01

In this Letter we investigate the error performance of multiple-input multiple-output free-space optical communication systems employing intensity modulation/direct detection and operating over strong atmospheric turbulence channels. Atmospheric-induced strong turbulence fading is modeled using the negative exponential distribution. For the considered system, an approximate yet accurate analytical expression for the average bit error probability is derived and an efficient method for its numerical evaluation is proposed. Numerically evaluated and computer simulation results are further provided to demonstrate the validity of the proposed mathematical analysis.
Incorporating GIS and remote sensing for census population disaggregation

NASA Astrophysics Data System (ADS)

Wu, Shuo-Sheng'derek'

Census data are the primary source of demographic data for a variety of researches and applications. For confidentiality issues and administrative purposes, census data are usually released to the public by aggregated areal units. In the United States, the smallest census unit is census blocks. Due to data aggregation, users of census data may have problems in visualizing population distribution within census blocks and estimating population counts for areas not coinciding with census block boundaries. The main purpose of this study is to develop methodology for estimating sub-block areal populations and assessing the estimation errors. The City of Austin, Texas was used as a case study area. Based on tax parcel boundaries and parcel attributes derived from ancillary GIS and remote sensing data, detailed urban land use classes were first classified using a per-field approach. After that, statistical models by land use classes were built to infer population density from other predictor variables, including four census demographic statistics (the Hispanic percentage, the married percentage, the unemployment rate, and per capita income) and three physical variables derived from remote sensing images and building footprints vector data (a landscape heterogeneity statistics, a building pattern statistics, and a building volume statistics). In addition to statistical models, deterministic models were proposed to directly infer populations from building volumes and three housing statistics, including the average space per housing unit, the housing unit occupancy rate, and the average household size. After population models were derived or proposed, how well the models predict populations for another set of sample blocks was assessed. The results show that deterministic models were more accurate than statistical models. Further, by simulating the base unit for modeling from aggregating blocks, I assessed how well the deterministic models estimate sub-unit-level populations. I also assessed the aggregation effects and the resealing effects on sub-unit estimates. Lastly, from another set of mixed-land-use sample blocks, a mixed-land-use model was derived and compared with a residential-land-use model. The results of per-field land use classification are satisfactory with a Kappa accuracy statistics of 0.747. Model Assessments by land use show that population estimates for multi-family land use areas have higher errors than those for single-family land use areas, and population estimates for mixed land use areas have higher errors than those for residential land use areas. The assessments of sub-unit estimates using a simulation approach indicate that smaller areas show higher estimation errors, estimation errors do not relate to the base unit size, and resealing improves all levels of sub-unit estimates.
Image-classification-based global dimming algorithm for LED backlights in LCDs

NASA Astrophysics Data System (ADS)

Qibin, Feng; Huijie, He; Dong, Han; Lei, Zhang; Guoqiang, Lv

2015-07-01

Backlight dimming can help LCDs reduce power consumption and improve CR. With fixed parameters, dimming algorithm cannot achieve satisfied effects for all kinds of images. The paper introduces an image-classification-based global dimming algorithm. The proposed classification method especially for backlight dimming is based on luminance and CR of input images. The parameters for backlight dimming level and pixel compensation are adaptive with image classifications. The simulation results show that the classification based dimming algorithm presents 86.13% power reduction improvement compared with dimming without classification, with almost same display quality. The prototype is developed. There are no perceived distortions when playing videos. The practical average power reduction of the prototype TV is 18.72%, compared with common TV without dimming.
SU-F-T-465: Two Years of Radiotherapy Treatments Analyzed Through MLC Log Files

DOE Office of Scientific and Technical Information (OSTI.GOV)

Defoor, D; Kabat, C; Papanikolaou, N

Purpose: To present treatment statistics of a Varian Novalis Tx using more than 90,000 Varian Dynalog files collected over the past 2 years. Methods: Varian Dynalog files are recorded for every patient treated on our Varian Novalis Tx. The files are collected and analyzed daily to check interfraction agreement of treatment deliveries. This is accomplished by creating fluence maps from the data contained in the Dynalog files. From the Dynalog files we have also compiled statistics for treatment delivery times, MLC errors, gantry errors and collimator errors. Results: The mean treatment time for VMAT patients was 153 ± 86 secondsmore » while the mean treatment time for step & shoot was 256 ± 149 seconds. Patient’s treatment times showed a variation of 0.4% over there treatment course for VMAT and 0.5% for step & shoot. The average field sizes were 40 cm2 and 26 cm2 for VMAT and step & shoot respectively. VMAT beams contained and average overall leaf travel of 34.17 meters and step & shoot beams averaged less than half of that at 15.93 meters. When comparing planned and delivered fluence maps generated using the Dynalog files VMAT plans showed an average gamma passing percentage of 99.85 ± 0.47. Step & shoot plans showed an average gamma passing percentage of 97.04 ± 0.04. 5.3% of beams contained an MLC error greater than 1 mm and 2.4% had an error greater than 2mm. The mean gantry speed for VMAT plans was 1.01 degrees/s with a maximum of 6.5 degrees/s. Conclusion: Varian Dynalog files are useful for monitoring machine performance treatment parameters. The Dynalog files have shown that the performance of the Novalis Tx is consistent over the course of a patients treatment with only slight variations in patient treatment times and a low rate of MLC errors.« less
Quality assurance of chemical ingredient classification for the National Drug File - Reference Terminology.

PubMed

Zheng, Ling; Yumak, Hasan; Chen, Ling; Ochs, Christopher; Geller, James; Kapusnik-Uner, Joan; Perl, Yehoshua

2017-09-01

The National Drug File - Reference Terminology (NDF-RT) is a large and complex drug terminology consisting of several classification hierarchies on top of an extensive collection of drug concepts. These hierarchies provide important information about clinical drugs, e.g., their chemical ingredients, mechanisms of action, dosage form and physiological effects. Within NDF-RT such information is represented using tens of thousands of roles connecting drugs to classifications. In previous studies, we have introduced various kinds of Abstraction Networks to summarize the content and structure of terminologies in order to facilitate their visual comprehension, and support quality assurance of terminologies. However, these previous kinds of Abstraction Networks are not appropriate for summarizing the NDF-RT classification hierarchies, due to its unique structure. In this paper, we present the novel Ingredient Abstraction Network (IAbN) to summarize, visualize and support the audit of NDF-RT's Chemical Ingredients hierarchy and its associated drugs. A common theme in our quality assurance framework is to use characterizations of sets of concepts, revealed by the Abstraction Network structure, to capture concepts, the modeling of which is more complex than for other concepts. For the IAbN, we characterize drug ingredient concepts as more complex if they belong to IAbN groups with multiple parent groups. We show that such concepts have a statistically significantly higher rate of errors than a control sample and identify two especially common patterns of errors. Copyright © 2017 Elsevier Inc. All rights reserved.
Improvement in defect classification efficiency by grouping disposition for reticle inspection

NASA Astrophysics Data System (ADS)

Lai, Rick; Hsu, Luke T. H.; Chang, Peter; Ho, C. H.; Tsai, Frankie; Long, Garrett; Yu, Paul; Miller, John; Hsu, Vincent; Chen, Ellison

2005-11-01

As the lithography design rule of IC manufacturing continues to migrate toward more advanced technology nodes, the mask error enhancement factor (MEEF) increases and necessitates the use of aggressive OPC features. These aggressive OPC features pose challenges to reticle inspection due to high false detection, which is time-consuming for defect classification and impacts the throughput of mask manufacturing. Moreover, higher MEEF leads to stricter mask defect capture criteria so that new generation reticle inspection tool is equipped with better detection capability. Hence, mask process induced defects, which were once undetectable, are now detected and results in the increase of total defect count. Therefore, how to review and characterize reticle defects efficiently is becoming more significant. A new defect review system called ReviewSmart has been developed based on the concept of defect grouping disposition. The review system intelligently bins repeating or similar defects into defect groups and thus allows operators to review massive defects more efficiently. Compared to the conventional defect review method, ReviewSmart not only reduces defect classification time and human judgment error, but also eliminates desensitization that is formerly inevitable. In this study, we attempt to explore the most efficient use of ReviewSmart by evaluating various defect binning conditions. The optimal binning conditions are obtained and have been verified for fidelity qualification through inspection reports (IRs) of production masks. The experiment results help to achieve the best defect classification efficiency when using ReviewSmart in the mask manufacturing and development.
A Quantile Mapping Bias Correction Method Based on Hydroclimatic Classification of the Guiana Shield

PubMed Central

Ringard, Justine; Seyler, Frederique; Linguet, Laurent

2017-01-01

Satellite precipitation products (SPPs) provide alternative precipitation data for regions with sparse rain gauge measurements. However, SPPs are subject to different types of error that need correction. Most SPP bias correction methods use the statistical properties of the rain gauge data to adjust the corresponding SPP data. The statistical adjustment does not make it possible to correct the pixels of SPP data for which there is no rain gauge data. The solution proposed in this article is to correct the daily SPP data for the Guiana Shield using a novel two set approach, without taking into account the daily gauge data of the pixel to be corrected, but the daily gauge data from surrounding pixels. In this case, a spatial analysis must be involved. The first step defines hydroclimatic areas using a spatial classification that considers precipitation data with the same temporal distributions. The second step uses the Quantile Mapping bias correction method to correct the daily SPP data contained within each hydroclimatic area. We validate the results by comparing the corrected SPP data and daily rain gauge measurements using relative RMSE and relative bias statistical errors. The results show that analysis scale variation reduces rBIAS and rRMSE significantly. The spatial classification avoids mixing rainfall data with different temporal characteristics in each hydroclimatic area, and the defined bias correction parameters are more realistic and appropriate. This study demonstrates that hydroclimatic classification is relevant for implementing bias correction methods at the local scale. PMID:28621723
A Quantile Mapping Bias Correction Method Based on Hydroclimatic Classification of the Guiana Shield.

PubMed

Ringard, Justine; Seyler, Frederique; Linguet, Laurent

2017-06-16

Satellite precipitation products (SPPs) provide alternative precipitation data for regions with sparse rain gauge measurements. However, SPPs are subject to different types of error that need correction. Most SPP bias correction methods use the statistical properties of the rain gauge data to adjust the corresponding SPP data. The statistical adjustment does not make it possible to correct the pixels of SPP data for which there is no rain gauge data. The solution proposed in this article is to correct the daily SPP data for the Guiana Shield using a novel two set approach, without taking into account the daily gauge data of the pixel to be corrected, but the daily gauge data from surrounding pixels. In this case, a spatial analysis must be involved. The first step defines hydroclimatic areas using a spatial classification that considers precipitation data with the same temporal distributions. The second step uses the Quantile Mapping bias correction method to correct the daily SPP data contained within each hydroclimatic area. We validate the results by comparing the corrected SPP data and daily rain gauge measurements using relative RMSE and relative bias statistical errors. The results show that analysis scale variation reduces rBIAS and rRMSE significantly. The spatial classification avoids mixing rainfall data with different temporal characteristics in each hydroclimatic area, and the defined bias correction parameters are more realistic and appropriate. This study demonstrates that hydroclimatic classification is relevant for implementing bias correction methods at the local scale.
An investigation of the usability of sound recognition for source separation of packaging wastes in reverse vending machines.

PubMed

Korucu, M Kemal; Kaplan, Özgür; Büyük, Osman; Güllü, M Kemal

2016-10-01

In this study, we investigate the usability of sound recognition for source separation of packaging wastes in reverse vending machines (RVMs). For this purpose, an experimental setup equipped with a sound recording mechanism was prepared. Packaging waste sounds generated by three physical impacts such as free falling, pneumatic hitting and hydraulic crushing were separately recorded using two different microphones. To classify the waste types and sizes based on sound features of the wastes, a support vector machine (SVM) and a hidden Markov model (HMM) based sound classification systems were developed. In the basic experimental setup in which only free falling impact type was considered, SVM and HMM systems provided 100% classification accuracy for both microphones. In the expanded experimental setup which includes all three impact types, material type classification accuracies were 96.5% for dynamic microphone and 97.7% for condenser microphone. When both the material type and the size of the wastes were classified, the accuracy was 88.6% for the microphones. The modeling studies indicated that hydraulic crushing impact type recordings were very noisy for an effective sound recognition application. In the detailed analysis of the recognition errors, it was observed that most of the errors occurred in the hitting impact type. According to the experimental results, it can be said that the proposed novel approach for the separation of packaging wastes could provide a high classification performance for RVMs. Copyright © 2016 Elsevier Ltd. All rights reserved.
Her2Net: A Deep Framework for Semantic Segmentation and Classification of Cell Membranes and Nuclei in Breast Cancer Evaluation.

PubMed

Saha, Monjoy; Chakraborty, Chandan

2018-05-01

We present an efficient deep learning framework for identifying, segmenting, and classifying cell membranes and nuclei from human epidermal growth factor receptor-2 (HER2)-stained breast cancer images with minimal user intervention. This is a long-standing issue for pathologists because the manual quantification of HER2 is error-prone, costly, and time-consuming. Hence, we propose a deep learning-based HER2 deep neural network (Her2Net) to solve this issue. The convolutional and deconvolutional parts of the proposed Her2Net framework consisted mainly of multiple convolution layers, max-pooling layers, spatial pyramid pooling layers, deconvolution layers, up-sampling layers, and trapezoidal long short-term memory (TLSTM). A fully connected layer and a softmax layer were also used for classification and error estimation. Finally, HER2 scores were calculated based on the classification results. The main contribution of our proposed Her2Net framework includes the implementation of TLSTM and a deep learning framework for cell membrane and nucleus detection, segmentation, and classification and HER2 scoring. Our proposed Her2Net achieved 96.64% precision, 96.79% recall, 96.71% F-score, 93.08% negative predictive value, 98.33% accuracy, and a 6.84% false-positive rate. Our results demonstrate the high accuracy and wide applicability of the proposed Her2Net in the context of HER2 scoring for breast cancer evaluation.
Optimum data analysis procedures for Titan 4 and Space Shuttle payload acoustic measurements during lift-off

NASA Technical Reports Server (NTRS)

Piersol, Allan G.

1991-01-01

Analytical expressions have been derived to describe the mean square error in the estimation of the maximum rms value computed from a step-wise (or running) time average of a nonstationary random signal. These analytical expressions have been applied to the problem of selecting the optimum averaging times that will minimize the total mean square errors in estimates of the maximum sound pressure levels measured inside the Titan IV payload fairing (PLF) and the Space Shuttle payload bay (PLB) during lift-off. Based on evaluations of typical Titan IV and Space Shuttle launch data, it has been determined that the optimum averaging times for computing the maximum levels are (1) T (sub o) = 1.14 sec for the maximum overall level, and T(sub oi) = 4.88 f (sub i) (exp -0.2) sec for the maximum 1/3 octave band levels inside the Titan IV PLF, and (2) T (sub o) = 1.65 sec for the maximum overall level, and T (sub oi) = 7.10 f (sub i) (exp -0.2) sec for the maximum 1/3 octave band levels inside the Space Shuttle PLB, where f (sub i) is the 1/3 octave band center frequency. However, the results for both vehicles indicate that the total rms error in the maximum level estimates will be within 25 percent the minimum error for all averaging times within plus or minus 50 percent of the optimum averaging time, so a precise selection of the exact optimum averaging time is not critical. Based on these results, linear averaging times (T) are recommended for computing the maximum sound pressure level during lift-off.
Differentiation of several interstitial lung disease patterns in HRCT images using support vector machine: role of databases on performance

NASA Astrophysics Data System (ADS)

Kale, Mandar; Mukhopadhyay, Sudipta; Dash, Jatindra K.; Garg, Mandeep; Khandelwal, Niranjan

2016-03-01

Interstitial lung disease (ILD) is complicated group of pulmonary disorders. High Resolution Computed Tomography (HRCT) considered to be best imaging technique for analysis of different pulmonary disorders. HRCT findings can be categorised in several patterns viz. Consolidation, Emphysema, Ground Glass Opacity, Nodular, Normal etc. based on their texture like appearance. Clinician often find it difficult to diagnosis these pattern because of their complex nature. In such scenario computer-aided diagnosis system could help clinician to identify patterns. Several approaches had been proposed for classification of ILD patterns. This includes computation of textural feature and training /testing of classifier such as artificial neural network (ANN), support vector machine (SVM) etc. In this paper, wavelet features are calculated from two different ILD database, publically available MedGIFT ILD database and private ILD database, followed by performance evaluation of ANN and SVM classifiers in terms of average accuracy. It is found that average classification accuracy by SVM is greater than ANN where trained and tested on same database. Investigation continued further to test variation in accuracy of classifier when training and testing is performed with alternate database and training and testing of classifier with database formed by merging samples from same class from two individual databases. The average classification accuracy drops when two independent databases used for training and testing respectively. There is significant improvement in average accuracy when classifiers are trained and tested with merged database. It infers dependency of classification accuracy on training data. It is observed that SVM outperforms ANN when same database is used for training and testing.
The Performance of Noncoherent Orthogonal M-FSK in the Presence of Timing and Frequency Errors

NASA Technical Reports Server (NTRS)

Hinedi, Sami; Simon, Marvin K.; Raphaeli, Dan

1993-01-01

Practical M-FSK systems experience a combination of time and frequency offsets (errors). This paper assesses the deleterious effect of these offsets, first individually and then combined, on the average bit error probability performance of the system.
The Whole Warps the Sum of Its Parts.

PubMed

Corbett, Jennifer E

2017-01-01

The efficiency of averaging properties of sets without encoding redundant details is analogous to gestalt proposals that perception is parsimoniously organized as a function of recurrent order in the world. This similarity suggests that grouping and averaging are part of a broader set of strategies allowing the visual system to circumvent capacity limitations. To examine how gestalt grouping affects the manner in which information is averaged and remembered, I compared the error in observers' adjustments of remembered sizes of individual circles in two different mean-size sets defined by similarity, proximity, connectedness, or a common region. Overall, errors were more similar within the same gestalt-defined groups than between different gestalt-defined groups, such that the remembered sizes of individual circles were biased toward the mean size of their respective gestalt-defined groups. These results imply that gestalt grouping facilitates perceptual averaging to minimize the error with which individual items are encoded, thereby optimizing the efficiency of visual short-term memory.
Computation of Standard Errors

PubMed Central

Dowd, Bryan E; Greene, William H; Norton, Edward C

2014-01-01

Objectives We discuss the problem of computing the standard errors of functions involving estimated parameters and provide the relevant computer code for three different computational approaches using two popular computer packages. Study Design We show how to compute the standard errors of several functions of interest: the predicted value of the dependent variable for a particular subject, and the effect of a change in an explanatory variable on the predicted value of the dependent variable for an individual subject and average effect for a sample of subjects. Empirical Application Using a publicly available dataset, we explain three different methods of computing standard errors: the delta method, Krinsky–Robb, and bootstrapping. We provide computer code for Stata 12 and LIMDEP 10/NLOGIT 5. Conclusions In most applications, choice of the computational method for standard errors of functions of estimated parameters is a matter of convenience. However, when computing standard errors of the sample average of functions that involve both estimated parameters and nonstochastic explanatory variables, it is important to consider the sources of variation in the function's values. PMID:24800304
A Ternary Hybrid EEG-NIRS Brain-Computer Interface for the Classification of Brain Activation Patterns during Mental Arithmetic, Motor Imagery, and Idle State.

PubMed

Shin, Jaeyoung; Kwon, Jinuk; Im, Chang-Hwan

2018-01-01

The performance of a brain-computer interface (BCI) can be enhanced by simultaneously using two or more modalities to record brain activity, which is generally referred to as a hybrid BCI. To date, many BCI researchers have tried to implement a hybrid BCI system by combining electroencephalography (EEG) and functional near-infrared spectroscopy (NIRS) to improve the overall accuracy of binary classification. However, since hybrid EEG-NIRS BCI, which will be denoted by hBCI in this paper, has not been applied to ternary classification problems, paradigms and classification strategies appropriate for ternary classification using hBCI are not well investigated. Here we propose the use of an hBCI for the classification of three brain activation patterns elicited by mental arithmetic, motor imagery, and idle state, with the aim to elevate the information transfer rate (ITR) of hBCI by increasing the number of classes while minimizing the loss of accuracy. EEG electrodes were placed over the prefrontal cortex and the central cortex, and NIRS optodes were placed only on the forehead. The ternary classification problem was decomposed into three binary classification problems using the "one-versus-one" (OVO) classification strategy to apply the filter-bank common spatial patterns filter to EEG data. A 10 × 10-fold cross validation was performed using shrinkage linear discriminant analysis (sLDA) to evaluate the average classification accuracies for EEG-BCI, NIRS-BCI, and hBCI when the meta-classification method was adopted to enhance classification accuracy. The ternary classification accuracies for EEG-BCI, NIRS-BCI, and hBCI were 76.1 ± 12.8, 64.1 ± 9.7, and 82.2 ± 10.2%, respectively. The classification accuracy of the proposed hBCI was thus significantly higher than those of the other BCIs ( p < 0.005). The average ITR for the proposed hBCI was calculated to be 4.70 ± 1.92 bits/minute, which was 34.3% higher than that reported for a previous binary hBCI study.
Effect of radiance-to-reflectance transformation and atmosphere removal on maximum likelihood classification accuracy of high-dimensional remote sensing data

NASA Technical Reports Server (NTRS)

Hoffbeck, Joseph P.; Landgrebe, David A.

1994-01-01

Many analysis algorithms for high-dimensional remote sensing data require that the remotely sensed radiance spectra be transformed to approximate reflectance to allow comparison with a library of laboratory reflectance spectra. In maximum likelihood classification, however, the remotely sensed spectra are compared to training samples, thus a transformation to reflectance may or may not be helpful. The effect of several radiance-to-reflectance transformations on maximum likelihood classification accuracy is investigated in this paper. We show that the empirical line approach, LOWTRAN7, flat-field correction, single spectrum method, and internal average reflectance are all non-singular affine transformations, and that non-singular affine transformations have no effect on discriminant analysis feature extraction and maximum likelihood classification accuracy. (An affine transformation is a linear transformation with an optional offset.) Since the Atmosphere Removal Program (ATREM) and the log residue method are not affine transformations, experiments with Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) data were conducted to determine the effect of these transformations on maximum likelihood classification accuracy. The average classification accuracy of the data transformed by ATREM and the log residue method was slightly less than the accuracy of the original radiance data. Since the radiance-to-reflectance transformations allow direct comparison of remotely sensed spectra with laboratory reflectance spectra, they can be quite useful in labeling the training samples required by maximum likelihood classification, but these transformations have only a slight effect or no effect at all on discriminant analysis and maximum likelihood classification accuracy.
A Locality-Constrained and Label Embedding Dictionary Learning Algorithm for Image Classification.

PubMed

Zhengming Li; Zhihui Lai; Yong Xu; Jian Yang; Zhang, David

2017-02-01

Locality and label information of training samples play an important role in image classification. However, previous dictionary learning algorithms do not take the locality and label information of atoms into account together in the learning process, and thus their performance is limited. In this paper, a discriminative dictionary learning algorithm, called the locality-constrained and label embedding dictionary learning (LCLE-DL) algorithm, was proposed for image classification. First, the locality information was preserved using the graph Laplacian matrix of the learned dictionary instead of the conventional one derived from the training samples. Then, the label embedding term was constructed using the label information of atoms instead of the classification error term, which contained discriminating information of the learned dictionary. The optimal coding coefficients derived by the locality-based and label-based reconstruction were effective for image classification. Experimental results demonstrated that the LCLE-DL algorithm can achieve better performance than some state-of-the-art algorithms.
Convolutional neural network with transfer learning for rice type classification

NASA Astrophysics Data System (ADS)

Patel, Vaibhav Amit; Joshi, Manjunath V.

2018-04-01

Presently, rice type is identified manually by humans, which is time consuming and error prone. Therefore, there is a need to do this by machine which makes it faster with greater accuracy. This paper proposes a deep learning based method for classification of rice types. We propose two methods to classify the rice types. In the first method, we train a deep convolutional neural network (CNN) using the given segmented rice images. In the second method, we train a combination of a pretrained VGG16 network and the proposed method, while using transfer learning in which the weights of a pretrained network are used to achieve better accuracy. Our approach can also be used for classification of rice grain as broken or fine. We train a 5-class model for classifying rice types using 4000 training images and another 2- class model for the classification of broken and normal rice using 1600 training images. We observe that despite having distinct rice images, our architecture, pretrained on ImageNet data boosts classification accuracy significantly.
Application of partial least squares near-infrared spectral classification in diabetic identification

NASA Astrophysics Data System (ADS)

Yan, Wen-juan; Yang, Ming; He, Guo-quan; Qin, Lin; Li, Gang

2014-11-01

In order to identify the diabetic patients by using tongue near-infrared (NIR) spectrum - a spectral classification model of the NIR reflectivity of the tongue tip is proposed, based on the partial least square (PLS) method. 39sample data of tongue tip's NIR spectra are harvested from healthy people and diabetic patients , respectively. After pretreatment of the reflectivity, the spectral data are set as the independent variable matrix, and information of classification as the dependent variables matrix, Samples were divided into two groups - i.e. 53 samples as calibration set and 25 as prediction set - then the PLS is used to build the classification model The constructed modelfrom the 53 samples has the correlation of 0.9614 and the root mean square error of cross-validation (RMSECV) of 0.1387.The predictions for the 25 samples have the correlation of 0.9146 and the RMSECV of 0.2122.The experimental result shows that the PLS method can achieve good classification on features of healthy people and diabetic patients.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.