Using Ensemble Decisions and Active Selection to Improve Low-Cost Labeling for Multi-View Data
NASA Technical Reports Server (NTRS)
Rebbapragada, Umaa; Wagstaff, Kiri L.
2011-01-01
This paper seeks to improve low-cost labeling in terms of training set reliability (the fraction of correctly labeled training items) and test set performance for multi-view learning methods. Co-training is a popular multiview learning method that combines high-confidence example selection with low-cost (self) labeling. However, co-training with certain base learning algorithms significantly reduces training set reliability, causing an associated drop in prediction accuracy. We propose the use of ensemble labeling to improve reliability in such cases. We also discuss and show promising results on combining low-cost ensemble labeling with active (low-confidence) example selection. We unify these example selection and labeling strategies under collaborative learning, a family of techniques for multi-view learning that we are developing for distributed, sensor-network environments.
Label Review Training: Module 1: Label Basics, Page 27
This module of the pesticide label review training provides basic information about pesticides, their labeling and regulation, and the core principles of pesticide label review. See examples of mandatory and advisory label statements.
Active learning in the presence of unlabelable examples
NASA Technical Reports Server (NTRS)
Mazzoni, Dominic; Wagstaff, Kiri
2004-01-01
We propose a new active learning framework where the expert labeler is allowed to decline to label any example. This may be necessary because the true label is unknown or because the example belongs to a class that is not part of the real training problem. We show that within this framework, popular active learning algorithms (such as Simple) may perform worse than random selection because they make so many queries to the unlabelable class. We present a method by which any active learning algorithm can be modified to avoid unlabelable examples by training a second classifier to distinguish between the labelable and unlabelable classes. We also demonstrate the effectiveness of the method on two benchmark data sets and a real-world problem.
Active Learning with Irrelevant Examples
NASA Technical Reports Server (NTRS)
Wagstaff, Kiri; Mazzoni, Dominic
2009-01-01
An improved active learning method has been devised for training data classifiers. One example of a data classifier is the algorithm used by the United States Postal Service since the 1960s to recognize scans of handwritten digits for processing zip codes. Active learning algorithms enable rapid training with minimal investment of time on the part of human experts to provide training examples consisting of correctly classified (labeled) input data. They function by identifying which examples would be most profitable for a human expert to label. The goal is to maximize classifier accuracy while minimizing the number of examples the expert must label. Although there are several well-established methods for active learning, they may not operate well when irrelevant examples are present in the data set. That is, they may select an item for labeling that the expert simply cannot assign to any of the valid classes. In the context of classifying handwritten digits, the irrelevant items may include stray marks, smudges, and mis-scans. Querying the expert about these items results in wasted time or erroneous labels, if the expert is forced to assign the item to one of the valid classes. In contrast, the new algorithm provides a specific mechanism for avoiding querying the irrelevant items. This algorithm has two components: an active learner (which could be a conventional active learning algorithm) and a relevance classifier. The combination of these components yields a method, denoted Relevance Bias, that enables the active learner to avoid querying irrelevant data so as to increase its learning rate and efficiency when irrelevant items are present. The algorithm collects irrelevant data in a set of rejected examples, then trains the relevance classifier to distinguish between labeled (relevant) training examples and the rejected ones. The active learner combines its ranking of the items with the probability that they are relevant to yield a final decision about which item to present to the expert for labeling. Experiments on several data sets have demonstrated that the Relevance Bias approach significantly decreases the number of irrelevant items queried and also accelerates learning speed.
Machine learning with naturally labeled data for identifying abbreviation definitions.
Yeganova, Lana; Comeau, Donald C; Wilbur, W John
2011-06-09
The rapid growth of biomedical literature requires accurate text analysis and text processing tools. Detecting abbreviations and identifying their definitions is an important component of such tools. Most existing approaches for the abbreviation definition identification task employ rule-based methods. While achieving high precision, rule-based methods are limited to the rules defined and fail to capture many uncommon definition patterns. Supervised learning techniques, which offer more flexibility in detecting abbreviation definitions, have also been applied to the problem. However, they require manually labeled training data. In this work, we develop a machine learning algorithm for abbreviation definition identification in text which makes use of what we term naturally labeled data. Positive training examples are naturally occurring potential abbreviation-definition pairs in text. Negative training examples are generated by randomly mixing potential abbreviations with unrelated potential definitions. The machine learner is trained to distinguish between these two sets of examples. Then, the learned feature weights are used to identify the abbreviation full form. This approach does not require manually labeled training data. We evaluate the performance of our algorithm on the Ab3P, BIOADI and Medstract corpora. Our system demonstrated results that compare favourably to the existing Ab3P and BIOADI systems. We achieve an F-measure of 91.36% on Ab3P corpus, and an F-measure of 87.13% on BIOADI corpus which are superior to the results reported by Ab3P and BIOADI systems. Moreover, we outperform these systems in terms of recall, which is one of our goals.
Rapid Training of Information Extraction with Local and Global Data Views
2012-05-01
56 xiii 4.1 An example of words and their bit string representations. Bold ones are transliterated Arabic words...Natural Language Processing ( NLP ) community faces new tasks and new domains all the time. Without enough labeled data of a new task or a new domain to...conduct supervised learning, semi-supervised learning is particularly attractive to NLP researchers since it only requires a handful of labeled examples
Active Learning Strategies for Phenotypic Profiling of High-Content Screens.
Smith, Kevin; Horvath, Peter
2014-06-01
High-content screening is a powerful method to discover new drugs and carry out basic biological research. Increasingly, high-content screens have come to rely on supervised machine learning (SML) to perform automatic phenotypic classification as an essential step of the analysis. However, this comes at a cost, namely, the labeled examples required to train the predictive model. Classification performance increases with the number of labeled examples, and because labeling examples demands time from an expert, the training process represents a significant time investment. Active learning strategies attempt to overcome this bottleneck by presenting the most relevant examples to the annotator, thereby achieving high accuracy while minimizing the cost of obtaining labeled data. In this article, we investigate the impact of active learning on single-cell-based phenotype recognition, using data from three large-scale RNA interference high-content screens representing diverse phenotypic profiling problems. We consider several combinations of active learning strategies and popular SML methods. Our results show that active learning significantly reduces the time cost and can be used to reveal the same phenotypic targets identified using SML. We also identify combinations of active learning strategies and SML methods which perform better than others on the phenotypic profiling problems we studied. © 2014 Society for Laboratory Automation and Screening.
ERIC Educational Resources Information Center
Hinton, Geoffrey
2014-01-01
It is possible to learn multiple layers of non-linear features by backpropagating error derivatives through a feedforward neural network. This is a very effective learning procedure when there is a huge amount of labeled training data, but for many learning tasks very few labeled examples are available. In an effort to overcome the need for…
Classification without labels: learning from mixed samples in high energy physics
NASA Astrophysics Data System (ADS)
Metodiev, Eric M.; Nachman, Benjamin; Thaler, Jesse
2017-10-01
Modern machine learning techniques can be used to construct powerful models for difficult collider physics problems. In many applications, however, these models are trained on imperfect simulations due to a lack of truth-level information in the data, which risks the model learning artifacts of the simulation. In this paper, we introduce the paradigm of classification without labels (CWoLa) in which a classifier is trained to distinguish statistical mixtures of classes, which are common in collider physics. Crucially, neither individual labels nor class proportions are required, yet we prove that the optimal classifier in the CWoLa paradigm is also the optimal classifier in the traditional fully-supervised case where all label information is available. After demonstrating the power of this method in an analytical toy example, we consider a realistic benchmark for collider physics: distinguishing quark- versus gluon-initiated jets using mixed quark/gluon training samples. More generally, CWoLa can be applied to any classification problem where labels or class proportions are unknown or simulations are unreliable, but statistical mixtures of the classes are available.
Classification without labels: learning from mixed samples in high energy physics
Metodiev, Eric M.; Nachman, Benjamin; Thaler, Jesse
2017-10-25
Modern machine learning techniques can be used to construct powerful models for difficult collider physics problems. In many applications, however, these models are trained on imperfect simulations due to a lack of truth-level information in the data, which risks the model learning artifacts of the simulation. In this paper, we introduce the paradigm of classification without labels (CWoLa) in which a classifier is trained to distinguish statistical mixtures of classes, which are common in collider physics. Crucially, neither individual labels nor class proportions are required, yet we prove that the optimal classifier in the CWoLa paradigm is also the optimalmore » classifier in the traditional fully-supervised case where all label information is available. After demonstrating the power of this method in an analytical toy example, we consider a realistic benchmark for collider physics: distinguishing quark- versus gluon-initiated jets using mixed quark/gluon training samples. More generally, CWoLa can be applied to any classification problem where labels or class proportions are unknown or simulations are unreliable, but statistical mixtures of the classes are available.« less
Classification without labels: learning from mixed samples in high energy physics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Metodiev, Eric M.; Nachman, Benjamin; Thaler, Jesse
Modern machine learning techniques can be used to construct powerful models for difficult collider physics problems. In many applications, however, these models are trained on imperfect simulations due to a lack of truth-level information in the data, which risks the model learning artifacts of the simulation. In this paper, we introduce the paradigm of classification without labels (CWoLa) in which a classifier is trained to distinguish statistical mixtures of classes, which are common in collider physics. Crucially, neither individual labels nor class proportions are required, yet we prove that the optimal classifier in the CWoLa paradigm is also the optimalmore » classifier in the traditional fully-supervised case where all label information is available. After demonstrating the power of this method in an analytical toy example, we consider a realistic benchmark for collider physics: distinguishing quark- versus gluon-initiated jets using mixed quark/gluon training samples. More generally, CWoLa can be applied to any classification problem where labels or class proportions are unknown or simulations are unreliable, but statistical mixtures of the classes are available.« less
NASA Astrophysics Data System (ADS)
Gao, Yuan; Ma, Jiayi; Yuille, Alan L.
2017-05-01
This paper addresses the problem of face recognition when there is only few, or even only a single, labeled examples of the face that we wish to recognize. Moreover, these examples are typically corrupted by nuisance variables, both linear (i.e., additive nuisance variables such as bad lighting, wearing of glasses) and non-linear (i.e., non-additive pixel-wise nuisance variables such as expression changes). The small number of labeled examples means that it is hard to remove these nuisance variables between the training and testing faces to obtain good recognition performance. To address the problem we propose a method called Semi-Supervised Sparse Representation based Classification (S$^3$RC). This is based on recent work on sparsity where faces are represented in terms of two dictionaries: a gallery dictionary consisting of one or more examples of each person, and a variation dictionary representing linear nuisance variables (e.g., different lighting conditions, different glasses). The main idea is that (i) we use the variation dictionary to characterize the linear nuisance variables via the sparsity framework, then (ii) prototype face images are estimated as a gallery dictionary via a Gaussian Mixture Model (GMM), with mixed labeled and unlabeled samples in a semi-supervised manner, to deal with the non-linear nuisance variations between labeled and unlabeled samples. We have done experiments with insufficient labeled samples, even when there is only a single labeled sample per person. Our results on the AR, Multi-PIE, CAS-PEAL, and LFW databases demonstrate that the proposed method is able to deliver significantly improved performance over existing methods.
Learning to Recognize Actions From Limited Training Examples Using a Recurrent Spiking Neural Model
Panda, Priyadarshini; Srinivasa, Narayan
2018-01-01
A fundamental challenge in machine learning today is to build a model that can learn from few examples. Here, we describe a reservoir based spiking neural model for learning to recognize actions with a limited number of labeled videos. First, we propose a novel encoding, inspired by how microsaccades influence visual perception, to extract spike information from raw video data while preserving the temporal correlation across different frames. Using this encoding, we show that the reservoir generalizes its rich dynamical activity toward signature action/movements enabling it to learn from few training examples. We evaluate our approach on the UCF-101 dataset. Our experiments demonstrate that our proposed reservoir achieves 81.3/87% Top-1/Top-5 accuracy, respectively, on the 101-class data while requiring just 8 video examples per class for training. Our results establish a new benchmark for action recognition from limited video examples for spiking neural models while yielding competitive accuracy with respect to state-of-the-art non-spiking neural models. PMID:29551962
NASA Astrophysics Data System (ADS)
Reichman, Daniël.; Collins, Leslie M.; Malof, Jordan M.
2018-04-01
This work focuses on the development of automatic buried threat detection (BTD) algorithms using ground penetrating radar (GPR) data. Buried threats tend to exhibit unique characteristics in GPR imagery, such as high energy hyperbolic shapes, which can be leveraged for detection. Many recent BTD algorithms are supervised, and therefore they require training with exemplars of GPR data collected over non-threat locations and threat locations, respectively. Frequently, data from non-threat GPR examples will exhibit high energy hyperbolic patterns, similar to those observed from a buried threat. Is it still useful therefore, to include such examples during algorithm training, and encourage an algorithm to label such data as a non-threat? Similarly, some true buried threat examples exhibit very little distinctive threat-like patterns. We investigate whether it is beneficial to treat such GPR data examples as mislabeled, and either (i) relabel them, or (ii) remove them from training. We study this problem using two algorithms to automatically identify mislabeled examples, if they are present, and examine the impact of removing or relabeling them for training. We conduct these experiments on a large collection of GPR data with several state-of-the-art GPR-based BTD algorithms.
Fully Convolutional Neural Networks Improve Abdominal Organ Segmentation.
Bobo, Meg F; Bao, Shunxing; Huo, Yuankai; Yao, Yuang; Virostko, Jack; Plassard, Andrew J; Lyu, Ilwoo; Assad, Albert; Abramson, Richard G; Hilmes, Melissa A; Landman, Bennett A
2018-03-01
Abdominal image segmentation is a challenging, yet important clinical problem. Variations in body size, position, and relative organ positions greatly complicate the segmentation process. Historically, multi-atlas methods have achieved leading results across imaging modalities and anatomical targets. However, deep learning is rapidly overtaking classical approaches for image segmentation. Recently, Zhou et al. showed that fully convolutional networks produce excellent results in abdominal organ segmentation of computed tomography (CT) scans. Yet, deep learning approaches have not been applied to whole abdomen magnetic resonance imaging (MRI) segmentation. Herein, we evaluate the applicability of an existing fully convolutional neural network (FCNN) designed for CT imaging to segment abdominal organs on T2 weighted (T2w) MRI's with two examples. In the primary example, we compare a classical multi-atlas approach with FCNN on forty-five T2w MRI's acquired from splenomegaly patients with five organs labeled (liver, spleen, left kidney, right kidney, and stomach). Thirty-six images were used for training while nine were used for testing. The FCNN resulted in a Dice similarity coefficient (DSC) of 0.930 in spleens, 0.730 in left kidneys, 0.780 in right kidneys, 0.913 in livers, and 0.556 in stomachs. The performance measures for livers, spleens, right kidneys, and stomachs were significantly better than multi-atlas (p < 0.05, Wilcoxon rank-sum test). In a secondary example, we compare the multi-atlas approach with FCNN on 138 distinct T2w MRI's with manually labeled pancreases (one label). On the pancreas dataset, the FCNN resulted in a median DSC of 0.691 in pancreases versus 0.287 for multi-atlas. The results are highly promising given relatively limited training data and without specific training of the FCNN model and illustrate the potential of deep learning approaches to transcend imaging modalities.
Fully convolutional neural networks improve abdominal organ segmentation
NASA Astrophysics Data System (ADS)
Bobo, Meg F.; Bao, Shunxing; Huo, Yuankai; Yao, Yuang; Virostko, Jack; Plassard, Andrew J.; Lyu, Ilwoo; Assad, Albert; Abramson, Richard G.; Hilmes, Melissa A.; Landman, Bennett A.
2018-03-01
Abdominal image segmentation is a challenging, yet important clinical problem. Variations in body size, position, and relative organ positions greatly complicate the segmentation process. Historically, multi-atlas methods have achieved leading results across imaging modalities and anatomical targets. However, deep learning is rapidly overtaking classical approaches for image segmentation. Recently, Zhou et al. showed that fully convolutional networks produce excellent results in abdominal organ segmentation of computed tomography (CT) scans. Yet, deep learning approaches have not been applied to whole abdomen magnetic resonance imaging (MRI) segmentation. Herein, we evaluate the applicability of an existing fully convolutional neural network (FCNN) designed for CT imaging to segment abdominal organs on T2 weighted (T2w) MRI's with two examples. In the primary example, we compare a classical multi-atlas approach with FCNN on forty-five T2w MRI's acquired from splenomegaly patients with five organs labeled (liver, spleen, left kidney, right kidney, and stomach). Thirty-six images were used for training while nine were used for testing. The FCNN resulted in a Dice similarity coefficient (DSC) of 0.930 in spleens, 0.730 in left kidneys, 0.780 in right kidneys, 0.913 in livers, and 0.556 in stomachs. The performance measures for livers, spleens, right kidneys, and stomachs were significantly better than multi-atlas (p < 0.05, Wilcoxon rank-sum test). In a secondary example, we compare the multi-atlas approach with FCNN on 138 distinct T2w MRI's with manually labeled pancreases (one label). On the pancreas dataset, the FCNN resulted in a median DSC of 0.691 in pancreases versus 0.287 for multi-atlas. The results are highly promising given relatively limited training data and without specific training of the FCNN model and illustrate the potential of deep learning approaches to transcend imaging modalities. 1
Leveraging Long-term Seismic Catalogs for Automated Real-time Event Classification
NASA Astrophysics Data System (ADS)
Linville, L.; Draelos, T.; Pankow, K. L.; Young, C. J.; Alvarez, S.
2017-12-01
We investigate the use of labeled event types available through reviewed seismic catalogs to produce automated event labels on new incoming data from the crustal region spanned by the cataloged events. Using events cataloged by the University of Utah Seismograph Stations between October, 2012 and June, 2017, we calculate the spectrogram for a time window that spans the duration of each event as seen on individual stations, resulting in 110k event spectrograms (50% local earthquakes examples, 50% quarry blasts examples). Using 80% of the randomized example events ( 90k), a classifier is trained to distinguish between local earthquakes and quarry blasts. We explore variations of deep learning classifiers, incorporating elements of convolutional and recurrent neural networks. Using a single-layer Long Short Term Memory recurrent neural network, we achieve 92% accuracy on the classification task on the remaining 20K test examples. Leveraging the decisions from a group of stations that detected the same event by using the median of all classifications in the group increases the model accuracy to 96%. Additional data with equivalent processing from 500 more recently cataloged events (July, 2017), achieves the same accuracy as our test data on both single-station examples and multi-station medians, suggesting that the model can maintain accurate and stable classification rates on real-time automated events local to the University of Utah Seismograph Stations, with potentially minimal levels of re-training through time.
Multiclass Continuous Correspondence Learning
NASA Technical Reports Server (NTRS)
Bue, Brian D,; Thompson, David R.
2011-01-01
We extend the Structural Correspondence Learning (SCL) domain adaptation algorithm of Blitzer er al. to the realm of continuous signals. Given a set of labeled examples belonging to a 'source' domain, we select a set of unlabeled examples in a related 'target' domain that play similar roles in both domains. Using these 'pivot samples, we map both domains into a common feature space, allowing us to adapt a classifier trained on source examples to classify target examples. We show that when between-class distances are relatively preserved across domains, we can automatically select target pivots to bring the domains into correspondence.
Automatic Earthquake Detection by Active Learning
NASA Astrophysics Data System (ADS)
Bergen, K.; Beroza, G. C.
2017-12-01
In recent years, advances in machine learning have transformed fields such as image recognition, natural language processing and recommender systems. Many of these performance gains have relied on the availability of large, labeled data sets to train high-accuracy models; labeled data sets are those for which each sample includes a target class label, such as waveforms tagged as either earthquakes or noise. Earthquake seismologists are increasingly leveraging machine learning and data mining techniques to detect and analyze weak earthquake signals in large seismic data sets. One of the challenges in applying machine learning to seismic data sets is the limited labeled data problem; learning algorithms need to be given examples of earthquake waveforms, but the number of known events, taken from earthquake catalogs, may be insufficient to build an accurate detector. Furthermore, earthquake catalogs are known to be incomplete, resulting in training data that may be biased towards larger events and contain inaccurate labels. This challenge is compounded by the class imbalance problem; the events of interest, earthquakes, are infrequent relative to noise in continuous data sets, and many learning algorithms perform poorly on rare classes. In this work, we investigate the use of active learning for automatic earthquake detection. Active learning is a type of semi-supervised machine learning that uses a human-in-the-loop approach to strategically supplement a small initial training set. The learning algorithm incorporates domain expertise through interaction between a human expert and the algorithm, with the algorithm actively posing queries to the user to improve detection performance. We demonstrate the potential of active machine learning to improve earthquake detection performance with limited available training data.
Automatic measurement of voice onset time using discriminative structured prediction.
Sonderegger, Morgan; Keshet, Joseph
2012-12-01
A discriminative large-margin algorithm for automatic measurement of voice onset time (VOT) is described, considered as a case of predicting structured output from speech. Manually labeled data are used to train a function that takes as input a speech segment of an arbitrary length containing a voiceless stop, and outputs its VOT. The function is explicitly trained to minimize the difference between predicted and manually measured VOT; it operates on a set of acoustic feature functions designed based on spectral and temporal cues used by human VOT annotators. The algorithm is applied to initial voiceless stops from four corpora, representing different types of speech. Using several evaluation methods, the algorithm's performance is near human intertranscriber reliability, and compares favorably with previous work. Furthermore, the algorithm's performance is minimally affected by training and testing on different corpora, and remains essentially constant as the amount of training data is reduced to 50-250 manually labeled examples, demonstrating the method's practical applicability to new datasets.
Learning to merge: a new tool for interactive mapping
NASA Astrophysics Data System (ADS)
Porter, Reid B.; Lundquist, Sheng; Ruggiero, Christy
2013-05-01
The task of turning raw imagery into semantically meaningful maps and overlays is a key area of remote sensing activity. Image analysts, in applications ranging from environmental monitoring to intelligence, use imagery to generate and update maps of terrain, vegetation, road networks, buildings and other relevant features. Often these tasks can be cast as a pixel labeling problem, and several interactive pixel labeling tools have been developed. These tools exploit training data, which is generated by analysts using simple and intuitive paint-program annotation tools, in order to tailor the labeling algorithm for the particular dataset and task. In other cases, the task is best cast as a pixel segmentation problem. Interactive pixel segmentation tools have also been developed, but these tools typically do not learn from training data like the pixel labeling tools do. In this paper we investigate tools for interactive pixel segmentation that also learn from user input. The input has the form of segment merging (or grouping). Merging examples are 1) easily obtained from analysts using vector annotation tools, and 2) more challenging to exploit than traditional labels. We outline the key issues in developing these interactive merging tools, and describe their application to remote sensing.
SemiBoost: boosting for semi-supervised learning.
Mallapragada, Pavan Kumar; Jin, Rong; Jain, Anil K; Liu, Yi
2009-11-01
Semi-supervised learning has attracted a significant amount of attention in pattern recognition and machine learning. Most previous studies have focused on designing special algorithms to effectively exploit the unlabeled data in conjunction with labeled data. Our goal is to improve the classification accuracy of any given supervised learning algorithm by using the available unlabeled examples. We call this as the Semi-supervised improvement problem, to distinguish the proposed approach from the existing approaches. We design a metasemi-supervised learning algorithm that wraps around the underlying supervised algorithm and improves its performance using unlabeled data. This problem is particularly important when we need to train a supervised learning algorithm with a limited number of labeled examples and a multitude of unlabeled examples. We present a boosting framework for semi-supervised learning, termed as SemiBoost. The key advantages of the proposed semi-supervised learning approach are: 1) performance improvement of any supervised learning algorithm with a multitude of unlabeled data, 2) efficient computation by the iterative boosting algorithm, and 3) exploiting both manifold and cluster assumption in training classification models. An empirical study on 16 different data sets and text categorization demonstrates that the proposed framework improves the performance of several commonly used supervised learning algorithms, given a large number of unlabeled examples. We also show that the performance of the proposed algorithm, SemiBoost, is comparable to the state-of-the-art semi-supervised learning algorithms.
Learning about individuals' health from aggregate data.
Colbaugh, Rich; Glass, Kristin
2017-07-01
There is growing awareness that user-generated social media content contains valuable health-related information and is more convenient to collect than typical health data. For example, Twitter has been employed to predict aggregate-level outcomes, such as regional rates of diabetes and child poverty, and to identify individual cases of depression and food poisoning. Models which make aggregate-level inferences can be induced from aggregate data, and consequently are straightforward to build. In contrast, learning models that produce individual-level (IL) predictions, which are more informative, usually requires a large number of difficult-to-acquire labeled IL examples. This paper presents a new machine learning method which achieves the best of both worlds, enabling IL models to be learned from aggregate labels. The algorithm makes predictions by combining unsupervised feature extraction, aggregate-based modeling, and optimal integration of aggregate-level and IL information. Two case studies illustrate how to learn health-relevant IL prediction models using only aggregate labels, and show that these models perform as well as state-of-the-art models trained on hundreds or thousands of labeled individuals.
ERIC Educational Resources Information Center
Brunila, Kristiina; Ryynänen, Sanna
2017-01-01
Young people labelled "disadvantaged" or "at risk of social exclusion" are increasingly directed into publicly funded or NGO-based, partly privately financed projects in order to secure their desired integration into society through work or further education. In this article, we carry out a comparative analysis of youth…
Cross-domain question classification in community question answering via kernel mapping
NASA Astrophysics Data System (ADS)
Su, Lei; Hu, Zuoliang; Yang, Bin; Li, Yiyang; Chen, Jun
2015-10-01
An increasingly popular method for retrieving information is via the community question answering (CQA) systems such as Yahoo! Answers and Baidu Knows. In CQA, question classification plays an important role to find the answers. However, the labeled training examples for statistical question classifier are fairly expensive to obtain, as they require the experienced human efforts. Meanwhile, unlabeled data are readily available. This paper employs the method of domain adaptation via kernel mapping to solve this problem. In detail, the kernel approach is utilized to map the target-domain data and the source-domain data into a common space, where the question classifiers are trained under the closer conditional probabilities. The kernel mapping function is constructed by domain knowledge. Therefore, domain knowledge could be transferred from the labeled examples in the source domain to the unlabeled ones in the targeted domain. The statistical training model can be improved by using a large number of unlabeled data. Meanwhile, the Hadoop Platform is used to construct the mapping mechanism to reduce the time complexity. Map/Reduce enable kernel mapping for domain adaptation in parallel in the Hadoop Platform. Experimental results show that the accuracy of question classification could be improved by the method of kernel mapping. Furthermore, the parallel method in the Hadoop Platform could effective schedule the computing resources to reduce the running time.
Active learning of neuron morphology for accurate automated tracing of neurites
Gala, Rohan; Chapeton, Julio; Jitesh, Jayant; Bhavsar, Chintan; Stepanyants, Armen
2014-01-01
Automating the process of neurite tracing from light microscopy stacks of images is essential for large-scale or high-throughput quantitative studies of neural circuits. While the general layout of labeled neurites can be captured by many automated tracing algorithms, it is often not possible to differentiate reliably between the processes belonging to different cells. The reason is that some neurites in the stack may appear broken due to imperfect labeling, while others may appear fused due to the limited resolution of optical microscopy. Trained neuroanatomists routinely resolve such topological ambiguities during manual tracing tasks by combining information about distances between branches, branch orientations, intensities, calibers, tortuosities, colors, as well as the presence of spines or boutons. Likewise, to evaluate different topological scenarios automatically, we developed a machine learning approach that combines many of the above mentioned features. A specifically designed confidence measure was used to actively train the algorithm during user-assisted tracing procedure. Active learning significantly reduces the training time and makes it possible to obtain less than 1% generalization error rates by providing few training examples. To evaluate the overall performance of the algorithm a number of image stacks were reconstructed automatically, as well as manually by several trained users, making it possible to compare the automated traces to the baseline inter-user variability. Several geometrical and topological features of the traces were selected for the comparisons. These features include the total trace length, the total numbers of branch and terminal points, the affinity of corresponding traces, and the distances between corresponding branch and terminal points. Our results show that when the density of labeled neurites is sufficiently low, automated traces are not significantly different from manual reconstructions obtained by trained users. PMID:24904306
Ship detection leveraging deep neural networks in WorldView-2 images
NASA Astrophysics Data System (ADS)
Yamamoto, T.; Kazama, Y.
2017-10-01
Interpretation of high-resolution satellite images has been so difficult that skilled interpreters must have checked the satellite images manually because of the following issues. One is the requirement of the high detection accuracy rate. The other is the variety of the target, taking ships for example, there are many kinds of ships, such as boat, cruise ship, cargo ship, aircraft carrier, and so on. Furthermore, there are similar appearance objects throughout the image; therefore, it is often difficult even for the skilled interpreters to distinguish what object the pixels really compose. In this paper, we explore the feasibility of object extraction leveraging deep learning with high-resolution satellite images, especially focusing on ship detection. We calculated the detection accuracy using the WorldView-2 images. First, we collected the training images labelled as "ship" and "not ship". After preparing the training data, we defined the deep neural network model to judge whether ships are existing or not, and trained them with about 50,000 training images for each label. Subsequently, we scanned the evaluation image with different resolution windows and extracted the "ship" images. Experimental result shows the effectiveness of the deep learning based object detection.
Automatic threshold selection for multi-class open set recognition
NASA Astrophysics Data System (ADS)
Scherreik, Matthew; Rigling, Brian
2017-05-01
Multi-class open set recognition is the problem of supervised classification with additional unknown classes encountered after a model has been trained. An open set classifer often has two core components. The first component is a base classifier which estimates the most likely class of a given example. The second component consists of open set logic which estimates if the example is truly a member of the candidate class. Such a system is operated in a feed-forward fashion. That is, a candidate label is first estimated by the base classifier, and the true membership of the example to the candidate class is estimated afterward. Previous works have developed an iterative threshold selection algorithm for rejecting examples from classes which were not present at training time. In those studies, a Platt-calibrated SVM was used as the base classifier, and the thresholds were applied to class posterior probabilities for rejection. In this work, we investigate the effectiveness of other base classifiers when paired with the threshold selection algorithm and compare their performance with the original SVM solution.
Transfer Learning for Class Imbalance Problems with Inadequate Data.
Al-Stouhi, Samir; Reddy, Chandan K
2016-07-01
A fundamental problem in data mining is to effectively build robust classifiers in the presence of skewed data distributions. Class imbalance classifiers are trained specifically for skewed distribution datasets. Existing methods assume an ample supply of training examples as a fundamental prerequisite for constructing an effective classifier. However, when sufficient data is not readily available, the development of a representative classification algorithm becomes even more difficult due to the unequal distribution between classes. We provide a unified framework that will potentially take advantage of auxiliary data using a transfer learning mechanism and simultaneously build a robust classifier to tackle this imbalance issue in the presence of few training samples in a particular target domain of interest. Transfer learning methods use auxiliary data to augment learning when training examples are not sufficient and in this paper we will develop a method that is optimized to simultaneously augment the training data and induce balance into skewed datasets. We propose a novel boosting based instance-transfer classifier with a label-dependent update mechanism that simultaneously compensates for class imbalance and incorporates samples from an auxiliary domain to improve classification. We provide theoretical and empirical validation of our method and apply to healthcare and text classification applications.
Label Review Training: Module 1: Label Basics, Page 21
This module of the pesticide label review training provides basic information about pesticides, their labeling and regulation, and the core principles of pesticide label review. Learn about types of labels.
Label Review Training: Module 1: Label Basics, Page 20
This module of the pesticide label review training provides basic information about pesticides, their labeling and regulation, and the core principles of pesticide label review. This section focuses on supplemental labeling.
Label Review Training: Module 1: Label Basics, Page 22
This module of the pesticide label review training provides basic information about pesticides, their labeling and regulation, and the core principles of pesticide label review. Learn about what labels require review.
Label Review Training: Module 1: Label Basics, Page 18
This module of the pesticide label review training provides basic information about pesticides, their labeling and regulation, and the core principles of pesticide label review. This section discusses the types of labels.
Label Review Training: Module 1: Label Basics, Page 26
This module of the pesticide label review training provides basic information about pesticides, their labeling and regulation, and the core principles of pesticide label review. Learn about mandatory and advisory label statements.
Label Review Training: Module 1: Label Basics, Page 19
This module of the pesticide label review training provides basic information about pesticides, their labeling and regulation, and the core principles of pesticide label review. This section covers supplemental distributor labeling.
Label Review Training: Module 1: Label Basics, Page 15
This module of the pesticide label review training provides basic information about pesticides, their labeling and regulation, and the core principles of pesticide label review. Learn about the consequences of improper labeling.
Label Review Training: Module 1: Label Basics, Page 14
This module of the pesticide label review training provides basic information about pesticides, their labeling and regulation, and the core principles of pesticide label review. Learn about positive effects from proper labeling.
Label Review Training: Module 1: Label Basics, Page 24
This module of the pesticide label review training provides basic information about pesticides, their labeling and regulation, and the core principles of pesticide label review. This page is about which labels require review.
Label Review Training: Module 1: Label Basics, Page 17
This module of the pesticide label review training provides basic information about pesticides, their labeling and regulation, and the core principles of pesticide label review. See an overview of the importance of labels.
Label Review Training: Module 1: Label Basics, Page 23
This module of the pesticide label review training provides basic information about pesticides, their labeling and regulation, and the core principles of pesticide label review. Lists types of labels that do not require review.
Label Review Training: Module 1: Label Basics, Page 16
This module of the pesticide label review training provides basic information about pesticides, their labeling and regulation, and the core principles of pesticide label review. Learn about the importance of labels and the role in enforcement.
Co-Labeling for Multi-View Weakly Labeled Learning.
Xu, Xinxing; Li, Wen; Xu, Dong; Tsang, Ivor W
2016-06-01
It is often expensive and time consuming to collect labeled training samples in many real-world applications. To reduce human effort on annotating training samples, many machine learning techniques (e.g., semi-supervised learning (SSL), multi-instance learning (MIL), etc.) have been studied to exploit weakly labeled training samples. Meanwhile, when the training data is represented with multiple types of features, many multi-view learning methods have shown that classifiers trained on different views can help each other to better utilize the unlabeled training samples for the SSL task. In this paper, we study a new learning problem called multi-view weakly labeled learning, in which we aim to develop a unified approach to learn robust classifiers by effectively utilizing different types of weakly labeled multi-view data from a broad range of tasks including SSL, MIL and relative outlier detection (ROD). We propose an effective approach called co-labeling to solve the multi-view weakly labeled learning problem. Specifically, we model the learning problem on each view as a weakly labeled learning problem, which aims to learn an optimal classifier from a set of pseudo-label vectors generated by using the classifiers trained from other views. Unlike traditional co-training approaches using a single pseudo-label vector for training each classifier, our co-labeling approach explores different strategies to utilize the predictions from different views, biases and iterations for generating the pseudo-label vectors, making our approach more robust for real-world applications. Moreover, to further improve the weakly labeled learning on each view, we also exploit the inherent group structure in the pseudo-label vectors generated from different strategies, which leads to a new multi-layer multiple kernel learning problem. Promising results for text-based image retrieval on the NUS-WIDE dataset as well as news classification and text categorization on several real-world multi-view datasets clearly demonstrate that our proposed co-labeling approach achieves state-of-the-art performance for various multi-view weakly labeled learning problems including multi-view SSL, multi-view MIL and multi-view ROD.
Self-assessed performance improves statistical fusion of image labels
Bryan, Frederick W.; Xu, Zhoubing; Asman, Andrew J.; Allen, Wade M.; Reich, Daniel S.; Landman, Bennett A.
2014-01-01
Purpose: Expert manual labeling is the gold standard for image segmentation, but this process is difficult, time-consuming, and prone to inter-individual differences. While fully automated methods have successfully targeted many anatomies, automated methods have not yet been developed for numerous essential structures (e.g., the internal structure of the spinal cord as seen on magnetic resonance imaging). Collaborative labeling is a new paradigm that offers a robust alternative that may realize both the throughput of automation and the guidance of experts. Yet, distributing manual labeling expertise across individuals and sites introduces potential human factors concerns (e.g., training, software usability) and statistical considerations (e.g., fusion of information, assessment of confidence, bias) that must be further explored. During the labeling process, it is simple to ask raters to self-assess the confidence of their labels, but this is rarely done and has not been previously quantitatively studied. Herein, the authors explore the utility of self-assessment in relation to automated assessment of rater performance in the context of statistical fusion. Methods: The authors conducted a study of 66 volumes manually labeled by 75 minimally trained human raters recruited from the university undergraduate population. Raters were given 15 min of training during which they were shown examples of correct segmentation, and the online segmentation tool was demonstrated. The volumes were labeled 2D slice-wise, and the slices were unordered. A self-assessed quality metric was produced by raters for each slice by marking a confidence bar superimposed on the slice. Volumes produced by both voting and statistical fusion algorithms were compared against a set of expert segmentations of the same volumes. Results: Labels for 8825 distinct slices were obtained. Simple majority voting resulted in statistically poorer performance than voting weighted by self-assessed performance. Statistical fusion resulted in statistically indistinguishable performance from self-assessed weighted voting. The authors developed a new theoretical basis for using self-assessed performance in the framework of statistical fusion and demonstrated that the combined sources of information (both statistical assessment and self-assessment) yielded statistically significant improvement over the methods considered separately. Conclusions: The authors present the first systematic characterization of self-assessed performance in manual labeling. The authors demonstrate that self-assessment and statistical fusion yield similar, but complementary, benefits for label fusion. Finally, the authors present a new theoretical basis for combining self-assessments with statistical label fusion. PMID:24593721
Multi-instance multi-label distance metric learning for genome-wide protein function prediction.
Xu, Yonghui; Min, Huaqing; Song, Hengjie; Wu, Qingyao
2016-08-01
Multi-instance multi-label (MIML) learning has been proven to be effective for the genome-wide protein function prediction problems where each training example is associated with not only multiple instances but also multiple class labels. To find an appropriate MIML learning method for genome-wide protein function prediction, many studies in the literature attempted to optimize objective functions in which dissimilarity between instances is measured using the Euclidean distance. But in many real applications, Euclidean distance may be unable to capture the intrinsic similarity/dissimilarity in feature space and label space. Unlike other previous approaches, in this paper, we propose to learn a multi-instance multi-label distance metric learning framework (MIMLDML) for genome-wide protein function prediction. Specifically, we learn a Mahalanobis distance to preserve and utilize the intrinsic geometric information of both feature space and label space for MIML learning. In addition, we try to deal with the sparsely labeled data by giving weight to the labeled data. Extensive experiments on seven real-world organisms covering the biological three-domain system (i.e., archaea, bacteria, and eukaryote; Woese et al., 1990) show that the MIMLDML algorithm is superior to most state-of-the-art MIML learning algorithms. Copyright © 2016 Elsevier Ltd. All rights reserved.
To call a cloud 'cirrus': sound symbolism in names for categories or items.
Ković, Vanja; Sučević, Jelena; Styles, Suzy J
2017-01-01
The aim of the present paper is to experimentally test whether sound symbolism has selective effects on labels with different ranges-of-reference within a simple noun-hierarchy. In two experiments, adult participants learned the make up of two categories of unfamiliar objects ('alien life forms'), and were passively exposed to either category-labels or item-labels, in a learning-by-guessing categorization task. Following category training, participants were tested on their visual discrimination of object pairs. For different groups of participants, the labels were either congruent or incongruent with the objects. In Experiment 1, when trained on items with individual labels, participants were worse (made more errors) at detecting visual object mismatches when trained labels were incongruent. In Experiment 2, when participants were trained on items in labelled categories, participants were faster at detecting a match if the trained labels were congruent, and faster at detecting a mismatch if the trained labels were incongruent. This pattern of results suggests that sound symbolism in category labels facilitates later similarity judgments when congruent, and discrimination when incongruent, whereas for item labels incongruence generates error in judgements of visual object differences. These findings reveal that sound symbolic congruence has a different outcome at different levels of labelling within a noun hierarchy. These effects emerged in the absence of the label itself, indicating subtle but pervasive effects on visual object processing.
Learning classification models with soft-label information.
Nguyen, Quang; Valizadegan, Hamed; Hauskrecht, Milos
2014-01-01
Learning of classification models in medicine often relies on data labeled by a human expert. Since labeling of clinical data may be time-consuming, finding ways of alleviating the labeling costs is critical for our ability to automatically learn such models. In this paper we propose a new machine learning approach that is able to learn improved binary classification models more efficiently by refining the binary class information in the training phase with soft labels that reflect how strongly the human expert feels about the original class labels. Two types of methods that can learn improved binary classification models from soft labels are proposed. The first relies on probabilistic/numeric labels, the other on ordinal categorical labels. We study and demonstrate the benefits of these methods for learning an alerting model for heparin induced thrombocytopenia. The experiments are conducted on the data of 377 patient instances labeled by three different human experts. The methods are compared using the area under the receiver operating characteristic curve (AUC) score. Our AUC results show that the new approach is capable of learning classification models more efficiently compared to traditional learning methods. The improvement in AUC is most remarkable when the number of examples we learn from is small. A new classification learning framework that lets us learn from auxiliary soft-label information provided by a human expert is a promising new direction for learning classification models from expert labels, reducing the time and cost needed to label data.
Label Review Training: Module 1: Label Basics, Page 25
This module of the pesticide label review training provides basic information about pesticides, their labeling and regulation, and the core principles of pesticide label review: clarity, accuracy, consistency with EPA policy, and enforceability.
Label Review Training: Module 1: Label Basics, Page 29
This module of the pesticide label review training provides basic information about pesticides, their labeling and regulation, and the core principles of pesticide label review. This page is a quiz on Module 1.
On evaluating clustering procedures for use in classification
NASA Technical Reports Server (NTRS)
Pore, M. D.; Moritz, T. E.; Register, D. T.; Yao, S. S.; Eppler, W. G. (Principal Investigator)
1979-01-01
The problem of evaluating clustering algorithms and their respective computer programs for use in a preprocessing step for classification is addressed. In clustering for classification the probability of correct classification is suggested as the ultimate measure of accuracy on training data. A means of implementing this criterion and a measure of cluster purity are discussed. Examples are given. A procedure for cluster labeling that is based on cluster purity and sample size is presented.
Pesticide Label Review Training
This training will help ensure that reviewers evaluate labels according to four core principles. It also will help pesticide registrants developing labels understand what EPA expects of pesticide labels, and what the Agency generally finds acceptable.
High Class-Imbalance in pre-miRNA Prediction: A Novel Approach Based on deepSOM.
Stegmayer, Georgina; Yones, Cristian; Kamenetzky, Laura; Milone, Diego H
2017-01-01
The computational prediction of novel microRNA within a full genome involves identifying sequences having the highest chance of being a miRNA precursor (pre-miRNA). These sequences are usually named candidates to miRNA. The well-known pre-miRNAs are usually only a few in comparison to the hundreds of thousands of potential candidates to miRNA that have to be analyzed, which makes this task a high class-imbalance classification problem. The classical way of approaching it has been training a binary classifier in a supervised manner, using well-known pre-miRNAs as positive class and artificially defining the negative class. However, although the selection of positive labeled examples is straightforward, it is very difficult to build a set of negative examples in order to obtain a good set of training samples for a supervised method. In this work, we propose a novel and effective way of approaching this problem using machine learning, without the definition of negative examples. The proposal is based on clustering unlabeled sequences of a genome together with well-known miRNA precursors for the organism under study, which allows for the quick identification of the best candidates to miRNA as those sequences clustered with known precursors. Furthermore, we propose a deep model to overcome the problem of having very few positive class labels. They are always maintained in the deep levels as positive class while less likely pre-miRNA sequences are filtered level after level. Our approach has been compared with other methods for pre-miRNAs prediction in several species, showing effective predictivity of novel miRNAs. Additionally, we will show that our approach has a lower training time and allows for a better graphical navegability and interpretation of the results. A web-demo interface to try deepSOM is available at http://fich.unl.edu.ar/sinc/web-demo/deepsom/.
Improved semi-supervised online boosting for object tracking
NASA Astrophysics Data System (ADS)
Li, Yicui; Qi, Lin; Tan, Shukun
2016-10-01
The advantage of an online semi-supervised boosting method which takes object tracking problem as a classification problem, is training a binary classifier from labeled and unlabeled examples. Appropriate object features are selected based on real time changes in the object. However, the online semi-supervised boosting method faces one key problem: The traditional self-training using the classification results to update the classifier itself, often leads to drifting or tracking failure, due to the accumulated error during each update of the tracker. To overcome the disadvantages of semi-supervised online boosting based on object tracking methods, the contribution of this paper is an improved online semi-supervised boosting method, in which the learning process is guided by positive (P) and negative (N) constraints, termed P-N constraints, which restrict the labeling of the unlabeled samples. First, we train the classification by an online semi-supervised boosting. Then, this classification is used to process the next frame. Finally, the classification is analyzed by the P-N constraints, which are used to verify if the labels of unlabeled data assigned by the classifier are in line with the assumptions made about positive and negative samples. The proposed algorithm can effectively improve the discriminative ability of the classifier and significantly alleviate the drifting problem in tracking applications. In the experiments, we demonstrate real-time tracking of our tracker on several challenging test sequences where our tracker outperforms other related on-line tracking methods and achieves promising tracking performance.
Relation Extraction with Weak Supervision and Distributional Semantics
2013-05-01
country is no longer a member of the organization), a player and an event, a team and a sport, etc. Multiple meanings of a relation phrase are success ...Zimbabwe, the Commonwealth> <force, country> <American forces, Vietnam>; <Roman Legions, Britain> < player , event> <Brandon Bass, the NBA draft>; <Agassi...training data. We found that dealing with incorrectly labeled examples is critical for its success . We develop a latent Bayesian framework for this
NASA Astrophysics Data System (ADS)
Gandikota, Dhanuj; Hadjiiski, Lubomir; Cha, Kenny H.; Chan, Heang-Ping; Caoili, Elaine M.; Cohan, Richard H.; Weizer, Alon; Alva, Ajjai; Paramagul, Chintana; Wei, Jun; Zhou, Chuan
2018-02-01
In bladder cancer, stage T2 is an important threshold in the decision of administering neoadjuvant chemotherapy. Our long-term goal is to develop a quantitative computerized decision support system (CDSS-S) to aid clinicians in accurate staging. In this study, we examined the effect of stage labels of the training samples on modeling such a system. We used a data set of 84 bladder cancers imaged with CT Urography (CTU). At clinical staging prior to treatment, 43 lesions were staged as below stage T2 and 41 were stage T2 or above. After cystectomy and pathological staging that is considered the gold standard, 10 of the lesions were upstaged to stage T2 or above. After correcting the stage labels, 33 lesions were below stage T2, and 51 were stage T2 or above. For the CDSS-S, the lesions were segmented using our AI-CALS method and radiomic features were extracted. We trained a linear discriminant analysis (LDA) classifier with leave-one-case-out cross validation to distinguish between bladder lesions of stage T2 or above and those below stage T2. The CDSS-S was trained and tested with the corrected post-cystectomy labels, and as a comparison, CDSS-S was also trained with understaged pre-treatment labels and tested on lesions with corrected labels. The test AUC for the CDSS-S trained with corrected labels was 0.89 +/- 0.04. For the CDSS-S trained with understaged pre-treatment labels and tested on the lesions with corrected labels, the test AUC was 0.86 +/- 0.04. The likelihood of stage T2 or above for 9 out of the 10 understaged lesions was correctly increased for the CDSS-S trained with corrected labels. The CDSS-S is sensitive to the accuracy of stage labeling. The CDSS-S trained with correct labels shows promise in prediction of the bladder cancer stage.
Label Review Training: Module 3: Special Issues, Page 12
This module further describes and provides strategies for reviewing some of the label parts introduced in Module 2 of the pesticide label training, such as precautionary statements, directions for use, worker protection labeling, and more.
Label Review Training: Module 3: Special Issues, Page 23
This module further describes and provides strategies for reviewing some of the label parts introduced in Module 2 of the pesticide label training, such as precautionary statements, directions for use, worker protection labeling, and more.
Label Review Training: Module 3: Special Issues, Page 3
This module further describes and provides strategies for reviewing some of the label parts introduced in Module 2 of the pesticide label training, such as precautionary statements, directions for use, worker protection labeling, and more.
Label Review Training: Module 3: Special Issues, Page 9
This module further describes and provides strategies for reviewing some of the label parts introduced in Module 2 of the pesticide label training, such as precautionary statements, directions for use, worker protection labeling, and more.
Label Review Training: Module 1: Label Basics, Page 7
Page 7, Label Training, Pesticide labels translate results of our extensive evaluations of pesticide products into conditions, directions and precautions that define parameters for use of a pesticide with the goal of ensuring protection of human he
Temporal Cyber Attack Detection.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ingram, Joey Burton; Draelos, Timothy J.; Galiardi, Meghan
Rigorous characterization of the performance and generalization ability of cyber defense systems is extremely difficult, making it hard to gauge uncertainty, and thus, confidence. This difficulty largely stems from a lack of labeled attack data that fully explores the potential adversarial space. Currently, performance of cyber defense systems is typically evaluated in a qualitative manner by manually inspecting the results of the system on live data and adjusting as needed. Additionally, machine learning has shown promise in deriving models that automatically learn indicators of compromise that are more robust than analyst-derived detectors. However, to generate these models, most algorithms requiremore » large amounts of labeled data (i.e., examples of attacks). Algorithms that do not require annotated data to derive models are similarly at a disadvantage, because labeled data is still necessary when evaluating performance. In this work, we explore the use of temporal generative models to learn cyber attack graph representations and automatically generate data for experimentation and evaluation. Training and evaluating cyber systems and machine learning models requires significant, annotated data, which is typically collected and labeled by hand for one-off experiments. Automatically generating such data helps derive/evaluate detection models and ensures reproducibility of results. Experimentally, we demonstrate the efficacy of generative sequence analysis techniques on learning the structure of attack graphs, based on a realistic example. These derived models can then be used to generate more data. Additionally, we provide a roadmap for future research efforts in this area.« less
Learning with imperfectly labeled patterns
NASA Technical Reports Server (NTRS)
Chittineni, C. B.
1979-01-01
The problem of learning in pattern recognition using imperfectly labeled patterns is considered. The performance of the Bayes and nearest neighbor classifiers with imperfect labels is discussed using a probabilistic model for the mislabeling of the training patterns. Schemes for training the classifier using both parametric and non parametric techniques are presented. Methods for the correction of imperfect labels were developed. To gain an understanding of the learning process, expressions are derived for success probability as a function of training time for a one dimensional increment error correction classifier with imperfect labels. Feature selection with imperfectly labeled patterns is described.
A machine learning pipeline for automated registration and classification of 3D lidar data
NASA Astrophysics Data System (ADS)
Rajagopal, Abhejit; Chellappan, Karthik; Chandrasekaran, Shivkumar; Brown, Andrew P.
2017-05-01
Despite the large availability of geospatial data, registration and exploitation of these datasets remains a persis- tent challenge in geoinformatics. Popular signal processing and machine learning algorithms, such as non-linear SVMs and neural networks, rely on well-formatted input models as well as reliable output labels, which are not always immediately available. In this paper we outline a pipeline for gathering, registering, and classifying initially unlabeled wide-area geospatial data. As an illustrative example, we demonstrate the training and test- ing of a convolutional neural network to recognize 3D models in the OGRIP 2007 LiDAR dataset using fuzzy labels derived from OpenStreetMap as well as other datasets available on OpenTopography.org. When auxiliary label information is required, various text and natural language processing filters are used to extract and cluster keywords useful for identifying potential target classes. A subset of these keywords are subsequently used to form multi-class labels, with no assumption of independence. Finally, we employ class-dependent geometry extraction routines to identify candidates from both training and testing datasets. Our regression networks are able to identify the presence of 6 structural classes, including roads, walls, and buildings, in volumes as big as 8000 m3 in as little as 1.2 seconds on a commodity 4-core Intel CPU. The presented framework is neither dataset nor sensor-modality limited due to the registration process, and is capable of multi-sensor data-fusion.
Generalization error analysis: deep convolutional neural network in mammography
NASA Astrophysics Data System (ADS)
Richter, Caleb D.; Samala, Ravi K.; Chan, Heang-Ping; Hadjiiski, Lubomir; Cha, Kenny
2018-02-01
We conducted a study to gain understanding of the generalizability of deep convolutional neural networks (DCNNs) given their inherent capability to memorize data. We examined empirically a specific DCNN trained for classification of masses on mammograms. Using a data set of 2,454 lesions from 2,242 mammographic views, a DCNN was trained to classify masses into malignant and benign classes using transfer learning from ImageNet LSVRC-2010. We performed experiments with varying amounts of label corruption and types of pixel randomization to analyze the generalization error for the DCNN. Performance was evaluated using the area under the receiver operating characteristic curve (AUC) with an N-fold cross validation. Comparisons were made between the convergence times, the inference AUCs for both the training set and the test set of the original image patches without corruption, and the root-mean-squared difference (RMSD) in the layer weights of the DCNN trained with different amounts and methods of corruption. Our experiments observed trends which revealed that the DCNN overfitted by memorizing corrupted data. More importantly, this study improved our understanding of DCNN weight updates when learning new patterns or new labels. Although we used a specific classification task with the ImageNet as example, similar methods may be useful for analysis of the DCNN learning processes, especially those that employ transfer learning for medical image analysis where sample size is limited and overfitting risk is high.
Nagelkerke, Nico; Fidler, Vaclav
2015-01-01
The problem of discrimination and classification is central to much of epidemiology. Here we consider the estimation of a logistic regression/discrimination function from training samples, when one of the training samples is subject to misclassification or mislabeling, e.g. diseased individuals are incorrectly classified/labeled as healthy controls. We show that this leads to zero-inflated binomial model with a defective logistic regression or discrimination function, whose parameters can be estimated using standard statistical methods such as maximum likelihood. These parameters can be used to estimate the probability of true group membership among those, possibly erroneously, classified as controls. Two examples are analyzed and discussed. A simulation study explores properties of the maximum likelihood parameter estimates and the estimates of the number of mislabeled observations.
Replacing Maladaptive Speech with Verbal Labeling Responses: An Analysis of Generalized Responding.
ERIC Educational Resources Information Center
Foxx, R. M.; And Others
1988-01-01
Three mentally handicapped students (aged 13, 36, and 40) with maladaptive speech received training to answer questions with verbal labels. The results of their cues-pause-point training showed that the students replaced their maladaptive speech with correct labels (answers) to questions in the training setting and three generalization settings.…
Effects of Labeling and Teacher Certification Type on Recall and Conflict Resolution
ERIC Educational Resources Information Center
Ayers, Jane M.; Krueger, Lacy E.; Jones, Beth A.
2015-01-01
Understanding how labels and prior training affect teachers of students with a disability is a step toward creating effective educational environments. Two goals of the present study were to examine how teacher training (special education vs. general education training) and labeling of students (either as having attention deficit hyperactivity…
Human factors in labeling and training for home healthcare technology.
Patterson, Patricia A
2010-01-01
In this article, Patricia A. Patterson, a contributor to the recently-released standard ANSI/AAMI HE75:2009 Human factors engineering-Design of medical devices, highlights information from the standard important to developing labeling and training for homecare devices. She also describes one approach to developing labeling and training materials.
Toward accelerating landslide mapping with interactive machine learning techniques
NASA Astrophysics Data System (ADS)
Stumpf, André; Lachiche, Nicolas; Malet, Jean-Philippe; Kerle, Norman; Puissant, Anne
2013-04-01
Despite important advances in the development of more automated methods for landslide mapping from optical remote sensing images, the elaboration of inventory maps after major triggering events still remains a tedious task. Image classification with expert defined rules typically still requires significant manual labour for the elaboration and adaption of rule sets for each particular case. Machine learning algorithm, on the contrary, have the ability to learn and identify complex image patterns from labelled examples but may require relatively large amounts of training data. In order to reduce the amount of required training data active learning has evolved as key concept to guide the sampling for applications such as document classification, genetics and remote sensing. The general underlying idea of most active learning approaches is to initialize a machine learning model with a small training set, and to subsequently exploit the model state and/or the data structure to iteratively select the most valuable samples that should be labelled by the user and added in the training set. With relatively few queries and labelled samples, an active learning strategy should ideally yield at least the same accuracy than an equivalent classifier trained with many randomly selected samples. Our study was dedicated to the development of an active learning approach for landslide mapping from VHR remote sensing images with special consideration of the spatial distribution of the samples. The developed approach is a region-based query heuristic that enables to guide the user attention towards few compact spatial batches rather than distributed points resulting in time savings of 50% and more compared to standard active learning techniques. The approach was tested with multi-temporal and multi-sensor satellite images capturing recent large scale triggering events in Brazil and China and demonstrated balanced user's and producer's accuracies between 74% and 80%. The assessment also included an experimental evaluation of the uncertainties of manual mappings from multiple experts and demonstrated strong relationships between the uncertainty of the experts and the machine learning model.
NASA Astrophysics Data System (ADS)
Kuijf, Hugo J.; Moeskops, Pim; de Vos, Bob D.; Bouvy, Willem H.; de Bresser, Jeroen; Biessels, Geert Jan; Viergever, Max A.; Vincken, Koen L.
2016-03-01
Novelty detection is concerned with identifying test data that differs from the training data of a classifier. In the case of brain MR images, pathology or imaging artefacts are examples of untrained data. In this proof-of-principle study, we measure the behaviour of a classifier during the classification of trained labels (i.e. normal brain tissue). Next, we devise a measure that distinguishes normal classifier behaviour from abnormal behavior that occurs in the case of a novelty. This will be evaluated by training a kNN classifier on normal brain tissue, applying it to images with an untrained pathology (white matter hyperintensities (WMH)), and determine if our measure is able to identify abnormal classifier behaviour at WMH locations. For our kNN classifier, behaviour is modelled as the mean, median, or q1 distance to the k nearest points. Healthy tissue was trained on 15 images; classifier behaviour was trained/tested on 5 images with leave-one-out cross-validation. For each trained class, we measure the distribution of mean/median/q1 distances to the k nearest point. Next, for each test voxel, we compute its Z-score with respect to the measured distribution of its predicted label. We consider a Z-score >=4 abnormal behaviour of the classifier, having a probability due to chance of 0.000032. Our measure identified >90% of WMH volume and also highlighted other non-trained findings. The latter being predominantly vessels, cerebral falx, brain mask errors, choroid plexus. This measure is generalizable to other classifiers and might help in detecting unexpected findings or novelties by measuring classifier behaviour.
Robust Statistical Fusion of Image Labels
Landman, Bennett A.; Asman, Andrew J.; Scoggins, Andrew G.; Bogovic, John A.; Xing, Fangxu; Prince, Jerry L.
2011-01-01
Image labeling and parcellation (i.e. assigning structure to a collection of voxels) are critical tasks for the assessment of volumetric and morphometric features in medical imaging data. The process of image labeling is inherently error prone as images are corrupted by noise and artifacts. Even expert interpretations are subject to subjectivity and the precision of the individual raters. Hence, all labels must be considered imperfect with some degree of inherent variability. One may seek multiple independent assessments to both reduce this variability and quantify the degree of uncertainty. Existing techniques have exploited maximum a posteriori statistics to combine data from multiple raters and simultaneously estimate rater reliabilities. Although quite successful, wide-scale application has been hampered by unstable estimation with practical datasets, for example, with label sets with small or thin objects to be labeled or with partial or limited datasets. As well, these approaches have required each rater to generate a complete dataset, which is often impossible given both human foibles and the typical turnover rate of raters in a research or clinical environment. Herein, we propose a robust approach to improve estimation performance with small anatomical structures, allow for missing data, account for repeated label sets, and utilize training/catch trial data. With this approach, numerous raters can label small, overlapping portions of a large dataset, and rater heterogeneity can be robustly controlled while simultaneously estimating a single, reliable label set and characterizing uncertainty. The proposed approach enables many individuals to collaborate in the construction of large datasets for labeling tasks (e.g., human parallel processing) and reduces the otherwise detrimental impact of rater unavailability. PMID:22010145
Joint deconvolution and classification with applications to passive acoustic underwater multipath.
Anderson, Hyrum S; Gupta, Maya R
2008-11-01
This paper addresses the problem of classifying signals that have been corrupted by noise and unknown linear time-invariant (LTI) filtering such as multipath, given labeled uncorrupted training signals. A maximum a posteriori approach to the deconvolution and classification is considered, which produces estimates of the desired signal, the unknown channel, and the class label. For cases in which only a class label is needed, the classification accuracy can be improved by not committing to an estimate of the channel or signal. A variant of the quadratic discriminant analysis (QDA) classifier is proposed that probabilistically accounts for the unknown LTI filtering, and which avoids deconvolution. The proposed QDA classifier can work either directly on the signal or on features whose transformation by LTI filtering can be analyzed; as an example a classifier for subband-power features is derived. Results on simulated data and real Bowhead whale vocalizations show that jointly considering deconvolution with classification can dramatically improve classification performance over traditional methods over a range of signal-to-noise ratios.
Miller-Spoto, Marcia; Gombatto, Sara P
2014-06-01
A variety of diagnostic classification systems are used by physical therapists, but little information about how therapists assign diagnostic labels and how the labels are used to direct intervention is available. The purposes of this study were: (1) to examine the diagnostic labels assigned to patient problems by physical therapists who are board-certified Orthopaedic Clinical Specialists (OCSs) and (2) to determine whether the label influences selection of interventions. A cross-sectional survey was conducted. Two written cases were developed for patients with low back and shoulder pain. A survey was used to evaluate the diagnostic label assigned and the interventions considered important for each case. The cases and survey were sent to therapists who are board-certified OCSs. Respondents assigned a diagnostic label and rated the importance of intervention categories for each case. Each diagnostic label was coded based on the construct it represented. Percentage responses for each diagnostic label code and intervention category were calculated. Relative importance of intervention category based on diagnostic label was examined. For the low back pain and shoulder pain cases, respectively, "Combination" (48.5%, 34.9%) and "Pathology/Pathophysiology" (32.7%, 57.3%) diagnostic labels were most common. Strengthening (85.9%, 98.1%), stretching (86.8%, 84.9%), neuromuscular re-education (87.6%, 93.4%), functional training (91.4%, 88.6%), and mobilization/manipulation (85.1%, 86.8%) were considered the most important interventions. Relative importance of interventions did not differ based on diagnostic label (χ2=0.050-1.263, P=.261-.824). The low response rate may limit the generalizability of the findings. Also, examples provided for labels may have influenced responses, and some of the label codes may have represented overlapping constructs. There is little consistency with which OCS therapists assign diagnostic labels, and the label does not seem to influence selection of interventions. © 2014 American Physical Therapy Association.
Automatic classification of retinal vessels into arteries and veins
NASA Astrophysics Data System (ADS)
Niemeijer, Meindert; van Ginneken, Bram; Abràmoff, Michael D.
2009-02-01
Separating the retinal vascular tree into arteries and veins is important for quantifying vessel changes that preferentially affect either the veins or the arteries. For example the ratio of arterial to venous diameter, the retinal a/v ratio, is well established to be predictive of stroke and other cardiovascular events in adults, as well as the staging of retinopathy of prematurity in premature infants. This work presents a supervised, automatic method that can determine whether a vessel is an artery or a vein based on intensity and derivative information. After thinning of the vessel segmentation, vessel crossing and bifurcation points are removed leaving a set of vessel segments containing centerline pixels. A set of features is extracted from each centerline pixel and using these each is assigned a soft label indicating the likelihood that it is part of a vein. As all centerline pixels in a connected segment should be the same type we average the soft labels and assign this average label to each centerline pixel in the segment. We train and test the algorithm using the data (40 color fundus photographs) from the DRIVE database1 with an enhanced reference standard. In the enhanced reference standard a fellowship trained retinal specialist (MDA) labeled all vessels for which it was possible to visually determine whether it was a vein or an artery. After applying the proposed method to the 20 images of the DRIVE test set we obtained an area under the receiver operator characteristic (ROC) curve of 0.88 for correctly assigning centerline pixels to either the vein or artery classes.
NASA Astrophysics Data System (ADS)
Kerner, H. R.; Bell, J. F., III; Ben Amor, H.
2017-12-01
The Mastcam color imaging system on the Mars Science Laboratory Curiosity rover acquires images within Gale crater for a variety of geologic and atmospheric studies. Images are often JPEG compressed before being downlinked to Earth. While critical for transmitting images on a low-bandwidth connection, this compression can result in image artifacts most noticeable as anomalous brightness or color changes within or near JPEG compression block boundaries. In images with significant high-frequency detail (e.g., in regions showing fine layering or lamination in sedimentary rocks), the image might need to be re-transmitted losslessly to enable accurate scientific interpretation of the data. The process of identifying which images have been adversely affected by compression artifacts is performed manually by the Mastcam science team, costing significant expert human time. To streamline the tedious process of identifying which images might need to be re-transmitted, we present an input-efficient neural network solution for predicting the perceived quality of a compressed Mastcam image. Most neural network solutions require large amounts of hand-labeled training data for the model to learn the target mapping between input (e.g. distorted images) and output (e.g. quality assessment). We propose an automatic labeling method using joint entropy between a compressed and uncompressed image to avoid the need for domain experts to label thousands of training examples by hand. We use automatically labeled data to train a convolutional neural network to estimate the probability that a Mastcam user would find the quality of a given compressed image acceptable for science analysis. We tested our model on a variety of Mastcam images and found that the proposed method correlates well with image quality perception by science team members. When assisted by our proposed method, we estimate that a Mastcam investigator could reduce the time spent reviewing images by a minimum of 70%.
Special object extraction from medieval books using superpixels and bag-of-features
NASA Astrophysics Data System (ADS)
Yang, Ying; Rushmeier, Holly
2017-01-01
We propose a method to extract special objects in images of medieval books, which generally represent, for example, figures and capital letters. Instead of working on the single-pixel level, we consider superpixels as the basic classification units for improved time efficiency. More specifically, we classify superpixels into different categories/objects by using a bag-of-features approach, where a superpixel category classifier is trained with the local features of the superpixels of the training images. With the trained classifier, we are able to assign the category labels to the superpixels of a historical document image under test. Finally, special objects can easily be identified and extracted after analyzing the categorization results. Experimental results demonstrate that, as compared to the state-of-the-art algorithms, our method provides comparable performance for some historical books but greatly outperforms them in terms of generality and computational time.
AveBoost2: Boosting for Noisy Data
NASA Technical Reports Server (NTRS)
Oza, Nikunj C.
2004-01-01
AdaBoost is a well-known ensemble learning algorithm that constructs its constituent or base models in sequence. A key step in AdaBoost is constructing a distribution over the training examples to create each base model. This distribution, represented as a vector, is constructed to be orthogonal to the vector of mistakes made by the pre- vious base model in the sequence. The idea is to make the next base model's errors uncorrelated with those of the previous model. In previous work, we developed an algorithm, AveBoost, that constructed distributions orthogonal to the mistake vectors of all the previous models, and then averaged them to create the next base model s distribution. Our experiments demonstrated the superior accuracy of our approach. In this paper, we slightly revise our algorithm to allow us to obtain non-trivial theoretical results: bounds on the training error and generalization error (difference between training and test error). Our averaging process has a regularizing effect which, as expected, leads us to a worse training error bound for our algorithm than for AdaBoost but a superior generalization error bound. For this paper, we experimented with the data that we used in both as originally supplied and with added label noise-a small fraction of the data has its original label changed. Noisy data are notoriously difficult for AdaBoost to learn. Our algorithm's performance improvement over AdaBoost is even greater on the noisy data than the original data.
Gentry, Gregory
2009-01-01
Your company has spent months designing a compliance program and training your sales representatives. They know never to mention the off-label uses of your product. If they are asked about the off-label uses by the physician they are detailing, they know to forward those inquiries to the scientific liaisons at headquarters. But, could your company still be in legal jeopardy simply because it knows that the product is being used for an off-label purpose? This article attempts to track the Food and Drug Administration's (FDA's) shifting interpretation of its "intended use" regulations, from focusing entirely on the statements of the manufacturers to focusing on the knowledge of the industry, indeed, of the consumers of products, in determining the true intended use of a product. It will look at several recent attempts by FDA to use that new interpretation of the regulations to expand its power: to regulate tobacco and to require pediatric indications for any new drug. Finally, it will look at several recent examples of how this new interpretation has manifested in actions by FDA and the Department of Justice (DOJ).
Neural Network for Nanoscience Scanning Electron Microscope Image Recognition.
Modarres, Mohammad Hadi; Aversa, Rossella; Cozzini, Stefano; Ciancio, Regina; Leto, Angelo; Brandino, Giuseppe Piero
2017-10-16
In this paper we applied transfer learning techniques for image recognition, automatic categorization, and labeling of nanoscience images obtained by scanning electron microscope (SEM). Roughly 20,000 SEM images were manually classified into 10 categories to form a labeled training set, which can be used as a reference set for future applications of deep learning enhanced algorithms in the nanoscience domain. The categories chosen spanned the range of 0-Dimensional (0D) objects such as particles, 1D nanowires and fibres, 2D films and coated surfaces, and 3D patterned surfaces such as pillars. The training set was used to retrain on the SEM dataset and to compare many convolutional neural network models (Inception-v3, Inception-v4, ResNet). We obtained compatible results by performing a feature extraction of the different models on the same dataset. We performed additional analysis of the classifier on a second test set to further investigate the results both on particular cases and from a statistical point of view. Our algorithm was able to successfully classify around 90% of a test dataset consisting of SEM images, while reduced accuracy was found in the case of images at the boundary between two categories or containing elements of multiple categories. In these cases, the image classification did not identify a predominant category with a high score. We used the statistical outcomes from testing to deploy a semi-automatic workflow able to classify and label images generated by the SEM. Finally, a separate training was performed to determine the volume fraction of coherently aligned nanowires in SEM images. The results were compared with what was obtained using the Local Gradient Orientation method. This example demonstrates the versatility and the potential of transfer learning to address specific tasks of interest in nanoscience applications.
Stewart, Derek; Rouf, Abdul; Snaith, Ailsa; Elliott, Kathleen; Helms, Peter J; McLay, James S
2007-01-01
What is already known about this subject There are increasing concerns about the safety and efficacy of paediatric off-label medicines. In the UK, each year 26% of children receive an off-label prescription from their general practitioner. The community pharmacist is the final and key professional in the chain, with the responsibility to ensure that medicines are both prescribed and dispensed appropriately. What this study adds The majority of community pharmacists are aware of off-label prescribing, but through work experience rather than undergraduate or postgraduate training or professional development. Community pharmacists, like UK general practitioners, underestimate the levels of paediatric off-label prescribing, and appear unclear as to the most common reasons for a prescription being off label. Most community pharmacists stated that they should inform the prescriber that a medicine was off label; however, when given specific practical examples, less than half would actually appear to do so. The majority of community pharmacists have been asked by the public to sell over-the-counter medicines for paediatric off-label use. Aim To identify community pharmacist experiences of, and attitudes towards paediatric off-label prescribing. Methods A prospective questionnaire-based study, with a 21-item questionnaire issued to 1500 randomly selected community pharmacies throughout the UK during 2005 on three separate occasions. Results Four hundred and eighty-two (32.1%) completed questionnaires were returned. Over 70% of respondents were familiar with the concept of off-label prescribing, primarily through dispensing experience rather than education, although only 40% were aware of having dispensed a paediatric off-label prescription within the previous month. The reasons given for a prescription being off label were younger age than recommended (84.6%, 297/351), primarily for antihistamines, analgesics and β2-agonists, and higher (73.9%, 229/310) or lower than (41%, 103/258) recommended dose, primarily antibiotics and analgesics. Over 60% of respondents had been asked by the public to sell paediatric over-the-counter medicines, such as antihistamines, analgesics and steroid preparations for off-label use. The majority of respondents used the British National Formulary or the Pack Insert rather than specialist formularies or guidelines as a source of specialist paediatric information. Although 78% of respondents believed they had a responsibility to inform the prescriber that a medicine was off label, only 66% believed that they had a similar responsibility to inform parents. Conclusion The community pharmacists who responded to this questionnaire appear to be aware of and concerned by the issues which surround paediatric off-label prescribing. Despite this, most gained relevant knowledge through work experience rather than undergraduate or postgraduate training or professional development. PMID:17324238
76 FR 39041 - Infectious Diseases
Federal Register 2010, 2011, 2012, 2013, 2014
2011-07-05
... controls, and personal protective equipment); medical surveillance; worker training; signage and labeling.... Whether and to what extent an OSHA standard should contain signage, labeling, and worker training...
Nearest neighbor density ratio estimation for large-scale applications in astronomy
NASA Astrophysics Data System (ADS)
Kremer, J.; Gieseke, F.; Steenstrup Pedersen, K.; Igel, C.
2015-09-01
In astronomical applications of machine learning, the distribution of objects used for building a model is often different from the distribution of the objects the model is later applied to. This is known as sample selection bias, which is a major challenge for statistical inference as one can no longer assume that the labeled training data are representative. To address this issue, one can re-weight the labeled training patterns to match the distribution of unlabeled data that are available already in the training phase. There are many examples in practice where this strategy yielded good results, but estimating the weights reliably from a finite sample is challenging. We consider an efficient nearest neighbor density ratio estimator that can exploit large samples to increase the accuracy of the weight estimates. To solve the problem of choosing the right neighborhood size, we propose to use cross-validation on a model selection criterion that is unbiased under covariate shift. The resulting algorithm is our method of choice for density ratio estimation when the feature space dimensionality is small and sample sizes are large. The approach is simple and, because of the model selection, robust. We empirically find that it is on a par with established kernel-based methods on relatively small regression benchmark datasets. However, when applied to large-scale photometric redshift estimation, our approach outperforms the state-of-the-art.
Barrington, Luke; Turnbull, Douglas; Lanckriet, Gert
2012-01-01
Searching for relevant content in a massive amount of multimedia information is facilitated by accurately annotating each image, video, or song with a large number of relevant semantic keywords, or tags. We introduce game-powered machine learning, an integrated approach to annotating multimedia content that combines the effectiveness of human computation, through online games, with the scalability of machine learning. We investigate this framework for labeling music. First, a socially-oriented music annotation game called Herd It collects reliable music annotations based on the “wisdom of the crowds.” Second, these annotated examples are used to train a supervised machine learning system. Third, the machine learning system actively directs the annotation games to collect new data that will most benefit future model iterations. Once trained, the system can automatically annotate a corpus of music much larger than what could be labeled using human computation alone. Automatically annotated songs can be retrieved based on their semantic relevance to text-based queries (e.g., “funky jazz with saxophone,” “spooky electronica,” etc.). Based on the results presented in this paper, we find that actively coupling annotation games with machine learning provides a reliable and scalable approach to making searchable massive amounts of multimedia data. PMID:22460786
Game-powered machine learning.
Barrington, Luke; Turnbull, Douglas; Lanckriet, Gert
2012-04-24
Searching for relevant content in a massive amount of multimedia information is facilitated by accurately annotating each image, video, or song with a large number of relevant semantic keywords, or tags. We introduce game-powered machine learning, an integrated approach to annotating multimedia content that combines the effectiveness of human computation, through online games, with the scalability of machine learning. We investigate this framework for labeling music. First, a socially-oriented music annotation game called Herd It collects reliable music annotations based on the "wisdom of the crowds." Second, these annotated examples are used to train a supervised machine learning system. Third, the machine learning system actively directs the annotation games to collect new data that will most benefit future model iterations. Once trained, the system can automatically annotate a corpus of music much larger than what could be labeled using human computation alone. Automatically annotated songs can be retrieved based on their semantic relevance to text-based queries (e.g., "funky jazz with saxophone," "spooky electronica," etc.). Based on the results presented in this paper, we find that actively coupling annotation games with machine learning provides a reliable and scalable approach to making searchable massive amounts of multimedia data.
Cocos, Anne; Fiks, Alexander G; Masino, Aaron J
2017-07-01
Social media is an important pharmacovigilance data source for adverse drug reaction (ADR) identification. Human review of social media data is infeasible due to data quantity, thus natural language processing techniques are necessary. Social media includes informal vocabulary and irregular grammar, which challenge natural language processing methods. Our objective is to develop a scalable, deep-learning approach that exceeds state-of-the-art ADR detection performance in social media. We developed a recurrent neural network (RNN) model that labels words in an input sequence with ADR membership tags. The only input features are word-embedding vectors, which can be formed through task-independent pretraining or during ADR detection training. Our best-performing RNN model used pretrained word embeddings created from a large, non-domain-specific Twitter dataset. It achieved an approximate match F-measure of 0.755 for ADR identification on the dataset, compared to 0.631 for a baseline lexicon system and 0.65 for the state-of-the-art conditional random field model. Feature analysis indicated that semantic information in pretrained word embeddings boosted sensitivity and, combined with contextual awareness captured in the RNN, precision. Our model required no task-specific feature engineering, suggesting generalizability to additional sequence-labeling tasks. Learning curve analysis showed that our model reached optimal performance with fewer training examples than the other models. ADR detection performance in social media is significantly improved by using a contextually aware model and word embeddings formed from large, unlabeled datasets. The approach reduces manual data-labeling requirements and is scalable to large social media datasets. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com
NASA Astrophysics Data System (ADS)
Rajwa, Bartek; Dundar, M. Murat; Akova, Ferit; Patsekin, Valery; Bae, Euiwon; Tang, Yanjie; Dietz, J. Eric; Hirleman, E. Daniel; Robinson, J. Paul; Bhunia, Arun K.
2011-06-01
The majority of tools for pathogen sensing and recognition are based on physiological or genetic properties of microorganisms. However, there is enormous interest in devising label-free and reagentless biosensors that would operate utilizing the biophysical signatures of samples without the need for labeling and reporting biochemistry. Optical biosensors are closest to realizing this goal and vibrational spectroscopies are examples of well-established optical label-free biosensing techniques. A recently introduced forward-scatter phenotyping (FSP) also belongs to the broad class of optical sensors. However, in contrast to spectroscopies, the remarkable specificity of FSP derives from the morphological information that bacterial material encodes on a coherent optical wavefront passing through the colony. The system collects elastically scattered light patterns that, given a constant environment, are unique to each bacterial species and/or serovar. Both FSP technology and spectroscopies rely on statistical machine learning to perform recognition and classification. However, the commonly used methods utilize either simplistic unsupervised learning or traditional supervised techniques that assume completeness of training libraries. This restrictive assumption is known to be false for real-life conditions, resulting in unsatisfactory levels of accuracy, and consequently limited overall performance for biodetection and classification tasks. The presented work demonstrates preliminary studies on the use of FSP system to classify selected serotypes of non-O157 Shiga toxin-producing E. coli in a nonexhaustive framework, that is, without full knowledge about all the possible classes that can be encountered. Our study uses a Bayesian approach to learning with a nonexhaustive training dataset to allow for the automated and distributed detection of unknown bacterial classes.
Democratization of Nanoscale Imaging and Sensing Tools Using Photonics
2015-01-01
Providing means for researchers and citizen scientists in the developing world to perform advanced measurements with nanoscale precision can help to accelerate the rate of discovery and invention as well as improve higher education and the training of the next generation of scientists and engineers worldwide. Here, we review some of the recent progress toward making optical nanoscale measurement tools more cost-effective, field-portable, and accessible to a significantly larger group of researchers and educators. We divide our review into two main sections: label-based nanoscale imaging and sensing tools, which primarily involve fluorescent approaches, and label-free nanoscale measurement tools, which include light scattering sensors, interferometric methods, photonic crystal sensors, and plasmonic sensors. For each of these areas, we have primarily focused on approaches that have either demonstrated operation outside of a traditional laboratory setting, including for example integration with mobile phones, or exhibited the potential for such operation in the near future. PMID:26068279
Semi-supervised Learning for Phenotyping Tasks.
Dligach, Dmitriy; Miller, Timothy; Savova, Guergana K
2015-01-01
Supervised learning is the dominant approach to automatic electronic health records-based phenotyping, but it is expensive due to the cost of manual chart review. Semi-supervised learning takes advantage of both scarce labeled and plentiful unlabeled data. In this work, we study a family of semi-supervised learning algorithms based on Expectation Maximization (EM) in the context of several phenotyping tasks. We first experiment with the basic EM algorithm. When the modeling assumptions are violated, basic EM leads to inaccurate parameter estimation. Augmented EM attenuates this shortcoming by introducing a weighting factor that downweights the unlabeled data. Cross-validation does not always lead to the best setting of the weighting factor and other heuristic methods may be preferred. We show that accurate phenotyping models can be trained with only a few hundred labeled (and a large number of unlabeled) examples, potentially providing substantial savings in the amount of the required manual chart review.
Democratization of Nanoscale Imaging and Sensing Tools Using Photonics.
McLeod, Euan; Wei, Qingshan; Ozcan, Aydogan
2015-07-07
Providing means for researchers and citizen scientists in the developing world to perform advanced measurements with nanoscale precision can help to accelerate the rate of discovery and invention as well as improve higher education and the training of the next generation of scientists and engineers worldwide. Here, we review some of the recent progress toward making optical nanoscale measurement tools more cost-effective, field-portable, and accessible to a significantly larger group of researchers and educators. We divide our review into two main sections: label-based nanoscale imaging and sensing tools, which primarily involve fluorescent approaches, and label-free nanoscale measurement tools, which include light scattering sensors, interferometric methods, photonic crystal sensors, and plasmonic sensors. For each of these areas, we have primarily focused on approaches that have either demonstrated operation outside of a traditional laboratory setting, including for example integration with mobile phones, or exhibited the potential for such operation in the near future.
Classification with asymmetric label noise: Consistency and maximal denoising
Blanchard, Gilles; Flaska, Marek; Handy, Gregory; ...
2016-09-20
In many real-world classification problems, the labels of training examples are randomly corrupted. Most previous theoretical work on classification with label noise assumes that the two classes are separable, that the label noise is independent of the true class label, or that the noise proportions for each class are known. In this work, we give conditions that are necessary and sufficient for the true class-conditional distributions to be identifiable. These conditions are weaker than those analyzed previously, and allow for the classes to be nonseparable and the noise levels to be asymmetric and unknown. The conditions essentially state that amore » majority of the observed labels are correct and that the true class-conditional distributions are “mutually irreducible,” a concept we introduce that limits the similarity of the two distributions. For any label noise problem, there is a unique pair of true class-conditional distributions satisfying the proposed conditions, and we argue that this pair corresponds in a certain sense to maximal denoising of the observed distributions. Our results are facilitated by a connection to “mixture proportion estimation,” which is the problem of estimating the maximal proportion of one distribution that is present in another. We establish a novel rate of convergence result for mixture proportion estimation, and apply this to obtain consistency of a discrimination rule based on surrogate loss minimization. Experimental results on benchmark data and a nuclear particle classification problem demonstrate the efficacy of our approach. MSC 2010 subject classifications: Primary 62H30; secondary 68T10. Keywords and phrases: Classification, label noise, mixture proportion estimation, surrogate loss, consistency.« less
Classification with asymmetric label noise: Consistency and maximal denoising
DOE Office of Scientific and Technical Information (OSTI.GOV)
Blanchard, Gilles; Flaska, Marek; Handy, Gregory
In many real-world classification problems, the labels of training examples are randomly corrupted. Most previous theoretical work on classification with label noise assumes that the two classes are separable, that the label noise is independent of the true class label, or that the noise proportions for each class are known. In this work, we give conditions that are necessary and sufficient for the true class-conditional distributions to be identifiable. These conditions are weaker than those analyzed previously, and allow for the classes to be nonseparable and the noise levels to be asymmetric and unknown. The conditions essentially state that amore » majority of the observed labels are correct and that the true class-conditional distributions are “mutually irreducible,” a concept we introduce that limits the similarity of the two distributions. For any label noise problem, there is a unique pair of true class-conditional distributions satisfying the proposed conditions, and we argue that this pair corresponds in a certain sense to maximal denoising of the observed distributions. Our results are facilitated by a connection to “mixture proportion estimation,” which is the problem of estimating the maximal proportion of one distribution that is present in another. We establish a novel rate of convergence result for mixture proportion estimation, and apply this to obtain consistency of a discrimination rule based on surrogate loss minimization. Experimental results on benchmark data and a nuclear particle classification problem demonstrate the efficacy of our approach. MSC 2010 subject classifications: Primary 62H30; secondary 68T10. Keywords and phrases: Classification, label noise, mixture proportion estimation, surrogate loss, consistency.« less
AUC-Maximized Deep Convolutional Neural Fields for Protein Sequence Labeling.
Wang, Sheng; Sun, Siqi; Xu, Jinbo
2016-09-01
Deep Convolutional Neural Networks (DCNN) has shown excellent performance in a variety of machine learning tasks. This paper presents Deep Convolutional Neural Fields (DeepCNF), an integration of DCNN with Conditional Random Field (CRF), for sequence labeling with an imbalanced label distribution. The widely-used training methods, such as maximum-likelihood and maximum labelwise accuracy, do not work well on imbalanced data. To handle this, we present a new training algorithm called maximum-AUC for DeepCNF. That is, we train DeepCNF by directly maximizing the empirical Area Under the ROC Curve (AUC), which is an unbiased measurement for imbalanced data. To fulfill this, we formulate AUC in a pairwise ranking framework, approximate it by a polynomial function and then apply a gradient-based procedure to optimize it. Our experimental results confirm that maximum-AUC greatly outperforms the other two training methods on 8-state secondary structure prediction and disorder prediction since their label distributions are highly imbalanced and also has similar performance as the other two training methods on solvent accessibility prediction, which has three equally-distributed labels. Furthermore, our experimental results show that our AUC-trained DeepCNF models greatly outperform existing popular predictors of these three tasks. The data and software related to this paper are available at https://github.com/realbigws/DeepCNF_AUC.
AUC-Maximized Deep Convolutional Neural Fields for Protein Sequence Labeling
Wang, Sheng; Sun, Siqi
2017-01-01
Deep Convolutional Neural Networks (DCNN) has shown excellent performance in a variety of machine learning tasks. This paper presents Deep Convolutional Neural Fields (DeepCNF), an integration of DCNN with Conditional Random Field (CRF), for sequence labeling with an imbalanced label distribution. The widely-used training methods, such as maximum-likelihood and maximum labelwise accuracy, do not work well on imbalanced data. To handle this, we present a new training algorithm called maximum-AUC for DeepCNF. That is, we train DeepCNF by directly maximizing the empirical Area Under the ROC Curve (AUC), which is an unbiased measurement for imbalanced data. To fulfill this, we formulate AUC in a pairwise ranking framework, approximate it by a polynomial function and then apply a gradient-based procedure to optimize it. Our experimental results confirm that maximum-AUC greatly outperforms the other two training methods on 8-state secondary structure prediction and disorder prediction since their label distributions are highly imbalanced and also has similar performance as the other two training methods on solvent accessibility prediction, which has three equally-distributed labels. Furthermore, our experimental results show that our AUC-trained DeepCNF models greatly outperform existing popular predictors of these three tasks. The data and software related to this paper are available at https://github.com/realbigws/DeepCNF_AUC. PMID:28884168
An Example-Based Multi-Atlas Approach to Automatic Labeling of White Matter Tracts
Yoo, Sang Wook; Guevara, Pamela; Jeong, Yong; Yoo, Kwangsun; Shin, Joseph S.; Mangin, Jean-Francois; Seong, Joon-Kyung
2015-01-01
We present an example-based multi-atlas approach for classifying white matter (WM) tracts into anatomic bundles. Our approach exploits expert-provided example data to automatically classify the WM tracts of a subject. Multiple atlases are constructed to model the example data from multiple subjects in order to reflect the individual variability of bundle shapes and trajectories over subjects. For each example subject, an atlas is maintained to allow the example data of a subject to be added or deleted flexibly. A voting scheme is proposed to facilitate the multi-atlas exploitation of example data. For conceptual simplicity, we adopt the same metrics in both example data construction and WM tract labeling. Due to the huge number of WM tracts in a subject, it is time-consuming to label each WM tract individually. Thus, the WM tracts are grouped according to their shape similarity, and WM tracts within each group are labeled simultaneously. To further enhance the computational efficiency, we implemented our approach on the graphics processing unit (GPU). Through nested cross-validation we demonstrated that our approach yielded high classification performance. The average sensitivities for bundles in the left and right hemispheres were 89.5% and 91.0%, respectively, and their average false discovery rates were 14.9% and 14.2%, respectively. PMID:26225419
An Example-Based Multi-Atlas Approach to Automatic Labeling of White Matter Tracts.
Yoo, Sang Wook; Guevara, Pamela; Jeong, Yong; Yoo, Kwangsun; Shin, Joseph S; Mangin, Jean-Francois; Seong, Joon-Kyung
2015-01-01
We present an example-based multi-atlas approach for classifying white matter (WM) tracts into anatomic bundles. Our approach exploits expert-provided example data to automatically classify the WM tracts of a subject. Multiple atlases are constructed to model the example data from multiple subjects in order to reflect the individual variability of bundle shapes and trajectories over subjects. For each example subject, an atlas is maintained to allow the example data of a subject to be added or deleted flexibly. A voting scheme is proposed to facilitate the multi-atlas exploitation of example data. For conceptual simplicity, we adopt the same metrics in both example data construction and WM tract labeling. Due to the huge number of WM tracts in a subject, it is time-consuming to label each WM tract individually. Thus, the WM tracts are grouped according to their shape similarity, and WM tracts within each group are labeled simultaneously. To further enhance the computational efficiency, we implemented our approach on the graphics processing unit (GPU). Through nested cross-validation we demonstrated that our approach yielded high classification performance. The average sensitivities for bundles in the left and right hemispheres were 89.5% and 91.0%, respectively, and their average false discovery rates were 14.9% and 14.2%, respectively.
Sariyar, M; Borg, A; Pommerening, K
2012-10-01
Supervised record linkage methods often require a clerical review to gain informative training data. Active learning means to actively prompt the user to label data with special characteristics in order to minimise the review costs. We conducted an empirical evaluation to investigate whether a simple active learning strategy using binary comparison patterns is sufficient or if string metrics together with a more sophisticated algorithm are necessary to achieve high accuracies with a small training set. Based on medical registry data with different numbers of attributes, we used active learning to acquire training sets for classification trees, which were then used to classify the remaining data. Active learning for binary patterns means that every distinct comparison pattern represents a stratum from which one item is sampled. Active learning for patterns consisting of the Levenshtein string metric values uses an iterative process where the most informative and representative examples are added to the training set. In this context, we extended the active learning strategy by Sarawagi and Bhamidipaty (2002). On the original data set, active learning based on binary comparison patterns leads to the best results. When dropping four or six attributes, using string metrics leads to better results. In both cases, not more than 200 manually reviewed training examples are necessary. In record linkage applications where only forename, name and birthday are available as attributes, we suggest the sophisticated active learning strategy based on string metrics in order to achieve highly accurate results. We recommend the simple strategy if more attributes are available, as in our study. In both cases, active learning significantly reduces the amount of manual involvement in training data selection compared to usual record linkage settings. Copyright © 2012 Elsevier Inc. All rights reserved.
Data Programming: Creating Large Training Sets, Quickly.
Ratner, Alexander; De Sa, Christopher; Wu, Sen; Selsam, Daniel; Ré, Christopher
2016-12-01
Large labeled training sets are the critical building blocks of supervised learning methods and are key enablers of deep learning techniques. For some applications, creating labeled training sets is the most time-consuming and expensive part of applying machine learning. We therefore propose a paradigm for the programmatic creation of training sets called data programming in which users express weak supervision strategies or domain heuristics as labeling functions , which are programs that label subsets of the data, but that are noisy and may conflict. We show that by explicitly representing this training set labeling process as a generative model, we can "denoise" the generated training set, and establish theoretically that we can recover the parameters of these generative models in a handful of settings. We then show how to modify a discriminative loss function to make it noise-aware, and demonstrate our method over a range of discriminative models including logistic regression and LSTMs. Experimentally, on the 2014 TAC-KBP Slot Filling challenge, we show that data programming would have led to a new winning score, and also show that applying data programming to an LSTM model leads to a TAC-KBP score almost 6 F1 points over a state-of-the-art LSTM baseline (and into second place in the competition). Additionally, in initial user studies we observed that data programming may be an easier way for non-experts to create machine learning models when training data is limited or unavailable.
Data Programming: Creating Large Training Sets, Quickly
Ratner, Alexander; De Sa, Christopher; Wu, Sen; Selsam, Daniel; Ré, Christopher
2018-01-01
Large labeled training sets are the critical building blocks of supervised learning methods and are key enablers of deep learning techniques. For some applications, creating labeled training sets is the most time-consuming and expensive part of applying machine learning. We therefore propose a paradigm for the programmatic creation of training sets called data programming in which users express weak supervision strategies or domain heuristics as labeling functions, which are programs that label subsets of the data, but that are noisy and may conflict. We show that by explicitly representing this training set labeling process as a generative model, we can “denoise” the generated training set, and establish theoretically that we can recover the parameters of these generative models in a handful of settings. We then show how to modify a discriminative loss function to make it noise-aware, and demonstrate our method over a range of discriminative models including logistic regression and LSTMs. Experimentally, on the 2014 TAC-KBP Slot Filling challenge, we show that data programming would have led to a new winning score, and also show that applying data programming to an LSTM model leads to a TAC-KBP score almost 6 F1 points over a state-of-the-art LSTM baseline (and into second place in the competition). Additionally, in initial user studies we observed that data programming may be an easier way for non-experts to create machine learning models when training data is limited or unavailable. PMID:29872252
Collaborative labeling of malignant glioma with WebMILL: a first look
NASA Astrophysics Data System (ADS)
Singh, Eesha; Asman, Andrew J.; Xu, Zhoubing; Chambless, Lola; Thompson, Reid; Landman, Bennett A.
2012-02-01
Malignant gliomas are the most common form of primary neoplasm in the central nervous system, and one of the most rapidly fatal of all human malignancies. They are treated by maximal surgical resection followed by radiation and chemotherapy. Herein, we seek to improve the methods available to quantify the extent of tumors using newly presented, collaborative labeling techniques on magnetic resonance imaging. Traditionally, labeling medical images has entailed that expert raters operate on one image at a time, which is resource intensive and not practical for very large datasets. Using many, minimally trained raters to label images has the possibility of minimizing laboratory requirements and allowing high degrees of parallelism. A successful effort also has the possibility of reducing overall cost. This potentially transformative technology presents a new set of problems, because one must pose the labeling challenge in a manner accessible to people with little or no background in labeling medical images and raters cannot be expected to read detailed instructions. Hence, a different training method has to be employed. The training must appeal to all types of learners and have the same concepts presented in multiple ways to ensure that all the subjects understand the basics of labeling. Our overall objective is to demonstrate the feasibility of studying malignant glioma morphometry through statistical analysis of the collaborative efforts of many, minimally-trained raters. This study presents preliminary results on optimization of the WebMILL framework for neoplasm labeling and investigates the initial contributions of 78 raters labeling 98 whole-brain datasets.
Wartman, Brianne C; Keeley, Robin J; Holahan, Matthew R
2012-10-24
Estrogen levels in rats are positively correlated with enhanced memory function and hippocampal dendritic spine density. There is much less work on the long-term effects of estradiol manipulation in preadolescent rats. The present work examined how injections of estradiol during postnatal days 19-22 (p19-22; preadolescence) affected water maze performance and hippocampal phosphorylated ERK labeling. To investigate this, half of the estradiol- and vehicle-treated female rats were trained on a water maze task 24h after the end of estradiol treatment (p23-27) while the other half was not trained. All female rats were tested on the water maze from p40 to p44 (adolescence) and hippocampal pERK1/2 labeling was assessed as a putative marker of neuronal plasticity. During adolescence, preadolescent-trained groups showed lower latencies than groups without preadolescent training. Retention data revealed lower latencies in both estradiol groups, whether preadolescent trained or not. Immunohistochemical detection of hippocampal pERK1/2 revealed elevations in granule cell labeling associated with the preadolescent trained groups and reductions in CA1 labeling associated with estradiol treatment. These results show a latent beneficial effect of preadolescent estradiol treatment on adolescent spatial performance and suggest an organizational effect of prepubescent exogenously applied estradiol. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Enhancing deep convolutional neural network scheme for breast cancer diagnosis with unlabeled data.
Sun, Wenqing; Tseng, Tzu-Liang Bill; Zhang, Jianying; Qian, Wei
2017-04-01
In this study we developed a graph based semi-supervised learning (SSL) scheme using deep convolutional neural network (CNN) for breast cancer diagnosis. CNN usually needs a large amount of labeled data for training and fine tuning the parameters, and our proposed scheme only requires a small portion of labeled data in training set. Four modules were included in the diagnosis system: data weighing, feature selection, dividing co-training data labeling, and CNN. 3158 region of interests (ROIs) with each containing a mass extracted from 1874 pairs of mammogram images were used for this study. Among them 100 ROIs were treated as labeled data while the rest were treated as unlabeled. The area under the curve (AUC) observed in our study was 0.8818, and the accuracy of CNN is 0.8243 using the mixed labeled and unlabeled data. Copyright © 2016. Published by Elsevier Ltd.
NASA Astrophysics Data System (ADS)
Moody, D.; Brumby, S. P.; Chartrand, R.; Franco, E.; Keisler, R.; Kelton, T.; Kontgis, C.; Mathis, M.; Raleigh, D.; Rudelis, X.; Skillman, S.; Warren, M. S.; Longbotham, N.
2016-12-01
The recent computing performance revolution has driven improvements in sensor, communication, and storage technology. Historical, multi-decadal remote sensing datasets at the petabyte scale are now available in commercial clouds, with new satellite constellations generating petabytes per year of high-resolution imagery with daily global coverage. Cloud computing and storage, combined with recent advances in machine learning and open software, are enabling understanding of the world at an unprecedented scale and detail. We have assembled all available satellite imagery from the USGS Landsat, NASA MODIS, and ESA Sentinel programs, as well as commercial PlanetScope and RapidEye imagery, and have analyzed over 2.8 quadrillion multispectral pixels. We leveraged the commercial cloud to generate a tiled, spatio-temporal mosaic of the Earth for fast iteration and development of new algorithms combining analysis techniques from remote sensing, machine learning, and scalable compute infrastructure. Our data platform enables processing at petabytes per day rates using multi-source data to produce calibrated, georeferenced imagery stacks at desired points in time and space that can be used for pixel level or global scale analysis. We demonstrate our data platform capability by using the European Space Agency's (ESA) published 2006 and 2009 GlobCover 20+ category label maps to train and test a Land Cover Land Use (LCLU) classifier, and generate current self-consistent LCLU maps in Brazil. We train a standard classifier on 2006 GlobCover categories using temporal imagery stacks, and we validate our results on co-registered 2009 Globcover LCLU maps and 2009 imagery. We then extend the derived LCLU model to current imagery stacks to generate an updated, in-season label map. Changes in LCLU labels can now be seamlessly monitored for a given location across the years in order to track, for example, cropland expansion, forest growth, and urban developments. An example of change monitoring is illustrated in the included figure showing rainfed cropland change in the Mato Grosso region of Brazil between 2006 and 2009.
Classifying with confidence from incomplete information.
Parrish, Nathan; Anderson, Hyrum S.; Gupta, Maya R.; ...
2013-12-01
For this paper, we consider the problem of classifying a test sample given incomplete information. This problem arises naturally when data about a test sample is collected over time, or when costs must be incurred to compute the classification features. For example, in a distributed sensor network only a fraction of the sensors may have reported measurements at a certain time, and additional time, power, and bandwidth is needed to collect the complete data to classify. A practical goal is to assign a class label as soon as enough data is available to make a good decision. We formalize thismore » goal through the notion of reliability—the probability that a label assigned given incomplete data would be the same as the label assigned given the complete data, and we propose a method to classify incomplete data only if some reliability threshold is met. Our approach models the complete data as a random variable whose distribution is dependent on the current incomplete data and the (complete) training data. The method differs from standard imputation strategies in that our focus is on determining the reliability of the classification decision, rather than just the class label. We show that the method provides useful reliability estimates of the correctness of the imputed class labels on a set of experiments on time-series data sets, where the goal is to classify the time-series as early as possible while still guaranteeing that the reliability threshold is met.« less
ERIC Educational Resources Information Center
McMorrow, Martin J.; And Others
1987-01-01
A cues-pause-point procedure was used to train two severely retarded females to remain quiet before, during, and briefly after the presentation of questions and then to verbalize on the basis of environmental cues whose labels represented the correct responses. Echolalia was rapidly replaced by correct responding on the trained stimuli. (Author/JW)
NASA Astrophysics Data System (ADS)
Liu, Jianjun; Kan, Jianquan
2018-04-01
In this paper, based on the terahertz spectrum, a new identification method of genetically modified material by support vector machine (SVM) based on affinity propagation clustering is proposed. This algorithm mainly uses affinity propagation clustering algorithm to make cluster analysis and labeling on unlabeled training samples, and in the iterative process, the existing SVM training data are continuously updated, when establishing the identification model, it does not need to manually label the training samples, thus, the error caused by the human labeled samples is reduced, and the identification accuracy of the model is greatly improved.
Snorkel: Rapid Training Data Creation with Weak Supervision.
Ratner, Alexander; Bach, Stephen H; Ehrenberg, Henry; Fries, Jason; Wu, Sen; Ré, Christopher
2017-11-01
Labeling training data is increasingly the largest bottleneck in deploying machine learning systems. We present Snorkel, a first-of-its-kind system that enables users to train state-of- the-art models without hand labeling any training data. Instead, users write labeling functions that express arbitrary heuristics, which can have unknown accuracies and correlations. Snorkel denoises their outputs without access to ground truth by incorporating the first end-to-end implementation of our recently proposed machine learning paradigm, data programming. We present a flexible interface layer for writing labeling functions based on our experience over the past year collaborating with companies, agencies, and research labs. In a user study, subject matter experts build models 2.8× faster and increase predictive performance an average 45.5% versus seven hours of hand labeling. We study the modeling tradeoffs in this new setting and propose an optimizer for automating tradeoff decisions that gives up to 1.8× speedup per pipeline execution. In two collaborations, with the U.S. Department of Veterans Affairs and the U.S. Food and Drug Administration, and on four open-source text and image data sets representative of other deployments, Snorkel provides 132% average improvements to predictive performance over prior heuristic approaches and comes within an average 3.60% of the predictive performance of large hand-curated training sets.
Manifold Regularized Experimental Design for Active Learning.
Zhang, Lining; Shum, Hubert P H; Shao, Ling
2016-12-02
Various machine learning and data mining tasks in classification require abundant data samples to be labeled for training. Conventional active learning methods aim at labeling the most informative samples for alleviating the labor of the user. Many previous studies in active learning select one sample after another in a greedy manner. However, this is not very effective because the classification models has to be retrained for each newly labeled sample. Moreover, many popular active learning approaches utilize the most uncertain samples by leveraging the classification hyperplane of the classifier, which is not appropriate since the classification hyperplane is inaccurate when the training data are small-sized. The problem of insufficient training data in real-world systems limits the potential applications of these approaches. This paper presents a novel method of active learning called manifold regularized experimental design (MRED), which can label multiple informative samples at one time for training. In addition, MRED gives an explicit geometric explanation for the selected samples to be labeled by the user. Different from existing active learning methods, our method avoids the intrinsic problems caused by insufficiently labeled samples in real-world applications. Various experiments on synthetic datasets, the Yale face database and the Corel image database have been carried out to show how MRED outperforms existing methods.
Automated MRI Cerebellar Size Measurements Using Active Appearance Modeling
Price, Mathew; Cardenas, Valerie A.; Fein, George
2014-01-01
Although the human cerebellum has been increasingly identified as an important hub that shows potential for helping in the diagnosis of a large spectrum of disorders, such as alcoholism, autism, and fetal alcohol spectrum disorder, the high costs associated with manual segmentation, and low availability of reliable automated cerebellar segmentation tools, has resulted in a limited focus on cerebellar measurement in human neuroimaging studies. We present here the CATK (Cerebellar Analysis Toolkit), which is based on the Bayesian framework implemented in FMRIB’s FIRST. This approach involves training Active Appearance Models (AAM) using hand-delineated examples. CATK can currently delineate the cerebellar hemispheres and three vermal groups (lobules I–V, VI–VII, and VIII–X). Linear registration with the low-resolution MNI152 template is used to provide initial alignment, and Point Distribution Models (PDM) are parameterized using stellar sampling. The Bayesian approach models the relationship between shape and texture through computation of conditionals in the training set. Our method varies from the FIRST framework in that initial fitting is driven by 1D intensity profile matching, and the conditional likelihood function is subsequently used to refine fitting. The method was developed using T1-weighted images from 63 subjects that were imaged and manually labeled: 43 subjects were scanned once and were used for training models, and 20 subjects were imaged twice (with manual labeling applied to both runs) and used to assess reliability and validity. Intraclass correlation analysis shows that CATK is highly reliable (average test-retest ICCs of 0.96), and offers excellent agreement with the gold standard (average validity ICC of 0.87 against manual labels). Comparisons against an alternative atlas-based approach, SUIT (Spatially Unbiased Infratentorial Template), that registers images with a high-resolution template of the cerebellum, show that our AAM approach offers superior reliability and validity. Extensions of CATK to cerebellar hemisphere parcels is envisioned. PMID:25192657
Detecting Protected Health Information in Heterogeneous Clinical Notes.
Henriksson, Aron; Kvist, Maria; Dalianis, Hercules
2017-01-01
To enable secondary use of healthcare data in a privacy-preserving manner, there is a need for methods capable of automatically identifying protected health information (PHI) in clinical text. To that end, learning predictive models from labeled examples has emerged as a promising alternative to rule-based systems. However, little is known about differences with respect to PHI prevalence in different types of clinical notes and how potential domain differences may affect the performance of predictive models trained on one particular type of note and applied to another. In this study, we analyze the performance of a predictive model trained on an existing PHI corpus of Swedish clinical notes and applied to a variety of clinical notes: written (i) in different clinical specialties, (ii) under different headings, and (iii) by persons in different professions. The results indicate that domain adaption is needed for effective detection of PHI in heterogeneous clinical notes.
Stanescu, Ana; Caragea, Doina
2015-01-01
Recent biochemical advances have led to inexpensive, time-efficient production of massive volumes of raw genomic data. Traditional machine learning approaches to genome annotation typically rely on large amounts of labeled data. The process of labeling data can be expensive, as it requires domain knowledge and expert involvement. Semi-supervised learning approaches that can make use of unlabeled data, in addition to small amounts of labeled data, can help reduce the costs associated with labeling. In this context, we focus on the problem of predicting splice sites in a genome using semi-supervised learning approaches. This is a challenging problem, due to the highly imbalanced distribution of the data, i.e., small number of splice sites as compared to the number of non-splice sites. To address this challenge, we propose to use ensembles of semi-supervised classifiers, specifically self-training and co-training classifiers. Our experiments on five highly imbalanced splice site datasets, with positive to negative ratios of 1-to-99, showed that the ensemble-based semi-supervised approaches represent a good choice, even when the amount of labeled data consists of less than 1% of all training data. In particular, we found that ensembles of co-training and self-training classifiers that dynamically balance the set of labeled instances during the semi-supervised iterations show improvements over the corresponding supervised ensemble baselines. In the presence of limited amounts of labeled data, ensemble-based semi-supervised approaches can successfully leverage the unlabeled data to enhance supervised ensembles learned from highly imbalanced data distributions. Given that such distributions are common for many biological sequence classification problems, our work can be seen as a stepping stone towards more sophisticated ensemble-based approaches to biological sequence annotation in a semi-supervised framework.
2015-01-01
Background Recent biochemical advances have led to inexpensive, time-efficient production of massive volumes of raw genomic data. Traditional machine learning approaches to genome annotation typically rely on large amounts of labeled data. The process of labeling data can be expensive, as it requires domain knowledge and expert involvement. Semi-supervised learning approaches that can make use of unlabeled data, in addition to small amounts of labeled data, can help reduce the costs associated with labeling. In this context, we focus on the problem of predicting splice sites in a genome using semi-supervised learning approaches. This is a challenging problem, due to the highly imbalanced distribution of the data, i.e., small number of splice sites as compared to the number of non-splice sites. To address this challenge, we propose to use ensembles of semi-supervised classifiers, specifically self-training and co-training classifiers. Results Our experiments on five highly imbalanced splice site datasets, with positive to negative ratios of 1-to-99, showed that the ensemble-based semi-supervised approaches represent a good choice, even when the amount of labeled data consists of less than 1% of all training data. In particular, we found that ensembles of co-training and self-training classifiers that dynamically balance the set of labeled instances during the semi-supervised iterations show improvements over the corresponding supervised ensemble baselines. Conclusions In the presence of limited amounts of labeled data, ensemble-based semi-supervised approaches can successfully leverage the unlabeled data to enhance supervised ensembles learned from highly imbalanced data distributions. Given that such distributions are common for many biological sequence classification problems, our work can be seen as a stepping stone towards more sophisticated ensemble-based approaches to biological sequence annotation in a semi-supervised framework. PMID:26356316
Ikeda, Mitsuru
2017-01-01
Information extraction and knowledge discovery regarding adverse drug reaction (ADR) from large-scale clinical texts are very useful and needy processes. Two major difficulties of this task are the lack of domain experts for labeling examples and intractable processing of unstructured clinical texts. Even though most previous works have been conducted on these issues by applying semisupervised learning for the former and a word-based approach for the latter, they face with complexity in an acquisition of initial labeled data and ignorance of structured sequence of natural language. In this study, we propose automatic data labeling by distant supervision where knowledge bases are exploited to assign an entity-level relation label for each drug-event pair in texts, and then, we use patterns for characterizing ADR relation. The multiple-instance learning with expectation-maximization method is employed to estimate model parameters. The method applies transductive learning to iteratively reassign a probability of unknown drug-event pair at the training time. By investigating experiments with 50,998 discharge summaries, we evaluate our method by varying large number of parameters, that is, pattern types, pattern-weighting models, and initial and iterative weightings of relations for unlabeled data. Based on evaluations, our proposed method outperforms the word-based feature for NB-EM (iEM), MILR, and TSVM with F1 score of 11.3%, 9.3%, and 6.5% improvement, respectively. PMID:29090077
30 CFR 47.42 - Label contents.
Code of Federal Regulations, 2014 CFR
2014-07-01
... 30 Mineral Resources 1 2014-07-01 2014-07-01 false Label contents. 47.42 Section 47.42 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR EDUCATION AND TRAINING HAZARD COMMUNICATION (HazCom) Container Labels and Other Forms of Warning § 47.42 Label contents. When an operator must make a label, the label must— (a) Be...
30 CFR 47.42 - Label contents.
Code of Federal Regulations, 2012 CFR
2012-07-01
... 30 Mineral Resources 1 2012-07-01 2012-07-01 false Label contents. 47.42 Section 47.42 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR EDUCATION AND TRAINING HAZARD COMMUNICATION (HazCom) Container Labels and Other Forms of Warning § 47.42 Label contents. When an operator must make a label, the label must— (a) Be...
30 CFR 47.42 - Label contents.
Code of Federal Regulations, 2013 CFR
2013-07-01
... 30 Mineral Resources 1 2013-07-01 2013-07-01 false Label contents. 47.42 Section 47.42 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR EDUCATION AND TRAINING HAZARD COMMUNICATION (HazCom) Container Labels and Other Forms of Warning § 47.42 Label contents. When an operator must make a label, the label must— (a) Be...
30 CFR 47.42 - Label contents.
Code of Federal Regulations, 2010 CFR
2010-07-01
... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Label contents. 47.42 Section 47.42 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR EDUCATION AND TRAINING HAZARD COMMUNICATION (HazCom) Container Labels and Other Forms of Warning § 47.42 Label contents. When an operator must make a label, the label must— (a) Be...
30 CFR 47.42 - Label contents.
Code of Federal Regulations, 2011 CFR
2011-07-01
... 30 Mineral Resources 1 2011-07-01 2011-07-01 false Label contents. 47.42 Section 47.42 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR EDUCATION AND TRAINING HAZARD COMMUNICATION (HazCom) Container Labels and Other Forms of Warning § 47.42 Label contents. When an operator must make a label, the label must— (a) Be...
21 CFR 314.81 - Other postmarketing reports.
Code of Federal Regulations, 2011 CFR
2011-04-01
... to safety (for example, epidemiologic studies or analyses of experience in a monitored series of... times two copies of the following reports: (1) NDA—Field alert report. The applicant shall submit... information, for example, submit a labeling supplement, add a warning to the labeling, or initiate a new study...
21 CFR 314.81 - Other postmarketing reports.
Code of Federal Regulations, 2012 CFR
2012-04-01
... to safety (for example, epidemiologic studies or analyses of experience in a monitored series of... times two copies of the following reports: (1) NDA—Field alert report. The applicant shall submit... information, for example, submit a labeling supplement, add a warning to the labeling, or initiate a new study...
21 CFR 314.81 - Other postmarketing reports.
Code of Federal Regulations, 2013 CFR
2013-04-01
... to safety (for example, epidemiologic studies or analyses of experience in a monitored series of... times two copies of the following reports: (1) NDA—Field alert report. The applicant shall submit... information, for example, submit a labeling supplement, add a warning to the labeling, or initiate a new study...
21 CFR 314.81 - Other postmarketing reports.
Code of Federal Regulations, 2014 CFR
2014-04-01
... to safety (for example, epidemiologic studies or analyses of experience in a monitored series of... times two copies of the following reports: (1) NDA—Field alert report. The applicant shall submit... information, for example, submit a labeling supplement, add a warning to the labeling, or initiate a new study...
Active learning based segmentation of Crohns disease from abdominal MRI.
Mahapatra, Dwarikanath; Vos, Franciscus M; Buhmann, Joachim M
2016-05-01
This paper proposes a novel active learning (AL) framework, and combines it with semi supervised learning (SSL) for segmenting Crohns disease (CD) tissues from abdominal magnetic resonance (MR) images. Robust fully supervised learning (FSL) based classifiers require lots of labeled data of different disease severities. Obtaining such data is time consuming and requires considerable expertise. SSL methods use a few labeled samples, and leverage the information from many unlabeled samples to train an accurate classifier. AL queries labels of most informative samples and maximizes gain from the labeling effort. Our primary contribution is in designing a query strategy that combines novel context information with classification uncertainty and feature similarity. Combining SSL and AL gives a robust segmentation method that: (1) optimally uses few labeled samples and many unlabeled samples; and (2) requires lower training time. Experimental results show our method achieves higher segmentation accuracy than FSL methods with fewer samples and reduced training effort. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Mao, H.; Bhaduri, B. L.
2016-12-01
Understanding public opinions on climate change is important for policy making. Public opinion, however, is typically measured with national surveys, which are often too expensive and thus being updated at a low frequency. Twitter has become a major platform for people to express their opinions on social and political issues. Our work attempts to understand if Twitter data can provide complimentary insights about climate change perceptions. Since the nature of social media is real-time, this data source can especially help us understand how public opinion changes over time in response to climate events and hazards, which though is very difficult to be captured by manual surveys. We use the Twitter Streaming API to collect tweets that contain keywords, "climate change" or "#climatechange". Traditional machine-learning based opinion mining algorithms require a significant amount of labeled data. Data labeling is notoriously time consuming. To address this problem, we use hashtags (a significant feature used to mark topics of tweets) to annotate tweets automatically. For example, hashtags, #climatedenial and #climatescam, are negative opinion labels, while #actonclimate and #climateaction are positive. Following this method, we can obtain a large amount of training data without human labor. This labeled dataset is used to train a deep convolutional neural network that classifies tweets into positive (i.e. believe in climate change) and negative (i.e. do not believe). Based on the positive/negative tweets obtained, we will further analyze risk perceptions and opinions towards policy support. In addition, we analyze twitter user profiles to understand the demographics of proponents and opponents of climate change. Deep learning techniques, especially convolutional deep neural networks, have achieved much success in computer vision. In this work, we propose a convolutional neural network architecture for understanding opinions within text. This method is compared with lexicon-based opinion analysis approaches. Results and the advantages/limitations of this method are to be discussed.
Snorkel: Rapid Training Data Creation with Weak Supervision
Ratner, Alexander; Bach, Stephen H.; Ehrenberg, Henry; Fries, Jason; Wu, Sen; Ré, Christopher
2018-01-01
Labeling training data is increasingly the largest bottleneck in deploying machine learning systems. We present Snorkel, a first-of-its-kind system that enables users to train state-of- the-art models without hand labeling any training data. Instead, users write labeling functions that express arbitrary heuristics, which can have unknown accuracies and correlations. Snorkel denoises their outputs without access to ground truth by incorporating the first end-to-end implementation of our recently proposed machine learning paradigm, data programming. We present a flexible interface layer for writing labeling functions based on our experience over the past year collaborating with companies, agencies, and research labs. In a user study, subject matter experts build models 2.8× faster and increase predictive performance an average 45.5% versus seven hours of hand labeling. We study the modeling tradeoffs in this new setting and propose an optimizer for automating tradeoff decisions that gives up to 1.8× speedup per pipeline execution. In two collaborations, with the U.S. Department of Veterans Affairs and the U.S. Food and Drug Administration, and on four open-source text and image data sets representative of other deployments, Snorkel provides 132% average improvements to predictive performance over prior heuristic approaches and comes within an average 3.60% of the predictive performance of large hand-curated training sets. PMID:29770249
Using partially labeled data for normal mixture identification with application to class definition
NASA Technical Reports Server (NTRS)
Shahshahani, Behzad M.; Landgrebe, David A.
1992-01-01
The problem of estimating the parameters of a normal mixture density when, in addition to the unlabeled samples, sets of partially labeled samples are available is addressed. The density of the multidimensional feature space is modeled with a normal mixture. It is assumed that the set of components of the mixture can be partitioned into several classes and that training samples are available from each class. Since for any training sample the class of origin is known but the exact component of origin within the corresponding class is unknown, the training samples as considered to be partially labeled. The EM iterative equations are derived for estimating the parameters of the normal mixture in the presence of partially labeled samples. These equations can be used to combine the supervised and nonsupervised learning processes.
Replacing maladaptive speech with verbal labeling responses: an analysis of generalized responding.
Foxx, R M; Faw, G D; McMorrow, M J; Kyle, M S; Bittle, R G
1988-01-01
We taught three mentally handicapped students to answer questions with verbal labels and evaluated the generalized effects of this training on their maladaptive speech (e.g., echolalia) and correct responding to untrained questions. The students received cues-pause-point training on an initial question set followed by generalization assessments on a different set in another setting. Probes were conducted on novel questions in three other settings to determine the strength and spread of the generalization effect. A multiple baseline across subjects design revealed that maladaptive speech was replaced with correct labels (answers) to questions in the training and all generalization settings. These results replicate and extend previous research that suggested that cues-pause-point procedures may be useful in replacing maladaptive speech patterns by teaching students to use their verbal labeling repertoires. PMID:3225258
Hyperspectral Sulfur Detection Using an SVM with Extreme Minority Positive Examples Onboard EO-1
NASA Astrophysics Data System (ADS)
Mandrake, Lukas; Wagstaff, Kiri L.; Gleeson, Damhnait; Rebbapragada, Umaa; Tran, Daniel; Castano, Rebecca; Chien, Steven; Pappalardo, Robert T.
2009-09-01
Onboard classification of remote sensing data is of general interest given that it can be used as a trigger to initiate alarms, data download, additional higher- resolution scans, or more frequent scans of an area without ground interaction. In our case, we study the sulfur-rich Borup-Fiord glacial springs in Canada utilizing the Hyperion instrument aboard the EO-1 spacecraft. This system consists of naturally occurring sulfur-rich springs emerging from glacial ice, which are a known environment for microbial life. The biological activity of the spring is associated with sulfur compounds that can be detected remotely via spectral analysis. This system may offer an analog to far more exotic locales such as Europa where remote sensing of biogenic indicators is of considerable interest. Unfortunately, spacecraft processing power and memory is severely limited which places strong constraints on the algorithms available. Previous work has been performed in the generation and execution of an onboard SVM (support vector machine) classifier to autonomously identify the presence of sulfur compounds associated with the activity of microbial life. However, those results were limited in the number of positive examples available to be labeled. In this paper we extend the sample size from 1 to 7 example scenes between 2006 and 2008, corresponding to a change from 18 to 235 positive labels. Of key interest is our assessment of the classifier's behavior on non-sulfur-bearing imagery far from the training region. Selection of the most relevant spectral bands and parameters for the SVM are also explored.
30 CFR 47.41 - Requirement for container labels.
Code of Federal Regulations, 2010 CFR
2010-07-01
... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Requirement for container labels. 47.41 Section... TRAINING HAZARD COMMUNICATION (HazCom) Container Labels and Other Forms of Warning § 47.41 Requirement for container labels. (a) The operator must ensure that each container of a hazardous chemical has a label. If a...
Generating Ground Reference Data for a Global Impervious Surface Survey
NASA Technical Reports Server (NTRS)
Tilton, James C.; deColstoun, Eric Brown; Wolfe, Robert E.; Tan, Bin; Huang, Chengquan
2012-01-01
We are engaged in a project to produce a 30m impervious cover data set of the entire Earth for the years 2000 and 2010 based on the Landsat Global Land Survey (GLS) data set. The GLS data from Landsat provide an unprecedented opportunity to map global urbanization at this resolution for the first time, with unprecedented detail and accuracy. Moreover, the spatial resolution of Landsat is absolutely essential to accurately resolve urban targets such as buildings, roads and parking lots. Finally, with GLS data available for the 1975, 1990, 2000, and 2005 time periods, and soon for the 2010 period, the land cover/use changes due to urbanization can now be quantified at this spatial scale as well. Our approach works across spatial scales using very high spatial resolution commercial satellite data to both produce and evaluate continental scale products at the 30m spatial resolution of Landsat data. We are developing continental scale training data at 1m or so resolution and aggregating these to 30m for training a regression tree algorithm. Because the quality of the input training data are critical, we have developed an interactive software tool, called HSegLearn, to facilitate the photo-interpretation of high resolution imagery data, such as Quickbird or Ikonos data, into an impervious versus non-impervious map. Previous work has shown that photo-interpretation of high resolution data at 1 meter resolution will generate an accurate 30m resolution ground reference when coarsened to that resolution. Since this process can be very time consuming when using standard clustering classification algorithms, we are looking at image segmentation as a potential avenue to not only improve the training process but also provide a semi-automated approach for generating the ground reference data. HSegLearn takes as its input a hierarchical set of image segmentations produced by the HSeg image segmentation program [1, 2]. HSegLearn lets an analyst specify pixel locations as being either positive or negative examples, and displays a classification of the study area based on these examples. For our study, the positive examples are examples of impervious surfaces and negative examples are examples of non-impervious surfaces. HSegLearn searches the hierarchical segmentation from HSeg for the coarsest level of segmentation at which selected positive example locations do not conflict with negative example locations and labels the image accordingly. The negative example regions are always defined at the finest level of segmentation detail. The resulting classification map can be then further edited at a region object level using the previously developed HSegViewer tool [3]. After providing an overview of the HSeg image segmentation program, we provide a detailed description of the HSegLearn software tool. We then give examples of using HSegLearn to generate ground reference data and conclude with comments on the effectiveness of the HSegLearn tool.
A Simple Label Switching Algorithm for Semisupervised Structural SVMs.
Balamurugan, P; Shevade, Shirish; Sundararajan, S
2015-10-01
In structured output learning, obtaining labeled data for real-world applications is usually costly, while unlabeled examples are available in abundance. Semisupervised structured classification deals with a small number of labeled examples and a large number of unlabeled structured data. In this work, we consider semisupervised structural support vector machines with domain constraints. The optimization problem, which in general is not convex, contains the loss terms associated with the labeled and unlabeled examples, along with the domain constraints. We propose a simple optimization approach that alternates between solving a supervised learning problem and a constraint matching problem. Solving the constraint matching problem is difficult for structured prediction, and we propose an efficient and effective label switching method to solve it. The alternating optimization is carried out within a deterministic annealing framework, which helps in effective constraint matching and avoiding poor local minima, which are not very useful. The algorithm is simple and easy to implement. Further, it is suitable for any structured output learning problem where exact inference is available. Experiments on benchmark sequence labeling data sets and a natural language parsing data set show that the proposed approach, though simple, achieves comparable generalization performance.
Improving condition severity classification with an efficient active learning based framework.
Nissim, Nir; Boland, Mary Regina; Tatonetti, Nicholas P; Elovici, Yuval; Hripcsak, George; Shahar, Yuval; Moskovitch, Robert
2016-06-01
Classification of condition severity can be useful for discriminating among sets of conditions or phenotypes, for example when prioritizing patient care or for other healthcare purposes. Electronic Health Records (EHRs) represent a rich source of labeled information that can be harnessed for severity classification. The labeling of EHRs is expensive and in many cases requires employing professionals with high level of expertise. In this study, we demonstrate the use of Active Learning (AL) techniques to decrease expert labeling efforts. We employ three AL methods and demonstrate their ability to reduce labeling efforts while effectively discriminating condition severity. We incorporate three AL methods into a new framework based on the original CAESAR (Classification Approach for Extracting Severity Automatically from Electronic Health Records) framework to create the Active Learning Enhancement framework (CAESAR-ALE). We applied CAESAR-ALE to a dataset containing 516 conditions of varying severity levels that were manually labeled by seven experts. Our dataset, called the "CAESAR dataset," was created from the medical records of 1.9 million patients treated at Columbia University Medical Center (CUMC). All three AL methods decreased labelers' efforts compared to the learning methods applied by the original CAESER framework in which the classifier was trained on the entire set of conditions; depending on the AL strategy used in the current study, the reduction ranged from 48% to 64% that can result in significant savings, both in time and money. As for the PPV (precision) measure, CAESAR-ALE achieved more than 13% absolute improvement in the predictive capabilities of the framework when classifying conditions as severe. These results demonstrate the potential of AL methods to decrease the labeling efforts of medical experts, while increasing accuracy given the same (or even a smaller) number of acquired conditions. We also demonstrated that the methods included in the CAESAR-ALE framework (Exploitation and Combination_XA) are more robust to the use of human labelers with different levels of professional expertise. Copyright © 2016 Elsevier Inc. All rights reserved.
HCP: A Flexible CNN Framework for Multi-label Image Classification.
Wei, Yunchao; Xia, Wei; Lin, Min; Huang, Junshi; Ni, Bingbing; Dong, Jian; Zhao, Yao; Yan, Shuicheng
2015-10-26
Convolutional Neural Network (CNN) has demonstrated promising performance in single-label image classification tasks. However, how CNN best copes with multi-label images still remains an open problem, mainly due to the complex underlying object layouts and insufficient multi-label training images. In this work, we propose a flexible deep CNN infrastructure, called Hypotheses-CNN-Pooling (HCP), where an arbitrary number of object segment hypotheses are taken as the inputs, then a shared CNN is connected with each hypothesis, and finally the CNN output results from different hypotheses are aggregated with max pooling to produce the ultimate multi-label predictions. Some unique characteristics of this flexible deep CNN infrastructure include: 1) no ground-truth bounding box information is required for training; 2) the whole HCP infrastructure is robust to possibly noisy and/or redundant hypotheses; 3) the shared CNN is flexible and can be well pre-trained with a large-scale single-label image dataset, e.g., ImageNet; and 4) it may naturally output multi-label prediction results. Experimental results on Pascal VOC 2007 and VOC 2012 multi-label image datasets well demonstrate the superiority of the proposed HCP infrastructure over other state-of-the-arts. In particular, the mAP reaches 90.5% by HCP only and 93.2% after the fusion with our complementary result in [44] based on hand-crafted features on the VOC 2012 dataset.
NASA Technical Reports Server (NTRS)
Coberly, W. A.; Tubbs, J. D.; Odell, P. L.
1979-01-01
The overall success of large-scale crop inventories of agricultural regions using Landsat multispectral scanner data is highly dependent upon the labeling of training data by analyst/photointerpreters. The principal analyst tool in labeling training data is a false color infrared composite of Landsat bands 4, 5, and 7. In this paper, this color display is investigated and its influence upon classification errors is partially determined.
Lung nodule detection using 3D convolutional neural networks trained on weakly labeled data
NASA Astrophysics Data System (ADS)
Anirudh, Rushil; Thiagarajan, Jayaraman J.; Bremer, Timo; Kim, Hyojin
2016-03-01
Early detection of lung nodules is currently the one of the most effective ways to predict and treat lung cancer. As a result, the past decade has seen a lot of focus on computer aided diagnosis (CAD) of lung nodules, whose goal is to efficiently detect, segment lung nodules and classify them as being benign or malignant. Effective detection of such nodules remains a challenge due to their arbitrariness in shape, size and texture. In this paper, we propose to employ 3D convolutional neural networks (CNN) to learn highly discriminative features for nodule detection in lieu of hand-engineered ones such as geometric shape or texture. While 3D CNNs are promising tools to model the spatio-temporal statistics of data, they are limited by their need for detailed 3D labels, which can be prohibitively expensive when compared obtaining 2D labels. Existing CAD methods rely on obtaining detailed labels for lung nodules, to train models, which is also unrealistic and time consuming. To alleviate this challenge, we propose a solution wherein the expert needs to provide only a point label, i.e., the central pixel of of the nodule, and its largest expected size. We use unsupervised segmentation to grow out a 3D region, which is used to train the CNN. Using experiments on the SPIE-LUNGx dataset, we show that the network trained using these weak labels can produce reasonably low false positive rates with a high sensitivity, even in the absence of accurate 3D labels.
49 CFR 172.404 - Labels for mixed and consolidated packaging.
Code of Federal Regulations, 2012 CFR
2012-10-01
... 49 Transportation 2 2012-10-01 2012-10-01 false Labels for mixed and consolidated packaging. 172..., TRAINING REQUIREMENTS, AND SECURITY PLANS Labeling § 172.404 Labels for mixed and consolidated packaging. (a) Mixed packaging. When compatible hazardous materials having different hazard classes are packed...
49 CFR 172.404 - Labels for mixed and consolidated packaging.
Code of Federal Regulations, 2014 CFR
2014-10-01
... 49 Transportation 2 2014-10-01 2014-10-01 false Labels for mixed and consolidated packaging. 172..., TRAINING REQUIREMENTS, AND SECURITY PLANS Labeling § 172.404 Labels for mixed and consolidated packaging. (a) Mixed packaging. When compatible hazardous materials having different hazard classes are packed...
49 CFR 172.404 - Labels for mixed and consolidated packaging.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 49 Transportation 2 2011-10-01 2011-10-01 false Labels for mixed and consolidated packaging. 172..., TRAINING REQUIREMENTS, AND SECURITY PLANS Labeling § 172.404 Labels for mixed and consolidated packaging. (a) Mixed packaging. When compatible hazardous materials having different hazard classes are packed...
49 CFR 172.404 - Labels for mixed and consolidated packaging.
Code of Federal Regulations, 2013 CFR
2013-10-01
... 49 Transportation 2 2013-10-01 2013-10-01 false Labels for mixed and consolidated packaging. 172..., TRAINING REQUIREMENTS, AND SECURITY PLANS Labeling § 172.404 Labels for mixed and consolidated packaging. (a) Mixed packaging. When compatible hazardous materials having different hazard classes are packed...
Semi-Supervised Active Learning for Sound Classification in Hybrid Learning Environments.
Han, Wenjing; Coutinho, Eduardo; Ruan, Huabin; Li, Haifeng; Schuller, Björn; Yu, Xiaojie; Zhu, Xuan
2016-01-01
Coping with scarcity of labeled data is a common problem in sound classification tasks. Approaches for classifying sounds are commonly based on supervised learning algorithms, which require labeled data which is often scarce and leads to models that do not generalize well. In this paper, we make an efficient combination of confidence-based Active Learning and Self-Training with the aim of minimizing the need for human annotation for sound classification model training. The proposed method pre-processes the instances that are ready for labeling by calculating their classifier confidence scores, and then delivers the candidates with lower scores to human annotators, and those with high scores are automatically labeled by the machine. We demonstrate the feasibility and efficacy of this method in two practical scenarios: pool-based and stream-based processing. Extensive experimental results indicate that our approach requires significantly less labeled instances to reach the same performance in both scenarios compared to Passive Learning, Active Learning and Self-Training. A reduction of 52.2% in human labeled instances is achieved in both of the pool-based and stream-based scenarios on a sound classification task considering 16,930 sound instances.
Semi-Supervised Active Learning for Sound Classification in Hybrid Learning Environments
Han, Wenjing; Coutinho, Eduardo; Li, Haifeng; Schuller, Björn; Yu, Xiaojie; Zhu, Xuan
2016-01-01
Coping with scarcity of labeled data is a common problem in sound classification tasks. Approaches for classifying sounds are commonly based on supervised learning algorithms, which require labeled data which is often scarce and leads to models that do not generalize well. In this paper, we make an efficient combination of confidence-based Active Learning and Self-Training with the aim of minimizing the need for human annotation for sound classification model training. The proposed method pre-processes the instances that are ready for labeling by calculating their classifier confidence scores, and then delivers the candidates with lower scores to human annotators, and those with high scores are automatically labeled by the machine. We demonstrate the feasibility and efficacy of this method in two practical scenarios: pool-based and stream-based processing. Extensive experimental results indicate that our approach requires significantly less labeled instances to reach the same performance in both scenarios compared to Passive Learning, Active Learning and Self-Training. A reduction of 52.2% in human labeled instances is achieved in both of the pool-based and stream-based scenarios on a sound classification task considering 16,930 sound instances. PMID:27627768
Joint learning of labels and distance metric.
Liu, Bo; Wang, Meng; Hong, Richang; Zha, Zhengjun; Hua, Xian-Sheng
2010-06-01
Machine learning algorithms frequently suffer from the insufficiency of training data and the usage of inappropriate distance metric. In this paper, we propose a joint learning of labels and distance metric (JLLDM) approach, which is able to simultaneously address the two difficulties. In comparison with the existing semi-supervised learning and distance metric learning methods that focus only on label prediction or distance metric construction, the JLLDM algorithm optimizes the labels of unlabeled samples and a Mahalanobis distance metric in a unified scheme. The advantage of JLLDM is multifold: 1) the problem of training data insufficiency can be tackled; 2) a good distance metric can be constructed with only very few training samples; and 3) no radius parameter is needed since the algorithm automatically determines the scale of the metric. Extensive experiments are conducted to compare the JLLDM approach with different semi-supervised learning and distance metric learning methods, and empirical results demonstrate its effectiveness.
Sample Pesticide Label for Label Review Training
Pesticide labels translate results of our extensive evaluations of pesticide products into conditions, directions and precautions that define parameters for use of a pesticide with the goal of ensuring protection of human health and the environment.
Label Review Training: Module 1: Label Basics, Page 8
Pesticide labels translate results of our extensive evaluations of pesticide products into conditions, directions and precautions that define parameters for use of a pesticide with the goal of ensuring protection of human he
Student beats the teacher: deep neural networks for lateral ventricles segmentation in brain MR
NASA Astrophysics Data System (ADS)
Ghafoorian, Mohsen; Teuwen, Jonas; Manniesing, Rashindra; Leeuw, Frank-Erik d.; van Ginneken, Bram; Karssemeijer, Nico; Platel, Bram
2018-03-01
Ventricular volume and its progression are known to be linked to several brain diseases such as dementia and schizophrenia. Therefore accurate measurement of ventricle volume is vital for longitudinal studies on these disorders, making automated ventricle segmentation algorithms desirable. In the past few years, deep neural networks have shown to outperform the classical models in many imaging domains. However, the success of deep networks is dependent on manually labeled data sets, which are expensive to acquire especially for higher dimensional data in the medical domain. In this work, we show that deep neural networks can be trained on muchcheaper-to-acquire pseudo-labels (e.g., generated by other automated less accurate methods) and still produce more accurate segmentations compared to the quality of the labels. To show this, we use noisy segmentation labels generated by a conventional region growing algorithm to train a deep network for lateral ventricle segmentation. Then on a large manually annotated test set, we show that the network significantly outperforms the conventional region growing algorithm which was used to produce the training labels for the network. Our experiments report a Dice Similarity Coefficient (DSC) of 0.874 for the trained network compared to 0.754 for the conventional region growing algorithm (p < 0.001).
49 CFR 172.404 - Labels for mixed and consolidated packaging.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 49 Transportation 2 2010-10-01 2010-10-01 false Labels for mixed and consolidated packaging. 172..., TRAINING REQUIREMENTS, AND SECURITY PLANS Labeling § 172.404 Labels for mixed and consolidated packaging. (a) Mixed packaging. When hazardous materials having different hazard classes are packed within the...
PRN 2000-5: Guidance for Mandatory and Advisory Labeling Statements
This notice provides guidance for improving the clarity of labeling statements in order to avoid confusing directions and precautions and to prevent the misuse of pesticides. It includes definitions and examples for mandatory and advisory label statements.
Label Review Training: Module 1: Label Basics, Page 2
Pesticide labels translate results of our extensive evaluations of pesticide products into conditions, directions and precautions that define parameters for use of a pesticide with the goal of ensuring protection of human health and the environment.
Label Review Training: Module 1: Label Basics, Page 9
Pesticide labels translate results of our extensive evaluations of pesticide products into conditions, directions and precautions that define parameters for use of a pesticide with the goal of ensuring protection of human health and the environment.
Label Review Training: Module 1: Label Basics, Page 5
Pesticide labels translate results of our extensive evaluations of pesticide products into conditions, directions and precautions that define parameters for use of a pesticide with the goal of ensuring protection of human health and the environment.
Label Review Training: Module 1: Label Basics, Page 4
Pesticide labels translate results of our extensive evaluations of pesticide products into conditions, directions and precautions that define parameters for use of a pesticide with the goal of ensuring protection of human health and the environment.
Label Review Training: Module 1: Label Basics, Page 3
Pesticide labels translate results of our extensive evaluations of pesticide products into conditions, directions and precautions that define parameters for use of a pesticide with the goal of ensuring protection of human health and the environment.
Breen, Cathal; Zhu, Tingting; Bond, Raymond; Finlay, Dewar; Clifford, Gari
2016-01-01
The aim of this study is to present and evaluate the integration of a low resource JavaScript based ECG training interface (CrowdLabel) and a standardised curriculum for self-guided tuition in ECG interpretation. Participants practiced interpreting ECGs weekly using the CrowdLabel interface to assist with the learning of the traditional didactic taught course material during a 6 week training period. To determine competency students were tested during week 7. A total of 245 unique ECG cases were submitted by each student. Accuracy scores during the training period ranged from 0-59.5% (median = 33.3%). Conversely accuracy scores during the test ranged from 30 - 70% (median = 37.5%) (p < 0.05). There was no correlation between students who interpreted high numbers of ECGs during the training period and their marks obtained. CrowdLabel is shown to be a readily accessible dedicated learning platform to support ECG interpretation competency. Copyright © 2016 Elsevier Inc. All rights reserved.
7 CFR 201.31a - Labeling treated seed.
Code of Federal Regulations, 2011 CFR
2011-01-01
... been treated shall be labeled in type no smaller than 8 point to indicate that the seed has been... labeling all types of mercurials. Examples of commonly accepted abbreviated chemical names are: BHC (1, 2... the size of the type used for information required to be on the label under paragraph (a) and shall...
A Locality-Constrained and Label Embedding Dictionary Learning Algorithm for Image Classification.
Zhengming Li; Zhihui Lai; Yong Xu; Jian Yang; Zhang, David
2017-02-01
Locality and label information of training samples play an important role in image classification. However, previous dictionary learning algorithms do not take the locality and label information of atoms into account together in the learning process, and thus their performance is limited. In this paper, a discriminative dictionary learning algorithm, called the locality-constrained and label embedding dictionary learning (LCLE-DL) algorithm, was proposed for image classification. First, the locality information was preserved using the graph Laplacian matrix of the learned dictionary instead of the conventional one derived from the training samples. Then, the label embedding term was constructed using the label information of atoms instead of the classification error term, which contained discriminating information of the learned dictionary. The optimal coding coefficients derived by the locality-based and label-based reconstruction were effective for image classification. Experimental results demonstrated that the LCLE-DL algorithm can achieve better performance than some state-of-the-art algorithms.
ERIC Educational Resources Information Center
Loehr, Abbey M.; Rittle-Johnson, Bethany
2017-01-01
Research has demonstrated that providing labels helps children notice key features of examples. Much less is known about how different labels impact children's ability to make inferences about the structure underlying mathematical notation. We tested the impact of labeling decimals such as 0.34 using formal place-value labels ("3 tenths and 4…
Showing Parents How to Talk to Their Kids about the Nutrition Facts Label
... How to Talk to Their Kids about the Nutrition Facts Label Training for Health Educators and Community ... leaders unite with the goal of using the Nutrition Facts Label as their everyday tool for making ...
Label Review Training: Module 1: Label Basics, Page 6
Page 6, Pesticide labels translate results of our extensive evaluations of pesticide products into conditions, directions and precautions that define parameters for use of a pesticide with the goal of ensuring protection of human health and the environment
Improving condition severity classification with an efficient active learning based framework
Nissim, Nir; Boland, Mary Regina; Tatonetti, Nicholas P.; Elovici, Yuval; Hripcsak, George; Shahar, Yuval; Moskovitch, Robert
2017-01-01
Classification of condition severity can be useful for discriminating among sets of conditions or phenotypes, for example when prioritizing patient care or for other healthcare purposes. Electronic Health Records (EHRs) represent a rich source of labeled information that can be harnessed for severity classification. The labeling of EHRs is expensive and in many cases requires employing professionals with high level of expertise. In this study, we demonstrate the use of Active Learning (AL) techniques to decrease expert labeling efforts. We employ three AL methods and demonstrate their ability to reduce labeling efforts while effectively discriminating condition severity. We incorporate three AL methods into a new framework based on the original CAESAR (Classification Approach for Extracting Severity Automatically from Electronic Health Records) framework to create the Active Learning Enhancement framework (CAESAR-ALE). We applied CAESAR-ALE to a dataset containing 516 conditions of varying severity levels that were manually labeled by seven experts. Our dataset, called the “CAESAR dataset,” was created from the medical records of 1.9 million patients treated at Columbia University Medical Center (CUMC). All three AL methods decreased labelers’ efforts compared to the learning methods applied by the original CAESER framework in which the classifier was trained on the entire set of conditions; depending on the AL strategy used in the current study, the reduction ranged from 48% to 64% that can result in significant savings, both in time and money. As for the PPV (precision) measure, CAESAR-ALE achieved more than 13% absolute improvement in the predictive capabilities of the framework when classifying conditions as severe. These results demonstrate the potential of AL methods to decrease the labeling efforts of medical experts, while increasing accuracy given the same (or even a smaller) number of acquired conditions. We also demonstrated that the methods included in the CAESAR-ALE framework (Exploitation and Combination_XA) are more robust to the use of human labelers with different levels of professional expertise. PMID:27016383
Vrooman, Henri A; Cocosco, Chris A; van der Lijn, Fedde; Stokking, Rik; Ikram, M Arfan; Vernooij, Meike W; Breteler, Monique M B; Niessen, Wiro J
2007-08-01
Conventional k-Nearest-Neighbor (kNN) classification, which has been successfully applied to classify brain tissue in MR data, requires training on manually labeled subjects. This manual labeling is a laborious and time-consuming procedure. In this work, a new fully automated brain tissue classification procedure is presented, in which kNN training is automated. This is achieved by non-rigidly registering the MR data with a tissue probability atlas to automatically select training samples, followed by a post-processing step to keep the most reliable samples. The accuracy of the new method was compared to rigid registration-based training and to conventional kNN-based segmentation using training on manually labeled subjects for segmenting gray matter (GM), white matter (WM) and cerebrospinal fluid (CSF) in 12 data sets. Furthermore, for all classification methods, the performance was assessed when varying the free parameters. Finally, the robustness of the fully automated procedure was evaluated on 59 subjects. The automated training method using non-rigid registration with a tissue probability atlas was significantly more accurate than rigid registration. For both automated training using non-rigid registration and for the manually trained kNN classifier, the difference with the manual labeling by observers was not significantly larger than inter-observer variability for all tissue types. From the robustness study, it was clear that, given an appropriate brain atlas and optimal parameters, our new fully automated, non-rigid registration-based method gives accurate and robust segmentation results. A similarity index was used for comparison with manually trained kNN. The similarity indices were 0.93, 0.92 and 0.92, for CSF, GM and WM, respectively. It can be concluded that our fully automated method using non-rigid registration may replace manual segmentation, and thus that automated brain tissue segmentation without laborious manual training is feasible.
An Oracle-based co-training framework for writer identification in offline handwriting
NASA Astrophysics Data System (ADS)
Porwal, Utkarsh; Rajan, Sreeranga; Govindaraju, Venu
2012-01-01
State-of-the-art techniques for writer identification have been centered primarily on enhancing the performance of the system for writer identification. Machine learning algorithms have been used extensively to improve the accuracy of such system assuming sufficient amount of data is available for training. Little attention has been paid to the prospect of harnessing the information tapped in a large amount of un-annotated data. This paper focuses on co-training based framework that can be used for iterative labeling of the unlabeled data set exploiting the independence between the multiple views (features) of the data. This paradigm relaxes the assumption of sufficiency of the data available and tries to generate labeled data from unlabeled data set along with improving the accuracy of the system. However, performance of co-training based framework is dependent on the effectiveness of the algorithm used for the selection of data points to be added in the labeled set. We propose an Oracle based approach for data selection that learns the patterns in the score distribution of classes for labeled data points and then predicts the labels (writers) of the unlabeled data point. This method for selection statistically learns the class distribution and predicts the most probable class unlike traditional selection algorithms which were based on heuristic approaches. We conducted experiments on publicly available IAM dataset and illustrate the efficacy of the proposed approach.
Weakly supervised image semantic segmentation based on clustering superpixels
NASA Astrophysics Data System (ADS)
Yan, Xiong; Liu, Xiaohua
2018-04-01
In this paper, we propose an image semantic segmentation model which is trained from image-level labeled images. The proposed model starts with superpixel segmenting, and features of the superpixels are extracted by trained CNN. We introduce a superpixel-based graph followed by applying the graph partition method to group correlated superpixels into clusters. For the acquisition of inter-label correlations between the image-level labels in dataset, we not only utilize label co-occurrence statistics but also exploit visual contextual cues simultaneously. At last, we formulate the task of mapping appropriate image-level labels to the detected clusters as a problem of convex minimization. Experimental results on MSRC-21 dataset and LableMe dataset show that the proposed method has a better performance than most of the weakly supervised methods and is even comparable to fully supervised methods.
Scaling up spike-and-slab models for unsupervised feature learning.
Goodfellow, Ian J; Courville, Aaron; Bengio, Yoshua
2013-08-01
We describe the use of two spike-and-slab models for modeling real-valued data, with an emphasis on their applications to object recognition. The first model, which we call spike-and-slab sparse coding (S3C), is a preexisting model for which we introduce a faster approximate inference algorithm. We introduce a deep variant of S3C, which we call the partially directed deep Boltzmann machine (PD-DBM) and extend our S3C inference algorithm for use on this model. We describe learning procedures for each. We demonstrate that our inference procedure for S3C enables scaling the model to unprecedented large problem sizes, and demonstrate that using S3C as a feature extractor results in very good object recognition performance, particularly when the number of labeled examples is low. We show that the PD-DBM generates better samples than its shallow counterpart, and that unlike DBMs or DBNs, the PD-DBM may be trained successfully without greedy layerwise training.
ERIC Educational Resources Information Center
Boxwell, Stephen A.
2011-01-01
Treebanks are a necessary prerequisite for many NLP tasks, including, but not limited to, semantic role labeling. For many languages, however, treebanks are either nonexistent or too small to be useful. Time-critical applications may require rapid deployment of natural language software for a new critical language--much faster than the development…
49 CFR 172.411 - EXPLOSIVE 1.1, 1.2, 1.3, 1.4, 1.5 and 1.6 labels, and EXPLOSIVE Subsidiary label.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 49 Transportation 2 2011-10-01 2011-10-01 false EXPLOSIVE 1.1, 1.2, 1.3, 1.4, 1.5 and 1.6 labels..., EMERGENCY RESPONSE INFORMATION, TRAINING REQUIREMENTS, AND SECURITY PLANS Labeling § 172.411 EXPLOSIVE 1.1, 1.2, 1.3, 1.4, 1.5 and 1.6 labels, and EXPLOSIVE Subsidiary label. (a) Except for size and color...
ERIC Educational Resources Information Center
De Smet, M.; Van Keer, H.; Valcke, M.
2008-01-01
Cross-age tutors were randomly assigned to one of the three tutor training conditions distinguished for the current study: (1) the labelling experimental condition, characterized by requirements to label their tutor interventions, based on the e-moderating model of Salmon; (2) the non-labelling experimental condition, focusing on tutor's acting…
Ensemble LUT classification for degraded document enhancement
NASA Astrophysics Data System (ADS)
Obafemi-Ajayi, Tayo; Agam, Gady; Frieder, Ophir
2008-01-01
The fast evolution of scanning and computing technologies have led to the creation of large collections of scanned paper documents. Examples of such collections include historical collections, legal depositories, medical archives, and business archives. Moreover, in many situations such as legal litigation and security investigations scanned collections are being used to facilitate systematic exploration of the data. It is almost always the case that scanned documents suffer from some form of degradation. Large degradations make documents hard to read and substantially deteriorate the performance of automated document processing systems. Enhancement of degraded document images is normally performed assuming global degradation models. When the degradation is large, global degradation models do not perform well. In contrast, we propose to estimate local degradation models and use them in enhancing degraded document images. Using a semi-automated enhancement system we have labeled a subset of the Frieder diaries collection.1 This labeled subset was then used to train an ensemble classifier. The component classifiers are based on lookup tables (LUT) in conjunction with the approximated nearest neighbor algorithm. The resulting algorithm is highly effcient. Experimental evaluation results are provided using the Frieder diaries collection.1
Reducing Spatial Data Complexity for Classification Models
NASA Astrophysics Data System (ADS)
Ruta, Dymitr; Gabrys, Bogdan
2007-11-01
Intelligent data analytics gradually becomes a day-to-day reality of today's businesses. However, despite rapidly increasing storage and computational power current state-of-the-art predictive models still can not handle massive and noisy corporate data warehouses. What is more adaptive and real-time operational environment requires multiple models to be frequently retrained which further hinders their use. Various data reduction techniques ranging from data sampling up to density retention models attempt to address this challenge by capturing a summarised data structure, yet they either do not account for labelled data or degrade the classification performance of the model trained on the condensed dataset. Our response is a proposition of a new general framework for reducing the complexity of labelled data by means of controlled spatial redistribution of class densities in the input space. On the example of Parzen Labelled Data Compressor (PLDC) we demonstrate a simulatory data condensation process directly inspired by the electrostatic field interaction where the data are moved and merged following the attracting and repelling interactions with the other labelled data. The process is controlled by the class density function built on the original data that acts as a class-sensitive potential field ensuring preservation of the original class density distributions, yet allowing data to rearrange and merge joining together their soft class partitions. As a result we achieved a model that reduces the labelled datasets much further than any competitive approaches yet with the maximum retention of the original class densities and hence the classification performance. PLDC leaves the reduced dataset with the soft accumulative class weights allowing for efficient online updates and as shown in a series of experiments if coupled with Parzen Density Classifier (PDC) significantly outperforms competitive data condensation methods in terms of classification performance at the comparable compression levels.
Shabanpoor, Fazel; Gait, Michael J
2013-11-11
We describe a general methodology for fluorescent labelling of peptide conjugates of phosphorodiamidate morpholino oligonucleotides (PMOs) by alkyne functionalization of peptides, subsequent conjugation to PMOs and labelling with a fluorescent compound (Cy5-azide). Two peptide-PMO (PPMO) examples are shown. No detrimental effect of such labelled PMOs was seen in a biological assay.
Instance annotation for multi-instance multi-label learning
F. Briggs; X.Z. Fern; R. Raich; Q. Lou
2013-01-01
Multi-instance multi-label learning (MIML) is a framework for supervised classification where the objects to be classified are bags of instances associated with multiple labels. For example, an image can be represented as a bag of segments and associated with a list of objects it contains. Prior work on MIML has focused on predicting label sets for previously unseen...
NASA Astrophysics Data System (ADS)
Wu, J.; Yao, W.; Zhang, J.; Li, Y.
2018-04-01
Labeling 3D point cloud data with traditional supervised learning methods requires considerable labelled samples, the collection of which is cost and time expensive. This work focuses on adopting domain adaption concept to transfer existing trained random forest classifiers (based on source domain) to new data scenes (target domain), which aims at reducing the dependence of accurate 3D semantic labeling in point clouds on training samples from the new data scene. Firstly, two random forest classifiers were firstly trained with existing samples previously collected for other data. They were different from each other by using two different decision tree construction algorithms: C4.5 with information gain ratio and CART with Gini index. Secondly, four random forest classifiers adapted to the target domain are derived through transferring each tree in the source random forest models with two types of operations: structure expansion and reduction-SER and structure transfer-STRUT. Finally, points in target domain are labelled by fusing the four newly derived random forest classifiers using weights of evidence based fusion model. To validate our method, experimental analysis was conducted using 3 datasets: one is used as the source domain data (Vaihingen data for 3D Semantic Labelling); another two are used as the target domain data from two cities in China (Jinmen city and Dunhuang city). Overall accuracies of 85.5 % and 83.3 % for 3D labelling were achieved for Jinmen city and Dunhuang city data respectively, with only 1/3 newly labelled samples compared to the cases without domain adaption.
An active learning approach for rapid characterization of endothelial cells in human tumors.
Padmanabhan, Raghav K; Somasundar, Vinay H; Griffith, Sandra D; Zhu, Jianliang; Samoyedny, Drew; Tan, Kay See; Hu, Jiahao; Liao, Xuejun; Carin, Lawrence; Yoon, Sam S; Flaherty, Keith T; Dipaola, Robert S; Heitjan, Daniel F; Lal, Priti; Feldman, Michael D; Roysam, Badrinath; Lee, William M F
2014-01-01
Currently, no available pathological or molecular measures of tumor angiogenesis predict response to antiangiogenic therapies used in clinical practice. Recognizing that tumor endothelial cells (EC) and EC activation and survival signaling are the direct targets of these therapies, we sought to develop an automated platform for quantifying activity of critical signaling pathways and other biological events in EC of patient tumors by histopathology. Computer image analysis of EC in highly heterogeneous human tumors by a statistical classifier trained using examples selected by human experts performed poorly due to subjectivity and selection bias. We hypothesized that the analysis can be optimized by a more active process to aid experts in identifying informative training examples. To test this hypothesis, we incorporated a novel active learning (AL) algorithm into FARSIGHT image analysis software that aids the expert by seeking out informative examples for the operator to label. The resulting FARSIGHT-AL system identified EC with specificity and sensitivity consistently greater than 0.9 and outperformed traditional supervised classification algorithms. The system modeled individual operator preferences and generated reproducible results. Using the results of EC classification, we also quantified proliferation (Ki67) and activity in important signal transduction pathways (MAP kinase, STAT3) in immunostained human clear cell renal cell carcinoma and other tumors. FARSIGHT-AL enables characterization of EC in conventionally preserved human tumors in a more automated process suitable for testing and validating in clinical trials. The results of our study support a unique opportunity for quantifying angiogenesis in a manner that can now be tested for its ability to identify novel predictive and response biomarkers.
Training Requirements in OSHA Standards and Training Guidelines.
ERIC Educational Resources Information Center
Occupational Safety and Health Administration, Washington, DC.
This booklet contains Occupational Safety and Health Administration (OSHA) training requirements, excerpted from OSHA standards. The booklet is designed to help employers, safety and health professionals, training directors, and others who need to know training requirements. (Requirements for posting information, warning signs, labels, and the…
Basheti, Iman A; Obeidat, Nathir M; Reddel, Helen K
2017-02-09
Inhaler technique can be corrected with training, but skills drop off quickly without repeated training. The aim of our study was to explore the effect of novel inhaler technique labels on the retention of correct inhaler technique. In this single-blind randomized parallel-group active-controlled study, clinical pharmacists enrolled asthma patients using controller medication by Accuhaler [Diskus] or Turbuhaler. Inhaler technique was assessed using published checklists (score 0-9). Symptom control was assessed by asthma control test. Patients were randomized into active (ACCa; THa) and control (ACCc; THc) groups. All patients received a "Show-and-Tell" inhaler technique counseling service. Active patients also received inhaler labels highlighting their initial errors. Baseline data were available for 95 patients, 68% females, mean age 44.9 (SD 15.2) years. Mean inhaler scores were ACCa:5.3 ± 1.0; THa:4.7 ± 0.9, ACCc:5.5 ± 1.1; THc:4.2 ± 1.0. Asthma was poorly controlled (mean ACT scores ACCa:13.9 ± 4.3; THa:12.1 ± 3.9; ACCc:12.7 ± 3.3; THc:14.3 ± 3.7). After training, all patients had correct technique (score 9/9). After 3 months, there was significantly less decline in inhaler technique scores for active than control groups (mean difference: Accuhaler -1.04 (95% confidence interval -1.92, -0.16, P = 0.022); Turbuhaler -1.61 (-2.63, -0.59, P = 0.003). Symptom control improved significantly, with no significant difference between active and control patients, but active patients used less reliever medication (active 2.19 (SD 1.78) vs. control 3.42 (1.83) puffs/day, P = 0.002). After inhaler training, novel inhaler technique labels improve retention of correct inhaler technique skills with dry powder inhalers. Inhaler technique labels represent a simple, scalable intervention that has the potential to extend the benefit of inhaler training on asthma outcomes. REMINDER LABELS IMPROVE INHALER TECHNIQUE: Personalized labels on asthma inhalers remind patients of correct technique and help improve symptoms over time. Iman Basheti at the Applied Science Private University in Jordan and co-workers trialed the approach of placing patient-specific reminder labels on dry-powder asthma inhalers to improve long-term technique. Poor asthma control is often exacerbated by patients making mistakes when using their inhalers. During the trial, 95 patients received inhaler training before being split into two groups: the control group received no further help, while the other group received individualized labels on their inhalers reminding them of their initial errors. After three months, 67% of patients with reminder labels retained correct technique compared to only 12% of controls. They also required less reliever medication and reported improved symptoms. This represents a simple, cheap way of tackling inhaler technique errors.
NASA Astrophysics Data System (ADS)
An, Le; Adeli, Ehsan; Liu, Mingxia; Zhang, Jun; Lee, Seong-Whan; Shen, Dinggang
2017-03-01
Classification is one of the most important tasks in machine learning. Due to feature redundancy or outliers in samples, using all available data for training a classifier may be suboptimal. For example, the Alzheimer’s disease (AD) is correlated with certain brain regions or single nucleotide polymorphisms (SNPs), and identification of relevant features is critical for computer-aided diagnosis. Many existing methods first select features from structural magnetic resonance imaging (MRI) or SNPs and then use those features to build the classifier. However, with the presence of many redundant features, the most discriminative features are difficult to be identified in a single step. Thus, we formulate a hierarchical feature and sample selection framework to gradually select informative features and discard ambiguous samples in multiple steps for improved classifier learning. To positively guide the data manifold preservation process, we utilize both labeled and unlabeled data during training, making our method semi-supervised. For validation, we conduct experiments on AD diagnosis by selecting mutually informative features from both MRI and SNP, and using the most discriminative samples for training. The superior classification results demonstrate the effectiveness of our approach, as compared with the rivals.
Improving labeling efficiency in automatic quality control of MRSI data.
Pedrosa de Barros, Nuno; McKinley, Richard; Wiest, Roland; Slotboom, Johannes
2017-12-01
To improve the efficiency of the labeling task in automatic quality control of MR spectroscopy imaging data. 28'432 short and long echo time (TE) spectra (1.5 tesla; point resolved spectroscopy (PRESS); repetition time (TR)= 1,500 ms) from 18 different brain tumor patients were labeled by two experts as either accept or reject, depending on their quality. For each spectrum, 47 signal features were extracted. The data was then used to run several simulations and test an active learning approach using uncertainty sampling. The performance of the classifiers was evaluated as a function of the number of patients in the training set, number of spectra in the training set, and a parameter α used to control the level of classification uncertainty required for a new spectrum to be selected for labeling. The results showed that the proposed strategy allows reductions of up to 72.97% for short TE and 62.09% for long TE in the amount of data that needs to be labeled, without significant impact in classification accuracy. Further reductions are possible with significant but minimal impact in performance. Active learning using uncertainty sampling is an effective way to increase the labeling efficiency for training automatic quality control classifiers. Magn Reson Med 78:2399-2405, 2017. © 2017 International Society for Magnetic Resonance in Medicine. © 2017 International Society for Magnetic Resonance in Medicine.
Training labels for hippocampal segmentation based on the EADC-ADNI harmonized hippocampal protocol.
Boccardi, Marina; Bocchetta, Martina; Morency, Félix C; Collins, D Louis; Nishikawa, Masami; Ganzola, Rossana; Grothe, Michel J; Wolf, Dominik; Redolfi, Alberto; Pievani, Michela; Antelmi, Luigi; Fellgiebel, Andreas; Matsuda, Hiroshi; Teipel, Stefan; Duchesne, Simon; Jack, Clifford R; Frisoni, Giovanni B
2015-02-01
The European Alzheimer's Disease Consortium and Alzheimer's Disease Neuroimaging Initiative (ADNI) Harmonized Protocol (HarP) is a Delphi definition of manual hippocampal segmentation from magnetic resonance imaging (MRI) that can be used as the standard of truth to train new tracers, and to validate automated segmentation algorithms. Training requires large and representative data sets of segmented hippocampi. This work aims to produce a set of HarP labels for the proper training and certification of tracers and algorithms. Sixty-eight 1.5 T and 67 3 T volumetric structural ADNI scans from different subjects, balanced by age, medial temporal atrophy, and scanner manufacturer, were segmented by five qualified HarP tracers whose absolute interrater intraclass correlation coefficients were 0.953 and 0.975 (left and right). Labels were validated as HarP compliant through centralized quality check and correction. Hippocampal volumes (mm(3)) were as follows: controls: left = 3060 (standard deviation [SD], 502), right = 3120 (SD, 897); mild cognitive impairment (MCI): left = 2596 (SD, 447), right = 2686 (SD, 473); and Alzheimer's disease (AD): left = 2301 (SD, 492), right = 2445 (SD, 525). Volumes significantly correlated with atrophy severity at Scheltens' scale (Spearman's ρ = <-0.468, P = <.0005). Cerebrospinal fluid spaces (mm(3)) were as follows: controls: left = 23 (32), right = 25 (25); MCI: left = 15 (13), right = 22 (16); and AD: left = 11 (13), right = 20 (25). Five subjects (3.7%) presented with unusual anatomy. This work provides reference hippocampal labels for the training and certification of automated segmentation algorithms. The publicly released labels will allow the widespread implementation of the standard segmentation protocol. Copyright © 2015 The Alzheimer's Association. Published by Elsevier Inc. All rights reserved.
Fact Sheet on Training and Exam Options for Pesticide Applicators
New pesticide label requirements for training protect applicators, other fumigant handlers and bystanders from soil fumigant exposures. Find criteria details and content check list for approval of training programs.
Joint Labeling Of Multiple Regions of Interest (Rois) By Enhanced Auto Context Models.
Kim, Minjeong; Wu, Guorong; Guo, Yanrong; Shen, Dinggang
2015-04-01
Accurate segmentation of a set of regions of interest (ROIs) in the brain images is a key step in many neuroscience studies. Due to the complexity of image patterns, many learning-based segmentation methods have been proposed, including auto context model (ACM) that can capture high-level contextual information for guiding segmentation. However, since current ACM can only handle one ROI at a time, neighboring ROIs have to be labeled separately with different ACMs that are trained independently without communicating each other. To address this, we enhance the current single-ROI learning ACM to multi-ROI learning ACM for joint labeling of multiple neighboring ROIs (called e ACM). First, we extend current independently-trained single-ROI ACMs to a set of jointly-trained cross-ROI ACMs, by simultaneous training of ACMs for all spatially-connected ROIs to let them to share their respective intermediate outputs for coordinated labeling of each image point. Then, the context features in each ACM can capture the cross-ROI dependence information from the outputs of other ACMs that are designed for neighboring ROIs. Second, we upgrade the output labeling map of each ACM with the multi-scale representation, thus both local and global context information can be effectively used to increase the robustness in characterizing geometric relationship among neighboring ROIs. Third, we integrate ACM into a multi-atlases segmentation paradigm, for encompassing high variations among subjects. Experiments on LONI LPBA40 dataset show much better performance by our e ACM, compared to the conventional ACM.
Vajda, Szilárd; Rangoni, Yves; Cecotti, Hubert
2015-01-01
For training supervised classifiers to recognize different patterns, large data collections with accurate labels are necessary. In this paper, we propose a generic, semi-automatic labeling technique for large handwritten character collections. In order to speed up the creation of a large scale ground truth, the method combines unsupervised clustering and minimal expert knowledge. To exploit the potential discriminant complementarities across features, each character is projected into five different feature spaces. After clustering the images in each feature space, the human expert labels the cluster centers. Each data point inherits the label of its cluster’s center. A majority (or unanimity) vote decides the label of each character image. The amount of human involvement (labeling) is strictly controlled by the number of clusters – produced by the chosen clustering approach. To test the efficiency of the proposed approach, we have compared, and evaluated three state-of-the art clustering methods (k-means, self-organizing maps, and growing neural gas) on the MNIST digit data set, and a Lampung Indonesian character data set, respectively. Considering a k-nn classifier, we show that labeling manually only 1.3% (MNIST), and 3.2% (Lampung) of the training data, provides the same range of performance than a completely labeled data set would. PMID:25870463
Target discrimination method for SAR images based on semisupervised co-training
NASA Astrophysics Data System (ADS)
Wang, Yan; Du, Lan; Dai, Hui
2018-01-01
Synthetic aperture radar (SAR) target discrimination is usually performed in a supervised manner. However, supervised methods for SAR target discrimination may need lots of labeled training samples, whose acquirement is costly, time consuming, and sometimes impossible. This paper proposes an SAR target discrimination method based on semisupervised co-training, which utilizes a limited number of labeled samples and an abundant number of unlabeled samples. First, Lincoln features, widely used in SAR target discrimination, are extracted from the training samples and partitioned into two sets according to their physical meanings. Second, two support vector machine classifiers are iteratively co-trained with the extracted two feature sets based on the co-training algorithm. Finally, the trained classifiers are exploited to classify the test data. The experimental results on real SAR images data not only validate the effectiveness of the proposed method compared with the traditional supervised methods, but also demonstrate the superiority of co-training over self-training, which only uses one feature set.
Adaptive maritime video surveillance
NASA Astrophysics Data System (ADS)
Gupta, Kalyan Moy; Aha, David W.; Hartley, Ralph; Moore, Philip G.
2009-05-01
Maritime assets such as ports, harbors, and vessels are vulnerable to a variety of near-shore threats such as small-boat attacks. Currently, such vulnerabilities are addressed predominantly by watchstanders and manual video surveillance, which is manpower intensive. Automatic maritime video surveillance techniques are being introduced to reduce manpower costs, but they have limited functionality and performance. For example, they only detect simple events such as perimeter breaches and cannot predict emerging threats. They also generate too many false alerts and cannot explain their reasoning. To overcome these limitations, we are developing the Maritime Activity Analysis Workbench (MAAW), which will be a mixed-initiative real-time maritime video surveillance tool that uses an integrated supervised machine learning approach to label independent and coordinated maritime activities. It uses the same information to predict anomalous behavior and explain its reasoning; this is an important capability for watchstander training and for collecting performance feedback. In this paper, we describe MAAW's functional architecture, which includes the following pipeline of components: (1) a video acquisition and preprocessing component that detects and tracks vessels in video images, (2) a vessel categorization and activity labeling component that uses standard and relational supervised machine learning methods to label maritime activities, and (3) an ontology-guided vessel and maritime activity annotator to enable subject matter experts (e.g., watchstanders) to provide feedback and supervision to the system. We report our findings from a preliminary system evaluation on river traffic video.
Graph-Based Semi-Supervised Hyperspectral Image Classification Using Spatial Information
NASA Astrophysics Data System (ADS)
Jamshidpour, N.; Homayouni, S.; Safari, A.
2017-09-01
Hyperspectral image classification has been one of the most popular research areas in the remote sensing community in the past decades. However, there are still some problems that need specific attentions. For example, the lack of enough labeled samples and the high dimensionality problem are two most important issues which degrade the performance of supervised classification dramatically. The main idea of semi-supervised learning is to overcome these issues by the contribution of unlabeled samples, which are available in an enormous amount. In this paper, we propose a graph-based semi-supervised classification method, which uses both spectral and spatial information for hyperspectral image classification. More specifically, two graphs were designed and constructed in order to exploit the relationship among pixels in spectral and spatial spaces respectively. Then, the Laplacians of both graphs were merged to form a weighted joint graph. The experiments were carried out on two different benchmark hyperspectral data sets. The proposed method performed significantly better than the well-known supervised classification methods, such as SVM. The assessments consisted of both accuracy and homogeneity analyses of the produced classification maps. The proposed spectral-spatial SSL method considerably increased the classification accuracy when the labeled training data set is too scarce.When there were only five labeled samples for each class, the performance improved 5.92% and 10.76% compared to spatial graph-based SSL, for AVIRIS Indian Pine and Pavia University data sets respectively.
GEO Label: User and Producer Perspectives on a Label for Geospatial Data
NASA Astrophysics Data System (ADS)
Lush, V.; Lumsden, J.; Masó, J.; Díaz, P.; McCallum, I.
2012-04-01
One of the aims of the Science and Technology Committee (STC) of the Group on Earth Observations (GEO) was to establish a GEO Label- a label to certify geospatial datasets and their quality. As proposed, the GEO Label will be used as a value indicator for geospatial data and datasets accessible through the Global Earth Observation System of Systems (GEOSS). It is suggested that the development of such a label will significantly improve user recognition of the quality of geospatial datasets and that its use will help promote trust in datasets that carry the established GEO Label. Furthermore, the GEO Label is seen as an incentive to data providers. At the moment GEOSS contains a large amount of data and is constantly growing. Taking this into account, a GEO Label could assist in searching by providing users with visual cues of dataset quality and possibly relevance; a GEO Label could effectively stand as a decision support mechanism for dataset selection. Currently our project - GeoViQua, - together with EGIDA and ID-03 is undertaking research to define and evaluate the concept of a GEO Label. The development and evaluation process will be carried out in three phases. In phase I we have conducted an online survey (GEO Label Questionnaire) to identify the initial user and producer views on a GEO Label or its potential role. In phase II we will conduct a further study presenting some GEO Label examples that will be based on Phase I. We will elicit feedback on these examples under controlled conditions. In phase III we will create physical prototypes which will be used in a human subject study. The most successful prototypes will then be put forward as potential GEO Label options. At the moment we are in phase I, where we developed an online questionnaire to collect the initial GEO Label requirements and to identify the role that a GEO Label should serve from the user and producer standpoint. The GEO Label Questionnaire consists of generic questions to identify whether users and producers believe a GEO Label is relevant to geospatial data; whether they want a single "one-for-all" label or separate labels that will serve a particular role; the function that would be most relevant for a GEO Label to carry; and the functionality that users and producers would like to see from common rating and review systems they use. To distribute the questionnaire, relevant user and expert groups were contacted at meetings or by email. At this stage we successfully collected over 80 valid responses from geospatial data users and producers. This communication will provide a comprehensive analysis of the survey results, indicating to what extent the users surveyed in Phase I value a GEO Label, and suggesting in what directions a GEO Label may develop. Potential GEO Label examples based on the results of the survey will be presented for use in Phase II.
Kavuluru, Ramakanth; Rios, Anthony; Lu, Yuan
2015-01-01
Background Diagnosis codes are assigned to medical records in healthcare facilities by trained coders by reviewing all physician authored documents associated with a patient's visit. This is a necessary and complex task involving coders adhering to coding guidelines and coding all assignable codes. With the popularity of electronic medical records (EMRs), computational approaches to code assignment have been proposed in the recent years. However, most efforts have focused on single and often short clinical narratives, while realistic scenarios warrant full EMR level analysis for code assignment. Objective We evaluate supervised learning approaches to automatically assign international classification of diseases (ninth revision) - clinical modification (ICD-9-CM) codes to EMRs by experimenting with a large realistic EMR dataset. The overall goal is to identify methods that offer superior performance in this task when considering such datasets. Methods We use a dataset of 71,463 EMRs corresponding to in-patient visits with discharge date falling in a two year period (2011–2012) from the University of Kentucky (UKY) Medical Center. We curate a smaller subset of this dataset and also use a third gold standard dataset of radiology reports. We conduct experiments using different problem transformation approaches with feature and data selection components and employing suitable label calibration and ranking methods with novel features involving code co-occurrence frequencies and latent code associations. Results Over all codes with at least 50 training examples we obtain a micro F-score of 0.48. On the set of codes that occur at least in 1% of the two year dataset, we achieve a micro F-score of 0.54. For the smaller radiology report dataset, the classifier chaining approach yields best results. For the smaller subset of the UKY dataset, feature selection, data selection, and label calibration offer best performance. Conclusions We show that datasets at different scale (size of the EMRs, number of distinct codes) and with different characteristics warrant different learning approaches. For shorter narratives pertaining to a particular medical subdomain (e.g., radiology, pathology), classifier chaining is ideal given the codes are highly related with each other. For realistic in-patient full EMRs, feature and data selection methods offer high performance for smaller datasets. However, for large EMR datasets, we observe that the binary relevance approach with learning-to-rank based code reranking offers the best performance. Regardless of the training dataset size, for general EMRs, label calibration to select the optimal number of labels is an indispensable final step. PMID:26054428
Kavuluru, Ramakanth; Rios, Anthony; Lu, Yuan
2015-10-01
Diagnosis codes are assigned to medical records in healthcare facilities by trained coders by reviewing all physician authored documents associated with a patient's visit. This is a necessary and complex task involving coders adhering to coding guidelines and coding all assignable codes. With the popularity of electronic medical records (EMRs), computational approaches to code assignment have been proposed in the recent years. However, most efforts have focused on single and often short clinical narratives, while realistic scenarios warrant full EMR level analysis for code assignment. We evaluate supervised learning approaches to automatically assign international classification of diseases (ninth revision) - clinical modification (ICD-9-CM) codes to EMRs by experimenting with a large realistic EMR dataset. The overall goal is to identify methods that offer superior performance in this task when considering such datasets. We use a dataset of 71,463 EMRs corresponding to in-patient visits with discharge date falling in a two year period (2011-2012) from the University of Kentucky (UKY) Medical Center. We curate a smaller subset of this dataset and also use a third gold standard dataset of radiology reports. We conduct experiments using different problem transformation approaches with feature and data selection components and employing suitable label calibration and ranking methods with novel features involving code co-occurrence frequencies and latent code associations. Over all codes with at least 50 training examples we obtain a micro F-score of 0.48. On the set of codes that occur at least in 1% of the two year dataset, we achieve a micro F-score of 0.54. For the smaller radiology report dataset, the classifier chaining approach yields best results. For the smaller subset of the UKY dataset, feature selection, data selection, and label calibration offer best performance. We show that datasets at different scale (size of the EMRs, number of distinct codes) and with different characteristics warrant different learning approaches. For shorter narratives pertaining to a particular medical subdomain (e.g., radiology, pathology), classifier chaining is ideal given the codes are highly related with each other. For realistic in-patient full EMRs, feature and data selection methods offer high performance for smaller datasets. However, for large EMR datasets, we observe that the binary relevance approach with learning-to-rank based code reranking offers the best performance. Regardless of the training dataset size, for general EMRs, label calibration to select the optimal number of labels is an indispensable final step. Copyright © 2015 Elsevier B.V. All rights reserved.
Safety Issues in Controlling Bed Bugs
Prevent pesticide misuse by always following label directions carefully. For example don't use a pesticide indoors that is labeled for outdoor use, prepare treatment area as instructed, and look for an EPA registration number.
Less label, more free: approaches in label-free quantitative mass spectrometry.
Neilson, Karlie A; Ali, Naveid A; Muralidharan, Sridevi; Mirzaei, Mehdi; Mariani, Michael; Assadourian, Gariné; Lee, Albert; van Sluyter, Steven C; Haynes, Paul A
2011-02-01
In this review we examine techniques, software, and statistical analyses used in label-free quantitative proteomics studies for area under the curve and spectral counting approaches. Recent advances in the field are discussed in an order that reflects a logical workflow design. Examples of studies that follow this design are presented to highlight the requirement for statistical assessment and further experiments to validate results from label-free quantitation. Limitations of label-free approaches are considered, label-free approaches are compared with labelling techniques, and forward-looking applications for label-free quantitative data are presented. We conclude that label-free quantitative proteomics is a reliable, versatile, and cost-effective alternative to labelled quantitation. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Amis, Gregory P; Carpenter, Gail A
2010-03-01
Computational models of learning typically train on labeled input patterns (supervised learning), unlabeled input patterns (unsupervised learning), or a combination of the two (semi-supervised learning). In each case input patterns have a fixed number of features throughout training and testing. Human and machine learning contexts present additional opportunities for expanding incomplete knowledge from formal training, via self-directed learning that incorporates features not previously experienced. This article defines a new self-supervised learning paradigm to address these richer learning contexts, introducing a neural network called self-supervised ARTMAP. Self-supervised learning integrates knowledge from a teacher (labeled patterns with some features), knowledge from the environment (unlabeled patterns with more features), and knowledge from internal model activation (self-labeled patterns). Self-supervised ARTMAP learns about novel features from unlabeled patterns without destroying partial knowledge previously acquired from labeled patterns. A category selection function bases system predictions on known features, and distributed network activation scales unlabeled learning to prediction confidence. Slow distributed learning on unlabeled patterns focuses on novel features and confident predictions, defining classification boundaries that were ambiguous in the labeled patterns. Self-supervised ARTMAP improves test accuracy on illustrative low-dimensional problems and on high-dimensional benchmarks. Model code and benchmark data are available from: http://techlab.eu.edu/SSART/. Copyright 2009 Elsevier Ltd. All rights reserved.
Cross-Domain Semi-Supervised Learning Using Feature Formulation.
Xingquan Zhu
2011-12-01
Semi-Supervised Learning (SSL) traditionally makes use of unlabeled samples by including them into the training set through an automated labeling process. Such a primitive Semi-Supervised Learning (pSSL) approach suffers from a number of disadvantages including false labeling and incapable of utilizing out-of-domain samples. In this paper, we propose a formative Semi-Supervised Learning (fSSL) framework which explores hidden features between labeled and unlabeled samples to achieve semi-supervised learning. fSSL regards that both labeled and unlabeled samples are generated from some hidden concepts with labeling information partially observable for some samples. The key of the fSSL is to recover the hidden concepts, and take them as new features to link labeled and unlabeled samples for semi-supervised learning. Because unlabeled samples are only used to generate new features, but not to be explicitly included in the training set like pSSL does, fSSL overcomes the inherent disadvantages of the traditional pSSL methods, especially for samples not within the same domain as the labeled instances. Experimental results and comparisons demonstrate that fSSL significantly outperforms pSSL-based methods for both within-domain and cross-domain semi-supervised learning.
What's in a Label? Careers in Integrated Early Childhood Programs.
ERIC Educational Resources Information Center
Gorelick, Molly C.
The paper, given by the director of a project to train teachers for early childhood education programs which integrate handicapped and normal children, focuses on the effects of labeling on teacher-child interaction. The author recounts her own experience with teaching handicapped children and the historical tendency to label and segregate various…
McMorrow, M J; Foxx, R M; Faw, G D; Bittle, R G
1987-01-01
We evaluated the direct and generalized effects of cues-pause-point language training procedures on immediate echolalia and correct responding in two severely retarded females. Two experiments were conducted with each subject in which the overall goal was to encourage them to remain quiet before, during, and briefly after the presentation of questions and then to verbalize on the basis of environmental cues whose labels represented the correct responses. Multiple baseline designs across question/response pairs (Experiment I) or question/response pairs and settings (Experiment II) demonstrated that echolalia was rapidly replaced by correct responding on the trained stimuli. More importantly, there were clear improvements in subjects' responding to untrained stimuli. Results demonstrated that the cues-pause-point procedures can be effective in teaching severely retarded or echolalic individuals functional use of their verbal labeling repertoires. PMID:3583962
Sample Complexity Bounds for Differentially Private Learning
Chaudhuri, Kamalika; Hsu, Daniel
2013-01-01
This work studies the problem of privacy-preserving classification – namely, learning a classifier from sensitive data while preserving the privacy of individuals in the training set. In particular, the learning algorithm is required in this problem to guarantee differential privacy, a very strong notion of privacy that has gained significant attention in recent years. A natural question to ask is: what is the sample requirement of a learning algorithm that guarantees a certain level of privacy and accuracy? We address this question in the context of learning with infinite hypothesis classes when the data is drawn from a continuous distribution. We first show that even for very simple hypothesis classes, any algorithm that uses a finite number of examples and guarantees differential privacy must fail to return an accurate classifier for at least some unlabeled data distributions. This result is unlike the case with either finite hypothesis classes or discrete data domains, in which distribution-free private learning is possible, as previously shown by Kasiviswanathan et al. (2008). We then consider two approaches to differentially private learning that get around this lower bound. The first approach is to use prior knowledge about the unlabeled data distribution in the form of a reference distribution chosen independently of the sensitive data. Given such a reference , we provide an upper bound on the sample requirement that depends (among other things) on a measure of closeness between and the unlabeled data distribution. Our upper bound applies to the non-realizable as well as the realizable case. The second approach is to relax the privacy requirement, by requiring only label-privacy – namely, that the only labels (and not the unlabeled parts of the examples) be considered sensitive information. An upper bound on the sample requirement of learning with label privacy was shown by Chaudhuri et al. (2006); in this work, we show a lower bound. PMID:25285183
Simplified one-pot synthesis of [.sup.18F]SFB for radiolabeling
Olma, Sebastian; Shen, Clifton Kwang-Fu
2015-08-04
A non-aqueous single pot synthesis of [.sup.18F]SFB is set forth. The [.sup.18F]SFB produced with this method is then used, for example, to label a peptide or an engineered antibody fragment (diabody) targeting human epidermal growth factor receptor 2 (HER2) as representative examples of labeled compounds for use as an injectable composition to locate abnormal tissue, specifically tumors within an animal or human using a PET scan.
Simplified one-pot synthesis of [.sup.18F]SFB for radiolabeling
Olma, Sebastian; Shen, Clifton Kwang-Fu
2013-07-16
A non-aqueous single pot synthesis of [.sup.18F]SFB is set forth. The [.sup.18F]SFB produced with this method is then used, for example, to label a peptide or an engineered antibody fragment (diabody) targeting human epidermal growth factor receptor 2 (HER2) as representative examples of labeled compounds for use as an injectable composition to locate abnormal tissue, specifically tumors within an animal or human using a PET scan.
Label Review Training - Resources
Pesticide labels translate results of our extensive evaluations of pesticide products into conditions, directions and precautions that define parameters for use of a pesticide with the goal of ensuring protection of human health and the environment.
Information Measures of Degree Distributions with an Application to Labeled Graphs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Joslyn, Cliff A.; Purvine, Emilie AH
2016-01-11
The problem of describing the distribution of labels over a set of objects is relevant to many domains. For example: cyber security, social media, and protein interactions all care about the manner in which labels are distributed among different objects. In this paper we present three interacting statistical measures on label distributions, inspired by entropy and information theory. Labeled graphs are discussed as a specific case of labels distributed over a set of edges. We describe a use case in cyber security using a labeled directed multi-graph of IPFLOW. Finally we show how these measures respond when labels are updatedmore » in certain ways.« less
Code of Federal Regulations, 2010 CFR
2010-10-01
... indications for off-label uses of drugs and biologicals in an anti-cancer chemotherapeutic regimen. 414.930... indications for off-label uses of drugs and biologicals in an anti-cancer chemotherapeutic regimen. (a... specialty compendium, for example a compendium of anti-cancer treatment. A compendium— (i) Includes a...
Code of Federal Regulations, 2011 CFR
2011-10-01
... indications for off-label uses of drugs and biologicals in an anti-cancer chemotherapeutic regimen. 414.930... indications for off-label uses of drugs and biologicals in an anti-cancer chemotherapeutic regimen. (a... specialty compendium, for example a compendium of anti-cancer treatment. A compendium— (i) Includes a...
Federal Register 2010, 2011, 2012, 2013, 2014
2012-06-27
... medical literature, national organizations, or technology assessment bodies that the off-label use is safe... medical literature, national organizations, or technology assessment bodies that the off-label use is safe.... Due to the rapid and extensive changes in medical technology it is not feasible to maintain this list...
Code of Federal Regulations, 2013 CFR
2013-10-01
... indications for off-label uses of drugs and biologicals in an anti-cancer chemotherapeutic regimen. 414.930... indications for off-label uses of drugs and biologicals in an anti-cancer chemotherapeutic regimen. (a... specialty compendium, for example a compendium of anti-cancer treatment. A compendium— (i) Includes a...
Code of Federal Regulations, 2012 CFR
2012-10-01
... indications for off-label uses of drugs and biologicals in an anti-cancer chemotherapeutic regimen. 414.930... indications for off-label uses of drugs and biologicals in an anti-cancer chemotherapeutic regimen. (a... specialty compendium, for example a compendium of anti-cancer treatment. A compendium— (i) Includes a...
Code of Federal Regulations, 2014 CFR
2014-10-01
... indications for off-label uses of drugs and biologicals in an anti-cancer chemotherapeutic regimen. 414.930... indications for off-label uses of drugs and biologicals in an anti-cancer chemotherapeutic regimen. (a... specialty compendium, for example a compendium of anti-cancer treatment. A compendium— (i) Includes a...
Optimizing ChIP-seq peak detectors using visual labels and supervised machine learning
Goerner-Potvin, Patricia; Morin, Andreanne; Shao, Xiaojian; Pastinen, Tomi
2017-01-01
Motivation: Many peak detection algorithms have been proposed for ChIP-seq data analysis, but it is not obvious which algorithm and what parameters are optimal for any given dataset. In contrast, regions with and without obvious peaks can be easily labeled by visual inspection of aligned read counts in a genome browser. We propose a supervised machine learning approach for ChIP-seq data analysis, using labels that encode qualitative judgments about which genomic regions contain or do not contain peaks. The main idea is to manually label a small subset of the genome, and then learn a model that makes consistent peak predictions on the rest of the genome. Results: We created 7 new histone mark datasets with 12 826 visually determined labels, and analyzed 3 existing transcription factor datasets. We observed that default peak detection parameters yield high false positive rates, which can be reduced by learning parameters using a relatively small training set of labeled data from the same experiment type. We also observed that labels from different people are highly consistent. Overall, these data indicate that our supervised labeling method is useful for quantitatively training and testing peak detection algorithms. Availability and Implementation: Labeled histone mark data http://cbio.ensmp.fr/~thocking/chip-seq-chunk-db/, R package to compute the label error of predicted peaks https://github.com/tdhock/PeakError Contacts: toby.hocking@mail.mcgill.ca or guil.bourque@mcgill.ca Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27797775
Optimizing ChIP-seq peak detectors using visual labels and supervised machine learning.
Hocking, Toby Dylan; Goerner-Potvin, Patricia; Morin, Andreanne; Shao, Xiaojian; Pastinen, Tomi; Bourque, Guillaume
2017-02-15
Many peak detection algorithms have been proposed for ChIP-seq data analysis, but it is not obvious which algorithm and what parameters are optimal for any given dataset. In contrast, regions with and without obvious peaks can be easily labeled by visual inspection of aligned read counts in a genome browser. We propose a supervised machine learning approach for ChIP-seq data analysis, using labels that encode qualitative judgments about which genomic regions contain or do not contain peaks. The main idea is to manually label a small subset of the genome, and then learn a model that makes consistent peak predictions on the rest of the genome. We created 7 new histone mark datasets with 12 826 visually determined labels, and analyzed 3 existing transcription factor datasets. We observed that default peak detection parameters yield high false positive rates, which can be reduced by learning parameters using a relatively small training set of labeled data from the same experiment type. We also observed that labels from different people are highly consistent. Overall, these data indicate that our supervised labeling method is useful for quantitatively training and testing peak detection algorithms. Labeled histone mark data http://cbio.ensmp.fr/~thocking/chip-seq-chunk-db/ , R package to compute the label error of predicted peaks https://github.com/tdhock/PeakError. toby.hocking@mail.mcgill.ca or guil.bourque@mcgill.ca. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Label Review Training - Table of Contents
Pesticide labels translate results of our extensive evaluations of pesticide products into conditions, directions and precautions that define parameters for use of a pesticide with the goal of ensuring protection of human health and the environment.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Symons, Christopher T; Arel, Itamar
2011-01-01
Budgeted learning under constraints on both the amount of labeled information and the availability of features at test time pertains to a large number of real world problems. Ideas from multi-view learning, semi-supervised learning, and even active learning have applicability, but a common framework whose assumptions fit these problem spaces is non-trivial to construct. We leverage ideas from these fields based on graph regularizers to construct a robust framework for learning from labeled and unlabeled samples in multiple views that are non-independent and include features that are inaccessible at the time the model would need to be applied. We describemore » examples of applications that fit this scenario, and we provide experimental results to demonstrate the effectiveness of knowledge carryover from training-only views. As learning algorithms are applied to more complex applications, relevant information can be found in a wider variety of forms, and the relationships between these information sources are often quite complex. The assumptions that underlie most learning algorithms do not readily or realistically permit the incorporation of many of the data sources that are available, despite an implicit understanding that useful information exists in these sources. When multiple information sources are available, they are often partially redundant, highly interdependent, and contain noise as well as other information that is irrelevant to the problem under study. In this paper, we are focused on a framework whose assumptions match this reality, as well as the reality that labeled information is usually sparse. Most significantly, we are interested in a framework that can also leverage information in scenarios where many features that would be useful for learning a model are not available when the resulting model will be applied. As with constraints on labels, there are many practical limitations on the acquisition of potentially useful features. A key difference in the case of feature acquisition is that the same constraints often don't pertain to the training samples. This difference provides an opportunity to allow features that are impractical in an applied setting to nevertheless add value during the model-building process. Unfortunately, there are few machine learning frameworks built on assumptions that allow effective utilization of features that are only available at training time. In this paper we formulate a knowledge carryover framework for the budgeted learning scenario with constraints on features and labels. The approach is based on multi-view and semi-supervised learning methods that use graph-encoded regularization. Our main contributions are the following: (1) we propose and provide justification for a methodology for ensuring that changes in the graph regularizer using alternate views are performed in a manner that is target-concept specific, allowing value to be obtained from noisy views; and (2) we demonstrate how this general set-up can be used to effectively improve models by leveraging features unavailable at test time. The rest of the paper is structured as follows. In Section 2, we outline real-world problems to motivate the approach and describe relevant prior work. Section 3 describes the graph construction process and the learning methodologies that are employed. Section 4 provides preliminary discussion regarding theoretical motivation for the method. In Section 5, effectiveness of the approach is demonstrated in a series of experiments employing modified versions of two well-known semi-supervised learning algorithms. Section 6 concludes the paper.« less
Number Prompts Left-to-Right Spatial Mapping in Toddlerhood
ERIC Educational Resources Information Center
McCrink, Koleen; Perez, Jasmin; Baruch, Erica
2017-01-01
Toddlers performed a spatial mapping task in which they were required to learn the location of a hidden object in a vertical array and then transpose this location information 90° to a horizontal array. During the vertical training, they were given (a) no labels, (b) alphabetical labels, or (c) numerical labels for each potential spatial location.…
Zhao, Yu; Ge, Fangfei; Liu, Tianming
2018-07-01
fMRI data decomposition techniques have advanced significantly from shallow models such as Independent Component Analysis (ICA) and Sparse Coding and Dictionary Learning (SCDL) to deep learning models such Deep Belief Networks (DBN) and Convolutional Autoencoder (DCAE). However, interpretations of those decomposed networks are still open questions due to the lack of functional brain atlases, no correspondence across decomposed or reconstructed networks across different subjects, and significant individual variabilities. Recent studies showed that deep learning, especially deep convolutional neural networks (CNN), has extraordinary ability of accommodating spatial object patterns, e.g., our recent works using 3D CNN for fMRI-derived network classifications achieved high accuracy with a remarkable tolerance for mistakenly labelled training brain networks. However, the training data preparation is one of the biggest obstacles in these supervised deep learning models for functional brain network map recognitions, since manual labelling requires tedious and time-consuming labours which will sometimes even introduce label mistakes. Especially for mapping functional networks in large scale datasets such as hundreds of thousands of brain networks used in this paper, the manual labelling method will become almost infeasible. In response, in this work, we tackled both the network recognition and training data labelling tasks by proposing a new iteratively optimized deep learning CNN (IO-CNN) framework with an automatic weak label initialization, which enables the functional brain networks recognition task to a fully automatic large-scale classification procedure. Our extensive experiments based on ABIDE-II 1099 brains' fMRI data showed the great promise of our IO-CNN framework. Copyright © 2018 Elsevier B.V. All rights reserved.
Track-before-detect labeled multi-bernoulli particle filter with label switching
NASA Astrophysics Data System (ADS)
Garcia-Fernandez, Angel F.
2016-10-01
This paper presents a multitarget tracking particle filter (PF) for general track-before-detect measurement models. The PF is presented in the random finite set framework and uses a labelled multi-Bernoulli approximation. We also present a label switching improvement algorithm based on Markov chain Monte Carlo that is expected to increase filter performance if targets get in close proximity for a sufficiently long time. The PF is tested in two challenging numerical examples.
Guo, Junqi; Zhou, Xi; Sun, Yunchuan; Ping, Gong; Zhao, Guoxing; Li, Zhuorong
2016-06-01
Smartphone based activity recognition has recently received remarkable attention in various applications of mobile health such as safety monitoring, fitness tracking, and disease prediction. To achieve more accurate and simplified medical monitoring, this paper proposes a self-learning scheme for patients' activity recognition, in which a patient only needs to carry an ordinary smartphone that contains common motion sensors. After the real-time data collection though this smartphone, we preprocess the data using coordinate system transformation to eliminate phone orientation influence. A set of robust and effective features are then extracted from the preprocessed data. Because a patient may inevitably perform various unpredictable activities that have no apriori knowledge in the training dataset, we propose a self-learning activity recognition scheme. The scheme determines whether there are apriori training samples and labeled categories in training pools that well match with unpredictable activity data. If not, it automatically assembles these unpredictable samples into different clusters and gives them new category labels. These clustered samples combined with the acquired new category labels are then merged into the training dataset to reinforce recognition ability of the self-learning model. In experiments, we evaluate our scheme using the data collected from two postoperative patient volunteers, including six labeled daily activities as the initial apriori categories in the training pool. Experimental results demonstrate that the proposed self-learning scheme for activity recognition works very well for most cases. When there exist several types of unseen activities without any apriori information, the accuracy reaches above 80 % after the self-learning process converges.
Lupus myocarditis: case report
DOE Office of Scientific and Technical Information (OSTI.GOV)
LaManna, M.M.; Lumia, F.J.; Gordon, C.I.
1988-03-01
Although gallium-67 (/sup 67/Ga) accumulates in both neoplastic and inflammatory tissues, indium-111 (/sup 111/In) labeled leukocytes are seen only in inflammatory cells. Indium-111-labeled leukocytes therefore are a useful agent in the noninvasive differentiation of inflammatory tissue from neoplastic tissue. This case is an interesting example of the use of /sup 111/In-labeled leukocytes to make that differentiation.
... the product will be of best quality. For example, sausage formulated with certain ingredients used to preserve ... phrases used on labels to describe quality dates. Examples of commonly used phrases: A "Best if Used ...
40 CFR 1039.130 - What installation instructions must I give to equipment manufacturers?
Code of Federal Regulations, 2010 CFR
2010-07-01
... information label hard to read during normal engine maintenance, you must place a duplicate label on the... own equipment. (d) Provide instructions in writing or in an equivalent format. For example, you may...
NASA Astrophysics Data System (ADS)
Orenstein, E. C.; Morgado, P. M.; Peacock, E.; Sosik, H. M.; Jaffe, J. S.
2016-02-01
Technological advances in instrumentation and computing have allowed oceanographers to develop imaging systems capable of collecting extremely large data sets. With the advent of in situ plankton imaging systems, scientists must now commonly deal with "big data" sets containing tens of millions of samples spanning hundreds of classes, making manual classification untenable. Automated annotation methods are now considered to be the bottleneck between collection and interpretation. Typically, such classifiers learn to approximate a function that predicts a predefined set of classes for which a considerable amount of labeled training data is available. The requirement that the training data span all the classes of concern is problematic for plankton imaging systems since they sample such diverse, rapidly changing populations. These data sets may contain relatively rare, sparsely distributed, taxa that will not have associated training data; a classifier trained on a limited set of classes will miss these samples. The computer vision community, leveraging advances in Convolutional Neural Networks (CNNs), has recently attempted to tackle such problems using "zero-shot" object categorization methods. Under a zero-shot framework, a classifier is trained to map samples onto a set of attributes rather than a class label. These attributes can include visual and non-visual information such as what an organism is made out of, where it is distributed globally, or how it reproduces. A second stage classifier is then used to extrapolate a class. In this work, we demonstrate a zero-shot classifier, implemented with a CNN, to retrieve out-of-training-set labels from images. This method is applied to data from two continuously imaging, moored instruments: the Scripps Plankton Camera System (SPCS) and the Imaging FlowCytobot (IFCB). Results from simulated deployment scenarios indicate zero-shot classifiers could be successful at recovering samples of rare taxa in image sets. This capability will allow ecologists to identify trends in the distribution of difficult to sample organisms in their data.
English, Arthur W.; Cucoranu, Delia; Mulligan, Amanda; Sabatier, Manning
2009-01-01
We investigated the extent of misdirection of regenerating axons when that regeneration was enhanced using treadmill training. Retrograde fluorescent tracers were applied to the cut proximal stumps of the tibial and common fibular nerves two or four weeks after transection and surgical repair of the mouse sciatic nerve. The spatial locations of retrogradely labeled motoneurons were studied in untreated control mice and in mice receiving two weeks of treadmill training, either according to a continuous protocol (10 m/min, one hour/day, five day/week) or an interval protocol (20 m/min for two minutes, followed by a five minute rest, repeated 4 times, five days/week). More retrogradely labeled motoneurons were found in both treadmill trained groups. The magnitude of this increase was as great as or greater than that found after using other enhancement strategies. In both treadmill trained groups, the proportions of motoneurons labeled from tracer applied to the common fibular nerve that were found in spinal cord locations reserved for tibial motoneurons in intact mice was no greater than in untreated control mice and significantly less than found after electrical stimulation or chondroitinase treatment. Treadmill training in the first two weeks following peripheral nerve injury produces a marked enhancement of motor axon regeneration without increasing the propensity of those axons to choose pathways leading to functionally inappropriate targets. PMID:19731339
40 CFR 763.95 - Warning labels.
Code of Federal Regulations, 2012 CFR
2012-07-01
... Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) TOXIC SUBSTANCES CONTROL ACT ASBESTOS Asbestos-Containing Materials in Schools § 763.95 Warning labels. (a) The local education agency shall...: ASBESTOS. HAZARDOUS. DO NOT DISTURB WITHOUT PROPER TRAINING AND EQUIPMENT. ...
40 CFR 763.95 - Warning labels.
Code of Federal Regulations, 2014 CFR
2014-07-01
... Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) TOXIC SUBSTANCES CONTROL ACT ASBESTOS Asbestos-Containing Materials in Schools § 763.95 Warning labels. (a) The local education agency shall...: ASBESTOS. HAZARDOUS. DO NOT DISTURB WITHOUT PROPER TRAINING AND EQUIPMENT. ...
40 CFR 763.95 - Warning labels.
Code of Federal Regulations, 2013 CFR
2013-07-01
... Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) TOXIC SUBSTANCES CONTROL ACT ASBESTOS Asbestos-Containing Materials in Schools § 763.95 Warning labels. (a) The local education agency shall...: ASBESTOS. HAZARDOUS. DO NOT DISTURB WITHOUT PROPER TRAINING AND EQUIPMENT. ...
40 CFR 763.95 - Warning labels.
Code of Federal Regulations, 2011 CFR
2011-07-01
... Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) TOXIC SUBSTANCES CONTROL ACT ASBESTOS Asbestos-Containing Materials in Schools § 763.95 Warning labels. (a) The local education agency shall...: ASBESTOS. HAZARDOUS. DO NOT DISTURB WITHOUT PROPER TRAINING AND EQUIPMENT. ...
40 CFR 763.95 - Warning labels.
Code of Federal Regulations, 2010 CFR
2010-07-01
... Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) TOXIC SUBSTANCES CONTROL ACT ASBESTOS Asbestos-Containing Materials in Schools § 763.95 Warning labels. (a) The local education agency shall...: ASBESTOS. HAZARDOUS. DO NOT DISTURB WITHOUT PROPER TRAINING AND EQUIPMENT. ...
Label Review Training: Module 5: Course Final Quiz
Pesticide labels translate results of our extensive evaluations of pesticide products into conditions, directions and precautions that define parameters for use of a pesticide with the goal of ensuring protection of human health and the environment.
Do we need annotation experts? A case study in celiac disease classification.
Kwitt, Roland; Hegenbart, Sebastian; Rasiwasia, Nikhil; Vécsei, Andreas; Uhl, Andreas
2014-01-01
Inference of clinically-relevant findings from the visual appearance of images has become an essential part of processing pipelines for many problems in medical imaging. Typically, a sufficient amount labeled training data is assumed to be available, provided by domain experts. However, acquisition of this data is usually a time-consuming and expensive endeavor. In this work, we ask the question if, for certain problems, expert knowledge is actually required. In fact, we investigate the impact of letting non-expert volunteers annotate a database of endoscopy images which are then used to assess the absence/presence of celiac disease. Contrary to previous approaches, we are not interested in algorithms that can handle the label noise. Instead, we present compelling empirical evidence that label noise can be compensated by a sufficiently large corpus of training data, labeled by the non-experts.
An algorithm for optimal fusion of atlases with different labeling protocols
Iglesias, Juan Eugenio; Sabuncu, Mert Rory; Aganj, Iman; Bhatt, Priyanka; Casillas, Christen; Salat, David; Boxer, Adam; Fischl, Bruce; Van Leemput, Koen
2014-01-01
In this paper we present a novel label fusion algorithm suited for scenarios in which different manual delineation protocols with potentially disparate structures have been used to annotate the training scans (hereafter referred to as “atlases”). Such scenarios arise when atlases have missing structures, when they have been labeled with different levels of detail, or when they have been taken from different heterogeneous databases. The proposed algorithm can be used to automatically label a novel scan with any of the protocols from the training data. Further, it enables us to generate new labels that are not present in any delineation protocol by defining intersections on the underling labels. We first use probabilistic models of label fusion to generalize three popular label fusion techniques to the multi-protocol setting: majority voting, semi-locally weighted voting and STAPLE. Then, we identify some shortcomings of the generalized methods, namely the inability to produce meaningful posterior probabilities for the different labels (majority voting, semi-locally weighted voting) and to exploit the similarities between the atlases (all three methods). Finally, we propose a novel generative label fusion model that can overcome these drawbacks. We use the proposed method to combine four brain MRI datasets labeled with different protocols (with a total of 102 unique labeled structures) to produce segmentations of 148 brain regions. Using cross-validation, we show that the proposed algorithm outperforms the generalizations of majority voting, semi-locally weighted voting and STAPLE (mean Dice score 83%, vs. 77%, 80% and 79%, respectively). We also evaluated the proposed algorithm in an aging study, successfully reproducing some well-known results in cortical and subcortical structures. PMID:25463466
Conditional Anomaly Detection with Soft Harmonic Functions
Valko, Michal; Kveton, Branislav; Valizadegan, Hamed; Cooper, Gregory F.; Hauskrecht, Milos
2012-01-01
In this paper, we consider the problem of conditional anomaly detection that aims to identify data instances with an unusual response or a class label. We develop a new non-parametric approach for conditional anomaly detection based on the soft harmonic solution, with which we estimate the confidence of the label to detect anomalous mislabeling. We further regularize the solution to avoid the detection of isolated examples and examples on the boundary of the distribution support. We demonstrate the efficacy of the proposed method on several synthetic and UCI ML datasets in detecting unusual labels when compared to several baseline approaches. We also evaluate the performance of our method on a real-world electronic health record dataset where we seek to identify unusual patient-management decisions. PMID:25309142
Conditional Anomaly Detection with Soft Harmonic Functions.
Valko, Michal; Kveton, Branislav; Valizadegan, Hamed; Cooper, Gregory F; Hauskrecht, Milos
2011-01-01
In this paper, we consider the problem of conditional anomaly detection that aims to identify data instances with an unusual response or a class label. We develop a new non-parametric approach for conditional anomaly detection based on the soft harmonic solution, with which we estimate the confidence of the label to detect anomalous mislabeling. We further regularize the solution to avoid the detection of isolated examples and examples on the boundary of the distribution support. We demonstrate the efficacy of the proposed method on several synthetic and UCI ML datasets in detecting unusual labels when compared to several baseline approaches. We also evaluate the performance of our method on a real-world electronic health record dataset where we seek to identify unusual patient-management decisions.
Prediction of reacting atoms for the major biotransformation reactions of organic xenobiotics.
Rudik, Anastasia V; Dmitriev, Alexander V; Lagunin, Alexey A; Filimonov, Dmitry A; Poroikov, Vladimir V
2016-01-01
The knowledge of drug metabolite structures is essential at the early stage of drug discovery to understand the potential liabilities and risks connected with biotransformation. The determination of the site of a molecule at which a particular metabolic reaction occurs could be used as a starting point for metabolite identification. The prediction of the site of metabolism does not always correspond to the particular atom that is modified by the enzyme but rather is often associated with a group of atoms. To overcome this problem, we propose to operate with the term "reacting atom", corresponding to a single atom in the substrate that is modified during the biotransformation reaction. The prediction of the reacting atom(s) in a molecule for the major classes of biotransformation reactions is necessary to generate drug metabolites. Substrates of the major human cytochromes P450 and UDP-glucuronosyltransferases from the Biovia Metabolite database were divided into nine groups according to their reaction classes, which are aliphatic and aromatic hydroxylation, N- and O-glucuronidation, N-, S- and C-oxidation, and N- and O-dealkylation. Each training set consists of positive and negative examples of structures with one labelled atom. In the positive examples, the labelled atom is the reacting atom of a particular reaction that changed adjacency. Negative examples represent non-reacting atoms of a particular reaction. We used Labelled Multilevel Neighbourhoods of Atoms descriptors for the designation of reacting atoms. A Bayesian-like algorithm was applied to estimate the structure-activity relationships. The average invariant accuracy of prediction obtained in leave-one-out and 20-fold cross-validation procedures for five human isoforms of cytochrome P450 and all isoforms of UDP-glucuronosyltransferase varies from 0.86 to 0.99 (0.96 on average). We report that reacting atoms may be predicted with reasonable accuracy for the major classes of metabolic reactions-aliphatic and aromatic hydroxylation, N- and O-glucuronidation, N-, S- and C-oxidation, and N- and O-dealkylation. The proposed method is implemented as a freely available web service at http://www.way2drug.com/RA and may be used for the prediction of the most probable biotransformation reaction(s) and the appropriate reacting atoms in drug-like compounds.Graphical abstract.
First Amendment Limits on Compulsory Speech.
Barrella, Nigel
Government-mandated labeling requirements have a long history, and are used extensively by FDA in regulating the industries under its jurisdiction. All such requirements can be characterized as a form of “compelled speech,” opening the door to First Amendment challenges. And some of these challenges, depending on the nature of the labeling requirement, have even been successful. Under Zauderer v. Office of Disciplinary Counsel of Supreme Court of Ohio, regulations that compel disclosure of information will, in many cases, merit only very limited First Amendment scrutiny—less, even, than most other regulations of commercial speech, which receive a type of “intermediate scrutiny.” The labeling requirement that can best avoid or overcome a First Amendment challenge, therefore, will follow the example of the regulation described in Zauderer. For example, Zauderer applied its lower scrutiny by noting that the compelled speech at issue was a disclosure of “purely factual and uncontroversial information.” Conversely, a successful First Amendment challenge to a labeling requirement will often involve an argument that the labeling requirement is outside the scope of what the Zauderer Court contemplated: so, for example, one may argue that a compelled disclosure is either “not factual” or else “controversial,” putting it beyond Zauderer’s reach. After briefly reviewing the major Supreme Court cases that establish the levels of scrutiny for commercial speech and compelled disclosures, the paper will discuss how the various elements of Zauderer have been analyzed by several lower courts, and how some courts have distinguished Zauderer in the context of labeling and other mandatory disclosure laws. In particular, the paper will focus on cases involving First Amendment challenges to food, tobacco, and drug labeling requirements—some successful, some not, and some ongoing—including cases challenging FDA, USDA, and state-level labeling requirements. The decided cases do not all agree on how to understand the elements of Zauderer—for example, must a disclosure be factually controversial to fall outside of Zauderer’s limited review, or may it be factually unquestionable but relating to a controversial topic? What role, if any, should public acceptance, knowledge, and history play? What sorts of interests may the government invoke to justify a labeling requirement? Although some courts have taken (or at least hinted at) strict limits on the meaning of Zauderer, most courts have read Zauderer as applying somewhat more expansively to circumstances beyond its facts. The paper concludes that generally, courts have read Zauderer more expansively in part because such a reading is consistent with existing, familiar labeling requirements, and a narrow reading of Zauderer limited to its facts would rest on a slippery slope to abolishing many accepted and historically unquestioned labeling requirements. Any future attempts to expand judicial review of labeling requirements would do well to highlight limiting principles that address such concerns.
NASA Astrophysics Data System (ADS)
Paul, A.; Vogt, K.; Rottensteiner, F.; Ostermann, J.; Heipke, C.
2018-05-01
In this paper we deal with the problem of measuring the similarity between training and tests datasets in the context of transfer learning (TL) for image classification. TL tries to transfer knowledge from a source domain, where labelled training samples are abundant but the data may follow a different distribution, to a target domain, where labelled training samples are scarce or even unavailable, assuming that the domains are related. Thus, the requirements w.r.t. the availability of labelled training samples in the target domain are reduced. In particular, if no labelled target data are available, it is inherently difficult to find a robust measure of relatedness between the source and target domains. This is of crucial importance for the performance of TL, because the knowledge transfer between unrelated data may lead to negative transfer, i.e. to a decrease of classification performance after transfer. We address the problem of measuring the relatedness between source and target datasets and investigate three different strategies to predict and, consequently, to avoid negative transfer in this paper. The first strategy is based on circular validation. The second strategy relies on the Maximum Mean Discrepancy (MMD) similarity metric, whereas the third one is an extension of MMD which incorporates the knowledge about the class labels in the source domain. Our method is evaluated using two different benchmark datasets. The experiments highlight the strengths and weaknesses of the investigated methods. We also show that it is possible to reduce the amount of negative transfer using these strategies for a TL method and to generate a consistent performance improvement over the whole dataset.
Recurrent neural network based virtual detection line
NASA Astrophysics Data System (ADS)
Kadikis, Roberts
2018-04-01
The paper proposes an efficient method for detection of moving objects in the video. The objects are detected when they cross a virtual detection line. Only the pixels of the detection line are processed, which makes the method computationally efficient. A Recurrent Neural Network processes these pixels. The machine learning approach allows one to train a model that works in different and changing outdoor conditions. Also, the same network can be trained for various detection tasks, which is demonstrated by the tests on vehicle and people counting. In addition, the paper proposes a method for semi-automatic acquisition of labeled training data. The labeling method is used to create training and testing datasets, which in turn are used to train and evaluate the accuracy and efficiency of the detection method. The method shows similar accuracy as the alternative efficient methods but provides greater adaptability and usability for different tasks.
Pacharawongsakda, Eakasit; Theeramunkong, Thanaruk
2013-12-01
Predicting protein subcellular location is one of major challenges in Bioinformatics area since such knowledge helps us understand protein functions and enables us to select the targeted proteins during drug discovery process. While many computational techniques have been proposed to improve predictive performance for protein subcellular location, they have several shortcomings. In this work, we propose a method to solve three main issues in such techniques; i) manipulation of multiplex proteins which may exist or move between multiple cellular compartments, ii) handling of high dimensionality in input and output spaces and iii) requirement of sufficient labeled data for model training. Towards these issues, this work presents a new computational method for predicting proteins which have either single or multiple locations. The proposed technique, namely iFLAST-CORE, incorporates the dimensionality reduction in the feature and label spaces with co-training paradigm for semi-supervised multi-label classification. For this purpose, the Singular Value Decomposition (SVD) is applied to transform the high-dimensional feature space and label space into the lower-dimensional spaces. After that, due to limitation of labeled data, the co-training regression makes use of unlabeled data by predicting the target values in the lower-dimensional spaces of unlabeled data. In the last step, the component of SVD is used to project labels in the lower-dimensional space back to those in the original space and an adaptive threshold is used to map a numeric value to a binary value for label determination. A set of experiments on viral proteins and gram-negative bacterial proteins evidence that our proposed method improve the classification performance in terms of various evaluation metrics such as Aiming (or Precision), Coverage (or Recall) and macro F-measure, compared to the traditional method that uses only labeled data.
2010-01-01
Background Comparative genomics methods such as phylogenetic profiling can mine powerful inferences from inherently noisy biological data sets. We introduce Sites Inferred by Metabolic Background Assertion Labeling (SIMBAL), a method that applies the Partial Phylogenetic Profiling (PPP) approach locally within a protein sequence to discover short sequence signatures associated with functional sites. The approach is based on the basic scoring mechanism employed by PPP, namely the use of binomial distribution statistics to optimize sequence similarity cutoffs during searches of partitioned training sets. Results Here we illustrate and validate the ability of the SIMBAL method to find functionally relevant short sequence signatures by application to two well-characterized protein families. In the first example, we partitioned a family of ABC permeases using a metabolic background property (urea utilization). Thus, the TRUE set for this family comprised members whose genome of origin encoded a urea utilization system. By moving a sliding window across the sequence of a permease, and searching each subsequence in turn against the full set of partitioned proteins, the method found which local sequence signatures best correlated with the urea utilization trait. Mapping of SIMBAL "hot spots" onto crystal structures of homologous permeases reveals that the significant sites are gating determinants on the cytosolic face rather than, say, docking sites for the substrate-binding protein on the extracellular face. In the second example, we partitioned a protein methyltransferase family using gene proximity as a criterion. In this case, the TRUE set comprised those methyltransferases encoded near the gene for the substrate RF-1. SIMBAL identifies sequence regions that map onto the substrate-binding interface while ignoring regions involved in the methyltransferase reaction mechanism in general. Neither method for training set construction requires any prior experimental characterization. Conclusions SIMBAL shows that, in functionally divergent protein families, selected short sequences often significantly outperform their full-length parent sequence for making functional predictions by sequence similarity, suggesting avenues for improved functional classifiers. When combined with structural data, SIMBAL affords the ability to localize and model functional sites. PMID:20102603
NASA Astrophysics Data System (ADS)
Feng, Steve; Woo, Min-jae; Kim, Hannah; Kim, Eunso; Ki, Sojung; Shao, Lei; Ozcan, Aydogan
2016-03-01
We developed an easy-to-use and widely accessible crowd-sourcing tool for rapidly training humans to perform biomedical image diagnostic tasks and demonstrated this platform's ability on middle and high school students in South Korea to diagnose malaria infected red-blood-cells (RBCs) using Giemsa-stained thin blood smears imaged under light microscopes. We previously used the same platform (i.e., BioGames) to crowd-source diagnostics of individual RBC images, marking them as malaria positive (infected), negative (uninfected), or questionable (insufficient information for a reliable diagnosis). Using a custom-developed statistical framework, we combined the diagnoses from both expert diagnosticians and the minimally trained human crowd to generate a gold standard library of malaria-infection labels for RBCs. Using this library of labels, we developed a web-based training and educational toolset that provides a quantified score for diagnosticians/users to compare their performance against their peers and view misdiagnosed cells. We have since demonstrated the ability of this platform to quickly train humans without prior training to reach high diagnostic accuracy as compared to expert diagnosticians. Our initial trial group of 55 middle and high school students has collectively played more than 170 hours, each demonstrating significant improvements after only 3 hours of training games, with diagnostic scores that match expert diagnosticians'. Next, through a national-scale educational outreach program in South Korea we recruited >1660 students who demonstrated a similar performance level after 5 hours of training. We plan to further demonstrate this tool's effectiveness for other diagnostic tasks involving image labeling and aim to provide an easily-accessible and quickly adaptable framework for online training of new diagnosticians.
Yang, Yang; Saleemi, Imran; Shah, Mubarak
2013-07-01
This paper proposes a novel representation of articulated human actions and gestures and facial expressions. The main goals of the proposed approach are: 1) to enable recognition using very few examples, i.e., one or k-shot learning, and 2) meaningful organization of unlabeled datasets by unsupervised clustering. Our proposed representation is obtained by automatically discovering high-level subactions or motion primitives, by hierarchical clustering of observed optical flow in four-dimensional, spatial, and motion flow space. The completely unsupervised proposed method, in contrast to state-of-the-art representations like bag of video words, provides a meaningful representation conducive to visual interpretation and textual labeling. Each primitive action depicts an atomic subaction, like directional motion of limb or torso, and is represented by a mixture of four-dimensional Gaussian distributions. For one--shot and k-shot learning, the sequence of primitive labels discovered in a test video are labeled using KL divergence, and can then be represented as a string and matched against similar strings of training videos. The same sequence can also be collapsed into a histogram of primitives or be used to learn a Hidden Markov model to represent classes. We have performed extensive experiments on recognition by one and k-shot learning as well as unsupervised action clustering on six human actions and gesture datasets, a composite dataset, and a database of facial expressions. These experiments confirm the validity and discriminative nature of the proposed representation.
Using virtual data for training deep model for hand gesture recognition
NASA Astrophysics Data System (ADS)
Nikolaev, E. I.; Dvoryaninov, P. V.; Lensky, Y. Y.; Drozdovsky, N. S.
2018-05-01
Deep learning has shown real promise for the classification efficiency for hand gesture recognition problems. In this paper, the authors present experimental results for a deeply-trained model for hand gesture recognition through the use of hand images. The authors have trained two deep convolutional neural networks. The first architecture produces the hand position as a 2D-vector by input hand image. The second one predicts the hand gesture class for the input image. The first proposed architecture produces state of the art results with an accuracy rate of 89% and the second architecture with split input produces accuracy rate of 85.2%. In this paper, the authors also propose using virtual data for training a supervised deep model. Such technique is aimed to avoid using original labelled images in the training process. The interest of this method in data preparation is motivated by the need to overcome one of the main challenges of deep supervised learning: using a copious amount of labelled data during training.
Devices, systems, and methods for conducting assays with improved sensitivity using sedimentation
Schaff, Ulrich Y.; Koh, Chung-Yan; Sommer, Gregory J.
2016-04-05
Embodiments of the present invention are directed toward devices, systems, and method for conducting assays using sedimentation. In one example, a method includes layering a mixture on a density medium, subjecting sedimentation particles in the mixture to sedimentation forces to cause the sedimentation particles to move to a detection area through a density medium, and detecting a target analyte in a detection region of the sedimentation channel. In some examples, the sedimentation particles and labeling agent may have like charges to reduce non-specific binding of labeling agent and sedimentation particles. In some examples, the density medium is provided with a separation layer for stabilizing the assay during storage and operation. In some examples, the sedimentation channel may be provided with a generally flat sedimentation chamber for dispersing the particle pellet over a larger surface area.
Devices, systems, and methods for conducting assays with improved sensitivity using sedimentation
Schaff, Ulrich Y; Koh, Chung-Yan; Sommer, Gregory J
2015-02-24
Embodiments of the present invention are directed toward devices, systems, and method for conducting assays using sedimentation. In one example, a method includes layering a mixture on a density medium, subjecting sedimentation particles in the mixture to sedimentation forces to cause the sedimentation particles to move to a detection area through a density medium, and detecting a target analyte in a detection region of the sedimentation channel. In some examples, the sedimentation particles and labeling agent may have like charges to reduce non-specific binding of labeling agent and sedimentation particles. In some examples, the density medium is provided with a separation layer for stabilizing the assay during storage and operation. In some examples, the sedimentation channel may be provided with a generally flat sedimentation chamber for dispersing the particle pellet over a larger surface area.
Opiates and cerebral functional activity in rats
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trusk, T.C.
1986-01-01
Cerebral activity was measured using the free-fatty acid (1-/sup 14/C) octanoate as a fast functional tracer in conscious, unrestrained rats 5 minutes after intravenous injection of heroin, cocaine or saline vehicle. Regional changes of octanoate labeling density in the autoradiograms relative to saline-injected animals were used to determine the functional activity effects of each drug. Heroin and cocaine each produced a distinctive pattern of activity increases and suppression throughout the rat brain. Similar regional changes induced by both drugs were found in limbic brain regions implicated in drug reinforcement. Labeled octanoate autoradiography was used to measure the cerebral functional responsemore » to a tone that had previously been paired to heroin injections. Rats were trained in groups of three consisting of one heroin self-administration animal, and two animals receiving yoked infusion of heroin or saline. A tone was paired with each infusion during training. Behavioral experiments in similarly trained rats demonstrated that these training conditions impart secondary reinforcing properties to the tone in animals previously self-administering heroin, while the tone remains behaviorally neutral in yoked-infusion rats. Cerebral functional activity was measured during presentation of the tone without drug infusion. Octanoate labeling density changed in fifteen brain areas in response to the tone previously paired to heroin without response contingency. Labeling density was significantly modified in sixteen regions as a result of previously pairing the tone to response-contingent heroin infusions.« less
Smart Annotation of Cyclic Data Using Hierarchical Hidden Markov Models.
Martindale, Christine F; Hoenig, Florian; Strohrmann, Christina; Eskofier, Bjoern M
2017-10-13
Cyclic signals are an intrinsic part of daily life, such as human motion and heart activity. The detailed analysis of them is important for clinical applications such as pathological gait analysis and for sports applications such as performance analysis. Labeled training data for algorithms that analyze these cyclic data come at a high annotation cost due to only limited annotations available under laboratory conditions or requiring manual segmentation of the data under less restricted conditions. This paper presents a smart annotation method that reduces this cost of labeling for sensor-based data, which is applicable to data collected outside of strict laboratory conditions. The method uses semi-supervised learning of sections of cyclic data with a known cycle number. A hierarchical hidden Markov model (hHMM) is used, achieving a mean absolute error of 0.041 ± 0.020 s relative to a manually-annotated reference. The resulting model was also used to simultaneously segment and classify continuous, 'in the wild' data, demonstrating the applicability of using hHMM, trained on limited data sections, to label a complete dataset. This technique achieved comparable results to its fully-supervised equivalent. Our semi-supervised method has the significant advantage of reduced annotation cost. Furthermore, it reduces the opportunity for human error in the labeling process normally required for training of segmentation algorithms. It also lowers the annotation cost of training a model capable of continuous monitoring of cycle characteristics such as those employed to analyze the progress of movement disorders or analysis of running technique.
Shah, Prithvi K; Garcia-Alias, Guillermo; Choe, Jaehoon; Gad, Parag; Gerasimenko, Yury; Tillakaratne, Niranjala; Zhong, Hui; Roy, Roland R; Edgerton, V Reggie
2013-11-01
Can lower limb motor function be improved after a spinal cord lesion by re-engaging functional activity of the upper limbs? We addressed this issue by training the forelimbs in conjunction with the hindlimbs after a thoracic spinal cord hemisection in adult rats. The spinal circuitries were more excitable, and behavioural and electrophysiological analyses showed improved hindlimb function when the forelimbs were engaged simultaneously with the hindlimbs during treadmill step-training as opposed to training only the hindlimbs. Neuronal retrograde labelling demonstrated a greater number of propriospinal labelled neurons above and below the thoracic lesion site in quadrupedally versus bipedally trained rats. The results provide strong evidence that actively engaging the forelimbs improves hindlimb function and that one likely mechanism underlying these effects is the reorganization and re-engagement of rostrocaudal spinal interneuronal networks. For the first time, we provide evidence that the spinal interneuronal networks linking the forelimbs and hindlimbs are amenable to a rehabilitation training paradigm. Identification of this phenomenon provides a strong rationale for proceeding toward preclinical studies for determining whether training paradigms involving upper arm training in concert with lower extremity training can enhance locomotor recovery after neurological damage.
Code of Federal Regulations, 2010 CFR
2010-04-01
... which the term is qualified in the labeling to reflect the product's intended use. (c) An article so... unless the use of the article under the conditions set forth in its labeling is generally recognized as safe and effective among experts qualified by scientific training and experience to evaluate the safety...
Isolating the Effects of Training Using Simple Regression Analysis: An Example of the Procedure.
ERIC Educational Resources Information Center
Waugh, C. Keith
This paper provides a case example of simple regression analysis, a forecasting procedure used to isolate the effects of training from an identified extraneous variable. This case example focuses on results of a three-day sales training program to improve bank loan officers' knowledge, skill-level, and attitude regarding solicitation and sale of…
The effect of verbal context on olfactory neural responses.
Bensafi, Moustafa; Croy, Ilona; Phillips, Nicola; Rouby, Catherine; Sezille, Caroline; Gerber, Johannes; Small, Dana M; Hummel, Thomas
2014-03-01
Odor names refer usually to "source" object categories. For example, the smell of rose is often described with its source category (flower). However, linguistic studies suggest that odors can also be named with labels referring to categories of "practices". This is the case when rose odor is described with a verbal label referring to its use in fragrance practices ("body lotion," cosmetic for example). It remains unknown whether naming an odor by its practice category influences olfactory neural responses differently than that observed when named with its source category. The aim of this study was to investigate this question. To this end, functional MRI was used in a within-subjects design comparing brain responses to four different odors (peach, chocolate, linden blossom, and rose) under two conditions whereby smells were described either (1) with their source category label (food and flower) or (2) with a practice category label (body lotion). Both types of labels induced activations in secondary olfactory areas (orbitofrontal cortex), whereas only the source label condition induced activation in the cingulate cortex and the insula. In summary, our findings offer a new look at olfactory perception by indicating differential brain responses depending on whether odors are named according to their source or practice category. Copyright © 2012 Wiley Periodicals, Inc.
Filtering big data from social media--Building an early warning system for adverse drug reactions.
Yang, Ming; Kiang, Melody; Shang, Wei
2015-04-01
Adverse drug reactions (ADRs) are believed to be a leading cause of death in the world. Pharmacovigilance systems are aimed at early detection of ADRs. With the popularity of social media, Web forums and discussion boards become important sources of data for consumers to share their drug use experience, as a result may provide useful information on drugs and their adverse reactions. In this study, we propose an automated ADR related posts filtering mechanism using text classification methods. In real-life settings, ADR related messages are highly distributed in social media, while non-ADR related messages are unspecific and topically diverse. It is expensive to manually label a large amount of ADR related messages (positive examples) and non-ADR related messages (negative examples) to train classification systems. To mitigate this challenge, we examine the use of a partially supervised learning classification method to automate the process. We propose a novel pharmacovigilance system leveraging a Latent Dirichlet Allocation modeling module and a partially supervised classification approach. We select drugs with more than 500 threads of discussion, and collect all the original posts and comments of these drugs using an automatic Web spidering program as the text corpus. Various classifiers were trained by varying the number of positive examples and the number of topics. The trained classifiers were applied to 3000 posts published over 60 days. Top-ranked posts from each classifier were pooled and the resulting set of 300 posts was reviewed by a domain expert to evaluate the classifiers. Compare to the alternative approaches using supervised learning methods and three general purpose partially supervised learning methods, our approach performs significantly better in terms of precision, recall, and the F measure (the harmonic mean of precision and recall), based on a computational experiment using online discussion threads from Medhelp. Our design provides satisfactory performance in identifying ADR related posts for post-marketing drug surveillance. The overall design of our system also points out a potentially fruitful direction for building other early warning systems that need to filter big data from social media networks. Copyright © 2015 Elsevier Inc. All rights reserved.
A Regions of Confidence Based Approach to Enhance Segmentation with Shape Priors.
Appia, Vikram V; Ganapathy, Balaji; Abufadel, Amer; Yezzi, Anthony; Faber, Tracy
2010-01-18
We propose an improved region based segmentation model with shape priors that uses labels of confidence/interest to exclude the influence of certain regions in the image that may not provide useful information for segmentation. These could be regions in the image which are expected to have weak, missing or corrupt edges or they could be regions in the image which the user is not interested in segmenting, but are part of the object being segmented. In the training datasets, along with the manual segmentations we also generate an auxiliary map indicating these regions of low confidence/interest. Since, all the training images are acquired under similar conditions, we can train our algorithm to estimate these regions as well. Based on this training we will generate a map which indicates the regions in the image that are likely to contain no useful information for segmentation. We then use a parametric model to represent the segmenting curve as a combination of shape priors obtained by representing the training data as a collection of signed distance functions. We evolve an objective energy functional to evolve the global parameters that are used to represent the curve. We vary the influence each pixel has on the evolution of these parameters based on the confidence/interest label. When we use these labels to indicate the regions with low confidence; the regions containing accurate edges will have a dominant role in the evolution of the curve and the segmentation in the low confidence regions will be approximated based on the training data. Since our model evolves global parameters, it improves the segmentation even in the regions with accurate edges. This is because we eliminate the influence of the low confidence regions which may mislead the final segmentation. Similarly when we use the labels to indicate the regions which are not of importance, we will get a better segmentation of the object in the regions we are interested in.
Event Recognition Based on Deep Learning in Chinese Texts
Zhang, Yajun; Liu, Zongtian; Zhou, Wen
2016-01-01
Event recognition is the most fundamental and critical task in event-based natural language processing systems. Existing event recognition methods based on rules and shallow neural networks have certain limitations. For example, extracting features using methods based on rules is difficult; methods based on shallow neural networks converge too quickly to a local minimum, resulting in low recognition precision. To address these problems, we propose the Chinese emergency event recognition model based on deep learning (CEERM). Firstly, we use a word segmentation system to segment sentences. According to event elements labeled in the CEC 2.0 corpus, we classify words into five categories: trigger words, participants, objects, time and location. Each word is vectorized according to the following six feature layers: part of speech, dependency grammar, length, location, distance between trigger word and core word and trigger word frequency. We obtain deep semantic features of words by training a feature vector set using a deep belief network (DBN), then analyze those features in order to identify trigger words by means of a back propagation neural network. Extensive testing shows that the CEERM achieves excellent recognition performance, with a maximum F-measure value of 85.17%. Moreover, we propose the dynamic-supervised DBN, which adds supervised fine-tuning to a restricted Boltzmann machine layer by monitoring its training performance. Test analysis reveals that the new DBN improves recognition performance and effectively controls the training time. Although the F-measure increases to 88.11%, the training time increases by only 25.35%. PMID:27501231
Event Recognition Based on Deep Learning in Chinese Texts.
Zhang, Yajun; Liu, Zongtian; Zhou, Wen
2016-01-01
Event recognition is the most fundamental and critical task in event-based natural language processing systems. Existing event recognition methods based on rules and shallow neural networks have certain limitations. For example, extracting features using methods based on rules is difficult; methods based on shallow neural networks converge too quickly to a local minimum, resulting in low recognition precision. To address these problems, we propose the Chinese emergency event recognition model based on deep learning (CEERM). Firstly, we use a word segmentation system to segment sentences. According to event elements labeled in the CEC 2.0 corpus, we classify words into five categories: trigger words, participants, objects, time and location. Each word is vectorized according to the following six feature layers: part of speech, dependency grammar, length, location, distance between trigger word and core word and trigger word frequency. We obtain deep semantic features of words by training a feature vector set using a deep belief network (DBN), then analyze those features in order to identify trigger words by means of a back propagation neural network. Extensive testing shows that the CEERM achieves excellent recognition performance, with a maximum F-measure value of 85.17%. Moreover, we propose the dynamic-supervised DBN, which adds supervised fine-tuning to a restricted Boltzmann machine layer by monitoring its training performance. Test analysis reveals that the new DBN improves recognition performance and effectively controls the training time. Although the F-measure increases to 88.11%, the training time increases by only 25.35%.
16 CFR 301.12 - Country of origin of imported furs.
Code of Federal Regulations, 2011 CFR
2011-01-01
... labeling shall be preceded by the term fur origin; as for example: Dyed Muskrat Fur Origin: Russia or Dyed... example: Tip-dyed Canadian American Sable Fur Origin: Canada or Russian Sable Fur Origin: Russia (f...
16 CFR 301.12 - Country of origin of imported furs.
Code of Federal Regulations, 2014 CFR
2014-01-01
... labeling shall be preceded by the term fur origin; as for example: Dyed Muskrat Fur Origin: Russia or Dyed... example: Tip-dyed Canadian American Sable Fur Origin: Canada or Russian Sable Fur Origin: Russia (f...
16 CFR 301.12 - Country of origin of imported furs.
Code of Federal Regulations, 2012 CFR
2012-01-01
... labeling shall be preceded by the term fur origin; as for example: Dyed Muskrat Fur Origin: Russia or Dyed... example: Tip-dyed Canadian American Sable Fur Origin: Canada or Russian Sable Fur Origin: Russia (f...
16 CFR 301.12 - Country of origin of imported furs.
Code of Federal Regulations, 2013 CFR
2013-01-01
... labeling shall be preceded by the term fur origin; as for example: Dyed Muskrat Fur Origin: Russia or Dyed... example: Tip-dyed Canadian American Sable Fur Origin: Canada or Russian Sable Fur Origin: Russia (f...
16 CFR 301.12 - Country of origin of imported furs.
Code of Federal Regulations, 2010 CFR
2010-01-01
... labeling shall be preceded by the term fur origin; as for example: Dyed Muskrat Fur Origin: Russia or Dyed... example: Tip-dyed Canadian American Sable Fur Origin: Canada or Russian Sable Fur Origin: Russia (f...
Peckys, Diana B; Dukes, Madeline J; de Jonge, Niels
2014-01-01
Correlative fluorescence microscopy and scanning transmission electron microscopy (STEM) of cells fully immersed in liquid is a new methodology with many application areas. Proteins, in live cells immobilized on microchips, are labeled with fluorescent quantum dot (QD) nanoparticles. In this protocol, the epidermal growth factor receptor (EGFR) is labeled. The cells are fixed after a selected labeling time, for example, 5 min as needed to form EGFR dimers. The microchip with cells is then imaged with fluorescence microscopy. Thereafter, the microchip with the labeled cells and one with a spacer are assembled in a special microfluidic device and imaged with STEM.
... Department of Health & Human Services Health Topics The Science Grants and Training News and Events About NHLBI Home » Health Information for the Public » Educational Campaigns & Programs » We Can! » Eat Right » Use the Nutrition Facts Label Ways to Enhance Children's Activity & Nutrition About ...
Effects of generic language on category content and structure.
Gelman, Susan A; Ware, Elizabeth A; Kleinberg, Felicia
2010-11-01
We hypothesized that generic noun phrases ("Bears climb trees") would provide important input to children's developing concepts. In three experiments, four-year-olds and adults learned a series of facts about a novel animal category, in one of three wording conditions: generic (e.g., "Zarpies hate ice cream"), specific-label (e.g., "This zarpie hates ice cream"), or no-label (e.g., "This hates ice cream"). Participants completed a battery of tasks assessing the extent to which they linked the category to the properties expressed, and the extent to which they treated the category as constituting an essentialized kind. As predicted, for adults, generics training resulted in tighter category-property links and more category essentialism than both the specific-label and no-label training. Children also showed effects of generic wording, though the effects were weaker and required more extensive input. We discuss the implications for language-thought relations, and for the acquisition of essentialized categories. Copyright 2010 Elsevier Inc. All rights reserved.
Drug-related webpages classification based on multi-modal local decision fusion
NASA Astrophysics Data System (ADS)
Hu, Ruiguang; Su, Xiaojing; Liu, Yanxin
2018-03-01
In this paper, multi-modal local decision fusion is used for drug-related webpages classification. First, meaningful text are extracted through HTML parsing, and effective images are chosen by the FOCARSS algorithm. Second, six SVM classifiers are trained for six kinds of drug-taking instruments, which are represented by PHOG. One SVM classifier is trained for the cannabis, which is represented by the mid-feature of BOW model. For each instance in a webpage, seven SVMs give seven labels for its image, and other seven labels are given by searching the names of drug-taking instruments and cannabis in its related text. Concatenating seven labels of image and seven labels of text, the representation of those instances in webpages are generated. Last, Multi-Instance Learning is used to classify those drugrelated webpages. Experimental results demonstrate that the classification accuracy of multi-instance learning with multi-modal local decision fusion is much higher than those of single-modal classification.
21 CFR 201.128 - Meaning of “intended uses”.
Code of Federal Regulations, 2010 CFR
2010-04-01
... the distribution of the article. This objective intent may, for example, be shown by labeling claims....128 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED..., and 201.122 refer to the objective intent of the persons legally responsible for the labeling of drugs...
Transfer learning for biomedical named entity recognition with neural networks.
Giorgi, John M; Bader, Gary D
2018-06-01
The explosive increase of biomedical literature has made information extraction an increasingly important tool for biomedical research. A fundamental task is the recognition of biomedical named entities in text (BNER) such as genes/proteins, diseases, and species. Recently, a domain-independent method based on deep learning and statistical word embeddings, called long short-term memory network-conditional random field (LSTM-CRF), has been shown to outperform state-of-the-art entity-specific BNER tools. However, this method is dependent on gold-standard corpora (GSCs) consisting of hand-labeled entities, which tend to be small but highly reliable. An alternative to GSCs are silver-standard corpora (SSCs), which are generated by harmonizing the annotations made by several automatic annotation systems. SSCs typically contain more noise than GSCs but have the advantage of containing many more training examples. Ideally, these corpora could be combined to achieve the benefits of both, which is an opportunity for transfer learning. In this work, we analyze to what extent transfer learning improves upon state-of-the-art results for BNER. We demonstrate that transferring a deep neural network (DNN) trained on a large, noisy SSC to a smaller, but more reliable GSC significantly improves upon state-of-the-art results for BNER. Compared to a state-of-the-art baseline evaluated on 23 GSCs covering four different entity classes, transfer learning results in an average reduction in error of approximately 11%. We found transfer learning to be especially beneficial for target data sets with a small number of labels (approximately 6000 or less). Source code for the LSTM-CRF is available athttps://github.com/Franck-Dernoncourt/NeuroNER/ and links to the corpora are available athttps://github.com/BaderLab/Transfer-Learning-BNER-Bioinformatics-2018/. john.giorgi@utoronto.ca. Supplementary data are available at Bioinformatics online.
A Label Propagation Approach for Detecting Buried Objects in Handheld GPR Data
2016-04-17
regions of interest that correspond to locations with anomalous signatures. Second, a classifier (or an ensemble of classifiers ) is used to assign a...investigated for almost two decades and several classifiers have been developed. Most of these methods are based on the supervised learning paradigm where...labeled target and clutter signatures are needed to train a classifier to discriminate between the two classes. Typically, large and diverse labeled
Active machine learning for rapid landslide inventory mapping with VHR satellite images (Invited)
NASA Astrophysics Data System (ADS)
Stumpf, A.; Lachiche, N.; Malet, J.; Kerle, N.; Puissant, A.
2013-12-01
VHR satellite images have become a primary source for landslide inventory mapping after major triggering events such as earthquakes and heavy rainfalls. Visual image interpretation is still the prevailing standard method for operational purposes but is time-consuming and not well suited to fully exploit the increasingly better supply of remote sensing data. Recent studies have addressed the development of more automated image analysis workflows for landslide inventory mapping. In particular object-oriented approaches that account for spatial and textural image information have been demonstrated to be more adequate than pixel-based classification but manually elaborated rule-based classifiers are difficult to adapt under changing scene characteristics. Machine learning algorithm allow learning classification rules for complex image patterns from labelled examples and can be adapted straightforwardly with available training data. In order to reduce the amount of costly training data active learning (AL) has evolved as a key concept to guide the sampling for many applications. The underlying idea of AL is to initialize a machine learning model with a small training set, and to subsequently exploit the model state and data structure to iteratively select the most valuable samples that should be labelled by the user. With relatively few queries and labelled samples, an AL strategy yields higher accuracies than an equivalent classifier trained with many randomly selected samples. This study addressed the development of an AL method for landslide mapping from VHR remote sensing images with special consideration of the spatial distribution of the samples. Our approach [1] is based on the Random Forest algorithm and considers the classifier uncertainty as well as the variance of potential sampling regions to guide the user towards the most valuable sampling areas. The algorithm explicitly searches for compact regions and thereby avoids a spatially disperse sampling pattern inherent to most other AL methods. The accuracy, the sampling time and the computational runtime of the algorithm were evaluated on multiple satellite images capturing recent large scale landslide events. Sampling between 1-4% of the study areas the accuracies between 74% and 80% were achieved, whereas standard sampling schemes yielded only accuracies between 28% and 50% with equal sampling costs. Compared to commonly used point-wise AL algorithm the proposed approach significantly reduces the number of iterations and hence the computational runtime. Since the user can focus on relatively few compact areas (rather than on hundreds of distributed points) the overall labeling time is reduced by more than 50% compared to point-wise queries. An experimental evaluation of multiple expert mappings demonstrated strong relationships between the uncertainties of the experts and the machine learning model. It revealed that the achieved accuracies are within the range of the inter-expert disagreement and that it will be indispensable to consider ground truth uncertainties to truly achieve further enhancements in the future. The proposed method is generally applicable to a wide range of optical satellite images and landslide types. [1] A. Stumpf, N. Lachiche, J.-P. Malet, N. Kerle, and A. Puissant, Active learning in the spatial domain for remote sensing image classification, IEEE Transactions on Geosciece and Remote Sensing. 2013, DOI 10.1109/TGRS.2013.2262052.
Adapter reagents for protein site specific dye labeling.
Thompson, Darren A; Evans, Eric G B; Kasza, Tomas; Millhauser, Glenn L; Dawson, Philip E
2014-05-01
Chemoselective protein labeling remains a significant challenge in chemical biology. Although many selective labeling chemistries have been reported, the practicalities of matching the reaction with appropriately functionalized proteins and labeling reagents is often a challenge. For example, we encountered the challenge of site specifically labeling the cellular form of the murine Prion protein with a fluorescent dye. To facilitate this labeling, a protein was expressed with site specific p-acetylphenylalanine. However, the utility of this acetophenone reactive group is hampered by the severe lack of commercially available aminooxy fluorophores. Here we outline a general strategy for the efficient solid phase synthesis of adapter reagents capable of converting maleimido-labels into aminooxy or azide functional groups that can be further tuned for desired length or solubility properties. The utility of the adapter strategy is demonstrated in the context of fluorescent labeling of the murine Prion protein through an adapted aminooxy-Alexa dye. © 2014 Wiley Periodicals, Inc.
Adapter Reagents for Protein Site Specific Dye Labeling
Thompson, Darren A.; Evans, Eric G. B.; Kasza, Tomas; Millhauser, Glenn L.; Dawson, Philip E.
2016-01-01
Chemoselective protein labeling remains a significant challenge in chemical biology. Although many selective labeling chemistries have been reported, the practicalities of matching the reaction with appropriately functionalized proteins and labeling reagents is often a challenge. For example, we encountered the challenge of site specifically labeling the cellular form of the murine Prion protein with a fluorescent dye. To facilitate this labeling, a protein was expressed with site specific p-acetylphenylalanine. However, the utility of this aceto-phenone reactive group is hampered by the severe lack of commercially available aminooxy fluorophores. Here we outline a general strategy for the efficient solid phase synthesis of adapter reagents capable of converting maleimido-labels into aminooxy or azide functional groups that can be further tuned for desired length or solubility properties. The utility of the adapter strategy is demonstrated in the context of fluorescent labeling of the murine Prion protein through an adapted aminooxy-Alexa dye. PMID:24599728
NASA Astrophysics Data System (ADS)
Liu, Jiamin; Chang, Kevin; Kim, Lauren; Turkbey, Evrim; Lu, Le; Yao, Jianhua; Summers, Ronald
2015-03-01
The thyroid gland plays an important role in clinical practice, especially for radiation therapy treatment planning. For patients with head and neck cancer, radiation therapy requires a precise delineation of the thyroid gland to be spared on the pre-treatment planning CT images to avoid thyroid dysfunction. In the current clinical workflow, the thyroid gland is normally manually delineated by radiologists or radiation oncologists, which is time consuming and error prone. Therefore, a system for automated segmentation of the thyroid is desirable. However, automated segmentation of the thyroid is challenging because the thyroid is inhomogeneous and surrounded by structures that have similar intensities. In this work, the thyroid gland segmentation is initially estimated by multi-atlas label fusion algorithm. The segmentation is refined by supervised statistical learning based voxel labeling with a random forest algorithm. Multiatlas label fusion (MALF) transfers expert-labeled thyroids from atlases to a target image using deformable registration. Errors produced by label transfer are reduced by label fusion that combines the results produced by all atlases into a consensus solution. Then, random forest (RF) employs an ensemble of decision trees that are trained on labeled thyroids to recognize features. The trained forest classifier is then applied to the thyroid estimated from the MALF by voxel scanning to assign the class-conditional probability. Voxels from the expert-labeled thyroids in CT volumes are treated as positive classes; background non-thyroid voxels as negatives. We applied this automated thyroid segmentation system to CT scans of 20 patients. The results showed that the MALF achieved an overall 0.75 Dice Similarity Coefficient (DSC) and the RF classification further improved the DSC to 0.81.
Using Distance Education to Teach the New Food Label to Extension Educators.
ERIC Educational Resources Information Center
Struempler, Barbara; And Others
1997-01-01
Satellite training about the new national food labeling system was provided to 97 Alabama extension agents and 67 program assistants. The program, which consisted of a 30-minute video and 25-minute question/answer call-in, proved an effective means of distance education. (SK)
30 CFR 47.44 - Temporary, portable containers.
Code of Federal Regulations, 2010 CFR
2010-07-01
... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Temporary, portable containers. 47.44 Section... TRAINING HAZARD COMMUNICATION (HazCom) Container Labels and Other Forms of Warning § 47.44 Temporary, portable containers. (a) The operator does not have to label a temporary, portable container if he or she...
NASA Astrophysics Data System (ADS)
Eppenhof, Koen A. J.; Pluim, Josien P. W.
2017-02-01
Error estimation in medical image registration is valuable when validating, comparing, or combining registration methods. To validate a nonlinear image registration method, ideally the registration error should be known for the entire image domain. We propose a supervised method for the estimation of a registration error map for nonlinear image registration. The method is based on a convolutional neural network that estimates the norm of the residual deformation from patches around each pixel in two registered images. This norm is interpreted as the registration error, and is defined for every pixel in the image domain. The network is trained using a set of artificially deformed images. Each training example is a pair of images: the original image, and a random deformation of that image. No manually labeled ground truth error is required. At test time, only the two registered images are required as input. We train and validate the network on registrations in a set of 2D digital subtraction angiography sequences, such that errors up to eight pixels can be estimated. We show that for this range of errors the convolutional network is able to learn the registration error in pairs of 2D registered images at subpixel precision. Finally, we present a proof of principle for the extension to 3D registration problems in chest CTs, showing that the method has the potential to estimate errors in 3D registration problems.
18F-Labeling of Sensitive Biomolecules for Positron Emission Tomography
Krishnan, Hema S.; Ma, Longle; Vasdev, Neil; Liang, Steven H.
2017-01-01
Positron emission tomography (PET) imaging study of fluorine-18 labeled biomolecules is an emerging and rapidly growing area for preclinical and clinical research. The present review focuses on recent advances in radiochemical methods for incorporating fluorine-18 into biomolecules via ‘direct’ or ‘indirect’ bioconjugation. Recently developed prosthetic groups and pre-targeting strategies, as well as representative examples in 18F-labeling of biomolecules in PET imaging research studies are highlighted. PMID:28704575
United States Food and Drug Administration Product Label Changes
Sung, Julie C.; Stein-Gold, Linda; Goldenberg, Gary
2017-01-01
Once a drug has been approved by the United States Food and Drug Administration and is on the market, the Food and Drug Administration communicates new safety information through product label changes. Most of these label changes occur after a spontaneous report to either the drug manufacturing companies or the Food and Drug Administration MedWatch program. As a result, 400 to 500 label changes occur every year. Actinic keratosis treatments exemplify the commonality of label changes throughout the postmarket course of a drug. Diclofenac gel, 5-fluorouracil cream, imiquimod, and ingenol mebutate are examples of actinic keratosis treatments that have all undergone at least one label revision. With the current system of spontaneous reports leading to numerous label changes, each occurrence does not necessarily signify a radical change in the safety of a drug. PMID:28367259
United States Food and Drug Administration Product Label Changes
Sung, Julie C.; Stein-Gold, Linda; Goldenberg, Gary
2016-01-01
Once a drug has been approved by the United States Food and Drug Administration and is on the market, the Food and Drug Administration communicates new safety information through product label changes. Most of these label changes occur after a spontaneous report to either the drug manufacturing companies or the Food and Drug Administration MedWatch program. As a result, 400 to 500 label changes occur every year. Actinic keratosis treatments exemplify the commonality of label changes throughout the postmarket course of a drug. Diclofenac gel, 5-fluorouracil cream, imiquimod, and ingenol mebutate are examples of actinic keratosis treatments that have all undergone at least one label revision. With the current system of spontaneous reports leading to numerous label changes, each occurrence does not necessarily signify a radical change in the safety of a drug. PMID:26962391
United States Food and Drug Administration Product Label Changes.
Kircik, Leon; Sung, Julie C; Stein-Gold, Linda; Goldenberg, Gary
2017-02-01
Once a drug has been approved by the United States Food and Drug Administration and is on the market, the Food and Drug Administration communicates new safety information through product label changes. Most of these label changes occur after a spontaneous report to either the drug manufacturing companies or the Food and Drug Administration MedWatch program. As a result, 400 to 500 label changes occur every year. Actinic keratosis treatments exemplify the commonality of label changes throughout the postmarket course of a drug. Diclofenac gel, 5-fluorouracil cream, imiquimod, and ingenol mebutate are examples of actinic keratosis treatments that have all undergone at least one label revision. With the current system of spontaneous reports leading to numerous label changes, each occurrence does not necessarily signify a radical change in the safety of a drug.
Food for thought: obstacles to menu labelling in restaurants and cafeterias.
Thomas, Erica
2016-08-01
Menu labelling is recommended as a policy intervention to reduce obesity and diet-related disease. The present commentary considers the many challenges the restaurant industry faces in providing nutrition information on its menus. Barriers include lack of nutrition expertise, time, cost, availability of nutrition information for exotic ingredients, ability to provide accurate nutrition information, libel risk, customer dissatisfaction, limited space on the menu, menu variations, loss of flexibility in changing the menu, staff training and resistance of employees to change current practice. Health promotion specialists and academics involved in fieldwork must help restaurateurs find solutions to these barriers for menu labelling interventions to be widely implemented and successful. Practical support for small independent restaurants such as free or subsidised nutrition analysis, nutrition training for staff and menu design may also be necessary to encourage voluntary participation.
Bayesian network classifiers for categorizing cortical GABAergic interneurons.
Mihaljević, Bojan; Benavides-Piccione, Ruth; Bielza, Concha; DeFelipe, Javier; Larrañaga, Pedro
2015-04-01
An accepted classification of GABAergic interneurons of the cerebral cortex is a major goal in neuroscience. A recently proposed taxonomy based on patterns of axonal arborization promises to be a pragmatic method for achieving this goal. It involves characterizing interneurons according to five axonal arborization features, called F1-F5, and classifying them into a set of predefined types, most of which are established in the literature. Unfortunately, there is little consensus among expert neuroscientists regarding the morphological definitions of some of the proposed types. While supervised classifiers were able to categorize the interneurons in accordance with experts' assignments, their accuracy was limited because they were trained with disputed labels. Thus, here we automatically classify interneuron subsets with different label reliability thresholds (i.e., such that every cell's label is backed by at least a certain (threshold) number of experts). We quantify the cells with parameters of axonal and dendritic morphologies and, in order to predict the type, also with axonal features F1-F4 provided by the experts. Using Bayesian network classifiers, we accurately characterize and classify the interneurons and identify useful predictor variables. In particular, we discriminate among reliable examples of common basket, horse-tail, large basket, and Martinotti cells with up to 89.52% accuracy, and single out the number of branches at 180 μm from the soma, the convex hull 2D area, and the axonal features F1-F4 as especially useful predictors for distinguishing among these types. These results open up new possibilities for an objective and pragmatic classification of interneurons.
Semantic Image Segmentation with Contextual Hierarchical Models.
Seyedhosseini, Mojtaba; Tasdizen, Tolga
2016-05-01
Semantic segmentation is the problem of assigning an object label to each pixel. It unifies the image segmentation and object recognition problems. The importance of using contextual information in semantic segmentation frameworks has been widely realized in the field. We propose a contextual framework, called contextual hierarchical model (CHM), which learns contextual information in a hierarchical framework for semantic segmentation. At each level of the hierarchy, a classifier is trained based on downsampled input images and outputs of previous levels. Our model then incorporates the resulting multi-resolution contextual information into a classifier to segment the input image at original resolution. This training strategy allows for optimization of a joint posterior probability at multiple resolutions through the hierarchy. Contextual hierarchical model is purely based on the input image patches and does not make use of any fragments or shape examples. Hence, it is applicable to a variety of problems such as object segmentation and edge detection. We demonstrate that CHM performs at par with state-of-the-art on Stanford background and Weizmann horse datasets. It also outperforms state-of-the-art edge detection methods on NYU depth dataset and achieves state-of-the-art on Berkeley segmentation dataset (BSDS 500).
Kogan, J A; Margoliash, D
1998-04-01
The performance of two techniques is compared for automated recognition of bird song units from continuous recordings. The advantages and limitations of dynamic time warping (DTW) and hidden Markov models (HMMs) are evaluated on a large database of male songs of zebra finches (Taeniopygia guttata) and indigo buntings (Passerina cyanea), which have different types of vocalizations and have been recorded under different laboratory conditions. Depending on the quality of recordings and complexity of song, the DTW-based technique gives excellent to satisfactory performance. Under challenging conditions such as noisy recordings or presence of confusing short-duration calls, good performance of the DTW-based technique requires careful selection of templates that may demand expert knowledge. Because HMMs are trained, equivalent or even better performance of HMMs can be achieved based only on segmentation and labeling of constituent vocalizations, albeit with many more training examples than DTW templates. One weakness in HMM performance is the misclassification of short-duration vocalizations or song units with more variable structure (e.g., some calls, and syllables of plastic songs). To address these and other limitations, new approaches for analyzing bird vocalizations are discussed.
Automatic ground control point recognition with parallel associative memory
NASA Technical Reports Server (NTRS)
Al-Tahir, Raid; Toth, Charles K.; Schenck, Anton F.
1990-01-01
The basic principle of the associative memory is to match the unknown input pattern against a stored training set, and responding with the 'closest match' and the corresponding label. Generally, an associative memory system requires two preparatory steps: selecting attributes of the pattern class, and training the system by associating patterns with labels. Experimental results gained from using Parallel Associative Memory are presented. The primary concern is an automatic search for ground control points in aerial photographs. Synthetic patterns are tested followed by real data. The results are encouraging as a relatively high level of correct matches is reached.
Highly enriched multiply-labeled stable isotopic compounds as atmospheric tracers
Goldblatt, M.; McInteer, B.B.
1974-01-29
Compounds multiply-labeled with stable isotopes and highly enriched in these isotopes are readily capable of detection in tracer experiments involving high dilutions. Thus, for example, /sup 13/C/sup 18/O/sub 2/ provides a useful tracer for following atmospheric pol lution produced as a result of fossil fuel burning. (Official Gazette)
The effect of sample size and disease prevalence on supervised machine learning of narrative data.
McKnight, Lawrence K.; Wilcox, Adam; Hripcsak, George
2002-01-01
This paper examines the independent effects of outcome prevalence and training sample sizes on inductive learning performance. We trained 3 inductive learning algorithms (MC4, IB, and Naïve-Bayes) on 60 simulated datasets of parsed radiology text reports labeled with 6 disease states. Data sets were constructed to define positive outcome states at 4 prevalence rates (1, 5, 10, 25, and 50%) in training set sizes of 200 and 2,000 cases. We found that the effect of outcome prevalence is significant when outcome classes drop below 10% of cases. The effect appeared independent of sample size, induction algorithm used, or class label. Work is needed to identify methods of improving classifier performance when output classes are rare. PMID:12463878
Besalú, Emili
2016-01-01
The Superposing Significant Interaction Rules (SSIR) method is described. It is a general combinatorial and symbolic procedure able to rank compounds belonging to combinatorial analogue series. The procedure generates structure-activity relationship (SAR) models and also serves as an inverse SAR tool. The method is fast and can deal with large databases. SSIR operates from statistical significances calculated from the available library of compounds and according to the previously attached molecular labels of interest or non-interest. The required symbolic codification allows dealing with almost any combinatorial data set, even in a confidential manner, if desired. The application example categorizes molecules as binding or non-binding, and consensus ranking SAR models are generated from training and two distinct cross-validation methods: leave-one-out and balanced leave-two-out (BL2O), the latter being suited for the treatment of binary properties. PMID:27240346
Active Learning with Irrelevant Examples
NASA Technical Reports Server (NTRS)
Mazzoni, Dominic; Wagstaff, Kiri L.; Burl, Michael
2006-01-01
Active learning algorithms attempt to accelerate the learning process by requesting labels for the most informative items first. In real-world problems, however, there may exist unlabeled items that are irrelevant to the user's classification goals. Queries about these points slow down learning because they provide no information about the problem of interest. We have observed that when irrelevant items are present, active learning can perform worse than random selection, requiring more time (queries) to achieve the same level of accuracy. Therefore, we propose a novel approach, Relevance Bias, in which the active learner combines its default selection heuristic with the output of a simultaneously trained relevance classifier to favor items that are likely to be both informative and relevant. In our experiments on a real-world problem and two benchmark datasets, the Relevance Bias approach significantly improved the learning rate of three different active learning approaches.
Generating Ground Reference Data for a Global Impervious Surface Survey
NASA Technical Reports Server (NTRS)
Tilton, James C.; De Colstoun, Eric Brown; Wolfe, Robert E.; Tan, Bin; Huang, Chengquan
2012-01-01
We are developing an approach for generating ground reference data in support of a project to produce a 30m impervious cover data set of the entire Earth for the years 2000 and 2010 based on the Landsat Global Land Survey (GLS) data set. Since sufficient ground reference data for training and validation is not available from ground surveys, we are developing an interactive tool, called HSegLearn, to facilitate the photo-interpretation of 1 to 2 m spatial resolution imagery data, which we will use to generate the needed ground reference data at 30m. Through the submission of selected region objects and positive or negative examples of impervious surfaces, HSegLearn enables an analyst to automatically select groups of spectrally similar objects from a hierarchical set of image segmentations produced by the HSeg image segmentation program at an appropriate level of segmentation detail, and label these region objects as either impervious or nonimpervious.
101 Labeled Brain Images and a Consistent Human Cortical Labeling Protocol
Klein, Arno; Tourville, Jason
2012-01-01
We introduce the Mindboggle-101 dataset, the largest and most complete set of free, publicly accessible, manually labeled human brain images. To manually label the macroscopic anatomy in magnetic resonance images of 101 healthy participants, we created a new cortical labeling protocol that relies on robust anatomical landmarks and minimal manual edits after initialization with automated labels. The “Desikan–Killiany–Tourville” (DKT) protocol is intended to improve the ease, consistency, and accuracy of labeling human cortical areas. Given how difficult it is to label brains, the Mindboggle-101 dataset is intended to serve as brain atlases for use in labeling other brains, as a normative dataset to establish morphometric variation in a healthy population for comparison against clinical populations, and contribute to the development, training, testing, and evaluation of automated registration and labeling algorithms. To this end, we also introduce benchmarks for the evaluation of such algorithms by comparing our manual labels with labels automatically generated by probabilistic and multi-atlas registration-based approaches. All data and related software and updated information are available on the http://mindboggle.info/data website. PMID:23227001
Detailed Phonetic Labeling of Multi-language Database for Spoken Language Processing Applications
2015-03-01
which contains about 60 interfering speakers as well as background music in a bar. The top panel is again clean training /noisy testing settings, and...recognition system for Mandarin was developed and tested. Character recognition rates as high as 88% were obtained, using an approximately 40 training ...Tool_ComputeFeat.m) .............................................................................................................. 50 6.3. Training
An Analysis of John Dewey's Notion of Occupations--Still Pedagogically Valuable?
ERIC Educational Resources Information Center
DeFalco, Anthony
2010-01-01
John Dewey lived and worked in an environment where the manual training movement was ever-present. For Dewey his own unique version of manual training is labeled occupations. Nevertheless, over the years what Dewey meant by occupations has been either misinterpreted or ignored for a plethora of reasons. This manual training climate that Dewey was…
This example training certification format and any attached training documentation may be used to demonstrate, document and certify successful completion of required training topics under 40 CFR 63.11515(d)(6) for personnel, who spray apply surface coating
Active Learning by Querying Informative and Representative Examples.
Huang, Sheng-Jun; Jin, Rong; Zhou, Zhi-Hua
2014-10-01
Active learning reduces the labeling cost by iteratively selecting the most valuable data to query their labels. It has attracted a lot of interests given the abundance of unlabeled data and the high cost of labeling. Most active learning approaches select either informative or representative unlabeled instances to query their labels, which could significantly limit their performance. Although several active learning algorithms were proposed to combine the two query selection criteria, they are usually ad hoc in finding unlabeled instances that are both informative and representative. We address this limitation by developing a principled approach, termed QUIRE, based on the min-max view of active learning. The proposed approach provides a systematic way for measuring and combining the informativeness and representativeness of an unlabeled instance. Further, by incorporating the correlation among labels, we extend the QUIRE approach to multi-label learning by actively querying instance-label pairs. Extensive experimental results show that the proposed QUIRE approach outperforms several state-of-the-art active learning approaches in both single-label and multi-label learning.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rother, Hanna-Andrea
Pesticide companies and regulators in developing countries use the United Nations Food and Agricultural Organization (FAO) recommended pictograms on pesticide labels to communicate risk information based on toxicological and environmental risk assessment data. The pesticide label not only is often the only access people have to pesticide risk information, but also in many countries is a legally binding document. As a result of the crucial role pesticide labels play in protecting health and the environment and as a legal instrument, pictograms are used to overcome literacy challenges in transmitting pesticide risk information. Yet, this risk communication tool is often pronemore » to misinterpretations of the risk information which results in hazardous exposures to pesticides for farm workers and end-users generally. In this paper, results are presented from a study with 115 farm workers on commercial vineyards in the Western Cape, South Africa, assessing their interpretations of 10 commonly used pictograms. A standardized questionnaire based on four commonly used pesticide labels was administered. Overall, 50% or more of the study farm workers had misleading, incorrect and critically confused interpretations of the label pictograms. Interpretations often reflected farm workers' social and cultural frames of reference rather than the technically intended risk information. For example, the pictogram indicating a pesticide's toxicity requires boots must be worn, evoked interpretations of 'dangerous to pedestrians' and 'don't walk through pesticides'. Furthermore, there was a gender variation in pictogram comprehension whereby males generally had more correct interpretations than females. This is a result both of a lack of training for women who are assumed to not work with pesticides, as well as a lack of pictograms relevant for female exposures. These findings challenge the viability of the United Nations current initiative to globally harmonize pictograms used on all chemical labels under the new Globally Harmonized System for the Classification and Labelling of Chemicals (GHS). Particularly as the GHS pictograms were not piloted prior to adoption of the system and represent complex risk assessment data such as chronic hazards. Public health and pesticide policy, backed by relevant research, need to address developing applicable and effective pesticide risk communication tools, particularly for developing country populations. Merely providing risk assessment derived information in a pictogram does not ensure that an end-user will interpret the message as intended and be able to make risk decisions which mitigate risks from exposures to pesticides or chemicals in general.« less
Messori, A; Fadda, V; Trippoli, S
2011-04-01
National healthcare systems as well as local institutions generally reimburse numerous off-label uses of anticancer drugs, but an explicit framework for managing these payments is still lacking. As in the case of on-label uses, an optimal management of off-label uses should be aimed at a direct proportionality between cost and clinical benefit. Within this framework, assessing the incremental cost/effectiveness ratio becomes mandatory, and measuring the magnitude of the clinical benefit (e.g. gain in overall survival or progression-free survival) is essential.This paper discusses how the standard principles of cost-effectiveness and value-for-money can be applied to manage the reimbursement of off-label treatments in oncology. It also describes a detailed operational scheme to appropriately implement this aim. Two separate approaches are considered: a) a trial-based approach, which is designed for situations where enough information is available from clinical studies about the expected effectiveness of the off-label treatment; b) an individualized payment-by-results approach, which is designed for situations in which adequate information on effectiveness is lacking; this latter approach requires that each patient receiving off-label treatment is followed-up to determine individual outcomes and tailor the extent of payment to individual results.Some examples of application of both approaches are presented in detail, which have been extracted from a list of 184 off-label indications approved in 2010 by the Region of tuscany in italy. these examples support the feasibility of the two methods proposed.In conclusion, the scheme described in this paper represents an operational solution to an unsettled problem in the area of oncology drugs. © E.S.I.F.T. srl - Firenze
18 F-Labeling of Sensitive Biomolecules for Positron Emission Tomography.
Krishnan, Hema S; Ma, Longle; Vasdev, Neil; Liang, Steven H
2017-11-07
Positron emission tomography (PET) imaging study of fluorine-18 labeled biomolecules is an emerging and rapidly growing area for preclinical and clinical research. The present review focuses on recent advances in radiochemical methods for incorporating fluorine-18 into biomolecules via "direct" or "indirect" bioconjugation. Recently developed prosthetic groups and pre-targeting strategies, as well as representative examples in 18 F-labeling of biomolecules in PET imaging research studies are highlighted. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Volume-labeled nanoparticles and methods of preparation
Wang, Wei; Gu, Baohua; Retterer, Scott T; Doktycz, Mitchel J
2015-04-21
Compositions comprising nanosized objects (i.e., nanoparticles) in which at least one observable marker, such as a radioisotope or fluorophore, is incorporated within the nanosized object. The nanosized objects include, for example, metal or semi-metal oxide (e.g., silica), quantum dot, noble metal, magnetic metal oxide, organic polymer, metal salt, and core-shell nanoparticles, wherein the label is incorporated within the nanoparticle or selectively in a metal oxide shell of a core-shell nanoparticle. Methods of preparing the volume-labeled nanoparticles are also described.
Labelling chronic illness in primary care: a good or a bad thing?
Bedson, John; McCarney, Rob; Croft, Peter
2004-12-01
Traditionally the management of any chronic condition starts with its diagnosis. The labelling of disease can be beneficial in terms of defining appropriate treatment such as in coronary artery disease. However, sometimes it may be detrimental such as when x-rays are used to diagnose lumbar spondylosis leading to patients inappropriately limiting their activity. Chronic knee pain in the elderly is another example where applying labels is problematical. A common diagnosis in this situation is osteoarthritis, but this label can be applied in two ways: as a radiological diagnosis, or as a clinical one. The x-ray diagnosis, however, does not equate with the clinical syndrome, and vice versa. In addition, diagnosing knee pain as osteoarthritis does not necessarily help in management, since a patient's debility is more dependent upon their clinical signs and symptoms than the presence of radiographic osteoarthritis, and by the same token its clinical counterpart. GPs are consistent in their management of knee pain, but in attempting to diagnose the pain as osteoarthritis, these plans can alter and become more dependent on the actual diagnosis than the clinical picture. As a result management may well diverge from what the current best evidence supports. Diagnosis for diagnosis sake, should therefore be discouraged, and chronic knee pain gives us one example of why this is the case. GPs would be better placed to manage this condition if it was considered more as a regional pain syndrome, perhaps defining it simply as 'chronic knee pain in older people'. This example suggests that there is a pressing need in primary care to carefully consider in chronic disease when it is appropriate to be definitive in diagnosis such that when using disease specific labels, there is definite benefit for the patient and doctor.
Towards harmonized seismic analysis across Europe using supervised machine learning approaches
NASA Astrophysics Data System (ADS)
Zaccarelli, Riccardo; Bindi, Dino; Cotton, Fabrice; Strollo, Angelo
2017-04-01
In the framework of the Thematic Core Services for Seismology of EPOS-IP (European Plate Observing System-Implementation Phase), a service for disseminating a regionalized logic-tree of ground motions models for Europe is under development. While for the Mediterranean area the large availability of strong motion data qualified and disseminated through the Engineering Strong Motion database (ESM-EPOS), supports the development of both selection criteria and ground motion models, for the low-to-moderate seismic regions of continental Europe the development of ad-hoc models using weak motion recordings of moderate earthquakes is unavoidable. Aim of this work is to present a platform for creating application-oriented earthquake databases by retrieving information from EIDA (European Integrated Data Archive) and applying supervised learning models for earthquake records selection and processing suitable for any specific application of interest. Supervised learning models, i.e. the task of inferring a function from labelled training data, have been extensively used in several fields such as spam detection, speech and image recognition and in general pattern recognition. Their suitability to detect anomalies and perform a semi- to fully- automated filtering on large waveform data set easing the effort of (or replacing) human expertise is therefore straightforward. Being supervised learning algorithms capable of learning from a relatively small training set to predict and categorize unseen data, its advantage when processing large amount of data is crucial. Moreover, their intrinsic ability to make data driven predictions makes them suitable (and preferable) in those cases where explicit algorithms for detection might be unfeasible or too heuristic. In this study, we consider relatively simple statistical classifiers (e.g., Naive Bayes, Logistic Regression, Random Forest, SVMs) where label are assigned to waveform data based on "recognized classes" needed for our use case. These classes might be a simply binary case (e.g., "good for analysis" vs "bad") or more complex one (e.g., "good for analysis" vs "low SNR", "multi-event", "bad coda envelope"). It is important to stress the fact that our approach can be generalized to any use case providing, as in any supervised approach, an adequate training set of labelled data, a feature-set, a statistical classifier, and finally model validation and evaluation. Examples of use cases considered to develop the system prototype are the characterization of the ground motion in low seismic areas; harmonized spectral analysis across Europe for source and attenuation studies; magnitude calibration; coda analysis for attenuation studies.
Use of doubly labeled water technique in soldiers training for jungle warfare
DOE Office of Scientific and Technical Information (OSTI.GOV)
Forbes-Ewan, C.H.; Morrissey, B.L.; Gregg, G.C.
1989-07-01
The doubly labeled water method was used to estimate the energy expended by four members of an Australian Army platoon (34 soldiers) engaged in training for jungle warfare. Each subject received an oral isotope dose sufficient to raise isotope levels by 200-250 ({sup 18}O) and 100-120 ppm ({sup 2}H). The experimental period was 7 days. Concurrently, a factorial estimate of the energy expenditure of the platoon was conducted. Also, a food intake-energy balance study was conducted for the platoon. Mean daily energy expenditure by the doubly labeled water method was 4,750 kcal (range 4,152-5,394 kcal). The factorial estimate of meanmore » daily energy expenditure was 4,535 kcal. Because of inherent inaccuracies in the food intake-energy balance technique, we were able to conclude only that energy expenditure, as measured by this method, was greater than the estimated mean daily intake of 4,040 kcal. The doubly labeled water technique was well tolerated, is noninvasive, and appears to be suitable in a wide range of field applications.« less
Absolute Memory for Tempo in Musicians and Non-Musicians
Brandimonte, Maria A.; Bruno, Nicola
2016-01-01
The ability to remember tempo (the perceived frequency of musical pulse) without external references may be defined, by analogy with the notion of absolute pitch, as absolute tempo (AT). Anecdotal reports and sparse empirical evidence suggest that at least some individuals possess AT. However, to our knowledge, no systematic assessments of AT have been performed using laboratory tasks comparable to those assessing absolute pitch. In the present study, we operationalize AT as the ability to identify and reproduce tempo in the absence of rhythmic or melodic frames of reference and assess these abilities in musically trained and untrained participants. We asked 15 musicians and 15 non-musicians to listen to a seven-step `tempo scale’ of metronome beats, each associated to a numerical label, and then to perform two memory tasks. In the first task, participants heard one of the tempi and attempted to report the correct label (identification task), in the second, they saw one label and attempted to tap the correct tempo (production task). A musical and visual excerpt was presented between successive trials as a distractor to prevent participants from using previous tempi as anchors. Thus, participants needed to encode tempo information with the corresponding label, store the information, and recall it to give the response. We found that more than half were able to perform above chance in at least one of the tasks, and that musical training differentiated between participants in identification, but not in production. These results suggest that AT is relatively wide-spread, relatively independent of musical training in tempo production, but further refined by training in tempo identification. We propose that at least in production, the underlying motor representations are related to tactus, a basic internal rhythmic period that may provide a body-based reference for encoding tempo. PMID:27760198
Absolute Memory for Tempo in Musicians and Non-Musicians.
Gratton, Irene; Brandimonte, Maria A; Bruno, Nicola
2016-01-01
The ability to remember tempo (the perceived frequency of musical pulse) without external references may be defined, by analogy with the notion of absolute pitch, as absolute tempo (AT). Anecdotal reports and sparse empirical evidence suggest that at least some individuals possess AT. However, to our knowledge, no systematic assessments of AT have been performed using laboratory tasks comparable to those assessing absolute pitch. In the present study, we operationalize AT as the ability to identify and reproduce tempo in the absence of rhythmic or melodic frames of reference and assess these abilities in musically trained and untrained participants. We asked 15 musicians and 15 non-musicians to listen to a seven-step `tempo scale' of metronome beats, each associated to a numerical label, and then to perform two memory tasks. In the first task, participants heard one of the tempi and attempted to report the correct label (identification task), in the second, they saw one label and attempted to tap the correct tempo (production task). A musical and visual excerpt was presented between successive trials as a distractor to prevent participants from using previous tempi as anchors. Thus, participants needed to encode tempo information with the corresponding label, store the information, and recall it to give the response. We found that more than half were able to perform above chance in at least one of the tasks, and that musical training differentiated between participants in identification, but not in production. These results suggest that AT is relatively wide-spread, relatively independent of musical training in tempo production, but further refined by training in tempo identification. We propose that at least in production, the underlying motor representations are related to tactus, a basic internal rhythmic period that may provide a body-based reference for encoding tempo.
Temkin, Bharti; Acosta, Eric; Malvankar, Ameya; Vaidyanath, Sreeram
2006-04-01
The Visible Human digital datasets make it possible to develop computer-based anatomical training systems that use virtual anatomical models (virtual body structures-VBS). Medical schools are combining these virtual training systems and classical anatomy teaching methods that use labeled images and cadaver dissection. In this paper we present a customizable web-based three-dimensional anatomy training system, W3D-VBS. W3D-VBS uses National Library of Medicine's (NLM) Visible Human Male datasets to interactively locate, explore, select, extract, highlight, label, and visualize, realistic 2D (using axial, coronal, and sagittal views) and 3D virtual structures. A real-time self-guided virtual tour of the entire body is designed to provide detailed anatomical information about structures, substructures, and proximal structures. The system thus facilitates learning of visuospatial relationships at a level of detail that may not be possible by any other means. The use of volumetric structures allows for repeated real-time virtual dissections, from any angle, at the convenience of the user. Volumetric (3D) virtual dissections are performed by adding, removing, highlighting, and labeling individual structures (and/or entire anatomical systems). The resultant virtual explorations (consisting of anatomical 2D/3D illustrations and animations), with user selected highlighting colors and label positions, can be saved and used for generating lesson plans and evaluation systems. Tracking users' progress using the evaluation system helps customize the curriculum, making W3D-VBS a powerful learning tool. Our plan is to incorporate other Visible Human segmented datasets, especially datasets with higher resolutions, that make it possible to include finer anatomical structures such as nerves and small vessels. (c) 2006 Wiley-Liss, Inc.
A resource-saving collective approach to biomedical semantic role labeling
2014-01-01
Background Biomedical semantic role labeling (BioSRL) is a natural language processing technique that identifies the semantic roles of the words or phrases in sentences describing biological processes and expresses them as predicate-argument structures (PAS’s). Currently, a major problem of BioSRL is that most systems label every node in a full parse tree independently; however, some nodes always exhibit dependency. In general SRL, collective approaches based on the Markov logic network (MLN) model have been successful in dealing with this problem. However, in BioSRL such an approach has not been attempted because it would require more training data to recognize the more specialized and diverse terms found in biomedical literature, increasing training time and computational complexity. Results We first constructed a collective BioSRL system based on MLN. This system, called collective BIOSMILE (CBIOSMILE), is trained on the BioProp corpus. To reduce the resources used in BioSRL training, we employ a tree-pruning filter to remove unlikely nodes from the parse tree and four argument candidate identifiers to retain candidate nodes in the tree. Nodes not recognized by any candidate identifier are discarded. The pruned annotated parse trees are used to train a resource-saving MLN-based system, which is referred to as resource-saving collective BIOSMILE (RCBIOSMILE). Our experimental results show that our proposed CBIOSMILE system outperforms BIOSMILE, which is the top BioSRL system. Furthermore, our proposed RCBIOSMILE maintains the same level of accuracy as CBIOSMILE using 92% less memory and 57% less training time. Conclusions This greatly improved efficiency makes RCBIOSMILE potentially suitable for training on much larger BioSRL corpora over more biomedical domains. Compared to real-world biomedical corpora, BioProp is relatively small, containing only 445 MEDLINE abstracts and 30 event triggers. It is not large enough for practical applications, such as pathway construction. We consider it of primary importance to pursue SRL training on large corpora in the future. PMID:24884358
Computerized breast cancer analysis system using three stage semi-supervised learning method.
Sun, Wenqing; Tseng, Tzu-Liang Bill; Zhang, Jianying; Qian, Wei
2016-10-01
A large number of labeled medical image data is usually a requirement to train a well-performed computer-aided detection (CAD) system. But the process of data labeling is time consuming, and potential ethical and logistical problems may also present complications. As a result, incorporating unlabeled data into CAD system can be a feasible way to combat these obstacles. In this study we developed a three stage semi-supervised learning (SSL) scheme that combines a small amount of labeled data and larger amount of unlabeled data. The scheme was modified on our existing CAD system using the following three stages: data weighing, feature selection, and newly proposed dividing co-training data labeling algorithm. Global density asymmetry features were incorporated to the feature pool to reduce the false positive rate. Area under the curve (AUC) and accuracy were computed using 10 fold cross validation method to evaluate the performance of our CAD system. The image dataset includes mammograms from 400 women who underwent routine screening examinations, and each pair contains either two cranio-caudal (CC) or two mediolateral-oblique (MLO) view mammograms from the right and the left breasts. From these mammograms 512 regions were extracted and used in this study, and among them 90 regions were treated as labeled while the rest were treated as unlabeled. Using our proposed scheme, the highest AUC observed in our research was 0.841, which included the 90 labeled data and all the unlabeled data. It was 7.4% higher than using labeled data only. With the increasing amount of labeled data, AUC difference between using mixed data and using labeled data only reached its peak when the amount of labeled data was around 60. This study demonstrated that our proposed three stage semi-supervised learning can improve the CAD performance by incorporating unlabeled data. Using unlabeled data is promising in computerized cancer research and may have a significant impact for future CAD system applications. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Le, Long N; Jones, Douglas L
2018-03-01
Audio classification techniques often depend on the availability of a large labeled training dataset for successful performance. However, in many application domains of audio classification (e.g., wildlife monitoring), obtaining labeled data is still a costly and laborious process. Motivated by this observation, a technique is proposed to efficiently learn a clean template from a few labeled, but likely corrupted (by noise and interferences), data samples. This learning can be done efficiently via tensorial dynamic time warping on the articulation index-based time-frequency representations of audio data. The learned template can then be used in audio classification following the standard template-based approach. Experimental results show that the proposed approach outperforms both (1) the recurrent neural network approach and (2) the state-of-the-art in the template-based approach on a wildlife detection application with few training samples.
Scene recognition based on integrating active learning with dictionary learning
NASA Astrophysics Data System (ADS)
Wang, Chengxi; Yin, Xueyan; Yang, Lin; Gong, Chengrong; Zheng, Caixia; Yi, Yugen
2018-04-01
Scene recognition is a significant topic in the field of computer vision. Most of the existing scene recognition models require a large amount of labeled training samples to achieve a good performance. However, labeling image manually is a time consuming task and often unrealistic in practice. In order to gain satisfying recognition results when labeled samples are insufficient, this paper proposed a scene recognition algorithm named Integrating Active Learning and Dictionary Leaning (IALDL). IALDL adopts projective dictionary pair learning (DPL) as classifier and introduces active learning mechanism into DPL for improving its performance. When constructing sampling criterion in active learning, IALDL considers both the uncertainty and representativeness as the sampling criteria to effectively select the useful unlabeled samples from a given sample set for expanding the training dataset. Experiment results on three standard databases demonstrate the feasibility and validity of the proposed IALDL.
Wang, Zhengxia; Zhu, Xiaofeng; Adeli, Ehsan; Zhu, Yingying; Nie, Feiping; Munsell, Brent
2018-01-01
Graph-based transductive learning (GTL) is a powerful machine learning technique that is used when sufficient training data is not available. In particular, conventional GTL approaches first construct a fixed inter-subject relation graph that is based on similarities in voxel intensity values in the feature domain, which can then be used to propagate the known phenotype data (i.e., clinical scores and labels) from the training data to the testing data in the label domain. However, this type of graph is exclusively learned in the feature domain, and primarily due to outliers in the observed features, may not be optimal for label propagation in the label domain. To address this limitation, a progressive GTL (pGTL) method is proposed that gradually finds an intrinsic data representation that more accurately aligns imaging features with the phenotype data. In general, optimal feature-to-phenotype alignment is achieved using an iterative approach that: (1) refines inter-subject relationships observed in the feature domain by using the learned intrinsic data representation in the label domain, (2) updates the intrinsic data representation from the refined inter-subject relationships, and (3) verifies the intrinsic data representation on the training data to guarantee an optimal classification when applied to testing data. Additionally, the iterative approach is extended to multi-modal imaging data to further improve pGTL classification accuracy. Using Alzheimer’s disease and Parkinson’s disease study data, the classification accuracy of the proposed pGTL method is compared to several state-of-the-art classification methods, and the results show pGTL can more accurately identify subjects, even at different progression stages, in these two study data sets. PMID:28551556
Feature-to-Feature Inference Under Conditions of Cue Restriction and Dimensional Correlation.
Lancaster, Matthew E; Homa, Donald
2017-01-01
The present study explored feature-to-feature and label-to-feature inference in a category task for different category structures. In the correlated condition, each of the 4 dimensions comprising the category was positively correlated to each other and to the category label. In the uncorrelated condition, no correlation existed between the 4 dimensions comprising the category, although the dimension to category label correlation matched that of the correlated condition. After learning, participants made inference judgments of a missing feature, given 1, 2, or 3 feature cues; on half the trials, the category label was also included as a cue. The results showed superior inference of features following training on the correlated structure, with accurate inference when only a single feature was presented. In contrast, a single-feature cue resulted in chance levels of inference for the uncorrelated structure. Feature inference systematically improved with number of cues after training on the correlated structure. Surprisingly, a similar outcome was obtained for the uncorrelated structure, an outcome that must have reflected mediation via the category label. A descriptive model is briefly introduced to explain the results, with a suggestion that this paradigm might be profitably extended to hierarchical structures where the levels of feature-to-feature inference might vary with the depth of the hierarchy.
Sedai, Suman; Garnavi, Rahil; Roy, Pallab; Xi Liang
2015-08-01
Multi-atlas segmentation first registers each atlas image to the target image and transfers the label of atlas image to the coordinate system of the target image. The transferred labels are then combined, using a label fusion algorithm. In this paper, we propose a novel label fusion method which aggregates discriminative learning and generative modeling for segmentation of cardiac MR images. First, a probabilistic Random Forest classifier is trained as a discriminative model to obtain the prior probability of a label at the given voxel of the target image. Then, a probability distribution of image patches is modeled using Gaussian Mixture Model for each label, providing the likelihood of the voxel belonging to the label. The final label posterior is obtained by combining the classification score and the likelihood score under Bayesian rule. Comparative study performed on MICCAI 2013 SATA Segmentation Challenge demonstrates that our proposed hybrid label fusion algorithm is accurate than other five state-of-the-art label fusion methods. The proposed method obtains dice similarity coefficient of 0.94 and 0.92 in segmenting epicardium and endocardium respectively. Moreover, our label fusion method achieves more accurate segmentation results compared to four other label fusion methods.
Treatment of category generation and retrieval in aphasia: Effect of typicality of category items.
Kiran, Swathi; Sandberg, Chaleece; Sebastian, Rajani
2011-01-01
Purpose: Kiran and colleagues (Kiran, 2007, 2008; Kiran & Johnson, 2008; Kiran & Thompson, 2003) have previously suggested that training atypical examples within a semantic category is a more efficient treatment approach to facilitating generalization within the category than training typical examples. The present study extended our previous work examining the notion of semantic complexity within goal-derived (ad-hoc) categories in individuals with aphasia. Methods: Six individuals with fluent aphasia (range = 39-84 years) and varying degrees of naming deficits and semantic impairments were involved. Thirty typical and atypical items each from two categories were selected after an extensive stimulus norming task. Generative naming for the two categories was tested during baseline and treatment. Results: As predicted, training atypical examples in the category resulted in generalization to untrained typical examples in five out the five patient-treatment conditions. In contrast, training typical examples (which was in examined three conditions) produced mixed results. One patient showed generalization to untrained atypical examples, whereas two patients did not show generalization to untrained atypical examples. Conclusions: Results of the present study supplement our existing data on the effect of a semantically based treatment for lexical retrieval by manipulating the typicality of category exemplars. PMID:21173393
Translation-aware semantic segmentation via conditional least-square generative adversarial networks
NASA Astrophysics Data System (ADS)
Zhang, Mi; Hu, Xiangyun; Zhao, Like; Pang, Shiyan; Gong, Jinqi; Luo, Min
2017-10-01
Semantic segmentation has recently made rapid progress in the field of remote sensing and computer vision. However, many leading approaches cannot simultaneously translate label maps to possible source images with a limited number of training images. The core issue is insufficient adversarial information to interpret the inverse process and proper objective loss function to overcome the vanishing gradient problem. We propose the use of conditional least squares generative adversarial networks (CLS-GAN) to delineate visual objects and solve these problems. We trained the CLS-GAN network for semantic segmentation to discriminate dense prediction information either from training images or generative networks. We show that the optimal objective function of CLS-GAN is a special class of f-divergence and yields a generator that lies on the decision boundary of discriminator that reduces possible vanished gradient. We also demonstrate the effectiveness of the proposed architecture at translating images from label maps in the learning process. Experiments on a limited number of high resolution images, including close-range and remote sensing datasets, indicate that the proposed method leads to the improved semantic segmentation accuracy and can simultaneously generate high quality images from label maps.
Nissim, Nir; Shahar, Yuval; Elovici, Yuval; Hripcsak, George; Moskovitch, Robert
2017-09-01
Labeling instances by domain experts for classification is often time consuming and expensive. To reduce such labeling efforts, we had proposed the application of active learning (AL) methods, introduced our CAESAR-ALE framework for classifying the severity of clinical conditions, and shown its significant reduction of labeling efforts. The use of any of three AL methods (one well known [SVM-Margin], and two that we introduced [Exploitation and Combination_XA]) significantly reduced (by 48% to 64%) condition labeling efforts, compared to standard passive (random instance-selection) SVM learning. Furthermore, our new AL methods achieved maximal accuracy using 12% fewer labeled cases than the SVM-Margin AL method. However, because labelers have varying levels of expertise, a major issue associated with learning methods, and AL methods in particular, is how to best to use the labeling provided by a committee of labelers. First, we wanted to know, based on the labelers' learning curves, whether using AL methods (versus standard passive learning methods) has an effect on the Intra-labeler variability (within the learning curve of each labeler) and inter-labeler variability (among the learning curves of different labelers). Then, we wanted to examine the effect of learning (either passively or actively) from the labels created by the majority consensus of a group of labelers. We used our CAESAR-ALE framework for classifying the severity of clinical conditions, the three AL methods and the passive learning method, as mentioned above, to induce the classifications models. We used a dataset of 516 clinical conditions and their severity labeling, represented by features aggregated from the medical records of 1.9 million patients treated at Columbia University Medical Center. We analyzed the variance of the classification performance within (intra-labeler), and especially among (inter-labeler) the classification models that were induced by using the labels provided by seven labelers. We also compared the performance of the passive and active learning models when using the consensus label. The AL methods: produced, for the models induced from each labeler, smoother Intra-labeler learning curves during the training phase, compared to the models produced when using the passive learning method. The mean standard deviation of the learning curves of the three AL methods over all labelers (mean: 0.0379; range: [0.0182 to 0.0496]), was significantly lower (p=0.049) than the Intra-labeler standard deviation when using the passive learning method (mean: 0.0484; range: [0.0275-0.0724). Using the AL methods resulted in a lower mean Inter-labeler AUC standard deviation among the AUC values of the labelers' different models during the training phase, compared to the variance of the induced models' AUC values when using passive learning. The Inter-labeler AUC standard deviation, using the passive learning method (0.039), was almost twice as high as the Inter-labeler standard deviation using our two new AL methods (0.02 and 0.019, respectively). The SVM-Margin AL method resulted in an Inter-labeler standard deviation (0.029) that was higher by almost 50% than that of our two AL methods The difference in the inter-labeler standard deviation between the passive learning method and the SVM-Margin learning method was significant (p=0.042). The difference between the SVM-Margin and Exploitation method was insignificant (p=0.29), as was the difference between the Combination_XA and Exploitation methods (p=0.67). Finally, using the consensus label led to a learning curve that had a higher mean intra-labeler variance, but resulted eventually in an AUC that was at least as high as the AUC achieved using the gold standard label and that was always higher than the expected mean AUC of a randomly selected labeler, regardless of the choice of learning method (including a passive learning method). Using a paired t-test, the difference between the intra-labeler AUC standard deviation when using the consensus label, versus that value when using the other two labeling strategies, was significant only when using the passive learning method (p=0.014), but not when using any of the three AL methods. The use of AL methods, (a) reduces intra-labeler variability in the performance of the induced models during the training phase, and thus reduces the risk of halting the process at a local minimum that is significantly different in performance from the rest of the learned models; and (b) reduces Inter-labeler performance variance, and thus reduces the dependence on the use of a particular labeler. In addition, the use of a consensus label, agreed upon by a rather uneven group of labelers, might be at least as good as using the gold standard labeler, who might not be available, and certainly better than randomly selecting one of the group's individual labelers. Finally, using the AL methods: when provided by the consensus label reduced the intra-labeler AUC variance during the learning phase, compared to using passive learning. Copyright © 2017 Elsevier B.V. All rights reserved.
Wang, Jian-Gang; Sung, Eric; Yau, Wei-Yun
2011-07-01
Facial age classification is an approach to classify face images into one of several predefined age groups. One of the difficulties in applying learning techniques to the age classification problem is the large amount of labeled training data required. Acquiring such training data is very costly in terms of age progress, privacy, human time, and effort. Although unlabeled face images can be obtained easily, it would be expensive to manually label them on a large scale and getting the ground truth. The frugal selection of the unlabeled data for labeling to quickly reach high classification performance with minimal labeling efforts is a challenging problem. In this paper, we present an active learning approach based on an online incremental bilateral two-dimension linear discriminant analysis (IB2DLDA) which initially learns from a small pool of labeled data and then iteratively selects the most informative samples from the unlabeled set to increasingly improve the classifier. Specifically, we propose a novel data selection criterion called the furthest nearest-neighbor (FNN) that generalizes the margin-based uncertainty to the multiclass case and which is easy to compute, so that the proposed active learning algorithm can handle a large number of classes and large data sizes efficiently. Empirical experiments on FG-NET and Morph databases together with a large unlabeled data set for age categorization problems show that the proposed approach can achieve results comparable or even outperform a conventionally trained active classifier that requires much more labeling effort. Our IB2DLDA-FNN algorithm can achieve similar results much faster than random selection and with fewer samples for age categorization. It also can achieve comparable results with active SVM but is much faster than active SVM in terms of training because kernel methods are not needed. The results on the face recognition database and palmprint/palm vein database showed that our approach can handle problems with large number of classes. Our contributions in this paper are twofold. First, we proposed the IB2DLDA-FNN, the FNN being our novel idea, as a generic on-line or active learning paradigm. Second, we showed that it can be another viable tool for active learning of facial age range classification.
Infants' Acceptance of Phonotactically Illegal Word Forms as Object Labels
ERIC Educational Resources Information Center
Vukatana, Ena; Curtin, Suzanne; Graham, Susan A.
2016-01-01
We investigated 16- and 20-month-olds' flexibility in mapping phonotactically illegal words to objects. Using an associative word-learning task, infants were presented with a training phase that either highlighted or did not highlight the referential status of a novel label. Infants were then habituated to two novel objects, each paired with a…
Nissim, Nir; Shahar, Yuval; Boland, Mary Regina; Tatonetti, Nicholas P; Elovici, Yuval; Hripcsak, George; Moskovitch, Robert
2018-01-01
Background and Objectives Labeling instances by domain experts for classification is often time consuming and expensive. To reduce such labeling efforts, we had proposed the application of active learning (AL) methods, introduced our CAESAR-ALE framework for classifying the severity of clinical conditions, and shown its significant reduction of labeling efforts. The use of any of three AL methods (one well known [SVM-Margin], and two that we introduced [Exploitation and Combination_XA]) significantly reduced (by 48% to 64%) condition labeling efforts, compared to standard passive (random instance-selection) SVM learning. Furthermore, our new AL methods achieved maximal accuracy using 12% fewer labeled cases than the SVM-Margin AL method. However, because labelers have varying levels of expertise, a major issue associated with learning methods, and AL methods in particular, is how to best to use the labeling provided by a committee of labelers. First, we wanted to know, based on the labelers’ learning curves, whether using AL methods (versus standard passive learning methods) has an effect on the Intra-labeler variability (within the learning curve of each labeler) and inter-labeler variability (among the learning curves of different labelers). Then, we wanted to examine the effect of learning (either passively or actively) from the labels created by the majority consensus of a group of labelers. Methods We used our CAESAR-ALE framework for classifying the severity of clinical conditions, the three AL methods and the passive learning method, as mentioned above, to induce the classifications models. We used a dataset of 516 clinical conditions and their severity labeling, represented by features aggregated from the medical records of 1.9 million patients treated at Columbia University Medical Center. We analyzed the variance of the classification performance within (intra-labeler), and especially among (inter-labeler) the classification models that were induced by using the labels provided by seven labelers. We also compared the performance of the passive and active learning models when using the consensus label. Results The AL methods produced, for the models induced from each labeler, smoother Intra-labeler learning curves during the training phase, compared to the models produced when using the passive learning method. The mean standard deviation of the learning curves of the three AL methods over all labelers (mean: 0.0379; range: [0.0182 to 0.0496]), was significantly lower (p = 0.049) than the Intra-labeler standard deviation when using the passive learning method (mean: 0.0484; range: [0.0275 to 0.0724). Using the AL methods resulted in a lower mean Inter-labeler AUC standard deviation among the AUC values of the labelers’ different models during the training phase, compared to the variance of the induced models’ AUC values when using passive learning. The Inter-labeler AUC standard deviation, using the passive learning method (0.039), was almost twice as high as the Inter-labeler standard deviation using our two new AL methods (0.02 and 0.019, respectively). The SVM-Margin AL method resulted in an Inter-labeler standard deviation (0.029) that was higher by almost 50% than that of our two AL methods. The difference in the inter-labeler standard deviation between the passive learning method and the SVM-Margin learning method was significant (p = 0.042). The difference between the SVM-Margin and Exploitation method was insignificant (p = 0.29), as was the difference between the Combination_XA and Exploitation methods (p = 0.67). Finally, using the consensus label led to a learning curve that had a higher mean intra-labeler variance, but resulted eventually in an AUC that was at least as high as the AUC achieved using the gold standard label and that was always higher than the expected mean AUC of a randomly selected labeler, regardless of the choice of learning method (including a passive learning method). Using a paired t-test, the difference between the intra-labeler AUC standard deviation when using the consensus label, versus that value when using the other two labeling strategies, was significant only when using the passive learning method (p = 0.014), but not when using any of the three AL methods. Conclusions The use of AL methods, (a) reduces intra-labeler variability in the performance of the induced models during the training phase, and thus reduces the risk of halting the process at a local minimum that is significantly different in performance from the rest of the learned models; and (b) reduces Inter-labeler performance variance, and thus reduces the dependence on the use of a particular labeler. In addition, the use of a consensus label, agreed upon by a rather uneven group of labelers, might be at least as good as using the gold standard labeler, who might not be available, and certainly better than randomly selecting one of the group’s individual labelers. Finally, using the AL methods when provided by the consensus label reduced the intra-labeler AUC variance during the learning phase, compared to using passive learning. PMID:28456512
Integration and baseline management training and transition plan
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jech, J.B.
The purpose of the Integration and Baseline Management Training and Transition Plan is to provide a training outline for the Integration and Baseline Management (I and BM) organization and a transition strategy for the Master Equipment List (MEL) Phase 1 application. The training outline includes the following courses: MEL Phase 1 Application Course 1 Master Equipment List General Overview. Course 2 Master Equipment List Editing. Tank Waste Remediation System (TWRS) Labeling Related Course 3 TWRS Equipment Labeling Program (Course Number 350545). As part of courses 1, 2, and 3, it is recommended that a lesson plan be developed and integratedmore » into each of the three courses on the subject of Configuration Management (CM) to include: CM concepts, terminology, definitions, fundamentals and its application with respect to the course. The strategy for the MEL Phase 1 application is to train internal organizations (I and BM) on the MEL-General Overview for read only users and train MEL-Editing for edit users (only on an as needed basis). For external organizations, the strategy is to train selected personnel on the MEL-General Overview and transition them from read only privileges to editing privileges when the appropriate administrative procedures that outline the external organization`s responsibilities (to support MEL) are established. The purpose of this training is to ensure support of the I and BM organization objectives within the TV,IRS Division. These training courses will be added to the existing required training for I and BM personnel only. Other organizations implementing the training will be directed by their management on which training is required.« less
Convolutional Neural Networks for Medical Image Analysis: Full Training or Fine Tuning?
Tajbakhsh, Nima; Shin, Jae Y; Gurudu, Suryakanth R; Hurst, R Todd; Kendall, Christopher B; Gotway, Michael B; Jianming Liang
2016-05-01
Training a deep convolutional neural network (CNN) from scratch is difficult because it requires a large amount of labeled training data and a great deal of expertise to ensure proper convergence. A promising alternative is to fine-tune a CNN that has been pre-trained using, for instance, a large set of labeled natural images. However, the substantial differences between natural and medical images may advise against such knowledge transfer. In this paper, we seek to answer the following central question in the context of medical image analysis: Can the use of pre-trained deep CNNs with sufficient fine-tuning eliminate the need for training a deep CNN from scratch? To address this question, we considered four distinct medical imaging applications in three specialties (radiology, cardiology, and gastroenterology) involving classification, detection, and segmentation from three different imaging modalities, and investigated how the performance of deep CNNs trained from scratch compared with the pre-trained CNNs fine-tuned in a layer-wise manner. Our experiments consistently demonstrated that 1) the use of a pre-trained CNN with adequate fine-tuning outperformed or, in the worst case, performed as well as a CNN trained from scratch; 2) fine-tuned CNNs were more robust to the size of training sets than CNNs trained from scratch; 3) neither shallow tuning nor deep tuning was the optimal choice for a particular application; and 4) our layer-wise fine-tuning scheme could offer a practical way to reach the best performance for the application at hand based on the amount of available data.
Group-Based Active Learning of Classification Models.
Luo, Zhipeng; Hauskrecht, Milos
2017-05-01
Learning of classification models from real-world data often requires additional human expert effort to annotate the data. However, this process can be rather costly and finding ways of reducing the human annotation effort is critical for this task. The objective of this paper is to develop and study new ways of providing human feedback for efficient learning of classification models by labeling groups of examples. Briefly, unlike traditional active learning methods that seek feedback on individual examples, we develop a new group-based active learning framework that solicits label information on groups of multiple examples. In order to describe groups in a user-friendly way, conjunctive patterns are used to compactly represent groups. Our empirical study on 12 UCI data sets demonstrates the advantages and superiority of our approach over both classic instance-based active learning work, as well as existing group-based active-learning methods.
Schmitz, Felix Michael; Schnabel, Kai Philipp; Stricker, Daniel; Fischer, Martin Rudolf; Guttormsen, Sissel
2017-06-01
Appropriate training strategies are required to equip undergraduate healthcare students to benefit from communication training with simulated patients. This study examines the learning effects of different formats of video-based worked examples on initial communication skills. First-year nursing students (N=36) were randomly assigned to one of two experimental groups (correct v. erroneous examples) or to the control group (no examples). All the groups were provided an identical introduction to learning materials on breaking bad news; the experimental groups also received a set of video-based worked examples. Each example was accompanied by a self-explanation prompt (considering the example's correctness) and elaborated feedback (the true explanation). Participants presented with erroneous examples broke bad news to a simulated patient significantly more appropriately than students in the control group. Additionally, they tended to outperform participants who had correct examples, while participants presented with correct examples tended to outperform the control group. The worked example effect was successfully adapted for learning in the provider-patient communication domain. Implementing video-based worked examples with self-explanation prompts and feedback can be an effective strategy to prepare students for their training with simulated patients, especially when examples are erroneous. Copyright © 2017 Elsevier B.V. All rights reserved.
Encoding probabilistic brain atlases using Bayesian inference.
Van Leemput, Koen
2009-06-01
This paper addresses the problem of creating probabilistic brain atlases from manually labeled training data. Probabilistic atlases are typically constructed by counting the relative frequency of occurrence of labels in corresponding locations across the training images. However, such an "averaging" approach generalizes poorly to unseen cases when the number of training images is limited, and provides no principled way of aligning the training datasets using deformable registration. In this paper, we generalize the generative image model implicitly underlying standard "average" atlases, using mesh-based representations endowed with an explicit deformation model. Bayesian inference is used to infer the optimal model parameters from the training data, leading to a simultaneous group-wise registration and atlas estimation scheme that encompasses standard averaging as a special case. We also use Bayesian inference to compare alternative atlas models in light of the training data, and show how this leads to a data compression problem that is intuitive to interpret and computationally feasible. Using this technique, we automatically determine the optimal amount of spatial blurring, the best deformation field flexibility, and the most compact mesh representation. We demonstrate, using 2-D training datasets, that the resulting models are better at capturing the structure in the training data than conventional probabilistic atlases. We also present experiments of the proposed atlas construction technique in 3-D, and show the resulting atlases' potential in fully-automated, pulse sequence-adaptive segmentation of 36 neuroanatomical structures in brain MRI scans.
ERIC Educational Resources Information Center
Barratt-Pugh, Llandis
This paper outlines the background and preliminary findings of a study currently in progress in Perth, Western Australia, to investigate the relationship between competence based training and the development of lifelong learning skills. The paper explores both the underlying aims of competency-based training (CBT) and the educational antecedents…
Code of Federal Regulations, 2012 CFR
2012-01-01
... Section 317.17 Animals and Animal Products FOOD SAFETY AND INSPECTION SERVICE, DEPARTMENT OF AGRICULTURE... of such product. For example, curing mixtures composed of such ingredients as water, salt, sugar... thermally processed to Fo 3 or more; they have been fermented or pickled to pH of 4.6 or less; or they have...
Code of Federal Regulations, 2014 CFR
2014-01-01
... Section 317.17 Animals and Animal Products FOOD SAFETY AND INSPECTION SERVICE, DEPARTMENT OF AGRICULTURE... of such product. For example, curing mixtures composed of such ingredients as water, salt, sugar... thermally processed to Fo 3 or more; they have been fermented or pickled to pH of 4.6 or less; or they have...
Code of Federal Regulations, 2013 CFR
2013-01-01
... Section 317.17 Animals and Animal Products FOOD SAFETY AND INSPECTION SERVICE, DEPARTMENT OF AGRICULTURE... of such product. For example, curing mixtures composed of such ingredients as water, salt, sugar... thermally processed to Fo 3 or more; they have been fermented or pickled to pH of 4.6 or less; or they have...
Protein specific fluorescent microspheres for labelling a protein
NASA Technical Reports Server (NTRS)
Rembaum, Alan (Inventor)
1982-01-01
Highly fluorescent, stable and biocompatible microspheres are obtained by copolymerizing an acrylic monomer containing a covalent bonding group such as hydroxyl, amine or carboxyl, for example, hydroxyethylmethacrylate, with an addition polymerizable fluorescent comonomer such as dansyl allyl amine. A lectin or antibody is bound to the covalent site to provide cell specificity. When the microspheres are added to a cell suspension the marked microspheres will specifically label a cell membrane by binding to a specific receptor site thereon. The labeled membrane can then be detected by fluorescence of the fluorescent monomer.
Homogeneous Biosensing Based on Magnetic Particle Labels
Schrittwieser, Stefan; Pelaz, Beatriz; Parak, Wolfgang J.; Lentijo-Mozo, Sergio; Soulantica, Katerina; Dieckhoff, Jan; Ludwig, Frank; Guenther, Annegret; Tschöpe, Andreas; Schotter, Joerg
2016-01-01
The growing availability of biomarker panels for molecular diagnostics is leading to an increasing need for fast and sensitive biosensing technologies that are applicable to point-of-care testing. In that regard, homogeneous measurement principles are especially relevant as they usually do not require extensive sample preparation procedures, thus reducing the total analysis time and maximizing ease-of-use. In this review, we focus on homogeneous biosensors for the in vitro detection of biomarkers. Within this broad range of biosensors, we concentrate on methods that apply magnetic particle labels. The advantage of such methods lies in the added possibility to manipulate the particle labels by applied magnetic fields, which can be exploited, for example, to decrease incubation times or to enhance the signal-to-noise-ratio of the measurement signal by applying frequency-selective detection. In our review, we discriminate the corresponding methods based on the nature of the acquired measurement signal, which can either be based on magnetic or optical detection. The underlying measurement principles of the different techniques are discussed, and biosensing examples for all techniques are reported, thereby demonstrating the broad applicability of homogeneous in vitro biosensing based on magnetic particle label actuation. PMID:27275824
Neural-Network Computer Transforms Coordinates
NASA Technical Reports Server (NTRS)
Josin, Gary M.
1990-01-01
Numerical simulation demonstrated ability of conceptual neural-network computer to generalize what it has "learned" from few examples. Ability to generalize achieved with even simple neural network (relatively few neurons) and after exposure of network to only few "training" examples. Ability to obtain fairly accurate mappings after only few training examples used to provide solutions to otherwise intractable mapping problems.
Exploiting the potential of unlabeled endoscopic video data with self-supervised learning.
Ross, Tobias; Zimmerer, David; Vemuri, Anant; Isensee, Fabian; Wiesenfarth, Manuel; Bodenstedt, Sebastian; Both, Fabian; Kessler, Philip; Wagner, Martin; Müller, Beat; Kenngott, Hannes; Speidel, Stefanie; Kopp-Schneider, Annette; Maier-Hein, Klaus; Maier-Hein, Lena
2018-06-01
Surgical data science is a new research field that aims to observe all aspects of the patient treatment process in order to provide the right assistance at the right time. Due to the breakthrough successes of deep learning-based solutions for automatic image annotation, the availability of reference annotations for algorithm training is becoming a major bottleneck in the field. The purpose of this paper was to investigate the concept of self-supervised learning to address this issue. Our approach is guided by the hypothesis that unlabeled video data can be used to learn a representation of the target domain that boosts the performance of state-of-the-art machine learning algorithms when used for pre-training. Core of the method is an auxiliary task based on raw endoscopic video data of the target domain that is used to initialize the convolutional neural network (CNN) for the target task. In this paper, we propose the re-colorization of medical images with a conditional generative adversarial network (cGAN)-based architecture as auxiliary task. A variant of the method involves a second pre-training step based on labeled data for the target task from a related domain. We validate both variants using medical instrument segmentation as target task. The proposed approach can be used to radically reduce the manual annotation effort involved in training CNNs. Compared to the baseline approach of generating annotated data from scratch, our method decreases exploratively the number of labeled images by up to 75% without sacrificing performance. Our method also outperforms alternative methods for CNN pre-training, such as pre-training on publicly available non-medical (COCO) or medical data (MICCAI EndoVis2017 challenge) using the target task (in this instance: segmentation). As it makes efficient use of available (non-)public and (un-)labeled data, the approach has the potential to become a valuable tool for CNN (pre-)training.
Surface Plasmon Resonance: A Versatile Technique for Biosensor Applications
Nguyen, Hoang Hiep; Park, Jeho; Kang, Sebyung; Kim, Moonil
2015-01-01
Surface plasmon resonance (SPR) is a label-free detection method which has emerged during the last two decades as a suitable and reliable platform in clinical analysis for biomolecular interactions. The technique makes it possible to measure interactions in real-time with high sensitivity and without the need of labels. This review article discusses a wide range of applications in optical-based sensors using either surface plasmon resonance (SPR) or surface plasmon resonance imaging (SPRI). Here we summarize the principles, provide examples, and illustrate the utility of SPR and SPRI through example applications from the biomedical, proteomics, genomics and bioengineering fields. In addition, SPR signal amplification strategies and surface functionalization are covered in the review. PMID:25951336
NASA Astrophysics Data System (ADS)
Stumpf, A.; Lachiche, N.; Malet, J.; Kerle, N.; Puissant, A.
2011-12-01
VHR satellite images have become a primary source for landslide inventory mapping after major triggering events such as earthquakes and heavy rainfalls. Visual image interpretation is still the prevailing standard method for operational purposes but is time-consuming and not well suited to fully exploit the increasingly better supply of remote sensing data. Recent studies have addressed the development of more automated image analysis workflows for landslide inventory mapping. In particular object-oriented approaches that account for spatial and textural image information have been demonstrated to be more adequate than pixel-based classification but manually elaborated rule-based classifiers are difficult to adapt under changing scene characteristics. Machine learning algorithm allow learning classification rules for complex image patterns from labelled examples and can be adapted straightforwardly with available training data. In order to reduce the amount of costly training data active learning (AL) has evolved as a key concept to guide the sampling for many applications. The underlying idea of AL is to initialize a machine learning model with a small training set, and to subsequently exploit the model state and data structure to iteratively select the most valuable samples that should be labelled by the user. With relatively few queries and labelled samples, an AL strategy yields higher accuracies than an equivalent classifier trained with many randomly selected samples. This study addressed the development of an AL method for landslide mapping from VHR remote sensing images with special consideration of the spatial distribution of the samples. Our approach [1] is based on the Random Forest algorithm and considers the classifier uncertainty as well as the variance of potential sampling regions to guide the user towards the most valuable sampling areas. The algorithm explicitly searches for compact regions and thereby avoids a spatially disperse sampling pattern inherent to most other AL methods. The accuracy, the sampling time and the computational runtime of the algorithm were evaluated on multiple satellite images capturing recent large scale landslide events. Sampling between 1-4% of the study areas the accuracies between 74% and 80% were achieved, whereas standard sampling schemes yielded only accuracies between 28% and 50% with equal sampling costs. Compared to commonly used point-wise AL algorithm the proposed approach significantly reduces the number of iterations and hence the computational runtime. Since the user can focus on relatively few compact areas (rather than on hundreds of distributed points) the overall labeling time is reduced by more than 50% compared to point-wise queries. An experimental evaluation of multiple expert mappings demonstrated strong relationships between the uncertainties of the experts and the machine learning model. It revealed that the achieved accuracies are within the range of the inter-expert disagreement and that it will be indispensable to consider ground truth uncertainties to truly achieve further enhancements in the future. The proposed method is generally applicable to a wide range of optical satellite images and landslide types. [1] A. Stumpf, N. Lachiche, J.-P. Malet, N. Kerle, and A. Puissant, Active learning in the spatial domain for remote sensing image classification, IEEE Transactions on Geosciece and Remote Sensing. 2013, DOI 10.1109/TGRS.2013.2262052.
Yip, Kevin Y.; Gerstein, Mark
2009-01-01
Motivation: An important problem in systems biology is reconstructing complete networks of interactions between biological objects by extrapolating from a few known interactions as examples. While there are many computational techniques proposed for this network reconstruction task, their accuracy is consistently limited by the small number of high-confidence examples, and the uneven distribution of these examples across the potential interaction space, with some objects having many known interactions and others few. Results: To address this issue, we propose two computational methods based on the concept of training set expansion. They work particularly effectively in conjunction with kernel approaches, which are a popular class of approaches for fusing together many disparate types of features. Both our methods are based on semi-supervised learning and involve augmenting the limited number of gold-standard training instances with carefully chosen and highly confident auxiliary examples. The first method, prediction propagation, propagates highly confident predictions of one local model to another as the auxiliary examples, thus learning from information-rich regions of the training network to help predict the information-poor regions. The second method, kernel initialization, takes the most similar and most dissimilar objects of each object in a global kernel as the auxiliary examples. Using several sets of experimentally verified protein–protein interactions from yeast, we show that training set expansion gives a measurable performance gain over a number of representative, state-of-the-art network reconstruction methods, and it can correctly identify some interactions that are ranked low by other methods due to the lack of training examples of the involved proteins. Contact: mark.gerstein@yale.edu Availability: The datasets and additional materials can be found at http://networks.gersteinlab.org/tse. PMID:19015141
Label-free functional nucleic acid sensors for detecting target agents
Lu, Yi; Xiang, Yu
2015-01-13
A general methodology to design label-free fluorescent functional nucleic acid sensors using a vacant site approach and an abasic site approach is described. In one example, a method for designing label-free fluorescent functional nucleic acid sensors (e.g., those that include a DNAzyme, aptamer or aptazyme) that have a tunable dynamic range through the introduction of an abasic site (e.g., dSpacer) or a vacant site into the functional nucleic acids. Also provided is a general method for designing label-free fluorescent aptamer sensors based on the regulation of malachite green (MG) fluorescence. A general method for designing label-free fluorescent catalytic and molecular beacons (CAMBs) is also provided. The methods demonstrated here can be used to design many other label-free fluorescent sensors to detect a wide range of analytes. Sensors and methods of using the disclosed sensors are also provided.
Garand, Linda; Lingler, Jennifer H.; Conner, Kyaien O.; Dew, Mary Amanda
2010-01-01
Health care professionals use diagnostic labels to classify individuals for both treatment and research purposes. Despite their clear benefits, diagnostic labels also serve as cues that activate stigma and stereotypes. Stigma associated with the diagnostic labels of dementia and mild cognitive impairment (MCI) can have a significant and negative impact on interpersonal relationships, interactions with the health care community, attitudes about service utilization, and participation in clinical research. The impact of stigma also extends to the family caregivers of individuals bearing such labels. In this article, we use examples from our investigations of individuals with dementia or MCI and their family caregivers to examine the impact of labeling and stigma on clinical research participation. We also discuss how stigma can affect numerous aspects of the nursing research process. Strategies are presented for addressing stigma-related barriers to participation in clinical research on dementia and MCI. PMID:20077972
Mitchell, Caroline; Dwyer, Rachel; Hagan, Teresa; Mathers, Nigel
2011-01-01
Background The National Institute for Health and clinical Excellence (NICE) depression guideline (2004) and the updated Quality and Outcomes Framework (QOF) ( 2006) in general practice have introduced the concepts of screening severity assessment, for example using the Patient Health Questionnaire 9 (PHQ-9), and ‘stepped care’ for depression. Aim To explore primary care practitioner perspectives on the clinical utility of the NICE guideline and the impact of the QOF on diagnosis and management of depression in routine practice. Design and setting Qualitative study using focus groups from four multidisciplinary practice teams with diverse populations in south Yorkshire. Method Four focus groups were conducted, using a topic guide and audiotaping. There were 38 participants: GPs, nurses, doctors in training, mental health workers, and a manager. Data analysis was iterative and thematic. Results The NICE guideline, with its embedded principles of holism and evidence-based practice, was viewed positively but its impact was compromised by resource and practitioner barriers to implementation. The perceived imposition of the screening questions and severity assessments (PHQ-9) with no responsive training had required practitioners to work hard to minimise negative impacts on their work, for example: constantly adapting consultations to tick boxes; avoiding triggering open displays of distress without the time to offer appropriate care; positively managing how their patients were labelled. Further confusion was experienced around the evolving content of psychological interventions for depression. Conclusion Organisational barriers to the implementation of the NICE guideline and the limited scope of the QOF highlight the need for policy makers to work more effectively with the complex realities of general practice in order to systematically improve the quality and delivery of ‘managed’ care for depression. PMID:21619752
Feature Acquisition with Imbalanced Training Data
NASA Technical Reports Server (NTRS)
Thompson, David R.; Wagstaff, Kiri L.; Majid, Walid A.; Jones, Dayton L.
2011-01-01
This work considers cost-sensitive feature acquisition that attempts to classify a candidate datapoint from incomplete information. In this task, an agent acquires features of the datapoint using one or more costly diagnostic tests, and eventually ascribes a classification label. A cost function describes both the penalties for feature acquisition, as well as misclassification errors. A common solution is a Cost Sensitive Decision Tree (CSDT), a branching sequence of tests with features acquired at interior decision points and class assignment at the leaves. CSDT's can incorporate a wide range of diagnostic tests and can reflect arbitrary cost structures. They are particularly useful for online applications due to their low computational overhead. In this innovation, CSDT's are applied to cost-sensitive feature acquisition where the goal is to recognize very rare or unique phenomena in real time. Example applications from this domain include four areas. In stream processing, one seeks unique events in a real time data stream that is too large to store. In fault protection, a system must adapt quickly to react to anticipated errors by triggering repair activities or follow- up diagnostics. With real-time sensor networks, one seeks to classify unique, new events as they occur. With observational sciences, a new generation of instrumentation seeks unique events through online analysis of large observational datasets. This work presents a solution based on transfer learning principles that permits principled CSDT learning while exploiting any prior knowledge of the designer to correct both between-class and withinclass imbalance. Training examples are adaptively reweighted based on a decomposition of the data attributes. The result is a new, nonparametric representation that matches the anticipated attribute distribution for the target events.
Modeling-Mainstreaming: A Teacher Training Proposal.
ERIC Educational Resources Information Center
Bireley, Marlene; Mahan, Virginia
This document presents a learning model for training teachers to effectively deal with physically handicapped and mildly retarded children in their regular classroom. The modules are organized in the following fashion: Phase One; Development of an awareness of the concept of mainstreaming, of labels and their consequences, and of the psychological…
ERIC Educational Resources Information Center
Barrera, Richardo D.; Sulzer-Azaroff, Beth
1983-01-01
Comparison of the relative effectiveness of oral and total communication training models for teaching expressive labeling skills to three echolalic autistic children (six-nine years old) demonstrated that total communication was the most successful approach with each of the Ss. (Author/CL)
Couple Graph Based Label Propagation Method for Hyperspectral Remote Sensing Data Classification
NASA Astrophysics Data System (ADS)
Wang, X. P.; Hu, Y.; Chen, J.
2018-04-01
Graph based semi-supervised classification method are widely used for hyperspectral image classification. We present a couple graph based label propagation method, which contains both the adjacency graph and the similar graph. We propose to construct the similar graph by using the similar probability, which utilize the label similarity among examples probably. The adjacency graph was utilized by a common manifold learning method, which has effective improve the classification accuracy of hyperspectral data. The experiments indicate that the couple graph Laplacian which unite both the adjacency graph and the similar graph, produce superior classification results than other manifold Learning based graph Laplacian and Sparse representation based graph Laplacian in label propagation framework.
(Machine) learning to do more with less
NASA Astrophysics Data System (ADS)
Cohen, Timothy; Freytsis, Marat; Ostdiek, Bryan
2018-02-01
Determining the best method for training a machine learning algorithm is critical to maximizing its ability to classify data. In this paper, we compare the standard "fully supervised" approach (which relies on knowledge of event-by-event truth-level labels) with a recent proposal that instead utilizes class ratios as the only discriminating information provided during training. This so-called "weakly supervised" technique has access to less information than the fully supervised method and yet is still able to yield impressive discriminating power. In addition, weak supervision seems particularly well suited to particle physics since quantum mechanics is incompatible with the notion of mapping an individual event onto any single Feynman diagram. We examine the technique in detail — both analytically and numerically — with a focus on the robustness to issues of mischaracterizing the training samples. Weakly supervised networks turn out to be remarkably insensitive to a class of systematic mismodeling. Furthermore, we demonstrate that the event level outputs for weakly versus fully supervised networks are probing different kinematics, even though the numerical quality metrics are essentially identical. This implies that it should be possible to improve the overall classification ability by combining the output from the two types of networks. For concreteness, we apply this technology to a signature of beyond the Standard Model physics to demonstrate that all these impressive features continue to hold in a scenario of relevance to the LHC. Example code is provided on GitHub.
NASA Astrophysics Data System (ADS)
Xue, Di-Xiu; Zhang, Rong; Zhao, Yuan-Yuan; Xu, Jian-Ming; Wang, Ya-Lei
2017-07-01
Cancer recognition is the prerequisite to determine appropriate treatment. This paper focuses on the semantic segmentation task of microvascular morphological types on narrowband images to aid clinical examination of esophageal cancer. The most challenge for semantic segmentation is incomplete-labeling. Our key insight is to build fully convolutional networks (FCNs) with double-label to make pixel-wise predictions. The roi-label indicating ROIs (region of interest) is introduced as extra constraint to guild feature learning. Trained end-to-end, the FCN model with two target jointly optimizes both segmentation of sem-label (semantic label) and segmentation of roi-label within the framework of self-transfer learning based on multi-task learning theory. The learning representation ability of shared convolutional networks for sem-label is improved with support of roi-label via achieving a better understanding of information outside the ROIs. Our best FCN model gives satisfactory segmentation result with mean IU up to 77.8% (pixel accuracy > 90%). The results show that the proposed approach is able to assist clinical diagnosis to a certain extent.
... skills. For example, calculating cholesterol and blood sugar levels, measuring medications, and understanding nutrition labels all require math skills. Choosing between health plans or comparing prescription ...
1988-07-01
MSDS or Material Safety Data Sheets from our suppliers and we are required to provide the same for our customers . We are required to train our personnel...non-sparking tools. Labels We protect our customers by labeling our materials in accordance with the NPCA Labeling Guide which is at least as...stringent as any federal or local regulations, by providing Material Safety Data Sheets and by providing customer assistance O when requested regarding safe
Young Children's Ability to Use Ordinal Labels in a Spatial Search Task
ERIC Educational Resources Information Center
Miller, Stephanie E.; Marcovitch, Stuart; Boseovski, Janet J.; Lewkowicz, David J.
2015-01-01
The use and understanding of ordinal terms (e.g., "first" and "second") is a developmental milestone that has been relatively unexplored in the preschool age range. In the present study, 4- and 5-year-olds watched as a reward was placed in one of three train cars labeled by the experimenter with an ordinal (e.g.,…
ERIC Educational Resources Information Center
Dogoe, Maud S.; Banda, Devender R.; Lock, Robin H.; Feinstein, Rita
2011-01-01
This study examined the effectiveness of the constant timed delay procedure for teaching two young adults with autism to read, define, and state the contextual meaning of keywords on product warning labels of common household products. Training sessions were conducted in the dyad format using flash cards. Results indicated that both participants…
Engineering the biological conversion of methanol to specialty chemicals in Escherichia coli
DOE Office of Scientific and Technical Information (OSTI.GOV)
Whitaker, W. Brian; Jones, J. Andrew; Bennett, R. Kyle
Methanol is an attractive substrate for biological production of chemicals and fuels. Engineering methylotrophic Escherichia coli as a platform organism for converting methanol to metabolites is desirable. Prior efforts to engineer methylotrophic E. coli were limited by methanol dehydrogenases (Mdhs) with unfavorable enzyme kinetics. We engineered E. coli to utilize methanol using a superior NAD-dependent Mdh from Bacillus stearothermophilus and ribulose monophosphate (RuMP) pathway enzymes from B. methanolicus. Using 13C-labeling, we demonstrate this E. coli strain converts methanol into biomass components. For example, the key TCA cycle intermediates, succinate and malate, exhibit labeling up to 39%, while the lower glycolyticmore » intermediate, 3-phosphoglycerate, up to 53%. Multiple carbons are labeled for each compound, demonstrating a cycling RuMP pathway for methanol assimilation to support growth. In conclusion, by incorporating the pathway to synthesize the flavanone naringenin, we demonstrate the first example of in vivo conversion of methanol into a specialty chemical in E. coli.« less
Engineering the biological conversion of methanol to specialty chemicals in Escherichia coli
Whitaker, W. Brian; Jones, J. Andrew; Bennett, R. Kyle; ...
2016-11-01
Methanol is an attractive substrate for biological production of chemicals and fuels. Engineering methylotrophic Escherichia coli as a platform organism for converting methanol to metabolites is desirable. Prior efforts to engineer methylotrophic E. coli were limited by methanol dehydrogenases (Mdhs) with unfavorable enzyme kinetics. We engineered E. coli to utilize methanol using a superior NAD-dependent Mdh from Bacillus stearothermophilus and ribulose monophosphate (RuMP) pathway enzymes from B. methanolicus. Using 13C-labeling, we demonstrate this E. coli strain converts methanol into biomass components. For example, the key TCA cycle intermediates, succinate and malate, exhibit labeling up to 39%, while the lower glycolyticmore » intermediate, 3-phosphoglycerate, up to 53%. Multiple carbons are labeled for each compound, demonstrating a cycling RuMP pathway for methanol assimilation to support growth. In conclusion, by incorporating the pathway to synthesize the flavanone naringenin, we demonstrate the first example of in vivo conversion of methanol into a specialty chemical in E. coli.« less
Engineering the biological conversion of methanol to specialty chemicals in Escherichia coli.
Whitaker, W Brian; Jones, J Andrew; Bennett, R Kyle; Gonzalez, Jacqueline E; Vernacchio, Victoria R; Collins, Shannon M; Palmer, Michael A; Schmidt, Samuel; Antoniewicz, Maciek R; Koffas, Mattheos A; Papoutsakis, Eleftherios T
2017-01-01
Methanol is an attractive substrate for biological production of chemicals and fuels. Engineering methylotrophic Escherichia coli as a platform organism for converting methanol to metabolites is desirable. Prior efforts to engineer methylotrophic E. coli were limited by methanol dehydrogenases (Mdhs) with unfavorable enzyme kinetics. We engineered E. coli to utilize methanol using a superior NAD-dependent Mdh from Bacillus stearothermophilus and ribulose monophosphate (RuMP) pathway enzymes from B. methanolicus. Using 13 C-labeling, we demonstrate this E. coli strain converts methanol into biomass components. For example, the key TCA cycle intermediates, succinate and malate, exhibit labeling up to 39%, while the lower glycolytic intermediate, 3-phosphoglycerate, up to 53%. Multiple carbons are labeled for each compound, demonstrating a cycling RuMP pathway for methanol assimilation to support growth. By incorporating the pathway to synthesize the flavanone naringenin, we demonstrate the first example of in vivo conversion of methanol into a specialty chemical in E. coli. Copyright © 2016 International Metabolic Engineering Society. Published by Elsevier Inc. All rights reserved.
Deep learning based beat event detection in action movie franchises
NASA Astrophysics Data System (ADS)
Ejaz, N.; Khan, U. A.; Martínez-del-Amor, M. A.; Sparenberg, H.
2018-04-01
Automatic understanding and interpretation of movies can be used in a variety of ways to semantically manage the massive volumes of movies data. "Action Movie Franchises" dataset is a collection of twenty Hollywood action movies from five famous franchises with ground truth annotations at shot and beat level of each movie. In this dataset, the annotations are provided for eleven semantic beat categories. In this work, we propose a deep learning based method to classify shots and beat-events on this dataset. The training dataset for each of the eleven beat categories is developed and then a Convolution Neural Network is trained. After finding the shot boundaries, key frames are extracted for each shot and then three classification labels are assigned to each key frame. The classification labels for each of the key frames in a particular shot are then used to assign a unique label to each shot. A simple sliding window based method is then used to group adjacent shots having the same label in order to find a particular beat event. The results of beat event classification are presented based on criteria of precision, recall, and F-measure. The results are compared with the existing technique and significant improvements are recorded.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhou, Nan; Zheng, Nina; Fridley, David
2012-02-28
Appliance energy efficiency standards and labeling (S&L) programs have been important policy tools for regulating the efficiency of energy-using products for over 40 years and continue to expand in terms of geographic and product coverage. The most common S&L programs include mandatory minimum energy performance standards (MEPS) that seek to push the market for efficient products, and energy information and endorsement labels that seek to pull the market. This study seeks to review and compare some of the earliest and most well-developed S&L programs in three countries and one region: the U.S. MEPS and ENERGY STAR, Australia MEPS and Energymore » Label, European Union MEPS and Ecodesign requirements and Energy Label and Japanese Top Runner programs. For each program, key elements of S&L programs are evaluated and comparative analyses across the programs undertaken to identify best practice examples of individual elements as well as cross-cutting factors for success and lessons learned in international S&L program development and implementation. The international review and comparative analysis identified several overarching themes and highlighted some common factors behind successful program elements. First, standard-setting and programmatic implementation can benefit significantly from a legal framework that stipulates a specific timeline or schedule for standard-setting and revision, product coverage and legal sanctions for non-compliance. Second, the different MEPS programs revealed similarities in targeting efficiency gains that are technically feasible and economically justified as the principle for choosing a standard level, in many cases at a level that no product on the current market could reach. Third, detailed survey data such as the U.S. Residential Energy Consumption Survey (RECS) and rigorous analyses provide a strong foundation for standard-setting while incorporating the participation of different groups of stakeholders further strengthen the process. Fourth, sufficient program resources for program implementation and evaluation are critical to the effectiveness of standards and labeling programs and cost-sharing between national and local governments can help ensure adequate resources and uniform implementation. Lastly, check-testing and punitive measures are important forms of enforcement while the cancellation of registration or product sales-based fines have also proven effective in reducing non-compliance. The international comparative analysis also revealed the differing degree to which the level of government decentralization has influenced S&L programs and while no single country has best practices in all elements of standards and labeling development and implementation, national examples of best practices for specific elements do exist. For example, the U.S. has exemplified the use of rigorous analyses for standard-setting and robust data source with the RECS database while Japan's Top Runner standard-setting principle has motivated manufacturers to exceed targets. In terms of standards implementation and enforcement, Australia has demonstrated success with enforcement given its long history of check-testing and enforcement initiatives while mandatory information-sharing between EU jurisdictions on compliance results is another important enforcement mechanism. These examples show that it is important to evaluate not only the drivers of different paths of standards and labeling development, but also the country-specific context for best practice examples in order to understand how and why certain elements of specific S&L programs have been effective.« less
Impact of Health Labels on Flavor Perception and Emotional Profiling: A Consumer Study on Cheese
Schouteten, Joachim J.; De Steur, Hans; De Pelsmaeker, Sara; Lagast, Sofie; De Bourdeaudhuij, Ilse; Gellynck, Xavier
2015-01-01
The global increase of cardiovascular diseases is linked to the shift towards unbalanced diets with increasing salt and fat intake. This has led to a growing consumers’ interest in more balanced food products, which explains the growing number of health-related claims on food products (e.g., “low in salt” or “light”). Based on a within-subjects design, consumers (n = 129) evaluated the same cheese product with different labels. Participants rated liking, saltiness and fat flavor intensity before and after consuming four labeled cheeses. Even though the cheese products were identical, inclusion of health labels influenced consumer perceptions. Cheese with a “light” label had a lower overall expected and perceived liking compared to regular cheese. Although cheese with a “salt reduced” label had a lower expected liking compared to regular cheese, no lower liking was found when consumers actually consumed the labeled cheese. All labels also influenced the perceived intensities of the attributes related to these labels, e.g., for example salt intensity for reduced salt label. While emotional profiles of the labeled cheeses differed before tasting, little differences were found when actual tasting these cheeses. In conclusion, this study shows that health-related labels might influence the perceived flavor and emotional profiles of cheese products. PMID:26690211
NASA Astrophysics Data System (ADS)
Wang, Han; Yan, Jie; Liu, Yongqian; Han, Shuang; Li, Li; Zhao, Jing
2017-11-01
Increasing the accuracy of wind speed prediction lays solid foundation to the reliability of wind power forecasting. Most traditional correction methods for wind speed prediction establish the mapping relationship between wind speed of the numerical weather prediction (NWP) and the historical measurement data (HMD) at the corresponding time slot, which is free of time-dependent impacts of wind speed time series. In this paper, a multi-step-ahead wind speed prediction correction method is proposed with consideration of the passing effects from wind speed at the previous time slot. To this end, the proposed method employs both NWP and HMD as model inputs and the training labels. First, the probabilistic analysis of the NWP deviation for different wind speed bins is calculated to illustrate the inadequacy of the traditional time-independent mapping strategy. Then, support vector machine (SVM) is utilized as example to implement the proposed mapping strategy and to establish the correction model for all the wind speed bins. One Chinese wind farm in northern part of China is taken as example to validate the proposed method. Three benchmark methods of wind speed prediction are used to compare the performance. The results show that the proposed model has the best performance under different time horizons.
Fuzzy logic and neural networks in artificial intelligence and pattern recognition
NASA Astrophysics Data System (ADS)
Sanchez, Elie
1991-10-01
With the use of fuzzy logic techniques, neural computing can be integrated in symbolic reasoning to solve complex real world problems. In fact, artificial neural networks, expert systems, and fuzzy logic systems, in the context of approximate reasoning, share common features and techniques. A model of Fuzzy Connectionist Expert System is introduced, in which an artificial neural network is designed to construct the knowledge base of an expert system from, training examples (this model can also be used for specifications of rules in fuzzy logic control). Two types of weights are associated with the synaptic connections in an AND-OR structure: primary linguistic weights, interpreted as labels of fuzzy sets, and secondary numerical weights. Cell activation is computed through min-max fuzzy equations of the weights. Learning consists in finding the (numerical) weights and the network topology. This feedforward network is described and first illustrated in a biomedical application (medical diagnosis assistance from inflammatory-syndromes/proteins profiles). Then, it is shown how this methodology can be utilized for handwritten pattern recognition (characters play the role of diagnoses): in a fuzzy neuron describing a number for example, the linguistic weights represent fuzzy sets on cross-detecting lines and the numerical weights reflect the importance (or weakness) of connections between cross-detecting lines and characters.
NASA Astrophysics Data System (ADS)
Oza, Nikunj
2012-03-01
A supervised learning task involves constructing a mapping from input data (normally described by several features) to the appropriate outputs. A set of training examples— examples with known output values—is used by a learning algorithm to generate a model. This model is intended to approximate the mapping between the inputs and outputs. This model can be used to generate predicted outputs for inputs that have not been seen before. Within supervised learning, one type of task is a classification learning task, in which each output is one or more classes to which the input belongs. For example, we may have data consisting of observations of sunspots. In a classification learning task, our goal may be to learn to classify sunspots into one of several types. Each example may correspond to one candidate sunspot with various measurements or just an image. A learning algorithm would use the supplied examples to generate a model that approximates the mapping between each supplied set of measurements and the type of sunspot. This model can then be used to classify previously unseen sunspots based on the candidate’s measurements. The generalization performance of a learned model (how closely the target outputs and the model’s predicted outputs agree for patterns that have not been presented to the learning algorithm) would provide an indication of how well the model has learned the desired mapping. More formally, a classification learning algorithm L takes a training set T as its input. The training set consists of |T| examples or instances. It is assumed that there is a probability distribution D from which all training examples are drawn independently—that is, all the training examples are independently and identically distributed (i.i.d.). The ith training example is of the form (x_i, y_i), where x_i is a vector of values of several features and y_i represents the class to be predicted.* In the sunspot classification example given above, each training example would represent one sunspot’s classification (y_i) and the corresponding set of measurements (x_i). The output of a supervised learning algorithm is a model h that approximates the unknown mapping from the inputs to the outputs. In our example, h would map from the sunspot measurements to the type of sunspot. We may have a test set S—a set of examples not used in training that we use to test how well the model h predicts the outputs on new examples. Just as with the examples in T, the examples in S are assumed to be independent and identically distributed (i.i.d.) draws from the distribution D. We measure the error of h on the test set as the proportion of test cases that h misclassifies: 1/|S| Sigma(x,y union S)[I(h(x)!= y)] where I(v) is the indicator function—it returns 1 if v is true and 0 otherwise. In our sunspot classification example, we would identify additional examples of sunspots that were not used in generating the model, and use these to determine how accurate the model is—the fraction of the test samples that the model classifies correctly. An example of a classification model is the decision tree shown in Figure 23.1. We will discuss the decision tree learning algorithm in more detail later—for now, we assume that, given a training set with examples of sunspots, this decision tree is derived. This can be used to classify previously unseen examples of sunpots. For example, if a new sunspot’s inputs indicate that its "Group Length" is in the range 10-15, then the decision tree would classify the sunspot as being of type “E,” whereas if the "Group Length" is "NULL," the "Magnetic Type" is "bipolar," and the "Penumbra" is "rudimentary," then it would be classified as type "C." In this chapter, we will add to the above description of classification problems. We will discuss decision trees and several other classification models. In particular, we will discuss the learning algorithms that generate these classification models, how to use them to classify new examples, and the strengths and weaknesses of these models. We will end with pointers to further reading on classification methods applied to astronomy data.
Fast max-margin clustering for unsupervised word sense disambiguation in biomedical texts
Duan, Weisi; Song, Min; Yates, Alexander
2009-01-01
Background We aim to solve the problem of determining word senses for ambiguous biomedical terms with minimal human effort. Methods We build a fully automated system for Word Sense Disambiguation by designing a system that does not require manually-constructed external resources or manually-labeled training examples except for a single ambiguous word. The system uses a novel and efficient graph-based algorithm to cluster words into groups that have the same meaning. Our algorithm follows the principle of finding a maximum margin between clusters, determining a split of the data that maximizes the minimum distance between pairs of data points belonging to two different clusters. Results On a test set of 21 ambiguous keywords from PubMed abstracts, our system has an average accuracy of 78%, outperforming a state-of-the-art unsupervised system by 2% and a baseline technique by 23%. On a standard data set from the National Library of Medicine, our system outperforms the baseline by 6% and comes within 5% of the accuracy of a supervised system. Conclusion Our system is a novel, state-of-the-art technique for efficiently finding word sense clusters, and does not require training data or human effort for each new word to be disambiguated. PMID:19344480
Self-Supervised Chinese Ontology Learning from Online Encyclopedias
Shao, Zhiqing; Ruan, Tong
2014-01-01
Constructing ontology manually is a time-consuming, error-prone, and tedious task. We present SSCO, a self-supervised learning based chinese ontology, which contains about 255 thousand concepts, 5 million entities, and 40 million facts. We explore the three largest online Chinese encyclopedias for ontology learning and describe how to transfer the structured knowledge in encyclopedias, including article titles, category labels, redirection pages, taxonomy systems, and InfoBox modules, into ontological form. In order to avoid the errors in encyclopedias and enrich the learnt ontology, we also apply some machine learning based methods. First, we proof that the self-supervised machine learning method is practicable in Chinese relation extraction (at least for synonymy and hyponymy) statistically and experimentally and train some self-supervised models (SVMs and CRFs) for synonymy extraction, concept-subconcept relation extraction, and concept-instance relation extraction; the advantages of our methods are that all training examples are automatically generated from the structural information of encyclopedias and a few general heuristic rules. Finally, we evaluate SSCO in two aspects, scale and precision; manual evaluation results show that the ontology has excellent precision, and high coverage is concluded by comparing SSCO with other famous ontologies and knowledge bases; the experiment results also indicate that the self-supervised models obviously enrich SSCO. PMID:24715819
Self-supervised Chinese ontology learning from online encyclopedias.
Hu, Fanghuai; Shao, Zhiqing; Ruan, Tong
2014-01-01
Constructing ontology manually is a time-consuming, error-prone, and tedious task. We present SSCO, a self-supervised learning based chinese ontology, which contains about 255 thousand concepts, 5 million entities, and 40 million facts. We explore the three largest online Chinese encyclopedias for ontology learning and describe how to transfer the structured knowledge in encyclopedias, including article titles, category labels, redirection pages, taxonomy systems, and InfoBox modules, into ontological form. In order to avoid the errors in encyclopedias and enrich the learnt ontology, we also apply some machine learning based methods. First, we proof that the self-supervised machine learning method is practicable in Chinese relation extraction (at least for synonymy and hyponymy) statistically and experimentally and train some self-supervised models (SVMs and CRFs) for synonymy extraction, concept-subconcept relation extraction, and concept-instance relation extraction; the advantages of our methods are that all training examples are automatically generated from the structural information of encyclopedias and a few general heuristic rules. Finally, we evaluate SSCO in two aspects, scale and precision; manual evaluation results show that the ontology has excellent precision, and high coverage is concluded by comparing SSCO with other famous ontologies and knowledge bases; the experiment results also indicate that the self-supervised models obviously enrich SSCO.
Basaruddin, T.
2016-01-01
One essential task in information extraction from the medical corpus is drug name recognition. Compared with text sources come from other domains, the medical text mining poses more challenges, for example, more unstructured text, the fast growing of new terms addition, a wide range of name variation for the same drug, the lack of labeled dataset sources and external knowledge, and the multiple token representations for a single drug name. Although many approaches have been proposed to overwhelm the task, some problems remained with poor F-score performance (less than 0.75). This paper presents a new treatment in data representation techniques to overcome some of those challenges. We propose three data representation techniques based on the characteristics of word distribution and word similarities as a result of word embedding training. The first technique is evaluated with the standard NN model, that is, MLP. The second technique involves two deep network classifiers, that is, DBN and SAE. The third technique represents the sentence as a sequence that is evaluated with a recurrent NN model, that is, LSTM. In extracting the drug name entities, the third technique gives the best F-score performance compared to the state of the art, with its average F-score being 0.8645. PMID:27843447
Pervasive Sound Sensing: A Weakly Supervised Training Approach.
Kelly, Daniel; Caulfield, Brian
2016-01-01
Modern smartphones present an ideal device for pervasive sensing of human behavior. Microphones have the potential to reveal key information about a person's behavior. However, they have been utilized to a significantly lesser extent than other smartphone sensors in the context of human behavior sensing. We postulate that, in order for microphones to be useful in behavior sensing applications, the analysis techniques must be flexible and allow easy modification of the types of sounds to be sensed. A simplification of the training data collection process could allow a more flexible sound classification framework. We hypothesize that detailed training, a prerequisite for the majority of sound sensing techniques, is not necessary and that a significantly less detailed and time consuming data collection process can be carried out, allowing even a nonexpert to conduct the collection, labeling, and training process. To test this hypothesis, we implement a diverse density-based multiple instance learning framework, to identify a target sound, and a bag trimming algorithm, which, using the target sound, automatically segments weakly labeled sound clips to construct an accurate training set. Experiments reveal that our hypothesis is a valid one and results show that classifiers, trained using the automatically segmented training sets, were able to accurately classify unseen sound samples with accuracies comparable to supervised classifiers, achieving an average F -measure of 0.969 and 0.87 for two weakly supervised datasets.
Krzentowski, G; Pirnay, F; Luyckx, A S; Lacroix, M; Mosora, F; Lefebvre, P J
1983-01-01
This study aimed at investigating, in six healthy, non obese, young (25 +/- 1 years) male volunteers, with strictly normal oral glucose tolerance, the influence of a six week physical training period (60 min bicycling 5 days/week at 30-40% of their individual VO2 max) on the hormonal and metabolic response to a 100 g oral 13C-naturally labeled glucose load given at rest before and 36 h after the last training session. Exogenous glucose oxidation was derived from 13CO2 measurements on expired air. Training resulted in: a 29% increase in VO2 max (2 p less than 0.002), a 27% decrease in plasma triglycerides (2 p less than 0.02). No changes were observed concerning weight, total body K, skinfold tolerance, which was strictly normal before training, remained unchanged, but the insulin response to the oral glucose load decreased by 24% (2 p less than 0.025). Exogenous glucose oxidation was similar before and after training, averaging 35.9 +/- 2.1 and 37.4 +/- 2.0 g/7 h respectively. a 6 week training period, performed on strictly healthy young males, studied at rest, induced an increase in VO2 max, a decrease in plasma triglycerides and a lower insulin response to oral glucose while glucose tolerance and exogenous glucose oxidation remained unchanged.
Semantic and visual memory codes in learning disabled readers.
Swanson, H L
1984-02-01
Two experiments investigated whether learning disabled readers' impaired recall is due to multiple coding deficiencies. In Experiment 1, learning disabled and skilled readers viewed nonsense pictures without names or with either relevant or irrelevant names with respect to the distinctive characteristics of the picture. Both types of names improved recall of nondisabled readers, while learning disabled readers exhibited better recall for unnamed pictures. No significant difference in recall was found between name training (relevant, irrelevant) conditions within reading groups. In Experiment 2, both reading groups participated in recall training for complex visual forms labeled with unrelated words, hierarchically related words, or without labels. A subsequent reproduction transfer task showed a facilitation in performance in skilled readers due to labeling, with learning disabled readers exhibiting better reproduction for unnamed pictures. Measures of output organization (clustering) indicated that recall is related to the development of superordinate categories. The results suggest that learning disabled children's reading difficulties are due to an inability to activate a semantic representation that interconnects visual and verbal codes.
A comparison of methods for teaching receptive labeling to children with autism spectrum disorders.
Grow, Laura L; Carr, James E; Kodak, Tiffany M; Jostad, Candice M; Kisamore, April N
2011-01-01
Many early intervention curricular manuals recommend teaching auditory-visual conditional discriminations (i.e., receptive labeling) using the simple-conditional method in which component simple discriminations are taught in isolation and in the presence of a distracter stimulus before the learner is required to respond conditionally. Some have argued that this procedure might be susceptible to faulty stimulus control such as stimulus overselectivity (Green, 2001). Consequently, there has been a call for the use of alternative teaching procedures such as the conditional-only method, which involves conditional discrimination training from the onset of intervention. The purpose of the present study was to compare the simple-conditional and conditional-only methods for teaching receptive labeling to 3 young children diagnosed with autism spectrum disorders. The data indicated that the conditional-only method was a more reliable and efficient teaching procedure. In addition, several error patterns emerged during training using the simple-conditional method. The implications of the results with respect to current teaching practices in early intervention programs are discussed.
Henderson, H; German, V F; Panter, A T; Huba, G J; Rohweder, C; Zalumas, J; Wolfe, L; Uldall, K K; Lalonde, B; Henderson, R; Driscoll, M; Martin, S; Duggan, S; Rahimian, A; Melchior, L A
1999-12-01
An evaluation of nine diverse HIV/AIDS training programs assessed the degree to which the programs produced changes in the ways that health care systems deliver HIV/AIDS care. Participants were interviewed an average of 8 months following completion of training and asked for specific examples of a resulting change in their health care system. More than half of the trainees gave at least one example of a systems change. The examples included the way patient referrals are made, the manner in which agency collaborations are organized, and the way care is delivered.
Spine labeling in MRI via regularized distribution matching.
Hojjat, Seyed-Parsa; Ayed, Ismail; Garvin, Gregory J; Punithakumar, Kumaradevan
2017-11-01
This study investigates an efficient (nearly real-time) two-stage spine labeling algorithm that removes the need for an external training while being applicable to different types of MRI data and acquisition protocols. Based solely on the image being labeled (i.e., we do not use training data), the first stage aims at detecting potential vertebra candidates following the optimization of a functional containing two terms: (i) a distribution-matching term that encodes contextual information about the vertebrae via a density model learned from a very simple user input, which amounts to a point (mouse click) on a predefined vertebra; and (ii) a regularization constraint, which penalizes isolated candidates in the solution. The second stage removes false positives and identifies all vertebrae and discs by optimizing a geometric constraint, which embeds generic anatomical information on the interconnections between neighboring structures. Based on generic knowledge, our geometric constraint does not require external training. We performed quantitative evaluations of the algorithm over a data set of 90 mid-sagittal MRI images of the lumbar spine acquired from 45 different subjects. To assess the flexibility of the algorithm, we used both T1- and T2-weighted images for each subject. A total of 990 structures were automatically detected/labeled and compared to ground-truth annotations by an expert. On the T2-weighted data, we obtained an accuracy of 91.6% for the vertebrae and 89.2% for the discs. On the T1-weighted data, we obtained an accuracy of 90.7% for the vertebrae and 88.1% for the discs. Our algorithm removes the need for external training while being applicable to different types of MRI data and acquisition protocols. Based on the current testing data, a subject-specific model density and generic anatomical information, our method can achieve competitive performances when applied to T1- and T2-weighted MRI images.
ERIC Educational Resources Information Center
Pepperberg, Irene M.; Carey, Susan
2012-01-01
A Grey parrot ("Psittacus erithacus") had previously been taught to use English count words ("one" through "sih" [six]) to label sets of one to six individual items (Pepperberg, 1994). He had also been taught to use the same count words to label the Arabic numerals 1 through 6. Without training, he inferred the relationship between the Arabic…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Raza, A.; Preisler, H.D.
A new technique using immunofluorescence and autoradiography is described, in which the DNA of cells in S phase are labeled with two different probes. This method makes it possible to study the relationship between DNA synthesis and the uptake and/or incorporation of chemotherapeutic agents into normal or neoplastic cells. An example is provided in which the incorporation of /sup 3/H-cytarabine into DNA is demonstrated to occur only in cells which were synthesizing DNA during exposure to /sup 3/H-cytarabine. Other radioactively labeled probes can be used as well.
Zhou, Xiangrong; Takayama, Ryosuke; Wang, Song; Hara, Takeshi; Fujita, Hiroshi
2017-10-01
We propose a single network trained by pixel-to-label deep learning to address the general issue of automatic multiple organ segmentation in three-dimensional (3D) computed tomography (CT) images. Our method can be described as a voxel-wise multiple-class classification scheme for automatically assigning labels to each pixel/voxel in a 2D/3D CT image. We simplify the segmentation algorithms of anatomical structures (including multiple organs) in a CT image (generally in 3D) to a majority voting scheme over the semantic segmentation of multiple 2D slices drawn from different viewpoints with redundancy. The proposed method inherits the spirit of fully convolutional networks (FCNs) that consist of "convolution" and "deconvolution" layers for 2D semantic image segmentation, and expands the core structure with 3D-2D-3D transformations to adapt to 3D CT image segmentation. All parameters in the proposed network are trained pixel-to-label from a small number of CT cases with human annotations as the ground truth. The proposed network naturally fulfills the requirements of multiple organ segmentations in CT cases of different sizes that cover arbitrary scan regions without any adjustment. The proposed network was trained and validated using the simultaneous segmentation of 19 anatomical structures in the human torso, including 17 major organs and two special regions (lumen and content inside of stomach). Some of these structures have never been reported in previous research on CT segmentation. A database consisting of 240 (95% for training and 5% for testing) 3D CT scans, together with their manually annotated ground-truth segmentations, was used in our experiments. The results show that the 19 structures of interest were segmented with acceptable accuracy (88.1% and 87.9% voxels in the training and testing datasets, respectively, were labeled correctly) against the ground truth. We propose a single network based on pixel-to-label deep learning to address the challenging issue of anatomical structure segmentation in 3D CT cases. The novelty of this work is the policy of deep learning of the different 2D sectional appearances of 3D anatomical structures for CT cases and the majority voting of the 3D segmentation results from multiple crossed 2D sections to achieve availability and reliability with better efficiency, generality, and flexibility than conventional segmentation methods, which must be guided by human expertise. © 2017 The Authors. Medical Physics published by Wiley Periodicals, Inc. on behalf of American Association of Physicists in Medicine.
Independent valine and leucine isotope labeling in Escherichia coli protein overexpression systems.
Lichtenecker, Roman J; Weinhäupl, Katharina; Reuther, Lukas; Schörghuber, Julia; Schmid, Walther; Konrat, Robert
2013-11-01
The addition of labeled α-ketoisovalerate to the growth medium of a protein-expressing host organism has evolved into a versatile tool to achieve concomitant incorporation of specific isotopes into valine- and leucine- residues. The resulting target proteins represent excellent probes for protein NMR analysis. However, as the sidechain resonances of these residues emerge in a narrow spectral range, signal overlap represents a severe limitation in the case of high-molecular-weight NMR probes. We present a protocol to eliminate leucine labeling by supplying the medium with unlabeled α-ketoisocaproate. The resulting spectra of a model protein exclusively feature valine signals of increased intensity, confirming the method to be a first example of independent valine and leucine labeling employing α-ketoacid precursor compounds.
NEED FOR HARMONIZATION OF LABELING OF MEDICAL DEVICES: A REVIEW
Songara, Raiendra K.; Sharma, Ganesh N.; Gupta, Vipul K.; Gupta, Promila
2010-01-01
Medical device labeling is any information associated with a device targeted to the patient or lay caregiver. It is intended to help assure that the device is used safely and effectively. Medical device labeling is supplied in many formats, for example, as patient brochures, patient leaflets, user manuals, and videotapes. The European commission has discussed a series of agreements with third countries, Australia, New Zealand, USA, Canada, Japan and Eastern European countries wishing to join the EU, concerning the mutual acceptance of inspection bodies, proof of conformity in connection with medical devices. Device labeling is exceedingly difficult for manufacturers for many reasons like regulations from government bodies to ensure compliance, increased competent authority surveillance, increased audits and language requirements. PMID:22247840
Benefits of HIV testing during military exercises.
Gross, M L; Rendin, R W; Childress, C W; Kerstein, M D
1989-12-01
During U.S. Marine Corps Reserve summer 2-week active duty for training periods, 6,482 people were tested for human immunodeficiency virus (HIV). Testing at an initial exercise, Solar Flare, trained a cadre of contact teams to, in turn, train other personnel in phlebotomy and the HIV protocol at three other exercises (141 Navy Reserve and Inspector-Instructor hospital corpsmen were trained). Corpsmen could be trained with an indoctrination of 120 minutes and a mean of 15 phlebotomies. After 50 phlebotomies, the administration, identification, and labeling process plus phlebotomy could be completed in 90 seconds. HIV testing during military exercises is both good for training and cost-effective.
Smith, Tristram; Aman, Michael G; Arnold, L Eugene; Silverman, Laura B; Lecavalier, Luc; Hollway, Jill; Tumuluru, Rameshwari; Hyman, Susan L; Buchan-Page, Kristin A; Hellings, Jessica; Rice, Robert R; Brown, Nicole V; Pan, Xueliang; Handen, Benjamin L
2016-10-01
The authors previously reported on a 2-by-2 randomized clinical trial of individual and combined treatment with atomoxetine (ATX) and parent training (PT) for attention-deficit/hyperactivity disorder (ADHD) symptoms and behavioral noncompliance in 128 5- to 14-year-old children with autism spectrum disorder. In the present report, they describe a 24-week extension of treatment responders and nonresponders. One-hundred seventeen participants from the acute trial (91%) entered the extension; 84 of these were in 2 subgroups: "treatment responders" (n = 43) from all 4 groups in the acute trial, seen monthly for 24 weeks, and "placebo nonresponders" (n = 41), treated with open-label ATX for 10 weeks. Participants originally assigned to PT continued PT during the extension; the remainder served as controls. Primary outcome measurements were the parent-rated Swanson, Nolan and Pelham ADHD scale and the Home Situations Questionnaire. Sixty percent (26 of 43) of treatment responders in the acute trial, including 68% of responders originally assigned to ATX, still met the response criteria at the end of the extension. The response rate of placebo nonresponders treated with 10-week open-label ATX was 37% (15 of 41), similar to the acute trial. Children receiving open-label ATX + PT were significantly more likely to be ADHD responders (53% versus 23%) and noncompliance responders (58% versus 14%) than those receiving open-label ATX alone. Most ATX responders maintained their responses during the extension. PT combined with ATX in the open-label trial appeared to improve ADHD and noncompliance outcomes more than ATX alone. Clinical trial registration information-Atomoxetine, Placebo and Parent Management Training in Autism (Strattera); http://clinicaltrials.gov; NCT00844753. Copyright © 2016 American Academy of Child and Adolescent Psychiatry. Published by Elsevier Inc. All rights reserved.
2013-06-01
benefitting from rapid, automated discrimination of specific predefined signals , and is free-standing (requiring no other plugins or packages). The...previously labeled dataset, and comparing two labeled datasets. 15. SUBJECT TERMS Artifact, signal detection, EEG, MATLAB, toolbox 16. SECURITY... CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT UU 18. NUMBER OF PAGES 56 19a. NAME OF RESPONSIBLE PERSON W. David Hairston a. REPORT
NASA Astrophysics Data System (ADS)
Acosta, Oscar; Dowling, Jason; Cazoulat, Guillaume; Simon, Antoine; Salvado, Olivier; de Crevoisier, Renaud; Haigron, Pascal
The prediction of toxicity is crucial to managing prostate cancer radiotherapy (RT). This prediction is classically organ wise and based on the dose volume histograms (DVH) computed during the planning step, and using for example the mathematical Lyman Normal Tissue Complication Probability (NTCP) model. However, these models lack spatial accuracy, do not take into account deformations and may be inappropiate to explain toxicity events related with the distribution of the delivered dose. Producing voxel wise statistical models of toxicity might help to explain the risks linked to the dose spatial distribution but is challenging due to the difficulties lying on the mapping of organs and dose in a common template. In this paper we investigate the use of atlas based methods to perform the non-rigid mapping and segmentation of the individuals' organs at risk (OAR) from CT scans. To build a labeled atlas, 19 CT scans were selected from a population of patients treated for prostate cancer by radiotherapy. The prostate and the OAR (Rectum, Bladder, Bones) were then manually delineated by an expert and constituted the training data. After a number of affine and non rigid registration iterations, an average image (template) representing the whole population was obtained. The amount of consensus between labels was used to generate probabilistic maps for each organ. We validated the accuracy of the approach by segmenting the organs using the training data in a leave one out scheme. The agreement between the volumes after deformable registration and the manually segmented organs was on average above 60% for the organs at risk. The proposed methodology provides a way to map the organs from a whole population on a single template and sets the stage to perform further voxel wise analysis. With this method new and accurate predictive models of toxicity will be built.
Chemical biology-based approaches on fluorescent labeling of proteins in live cells.
Jung, Deokho; Min, Kyoungmi; Jung, Juyeon; Jang, Wonhee; Kwon, Youngeun
2013-05-01
Recently, significant advances have been made in live cell imaging owing to the rapid development of selective labeling of proteins in vivo. Green fluorescent protein (GFP) was the first example of fluorescent reporters genetically introduced to protein of interest (POI). While GFP and various types of engineered fluorescent proteins (FPs) have been actively used for live cell imaging for many years, the size and the limited windows of fluorescent spectra of GFP and its variants set limits on possible applications. In order to complement FP-based labeling methods, alternative approaches that allow incorporation of synthetic fluorescent probes to target POIs were developed. Synthetic fluorescent probes are smaller than fluorescent proteins, often have improved photochemical properties, and offer a larger variety of colors. These synthetic probes can be introduced to POIs selectively by numerous approaches that can be largely categorized into chemical recognition-based labeling, which utilizes metal-chelating peptide tags and fluorophore-carrying metal complexes, and biological recognition-based labeling, such as (1) specific non-covalent binding between an enzyme tag and its fluorophore-carrying substrate, (2) self-modification of protein tags using substrate variants conjugated to fluorophores, (3) enzymatic reaction to generate a covalent binding between a small molecule substrate and a peptide tag, and (4) split-intein-based C-terminal labeling of target proteins. The chemical recognition-based labeling reaction often suffers from compromised selectivity of metal-ligand interaction in the cytosolic environment, consequently producing high background signals. Use of protein-substrate interactions or enzyme-mediated reactions generally shows improved specificity but each method has its limitations. Some examples are the presence of large linker protein, restriction on the choice of introducible probes due to the substrate specificity of enzymes, and competitive reaction mediated by an endogenous analogue of the introduced protein tag. These limitations have been addressed, in part, by the split-intein-based labeling approach, which introduces fluorescent probes with a minimal size (~4 amino acids) peptide tag. In this review, the advantages and the limitations of each labeling method are discussed.
Interactive Multimedia Instruction for Training Self-Directed Learning Techniques
2016-06-01
feedback and input on the content, format, and pedagogical approach of the lesson. This survey could be e-mailed to the principal ARI researcher for...peers in self-directed learning. Some examples of the metaphorical relationships and common examples woven into this IMI are identified in Table 1...20 Table 1 Metaphorical Relationships and Illustrations Used in Self-Directed Learning Training Military or Common Example Self-Directed
Adults' acquisition of novel dimension words: creating a semantic congruity effect.
Ryalls, B O; Smith, L B
2000-07-01
The semantic congruity effect is exhibited when adults are asked to compare pairs of items from a series, and their response is faster when the direction of the comparison coincides with the location of the stimuli in the series. For example, people are faster at picking the bigger of 2 big items than the littler of 2 big items. In the 4 experiments presented, adults were taught new dimensional adjectives (mal/ler and borg/er). Characteristics of the learning situation, such as the nature of the stimulus series and the relative frequency of labeling, were varied. Results revealed that the participants who learned the relative meaning of the artificial dimensional adjectives also formed categories and developed a semantic congruity effect regardless of the characteristics of training. These findings have important implications for our understanding of adult acquisition of novel relational words, the relationship between learning such words and categorization, and the explanations of the semantic congruity effect.
Automated Detection of Microaneurysms Using Scale-Adapted Blob Analysis and Semi-Supervised Learning
DOE Office of Scientific and Technical Information (OSTI.GOV)
Adal, Kedir M.; Sidebe, Desire; Ali, Sharib
2014-01-07
Despite several attempts, automated detection of microaneurysm (MA) from digital fundus images still remains to be an open issue. This is due to the subtle nature of MAs against the surrounding tissues. In this paper, the microaneurysm detection problem is modeled as finding interest regions or blobs from an image and an automatic local-scale selection technique is presented. Several scale-adapted region descriptors are then introduced to characterize these blob regions. A semi-supervised based learning approach, which requires few manually annotated learning examples, is also proposed to train a classifier to detect true MAs. The developed system is built using onlymore » few manually labeled and a large number of unlabeled retinal color fundus images. The performance of the overall system is evaluated on Retinopathy Online Challenge (ROC) competition database. A competition performance measure (CPM) of 0.364 shows the competitiveness of the proposed system against state-of-the art techniques as well as the applicability of the proposed features to analyze fundus images.« less
Paley, J
2000-06-01
The rejection of Cartesian dualism can be taken to imply that the mind is implicated in health and illness to a greater degree than conventional medicine would suggest. Surprisingly, however, there appears to be a train of thought in antidualist nursing theory which takes the opposite view. This paper looks closely at an interesting example of antidualist thinking - an article in which Benner and her colleagues comment on the ways in which people with asthma make sense of their condition - and concludes that it places unduly stringent and arbitrary limits on the mind's role. It then asks how antidualism can lead to such a dogmatic rejection of the idea that states of the body are clinically influenced by states of mind. The answer to this question is that Benner assimilates very different philosophical theories into the same 'tradition'. On this occasion, she has combined Descartes, Kant and the Platonist ascetics into a single package, misleadingly labelled 'Cartesianism', and this move accounts for her unexpected views on the relation between mind and body in asthma.
Structure of the knowledge base for an expert labeling system
NASA Technical Reports Server (NTRS)
Rajaram, N. S.
1981-01-01
One of the principal objectives of the NASA AgRISTARS program is the inventory of global crop resources using remotely sensed data gathered by Land Satellites (LANDSAT). A central problem in any such crop inventory procedure is the interpretation of LANDSAT images and identification of parts of each image which are covered by a particular crop of interest. This task of labeling is largely a manual one done by trained human analysts and consequently presents obstacles to the development of totally automated crop inventory systems. However, development in knowledge engineering as well as widespread availability of inexpensive hardware and software for artificial intelligence work offers possibilities for developing expert systems for labeling of crops. Such a knowledge based approach to labeling is presented.
Venkataramani, Varun; Kardorff, Markus; Herrmannsdörfer, Frank; Wieneke, Ralph; Klein, Alina; Tampé, Robert; Heilemann, Mike; Kuner, Thomas
2018-04-03
With continuing advances in the resolving power of super-resolution microscopy, the inefficient labeling of proteins with suitable fluorophores becomes a limiting factor. For example, the low labeling density achieved with antibodies or small molecule tags limits attempts to reveal local protein nano-architecture of cellular compartments. On the other hand, high laser intensities cause photobleaching within and nearby an imaged region, thereby further reducing labeling density and impairing multi-plane whole-cell 3D super-resolution imaging. Here, we show that both labeling density and photobleaching can be addressed by repetitive application of trisNTA-fluorophore conjugates reversibly binding to a histidine-tagged protein by a novel approach called single-epitope repetitive imaging (SERI). For single-plane super-resolution microscopy, we demonstrate that, after multiple rounds of labeling and imaging, the signal density is increased. Using the same approach of repetitive imaging, washing and re-labeling, we demonstrate whole-cell 3D super-resolution imaging compensated for photobleaching above or below the imaging plane. This proof-of-principle study demonstrates that repetitive labeling of histidine-tagged proteins provides a versatile solution to break the 'labeling barrier' and to bypass photobleaching in multi-plane, whole-cell 3D experiments.
Developing a computer security training program
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
1990-01-01
We all know that training can empower the computer protection program. However, pushing computer security information outside the computer security organization into the rest of the company is often labeled as an easy project or a dungeon full of dragons. Used in part or whole, the strategy offered in this paper may help the developer of a computer security training program ward off dragons and create products and services. The strategy includes GOALS (what the result of training will be), POINTERS (tips to ensure survival), and STEPS (products and services as a means to accomplish the goals).
Agyemang, C.; Bhopal, R.; Bruijnzeels, M.
2005-01-01
Broad terms such as Black, African, or Black African are entrenched in scientific writings although there is considerable diversity within African descent populations and such terms may be both offensive and inaccurate. This paper outlines the heterogeneity within African populations, and discusses the strengths and limitations of the term Black and related labels from epidemiological and public health perspectives in Europe and the USA. This paper calls for debate on appropriate terminologies for African descent populations and concludes with the proposals that (1) describing the population under consideration is of paramount importance (2) the word African origin or simply African is an appropriate and necessary prefix for an ethnic label, for example, African Caribbean or African Kenyan or African Surinamese (3) documents should define the ethnic labels (4) the label Black should be phased out except when used in political contexts. PMID:16286485
Peckys, Diana B; Bandmann, Vera; de Jonge, Niels
2014-01-01
Correlative fluorescence microscopy combined with scanning transmission electron microscopy (STEM) of cells fully immersed in liquid is a new methodology with many application areas. Proteins, in live cells immobilized on microchips, are labeled with fluorescent quantum dot nanoparticles. In this protocol, the epidermal growth factor receptor (EGFR) is labeled. The cells are fixed after a selected labeling time, for example, 5 min as needed to form EGFR dimers. The microchip with cells is then imaged with fluorescence microscopy. Thereafter, STEM can be accomplished in two ways. The microchip with the labeled cells and one microchip with a spacer are assembled into a special microfluidic device and imaged with dedicated high-voltage STEM. Alternatively, thin edges of cells can be studied with environmental scanning electron microscopy with a STEM detector, by placing a microchip with cells in a cooled wet environment. © 2014 Elsevier Inc. All rights reserved.
Learning classification with auxiliary probabilistic information
Nguyen, Quang; Valizadegan, Hamed; Hauskrecht, Milos
2012-01-01
Finding ways of incorporating auxiliary information or auxiliary data into the learning process has been the topic of active data mining and machine learning research in recent years. In this work we study and develop a new framework for classification learning problem in which, in addition to class labels, the learner is provided with an auxiliary (probabilistic) information that reflects how strong the expert feels about the class label. This approach can be extremely useful for many practical classification tasks that rely on subjective label assessment and where the cost of acquiring additional auxiliary information is negligible when compared to the cost of the example analysis and labelling. We develop classification algorithms capable of using the auxiliary information to make the learning process more efficient in terms of the sample complexity. We demonstrate the benefit of the approach on a number of synthetic and real world data sets by comparing it to the learning with class labels only. PMID:25309141
A simple procedure for parallel sequence analysis of both strands of 5'-labeled DNA.
Razvi, F; Gargiulo, G; Worcel, A
1983-08-01
Ligation of a 5'-labeled DNA restriction fragment results in a circular DNA molecule carrying the two 32Ps at the reformed restriction site. Double digestions of the circular DNA with the original enzyme and a second restriction enzyme cleavage near the labeled site allows direct chemical sequencing of one 5'-labeled DNA strand. Similar double digestions, using an isoschizomer that cleaves differently at the 32P-labeled site, allows direct sequencing of the now 3'-labeled complementary DNA strand. It is possible to directly sequence both strands of cloned DNA inserts by using the above protocol and a multiple cloning site vector that provides the necessary restriction sites. The simultaneous and parallel visualization of both DNA strands eliminates sequence ambiguities. In addition, the labeled circular molecules are particularly useful for single-hit DNA cleavage studies and DNA footprint analysis. As an example, we show here an analysis of the micrococcal nuclease-induced breaks on the two strands of the somatic 5S RNA gene of Xenopus borealis, which suggests that the enzyme may recognize and cleave small AT-containing palindromes along the DNA helix.
Benchmark data for identifying multi-functional types of membrane proteins.
Wan, Shibiao; Mak, Man-Wai; Kung, Sun-Yuan
2016-09-01
Identifying membrane proteins and their multi-functional types is an indispensable yet challenging topic in proteomics and bioinformatics. In this article, we provide data that are used for training and testing Mem-ADSVM (Wan et al., 2016. "Mem-ADSVM: a two-layer multi-label predictor for identifying multi-functional types of membrane proteins" [1]), a two-layer multi-label predictor for predicting multi-functional types of membrane proteins.
ERIC Educational Resources Information Center
Hantson, Julie; Wang, Pan Pan; Grizenko-Vida, Michael; Ter-Stepanian, Marina; Harvey, William; Joober, Ridha; Grizenko, Natalie
2012-01-01
Objective: The objective of this study was to evaluate the effectiveness of a 2-week therapeutic summer day camp for children with ADHD, which included a social skills training program and parent psychoeducation and training program. This was an open-label, nonrandomized Phase I Clinical Intervention Trial. Method: Parents completed the Weiss…
Basheti, Iman A; Armour, Carol L; Bosnic-Anticevich, Sinthia Z; Reddel, Helen K
2008-07-01
To evaluate the feasibility, acceptability and effectiveness of a brief intervention about inhaler technique, delivered by community pharmacists to asthma patients. Thirty-one pharmacists received brief workshop education (Active: n=16, CONTROL: n=15). Active Group pharmacists were trained to assess and teach dry powder inhaler technique, using patient-centered educational tools including novel Inhaler Technique Labels. Interventions were delivered to patients at four visits over 6 months. At baseline, patients (Active: 53, CONTROL: 44) demonstrated poor inhaler technique (mean+/-S.D. score out of 9, 5.7+/-1.6). At 6 months, improvement in inhaler technique score was significantly greater in Active cf. CONTROL patients (2.8+/-1.6 cf. 0.9+/-1.4, p<0.001), and asthma severity was significantly improved (p=0.015). Qualitative responses from patients and pharmacists indicated a high level of satisfaction with the intervention and educational tools, both for their effectiveness and for their impact on the patient-pharmacist relationship. A simple feasible intervention in community pharmacies, incorporating daily reminders via Inhaler Technique Labels on inhalers, can lead to improvement in inhaler technique and asthma outcomes. Brief training modules and simple educational tools, such as Inhaler Technique Labels, can provide a low-cost and sustainable way of changing patient behavior in asthma, using community pharmacists as educators.
Optimizing area under the ROC curve using semi-supervised learning
Wang, Shijun; Li, Diana; Petrick, Nicholas; Sahiner, Berkman; Linguraru, Marius George; Summers, Ronald M.
2014-01-01
Receiver operating characteristic (ROC) analysis is a standard methodology to evaluate the performance of a binary classification system. The area under the ROC curve (AUC) is a performance metric that summarizes how well a classifier separates two classes. Traditional AUC optimization techniques are supervised learning methods that utilize only labeled data (i.e., the true class is known for all data) to train the classifiers. In this work, inspired by semi-supervised and transductive learning, we propose two new AUC optimization algorithms hereby referred to as semi-supervised learning receiver operating characteristic (SSLROC) algorithms, which utilize unlabeled test samples in classifier training to maximize AUC. Unlabeled samples are incorporated into the AUC optimization process, and their ranking relationships to labeled positive and negative training samples are considered as optimization constraints. The introduced test samples will cause the learned decision boundary in a multidimensional feature space to adapt not only to the distribution of labeled training data, but also to the distribution of unlabeled test data. We formulate the semi-supervised AUC optimization problem as a semi-definite programming problem based on the margin maximization theory. The proposed methods SSLROC1 (1-norm) and SSLROC2 (2-norm) were evaluated using 34 (determined by power analysis) randomly selected datasets from the University of California, Irvine machine learning repository. Wilcoxon signed rank tests showed that the proposed methods achieved significant improvement compared with state-of-the-art methods. The proposed methods were also applied to a CT colonography dataset for colonic polyp classification and showed promising results.1 PMID:25395692
Optimizing area under the ROC curve using semi-supervised learning.
Wang, Shijun; Li, Diana; Petrick, Nicholas; Sahiner, Berkman; Linguraru, Marius George; Summers, Ronald M
2015-01-01
Receiver operating characteristic (ROC) analysis is a standard methodology to evaluate the performance of a binary classification system. The area under the ROC curve (AUC) is a performance metric that summarizes how well a classifier separates two classes. Traditional AUC optimization techniques are supervised learning methods that utilize only labeled data (i.e., the true class is known for all data) to train the classifiers. In this work, inspired by semi-supervised and transductive learning, we propose two new AUC optimization algorithms hereby referred to as semi-supervised learning receiver operating characteristic (SSLROC) algorithms, which utilize unlabeled test samples in classifier training to maximize AUC. Unlabeled samples are incorporated into the AUC optimization process, and their ranking relationships to labeled positive and negative training samples are considered as optimization constraints. The introduced test samples will cause the learned decision boundary in a multidimensional feature space to adapt not only to the distribution of labeled training data, but also to the distribution of unlabeled test data. We formulate the semi-supervised AUC optimization problem as a semi-definite programming problem based on the margin maximization theory. The proposed methods SSLROC1 (1-norm) and SSLROC2 (2-norm) were evaluated using 34 (determined by power analysis) randomly selected datasets from the University of California, Irvine machine learning repository. Wilcoxon signed rank tests showed that the proposed methods achieved significant improvement compared with state-of-the-art methods. The proposed methods were also applied to a CT colonography dataset for colonic polyp classification and showed promising results.
Certified Professionals in Action
Irrigation professionals certified by WaterSense labeled certifying organizations are trained in water-efficient practices. Learn more about how professionals are helping customers save water with better irrigation practices.
Label consistent K-SVD: learning a discriminative dictionary for recognition.
Jiang, Zhuolin; Lin, Zhe; Davis, Larry S
2013-11-01
A label consistent K-SVD (LC-KSVD) algorithm to learn a discriminative dictionary for sparse coding is presented. In addition to using class labels of training data, we also associate label information with each dictionary item (columns of the dictionary matrix) to enforce discriminability in sparse codes during the dictionary learning process. More specifically, we introduce a new label consistency constraint called "discriminative sparse-code error" and combine it with the reconstruction error and the classification error to form a unified objective function. The optimal solution is efficiently obtained using the K-SVD algorithm. Our algorithm learns a single overcomplete dictionary and an optimal linear classifier jointly. The incremental dictionary learning algorithm is presented for the situation of limited memory resources. It yields dictionaries so that feature points with the same class labels have similar sparse codes. Experimental results demonstrate that our algorithm outperforms many recently proposed sparse-coding techniques for face, action, scene, and object category recognition under the same learning conditions.
NASA Astrophysics Data System (ADS)
Dutta, Sandeep; Gros, Eric
2018-03-01
Deep Learning (DL) has been successfully applied in numerous fields fueled by increasing computational power and access to data. However, for medical imaging tasks, limited training set size is a common challenge when applying DL. This paper explores the applicability of DL to the task of classifying a single axial slice from a CT exam into one of six anatomy regions. A total of 29000 images selected from 223 CT exams were manually labeled for ground truth. An additional 54 exams were labeled and used as an independent test set. The network architecture developed for this application is composed of 6 convolutional layers and 2 fully connected layers with RELU non-linear activations between each layer. Max-pooling was used after every second convolutional layer, and a softmax layer was used at the end. Given this base architecture, the effect of inclusion of network architecture components such as Dropout and Batch Normalization on network performance and training is explored. The network performance as a function of training and validation set size is characterized by training each network architecture variation using 5,10,20,40,50 and 100% of the available training data. The performance comparison of the various network architectures was done for anatomy classification as well as two computer vision datasets. The anatomy classifier accuracy varied from 74.1% to 92.3% in this study depending on the training size and network layout used. Dropout layers improved the model accuracy for all training sizes.
2015-06-01
Definitions are provided for this section to distinguish between adaptive training and education elements and also to highlight their relationships ...illustrate this point Franke (2011) asserts that through the use of case study examples, instruction can provide the pedagogical foundation for decision...a prime example of an adaptive training and education system: a learner or trainee model, an instructional or pedagogical model, a domain model
Federal Register 2010, 2011, 2012, 2013, 2014
2010-11-19
..., as shown in the example set forth in Figure 1 in this standard. At the manufacturer's option, the... including the border surrounding the entire label, as shown in the example set forth in Figure 2 in this...), and, as appropriate, (h) and (i)) may be shown in the format and color scheme set forth in Figures 1...
Federal Register 2010, 2011, 2012, 2013, 2014
2011-03-07
... surrounding the entire placard, as shown in the example set forth in Figure 1 in this standard. At the... example set forth in Figure 2 in this standard. The label shall be permanently affixed and proximate to... format and color scheme set forth in Figures 1 and 2. * * * (c) Vehicle manufacturer's recommended cold...
Forsyth, Alexander W; Barzilay, Regina; Hughes, Kevin S; Lui, Dickson; Lorenz, Karl A; Enzinger, Andrea; Tulsky, James A; Lindvall, Charlotta
2018-06-01
Clinicians document cancer patients' symptoms in free-text format within electronic health record visit notes. Although symptoms are critically important to quality of life and often herald clinical status changes, computational methods to assess the trajectory of symptoms over time are woefully underdeveloped. To create machine learning algorithms capable of extracting patient-reported symptoms from free-text electronic health record notes. The data set included 103,564 sentences obtained from the electronic clinical notes of 2695 breast cancer patients receiving paclitaxel-containing chemotherapy at two academic cancer centers between May 1996 and May 2015. We manually annotated 10,000 sentences and trained a conditional random field model to predict words indicating an active symptom (positive label), absence of a symptom (negative label), or no symptom at all (neutral label). Sentences labeled by human coder were divided into training, validation, and test data sets. Final model performance was determined on 20% test data unused in model development or tuning. The final model achieved precision of 0.82, 0.86, and 0.99 and recall of 0.56, 0.69, and 1.00 for positive, negative, and neutral symptom labels, respectively. The most common positive symptoms were pain, fatigue, and nausea. Machine-based labeling of 103,564 sentences took two minutes. We demonstrate the potential of machine learning to gather, track, and analyze symptoms experienced by cancer patients during chemotherapy. Although our initial model requires further optimization to improve the performance, further model building may yield machine learning methods suitable to be deployed in routine clinical care, quality improvement, and research applications. Copyright © 2018 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.
Andeer, Peter; Stahl, David A; Lillis, Lorraine; Strand, Stuart E
2013-09-17
The leaching of RDX (hexahydro-1,3,5-trinitro-1,3,5-triazine) from particulates deposited in live-fire military training range soils contributes to significant pollution of groundwater. In situ microbial degradation has been proposed as a viable method for onsite containment of RDX. However, there is only a single report of RDX degradation in training range soils and the soil microbial communities involved in RDX degradation were not identified. Here we demonstrate aerobic RDX degradation in soils taken from a target area of an Eglin Air Force Base bombing range, C52N Cat's Eye, (Eglin, Florida U.S.A.). RDX-degradation activity was spatially heterogeneous (found in less than 30% of initial target area field samples) and dependent upon the addition of exogenous carbon sources to the soils. Therefore, biostimulation (with exogenous carbon sources) and bioaugmentation may be necessary to sustain timely and effective in situ microbial biodegradation of RDX. High sensitivity stable isotope probing analysis of extracted soils incubated with fully labeled (15)N-RDX revealed several organisms with (15)N-labeled DNA during RDX-degradation, including xplA-bearing organisms. Rhodococcus was the most prominent genus in the RDX-degrading soil slurries and was completely labeled with (15)N-nitrogen from the RDX. Rhodococcus and Williamsia species isolated from these soils were capable of using RDX as a sole nitrogen source and possessed the genes xplB and xplA associated with RDX-degradation, indicating these genes may be suitable genetic biomarkers for assessing RDX degradation potential in soils. Other highly labeled species were primarily Proteobacteria, including: Mesorhizobium sp., Variovorax sp., and Rhizobium sp.
Obstacles to nutrition labeling in restaurants.
Almanza, B A; Nelson, D; Chai, S
1997-02-01
This study determined the major obstacles that foodservices face regarding nutrition labeling. Survey questionnaire was conducted in May 1994. In addition to demographic questions, the directors were asked questions addressing willingness, current practices, and perceived obstacles related to nutrition labeling. Sixty-eight research and development directors of the largest foodservice corporations as shown in Restaurants & Institutions magazine's list of the top 400 largest foodservices (July 1993). P tests were used to determine significance within a group for the number of foodservices that were currently using nutrition labeling, perceived impact of nutrition labeling on sales, and perceived responsibility to add nutrition labels. Regression analysis was used to determine the importance of factors on willingness to label. Response rate was 45.3%. Most companies were neutral about their willingness to use nutrition labeling. Two thirds of the respondents were not currently using nutrition labels. Only one third thought that it was the foodservice's responsibility to provide such information. Several companies perceived that nutrition labeling would have a potentially negative effect on annual sales volume. Major obstacles were identified as menu or personnel related, rather than cost related. Menu-related obstacles included too many menu variations, limited space on the menu for labeling, and loss of flexibility in changing the menu. Personnel-related obstacles included difficulty in training employees to implement nutrition labeling, and not enough time for foodservice personnel to implement nutrition labeling. Numerous opportunities will be created for dietetics professionals in helping foodservices overcome these menu- or personnel-related obstacles.
Unsupervised Ensemble Anomaly Detection Using Time-Periodic Packet Sampling
NASA Astrophysics Data System (ADS)
Uchida, Masato; Nawata, Shuichi; Gu, Yu; Tsuru, Masato; Oie, Yuji
We propose an anomaly detection method for finding patterns in network traffic that do not conform to legitimate (i.e., normal) behavior. The proposed method trains a baseline model describing the normal behavior of network traffic without using manually labeled traffic data. The trained baseline model is used as the basis for comparison with the audit network traffic. This anomaly detection works in an unsupervised manner through the use of time-periodic packet sampling, which is used in a manner that differs from its intended purpose — the lossy nature of packet sampling is used to extract normal packets from the unlabeled original traffic data. Evaluation using actual traffic traces showed that the proposed method has false positive and false negative rates in the detection of anomalies regarding TCP SYN packets comparable to those of a conventional method that uses manually labeled traffic data to train the baseline model. Performance variation due to the probabilistic nature of sampled traffic data is mitigated by using ensemble anomaly detection that collectively exploits multiple baseline models in parallel. Alarm sensitivity is adjusted for the intended use by using maximum- and minimum-based anomaly detection that effectively take advantage of the performance variations among the multiple baseline models. Testing using actual traffic traces showed that the proposed anomaly detection method performs as well as one using manually labeled traffic data and better than one using randomly sampled (unlabeled) traffic data.
Autofocusing in digital holography using deep learning
NASA Astrophysics Data System (ADS)
Ren, Zhenbo; Xu, Zhimin; Lam, Edmund Y.
2018-02-01
In digital holography, it is critical to know the distance in order to reconstruct the multi-sectional object. This autofocusing is traditionally solved by reconstructing a stack of in-focus and out-of-focus images and using some focus metric, such as entropy or variance, to calculate the sharpness of each reconstructed image. Then the distance corresponding to the sharpest image is determined as the focal position. This method is effective but computationally demanding and time-consuming. To get an accurate estimation, one has to reconstruct many images. Sometimes after a coarse search, a refinement is needed. To overcome this problem in autofocusing, we propose to use deep learning, i.e., a convolutional neural network (CNN), to solve this problem. Autofocusing is viewed as a classification problem, in which the true distance is transferred as a label. To estimate the distance is equated to labeling a hologram correctly. To train such an algorithm, totally 1000 holograms are captured under the same environment, i.e., exposure time, incident angle, object, except the distance. There are 5 labels corresponding to 5 distances. These data are randomly split into three datasets to train, validate and test a CNN network. Experimental results show that the trained network is capable of predicting the distance without reconstructing or knowing any physical parameters about the setup. The prediction time using this method is far less than traditional autofocusing methods.
49 CFR 172.400a - Exceptions from labeling.
Code of Federal Regulations, 2011 CFR
2011-10-01
... TABLE, SPECIAL PROVISIONS, HAZARDOUS MATERIALS COMMUNICATIONS, EMERGENCY RESPONSE INFORMATION, TRAINING....1 (poisonous) if the toxicity of the material is based solely on the corrosive destruction of tissue...
49 CFR 172.400a - Exceptions from labeling.
Code of Federal Regulations, 2010 CFR
2010-10-01
... TABLE, SPECIAL PROVISIONS, HAZARDOUS MATERIALS COMMUNICATIONS, EMERGENCY RESPONSE INFORMATION, TRAINING....1 (poisonous) if the toxicity of the material is based solely on the corrosive destruction of tissue...
A COMPARISON OF METHODS FOR TEACHING RECEPTIVE LABELING TO CHILDREN WITH AUTISM SPECTRUM DISORDERS
Grow, Laura L; Carr, James E; Kodak, Tiffany M; Jostad, Candice M; Kisamore, April N
2011-01-01
Many early intervention curricular manuals recommend teaching auditory-visual conditional discriminations (i.e., receptive labeling) using the simple-conditional method in which component simple discriminations are taught in isolation and in the presence of a distracter stimulus before the learner is required to respond conditionally. Some have argued that this procedure might be susceptible to faulty stimulus control such as stimulus overselectivity (Green, 2001). Consequently, there has been a call for the use of alternative teaching procedures such as the conditional-only method, which involves conditional discrimination training from the onset of intervention. The purpose of the present study was to compare the simple-conditional and conditional-only methods for teaching receptive labeling to 3 young children diagnosed with autism spectrum disorders. The data indicated that the conditional-only method was a more reliable and efficient teaching procedure. In addition, several error patterns emerged during training using the simple-conditional method. The implications of the results with respect to current teaching practices in early intervention programs are discussed. PMID:21941380
Analysis of precision and accuracy in a simple model of machine learning
NASA Astrophysics Data System (ADS)
Lee, Julian
2017-12-01
Machine learning is a procedure where a model for the world is constructed from a training set of examples. It is important that the model should capture relevant features of the training set, and at the same time make correct prediction for examples not included in the training set. I consider the polynomial regression, the simplest method of learning, and analyze the accuracy and precision for different levels of the model complexity.
2013-10-01
labeling and tracking of mesenchymal stem cells (MSCs). MSCs are a heterogeneous group of pluripotent stromal cells that can be isolated from... mesenchymal stem cell labelling by using polyhedral superparamagnetic iron oxide nanoparticles. Chemistry 2009;15:12417-25. Wang and Shan. MRI cell ...and Differentiation of Mesenchymal Stem Cells by Carboxylated Carbon Nanotubes. ACS Nano 2010, 4, 2185–2195. 15. Bertoncini, P.; Chauvet, O
Transfer Learning for Adaptive Relation Extraction
2011-09-13
other NLP tasks, however, supervised learning approach fails when there is not a sufficient amount of labeled data for training, which is often the case...always 12 Syntactic Pattern Relation Instance Relation Type (Subtype) arg-2 arg-1 Arab leaders OTHER-AFF (Ethnic) his father PER-SOC (Family) South...for x. For sequence labeling tasks in NLP , linear-chain conditional random field has been rather suc- cessful. It is an undirected graphical model in
2009-01-01
selection and uncertainty sampling signif- icantly. Index Terms: Transcription, labeling, submodularity, submod- ular selection, active learning , sequence...name of batch active learning , where a subset of data that is most informative and represen- tative of the whole is selected for labeling. Often...representative subset. Note that our Fisher ker- nel is over an unsupervised generative model, which enables us to bootstrap our active learning approach
NASA Astrophysics Data System (ADS)
Hetherington, Jorden; Pesteie, Mehran; Lessoway, Victoria A.; Abolmaesumi, Purang; Rohling, Robert N.
2017-03-01
Percutaneous needle insertion procedures on the spine often require proper identification of the vertebral level in order to effectively deliver anesthetics and analgesic agents to achieve adequate block. For example, in obstetric epidurals, the target is at the L3-L4 intervertebral space. The current clinical method involves "blind" identification of the vertebral level through manual palpation of the spine, which has only 30% accuracy. This implies the need for better anatomical identification prior to needle insertion. A system is proposed to identify the vertebrae, assigning them to their respective levels, and track them in a standard sequence of ultrasound images, when imaged in the paramedian plane. Machine learning techniques are developed to identify discriminative features of the laminae. In particular, a deep network is trained to automatically learn the anatomical features of the lamina peaks, and classify image patches, for pixel-level classification. The chosen network utilizes multiple connected auto-encoders to learn the anatomy. Pre-processing with ultrasound bone enhancement techniques is done to aid the pixel-level classification performance. Once the lamina are identified, vertebrae are assigned levels and tracked in sequential frames. Experimental results were evaluated against an expert sonographer. Based on data acquired from 15 subjects, vertebrae identification with sensitivity of 95% and precision of 95% was achieved within each frame. Between pairs of subsequently analyzed frames, matches of predicted vertebral level labels were correct in 94% of cases, when compared to matches of manually selected labels
Correcting Evaluation Bias of Relational Classifiers with Network Cross Validation
2010-01-01
classi- fication algorithms: simple random resampling (RRS), equal-instance random resampling (ERS), and network cross-validation ( NCV ). The first two... NCV procedure that eliminates overlap between test sets altogether. The procedure samples for k disjoint test sets that will be used for evaluation...propLabeled ∗ S) nodes from train Pool in f erenceSet =network − trainSet F = F ∪ < trainSet, test Set, in f erenceSet > end for output: F NCV addresses
ERIC Educational Resources Information Center
Borji, Rihab; Sahli, Sonia; Baccouch, Rym; Laatar, Rabeb; Kachouri, Hiba; Rebai, Haithem
2018-01-01
Background: This study aimed to compare the effectiveness of a hopping and jumping training programme (HJP) versus a sensorimotor rehabilitation programme (SRP) on postural performances in children with intellectual disability. Methods: Three groups of children with intellectual disability participated in the study: the HJP group, the SRP group…
Advanced Pediatric Brain Imaging Research and Training Program
2013-10-01
diffusion tensor imaging and perfusion ( arterial spin labeling) MRI data and to relate measures of global and regional brain microstructural organization...AD_________________ Award Number: W81XWH-11-2-0198 TITLE: Advanced Pediatric Brain Imaging...September 2013 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER Advanced Pediatric Brain Imaging Research and Training Program 5b. GRANT NUMBER W81XWH
Improving Acoustic Models by Watching Television
NASA Technical Reports Server (NTRS)
Witbrock, Michael J.; Hauptmann, Alexander G.
1998-01-01
Obtaining sufficient labelled training data is a persistent difficulty for speech recognition research. Although well transcribed data is expensive to produce, there is a constant stream of challenging speech data and poor transcription broadcast as closed-captioned television. We describe a reliable unsupervised method for identifying accurately transcribed sections of these broadcasts, and show how these segments can be used to train a recognition system. Starting from acoustic models trained on the Wall Street Journal database, a single iteration of our training method reduced the word error rate on an independent broadcast television news test set from 62.2% to 59.5%.
Stimulus Predifferentiation and Modification of Children's Racial Attitudes
ERIC Educational Resources Information Center
Katz, Phyllis A.
1973-01-01
The most significant finding is that stimulus-predifferentiation training elicited lower prejudice scores for children on two indices of ethnic attitudes than did a no-label control condition. (Author)
76 FR 52441 - Summary of Benefits and Coverage and the Uniform Glossary
Federal Register 2010, 2011, 2012, 2013, 2014
2011-08-22
... facts label that includes examples to illustrate common benefits scenarios (including pregnancy and... plan or coverage for common benefits scenarios, including pregnancy and serious or chronic medical...
Cross-Training: The Tactical View.
ERIC Educational Resources Information Center
Kaeter, Margaret
1993-01-01
Discusses the advantages of and problems associated with cross-training. Looks at the issue of remuneration and offers examples of how two companies that cross-train presently pay their employees. (JOW)
NASA Astrophysics Data System (ADS)
Hou, Rui; Shi, Bibo; Grimm, Lars J.; Mazurowski, Maciej A.; Marks, Jeffrey R.; King, Lorraine M.; Maley, Carlo C.; Hwang, E. Shelley; Lo, Joseph Y.
2018-02-01
Predicting whether ductal carcinoma in situ (DCIS) identified at core biopsy contains occult invasive disease is an import task since these "upstaged" cases will affect further treatment planning. Therefore, a prediction model that better classifies pure DCIS and upstaged DCIS can help avoid overtreatment and overdiagnosis. In this work, we propose to improve this classification performance with the aid of two other related classes: Atypical Ductal Hyperplasia (ADH) and Invasive Ductal Carcinoma (IDC). Our data set contains mammograms for 230 cases. Specifically, 66 of them are ADH cases; 99 of them are biopsy-proven DCIS cases, of whom 25 were found to contain invasive disease at the time of definitive surgery. The remaining 65 cases were diagnosed with IDC at core biopsy. Our hypothesis is that knowledge can be transferred from training with the easier and more readily available cases of benign but suspicious ADH versus IDC that is already apparent at initial biopsy. Thus, embedding both ADH and IDC cases to the classifier will improve the performance of distinguishing upstaged DCIS from pure DCIS. We extracted 113 mammographic features based on a radiologist's annotation of clusters.Our method then added both ADH and IDC cases during training, where ADH were "force labeled" or treated by the classifier as pure DCIS (negative) cases, and IDC were labeled as upstaged DCIS (positive) cases. A logistic regression classifier was built based on the designed training dataset to perform a prediction of whether biopsy-proven DCIS cases contain invasive cancer. The performance was assessed by repeated 5-fold CrossValidation and Receiver Operating Characteristic(ROC) curve analysis. While prediction performance with only training on DCIS dataset had an average AUC of 0.607(%95CI, 0.479-0.721). By adding both ADH and IDC cases for training, we improved the performance to 0.691(95%CI, 0.581-0.801).
Zhang, Yue; Xu, Chi; Zheng, Hui; Loh, Horace H; Law, Ping-Yee
2016-01-01
The regulation of adult neurogenesis by opiates has been implicated in modulating different addiction cycles. At which neurogenesis stage opiates exert their action remains unresolved. We attempt to define the temporal window of morphine's inhibition effect on adult neurogenesis by using the POMC-EGFP mouse model, in which newborn granular cells (GCs) can be visualized between days 3-28 post-mitotic. The POMC-EGFP mice were trained under the 3-chambers conditioned place preference (CPP) paradigm with either saline or morphine. We observed after 4 days of CPP training with saline, the number of EGFP-labeled newborn GCs in sub-granular zone (SGZ) hippocampus significantly increased compared to mice injected with saline in their homecage. CPP training with morphine significantly decreased the number of EGFP-labeled GCs, whereas no significant difference in the number of EGFP-labeled GCs was observed with the homecage mice injected with the same dose of morphine. Using cell-type selective markers, we observed that morphine reduced the number of late stage progenitors and immature neurons such as Doublecortin (DCX) and βIII Tubulin (TuJ1) positive cells in the SGZ but did not reduce the number of early progenitors such as Nestin, SOX2, or neurogenic differentiation-1 (NeuroD1) positive cells. Analysis of co-localization between different cell markers shows that morphine reduced the number of adult-born GCs by interfering with differentiation of early progenitors, but not by inducing apoptosis. In addition, when NeuroD1 was over-expressed in DG by stereotaxic injection of lentivirus, it rescued the loss of immature neurons and prolonged the extinction of morphine-trained CPP. These results suggest that under the condition of CPP training paradigm, morphine affects the transition of neural progenitor/stem cells to immature neurons via a mechanism involving NeuroD1.
Label Information Guided Graph Construction for Semi-Supervised Learning.
Zhuang, Liansheng; Zhou, Zihan; Gao, Shenghua; Yin, Jingwen; Lin, Zhouchen; Ma, Yi
2017-09-01
In the literature, most existing graph-based semi-supervised learning methods only use the label information of observed samples in the label propagation stage, while ignoring such valuable information when learning the graph. In this paper, we argue that it is beneficial to consider the label information in the graph learning stage. Specifically, by enforcing the weight of edges between labeled samples of different classes to be zero, we explicitly incorporate the label information into the state-of-the-art graph learning methods, such as the low-rank representation (LRR), and propose a novel semi-supervised graph learning method called semi-supervised low-rank representation. This results in a convex optimization problem with linear constraints, which can be solved by the linearized alternating direction method. Though we take LRR as an example, our proposed method is in fact very general and can be applied to any self-representation graph learning methods. Experiment results on both synthetic and real data sets demonstrate that the proposed graph learning method can better capture the global geometric structure of the data, and therefore is more effective for semi-supervised learning tasks.
ERIC Educational Resources Information Center
Harty, Sheila
1981-01-01
Criticizes the increasing infiltration of private corporations into the public schools through "educational" aids offers of "free" equipment in exchange for product labels; and the provision of technical assistance, management training, and scholarships. (GC)
Surgery, Hospitals, and Medications
... products that are not commonly stocked in hospital pharmacies. Examples include: Salagen ® , Evoxac ® , and Restasis ® Eye drops, ... prescription and OTC medications/products in their labeled pharmacy container or packaging. This is important in case ...
The Cannon: A data-driven approach to Stellar Label Determination
NASA Astrophysics Data System (ADS)
Ness, M.; Hogg, David W.; Rix, H.-W.; Ho, Anna. Y. Q.; Zasowski, G.
2015-07-01
New spectroscopic surveys offer the promise of stellar parameters and abundances (“stellar labels”) for hundreds of thousands of stars; this poses a formidable spectral modeling challenge. In many cases, there is a subset of reference objects for which the stellar labels are known with high(er) fidelity. We take advantage of this with The Cannon, a new data-driven approach for determining stellar labels from spectroscopic data. The Cannon learns from the “known” labels of reference stars how the continuum-normalized spectra depend on these labels by fitting a flexible model at each wavelength; then, The Cannon uses this model to derive labels for the remaining survey stars. We illustrate The Cannon by training the model on only 542 stars in 19 clusters as reference objects, with {T}{eff}, {log} g, and [{Fe}/{{H}}] as the labels, and then applying it to the spectra of 55,000 stars from APOGEE DR10. The Cannon is very accurate. Its stellar labels compare well to the stars for which APOGEE pipeline (ASPCAP) labels are provided in DR10, with rms differences that are basically identical to the stated ASPCAP uncertainties. Beyond the reference labels, The Cannon makes no use of stellar models nor any line-list, but needs a set of reference objects that span label-space. The Cannon performs well at lower signal-to-noise, as it delivers comparably good labels even at one-ninth the APOGEE observing time. We discuss the limitations of The Cannon and its future potential, particularly, to bring different spectroscopic surveys onto a consistent scale of stellar labels.
ERIC Educational Resources Information Center
Raaijmakers, Steven F.; Baars, Martine; Schaap, Lydia; Paas, Fred; van Merriënboer, Jeroen; van Gog, Tamara
2018-01-01
Self-assessment and task-selection skills are crucial in self-regulated learning situations in which students can choose their own tasks. Prior research suggested that training with video modeling examples, in which another person (the model) demonstrates and explains the cyclical process of problem-solving task performance, self-assessment, and…
Training for Environmental Impact Assessment (E.I.A.).
ERIC Educational Resources Information Center
Vougias, S.
1988-01-01
Deals with the methodology and practices for Environmental Impact Assessment (EIA). Describes the EIA process, prediction process, alternative assessment methods, training needs, major activities, training provision and material, main deficiencies and the precautions, and real world training examples. (Author/YP)
The Near-Term Viability and Benefits of eLabels for Patients, Clinical Sites, and Sponsors.
Smith-Gick, Jodi; Barnes, Nicola; Barone, Rocco; Bedford, Jeff; James, Jason R; Reisner, Stacy Frankovitz; Stephenson, Michael
2018-01-01
Current clinical trial labels are designed primarily to meet regulatory requirements. These labels have low patient and site utility, few are opened, and they have limited space and small fonts. As our world transitions from paper to electronic, an opportunity exists to provide patients with information about their investigational clinical trial product in a way that is more easily accessible, meets Health Authority requirements, and provides valuable additional information for the patient and caregiver. A TransCelerate initiative was launched to understand the current regulatory and technology landscape for the potential use an electronic label (eLabel) for investigational medicinal products (IMPs). Concepts and an example proof of concept were developed intended to show the "art of the possible" for a foundational eLabel and a "universal printed label." In addition, possible patient-centric enhancements were captured in the eLabel proof of concept. These concepts were shared with Health Authorities as well as patient and site advisory groups to gather feedback and subsequently enhance the concepts. Feedback indicated that the concept of an eLabel provides value and concepts should continue to be pursued. While the Health Authorities engaged with did not express issues with the use of an eLabel per se, the reduction in the content on the paper label is not possible in some geographic locations due to existing regulations. There is nothing that prevents transmitting the label electronically in conjunction with current conventional labeling. While there are still some regulatory barriers that need to be addressed for reducing what is on the paper label, advancement toward a more patient-centric approach benefits stakeholders and will enable a fully connected patient-centric experience. The industry must start now to build the foundation.
Isotope labeling for studying RNA by solid-state NMR spectroscopy.
Marchanka, Alexander; Kreutz, Christoph; Carlomagno, Teresa
2018-04-12
Nucleic acids play key roles in most biological processes, either in isolation or in complex with proteins. Often they are difficult targets for structural studies, due to their dynamic behavior and high molecular weight. Solid-state nuclear magnetic resonance spectroscopy (ssNMR) provides a unique opportunity to study large biomolecules in a non-crystalline state at atomic resolution. Application of ssNMR to RNA, however, is still at an early stage of development and presents considerable challenges due to broad resonances and poor dispersion. Isotope labeling, either as nucleotide-specific, atom-specific or segmental labeling, can resolve resonance overlaps and reduce the line width, thus allowing ssNMR studies of RNA domains as part of large biomolecules or complexes. In this review we discuss the methods for RNA production and purification as well as numerous approaches for isotope labeling of RNA. Furthermore, we give a few examples that emphasize the instrumental role of isotope labeling and ssNMR for studying RNA as part of large ribonucleoprotein complexes.
Detection of CdSe quantum dot photoluminescence for security label on paper
DOE Office of Scientific and Technical Information (OSTI.GOV)
Isnaeni,, E-mail: isnaeni@lipi.go.id; Sugiarto, Iyon Titok; Bilqis, Ratu
CdSe quantum dot has great potential in various applications especially for emitting devices. One example potential application of CdSe quantum dot is security label for anti-counterfeiting. In this work, we present a practical approach of security label on paper using one and two colors of colloidal CdSe quantum dot, which is used as stamping ink on various types of paper. Under ambient condition, quantum dot is almost invisible. The quantum dot security label can be revealed by detecting emission of quantum dot using photoluminescence and cnc machine. The recorded quantum dot emission intensity is then analyzed using home-made program tomore » reveal quantum dot pattern stamp having the word ’RAHASIA’. We found that security label using quantum dot works well on several types of paper. The quantum dot patterns can survive several days and further treatment is required to protect the quantum dot. Oxidation of quantum dot that occurred during this experiment reduced the emission intensity of quantum dot patterns.« less
Pieretti, Mariah M; Chung, Danna; Pacenza, Robert; Slotkin, Todd; Sicherer, Scott H
2009-08-01
The Food Allergy Labeling and Consumer Protection Act became effective January 1, 2006, and mandates disclosure of the 8 major allergens in plain English and as a source of ingredients in the ingredient statement. It does not regulate advisory labels. We sought to determine the frequency and language used in voluntary advisory labels among commercially available products and to identify labeling ambiguities affecting consumers with allergy. Trained surveyors performed a supermarket survey of 20,241 unique manufactured food products (from an original assessment of 49,604 products) for use of advisory labels. A second detailed survey of 744 unique products evaluated additional labeling practices. Overall, 17% of 20,241 products surveyed contain advisory labels. Chocolate candy, cookies, and baking mixes were the 3 categories of 24 with the greatest frequency (> or = 40%). Categorically, advisory warnings included "may contain" (38%), "shared equipment" (33%), and "within plant" (29%). The subsurvey disclosed 25 different types of advisory terminology. Nonspecific terms, such as "natural flavors" and "spices," were found on 65% of products and were not linked to a specific ingredient for 83% of them. Additional ambiguities included unclear sources of soy (lecithin vs protein), nondisclosure of sources of gelatin and lecithin, and simultaneous disclosure of "contains" and "may contain" for the same allergen, among others. Numerous products have advisory labeling and ambiguities that present challenges to consumers with food allergy. Additional allergen labeling regulation could improve safety and quality of life for individuals with food allergy.
29 CFR 1960.59 - Training of employees and employee representatives.
Code of Federal Regulations, 2010 CFR
2010-07-01
... specialized job safety and health training appropriate to the work performed by the employee, for example: Clerical; printing; welding; crane operation; chemical analysis, and computer operations. Such training...
A Guide to the Identification of Training Needs.
ERIC Educational Resources Information Center
Boydell, T. H.
This comprehensive analysis of training needs, which is illustrated with case studies and factual examples, is directed towards training management, but its concepts are expressed in terms valuable to all management. The first chapter answers the question, "What are training needs?" The following chapters discuss present and future training needs,…
Continuing Training in Enterprises for Technological Change.
ERIC Educational Resources Information Center
Behrens, A.; And Others
This document contains a series of papers on the topic of continuing training for technological change in business and industry. The papers focus on examples of training for technological change in several countries of Western Europe. The five papers included in the report are "Training for Continuing Training and Education" (A.…
Generative adversarial networks for brain lesion detection
NASA Astrophysics Data System (ADS)
Alex, Varghese; Safwan, K. P. Mohammed; Chennamsetty, Sai Saketh; Krishnamurthi, Ganapathy
2017-02-01
Manual segmentation of brain lesions from Magnetic Resonance Images (MRI) is cumbersome and introduces errors due to inter-rater variability. This paper introduces a semi-supervised technique for detection of brain lesion from MRI using Generative Adversarial Networks (GANs). GANs comprises of a Generator network and a Discriminator network which are trained simultaneously with the objective of one bettering the other. The networks were trained using non lesion patches (n=13,000) from 4 different MR sequences. The network was trained on BraTS dataset and patches were extracted from regions excluding tumor region. The Generator network generates data by modeling the underlying probability distribution of the training data, (PData). The Discriminator learns the posterior probability P (Label Data) by classifying training data and generated data as "Real" or "Fake" respectively. The Generator upon learning the joint distribution, produces images/patches such that the performance of the Discriminator on them are random, i.e. P (Label Data = GeneratedData) = 0.5. During testing, the Discriminator assigns posterior probability values close to 0.5 for patches from non lesion regions, while patches centered on lesion arise from a different distribution (PLesion) and hence are assigned lower posterior probability value by the Discriminator. On the test set (n=14), the proposed technique achieves whole tumor dice score of 0.69, sensitivity of 91% and specificity of 59%. Additionally the generator network was capable of generating non lesion patches from various MR sequences.
NASA Astrophysics Data System (ADS)
Gong, Maoguo; Yang, Hailun; Zhang, Puzhao
2017-07-01
Ternary change detection aims to detect changes and group the changes into positive change and negative change. It is of great significance in the joint interpretation of spatial-temporal synthetic aperture radar images. In this study, sparse autoencoder, convolutional neural networks (CNN) and unsupervised clustering are combined to solve ternary change detection problem without any supervison. Firstly, sparse autoencoder is used to transform log-ratio difference image into a suitable feature space for extracting key changes and suppressing outliers and noise. And then the learned features are clustered into three classes, which are taken as the pseudo labels for training a CNN model as change feature classifier. The reliable training samples for CNN are selected from the feature maps learned by sparse autoencoder with certain selection rules. Having training samples and the corresponding pseudo labels, the CNN model can be trained by using back propagation with stochastic gradient descent. During its training procedure, CNN is driven to learn the concept of change, and more powerful model is established to distinguish different types of changes. Unlike the traditional methods, the proposed framework integrates the merits of sparse autoencoder and CNN to learn more robust difference representations and the concept of change for ternary change detection. Experimental results on real datasets validate the effectiveness and superiority of the proposed framework.
Dinov, Martin; Leech, Robert
2017-01-01
Part of the process of EEG microstate estimation involves clustering EEG channel data at the global field power (GFP) maxima, very commonly using a modified K-means approach. Clustering has also been done deterministically, despite there being uncertainties in multiple stages of the microstate analysis, including the GFP peak definition, the clustering itself and in the post-clustering assignment of microstates back onto the EEG timecourse of interest. We perform a fully probabilistic microstate clustering and labeling, to account for these sources of uncertainty using the closest probabilistic analog to KM called Fuzzy C-means (FCM). We train softmax multi-layer perceptrons (MLPs) using the KM and FCM-inferred cluster assignments as target labels, to then allow for probabilistic labeling of the full EEG data instead of the usual correlation-based deterministic microstate label assignment typically used. We assess the merits of the probabilistic analysis vs. the deterministic approaches in EEG data recorded while participants perform real or imagined motor movements from a publicly available data set of 109 subjects. Though FCM group template maps that are almost topographically identical to KM were found, there is considerable uncertainty in the subsequent assignment of microstate labels. In general, imagined motor movements are less predictable on a time point-by-time point basis, possibly reflecting the more exploratory nature of the brain state during imagined, compared to during real motor movements. We find that some relationships may be more evident using FCM than using KM and propose that future microstate analysis should preferably be performed probabilistically rather than deterministically, especially in situations such as with brain computer interfaces, where both training and applying models of microstates need to account for uncertainty. Probabilistic neural network-driven microstate assignment has a number of advantages that we have discussed, which are likely to be further developed and exploited in future studies. In conclusion, probabilistic clustering and a probabilistic neural network-driven approach to microstate analysis is likely to better model and reveal details and the variability hidden in current deterministic and binarized microstate assignment and analyses.
Dinov, Martin; Leech, Robert
2017-01-01
Part of the process of EEG microstate estimation involves clustering EEG channel data at the global field power (GFP) maxima, very commonly using a modified K-means approach. Clustering has also been done deterministically, despite there being uncertainties in multiple stages of the microstate analysis, including the GFP peak definition, the clustering itself and in the post-clustering assignment of microstates back onto the EEG timecourse of interest. We perform a fully probabilistic microstate clustering and labeling, to account for these sources of uncertainty using the closest probabilistic analog to KM called Fuzzy C-means (FCM). We train softmax multi-layer perceptrons (MLPs) using the KM and FCM-inferred cluster assignments as target labels, to then allow for probabilistic labeling of the full EEG data instead of the usual correlation-based deterministic microstate label assignment typically used. We assess the merits of the probabilistic analysis vs. the deterministic approaches in EEG data recorded while participants perform real or imagined motor movements from a publicly available data set of 109 subjects. Though FCM group template maps that are almost topographically identical to KM were found, there is considerable uncertainty in the subsequent assignment of microstate labels. In general, imagined motor movements are less predictable on a time point-by-time point basis, possibly reflecting the more exploratory nature of the brain state during imagined, compared to during real motor movements. We find that some relationships may be more evident using FCM than using KM and propose that future microstate analysis should preferably be performed probabilistically rather than deterministically, especially in situations such as with brain computer interfaces, where both training and applying models of microstates need to account for uncertainty. Probabilistic neural network-driven microstate assignment has a number of advantages that we have discussed, which are likely to be further developed and exploited in future studies. In conclusion, probabilistic clustering and a probabilistic neural network-driven approach to microstate analysis is likely to better model and reveal details and the variability hidden in current deterministic and binarized microstate assignment and analyses. PMID:29163110
Prospects and challenges of quantitative phase imaging in tumor cell biology
NASA Astrophysics Data System (ADS)
Kemper, Björn; Götte, Martin; Greve, Burkhard; Ketelhut, Steffi
2016-03-01
Quantitative phase imaging (QPI) techniques provide high resolution label-free quantitative live cell imaging. Here, prospects and challenges of QPI in tumor cell biology are presented, using the example of digital holographic microscopy (DHM). It is shown that the evaluation of quantitative DHM phase images allows the retrieval of different parameter sets for quantification of cellular motion changes in migration and motility assays that are caused by genetic modifications. Furthermore, we demonstrate simultaneously label-free imaging of cell growth and morphology properties.
High temperature flow-through device for rapid solubilization and analysis
West, Jason A. A. [Castro Valley, CA; Hukari, Kyle W [San Ramon, CA; Patel, Kamlesh D [Dublin, CA; Peterson, Kenneth A [Albuquerque, NM; Renzi, Ronald F [Tracy, CA
2009-09-22
Devices and methods for thermally lysing of biological material, for example vegetative bacterial cells and bacterial spores, are provided. Hot solution methods for solubilizing bacterial spores are described. Systems for direct analysis are disclosed including thermal lysers coupled to sample preparation stations. Integrated systems capable of performing sample lysis, labeling and protein fingerprint analysis of biological material, for example, vegetative bacterial cells, bacterial spores and viruses are provided.
High temperature flow-through device for rapid solubilization and analysis
West, Jason A. A.; Hukari, Kyle W.; Patel, Kamlesh D.; Peterson, Kenneth A.; Renzi, Ronald F.
2013-04-23
Devices and methods for thermally lysing of biological material, for example vegetative bacterial cells and bacterial spores, are provided. Hot solution methods for solubilizing bacterial spores are described. Systems for direct analysis are disclosed including thermal lysers coupled to sample preparation stations. Integrated systems capable of performing sample lysis, labeling and protein fingerprint analysis of biological material, for example, vegetative bacterial cells, bacterial spores and viruses are provided.
A Study on Contingency Learning in Introductory Physics Concepts
NASA Astrophysics Data System (ADS)
Scaife, Thomas M.
Instructors of physics often use examples to illustrate new or complex physical concepts to students. For any particular concept, there are an infinite number of examples, thus presenting instructors with a difficult question whenever they wish to use one in their teaching: which example will most effectively illustrate the concept so that student learning is maximized? The choice is typically made by an intuitive assumption about which exact example will result in the most lucid illustration and the greatest student improvement. By questioning 583 students in four experiments, I examined a more principled approach to example selection. By controlling the manner in which physical dimensions vary, the parameter space of each concept can be divided into a discrete number of example categories. The effects of training with members of each of category was explored in two different physical contexts: projectile motion and torque. In the first context, students were shown two trajectories and asked to determine which represented the longer time of flight. Height, range, and time of flight were the physical dimensions that were used to categorize the examples. In the second context, students were shown a balance-scale with loads of differing masses placed at differing positions along either side of the balance-arm. Mass, lever-arm length, and torque were the physical dimensions used to categorize these examples. For both contexts, examples were chosen so that one or two independent dimensions were varied. After receiving training with examples from specific categories, students were tested with questions from all question categories. Successful training or instruction can be measured either as producing correct, expert-like behavior (as observed through answers to the questions) or as explicitly instilling an understanding of the underlying rule that governs a physical phenomenon. A student's behavior might not be consistent with their explicit rule, so following the investigation of their behavior, students were asked what rule they used when answering questions. Although the self-reported rules might not be congruent with their behavior, training with specific examples might affect how students explicitly think about physics problems. In addition to exploring the effectiveness of various training examples, the results were also compared to a cognitive theory of causality: the contingency model. Physical concepts can often be expressed in terms of causal relations (e.g., a net force causes an object to accelerate), and a large body of work has found that people make many decisions that are consistent with causal reasoning. The contingency model, in particular, explains how certain statistical regularities in the co-occurrence of two events can be interpreted by individuals as causal relations, and was chosen primarily because it of its robust results and simple, parsimonious form. The empirical results demonstrate that different categories of training examples did affect student answers differently. Furthermore, these effects were mostly consistent with the predictions made by the contingency model. When rule use was explored, the self-reported rules were consistent with contingency model predictions, but indicated that examples alone were insufficient to teach complex functional relationships between physical dimensions, such as torque.
NASA Technical Reports Server (NTRS)
Lee, A. T.
1984-01-01
The differences between flight training technology and flight simulation technology are highlighted. Examples of training technologies are provided, including the Navy's training system and the interactive cockpit training device. Training problems that might arise in the near future are discussed. These challenges follow from the increased amount and variety of information that a pilot must have access to in the cockpit.
Methyl Bromide and Chloropicrin Safety Information for Handlers
Labels for these kind of pesticides require that soil fumigant handlers receive safe handling training before participating in field fumigation, according to the Worker Protection Standard (WPS) and Good Agricultural Practices (GAPs).
Conditional random fields for pattern recognition applied to structured data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Burr, Tom; Skurikhin, Alexei
In order to predict labels from an output domain, Y, pattern recognition is used to gather measurements from an input domain, X. Image analysis is one setting where one might want to infer whether a pixel patch contains an object that is “manmade” (such as a building) or “natural” (such as a tree). Suppose the label for a pixel patch is “manmade”; if the label for a nearby pixel patch is then more likely to be “manmade” there is structure in the output domain that can be exploited to improve pattern recognition performance. Modeling P(X) is difficult because features betweenmore » parts of the model are often correlated. Thus, conditional random fields (CRFs) model structured data using the conditional distribution P(Y|X = x), without specifying a model for P(X), and are well suited for applications with dependent features. Our paper has two parts. First, we overview CRFs and their application to pattern recognition in structured problems. Our primary examples are image analysis applications in which there is dependence among samples (pixel patches) in the output domain. Second, we identify research topics and present numerical examples.« less
Conditional random fields for pattern recognition applied to structured data
Burr, Tom; Skurikhin, Alexei
2015-07-14
In order to predict labels from an output domain, Y, pattern recognition is used to gather measurements from an input domain, X. Image analysis is one setting where one might want to infer whether a pixel patch contains an object that is “manmade” (such as a building) or “natural” (such as a tree). Suppose the label for a pixel patch is “manmade”; if the label for a nearby pixel patch is then more likely to be “manmade” there is structure in the output domain that can be exploited to improve pattern recognition performance. Modeling P(X) is difficult because features betweenmore » parts of the model are often correlated. Thus, conditional random fields (CRFs) model structured data using the conditional distribution P(Y|X = x), without specifying a model for P(X), and are well suited for applications with dependent features. Our paper has two parts. First, we overview CRFs and their application to pattern recognition in structured problems. Our primary examples are image analysis applications in which there is dependence among samples (pixel patches) in the output domain. Second, we identify research topics and present numerical examples.« less
NASA Astrophysics Data System (ADS)
Eneva, Elena; Petrushin, Valery A.
2002-03-01
Taxonomies are valuable tools for structuring and representing our knowledge about the world. They are widely used in many domains, where information about species, products, customers, publications, etc. needs to be organized. In the absence of standards, many taxonomies of the same entities can co-exist. A problem arises when data categorized in a particular taxonomy needs to be used by a procedure (methodology or algorithm) that uses a different taxonomy. Usually, a labor-intensive manual approach is used to solve this problem. This paper describes a machine learning approach which aids domain experts in changing taxonomies. It allows learning relationships between two taxonomies and mapping the data from one taxonomy into another. The proposed approach uses decision trees and bootstrapping for learning mappings of instances from the source to the target taxonomies. A C4.5 decision tree classifier is trained on a small manually labeled training set and applied to a randomly selected sample from the unlabeled data. The classification results are analyzed and the misclassified items are corrected and all items are added to the training set. This procedure is iterated until unlabeled data is available or an acceptable error rate is reached. In the latter case the last classifier is used to label all the remaining data. We test our approach on a database of products obtained from as grocery store chain and find that it performs well, reaching 92.6% accuracy while requiring the human expert to explicitly label only 18% of the entire data.
Sanders, Martha J; Reynolds, Jesse; Bagatell, Nancy; Treu, Judith A; OʼConnor, Edward; Katz, David L
2015-01-01
The purpose of the study was to examine the efficacy of a multidisciplinary train-the-trainer model for improving fitness and food label literacy in third-grade students. University student trainers taught ABC for Fitness and Nutrition Detectives, established programs to promote physical activity and nutrition knowledge, to 239 third-grade students in 2 communities over a 6-month period. A total of 110 children were in the intervention group and 129 children in the control group (2 schools each). Outcomes included the Food Label Literacy and Nutrition Knowledge test and the fitness measures of curl-ups, push-ups, 0.5-mile run, and sit and reach. Focus groups were conducted as process feedback. Four public schools in 2 different communities. A total of 200 third-grade students. ABC for Fitness and Nutrition Detectives. Food Label Literacy and Nutrition Knowledge test and the fitness measures of curl-ups, push-ups, 0.5-mile run, and sit and reach. Nutrition knowledge increased in the intervention group by 25.2% (P < .01). Fitness measures in the intervention schools showed greater improvement than those in the controls for curl-ups (P < .01), push-ups (P < .01), sit and reach left (P = .07), and 0.5-mile run (P = .06). Process feedback from 3 teachers and 60 students indicated satisfaction with the program. Adaptation of the train-the-trainer approach for Nutrition Detectives and ABC for Fitness was effective for delivering these health-related programs.
Rezaei-Darzi, Ehsan; Farzadfar, Farshad; Hashemi-Meshkini, Amir; Navidi, Iman; Mahmoudi, Mahmoud; Varmaghani, Mehdi; Mehdipour, Parinaz; Soudi Alamdari, Mahsa; Tayefi, Batool; Naderimagham, Shohreh; Soleymani, Fatemeh; Mesdaghinia, Alireza; Delavari, Alireza; Mohammad, Kazem
2014-12-01
This study aimed to evaluate and compare the prediction accuracy of two data mining techniques, including decision tree and neural network models in labeling diagnosis to gastrointestinal prescriptions in Iran. This study was conducted in three phases: data preparation, training phase, and testing phase. A sample from a database consisting of 23 million pharmacy insurance claim records, from 2004 to 2011 was used, in which a total of 330 prescriptions were assessed and used to train and test the models simultaneously. In the training phase, the selected prescriptions were assessed by both a physician and a pharmacist separately and assigned a diagnosis. To test the performance of each model, a k-fold stratified cross validation was conducted in addition to measuring their sensitivity and specificity. Generally, two methods had very similar accuracies. Considering the weighted average of true positive rate (sensitivity) and true negative rate (specificity), the decision tree had slightly higher accuracy in its ability for correct classification (83.3% and 96% versus 80.3% and 95.1%, respectively). However, when the weighted average of ROC area (AUC between each class and all other classes) was measured, the ANN displayed higher accuracies in predicting the diagnosis (93.8% compared with 90.6%). According to the result of this study, artificial neural network and decision tree model represent similar accuracy in labeling diagnosis to GI prescription.
Self-supervised online metric learning with low rank constraint for scene categorization.
Cong, Yang; Liu, Ji; Yuan, Junsong; Luo, Jiebo
2013-08-01
Conventional visual recognition systems usually train an image classifier in a bath mode with all training data provided in advance. However, in many practical applications, only a small amount of training samples are available in the beginning and many more would come sequentially during online recognition. Because the image data characteristics could change over time, it is important for the classifier to adapt to the new data incrementally. In this paper, we present an online metric learning method to address the online scene recognition problem via adaptive similarity measurement. Given a number of labeled data followed by a sequential input of unseen testing samples, the similarity metric is learned to maximize the margin of the distance among different classes of samples. By considering the low rank constraint, our online metric learning model not only can provide competitive performance compared with the state-of-the-art methods, but also guarantees convergence. A bi-linear graph is also defined to model the pair-wise similarity, and an unseen sample is labeled depending on the graph-based label propagation, while the model can also self-update using the more confident new samples. With the ability of online learning, our methodology can well handle the large-scale streaming video data with the ability of incremental self-updating. We evaluate our model to online scene categorization and experiments on various benchmark datasets and comparisons with state-of-the-art methods demonstrate the effectiveness and efficiency of our algorithm.
Contour-Driven Atlas-Based Segmentation
Wachinger, Christian; Fritscher, Karl; Sharp, Greg; Golland, Polina
2016-01-01
We propose new methods for automatic segmentation of images based on an atlas of manually labeled scans and contours in the image. First, we introduce a Bayesian framework for creating initial label maps from manually annotated training images. Within this framework, we model various registration- and patch-based segmentation techniques by changing the deformation field prior. Second, we perform contour-driven regression on the created label maps to refine the segmentation. Image contours and image parcellations give rise to non-stationary kernel functions that model the relationship between image locations. Setting the kernel to the covariance function in a Gaussian process establishes a distribution over label maps supported by image structures. Maximum a posteriori estimation of the distribution over label maps conditioned on the outcome of the atlas-based segmentation yields the refined segmentation. We evaluate the segmentation in two clinical applications: the segmentation of parotid glands in head and neck CT scans and the segmentation of the left atrium in cardiac MR angiography images. PMID:26068202
40 CFR 1042.130 - Installation instructions for vessel manufacturers.
Code of Federal Regulations, 2010 CFR
2010-07-01
... engine in a way that makes the engine's emission control information label hard to read during normal... equivalent format. For example, you may post instructions on a publicly available Web site for downloading or...
Heller, Rebecca; Martin-Biggers, Jennifer; Berhaupt-Glickstein, Amanda; Quick, Virginia; Byrd-Bredbenner, Carol
2015-10-01
To determine whether food label information and advertisements for foods containing no fruit cause children to have a false impression of the foods' fruit content. In the food label condition, a trained researcher showed each child sixteen different food label photographs depicting front-of-food label packages that varied with regard to fruit content (i.e. real fruit v. sham fruit) and label elements. In the food advertisement condition, children viewed sixteen, 30 s television food advertisements with similar fruit content and label elements as in the food label condition. After viewing each food label and advertisement, children responded to the question 'Did they use fruit to make this?' with responses of yes, no or don't know. Schools, day-care centres, after-school programmes and other community groups. Children aged 4-7 years. In the food label condition, χ 2 analysis of within fruit content variation differences indicated children (n 58; mean age 4·2 years) were significantly more accurate in identifying real fruit foods as the label's informational load increased and were least accurate when neither a fruit name nor an image was on the label. Children (n 49; mean age 5·4 years) in the food advertisement condition were more likely to identify real fruit foods when advertisements had fruit images compared with when no image was included, while fruit images in advertisements for sham fruit foods significantly reduced accuracy of responses. Findings suggest that labels and advertisements for sham fruit foods mislead children with regard to the food's real fruit content.
Andersen, Erica; Asuri, Namrata; Clay, Matthew; Halloran, Mary
2010-01-01
The zebrafish is an ideal model for imaging cell behaviors during development in vivo. Zebrafish embryos are externally fertilized and thus easily accessible at all stages of development. Moreover, their optical clarity allows high resolution imaging of cell and molecular dynamics in the natural environment of the intact embryo. We are using a live imaging approach to analyze cell behaviors during neural crest cell migration and the outgrowth and guidance of neuronal axons. Live imaging is particularly useful for understanding mechanisms that regulate cell motility processes. To visualize details of cell motility, such as protrusive activity and molecular dynamics, it is advantageous to label individual cells. In zebrafish, plasmid DNA injection yields a transient mosaic expression pattern and offers distinct benefits over other cell labeling methods. For example, transgenic lines often label entire cell populations and thus may obscure visualization of the fine protrusions (or changes in molecular distribution) in a single cell. In addition, injection of DNA at the one-cell stage is less invasive and more precise than dye injections at later stages. Here we describe a method for labeling individual developing neurons or neural crest cells and imaging their behavior in vivo. We inject plasmid DNA into 1-cell stage embryos, which results in mosaic transgene expression. The vectors contain cell-specific promoters that drive expression of a gene of interest in a subset of sensory neurons or neural crest cells. We provide examples of cells labeled with membrane targeted GFP or with a biosensor probe that allows visualization of F-actin in living cells1. Erica Andersen, Namrata Asuri, and Matthew Clay contributed equally to this work. PMID:20130524
NASA Astrophysics Data System (ADS)
Li, Shuanghong; Cao, Hongliang; Yang, Yupu
2018-02-01
Fault diagnosis is a key process for the reliability and safety of solid oxide fuel cell (SOFC) systems. However, it is difficult to rapidly and accurately identify faults for complicated SOFC systems, especially when simultaneous faults appear. In this research, a data-driven Multi-Label (ML) pattern identification approach is proposed to address the simultaneous fault diagnosis of SOFC systems. The framework of the simultaneous-fault diagnosis primarily includes two components: feature extraction and ML-SVM classifier. The simultaneous-fault diagnosis approach can be trained to diagnose simultaneous SOFC faults, such as fuel leakage, air leakage in different positions in the SOFC system, by just using simple training data sets consisting only single fault and not demanding simultaneous faults data. The experimental result shows the proposed framework can diagnose the simultaneous SOFC system faults with high accuracy requiring small number training data and low computational burden. In addition, Fault Inference Tree Analysis (FITA) is employed to identify the correlations among possible faults and their corresponding symptoms at the system component level.
Image aesthetic quality evaluation using convolution neural network embedded learning
NASA Astrophysics Data System (ADS)
Li, Yu-xin; Pu, Yuan-yuan; Xu, Dan; Qian, Wen-hua; Wang, Li-peng
2017-11-01
A way of embedded learning convolution neural network (ELCNN) based on the image content is proposed to evaluate the image aesthetic quality in this paper. Our approach can not only solve the problem of small-scale data but also score the image aesthetic quality. First, we chose Alexnet and VGG_S to compare for confirming which is more suitable for this image aesthetic quality evaluation task. Second, to further boost the image aesthetic quality classification performance, we employ the image content to train aesthetic quality classification models. But the training samples become smaller and only using once fine-tuning cannot make full use of the small-scale data set. Third, to solve the problem in second step, a way of using twice fine-tuning continually based on the aesthetic quality label and content label respective is proposed, the classification probability of the trained CNN models is used to evaluate the image aesthetic quality. The experiments are carried on the small-scale data set of Photo Quality. The experiment results show that the classification accuracy rates of our approach are higher than the existing image aesthetic quality evaluation approaches.
Rios, Anthony; Kavuluru, Ramakanth
2013-09-01
Extracting diagnosis codes from medical records is a complex task carried out by trained coders by reading all the documents associated with a patient's visit. With the popularity of electronic medical records (EMRs), computational approaches to code extraction have been proposed in the recent years. Machine learning approaches to multi-label text classification provide an important methodology in this task given each EMR can be associated with multiple codes. In this paper, we study the the role of feature selection, training data selection, and probabilistic threshold optimization in improving different multi-label classification approaches. We conduct experiments based on two different datasets: a recent gold standard dataset used for this task and a second larger and more complex EMR dataset we curated from the University of Kentucky Medical Center. While conventional approaches achieve results comparable to the state-of-the-art on the gold standard dataset, on our complex in-house dataset, we show that feature selection, training data selection, and probabilistic thresholding provide significant gains in performance.
Interdisciplinarity, Climate, and Change
NASA Astrophysics Data System (ADS)
Pulwarty, R. S.
2016-12-01
Interdisciplinarity has become synonymous with all things progressive about research and education. This is so not simply because of a philosophical belief in the heterogeneity of knowledge but because of the scientific and social complexities of problems of major concern. The increased demand for improved climate knowledge and information has increased pressure to support planning under changing rates of extremes event occurrence, is well-documented. The application of useful climate data, information and knowledge requires multiple networks and information services infrastructure that support planning and implementation. As widely quoted, Pasteur's quadrant is a label given to a class of scientific research methodologies that seeks fundamental understanding of scientific problems and, simultaneously, to benefit society-what Stokes called "use-inspired research". Innovation, in this context, has been defined as "the process by which individuals and organizations generate new ideas and put them into practice". A growing number of research institutes and programs have begun developing a cadre of professionals focused on integrating basic and applied research in areas such as climate risk assessment and adaptation. There are now several examples of where researchers and teams have crafted examples that include affected communities. In this presentation we will outline the lessons from several efforts including the PACE program, the RISAs, NIDIS, the Climate Services Information System and other interdisciplinary service-oriented efforts in which the author has been involved. Some early lessons include the need to: Recognize that key concerns of social innovation go beyond the projections of climate and other global changes to embrace multiple methods Continue to train scientists of all stripes of disciplinary norms, but higher education should also prepare students who plan to seek careers outside of academia by increasing flexibility in graduate training programs Develop and support boundary institutions that span research, monitoring, prototype development and practice but recognize both the benefits and the limits of co-production Design more comprehensive metrics for evaluation to combat perceptions that interdisciplinary work is only a sideline to a traditional academic career.
Quantum Support Vector Machine for Big Data Classification
NASA Astrophysics Data System (ADS)
Rebentrost, Patrick; Mohseni, Masoud; Lloyd, Seth
2014-09-01
Supervised machine learning is the classification of new data based on already classified training examples. In this work, we show that the support vector machine, an optimized binary classifier, can be implemented on a quantum computer, with complexity logarithmic in the size of the vectors and the number of training examples. In cases where classical sampling algorithms require polynomial time, an exponential speedup is obtained. At the core of this quantum big data algorithm is a nonsparse matrix exponentiation technique for efficiently performing a matrix inversion of the training data inner-product (kernel) matrix.
ERIC Educational Resources Information Center
Bishop, John H.; Lynch, Lisa M.
1993-01-01
Using the example of France, Bishop recommends a U.S. training mandate involving a training tax and incentives. Lynch argues that a broader array of options is needed to meet the training needs of new workers, displaced workers, and the unemployed. (SK)
Ievers-Landis, Carolyn E.; Hazen, Rebecca A.; Fehr, Karla K.
2015-01-01
The recently developed competencies in pediatric psychology from the Society of Pediatric Psychology (SPP) Task Force on Competencies and Best Training Practices in Pediatric Psychology provide a benchmark to evaluate training program practices and student progress toward training in level-specific competency goals. Graduate-level training presents a unique challenge for addressing the breadth of competencies required in pediatric psychology while maintaining development of broader clinical psychology training goals. We describe a recurring graduate-level pediatric psychology seminar course that addresses training in a number of the competency cluster areas. The structure of the seminar, examples of classroom topics that correspond with competency cluster areas as well as benchmarks used to evaluate each student’s development in the competency area are provided. Specific challenges in developing and maintaining the seminar in this format are identified, and possible solutions are offered. This training format could serve as a model for established pediatric psychology programs to expand their didactic training goals or for programs without formal pediatric psychology training to address competencies outside of clinical placements. PMID:26900536
High-Throughput Particle Uptake Analysis by Imaging Flow Cytometry
Smirnov, Asya; Solga, Michael D.; Lannigan, Joanne; Criss, Alison K.
2017-01-01
Quantifying the efficiency of particle uptake by host cells is important in fields including infectious diseases, autoimmunity, cancer, developmental biology, and drug delivery. Here we present a protocol for high-throughput analysis of particle uptake using imaging flow cytometry, using the bacterium Neisseria gonorrhoeae attached and internalized to neutrophils as an example. Cells are exposed to fluorescently labeled bacteria, fixed, and stained with a bacteria-specific antibody of a different fluorophore. Thus in the absence of a permeabilizing agent, extracellular bacteria are double-labeled with two fluorophores while intracellular bacteria remain single-labeled. A spot count algorithm is used to determine the number of single- and double-labeled bacteria in individual cells, to calculate the percent of cells associated with bacteria, percent of cells with internalized bacteria, and percent of cell-associated bacteria that are internalized. These analyses quantify bacterial association and internalization across thousands of cells and can be applied to diverse experimental systems. PMID:28369762
Using complex auditory-visual samples to produce emergent relations in children with autism.
Groskreutz, Nicole C; Karsina, Allen; Miguel, Caio F; Groskreutz, Mark P
2010-03-01
Six participants with autism learned conditional relations between complex auditory-visual sample stimuli (dictated words and pictures) and simple visual comparisons (printed words) using matching-to-sample training procedures. Pre- and posttests examined potential stimulus control by each element of the complex sample when presented individually and emergence of additional conditional relations and oral labeling. Tests revealed class-consistent performance for all participants following training.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Various
This report covers the following titles: (1) Fertility and litter size of normally ovulated and artificially ovulated mice; (2) Further studies on sterility produced in male mice by deuterium oxide; (3) Planarian disaggregation; (4) Uptake of organic compounds by planarians. II; (5) Effects of environmental complexity and training on acetylcholinesterase and cholinesterase activity in rat brain; (6) Effects of environmental complexity and training on brain chemistry and anatomy among mature rats; (7) Improvements in paper chromatographic techniques for labeled cell extracts; (8) measurement and adjustment of pH in small volumes of solutions; (9) Carbon-14 and Nitrogen-15 tracer studies of aminomore » acid synthesis during photosynthesis by Chlorella Pyrenoidosa; (10) Photosynthesis of {sup 14}C-labeled protein from {sup 14}CO{sub 2} by Chlorella; (11) Further studies on carboxydismutase; (12) Electron microscopy of chlorophyll a crystals; (13) The possible role of chromanyl phosphates in oxidative and photosynthetic phosphorylation; (14) Oxidation-reductions of some coenzymes; (15) Preparation of some [{sup 14}C] labeled substances: glucose-6-phosphate, fructose-6-phosphate, 6-phosphogluconic acid, pyruvic acid, and succinic acid; (16) attempt to synthesize high molecular weight polynucleotides using Schramm's purely chemical method; and (17) Optical properties of some dye-polyanion complexes.« less
NASA Astrophysics Data System (ADS)
Maas, A.; Alrajhi, M.; Alobeid, A.; Heipke, C.
2017-05-01
Updating topographic geospatial databases is often performed based on current remotely sensed images. To automatically extract the object information (labels) from the images, supervised classifiers are being employed. Decisions to be taken in this process concern the definition of the classes which should be recognised, the features to describe each class and the training data necessary in the learning part of classification. With a view to large scale topographic databases for fast developing urban areas in the Kingdom of Saudi Arabia we conducted a case study, which investigated the following two questions: (a) which set of features is best suitable for the classification?; (b) what is the added value of height information, e.g. derived from stereo imagery? Using stereoscopic GeoEye and Ikonos satellite data we investigate these two questions based on our research on label tolerant classification using logistic regression and partly incorrect training data. We show that in between five and ten features can be recommended to obtain a stable solution, that height information consistently yields an improved overall classification accuracy of about 5%, and that label noise can be successfully modelled and thus only marginally influences the classification results.
Actively learning to distinguish suspicious from innocuous anomalies in a batch of vehicle tracks
NASA Astrophysics Data System (ADS)
Qiu, Zhicong; Miller, David J.; Stieber, Brian; Fair, Tim
2014-06-01
We investigate the problem of actively learning to distinguish between two sets of anomalous vehicle tracks, innocuous" and suspicious", starting from scratch, without any initial examples of suspicious" and with no prior knowledge of what an operator would deem suspicious. This two-class problem is challenging because it is a priori unknown which track features may characterize the suspicious class. Furthermore, there is inherent imbalance in the sizes of the labeled innocuous" and suspicious" sets, even after some suspicious examples are identified. We present a comprehensive solution wherein a classifier learns to discriminate suspicious from innocuous based on derived p-value track features. Through active learning, our classifier thus learns the types of anomalies on which to base its discrimination. Our solution encompasses: i) judicious choice of kinematic p-value based features conditioned on the road of origin, along with more explicit features that capture unique vehicle behavior (e.g. U-turns); ii) novel semi-supervised learning that exploits information in the unlabeled (test batch) tracks, and iii) evaluation of several classifier models (logistic regression, SVMs). We find that two active labeling streams are necessary in practice in order to have efficient classifier learning while also forwarding (for labeling) the most actionable tracks. Experiments on wide-area motion imagery (WAMI) tracks, extracted via a system developed by Toyon Research Corporation, demonstrate the strong ROC AUC performance of our system, with sparing use of operator-based active labeling.
Comparison of the hedonic general Labeled Magnitude Scale with the hedonic 9-point scale.
Kalva, Jaclyn J; Sims, Charles A; Puentes, Lorenzo A; Snyder, Derek J; Bartoshuk, Linda M
2014-02-01
The hedonic 9-point scale was designed to compare palatability among different food items; however, it has also been used occasionally to compare individuals and groups. Such comparisons can be invalid because scale labels (for example, "like extremely") can denote systematically different hedonic intensities across some groups. Addressing this problem, the hedonic general Labeled Magnitude Scale (gLMS) frames affective experience in terms of the strongest imaginable liking/disliking of any kind, which can yield valid group comparisons of food palatability provided extreme hedonic experiences are unrelated to food. For each scale, 200 panelists rated affect for remembered food products (including favorite and least favorite foods) and sampled foods; they also sampled taste stimuli (quinine, sucrose, NaCl, citric acid) and rated their intensity. Finally, subjects identified experiences representing the endpoints of the hedonic gLMS. Both scales were similar in their ability to detect within-subject hedonic differences across a range of food experiences, but group comparisons favored the hedonic gLMS. With the 9-point scale, extreme labels were strongly associated with extremes in food affect. In contrast, gLMS data showed that scale extremes referenced nonfood experiences. Perceived taste intensity significantly influenced differences in food liking/disliking (for example, those experiencing the most intense tastes, called supertasters, showed more extreme liking and disliking for their favorite and least favorite foods). Scales like the hedonic gLMS are suitable for across-group comparisons of food palatability. © 2014 Institute of Food Technologists®
Implementation Schedule for Soil Fumigant Safety Measures
This fact sheet summarizes the soil fumigant pesticide product label changes that are going into effect during each of two phases. New requirements cover worker protection, training, good agricultural practices, buffer zones, sign posting, and more.
1,3-Dichloropropene and Chloropicrin Combination Products Fumigant Safe Handling Guide
These soil fumigant pesticide products' labels require safety training according to the Worker Protection Standard WPS. Steps to mitigate exposure include air monitoring, respiratory protection, and proper tarp perforation and removal.
40 CFR 1045.130 - What installation instructions must I give to vessel manufacturers?
Code of Federal Regulations, 2010 CFR
2010-07-01
... engine's emission control information label hard to read during normal engine maintenance, you must place... equivalent format. For example, you may post instructions on a publicly available Web site for downloading or...
40 CFR 1054.130 - What installation instructions must I give to equipment manufacturers?
Code of Federal Regulations, 2010 CFR
2010-07-01
... makes the engine's emission control information label hard to read during normal engine maintenance, you... in an equivalent format. For example, you may post instructions on a publicly available Web site for...
ERIC Educational Resources Information Center
Prater, Mary Anne
1987-01-01
A procedure for teaching concepts to elementary grade students includes the following four steps: (1) provide the definition and label; (2) present examples and nonexamples; (3) incorporate both instruction and practice; and (4) use a diagnostic classification test. (DB)
Technical Note: Deep learning based MRAC using rapid ultra-short echo time imaging.
Jang, Hyungseok; Liu, Fang; Zhao, Gengyan; Bradshaw, Tyler; McMillan, Alan B
2018-05-15
In this study, we explore the feasibility of a novel framework for MR-based attenuation correction for PET/MR imaging based on deep learning via convolutional neural networks, which enables fully automated and robust estimation of a pseudo CT image based on ultrashort echo time (UTE), fat, and water images obtained by a rapid MR acquisition. MR images for MRAC are acquired using dual echo ramped hybrid encoding (dRHE), where both UTE and out-of-phase echo images are obtained within a short single acquisition (35 sec). Tissue labeling of air, soft tissue, and bone in the UTE image is accomplished via a deep learning network that was pre-trained with T1-weighted MR images. UTE images are used as input to the network, which was trained using labels derived from co-registered CT images. The tissue labels estimated by deep learning are refined by a conditional random field based correction. The soft tissue labels are further separated into fat and water components using the two-point Dixon method. The estimated bone, air, fat, and water images are then assigned appropriate Hounsfield units, resulting in a pseudo CT image for PET attenuation correction. To evaluate the proposed MRAC method, PET/MR imaging of the head was performed on 8 human subjects, where Dice similarity coefficients of the estimated tissue labels and relative PET errors were evaluated through comparison to a registered CT image. Dice coefficients for air (within the head), soft tissue, and bone labels were 0.76±0.03, 0.96±0.006, and 0.88±0.01. In PET quantification, the proposed MRAC method produced relative PET errors less than 1% within most brain regions. The proposed MRAC method utilizing deep learning with transfer learning and an efficient dRHE acquisition enables reliable PET quantification with accurate and rapid pseudo CT generation. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Hoffman, Keith B; Dimbil, Mo; Tatonetti, Nicholas P; Kyle, Robert F
2016-06-01
Many serious drug adverse events (AEs) only manifest well after regulatory approval. Therefore, the development of signaling methods to use with post-approval AE databases appears vital to comprehensively assess real-world drug safety. However, with millions of potential drug-AE pairs to analyze, the issue of focus is daunting. Our objective was to develop a signaling platform that focuses on AEs with historically demonstrated regulatory interest and to analyze such AEs with a disproportional reporting method that offers broad signal detection and acceptable false-positive rates. We analyzed over 1500 US FDA regulatory actions (safety communications and drug label changes) from 2008 to 2015 to construct a list of eligible signal AEs. The FDA Adverse Event Reporting System (FAERS) was used to evaluate disproportional reporting rates, constrained by minimum case counts and confidence interval limits, of these selected AEs for 109 training drugs. This step led to 45 AEs that appeared to have a low likelihood of being added to a label by FDA, so they were removed from the signal eligible list. We measured disproportional reporting for the final group of eligible AEs on a test group of 29 drugs that were not used in either the eligible list construction or the training steps. In a group of 29 test drugs, our model reduced the number of potential drug-AE signals from 41,834 to 97 and predicted 73 % of individual drug label changes. The model also predicted at least one AE-drug pair label change in 66 % of all the label changes for the test drugs. By concentrating on AE types with already demonstrated interest to FDA, we constructed a signaling system that provided focus regarding drug-AE pairs and suitable accuracy with regard to the issuance of FDA labeling changes. We suggest that focus on historical regulatory actions may increase the utility of pharmacovigilance signaling systems.
Technologies and combination therapies for enhancing movement training for people with a disability
2012-01-01
There has been a dramatic increase over the last decade in research on technologies for enhancing movement training and exercise for people with a disability. This paper reviews some of the recent developments in this area, using examples from a National Science Foundation initiated study of mobility research projects in Europe to illustrate important themes and key directions for future research. This paper also reviews several recent studies aimed at combining movement training with plasticity or regeneration therapies, again drawing in part from European research examples. Such combination therapies will likely involve complex interactions with motor training that must be understood in order to achieve the goal of eliminating severe motor impairment. PMID:22463132
Mayer, Annyce S; Brazile, William J; Erb, Samantha; Autenrieth, Daniel A; Serrano, Katherine; Van Dyke, Michael V
2015-05-01
In addition to formaldehyde, workers in salons can be exposed to other chemical irritants, sensitizers, carcinogens, reproductive hazards, infectious agents, ergonomic, and other physical hazards. Worker health and safety training is challenging because of current product labeling practices and the myriad of hazards portending risk for a wide variety of health effects. Through a Susan B. Harwood Targeted Topic Training grant from the Occupational Safety and Health Administration and assistance from salon development and training partners, we developed, delivered, and validated a health and safety training program using an iterative five-pronged approach. The training was well received and resulted in knowledge gain, improved workplace safety practices, and increased communication about health and safety. These training materials are available for download from the Occupational Safety and Health Administration's Susan B. Harwood Training Grant Program Web site.
Incremental Transductive Learning Approaches to Schistosomiasis Vector Classification
NASA Astrophysics Data System (ADS)
Fusco, Terence; Bi, Yaxin; Wang, Haiying; Browne, Fiona
2016-08-01
The key issues pertaining to collection of epidemic disease data for our analysis purposes are that it is a labour intensive, time consuming and expensive process resulting in availability of sparse sample data which we use to develop prediction models. To address this sparse data issue, we present the novel Incremental Transductive methods to circumvent the data collection process by applying previously acquired data to provide consistent, confidence-based labelling alternatives to field survey research. We investigated various reasoning approaches for semi-supervised machine learning including Bayesian models for labelling data. The results show that using the proposed methods, we can label instances of data with a class of vector density at a high level of confidence. By applying the Liberal and Strict Training Approaches, we provide a labelling and classification alternative to standalone algorithms. The methods in this paper are components in the process of reducing the proliferation of the Schistosomiasis disease and its effects.
NASA Astrophysics Data System (ADS)
Paredes, David; Saha, Ashirbani; Mazurowski, Maciej A.
2017-03-01
Deep learning and convolutional neural networks (CNNs) in particular are increasingly popular tools for segmentation and classification of medical images. CNNs were shown to be successful for segmentation of brain tumors into multiple regions or labels. However, in the environment which fosters data-sharing and collection of multi-institutional datasets, a question arises: does training with data from another institution with potentially different imaging equipment, contrast protocol, and patient population impact the segmentation performance of the CNN? Our study presents preliminary data towards answering this question. Specifically, we used MRI data of glioblastoma (GBM) patients for two institutions present in The Cancer Imaging Archive. We performed a process of training and testing CNN multiple times such that half of the time the CNN was tested on data from the same institution that was used for training and half of the time it was tested on another institution, keeping the training and testing set size constant. We observed a decrease in performance as measured by Dice coefficient when the CNN was trained with data from a different institution as compared to training with data from the same institution. The changes in performance for the entire tumor and for four different labels within the tumor were: 0.72 to 0.65 (p=0.06), 0.61 to 0.58 (p=0.49), 0.54 to 0.51 (p=0.82), 0.31 to 0.24 (p<0.03), and 0.43 to 0.31(p<0.003) respectively. In summary, we found that while data across institutions can be used for development of CNNs, this might be associated with a decrease in performance.
Emerging Themes in Image Informatics and Molecular Analysis for Digital Pathology.
Bhargava, Rohit; Madabhushi, Anant
2016-07-11
Pathology is essential for research in disease and development, as well as for clinical decision making. For more than 100 years, pathology practice has involved analyzing images of stained, thin tissue sections by a trained human using an optical microscope. Technological advances are now driving major changes in this paradigm toward digital pathology (DP). The digital transformation of pathology goes beyond recording, archiving, and retrieving images, providing new computational tools to inform better decision making for precision medicine. First, we discuss some emerging innovations in both computational image analytics and imaging instrumentation in DP. Second, we discuss molecular contrast in pathology. Molecular DP has traditionally been an extension of pathology with molecularly specific dyes. Label-free, spectroscopic images are rapidly emerging as another important information source, and we describe the benefits and potential of this evolution. Third, we describe multimodal DP, which is enabled by computational algorithms and combines the best characteristics of structural and molecular pathology. Finally, we provide examples of application areas in telepathology, education, and precision medicine. We conclude by discussing challenges and emerging opportunities in this area.
Emerging Themes in Image Informatics and Molecular Analysis for Digital Pathology
Bhargava, Rohit; Madabhushi, Anant
2017-01-01
Pathology is essential for research in disease and development, as well as for clinical decision making. For more than 100 years, pathology practice has involved analyzing images of stained, thin tissue sections by a trained human using an optical microscope. Technological advances are now driving major changes in this paradigm toward digital pathology (DP). The digital transformation of pathology goes beyond recording, archiving, and retrieving images, providing new computational tools to inform better decision making for precision medicine. First, we discuss some emerging innovations in both computational image analytics and imaging instrumentation in DP. Second, we discuss molecular contrast in pathology. Molecular DP has traditionally been an extension of pathology with molecularly specific dyes. Label-free, spectroscopic images are rapidly emerging as another important information source, and we describe the benefits and potential of this evolution. Third, we describe multimodal DP, which is enabled by computational algorithms and combines the best characteristics of structural and molecular pathology. Finally, we provide examples of application areas in telepathology, education, and precision medicine. We conclude by discussing challenges and emerging opportunities in this area. PMID:27420575
Adal, Kedir M; Sidibé, Désiré; Ali, Sharib; Chaum, Edward; Karnowski, Thomas P; Mériaudeau, Fabrice
2014-04-01
Despite several attempts, automated detection of microaneurysm (MA) from digital fundus images still remains to be an open issue. This is due to the subtle nature of MAs against the surrounding tissues. In this paper, the microaneurysm detection problem is modeled as finding interest regions or blobs from an image and an automatic local-scale selection technique is presented. Several scale-adapted region descriptors are introduced to characterize these blob regions. A semi-supervised based learning approach, which requires few manually annotated learning examples, is also proposed to train a classifier which can detect true MAs. The developed system is built using only few manually labeled and a large number of unlabeled retinal color fundus images. The performance of the overall system is evaluated on Retinopathy Online Challenge (ROC) competition database. A competition performance measure (CPM) of 0.364 shows the competitiveness of the proposed system against state-of-the art techniques as well as the applicability of the proposed features to analyze fundus images. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Clustering for unsupervised fault diagnosis in nuclear turbine shut-down transients
NASA Astrophysics Data System (ADS)
Baraldi, Piero; Di Maio, Francesco; Rigamonti, Marco; Zio, Enrico; Seraoui, Redouane
2015-06-01
Empirical methods for fault diagnosis usually entail a process of supervised training based on a set of examples of signal evolutions "labeled" with the corresponding, known classes of fault. However, in practice, the signals collected during plant operation may be, very often, "unlabeled", i.e., the information on the corresponding type of occurred fault is not available. To cope with this practical situation, in this paper we develop a methodology for the identification of transient signals showing similar characteristics, under the conjecture that operational/faulty transient conditions of the same type lead to similar behavior in the measured signals evolution. The methodology is founded on a feature extraction procedure, which feeds a spectral clustering technique, embedding the unsupervised fuzzy C-means (FCM) algorithm, which evaluates the functional similarity among the different operational/faulty transients. A procedure for validating the plausibility of the obtained clusters is also propounded based on physical considerations. The methodology is applied to a real industrial case, on the basis of 148 shut-down transients of a Nuclear Power Plant (NPP) steam turbine.
Classification of document page images based on visual similarity of layout structures
NASA Astrophysics Data System (ADS)
Shin, Christian K.; Doermann, David S.
1999-12-01
Searching for documents by their type or genre is a natural way to enhance the effectiveness of document retrieval. The layout of a document contains a significant amount of information that can be used to classify a document's type in the absence of domain specific models. A document type or genre can be defined by the user based primarily on layout structure. Our classification approach is based on 'visual similarity' of the layout structure by building a supervised classifier, given examples of the class. We use image features, such as the percentages of tex and non-text (graphics, image, table, and ruling) content regions, column structures, variations in the point size of fonts, the density of content area, and various statistics on features of connected components which can be derived from class samples without class knowledge. In order to obtain class labels for training samples, we conducted a user relevance test where subjects ranked UW-I document images with respect to the 12 representative images. We implemented our classification scheme using the OC1, a decision tree classifier, and report our findings.
Dolman, Nick J; Kilgore, Jason A; Davidson, Michael W
2013-07-01
Fluorescent labeling of vesicular structures in cultured cells, particularly for live cells, can be challenging for a number of reasons. The first challenge is to identify a reagent that will be specific enough where some structures have a number of potential reagents and others very few options. The emergence of BacMam constructs has allowed more easy-to-use choices. Presented here is a discussion of BacMam constructs as well as a review of commercially-available reagents for labeling vesicular structures in cells, including endosomes, peroxisomes, lysosomes, and autophagosomes, complete with a featured reagent for each structure, recommended protocol, troubleshooting guide, and example image. © 2013 by John Wiley & Sons, Inc.
Using Decision Trees to Detect and Isolate Simulated Leaks in the J-2X Rocket Engine
NASA Technical Reports Server (NTRS)
Schwabacher, Mark A.; Aguilar, Robert; Figueroa, Fernando F.
2009-01-01
The goal of this work was to use data-driven methods to automatically detect and isolate faults in the J-2X rocket engine. It was decided to use decision trees, since they tend to be easier to interpret than other data-driven methods. The decision tree algorithm automatically "learns" a decision tree by performing a search through the space of possible decision trees to find one that fits the training data. The particular decision tree algorithm used is known as C4.5. Simulated J-2X data from a high-fidelity simulator developed at Pratt & Whitney Rocketdyne and known as the Detailed Real-Time Model (DRTM) was used to "train" and test the decision tree. Fifty-six DRTM simulations were performed for this purpose, with different leak sizes, different leak locations, and different times of leak onset. To make the simulations as realistic as possible, they included simulated sensor noise, and included a gradual degradation in both fuel and oxidizer turbine efficiency. A decision tree was trained using 11 of these simulations, and tested using the remaining 45 simulations. In the training phase, the C4.5 algorithm was provided with labeled examples of data from nominal operation and data including leaks in each leak location. From the data, it "learned" a decision tree that can classify unseen data as having no leak or having a leak in one of the five leak locations. In the test phase, the decision tree produced very low false alarm rates and low missed detection rates on the unseen data. It had very good fault isolation rates for three of the five simulated leak locations, but it tended to confuse the remaining two locations, perhaps because a large leak at one of these two locations can look very similar to a small leak at the other location.
Shen, Wei-Bin; Vaccaro, Dennis E; Fishman, Paul S; Groman, Ernest V; Yarowsky, Paul
2016-05-01
This is the first report of the synthesis of a new nanoparticle, sans iron oxide rhodamine B (SIRB), an example of a new class of nanoparticles. SIRB is designed to provide all of the cell labeling properties of the ultrasmall superparamagnetic iron oxide (USPIO) nanoparticle Molday ION Rhodamine B (MIRB) without containing the iron oxide core. MIRB was developed to label cells and allow them to be tracked by MRI or to be manipulated by magnetic gradients. SIRB possesses a similar size, charge and cross-linked dextran coating as MIRB. Of great interest is understanding the biological and physiological changes in cells after they are labeled with a USPIO. Whether these effects are due to the iron oxide buried within the nanoparticle or to the surface coating surrounding the iron oxide core has not been considered previously. MIRB and SIRB represent an ideal pairing of nanoparticles to identify nanoparticle anatomy responsible for post-labeling cytotoxicity. Here we report the effects of SIRB labeling on the SH-SY5Y neuroblastoma cell line and primary human neuroprogenitor cells (hNPCs). These effects are contrasted with the effects of labeling SH-SY5Y cells and hNPCs with MIRB. We find that SIRB labeling, like MIRB labeling, (i) occurs without the use of transfection reagents, (ii) is packaged within lysosomes distributed within cell cytoplasm, (iii) is retained within cells with no loss of label after cell storage, and (iv) does not alter cellular viability or proliferation, and (v) SIRB labeled hNPCs differentiate normally into neurons or astrocytes. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Generating region proposals for histopathological whole slide image retrieval.
Ma, Yibing; Jiang, Zhiguo; Zhang, Haopeng; Xie, Fengying; Zheng, Yushan; Shi, Huaqiang; Zhao, Yu; Shi, Jun
2018-06-01
Content-based image retrieval is an effective method for histopathological image analysis. However, given a database of huge whole slide images (WSIs), acquiring appropriate region-of-interests (ROIs) for training is significant and difficult. Moreover, histopathological images can only be annotated by pathologists, resulting in the lack of labeling information. Therefore, it is an important and challenging task to generate ROIs from WSI and retrieve image with few labels. This paper presents a novel unsupervised region proposing method for histopathological WSI based on Selective Search. Specifically, the WSI is over-segmented into regions which are hierarchically merged until the WSI becomes a single region. Nucleus-oriented similarity measures for region mergence and Nucleus-Cytoplasm color space for histopathological image are specially defined to generate accurate region proposals. Additionally, we propose a new semi-supervised hashing method for image retrieval. The semantic features of images are extracted with Latent Dirichlet Allocation and transformed into binary hashing codes with Supervised Hashing. The methods are tested on a large-scale multi-class database of breast histopathological WSIs. The results demonstrate that for one WSI, our region proposing method can generate 7.3 thousand contoured regions which fit well with 95.8% of the ROIs annotated by pathologists. The proposed hashing method can retrieve a query image among 136 thousand images in 0.29 s and reach precision of 91% with only 10% of images labeled. The unsupervised region proposing method can generate regions as predictions of lesions in histopathological WSI. The region proposals can also serve as the training samples to train machine-learning models for image retrieval. The proposed hashing method can achieve fast and precise image retrieval with small amount of labels. Furthermore, the proposed methods can be potentially applied in online computer-aided-diagnosis systems. Copyright © 2018 Elsevier B.V. All rights reserved.
Query construction, entropy, and generalization in neural-network models
NASA Astrophysics Data System (ADS)
Sollich, Peter
1994-05-01
We study query construction algorithms, which aim at improving the generalization ability of systems that learn from examples by choosing optimal, nonredundant training sets. We set up a general probabilistic framework for deriving such algorithms from the requirement of optimizing a suitable objective function; specifically, we consider the objective functions entropy (or information gain) and generalization error. For two learning scenarios, the high-low game and the linear perceptron, we evaluate the generalization performance obtained by applying the corresponding query construction algorithms and compare it to training on random examples. We find qualitative differences between the two scenarios due to the different structure of the underlying rules (nonlinear and ``noninvertible'' versus linear); in particular, for the linear perceptron, random examples lead to the same generalization ability as a sequence of queries in the limit of an infinite number of examples. We also investigate learning algorithms which are ill matched to the learning environment and find that, in this case, minimum entropy queries can in fact yield a lower generalization ability than random examples. Finally, we study the efficiency of single queries and its dependence on the learning history, i.e., on whether the previous training examples were generated randomly or by querying, and the difference between globally and locally optimal query construction.
The Effect of Context on Training: Is Learning Situated?
1994-09-13
not underlie the central processes of ordinary everyday cognition ? We think not." There are numerous examples where abstract instruction has been shown... instruction , concrete examples, and abstract rules and procedures. Claims made by proponents of Situated Learning Theory suggest that training must be... instruction . This argues against apprenticeship learning during early stages of acquisition for many skills. Further, too much fidelity in simulation may
Keiderling, Timothy A
2017-12-01
Isotope labeling has a long history in chemistry as a tool for probing structure, offering enhanced sensitivity, or enabling site selection with a wide range of spectroscopic tools. Chirality sensitive methods such as electronic circular dichroism are global structural tools and have intrinsically low resolution. Consequently, they are generally insensitive to modifications to enhance site selectivity. The use of isotope labeling to modify vibrational spectra with unique resolvable frequency shifts can provide useful site-specific sensitivity, and these methods have been recently more widely expanded in biopolymer studies. While the spectral shifts resulting from changes in isotopic mass can provide resolution of modes from specific parts of the molecule and can allow detection of local change in structure with perturbation, these shifts alone do not directly indicate structure or chirality. With vibrational circular dichroism (VCD), the shifted bands and their resultant sign patterns can be used to indicate local conformations in labeled biopolymers, particularly if multiple labels are used and if their coupling is theoretically modeled. This mini-review discusses selected examples of the use of labeling specific amides in peptides to develop local structural insight with VCD spectra. © 2017 Wiley Periodicals, Inc.
Controlling off-label medication use.
Gillick, Muriel R
2009-03-03
Off-label prescribing may lead to innovative new uses of old medications, is essential in such fields as pediatrics, and avoids the lengthy and expensive process of modifying U.S. Food and Drug Administration (FDA) drug labeling. Using medications for unapproved indications, however, raises concerns about patient safety when the drugs have a high potential for toxicity and generates economic concerns when their cost is high. A possible means of controlling the use of off-label drugs is to focus on medications used off-label that are both expensive and potentially risky. These are principally biotechnology drugs, such as recombinant enzymes, cytokines, and monoclonal antibodies. This article suggests a 2-step process for controlling use of such drugs, analogous to that used for devices. Once a drug is FDA approved, it would undergo scrutiny using the Centers for Medicare & Medicaid Services (CMS) National Coverage Determination method if its cost exceeds a specified benchmark-for example, $12 000, which is the average cost of a pacemaker. The CMS would pay only for off-label uses for which there is adequate evidence in its National Coverage Determination process. Other insurance companies would probably adopt the recommendations of CMS.
Product Recommendation System Based on Personal Preference Model Using CAM
NASA Astrophysics Data System (ADS)
Murakami, Tomoko; Yoshioka, Nobukazu; Orihara, Ryohei; Furukawa, Koichi
Product recommendation system is realized by applying business rules acquired by data maining techniques. Business rules such as demographical patterns of purchase, are able to cover the groups of users that have a tendency to purchase products, but it is difficult to recommend products adaptive to various personal preferences only by utilizing them. In addition to that, it is very costly to gather the large volume of high quality survey data, which is necessary for good recommendation based on personal preference model. A method collecting kansei information automatically without questionnaire survey is required. The constructing personal preference model from less favor data is also necessary, since it is costly for the user to input favor data. In this paper, we propose product recommendation system based on kansei information extracted by text mining and user's preference model constructed by Category-guided Adaptive Modeling, CAM for short. CAM is a feature construction method that can generate new features constructing the space where same labeled examples are close and different labeled examples are far away from some labeled examples. It is possible to construct personal preference model by CAM despite less information of likes and dislikes categories. In the system, retrieval agent gathers the products' specification and user agent manages preference model, user's likes and dislikes. Kansei information of the products is gained by applying text mining technique to the reputation documents about the products on the web site. We carry out some experimental studies to make sure that prefrence model obtained by our method performs effectively.
Huang, Yue; Zheng, Han; Liu, Chi; Ding, Xinghao; Rohde, Gustavo K
2017-11-01
Epithelium-stroma classification is a necessary preprocessing step in histopathological image analysis. Current deep learning based recognition methods for histology data require collection of large volumes of labeled data in order to train a new neural network when there are changes to the image acquisition procedure. However, it is extremely expensive for pathologists to manually label sufficient volumes of data for each pathology study in a professional manner, which results in limitations in real-world applications. A very simple but effective deep learning method, that introduces the concept of unsupervised domain adaptation to a simple convolutional neural network (CNN), has been proposed in this paper. Inspired by transfer learning, our paper assumes that the training data and testing data follow different distributions, and there is an adaptation operation to more accurately estimate the kernels in CNN in feature extraction, in order to enhance performance by transferring knowledge from labeled data in source domain to unlabeled data in target domain. The model has been evaluated using three independent public epithelium-stroma datasets by cross-dataset validations. The experimental results demonstrate that for epithelium-stroma classification, the proposed framework outperforms the state-of-the-art deep neural network model, and it also achieves better performance than other existing deep domain adaptation methods. The proposed model can be considered to be a better option for real-world applications in histopathological image analysis, since there is no longer a requirement for large-scale labeled data in each specified domain.
Fusing Continuous-Valued Medical Labels Using a Bayesian Model.
Zhu, Tingting; Dunkley, Nic; Behar, Joachim; Clifton, David A; Clifford, Gari D
2015-12-01
With the rapid increase in volume of time series medical data available through wearable devices, there is a need to employ automated algorithms to label data. Examples of labels include interventions, changes in activity (e.g. sleep) and changes in physiology (e.g. arrhythmias). However, automated algorithms tend to be unreliable resulting in lower quality care. Expert annotations are scarce, expensive, and prone to significant inter- and intra-observer variance. To address these problems, a Bayesian Continuous-valued Label Aggregator (BCLA) is proposed to provide a reliable estimation of label aggregation while accurately infer the precision and bias of each algorithm. The BCLA was applied to QT interval (pro-arrhythmic indicator) estimation from the electrocardiogram using labels from the 2006 PhysioNet/Computing in Cardiology Challenge database. It was compared to the mean, median, and a previously proposed Expectation Maximization (EM) label aggregation approaches. While accurately predicting each labelling algorithm's bias and precision, the root-mean-square error of the BCLA was 11.78 ± 0.63 ms, significantly outperforming the best Challenge entry (15.37 ± 2.13 ms) as well as the EM, mean, and median voting strategies (14.76 ± 0.52, 17.61 ± 0.55, and 14.43 ± 0.57 ms respectively with p < 0.0001). The BCLA could therefore provide accurate estimation for medical continuous-valued label tasks in an unsupervised manner even when the ground truth is not available.
Spatially adapted augmentation of age-specific atlas-based segmentation using patch-based priors
NASA Astrophysics Data System (ADS)
Liu, Mengyuan; Seshamani, Sharmishtaa; Harrylock, Lisa; Kitsch, Averi; Miller, Steven; Chau, Van; Poskitt, Kenneth; Rousseau, Francois; Studholme, Colin
2014-03-01
One of the most common approaches to MRI brain tissue segmentation is to employ an atlas prior to initialize an Expectation- Maximization (EM) image labeling scheme using a statistical model of MRI intensities. This prior is commonly derived from a set of manually segmented training data from the population of interest. However, in cases where subject anatomy varies significantly from the prior anatomical average model (for example in the case where extreme developmental abnormalities or brain injuries occur), the prior tissue map does not provide adequate information about the observed MRI intensities to ensure the EM algorithm converges to an anatomically accurate labeling of the MRI. In this paper, we present a novel approach for automatic segmentation of such cases. This approach augments the atlas-based EM segmentation by exploring methods to build a hybrid tissue segmentation scheme that seeks to learn where an atlas prior fails (due to inadequate representation of anatomical variation in the statistical atlas) and utilize an alternative prior derived from a patch driven search of the atlas data. We describe a framework for incorporating this patch-based augmentation of EM (PBAEM) into a 4D age-specific atlas-based segmentation of developing brain anatomy. The proposed approach was evaluated on a set of MRI brain scans of premature neonates with ages ranging from 27.29 to 46.43 gestational weeks (GWs). Results indicated superior performance compared to the conventional atlas-based segmentation method, providing improved segmentation accuracy for gray matter, white matter, ventricles and sulcal CSF regions.
Costello, Nessan; Deighton, Kevin; Preston, Thomas; Matu, Jamie; Rowe, Joshua; Sawczuk, Thomas; Halkier, Matt; Read, Dale B; Weaving, Daniel; Jones, Ben
2018-06-01
Collision sports are characterised by frequent high-intensity collisions that induce substantial muscle damage, potentially increasing the energetic cost of recovery. Therefore, this study investigated the energetic cost of collision-based activity for the first time across any sport. Using a randomised crossover design, six professional young male rugby league players completed two different 5-day pre-season training microcycles. Players completed either a collision (COLL; 20 competitive one-on-one collisions) or non-collision (nCOLL; matched for kinematic demands, excluding collisions) training session on the first day of each microcycle, exactly 7 days apart. All remaining training sessions were matched and did not involve any collision-based activity. Total energy expenditure was measured using doubly labelled water, the literature gold standard. Collisions resulted in a very likely higher (4.96 ± 0.97 MJ; ES = 0.30 ± 0.07; p = 0.0021) total energy expenditure across the 5-day COLL training microcycle (95.07 ± 16.66 MJ) compared with the nCOLL training microcycle (90.34 ± 16.97 MJ). The COLL training session also resulted in a very likely higher (200 ± 102 AU; ES = 1.43 ± 0.74; p = 0.007) session rating of perceived exertion and a very likely greater (- 14.6 ± 3.3%; ES = - 1.60 ± 0.51; p = 0.002) decrease in wellbeing 24 h later. A single collision training session considerably increased total energy expenditure. This may explain the large energy expenditures of collision-sport athletes, which appear to exceed kinematic training and match demands. These findings suggest fuelling professional collision-sport athletes appropriately for the "muscle damage caused" alongside the kinematic "work required".
An evaluation of consensus techniques for diagnostic interpretation
NASA Astrophysics Data System (ADS)
Sauter, Jake N.; LaBarre, Victoria M.; Furst, Jacob D.; Raicu, Daniela S.
2018-02-01
Learning diagnostic labels from image content has been the standard in computer-aided diagnosis. Most computer-aided diagnosis systems use low-level image features extracted directly from image content to train and test machine learning classifiers for diagnostic label prediction. When the ground truth for the diagnostic labels is not available, reference truth is generated from the experts diagnostic interpretations of the image/region of interest. More specifically, when the label is uncertain, e.g. when multiple experts label an image and their interpretations are different, techniques to handle the label variability are necessary. In this paper, we compare three consensus techniques that are typically used to encode the variability in the experts labeling of the medical data: mean, median and mode, and their effects on simple classifiers that can handle deterministic labels (decision trees) and probabilistic vectors of labels (belief decision trees). Given that the NIH/NCI Lung Image Database Consortium (LIDC) data provides interpretations for lung nodules by up to four radiologists, we leverage the LIDC data to evaluate and compare these consensus approaches when creating computer-aided diagnosis systems for lung nodules. First, low-level image features of nodules are extracted and paired with their radiologists semantic ratings (1= most likely benign, , 5 = most likely malignant); second, machine learning multi-class classifiers that handle deterministic labels (decision trees) and probabilistic vectors of labels (belief decision trees) are built to predict the lung nodules semantic ratings. We show that the mean-based consensus generates the most robust classi- fier overall when compared to the median- and mode-based consensus. Lastly, the results of this study show that, when building CAD systems with uncertain diagnostic interpretation, it is important to evaluate different strategies for encoding and predicting the diagnostic label.
Exposure Control--OSHA's Bloodborne Pathogens Standard.
ERIC Educational Resources Information Center
Granville, Mark F.
1993-01-01
Explains schools' responsibilities in complying with the Occupational Safety and Health Administration's (OSHA) Bloodborne Pathogens Standard. Describes exposure determination plan, protective equipment, housekeeping practices, labeling of waste, training employees, hepatitis B vaccinations, postexposure evaluation and medical follow-up, and…
He, Dengchao; Zhang, Hongjun; Hao, Wenning; Zhang, Rui; Cheng, Kai
2017-07-01
Distant supervision, a widely applied approach in the field of relation extraction can automatically generate large amounts of labeled training corpus with minimal manual effort. However, the labeled training corpus may have many false-positive data, which would hurt the performance of relation extraction. Moreover, in traditional feature-based distant supervised approaches, extraction models adopt human design features with natural language processing. It may also cause poor performance. To address these two shortcomings, we propose a customized attention-based long short-term memory network. Our approach adopts word-level attention to achieve better data representation for relation extraction without manually designed features to perform distant supervision instead of fully supervised relation extraction, and it utilizes instance-level attention to tackle the problem of false-positive data. Experimental results demonstrate that our proposed approach is effective and achieves better performance than traditional methods.
Emotion computing using Word Mover's Distance features based on Ren_CECps.
Ren, Fuji; Liu, Ning
2018-01-01
In this paper, we propose an emotion separated method(SeTF·IDF) to assign the emotion labels of sentences with different values, which has a better visual effect compared with the values represented by TF·IDF in the visualization of a multi-label Chinese emotional corpus Ren_CECps. Inspired by the enormous improvement of the visualization map propelled by the changed distances among the sentences, we being the first group utilizes the Word Mover's Distance(WMD) algorithm as a way of feature representation in Chinese text emotion classification. Our experiments show that both in 80% for training, 20% for testing and 50% for training, 50% for testing experiments of Ren_CECps, WMD features get the best f1 scores and have a greater increase compared with the same dimension feature vectors obtained by dimension reduction TF·IDF method. Compared experiments in English corpus also show the efficiency of WMD features in the cross-language field.
Relabeling exchange method (REM) for learning in neural networks
NASA Astrophysics Data System (ADS)
Wu, Wen; Mammone, Richard J.
1994-02-01
The supervised training of neural networks require the use of output labels which are usually arbitrarily assigned. In this paper it is shown that there is a significant difference in the rms error of learning when `optimal' label assignment schemes are used. We have investigated two efficient random search algorithms to solve the relabeling problem: the simulated annealing and the genetic algorithm. However, we found them to be computationally expensive. Therefore we shall introduce a new heuristic algorithm called the Relabeling Exchange Method (REM) which is computationally more attractive and produces optimal performance. REM has been used to organize the optimal structure for multi-layered perceptrons and neural tree networks. The method is a general one and can be implemented as a modification to standard training algorithms. The motivation of the new relabeling strategy is based on the present interpretation of dyslexia as an encoding problem.
40 CFR 1048.130 - What installation instructions must I give to equipment manufacturers?
Code of Federal Regulations, 2010 CFR
2010-07-01
... engine in a way that makes the engine's emission control information label hard to read during normal...) Provide instructions in writing or in an equivalent format. For example, you may post instructions on a...
40 CFR 1051.130 - What installation instructions must I give to vehicle manufacturers?
Code of Federal Regulations, 2010 CFR
2010-07-01
... you install the engine in a way that makes the engine's emission contro information label hard to read.... (d) Provide instructions in writing or in an equivalent format. For example, you may post...
What Is It All For? A Conversation With Jonathan Kozol
ERIC Educational Resources Information Center
Journal of Current Social Issues, 1976
1976-01-01
The main complaint with the current educational system in the U.S. is that public schools are classic examples of consumer fraud in the most literal sense--false labels, half empty packages, and dangerous contents. (Author/AM)
Qiu, Wang-Ren; Zheng, Quan-Shu; Sun, Bi-Qian; Xiao, Xuan
2017-03-01
Predicting phosphorylation protein is a challenging problem, particularly when query proteins have multi-label features meaning that they may be phosphorylated at two or more different type amino acids. In fact, human protein usually be phosphorylated at serine, threonine and tyrosine. By introducing the "multi-label learning" approach, a novel predictor has been developed that can be used to deal with the systems containing both single- and multi-label phosphorylation protein. Here we proposed a predictor called Multi-iPPseEvo by (1) incorporating the protein sequence evolutionary information into the general pseudo amino acid composition (PseAAC) via the grey system theory, (2) balancing out the skewed training datasets by the asymmetric bootstrap approach, and (3) constructing an ensemble predictor by fusing an array of individual random forest classifiers thru a voting system. Rigorous cross-validations via a set of multi-label metrics indicate that the multi-label phosphorylation predictor is very promising and encouraging. The current approach represents a new strategy to deal with the multi-label biological problems, and the software is freely available for academic use at http://www.jci-bioinfo.cn/Multi-iPPseEvo. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
[Information perceived by consumers through food labeling on fats: a systematic review].
Sebastian-Ponce, Miren Itxaso; Sanz-Valero, Javier; Wanden-Berghe, Carmina
2014-11-22
To review the scientific literature related to the information given to consumers about different types of fats in foods through food labeling. Systematic review of the data found in MEDLINE (via PubMed), EMBASE, CINAHL, FSTA, Web of Science, Cochrane Library, SCOPUS and LILACS databasis, until September 2013. The terms used as descriptors and free text were "dietary fats", "dietary fats, unsaturated" and "food labeling". The limit "human" was used. 549 references were retrieved, of which 36 articles were selected after applying the inclusion and exclusion criteria. The main effects related to labeling information were linked to the price and place of purchase/ consumption, sensory dimensions, dietary habits, interpretation and education logo. Food labeling on fat content helps when making consumption decisions. Nutrition education and the meanings of food labels are essential and were effective although the "informed consumer" is yet to be achieved. Training activities should be directed towards prior beliefs and attitudes of consumers in order to make the health and nutrition message consistent. Food labels should be homogeneous and truthful in terms of expressing composition or presenting logos, and messages included in the packaging should be clear and not misleading. Copyright AULA MEDICA EDICIONES 2014. Published by AULA MEDICA. All rights reserved.
A Semantic Labeling of the Environment Based on What People Do.
Crespo, Jonathan; Gómez, Clara; Hernández, Alejandra; Barber, Ramón
2017-01-29
In this work, a system is developed for semantic labeling of locations based on what people do. This system is useful for semantic navigation of mobile robots. The system differentiates environments according to what people do in them. Background sound, number of people in a room and amount of movement of those people are items to be considered when trying to tell if people are doing different actions. These data are sampled, and it is assumed that people behave differently and perform different actions. A support vector machine is trained with the obtained samples, and therefore, it allows one to identify the room. Finally, the results are discussed and support the hypothesis that the proposed system can help to semantically label a room.
Image segmentation via foreground and background semantic descriptors
NASA Astrophysics Data System (ADS)
Yuan, Ding; Qiang, Jingjing; Yin, Jihao
2017-09-01
In the field of image processing, it has been a challenging task to obtain a complete foreground that is not uniform in color or texture. Unlike other methods, which segment the image by only using low-level features, we present a segmentation framework, in which high-level visual features, such as semantic information, are used. First, the initial semantic labels were obtained by using the nonparametric method. Then, a subset of the training images, with a similar foreground to the input image, was selected. Consequently, the semantic labels could be further refined according to the subset. Finally, the input image was segmented by integrating the object affinity and refined semantic labels. State-of-the-art performance was achieved in experiments with the challenging MSRC 21 dataset.
Zhao, Qiang; Lv, Qin; Wang, Hailin
2015-08-15
We previously reported a fluorescence anisotropy (FA) approach for small molecules using tetramethylrhodamine (TMR) labeled aptamer. It relies on target-binding induced change of intramolecular interaction between TMR and guanine (G) base. TMR-labeling sites are crucial for this approach. Only terminal ends and thymine (T) bases could be tested for TMR labeling in our previous work, possibly causing limitation in analysis of different targets with this FA strategy. Here, taking the analysis of adenosine triphosphate (ATP) as an example, we demonstrated a success of conjugating TMR on other bases of aptamer adenine (A) or cytosine (C) bases and an achievement of full mapping various labeling sites of aptamers. We successfully constructed aptamer fluorescence anisotropy (FA) sensors for adenosine triphosphate (ATP). We conjugated single TMR on adenine (A), cytosine (C), or thymine (T) bases or terminals of a 25-mer aptamer against ATP and tested FA responses of 14 TMR-labeled aptamer to ATP. The aptamers having TMR labeled on the 16th base C or 23rd base A were screened out and exhibited significant FA-decreasing or FA-increasing responses upon ATP, respectively. These two favorable TMR-labeled aptamers enabled direct FA sensing ATP with a detection limit of 1 µM and the analysis of ATP in diluted serum. The comprehensive screening various TMR labeling sites of aptamers facilitates the successful construction of FA sensors using TMR-labeled aptamers. It will expand application of TMR-G interaction based aptamer FA strategy to a variety of targets. Copyright © 2015 Elsevier B.V. All rights reserved.
Towards Autonomous Agriculture: Automatic Ground Detection Using Trinocular Stereovision
Reina, Giulio; Milella, Annalisa
2012-01-01
Autonomous driving is a challenging problem, particularly when the domain is unstructured, as in an outdoor agricultural setting. Thus, advanced perception systems are primarily required to sense and understand the surrounding environment recognizing artificial and natural structures, topology, vegetation and paths. In this paper, a self-learning framework is proposed to automatically train a ground classifier for scene interpretation and autonomous navigation based on multi-baseline stereovision. The use of rich 3D data is emphasized where the sensor output includes range and color information of the surrounding environment. Two distinct classifiers are presented, one based on geometric data that can detect the broad class of ground and one based on color data that can further segment ground into subclasses. The geometry-based classifier features two main stages: an adaptive training stage and a classification stage. During the training stage, the system automatically learns to associate geometric appearance of 3D stereo-generated data with class labels. Then, it makes predictions based on past observations. It serves as well to provide training labels to the color-based classifier. Once trained, the color-based classifier is able to recognize similar terrain classes in stereo imagery. The system is continuously updated online using the latest stereo readings, thus making it feasible for long range and long duration navigation, over changing environments. Experimental results, obtained with a tractor test platform operating in a rural environment, are presented to validate this approach, showing an average classification precision and recall of 91.0% and 77.3%, respectively.
Photoacoustic microscopy of single cells employing an intensity-modulated diode laser
NASA Astrophysics Data System (ADS)
Langer, Gregor; Buchegger, Bianca; Jacak, Jaroslaw; Dasa, Manoj Kumar; Klar, Thomas A.; Berer, Thomas
2018-02-01
In this work, we employ frequency-domain photoacoustic microscopy to obtain photoacoustic images of labeled and unlabeled cells. The photoacoustic microscope is based on an intensity-modulated diode laser in combination with a focused piezo-composite transducer and allows imaging of labeled cells without severe photo-bleaching. We demonstrate that frequency-domain photoacoustic microscopy realized with a diode laser is capable of recording photoacoustic images of single cells with sub-µm resolution. As examples, we present images of undyed human red blood cells, stained human epithelial cells, and stained yeast cells.
Joint Feature Selection and Classification for Multilabel Learning.
Huang, Jun; Li, Guorong; Huang, Qingming; Wu, Xindong
2018-03-01
Multilabel learning deals with examples having multiple class labels simultaneously. It has been applied to a variety of applications, such as text categorization and image annotation. A large number of algorithms have been proposed for multilabel learning, most of which concentrate on multilabel classification problems and only a few of them are feature selection algorithms. Current multilabel classification models are mainly built on a single data representation composed of all the features which are shared by all the class labels. Since each class label might be decided by some specific features of its own, and the problems of classification and feature selection are often addressed independently, in this paper, we propose a novel method which can perform joint feature selection and classification for multilabel learning, named JFSC. Different from many existing methods, JFSC learns both shared features and label-specific features by considering pairwise label correlations, and builds the multilabel classifier on the learned low-dimensional data representations simultaneously. A comparative study with state-of-the-art approaches manifests a competitive performance of our proposed method both in classification and feature selection for multilabel learning.
Investigation of RNA Synthesis Using 5-Bromouridine Labelling and Immunoprecipitation.
Kofoed, Rikke H; Betzer, Cristine; Lykke-Andersen, Søren; Molska, Ewa; Jensen, Poul H
2018-05-03
When steady state RNA levels are compared between two conditions, it is not possible to distinguish whether changes are caused by alterations in production or degradation of RNA. This protocol describes a method for measurement of RNA production, using 5-Bromouridine labelling of RNA followed by immunoprecipitation, which enables investigation of RNA synthesized within a short timeframe (e.g., 1 h). The advantage of 5-Bromouridine-labelling and immunoprecipitation over the use of toxic transcriptional inhibitors, such as α-amanitin and actinomycin D, is that there are no or very low effects on cell viability during short-term use. However, because 5-Bromouridine-immunoprecipitation only captures RNA produced within the short labelling time, slowly produced as well as rapidly degraded RNA can be difficult to measure by this method. The 5-Bromouridine-labelled RNA captured by 5-Bromouridine-immunoprecipitation can be analyzed by reverse transcription, quantitative polymerase chain reaction, and next generation sequencing. All types of RNA can be investigated, and the method is not limited to measuring mRNA as is presented in this example.