Using machine learning for sequence-level automated MRI protocol selection in neuroradiology.
Brown, Andrew D; Marotta, Thomas R
2018-05-01
Incorrect imaging protocol selection can lead to important clinical findings being missed, contributing to both wasted health care resources and patient harm. We present a machine learning method for analyzing the unstructured text of clinical indications and patient demographics from magnetic resonance imaging (MRI) orders to automatically protocol MRI procedures at the sequence level. We compared 3 machine learning models - support vector machine, gradient boosting machine, and random forest - to a baseline model that predicted the most common protocol for all observations in our test set. The gradient boosting machine model significantly outperformed the baseline and demonstrated the best performance of the 3 models in terms of accuracy (95%), precision (86%), recall (80%), and Hamming loss (0.0487). This demonstrates the feasibility of automating sequence selection by applying machine learning to MRI orders. Automated sequence selection has important safety, quality, and financial implications and may facilitate improvements in the quality and safety of medical imaging service delivery.
Narula, Sukrit; Shameer, Khader; Salem Omar, Alaa Mabrouk; Dudley, Joel T; Sengupta, Partho P
2016-11-29
Machine-learning models may aid cardiac phenotypic recognition by using features of cardiac tissue deformation. This study investigated the diagnostic value of a machine-learning framework that incorporates speckle-tracking echocardiographic data for automated discrimination of hypertrophic cardiomyopathy (HCM) from physiological hypertrophy seen in athletes (ATH). Expert-annotated speckle-tracking echocardiographic datasets obtained from 77 ATH and 62 HCM patients were used for developing an automated system. An ensemble machine-learning model with 3 different machine-learning algorithms (support vector machines, random forests, and artificial neural networks) was developed and a majority voting method was used for conclusive predictions with further K-fold cross-validation. Feature selection using an information gain (IG) algorithm revealed that volume was the best predictor for differentiating between HCM ands. ATH (IG = 0.24) followed by mid-left ventricular segmental (IG = 0.134) and average longitudinal strain (IG = 0.131). The ensemble machine-learning model showed increased sensitivity and specificity compared with early-to-late diastolic transmitral velocity ratio (p < 0.01), average early diastolic tissue velocity (e') (p < 0.01), and strain (p = 0.04). Because ATH were younger, adjusted analysis was undertaken in younger HCM patients and compared with ATH with left ventricular wall thickness >13 mm. In this subgroup analysis, the automated model continued to show equal sensitivity, but increased specificity relative to early-to-late diastolic transmitral velocity ratio, e', and strain. Our results suggested that machine-learning algorithms can assist in the discrimination of physiological versus pathological patterns of hypertrophic remodeling. This effort represents a step toward the development of a real-time, machine-learning-based system for automated interpretation of echocardiographic images, which may help novice readers with limited experience. Copyright © 2016 American College of Cardiology Foundation. Published by Elsevier Inc. All rights reserved.
2011-01-01
Background Machine learning has a vast range of applications. In particular, advanced machine learning methods are routinely and increasingly used in quantitative structure activity relationship (QSAR) modeling. QSAR data sets often encompass tens of thousands of compounds and the size of proprietary, as well as public data sets, is rapidly growing. Hence, there is a demand for computationally efficient machine learning algorithms, easily available to researchers without extensive machine learning knowledge. In granting the scientific principles of transparency and reproducibility, Open Source solutions are increasingly acknowledged by regulatory authorities. Thus, an Open Source state-of-the-art high performance machine learning platform, interfacing multiple, customized machine learning algorithms for both graphical programming and scripting, to be used for large scale development of QSAR models of regulatory quality, is of great value to the QSAR community. Results This paper describes the implementation of the Open Source machine learning package AZOrange. AZOrange is specially developed to support batch generation of QSAR models in providing the full work flow of QSAR modeling, from descriptor calculation to automated model building, validation and selection. The automated work flow relies upon the customization of the machine learning algorithms and a generalized, automated model hyper-parameter selection process. Several high performance machine learning algorithms are interfaced for efficient data set specific selection of the statistical method, promoting model accuracy. Using the high performance machine learning algorithms of AZOrange does not require programming knowledge as flexible applications can be created, not only at a scripting level, but also in a graphical programming environment. Conclusions AZOrange is a step towards meeting the needs for an Open Source high performance machine learning platform, supporting the efficient development of highly accurate QSAR models fulfilling regulatory requirements. PMID:21798025
Stålring, Jonna C; Carlsson, Lars A; Almeida, Pedro; Boyer, Scott
2011-07-28
Machine learning has a vast range of applications. In particular, advanced machine learning methods are routinely and increasingly used in quantitative structure activity relationship (QSAR) modeling. QSAR data sets often encompass tens of thousands of compounds and the size of proprietary, as well as public data sets, is rapidly growing. Hence, there is a demand for computationally efficient machine learning algorithms, easily available to researchers without extensive machine learning knowledge. In granting the scientific principles of transparency and reproducibility, Open Source solutions are increasingly acknowledged by regulatory authorities. Thus, an Open Source state-of-the-art high performance machine learning platform, interfacing multiple, customized machine learning algorithms for both graphical programming and scripting, to be used for large scale development of QSAR models of regulatory quality, is of great value to the QSAR community. This paper describes the implementation of the Open Source machine learning package AZOrange. AZOrange is specially developed to support batch generation of QSAR models in providing the full work flow of QSAR modeling, from descriptor calculation to automated model building, validation and selection. The automated work flow relies upon the customization of the machine learning algorithms and a generalized, automated model hyper-parameter selection process. Several high performance machine learning algorithms are interfaced for efficient data set specific selection of the statistical method, promoting model accuracy. Using the high performance machine learning algorithms of AZOrange does not require programming knowledge as flexible applications can be created, not only at a scripting level, but also in a graphical programming environment. AZOrange is a step towards meeting the needs for an Open Source high performance machine learning platform, supporting the efficient development of highly accurate QSAR models fulfilling regulatory requirements.
Applications of Machine Learning and Rule Induction,
1995-02-15
An important area of application for machine learning is in automating the acquisition of knowledge bases required for expert systems. In this paper...we review the major paradigms for machine learning , including neural networks, instance-based methods, genetic learning, rule induction, and analytic
Automated discovery systems and the inductivist controversy
NASA Astrophysics Data System (ADS)
Giza, Piotr
2017-09-01
The paper explores possible influences that some developments in the field of branches of AI, called automated discovery and machine learning systems, might have upon some aspects of the old debate between Francis Bacon's inductivism and Karl Popper's falsificationism. Donald Gillies facetiously calls this controversy 'the duel of two English knights', and claims, after some analysis of historical cases of discovery, that Baconian induction had been used in science very rarely, or not at all, although he argues that the situation has changed with the advent of machine learning systems. (Some clarification of terms machine learning and automated discovery is required here. The key idea of machine learning is that, given data with associated outcomes, software can be trained to make those associations in future cases which typically amounts to inducing some rules from individual cases classified by the experts. Automated discovery (also called machine discovery) deals with uncovering new knowledge that is valuable for human beings, and its key idea is that discovery is like other intellectual tasks and that the general idea of heuristic search in problem spaces applies also to discovery tasks. However, since machine learning systems discover (very low-level) regularities in data, throughout this paper I use the generic term automated discovery for both kinds of systems. I will elaborate on this later on). Gillies's line of argument can be generalised: thanks to automated discovery systems, philosophers of science have at their disposal a new tool for empirically testing their philosophical hypotheses. Accordingly, in the paper, I will address the question, which of the two philosophical conceptions of scientific method is better vindicated in view of the successes and failures of systems developed within three major research programmes in the field: machine learning systems in the Turing tradition, normative theory of scientific discovery formulated by Herbert Simon's group and the programme called HHNT, proposed by J. Holland, K. Holyoak, R. Nisbett and P. Thagard.
Machine learning of network metrics in ATLAS Distributed Data Management
NASA Astrophysics Data System (ADS)
Lassnig, Mario; Toler, Wesley; Vamosi, Ralf; Bogado, Joaquin; ATLAS Collaboration
2017-10-01
The increasing volume of physics data poses a critical challenge to the ATLAS experiment. In anticipation of high luminosity physics, automation of everyday data management tasks has become necessary. Previously many of these tasks required human decision-making and operation. Recent advances in hardware and software have made it possible to entrust more complicated duties to automated systems using models trained by machine learning algorithms. In this contribution we show results from one of our ongoing automation efforts that focuses on network metrics. First, we describe our machine learning framework built atop the ATLAS Analytics Platform. This framework can automatically extract and aggregate data, train models with various machine learning algorithms, and eventually score the resulting models and parameters. Second, we use these models to forecast metrics relevant for networkaware job scheduling and data brokering. We show the characteristics of the data and evaluate the forecasting accuracy of our models.
Neuromorphic Optical Signal Processing and Image Understanding for Automated Target Recognition
1989-12-01
34 Stochastic Learning Machine " Neuromorphic Target Identification * Cognitive Networks 3. Conclusions ..... ................ .. 12 4. Publications...16 5. References ...... ................... . 17 6. Appendices ....... .................. 18 I. Optoelectronic Neural Networks and...Learning Machines. II. Stochastic Optical Learning Machine. III. Learning Network for Extrapolation AccesFon For and Radar Target Identification
Nakanishi, Rine; Sankaran, Sethuraman; Grady, Leo; Malpeso, Jenifer; Yousfi, Razik; Osawa, Kazuhiro; Ceponiene, Indre; Nazarat, Negin; Rahmani, Sina; Kissel, Kendall; Jayawardena, Eranthi; Dailing, Christopher; Zarins, Christopher; Koo, Bon-Kwon; Min, James K; Taylor, Charles A; Budoff, Matthew J
2018-03-23
Our goal was to evaluate the efficacy of a fully automated method for assessing the image quality (IQ) of coronary computed tomography angiography (CCTA). The machine learning method was trained using 75 CCTA studies by mapping features (noise, contrast, misregistration scores, and un-interpretability index) to an IQ score based on manual ground truth data. The automated method was validated on a set of 50 CCTA studies and subsequently tested on a new set of 172 CCTA studies against visual IQ scores on a 5-point Likert scale. The area under the curve in the validation set was 0.96. In the 172 CCTA studies, our method yielded a Cohen's kappa statistic for the agreement between automated and visual IQ assessment of 0.67 (p < 0.01). In the group where good to excellent (n = 163), fair (n = 6), and poor visual IQ scores (n = 3) were graded, 155, 5, and 2 of the patients received an automated IQ score > 50 %, respectively. Fully automated assessment of the IQ of CCTA data sets by machine learning was reproducible and provided similar results compared with visual analysis within the limits of inter-operator variability. • The proposed method enables automated and reproducible image quality assessment. • Machine learning and visual assessments yielded comparable estimates of image quality. • Automated assessment potentially allows for more standardised image quality. • Image quality assessment enables standardization of clinical trial results across different datasets.
Automation of energy demand forecasting
NASA Astrophysics Data System (ADS)
Siddique, Sanzad
Automation of energy demand forecasting saves time and effort by searching automatically for an appropriate model in a candidate model space without manual intervention. This thesis introduces a search-based approach that improves the performance of the model searching process for econometrics models. Further improvements in the accuracy of the energy demand forecasting are achieved by integrating nonlinear transformations within the models. This thesis introduces machine learning techniques that are capable of modeling such nonlinearity. Algorithms for learning domain knowledge from time series data using the machine learning methods are also presented. The novel search based approach and the machine learning models are tested with synthetic data as well as with natural gas and electricity demand signals. Experimental results show that the model searching technique is capable of finding an appropriate forecasting model. Further experimental results demonstrate an improved forecasting accuracy achieved by using the novel machine learning techniques introduced in this thesis. This thesis presents an analysis of how the machine learning techniques learn domain knowledge. The learned domain knowledge is used to improve the forecast accuracy.
Cardiac imaging: working towards fully-automated machine analysis & interpretation.
Slomka, Piotr J; Dey, Damini; Sitek, Arkadiusz; Motwani, Manish; Berman, Daniel S; Germano, Guido
2017-03-01
Non-invasive imaging plays a critical role in managing patients with cardiovascular disease. Although subjective visual interpretation remains the clinical mainstay, quantitative analysis facilitates objective, evidence-based management, and advances in clinical research. This has driven developments in computing and software tools aimed at achieving fully automated image processing and quantitative analysis. In parallel, machine learning techniques have been used to rapidly integrate large amounts of clinical and quantitative imaging data to provide highly personalized individual patient-based conclusions. Areas covered: This review summarizes recent advances in automated quantitative imaging in cardiology and describes the latest techniques which incorporate machine learning principles. The review focuses on the cardiac imaging techniques which are in wide clinical use. It also discusses key issues and obstacles for these tools to become utilized in mainstream clinical practice. Expert commentary: Fully-automated processing and high-level computer interpretation of cardiac imaging are becoming a reality. Application of machine learning to the vast amounts of quantitative data generated per scan and integration with clinical data also facilitates a move to more patient-specific interpretation. These developments are unlikely to replace interpreting physicians but will provide them with highly accurate tools to detect disease, risk-stratify, and optimize patient-specific treatment. However, with each technological advance, we move further from human dependence and closer to fully-automated machine interpretation.
Dixon, Steven L; Duan, Jianxin; Smith, Ethan; Von Bargen, Christopher D; Sherman, Woody; Repasky, Matthew P
2016-10-01
We introduce AutoQSAR, an automated machine-learning application to build, validate and deploy quantitative structure-activity relationship (QSAR) models. The process of descriptor generation, feature selection and the creation of a large number of QSAR models has been automated into a single workflow within AutoQSAR. The models are built using a variety of machine-learning methods, and each model is scored using a novel approach. Effectiveness of the method is demonstrated through comparison with literature QSAR models using identical datasets for six end points: protein-ligand binding affinity, solubility, blood-brain barrier permeability, carcinogenicity, mutagenicity and bioaccumulation in fish. AutoQSAR demonstrates similar or better predictive performance as compared with published results for four of the six endpoints while requiring minimal human time and expertise.
Burlina, Philippe; Billings, Seth; Joshi, Neil
2017-01-01
Objective To evaluate the use of ultrasound coupled with machine learning (ML) and deep learning (DL) techniques for automated or semi-automated classification of myositis. Methods Eighty subjects comprised of 19 with inclusion body myositis (IBM), 14 with polymyositis (PM), 14 with dermatomyositis (DM), and 33 normal (N) subjects were included in this study, where 3214 muscle ultrasound images of 7 muscles (observed bilaterally) were acquired. We considered three problems of classification including (A) normal vs. affected (DM, PM, IBM); (B) normal vs. IBM patients; and (C) IBM vs. other types of myositis (DM or PM). We studied the use of an automated DL method using deep convolutional neural networks (DL-DCNNs) for diagnostic classification and compared it with a semi-automated conventional ML method based on random forests (ML-RF) and “engineered” features. We used the known clinical diagnosis as the gold standard for evaluating performance of muscle classification. Results The performance of the DL-DCNN method resulted in accuracies ± standard deviation of 76.2% ± 3.1% for problem (A), 86.6% ± 2.4% for (B) and 74.8% ± 3.9% for (C), while the ML-RF method led to accuracies of 72.3% ± 3.3% for problem (A), 84.3% ± 2.3% for (B) and 68.9% ± 2.5% for (C). Conclusions This study demonstrates the application of machine learning methods for automatically or semi-automatically classifying inflammatory muscle disease using muscle ultrasound. Compared to the conventional random forest machine learning method used here, which has the drawback of requiring manual delineation of muscle/fat boundaries, DCNN-based classification by and large improved the accuracies in all classification problems while providing a fully automated approach to classification. PMID:28854220
Burlina, Philippe; Billings, Seth; Joshi, Neil; Albayda, Jemima
2017-01-01
To evaluate the use of ultrasound coupled with machine learning (ML) and deep learning (DL) techniques for automated or semi-automated classification of myositis. Eighty subjects comprised of 19 with inclusion body myositis (IBM), 14 with polymyositis (PM), 14 with dermatomyositis (DM), and 33 normal (N) subjects were included in this study, where 3214 muscle ultrasound images of 7 muscles (observed bilaterally) were acquired. We considered three problems of classification including (A) normal vs. affected (DM, PM, IBM); (B) normal vs. IBM patients; and (C) IBM vs. other types of myositis (DM or PM). We studied the use of an automated DL method using deep convolutional neural networks (DL-DCNNs) for diagnostic classification and compared it with a semi-automated conventional ML method based on random forests (ML-RF) and "engineered" features. We used the known clinical diagnosis as the gold standard for evaluating performance of muscle classification. The performance of the DL-DCNN method resulted in accuracies ± standard deviation of 76.2% ± 3.1% for problem (A), 86.6% ± 2.4% for (B) and 74.8% ± 3.9% for (C), while the ML-RF method led to accuracies of 72.3% ± 3.3% for problem (A), 84.3% ± 2.3% for (B) and 68.9% ± 2.5% for (C). This study demonstrates the application of machine learning methods for automatically or semi-automatically classifying inflammatory muscle disease using muscle ultrasound. Compared to the conventional random forest machine learning method used here, which has the drawback of requiring manual delineation of muscle/fat boundaries, DCNN-based classification by and large improved the accuracies in all classification problems while providing a fully automated approach to classification.
Cardiac imaging: working towards fully-automated machine analysis & interpretation
Slomka, Piotr J; Dey, Damini; Sitek, Arkadiusz; Motwani, Manish; Berman, Daniel S; Germano, Guido
2017-01-01
Introduction Non-invasive imaging plays a critical role in managing patients with cardiovascular disease. Although subjective visual interpretation remains the clinical mainstay, quantitative analysis facilitates objective, evidence-based management, and advances in clinical research. This has driven developments in computing and software tools aimed at achieving fully automated image processing and quantitative analysis. In parallel, machine learning techniques have been used to rapidly integrate large amounts of clinical and quantitative imaging data to provide highly personalized individual patient-based conclusions. Areas covered This review summarizes recent advances in automated quantitative imaging in cardiology and describes the latest techniques which incorporate machine learning principles. The review focuses on the cardiac imaging techniques which are in wide clinical use. It also discusses key issues and obstacles for these tools to become utilized in mainstream clinical practice. Expert commentary Fully-automated processing and high-level computer interpretation of cardiac imaging are becoming a reality. Application of machine learning to the vast amounts of quantitative data generated per scan and integration with clinical data also facilitates a move to more patient-specific interpretation. These developments are unlikely to replace interpreting physicians but will provide them with highly accurate tools to detect disease, risk-stratify, and optimize patient-specific treatment. However, with each technological advance, we move further from human dependence and closer to fully-automated machine interpretation. PMID:28277804
Tackling the x-ray cargo inspection challenge using machine learning
NASA Astrophysics Data System (ADS)
Jaccard, Nicolas; Rogers, Thomas W.; Morton, Edward J.; Griffin, Lewis D.
2016-05-01
The current infrastructure for non-intrusive inspection of cargo containers cannot accommodate exploding com-merce volumes and increasingly stringent regulations. There is a pressing need to develop methods to automate parts of the inspection workflow, enabling expert operators to focus on a manageable number of high-risk images. To tackle this challenge, we developed a modular framework for automated X-ray cargo image inspection. Employing state-of-the-art machine learning approaches, including deep learning, we demonstrate high performance for empty container verification and specific threat detection. This work constitutes a significant step towards the partial automation of X-ray cargo image inspection.
Designing Anticancer Peptides by Constructive Machine Learning.
Grisoni, Francesca; Neuhaus, Claudia S; Gabernet, Gisela; Müller, Alex T; Hiss, Jan A; Schneider, Gisbert
2018-04-21
Constructive (generative) machine learning enables the automated generation of novel chemical structures without the need for explicit molecular design rules. This study presents the experimental application of such a deep machine learning model to design membranolytic anticancer peptides (ACPs) de novo. A recurrent neural network with long short-term memory cells was trained on α-helical cationic amphipathic peptide sequences and then fine-tuned with 26 known ACPs by transfer learning. This optimized model was used to generate unique and novel amino acid sequences. Twelve of the peptides were synthesized and tested for their activity on MCF7 human breast adenocarcinoma cells and selectivity against human erythrocytes. Ten of these peptides were active against cancer cells. Six of the active peptides killed MCF7 cancer cells without affecting human erythrocytes with at least threefold selectivity. These results advocate constructive machine learning for the automated design of peptides with desired biological activities. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Mori, Kensaku; Ota, Shunsuke; Deguchi, Daisuke; Kitasaka, Takayuki; Suenaga, Yasuhito; Iwano, Shingo; Hasegawa, Yosihnori; Takabatake, Hirotsugu; Mori, Masaki; Natori, Hiroshi
2009-01-01
This paper presents a method for the automated anatomical labeling of bronchial branches extracted from 3D CT images based on machine learning and combination optimization. We also show applications of anatomical labeling on a bronchoscopy guidance system. This paper performs automated labeling by using machine learning and combination optimization. The actual procedure consists of four steps: (a) extraction of tree structures of the bronchus regions extracted from CT images, (b) construction of AdaBoost classifiers, (c) computation of candidate names for all branches by using the classifiers, (d) selection of best combination of anatomical names. We applied the proposed method to 90 cases of 3D CT datasets. The experimental results showed that the proposed method can assign correct anatomical names to 86.9% of the bronchial branches up to the sub-segmental lobe branches. Also, we overlaid the anatomical names of bronchial branches on real bronchoscopic views to guide real bronchoscopy.
Kandaswamy, Umasankar; Rotman, Ziv; Watt, Dana; Schillebeeckx, Ian; Cavalli, Valeria; Klyachko, Vitaly
2013-01-01
High-resolution live-cell imaging studies of neuronal structure and function are characterized by large variability in image acquisition conditions due to background and sample variations as well as low signal-to-noise ratio. The lack of automated image analysis tools that can be generalized for varying image acquisition conditions represents one of the main challenges in the field of biomedical image analysis. Specifically, segmentation of the axonal/dendritic arborizations in brightfield or fluorescence imaging studies is extremely labor-intensive and still performed mostly manually. Here we describe a fully automated machine-learning approach based on textural analysis algorithms for segmenting neuronal arborizations in high-resolution brightfield images of live cultured neurons. We compare performance of our algorithm to manual segmentation and show that it combines 90% accuracy, with similarly high levels of specificity and sensitivity. Moreover, the algorithm maintains high performance levels under a wide range of image acquisition conditions indicating that it is largely condition-invariable. We further describe an application of this algorithm to fully automated synapse localization and classification in fluorescence imaging studies based on synaptic activity. Textural analysis-based machine-learning approach thus offers a high performance condition-invariable tool for automated neurite segmentation. PMID:23261652
ERIC Educational Resources Information Center
Nehm, Ross H.; Ha, Minsu; Mayfield, Elijah
2012-01-01
This study explored the use of machine learning to automatically evaluate the accuracy of students' written explanations of evolutionary change. Performance of the Summarization Integrated Development Environment (SIDE) program was compared to human expert scoring using a corpus of 2,260 evolutionary explanations written by 565 undergraduate…
Stone, Bryan L; Johnson, Michael D; Tarczy-Hornoch, Peter; Wilcox, Adam B; Mooney, Sean D; Sheng, Xiaoming; Haug, Peter J; Nkoy, Flory L
2017-01-01
Background To improve health outcomes and cut health care costs, we often need to conduct prediction/classification using large clinical datasets (aka, clinical big data), for example, to identify high-risk patients for preventive interventions. Machine learning has been proposed as a key technology for doing this. Machine learning has won most data science competitions and could support many clinical activities, yet only 15% of hospitals use it for even limited purposes. Despite familiarity with data, health care researchers often lack machine learning expertise to directly use clinical big data, creating a hurdle in realizing value from their data. Health care researchers can work with data scientists with deep machine learning knowledge, but it takes time and effort for both parties to communicate effectively. Facing a shortage in the United States of data scientists and hiring competition from companies with deep pockets, health care systems have difficulty recruiting data scientists. Building and generalizing a machine learning model often requires hundreds to thousands of manual iterations by data scientists to select the following: (1) hyper-parameter values and complex algorithms that greatly affect model accuracy and (2) operators and periods for temporally aggregating clinical attributes (eg, whether a patient’s weight kept rising in the past year). This process becomes infeasible with limited budgets. Objective This study’s goal is to enable health care researchers to directly use clinical big data, make machine learning feasible with limited budgets and data scientist resources, and realize value from data. Methods This study will allow us to achieve the following: (1) finish developing the new software, Automated Machine Learning (Auto-ML), to automate model selection for machine learning with clinical big data and validate Auto-ML on seven benchmark modeling problems of clinical importance; (2) apply Auto-ML and novel methodology to two new modeling problems crucial for care management allocation and pilot one model with care managers; and (3) perform simulations to estimate the impact of adopting Auto-ML on US patient outcomes. Results We are currently writing Auto-ML’s design document. We intend to finish our study by around the year 2022. Conclusions Auto-ML will generalize to various clinical prediction/classification problems. With minimal help from data scientists, health care researchers can use Auto-ML to quickly build high-quality models. This will boost wider use of machine learning in health care and improve patient outcomes. PMID:28851678
Designing Contestability: Interaction Design, Machine Learning, and Mental Health
Hirsch, Tad; Merced, Kritzia; Narayanan, Shrikanth; Imel, Zac E.; Atkins, David C.
2017-01-01
We describe the design of an automated assessment and training tool for psychotherapists to illustrate challenges with creating interactive machine learning (ML) systems, particularly in contexts where human life, livelihood, and wellbeing are at stake. We explore how existing theories of interaction design and machine learning apply to the psychotherapy context, and identify “contestability” as a new principle for designing systems that evaluate human behavior. Finally, we offer several strategies for making ML systems more accountable to human actors. PMID:28890949
2017-03-01
neuro ICP care beyond trauma care. 15. SUBJECT TERMS Advanced machine learning techniques, intracranial pressure, vital signs, monitoring...death and disability in combat casualties [1,2]. Approximately 2 million head injuries occur annually in the United States, resulting in more than...editor. Machine learning and data mining in pattern recognition. Proceedings of the 8th International Workshop on Machine Learning and Data Mining in
Radio Frequency Interference Detection using Machine Learning.
NASA Astrophysics Data System (ADS)
Mosiane, Olorato; Oozeer, Nadeem; Aniyan, Arun; Bassett, Bruce A.
2017-05-01
Radio frequency interference (RFI) has plagued radio astronomy which potentially might be as bad or worse by the time the Square Kilometre Array (SKA) comes up. RFI can be either internal (generated by instruments) or external that originates from intentional or unintentional radio emission generated by man. With the huge amount of data that will be available with up coming radio telescopes, an automated aproach will be required to detect RFI. In this paper to try automate this process we present the result of applying machine learning techniques to cross match RFI from the Karoo Array Telescope (KAT-7) data. We found that not all the features selected to characterise RFI are always important. We further investigated 3 machine learning techniques and conclude that the Random forest classifier performs with a 98% Area Under Curve and 91% recall in detecting RFI.
Machine learning for micro-tomography
NASA Astrophysics Data System (ADS)
Parkinson, Dilworth Y.; Pelt, Daniël. M.; Perciano, Talita; Ushizima, Daniela; Krishnan, Harinarayan; Barnard, Harold S.; MacDowell, Alastair A.; Sethian, James
2017-09-01
Machine learning has revolutionized a number of fields, but many micro-tomography users have never used it for their work. The micro-tomography beamline at the Advanced Light Source (ALS), in collaboration with the Center for Applied Mathematics for Energy Research Applications (CAMERA) at Lawrence Berkeley National Laboratory, has now deployed a series of tools to automate data processing for ALS users using machine learning. This includes new reconstruction algorithms, feature extraction tools, and image classification and recommen- dation systems for scientific image. Some of these tools are either in automated pipelines that operate on data as it is collected or as stand-alone software. Others are deployed on computing resources at Berkeley Lab-from workstations to supercomputers-and made accessible to users through either scripting or easy-to-use graphical interfaces. This paper presents a progress report on this work.
ERIC Educational Resources Information Center
Nakamura, Christopher M.; Murphy, Sytil K.; Christel, Michael G.; Stevens, Scott M.; Zollman, Dean A.
2016-01-01
Computer-automated assessment of students' text responses to short-answer questions represents an important enabling technology for online learning environments. We have investigated the use of machine learning to train computer models capable of automatically classifying short-answer responses and assessed the results. Our investigations are part…
Topic categorisation of statements in suicide notes with integrated rules and machine learning.
Kovačević, Aleksandar; Dehghan, Azad; Keane, John A; Nenadic, Goran
2012-01-01
We describe and evaluate an automated approach used as part of the i2b2 2011 challenge to identify and categorise statements in suicide notes into one of 15 topics, including Love, Guilt, Thankfulness, Hopelessness and Instructions. The approach combines a set of lexico-syntactic rules with a set of models derived by machine learning from a training dataset. The machine learning models rely on named entities, lexical, lexico-semantic and presentation features, as well as the rules that are applicable to a given statement. On a testing set of 300 suicide notes, the approach showed the overall best micro F-measure of up to 53.36%. The best precision achieved was 67.17% when only rules are used, whereas best recall of 50.57% was with integrated rules and machine learning. While some topics (eg, Sorrow, Anger, Blame) prove challenging, the performance for relatively frequent (eg, Love) and well-scoped categories (eg, Thankfulness) was comparatively higher (precision between 68% and 79%), suggesting that automated text mining approaches can be effective in topic categorisation of suicide notes.
Solti, Imre; Cooke, Colin R; Xia, Fei; Wurfel, Mark M
2009-11-01
This paper compares the performance of keyword and machine learning-based chest x-ray report classification for Acute Lung Injury (ALI). ALI mortality is approximately 30 percent. High mortality is, in part, a consequence of delayed manual chest x-ray classification. An automated system could reduce the time to recognize ALI and lead to reductions in mortality. For our study, 96 and 857 chest x-ray reports in two corpora were labeled by domain experts for ALI. We developed a keyword and a Maximum Entropy-based classification system. Word unigram and character n-grams provided the features for the machine learning system. The Maximum Entropy algorithm with character 6-gram achieved the highest performance (Recall=0.91, Precision=0.90 and F-measure=0.91) on the 857-report corpus. This study has shown that for the classification of ALI chest x-ray reports, the machine learning approach is superior to the keyword based system and achieves comparable results to highest performing physician annotators.
Solti, Imre; Cooke, Colin R.; Xia, Fei; Wurfel, Mark M.
2010-01-01
This paper compares the performance of keyword and machine learning-based chest x-ray report classification for Acute Lung Injury (ALI). ALI mortality is approximately 30 percent. High mortality is, in part, a consequence of delayed manual chest x-ray classification. An automated system could reduce the time to recognize ALI and lead to reductions in mortality. For our study, 96 and 857 chest x-ray reports in two corpora were labeled by domain experts for ALI. We developed a keyword and a Maximum Entropy-based classification system. Word unigram and character n-grams provided the features for the machine learning system. The Maximum Entropy algorithm with character 6-gram achieved the highest performance (Recall=0.91, Precision=0.90 and F-measure=0.91) on the 857-report corpus. This study has shown that for the classification of ALI chest x-ray reports, the machine learning approach is superior to the keyword based system and achieves comparable results to highest performing physician annotators. PMID:21152268
Pizarro, Ricardo A; Cheng, Xi; Barnett, Alan; Lemaitre, Herve; Verchinski, Beth A; Goldman, Aaron L; Xiao, Ena; Luo, Qian; Berman, Karen F; Callicott, Joseph H; Weinberger, Daniel R; Mattay, Venkata S
2016-01-01
High-resolution three-dimensional magnetic resonance imaging (3D-MRI) is being increasingly used to delineate morphological changes underlying neuropsychiatric disorders. Unfortunately, artifacts frequently compromise the utility of 3D-MRI yielding irreproducible results, from both type I and type II errors. It is therefore critical to screen 3D-MRIs for artifacts before use. Currently, quality assessment involves slice-wise visual inspection of 3D-MRI volumes, a procedure that is both subjective and time consuming. Automating the quality rating of 3D-MRI could improve the efficiency and reproducibility of the procedure. The present study is one of the first efforts to apply a support vector machine (SVM) algorithm in the quality assessment of structural brain images, using global and region of interest (ROI) automated image quality features developed in-house. SVM is a supervised machine-learning algorithm that can predict the category of test datasets based on the knowledge acquired from a learning dataset. The performance (accuracy) of the automated SVM approach was assessed, by comparing the SVM-predicted quality labels to investigator-determined quality labels. The accuracy for classifying 1457 3D-MRI volumes from our database using the SVM approach is around 80%. These results are promising and illustrate the possibility of using SVM as an automated quality assessment tool for 3D-MRI.
Lee, Unseok; Chang, Sungyul; Putra, Gian Anantrio; Kim, Hyoungseok; Kim, Dong Hwan
2018-01-01
A high-throughput plant phenotyping system automatically observes and grows many plant samples. Many plant sample images are acquired by the system to determine the characteristics of the plants (populations). Stable image acquisition and processing is very important to accurately determine the characteristics. However, hardware for acquiring plant images rapidly and stably, while minimizing plant stress, is lacking. Moreover, most software cannot adequately handle large-scale plant imaging. To address these problems, we developed a new, automated, high-throughput plant phenotyping system using simple and robust hardware, and an automated plant-imaging-analysis pipeline consisting of machine-learning-based plant segmentation. Our hardware acquires images reliably and quickly and minimizes plant stress. Furthermore, the images are processed automatically. In particular, large-scale plant-image datasets can be segmented precisely using a classifier developed using a superpixel-based machine-learning algorithm (Random Forest), and variations in plant parameters (such as area) over time can be assessed using the segmented images. We performed comparative evaluations to identify an appropriate learning algorithm for our proposed system, and tested three robust learning algorithms. We developed not only an automatic analysis pipeline but also a convenient means of plant-growth analysis that provides a learning data interface and visualization of plant growth trends. Thus, our system allows end-users such as plant biologists to analyze plant growth via large-scale plant image data easily.
Luo, Gang; Stone, Bryan L; Johnson, Michael D; Tarczy-Hornoch, Peter; Wilcox, Adam B; Mooney, Sean D; Sheng, Xiaoming; Haug, Peter J; Nkoy, Flory L
2017-08-29
To improve health outcomes and cut health care costs, we often need to conduct prediction/classification using large clinical datasets (aka, clinical big data), for example, to identify high-risk patients for preventive interventions. Machine learning has been proposed as a key technology for doing this. Machine learning has won most data science competitions and could support many clinical activities, yet only 15% of hospitals use it for even limited purposes. Despite familiarity with data, health care researchers often lack machine learning expertise to directly use clinical big data, creating a hurdle in realizing value from their data. Health care researchers can work with data scientists with deep machine learning knowledge, but it takes time and effort for both parties to communicate effectively. Facing a shortage in the United States of data scientists and hiring competition from companies with deep pockets, health care systems have difficulty recruiting data scientists. Building and generalizing a machine learning model often requires hundreds to thousands of manual iterations by data scientists to select the following: (1) hyper-parameter values and complex algorithms that greatly affect model accuracy and (2) operators and periods for temporally aggregating clinical attributes (eg, whether a patient's weight kept rising in the past year). This process becomes infeasible with limited budgets. This study's goal is to enable health care researchers to directly use clinical big data, make machine learning feasible with limited budgets and data scientist resources, and realize value from data. This study will allow us to achieve the following: (1) finish developing the new software, Automated Machine Learning (Auto-ML), to automate model selection for machine learning with clinical big data and validate Auto-ML on seven benchmark modeling problems of clinical importance; (2) apply Auto-ML and novel methodology to two new modeling problems crucial for care management allocation and pilot one model with care managers; and (3) perform simulations to estimate the impact of adopting Auto-ML on US patient outcomes. We are currently writing Auto-ML's design document. We intend to finish our study by around the year 2022. Auto-ML will generalize to various clinical prediction/classification problems. With minimal help from data scientists, health care researchers can use Auto-ML to quickly build high-quality models. This will boost wider use of machine learning in health care and improve patient outcomes. ©Gang Luo, Bryan L Stone, Michael D Johnson, Peter Tarczy-Hornoch, Adam B Wilcox, Sean D Mooney, Xiaoming Sheng, Peter J Haug, Flory L Nkoy. Originally published in JMIR Research Protocols (http://www.researchprotocols.org), 29.08.2017.
One-Class Classification-Based Real-Time Activity Error Detection in Smart Homes.
Das, Barnan; Cook, Diane J; Krishnan, Narayanan C; Schmitter-Edgecombe, Maureen
2016-08-01
Caring for individuals with dementia is frequently associated with extreme physical and emotional stress, which often leads to depression. Smart home technology and advances in machine learning techniques can provide innovative solutions to reduce caregiver burden. One key service that caregivers provide is prompting individuals with memory limitations to initiate and complete daily activities. We hypothesize that sensor technologies combined with machine learning techniques can automate the process of providing reminder-based interventions. The first step towards automated interventions is to detect when an individual faces difficulty with activities. We propose machine learning approaches based on one-class classification that learn normal activity patterns. When we apply these classifiers to activity patterns that were not seen before, the classifiers are able to detect activity errors, which represent potential prompt situations. We validate our approaches on smart home sensor data obtained from older adult participants, some of whom faced difficulties performing routine activities and thus committed errors.
14 CFR 382.3 - What do the terms in this rule mean?
Code of Federal Regulations, 2014 CFR
2014-01-01
... devices and medications. Automated airport kiosk means a self-service transaction machine that a carrier... machine means a continuous positive airway pressure machine. Department or DOT means the United States..., emotional or mental illness, and specific learning disabilities. The term physical or mental impairment...
Machine learning and computer vision approaches for phenotypic profiling.
Grys, Ben T; Lo, Dara S; Sahin, Nil; Kraus, Oren Z; Morris, Quaid; Boone, Charles; Andrews, Brenda J
2017-01-02
With recent advances in high-throughput, automated microscopy, there has been an increased demand for effective computational strategies to analyze large-scale, image-based data. To this end, computer vision approaches have been applied to cell segmentation and feature extraction, whereas machine-learning approaches have been developed to aid in phenotypic classification and clustering of data acquired from biological images. Here, we provide an overview of the commonly used computer vision and machine-learning methods for generating and categorizing phenotypic profiles, highlighting the general biological utility of each approach. © 2017 Grys et al.
Machine learning and computer vision approaches for phenotypic profiling
Morris, Quaid
2017-01-01
With recent advances in high-throughput, automated microscopy, there has been an increased demand for effective computational strategies to analyze large-scale, image-based data. To this end, computer vision approaches have been applied to cell segmentation and feature extraction, whereas machine-learning approaches have been developed to aid in phenotypic classification and clustering of data acquired from biological images. Here, we provide an overview of the commonly used computer vision and machine-learning methods for generating and categorizing phenotypic profiles, highlighting the general biological utility of each approach. PMID:27940887
Niemeijer, Meindert; van Ginneken, Bram; Russell, Stephen R; Suttorp-Schulten, Maria S A; Abràmoff, Michael D
2007-05-01
To describe and evaluate a machine learning-based, automated system to detect exudates and cotton-wool spots in digital color fundus photographs and differentiate them from drusen, for early diagnosis of diabetic retinopathy. Three hundred retinal images from one eye of 300 patients with diabetes were selected from a diabetic retinopathy telediagnosis database (nonmydriatic camera, two-field photography): 100 with previously diagnosed bright lesions and 200 without. A machine learning computer program was developed that can identify and differentiate among drusen, (hard) exudates, and cotton-wool spots. A human expert standard for the 300 images was obtained by consensus annotation by two retinal specialists. Sensitivities and specificities of the annotations on the 300 images by the automated system and a third retinal specialist were determined. The system achieved an area under the receiver operating characteristic (ROC) curve of 0.95 and sensitivity/specificity pairs of 0.95/0.88 for the detection of bright lesions of any type, and 0.95/0.86, 0.70/0.93, and 0.77/0.88 for the detection of exudates, cotton-wool spots, and drusen, respectively. The third retinal specialist achieved pairs of 0.95/0.74 for bright lesions and 0.90/0.98, 0.87/0.98, and 0.92/0.79 per lesion type. A machine learning-based, automated system capable of detecting exudates and cotton-wool spots and differentiating them from drusen in color images obtained in community based diabetic patients has been developed and approaches the performance level of retinal experts. If the machine learning can be improved with additional training data sets, it may be useful for detecting clinically important bright lesions, enhancing early diagnosis, and reducing visual loss in patients with diabetes.
Ross, Elsie Gyang; Shah, Nigam H; Dalman, Ronald L; Nead, Kevin T; Cooke, John P; Leeper, Nicholas J
2016-11-01
A key aspect of the precision medicine effort is the development of informatics tools that can analyze and interpret "big data" sets in an automated and adaptive fashion while providing accurate and actionable clinical information. The aims of this study were to develop machine learning algorithms for the identification of disease and the prognostication of mortality risk and to determine whether such models perform better than classical statistical analyses. Focusing on peripheral artery disease (PAD), patient data were derived from a prospective, observational study of 1755 patients who presented for elective coronary angiography. We employed multiple supervised machine learning algorithms and used diverse clinical, demographic, imaging, and genomic information in a hypothesis-free manner to build models that could identify patients with PAD and predict future mortality. Comparison was made to standard stepwise linear regression models. Our machine-learned models outperformed stepwise logistic regression models both for the identification of patients with PAD (area under the curve, 0.87 vs 0.76, respectively; P = .03) and for the prediction of future mortality (area under the curve, 0.76 vs 0.65, respectively; P = .10). Both machine-learned models were markedly better calibrated than the stepwise logistic regression models, thus providing more accurate disease and mortality risk estimates. Machine learning approaches can produce more accurate disease classification and prediction models. These tools may prove clinically useful for the automated identification of patients with highly morbid diseases for which aggressive risk factor management can improve outcomes. Copyright © 2016 Society for Vascular Surgery. Published by Elsevier Inc. All rights reserved.
Ellis, Katherine; Godbole, Suneeta; Marshall, Simon; Lanckriet, Gert; Staudenmayer, John; Kerr, Jacqueline
2014-01-01
Active travel is an important area in physical activity research, but objective measurement of active travel is still difficult. Automated methods to measure travel behaviors will improve research in this area. In this paper, we present a supervised machine learning method for transportation mode prediction from global positioning system (GPS) and accelerometer data. We collected a dataset of about 150 h of GPS and accelerometer data from two research assistants following a protocol of prescribed trips consisting of five activities: bicycling, riding in a vehicle, walking, sitting, and standing. We extracted 49 features from 1-min windows of this data. We compared the performance of several machine learning algorithms and chose a random forest algorithm to classify the transportation mode. We used a moving average output filter to smooth the output predictions over time. The random forest algorithm achieved 89.8% cross-validated accuracy on this dataset. Adding the moving average filter to smooth output predictions increased the cross-validated accuracy to 91.9%. Machine learning methods are a viable approach for automating measurement of active travel, particularly for measuring travel activities that traditional accelerometer data processing methods misclassify, such as bicycling and vehicle travel.
Monitoring Hitting Load in Tennis Using Inertial Sensors and Machine Learning.
Whiteside, David; Cant, Olivia; Connolly, Molly; Reid, Machar
2017-10-01
Quantifying external workload is fundamental to training prescription in sport. In tennis, global positioning data are imprecise and fail to capture hitting loads. The current gold standard (manual notation) is time intensive and often not possible given players' heavy travel schedules. To develop an automated stroke-classification system to help quantify hitting load in tennis. Nineteen athletes wore an inertial measurement unit (IMU) on their wrist during 66 video-recorded training sessions. Video footage was manually notated such that known shot type (serve, rally forehand, slice forehand, forehand volley, rally backhand, slice backhand, backhand volley, smash, or false positive) was associated with the corresponding IMU data for 28,582 shots. Six types of machine-learning models were then constructed to classify true shot type from the IMU signals. Across 10-fold cross-validation, a cubic-kernel support vector machine classified binned shots (overhead, forehand, or backhand) with an accuracy of 97.4%. A second cubic-kernel support vector machine achieved 93.2% accuracy when classifying all 9 shot types. With a view to monitoring external load, the combination of miniature inertial sensors and machine learning offers a practical and automated method of quantifying shot counts and discriminating shot types in elite tennis players.
Hong, Weizhe; Kennedy, Ann; Burgos-Artizzu, Xavier P; Zelikowsky, Moriel; Navonne, Santiago G; Perona, Pietro; Anderson, David J
2015-09-22
A lack of automated, quantitative, and accurate assessment of social behaviors in mammalian animal models has limited progress toward understanding mechanisms underlying social interactions and their disorders such as autism. Here we present a new integrated hardware and software system that combines video tracking, depth sensing, and machine learning for automatic detection and quantification of social behaviors involving close and dynamic interactions between two mice of different coat colors in their home cage. We designed a hardware setup that integrates traditional video cameras with a depth camera, developed computer vision tools to extract the body "pose" of individual animals in a social context, and used a supervised learning algorithm to classify several well-described social behaviors. We validated the robustness of the automated classifiers in various experimental settings and used them to examine how genetic background, such as that of Black and Tan Brachyury (BTBR) mice (a previously reported autism model), influences social behavior. Our integrated approach allows for rapid, automated measurement of social behaviors across diverse experimental designs and also affords the ability to develop new, objective behavioral metrics.
Hong, Weizhe; Kennedy, Ann; Burgos-Artizzu, Xavier P.; Zelikowsky, Moriel; Navonne, Santiago G.; Perona, Pietro; Anderson, David J.
2015-01-01
A lack of automated, quantitative, and accurate assessment of social behaviors in mammalian animal models has limited progress toward understanding mechanisms underlying social interactions and their disorders such as autism. Here we present a new integrated hardware and software system that combines video tracking, depth sensing, and machine learning for automatic detection and quantification of social behaviors involving close and dynamic interactions between two mice of different coat colors in their home cage. We designed a hardware setup that integrates traditional video cameras with a depth camera, developed computer vision tools to extract the body “pose” of individual animals in a social context, and used a supervised learning algorithm to classify several well-described social behaviors. We validated the robustness of the automated classifiers in various experimental settings and used them to examine how genetic background, such as that of Black and Tan Brachyury (BTBR) mice (a previously reported autism model), influences social behavior. Our integrated approach allows for rapid, automated measurement of social behaviors across diverse experimental designs and also affords the ability to develop new, objective behavioral metrics. PMID:26354123
NASA Astrophysics Data System (ADS)
Ali, Salah M.; Hui, K. H.; Hee, L. M.; Salman Leong, M.; Al-Obaidi, M. A.; Ali, Y. H.; Abdelrhman, Ahmed M.
2018-03-01
Acoustic emission (AE) analysis has become a vital tool for initiating the maintenance tasks in many industries. However, the analysis process and interpretation has been found to be highly dependent on the experts. Therefore, an automated monitoring method would be required to reduce the cost and time consumed in the interpretation of AE signal. This paper investigates the application of two of the most common machine learning approaches namely artificial neural network (ANN) and support vector machine (SVM) to automate the diagnosis of valve faults in reciprocating compressor based on AE signal parameters. Since the accuracy is an essential factor in any automated diagnostic system, this paper also provides a comparative study based on predictive performance of ANN and SVM. AE parameters data was acquired from single stage reciprocating air compressor with different operational and valve conditions. ANN and SVM diagnosis models were subsequently devised by combining AE parameters of different conditions. Results demonstrate that ANN and SVM models have the same results in term of prediction accuracy. However, SVM model is recommended to automate diagnose the valve condition in due to the ability of handling a high number of input features with low sampling data sets.
Automated Blazar Light Curves Using Machine Learning
DOE Office of Scientific and Technical Information (OSTI.GOV)
Johnson, Spencer James
2017-07-27
This presentation describes a problem and methodology pertaining to automated blazar light curves. Namely, optical variability patterns for blazars require the construction of light curves and in order to generate the light curves, data must be filtered before processing to ensure quality.
Data-Driven Property Estimation for Protective Clothing
2014-09-01
reliable predictions falls under the rubric “machine learning”. Inspired by the applications of machine learning in pharmaceutical drug design and...using genetic algorithms, for instance— descriptor selection can be automated as well. A well-known structured learning technique—Artificial Neural...descriptors automatically, by iteration, e.g., using a genetic algorithm [49]. 4.2.4 Avoiding Overfitting A peril of all regression—least squares as
Automated Data Assimilation and Flight Planning for Multi-Platform Observation Missions
NASA Technical Reports Server (NTRS)
Oza, Nikunj; Morris, Robert A.; Strawa, Anthony; Kurklu, Elif; Keely, Leslie
2008-01-01
This is a progress report on an effort in which our goal is to demonstrate the effectiveness of automated data mining and planning for the daily management of Earth Science missions. Currently, data mining and machine learning technologies are being used by scientists at research labs for validating Earth science models. However, few if any of these advanced techniques are currently being integrated into daily mission operations. Consequently, there are significant gaps in the knowledge that can be derived from the models and data that are used each day for guiding mission activities. The result can be sub-optimal observation plans, lack of useful data, and wasteful use of resources. Recent advances in data mining, machine learning, and planning make it feasible to migrate these technologies into the daily mission planning cycle. We describe the design of a closed loop system for data acquisition, processing, and flight planning that integrates the results of machine learning into the flight planning process.
Hu, Yu-Chuan; Li, Gang; Yang, Yang; Han, Yu; Sun, Ying-Zhi; Liu, Zhi-Cheng; Tian, Qiang; Han, Zi-Yang; Liu, Le-De; Hu, Bin-Quan; Qiu, Zi-Yu; Wang, Wen; Cui, Guang-Bin
2017-01-01
Current machine learning techniques provide the opportunity to develop noninvasive and automated glioma grading tools, by utilizing quantitative parameters derived from multi-modal magnetic resonance imaging (MRI) data. However, the efficacies of different machine learning methods in glioma grading have not been investigated.A comprehensive comparison of varied machine learning methods in differentiating low-grade gliomas (LGGs) and high-grade gliomas (HGGs) as well as WHO grade II, III and IV gliomas based on multi-parametric MRI images was proposed in the current study. The parametric histogram and image texture attributes of 120 glioma patients were extracted from the perfusion, diffusion and permeability parametric maps of preoperative MRI. Then, 25 commonly used machine learning classifiers combined with 8 independent attribute selection methods were applied and evaluated using leave-one-out cross validation (LOOCV) strategy. Besides, the influences of parameter selection on the classifying performances were investigated. We found that support vector machine (SVM) exhibited superior performance to other classifiers. By combining all tumor attributes with synthetic minority over-sampling technique (SMOTE), the highest classifying accuracy of 0.945 or 0.961 for LGG and HGG or grade II, III and IV gliomas was achieved. Application of Recursive Feature Elimination (RFE) attribute selection strategy further improved the classifying accuracies. Besides, the performances of LibSVM, SMO, IBk classifiers were influenced by some key parameters such as kernel type, c, gama, K, etc. SVM is a promising tool in developing automated preoperative glioma grading system, especially when being combined with RFE strategy. Model parameters should be considered in glioma grading model optimization. PMID:28599282
Zhang, Xin; Yan, Lin-Feng; Hu, Yu-Chuan; Li, Gang; Yang, Yang; Han, Yu; Sun, Ying-Zhi; Liu, Zhi-Cheng; Tian, Qiang; Han, Zi-Yang; Liu, Le-De; Hu, Bin-Quan; Qiu, Zi-Yu; Wang, Wen; Cui, Guang-Bin
2017-07-18
Current machine learning techniques provide the opportunity to develop noninvasive and automated glioma grading tools, by utilizing quantitative parameters derived from multi-modal magnetic resonance imaging (MRI) data. However, the efficacies of different machine learning methods in glioma grading have not been investigated.A comprehensive comparison of varied machine learning methods in differentiating low-grade gliomas (LGGs) and high-grade gliomas (HGGs) as well as WHO grade II, III and IV gliomas based on multi-parametric MRI images was proposed in the current study. The parametric histogram and image texture attributes of 120 glioma patients were extracted from the perfusion, diffusion and permeability parametric maps of preoperative MRI. Then, 25 commonly used machine learning classifiers combined with 8 independent attribute selection methods were applied and evaluated using leave-one-out cross validation (LOOCV) strategy. Besides, the influences of parameter selection on the classifying performances were investigated. We found that support vector machine (SVM) exhibited superior performance to other classifiers. By combining all tumor attributes with synthetic minority over-sampling technique (SMOTE), the highest classifying accuracy of 0.945 or 0.961 for LGG and HGG or grade II, III and IV gliomas was achieved. Application of Recursive Feature Elimination (RFE) attribute selection strategy further improved the classifying accuracies. Besides, the performances of LibSVM, SMO, IBk classifiers were influenced by some key parameters such as kernel type, c, gama, K, etc. SVM is a promising tool in developing automated preoperative glioma grading system, especially when being combined with RFE strategy. Model parameters should be considered in glioma grading model optimization.
NASA Astrophysics Data System (ADS)
Hoffmann, Achim; Mahidadia, Ashesh
The purpose of this chapter is to present fundamental ideas and techniques of machine learning suitable for the field of this book, i.e., for automated scientific discovery. The chapter focuses on those symbolic machine learning methods, which produce results that are suitable to be interpreted and understood by humans. This is particularly important in the context of automated scientific discovery as the scientific theories to be produced by machines are usually meant to be interpreted by humans. This chapter contains some of the most influential ideas and concepts in machine learning research to give the reader a basic insight into the field. After the introduction in Sect. 1, general ideas of how learning problems can be framed are given in Sect. 2. The section provides useful perspectives to better understand what learning algorithms actually do. Section 3 presents the Version space model which is an early learning algorithm as well as a conceptual framework, that provides important insight into the general mechanisms behind most learning algorithms. In section 4, a family of learning algorithms, the AQ family for learning classification rules is presented. The AQ family belongs to the early approaches in machine learning. The next, Sect. 5 presents the basic principles of decision tree learners. Decision tree learners belong to the most influential class of inductive learning algorithms today. Finally, a more recent group of learning systems are presented in Sect. 6, which learn relational concepts within the framework of logic programming. This is a particularly interesting group of learning systems since the framework allows also to incorporate background knowledge which may assist in generalisation. Section 7 discusses Association Rules - a technique that comes from the related field of Data mining. Section 8 presents the basic idea of the Naive Bayesian Classifier. While this is a very popular learning technique, the learning result is not well suited for human comprehension as it is essentially a large collection of probability values. In Sect. 9, we present a generic method for improving accuracy of a given learner by generatingmultiple classifiers using variations of the training data. While this works well in most cases, the resulting classifiers have significantly increased complexity and, hence, tend to destroy the human readability of the learning result that a single learner may produce. Section 10 contains a summary, mentions briefly other techniques not discussed in this chapter and presents outlook on the potential of machine learning in the future.
A Machine Learning Approach to Automated Gait Analysis for the Noldus Catwalk System.
Frohlich, Holger; Claes, Kasper; De Wolf, Catherine; Van Damme, Xavier; Michel, Anne
2018-05-01
Gait analysis of animal disease models can provide valuable insights into in vivo compound effects and thus help in preclinical drug development. The purpose of this paper is to establish a computational gait analysis approach for the Noldus Catwalk system, in which footprints are automatically captured and stored. We present a - to our knowledge - first machine learning based approach for the Catwalk system, which comprises a step decomposition, definition and extraction of meaningful features, multivariate step sequence alignment, feature selection, and training of different classifiers (gradient boosting machine, random forest, and elastic net). Using animal-wise leave-one-out cross validation we demonstrate that with our method we can reliable separate movement patterns of a putative Parkinson's disease animal model and several control groups. Furthermore, we show that we can predict the time point after and the type of different brain lesions and can even forecast the brain region, where the intervention was applied. We provide an in-depth analysis of the features involved into our classifiers via statistical techniques for model interpretation. A machine learning method for automated analysis of data from the Noldus Catwalk system was established. Our works shows the ability of machine learning to discriminate pharmacologically relevant animal groups based on their walking behavior in a multivariate manner. Further interesting aspects of the approach include the ability to learn from past experiments, improve with more data arriving and to make predictions for single animals in future studies.
PredicT-ML: a tool for automating machine learning model building with big clinical data.
Luo, Gang
2016-01-01
Predictive modeling is fundamental to transforming large clinical data sets, or "big clinical data," into actionable knowledge for various healthcare applications. Machine learning is a major predictive modeling approach, but two barriers make its use in healthcare challenging. First, a machine learning tool user must choose an algorithm and assign one or more model parameters called hyper-parameters before model training. The algorithm and hyper-parameter values used typically impact model accuracy by over 40 %, but their selection requires many labor-intensive manual iterations that can be difficult even for computer scientists. Second, many clinical attributes are repeatedly recorded over time, requiring temporal aggregation before predictive modeling can be performed. Many labor-intensive manual iterations are required to identify a good pair of aggregation period and operator for each clinical attribute. Both barriers result in time and human resource bottlenecks, and preclude healthcare administrators and researchers from asking a series of what-if questions when probing opportunities to use predictive models to improve outcomes and reduce costs. This paper describes our design of and vision for PredicT-ML (prediction tool using machine learning), a software system that aims to overcome these barriers and automate machine learning model building with big clinical data. The paper presents the detailed design of PredicT-ML. PredicT-ML will open the use of big clinical data to thousands of healthcare administrators and researchers and increase the ability to advance clinical research and improve healthcare.
Automated negotiation in environmental resource management: Review and assessment.
Eshragh, Faezeh; Pooyandeh, Majeed; Marceau, Danielle J
2015-10-01
Negotiation is an integral part of our daily life and plays an important role in resolving conflicts and facilitating human interactions. Automated negotiation, which aims at capturing the human negotiation process using artificial intelligence and machine learning techniques, is well-established in e-commerce, but its application in environmental resource management remains limited. This is due to the inherent uncertainties and complexity of environmental issues, along with the diversity of stakeholders' perspectives when dealing with these issues. The objective of this paper is to describe the main components of automated negotiation, review and compare machine learning techniques in automated negotiation, and provide a guideline for the selection of suitable methods in the particular context of stakeholders' negotiation over environmental resource issues. We advocate that automated negotiation can facilitate the involvement of stakeholders in the exploration of a plurality of solutions in order to reach a mutually satisfying agreement and contribute to informed decisions in environmental management along with the need for further studies to consolidate the potential of this modeling approach. Copyright © 2015 Elsevier Ltd. All rights reserved.
Automated assessment of cognitive health using smart home technologies.
Dawadi, Prafulla N; Cook, Diane J; Schmitter-Edgecombe, Maureen; Parsey, Carolyn
2013-01-01
The goal of this work is to develop intelligent systems to monitor the wellbeing of individuals in their home environments. This paper introduces a machine learning-based method to automatically predict activity quality in smart homes and automatically assess cognitive health based on activity quality. This paper describes an automated framework to extract set of features from smart home sensors data that reflects the activity performance or ability of an individual to complete an activity which can be input to machine learning algorithms. Output from learning algorithms including principal component analysis, support vector machine, and logistic regression algorithms are used to quantify activity quality for a complex set of smart home activities and predict cognitive health of participants. Smart home activity data was gathered from volunteer participants (n=263) who performed a complex set of activities in our smart home testbed. We compare our automated activity quality prediction and cognitive health prediction with direct observation scores and health assessment obtained from neuropsychologists. With all samples included, we obtained statistically significant correlation (r=0.54) between direct observation scores and predicted activity quality. Similarly, using a support vector machine classifier, we obtained reasonable classification accuracy (area under the ROC curve=0.80, g-mean=0.73) in classifying participants into two different cognitive classes, dementia and cognitive healthy. The results suggest that it is possible to automatically quantify the task quality of smart home activities and perform limited assessment of the cognitive health of individual if smart home activities are properly chosen and learning algorithms are appropriately trained.
Automated Assessment of Cognitive Health Using Smart Home Technologies
Dawadi, Prafulla N.; Cook, Diane J.; Schmitter-Edgecombe, Maureen; Parsey, Carolyn
2014-01-01
BACKGROUND The goal of this work is to develop intelligent systems to monitor the well being of individuals in their home environments. OBJECTIVE This paper introduces a machine learning-based method to automatically predict activity quality in smart homes and automatically assess cognitive health based on activity quality. METHODS This paper describes an automated framework to extract set of features from smart home sensors data that reflects the activity performance or ability of an individual to complete an activity which can be input to machine learning algorithms. Output from learning algorithms including principal component analysis, support vector machine, and logistic regression algorithms are used to quantify activity quality for a complex set of smart home activities and predict cognitive health of participants. RESULTS Smart home activity data was gathered from volunteer participants (n=263) who performed a complex set of activities in our smart home testbed. We compare our automated activity quality prediction and cognitive health prediction with direct observation scores and health assessment obtained from neuropsychologists. With all samples included, we obtained statistically significant correlation (r=0.54) between direct observation scores and predicted activity quality. Similarly, using a support vector machine classifier, we obtained reasonable classification accuracy (area under the ROC curve = 0.80, g-mean = 0.73) in classifying participants into two different cognitive classes, dementia and cognitive healthy. CONCLUSIONS The results suggest that it is possible to automatically quantify the task quality of smart home activities and perform limited assessment of the cognitive health of individual if smart home activities are properly chosen and learning algorithms are appropriately trained. PMID:23949177
Silva, Fabrício R; Vidotti, Vanessa G; Cremasco, Fernanda; Dias, Marcelo; Gomi, Edson S; Costa, Vital P
2013-01-01
To evaluate the sensitivity and specificity of machine learning classifiers (MLCs) for glaucoma diagnosis using Spectral Domain OCT (SD-OCT) and standard automated perimetry (SAP). Observational cross-sectional study. Sixty two glaucoma patients and 48 healthy individuals were included. All patients underwent a complete ophthalmologic examination, achromatic standard automated perimetry (SAP) and retinal nerve fiber layer (RNFL) imaging with SD-OCT (Cirrus HD-OCT; Carl Zeiss Meditec Inc., Dublin, California). Receiver operating characteristic (ROC) curves were obtained for all SD-OCT parameters and global indices of SAP. Subsequently, the following MLCs were tested using parameters from the SD-OCT and SAP: Bagging (BAG), Naive-Bayes (NB), Multilayer Perceptron (MLP), Radial Basis Function (RBF), Random Forest (RAN), Ensemble Selection (ENS), Classification Tree (CTREE), Ada Boost M1(ADA),Support Vector Machine Linear (SVML) and Support Vector Machine Gaussian (SVMG). Areas under the receiver operating characteristic curves (aROC) obtained for isolated SAP and OCT parameters were compared with MLCs using OCT+SAP data. Combining OCT and SAP data, MLCs' aROCs varied from 0.777(CTREE) to 0.946 (RAN).The best OCT+SAP aROC obtained with RAN (0.946) was significantly larger the best single OCT parameter (p<0.05), but was not significantly different from the aROC obtained with the best single SAP parameter (p=0.19). Machine learning classifiers trained on OCT and SAP data can successfully discriminate between healthy and glaucomatous eyes. The combination of OCT and SAP measurements improved the diagnostic accuracy compared with OCT data alone.
Automated edge finishing using an active XY table
Loucks, Clifford S.; Starr, Gregory P.
1993-01-01
The disclosure is directed to an apparatus and method for automated edge finishing using hybrid position/force control of an XY table. The disclosure is particularly directed to learning the trajectory of the edge of a workpiece by "guarded moves". Machining is done by controllably moving the XY table, with the workpiece mounted thereon, along the learned trajectory with feedback from a force sensor. Other similar workpieces can be mounted, without a fixture on the XY table, located and the learned trajectory adjusted
Advanced, Analytic, Automated (AAA) Measurement of Engagement During Learning
D’Mello, Sidney; Dieterle, Ed; Duckworth, Angela
2017-01-01
It is generally acknowledged that engagement plays a critical role in learning. Unfortunately, the study of engagement has been stymied by a lack of valid and efficient measures. We introduce the advanced, analytic, and automated (AAA) approach to measure engagement at fine-grained temporal resolutions. The AAA measurement approach is grounded in embodied theories of cognition and affect, which advocate a close coupling between thought and action. It uses machine-learned computational models to automatically infer mental states associated with engagement (e.g., interest, flow) from machine-readable behavioral and physiological signals (e.g., facial expressions, eye tracking, click-stream data) and from aspects of the environmental context. We present15 case studies that illustrate the potential of the AAA approach for measuring engagement in digital learning environments. We discuss strengths and weaknesses of the AAA approach, concluding that it has significant promise to catalyze engagement research. PMID:29038607
Advanced, Analytic, Automated (AAA) Measurement of Engagement During Learning.
D'Mello, Sidney; Dieterle, Ed; Duckworth, Angela
2017-01-01
It is generally acknowledged that engagement plays a critical role in learning. Unfortunately, the study of engagement has been stymied by a lack of valid and efficient measures. We introduce the advanced, analytic, and automated (AAA) approach to measure engagement at fine-grained temporal resolutions. The AAA measurement approach is grounded in embodied theories of cognition and affect, which advocate a close coupling between thought and action. It uses machine-learned computational models to automatically infer mental states associated with engagement (e.g., interest, flow) from machine-readable behavioral and physiological signals (e.g., facial expressions, eye tracking, click-stream data) and from aspects of the environmental context. We present15 case studies that illustrate the potential of the AAA approach for measuring engagement in digital learning environments. We discuss strengths and weaknesses of the AAA approach, concluding that it has significant promise to catalyze engagement research.
Real-time detection of transients in OGLE-IV with application of machine learning
NASA Astrophysics Data System (ADS)
Klencki, Jakub; Wyrzykowski, Łukasz
2016-06-01
The current bottleneck of transient detection in most surveys is the problem of rejecting numerous artifacts from detected candidates. We present a triple-stage hierarchical machine learning system for automated artifact filtering in difference imaging, based on self-organizing maps. The classifier, when tested on the OGLE-IV Transient Detection System, accepts 97% of real transients while removing up to 97.5% of artifacts.
Ellis, Katherine; Godbole, Suneeta; Marshall, Simon; Lanckriet, Gert; Staudenmayer, John; Kerr, Jacqueline
2014-01-01
Background: Active travel is an important area in physical activity research, but objective measurement of active travel is still difficult. Automated methods to measure travel behaviors will improve research in this area. In this paper, we present a supervised machine learning method for transportation mode prediction from global positioning system (GPS) and accelerometer data. Methods: We collected a dataset of about 150 h of GPS and accelerometer data from two research assistants following a protocol of prescribed trips consisting of five activities: bicycling, riding in a vehicle, walking, sitting, and standing. We extracted 49 features from 1-min windows of this data. We compared the performance of several machine learning algorithms and chose a random forest algorithm to classify the transportation mode. We used a moving average output filter to smooth the output predictions over time. Results: The random forest algorithm achieved 89.8% cross-validated accuracy on this dataset. Adding the moving average filter to smooth output predictions increased the cross-validated accuracy to 91.9%. Conclusion: Machine learning methods are a viable approach for automating measurement of active travel, particularly for measuring travel activities that traditional accelerometer data processing methods misclassify, such as bicycling and vehicle travel. PMID:24795875
Automated inspection of bread and loaves
NASA Astrophysics Data System (ADS)
Batchelor, Bruce G.
1993-08-01
The prospects for building practical automated inspection machines, capable of detecting the following faults in ordinary, everyday loaves are reviewed: (1) foreign bodies, using X-rays, (2) texture changes, using glancing illumination, mathematical morphology and Neural Net learning techniques, and (3) shape deformations, using structured lighting and simple geometry.
Chemically intuited, large-scale screening of MOFs by machine learning techniques
NASA Astrophysics Data System (ADS)
Borboudakis, Giorgos; Stergiannakos, Taxiarchis; Frysali, Maria; Klontzas, Emmanuel; Tsamardinos, Ioannis; Froudakis, George E.
2017-10-01
A novel computational methodology for large-scale screening of MOFs is applied to gas storage with the use of machine learning technologies. This approach is a promising trade-off between the accuracy of ab initio methods and the speed of classical approaches, strategically combined with chemical intuition. The results demonstrate that the chemical properties of MOFs are indeed predictable (stochastically, not deterministically) using machine learning methods and automated analysis protocols, with the accuracy of predictions increasing with sample size. Our initial results indicate that this methodology is promising to apply not only to gas storage in MOFs but in many other material science projects.
Automated Cognitive Health Assessment Using Smart Home Monitoring of Complex Tasks
Dawadi, Prafulla N.; Cook, Diane J.; Schmitter-Edgecombe, Maureen
2014-01-01
One of the many services that intelligent systems can provide is the automated assessment of resident well-being. We hypothesize that the functional health of individuals, or ability of individuals to perform activities independently without assistance, can be estimated by tracking their activities using smart home technologies. In this paper, we introduce a machine learning-based method for assessing activity quality in smart homes. To validate our approach we quantify activity quality for 179 volunteer participants who performed a complex, interweaved set of activities in our smart home apartment. We observed a statistically significant correlation (r=0.79) between automated assessment of task quality and direct observation scores. Using machine learning techniques to predict the cognitive health of the participants based on task quality is accomplished with an AUC value of 0.64. We believe that this capability is an important step in understanding everyday functional health of individuals in their home environments. PMID:25530925
Automated Cognitive Health Assessment Using Smart Home Monitoring of Complex Tasks.
Dawadi, Prafulla N; Cook, Diane J; Schmitter-Edgecombe, Maureen
2013-11-01
One of the many services that intelligent systems can provide is the automated assessment of resident well-being. We hypothesize that the functional health of individuals, or ability of individuals to perform activities independently without assistance, can be estimated by tracking their activities using smart home technologies. In this paper, we introduce a machine learning-based method for assessing activity quality in smart homes. To validate our approach we quantify activity quality for 179 volunteer participants who performed a complex, interweaved set of activities in our smart home apartment. We observed a statistically significant correlation (r=0.79) between automated assessment of task quality and direct observation scores. Using machine learning techniques to predict the cognitive health of the participants based on task quality is accomplished with an AUC value of 0.64. We believe that this capability is an important step in understanding everyday functional health of individuals in their home environments.
Computational Analysis of Behavior.
Egnor, S E Roian; Branson, Kristin
2016-07-08
In this review, we discuss the emerging field of computational behavioral analysis-the use of modern methods from computer science and engineering to quantitatively measure animal behavior. We discuss aspects of experiment design important to both obtaining biologically relevant behavioral data and enabling the use of machine vision and learning techniques for automation. These two goals are often in conflict. Restraining or restricting the environment of the animal can simplify automatic behavior quantification, but it can also degrade the quality or alter important aspects of behavior. To enable biologists to design experiments to obtain better behavioral measurements, and computer scientists to pinpoint fruitful directions for algorithm improvement, we review known effects of artificial manipulation of the animal on behavior. We also review machine vision and learning techniques for tracking, feature extraction, automated behavior classification, and automated behavior discovery, the assumptions they make, and the types of data they work best with.
Libraries Can Learn from Banks.
ERIC Educational Resources Information Center
Lawrence, Gail H.
1983-01-01
The experiences of banks introducing computerized services to the public are described to provide some idea of what libraries can expect when they introduce online systems. Volume of use of Automated Teller Machines, types of users, introduction of machines, and user acceptance are highlighted. Thirty-two references are cited. (EJS)
Deep Interactive Learning with Sharkzor
DOE Office of Scientific and Technical Information (OSTI.GOV)
None
Sharkzor is a web application for machine-learning assisted image sort and summary. Deep learning algorithms are leveraged to infer, augment, and automate the user’s mental model. Initially, images uploaded by the user are spread out on a canvas. The user then interacts with the images to impute their mental model into the applications algorithmic underpinnings. Methods of interaction within Sharkzor’s user interface and user experience support three primary user tasks: triage, organize and automate. The user triages the large pile of overlapping images by moving images of interest into proximity. The user then organizes said images into meaningful groups. Aftermore » interacting with the images and groups, deep learning helps to automate the user’s interactions. The loop of interaction, automation, and response by the user allows the system to quickly make sense of large amounts of data.« less
AstroML: Python-powered Machine Learning for Astronomy
NASA Astrophysics Data System (ADS)
Vander Plas, Jake; Connolly, A. J.; Ivezic, Z.
2014-01-01
As astronomical data sets grow in size and complexity, automated machine learning and data mining methods are becoming an increasingly fundamental component of research in the field. The astroML project (http://astroML.org) provides a common repository for practical examples of the data mining and machine learning tools used and developed by astronomical researchers, written in Python. The astroML module contains a host of general-purpose data analysis and machine learning routines, loaders for openly-available astronomical datasets, and fast implementations of specific computational methods often used in astronomy and astrophysics. The associated website features hundreds of examples of these routines being used for analysis of real astronomical datasets, while the associated textbook provides a curriculum resource for graduate-level courses focusing on practical statistics, machine learning, and data mining approaches within Astronomical research. This poster will highlight several of the more powerful and unique examples of analysis performed with astroML, all of which can be reproduced in their entirety on any computer with the proper packages installed.
Automated analysis of high-content microscopy data with deep learning.
Kraus, Oren Z; Grys, Ben T; Ba, Jimmy; Chong, Yolanda; Frey, Brendan J; Boone, Charles; Andrews, Brenda J
2017-04-18
Existing computational pipelines for quantitative analysis of high-content microscopy data rely on traditional machine learning approaches that fail to accurately classify more than a single dataset without substantial tuning and training, requiring extensive analysis. Here, we demonstrate that the application of deep learning to biological image data can overcome the pitfalls associated with conventional machine learning classifiers. Using a deep convolutional neural network (DeepLoc) to analyze yeast cell images, we show improved performance over traditional approaches in the automated classification of protein subcellular localization. We also demonstrate the ability of DeepLoc to classify highly divergent image sets, including images of pheromone-arrested cells with abnormal cellular morphology, as well as images generated in different genetic backgrounds and in different laboratories. We offer an open-source implementation that enables updating DeepLoc on new microscopy datasets. This study highlights deep learning as an important tool for the expedited analysis of high-content microscopy data. © 2017 The Authors. Published under the terms of the CC BY 4.0 license.
Combining Offline and Online Computation for Solving Partially Observable Markov Decision Process
2015-03-06
David Hsu and Wee Sun Lee, Monte Carlo Bayesian Reinforcement Learning, International Conference on Machine Learning (ICML), 2012. • Haoyu Bai, David...and Automation (ICRA), 2015. • Zhan Wei Lim, David Hsu, and Wee Sun Lee, Adaptive Informative Path Planning in Metric Spaces. Submitted to Int. J... Automation (ICRA), 2015. 2. Bai, H., Hsu, D., Kochenderfer, M. J., and Lee, W. S., Unmanned aircraft collision avoidance using continuous state POMDPs
Automated Scoring of Chinese Engineering Students' English Essays
ERIC Educational Resources Information Center
Liu, Ming; Wang, Yuqi; Xu, Weiwei; Liu, Li
2017-01-01
The number of Chinese engineering students has increased greatly since 1999. Rating the quality of these students' English essays has thus become time-consuming and challenging. This paper presents a novel automatic essay scoring algorithm called PSOSVR, based on a machine learning algorithm, Support Vector Machine for Regression (SVR), and a…
An Automated System for Skeletal Maturity Assessment by Extreme Learning Machines
Mansourvar, Marjan; Shamshirband, Shahaboddin; Raj, Ram Gopal; Gunalan, Roshan; Mazinani, Iman
2015-01-01
Assessing skeletal age is a subjective and tedious examination process. Hence, automated assessment methods have been developed to replace manual evaluation in medical applications. In this study, a new fully automated method based on content-based image retrieval and using extreme learning machines (ELM) is designed and adapted to assess skeletal maturity. The main novelty of this approach is it overcomes the segmentation problem as suffered by existing systems. The estimation results of ELM models are compared with those of genetic programming (GP) and artificial neural networks (ANNs) models. The experimental results signify improvement in assessment accuracy over GP and ANN, while generalization capability is possible with the ELM approach. Moreover, the results are indicated that the ELM model developed can be used confidently in further work on formulating novel models of skeletal age assessment strategies. According to the experimental results, the new presented method has the capacity to learn many hundreds of times faster than traditional learning methods and it has sufficient overall performance in many aspects. It has conclusively been found that applying ELM is particularly promising as an alternative method for evaluating skeletal age. PMID:26402795
NASA Astrophysics Data System (ADS)
Sopharak, Akara; Uyyanonvara, Bunyarit; Barman, Sarah; Williamson, Thomas
To prevent blindness from diabetic retinopathy, periodic screening and early diagnosis are neccessary. Due to lack of expert ophthalmologists in rural area, automated early exudate (one of visible sign of diabetic retinopathy) detection could help to reduce the number of blindness in diabetic patients. Traditional automatic exudate detection methods are based on specific parameter configuration, while the machine learning approaches which seems more flexible may be computationally high cost. A comparative analysis of traditional and machine learning of exudates detection, namely, mathematical morphology, fuzzy c-means clustering, naive Bayesian classifier, Support Vector Machine and Nearest Neighbor classifier are presented. Detected exudates are validated with expert ophthalmologists' hand-drawn ground-truths. The sensitivity, specificity, precision, accuracy and time complexity of each method are also compared.
Atkinson, Jonathan A; Lobet, Guillaume; Noll, Manuel; Meyer, Patrick E; Griffiths, Marcus; Wells, Darren M
2017-10-01
Genetic analyses of plant root systems require large datasets of extracted architectural traits. To quantify such traits from images of root systems, researchers often have to choose between automated tools (that are prone to error and extract only a limited number of architectural traits) or semi-automated ones (that are highly time consuming). We trained a Random Forest algorithm to infer architectural traits from automatically extracted image descriptors. The training was performed on a subset of the dataset, then applied to its entirety. This strategy allowed us to (i) decrease the image analysis time by 73% and (ii) extract meaningful architectural traits based on image descriptors. We also show that these traits are sufficient to identify the quantitative trait loci that had previously been discovered using a semi-automated method. We have shown that combining semi-automated image analysis with machine learning algorithms has the power to increase the throughput of large-scale root studies. We expect that such an approach will enable the quantification of more complex root systems for genetic studies. We also believe that our approach could be extended to other areas of plant phenotyping. © The Authors 2017. Published by Oxford University Press.
Atkinson, Jonathan A.; Lobet, Guillaume; Noll, Manuel; Meyer, Patrick E.; Griffiths, Marcus
2017-01-01
Abstract Genetic analyses of plant root systems require large datasets of extracted architectural traits. To quantify such traits from images of root systems, researchers often have to choose between automated tools (that are prone to error and extract only a limited number of architectural traits) or semi-automated ones (that are highly time consuming). We trained a Random Forest algorithm to infer architectural traits from automatically extracted image descriptors. The training was performed on a subset of the dataset, then applied to its entirety. This strategy allowed us to (i) decrease the image analysis time by 73% and (ii) extract meaningful architectural traits based on image descriptors. We also show that these traits are sufficient to identify the quantitative trait loci that had previously been discovered using a semi-automated method. We have shown that combining semi-automated image analysis with machine learning algorithms has the power to increase the throughput of large-scale root studies. We expect that such an approach will enable the quantification of more complex root systems for genetic studies. We also believe that our approach could be extended to other areas of plant phenotyping. PMID:29020748
Machine learning in cardiovascular medicine: are we there yet?
Shameer, Khader; Johnson, Kipp W; Glicksberg, Benjamin S; Dudley, Joel T; Sengupta, Partho P
2018-01-19
Artificial intelligence (AI) broadly refers to analytical algorithms that iteratively learn from data, allowing computers to find hidden insights without being explicitly programmed where to look. These include a family of operations encompassing several terms like machine learning, cognitive learning, deep learning and reinforcement learning-based methods that can be used to integrate and interpret complex biomedical and healthcare data in scenarios where traditional statistical methods may not be able to perform. In this review article, we discuss the basics of machine learning algorithms and what potential data sources exist; evaluate the need for machine learning; and examine the potential limitations and challenges of implementing machine in the context of cardiovascular medicine. The most promising avenues for AI in medicine are the development of automated risk prediction algorithms which can be used to guide clinical care; use of unsupervised learning techniques to more precisely phenotype complex disease; and the implementation of reinforcement learning algorithms to intelligently augment healthcare providers. The utility of a machine learning-based predictive model will depend on factors including data heterogeneity, data depth, data breadth, nature of modelling task, choice of machine learning and feature selection algorithms, and orthogonal evidence. A critical understanding of the strength and limitations of various methods and tasks amenable to machine learning is vital. By leveraging the growing corpus of big data in medicine, we detail pathways by which machine learning may facilitate optimal development of patient-specific models for improving diagnoses, intervention and outcome in cardiovascular medicine. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Automated EEG artifact elimination by applying machine learning algorithms to ICA-based features.
Radüntz, Thea; Scouten, Jon; Hochmuth, Olaf; Meffert, Beate
2017-08-01
Biological and non-biological artifacts cause severe problems when dealing with electroencephalogram (EEG) recordings. Independent component analysis (ICA) is a widely used method for eliminating various artifacts from recordings. However, evaluating and classifying the calculated independent components (IC) as artifact or EEG is not fully automated at present. In this study, we propose a new approach for automated artifact elimination, which applies machine learning algorithms to ICA-based features. We compared the performance of our classifiers with the visual classification results given by experts. The best result with an accuracy rate of 95% was achieved using features obtained by range filtering of the topoplots and IC power spectra combined with an artificial neural network. Compared with the existing automated solutions, our proposed method is not limited to specific types of artifacts, electrode configurations, or number of EEG channels. The main advantages of the proposed method is that it provides an automatic, reliable, real-time capable, and practical tool, which avoids the need for the time-consuming manual selection of ICs during artifact removal.
Automated EEG artifact elimination by applying machine learning algorithms to ICA-based features
NASA Astrophysics Data System (ADS)
Radüntz, Thea; Scouten, Jon; Hochmuth, Olaf; Meffert, Beate
2017-08-01
Objective. Biological and non-biological artifacts cause severe problems when dealing with electroencephalogram (EEG) recordings. Independent component analysis (ICA) is a widely used method for eliminating various artifacts from recordings. However, evaluating and classifying the calculated independent components (IC) as artifact or EEG is not fully automated at present. Approach. In this study, we propose a new approach for automated artifact elimination, which applies machine learning algorithms to ICA-based features. Main results. We compared the performance of our classifiers with the visual classification results given by experts. The best result with an accuracy rate of 95% was achieved using features obtained by range filtering of the topoplots and IC power spectra combined with an artificial neural network. Significance. Compared with the existing automated solutions, our proposed method is not limited to specific types of artifacts, electrode configurations, or number of EEG channels. The main advantages of the proposed method is that it provides an automatic, reliable, real-time capable, and practical tool, which avoids the need for the time-consuming manual selection of ICs during artifact removal.
NASA Astrophysics Data System (ADS)
Rose, R.; Aizenman, H.; Mei, E.; Choudhury, N.
2013-12-01
High School students interested in the STEM fields benefit most when actively participating, so I created a series of learning modules on how to analyze complex systems using machine-learning that give automated feedback to students. The automated feedbacks give timely responses that will encourage the students to continue testing and enhancing their programs. I have designed my modules to take the tactical learning approach in conveying the concepts behind correlation, linear regression, and vector distance based classification and clustering. On successful completion of these modules, students will learn how to calculate linear regression, Pearson's correlation, and apply classification and clustering techniques to a dataset. Working on these modules will allow the students to take back to the classroom what they've learned and then apply it to the Earth Science curriculum. During my research this summer, we applied these lessons to analyzing river deltas; we looked at trends in the different variables over time, looked for similarities in NDVI, precipitation, inundation, runoff and discharge, and attempted to predict floods based on the precipitation, waves mean, area of discharge, NDVI, and inundation.
Anomaly detection using temporal data mining in a smart home environment.
Jakkula, V; Cook, D J
2008-01-01
To many people, home is a sanctuary. With the maturing of smart home technologies, many people with cognitive and physical disabilities can lead independent lives in their own homes for extended periods of time. In this paper, we investigate the design of machine learning algorithms that support this goal. We hypothesize that machine learning algorithms can be designed to automatically learn models of resident behavior in a smart home, and that the results can be used to perform automated health monitoring and to detect anomalies. Specifically, our algorithms draw upon the temporal nature of sensor data collected in a smart home to build a model of expected activities and to detect unexpected, and possibly health-critical, events in the home. We validate our algorithms using synthetic data and real activity data collected from volunteers in an automated smart environment. The results from our experiments support our hypothesis that a model can be learned from observed smart home data and used to report anomalies, as they occur, in a smart home.
Intelligent machines in the twenty-first century: foundations of inference and inquiry.
Knuth, Kevin H
2003-12-15
The last century saw the application of Boolean algebra to the construction of computing machines, which work by applying logical transformations to information contained in their memory. The development of information theory and the generalization of Boolean algebra to Bayesian inference have enabled these computing machines, in the last quarter of the twentieth century, to be endowed with the ability to learn by making inferences from data. This revolution is just beginning as new computational techniques continue to make difficult problems more accessible. Recent advances in our understanding of the foundations of probability theory have revealed implications for areas other than logic. Of relevance to intelligent machines, we recently identified the algebra of questions as the free distributive algebra, which will now allow us to work with questions in a way analogous to that which Boolean algebra enables us to work with logical statements. In this paper, we examine the foundations of inference and inquiry. We begin with a history of inferential reasoning, highlighting key concepts that have led to the automation of inference in modern machine-learning systems. We then discuss the foundations of inference in more detail using a modern viewpoint that relies on the mathematics of partially ordered sets and the scaffolding of lattice theory. This new viewpoint allows us to develop the logic of inquiry and introduce a measure describing the relevance of a proposed question to an unresolved issue. Last, we will demonstrate the automation of inference, and discuss how this new logic of inquiry will enable intelligent machines to ask questions. Automation of both inference and inquiry promises to allow robots to perform science in the far reaches of our solar system and in other star systems by enabling them not only to make inferences from data, but also to decide which question to ask, which experiment to perform, or which measurement to take given what they have learned and what they are designed to understand.
Intelligent machines in the twenty-first century: foundations of inference and inquiry
NASA Technical Reports Server (NTRS)
Knuth, Kevin H.
2003-01-01
The last century saw the application of Boolean algebra to the construction of computing machines, which work by applying logical transformations to information contained in their memory. The development of information theory and the generalization of Boolean algebra to Bayesian inference have enabled these computing machines, in the last quarter of the twentieth century, to be endowed with the ability to learn by making inferences from data. This revolution is just beginning as new computational techniques continue to make difficult problems more accessible. Recent advances in our understanding of the foundations of probability theory have revealed implications for areas other than logic. Of relevance to intelligent machines, we recently identified the algebra of questions as the free distributive algebra, which will now allow us to work with questions in a way analogous to that which Boolean algebra enables us to work with logical statements. In this paper, we examine the foundations of inference and inquiry. We begin with a history of inferential reasoning, highlighting key concepts that have led to the automation of inference in modern machine-learning systems. We then discuss the foundations of inference in more detail using a modern viewpoint that relies on the mathematics of partially ordered sets and the scaffolding of lattice theory. This new viewpoint allows us to develop the logic of inquiry and introduce a measure describing the relevance of a proposed question to an unresolved issue. Last, we will demonstrate the automation of inference, and discuss how this new logic of inquiry will enable intelligent machines to ask questions. Automation of both inference and inquiry promises to allow robots to perform science in the far reaches of our solar system and in other star systems by enabling them not only to make inferences from data, but also to decide which question to ask, which experiment to perform, or which measurement to take given what they have learned and what they are designed to understand.
Jauregi Unanue, Iñigo; Zare Borzeshi, Ehsan; Piccardi, Massimo
2017-12-01
Previous state-of-the-art systems on Drug Name Recognition (DNR) and Clinical Concept Extraction (CCE) have focused on a combination of text "feature engineering" and conventional machine learning algorithms such as conditional random fields and support vector machines. However, developing good features is inherently heavily time-consuming. Conversely, more modern machine learning approaches such as recurrent neural networks (RNNs) have proved capable of automatically learning effective features from either random assignments or automated word "embeddings". (i) To create a highly accurate DNR and CCE system that avoids conventional, time-consuming feature engineering. (ii) To create richer, more specialized word embeddings by using health domain datasets such as MIMIC-III. (iii) To evaluate our systems over three contemporary datasets. Two deep learning methods, namely the Bidirectional LSTM and the Bidirectional LSTM-CRF, are evaluated. A CRF model is set as the baseline to compare the deep learning systems to a traditional machine learning approach. The same features are used for all the models. We have obtained the best results with the Bidirectional LSTM-CRF model, which has outperformed all previously proposed systems. The specialized embeddings have helped to cover unusual words in DrugBank and MedLine, but not in the i2b2/VA dataset. We present a state-of-the-art system for DNR and CCE. Automated word embeddings has allowed us to avoid costly feature engineering and achieve higher accuracy. Nevertheless, the embeddings need to be retrained over datasets that are adequate for the domain, in order to adequately cover the domain-specific vocabulary. Copyright © 2017 Elsevier Inc. All rights reserved.
Machine Learning Approaches in Cardiovascular Imaging.
Henglin, Mir; Stein, Gillian; Hushcha, Pavel V; Snoek, Jasper; Wiltschko, Alexander B; Cheng, Susan
2017-10-01
Cardiovascular imaging technologies continue to increase in their capacity to capture and store large quantities of data. Modern computational methods, developed in the field of machine learning, offer new approaches to leveraging the growing volume of imaging data available for analyses. Machine learning methods can now address data-related problems ranging from simple analytic queries of existing measurement data to the more complex challenges involved in analyzing raw images. To date, machine learning has been used in 2 broad and highly interconnected areas: automation of tasks that might otherwise be performed by a human and generation of clinically important new knowledge. Most cardiovascular imaging studies have focused on task-oriented problems, but more studies involving algorithms aimed at generating new clinical insights are emerging. Continued expansion in the size and dimensionality of cardiovascular imaging databases is driving strong interest in applying powerful deep learning methods, in particular, to analyze these data. Overall, the most effective approaches will require an investment in the resources needed to appropriately prepare such large data sets for analyses. Notwithstanding current technical and logistical challenges, machine learning and especially deep learning methods have much to offer and will substantially impact the future practice and science of cardiovascular imaging. © 2017 American Heart Association, Inc.
NASA Astrophysics Data System (ADS)
Singla, Neeru; Srivastava, Vishal; Singh Mehta, Dalip
2018-02-01
We report the first fully automated detection of human skin burn injuries in vivo, with the goal of automatic surgical margin assessment based on optical coherence tomography (OCT) images. Our proposed automated procedure entails building a machine-learning-based classifier by extracting quantitative features from normal and burn tissue images recorded by OCT. In this study, 56 samples (28 normal, 28 burned) were imaged by OCT and eight features were extracted. A linear model classifier was trained using 34 samples and 22 samples were used to test the model. Sensitivity of 91.6% and specificity of 90% were obtained. Our results demonstrate the capability of a computer-aided technique for accurately and automatically identifying burn tissue resection margins during surgical treatment.
Estelles-Lopez, Lucia; Ropodi, Athina; Pavlidis, Dimitris; Fotopoulou, Jenny; Gkousari, Christina; Peyrodie, Audrey; Panagou, Efstathios; Nychas, George-John; Mohareb, Fady
2017-09-01
Over the past decade, analytical approaches based on vibrational spectroscopy, hyperspectral/multispectral imagining and biomimetic sensors started gaining popularity as rapid and efficient methods for assessing food quality, safety and authentication; as a sensible alternative to the expensive and time-consuming conventional microbiological techniques. Due to the multi-dimensional nature of the data generated from such analyses, the output needs to be coupled with a suitable statistical approach or machine-learning algorithms before the results can be interpreted. Choosing the optimum pattern recognition or machine learning approach for a given analytical platform is often challenging and involves a comparative analysis between various algorithms in order to achieve the best possible prediction accuracy. In this work, "MeatReg", a web-based application is presented, able to automate the procedure of identifying the best machine learning method for comparing data from several analytical techniques, to predict the counts of microorganisms responsible of meat spoilage regardless of the packaging system applied. In particularly up to 7 regression methods were applied and these are ordinary least squares regression, stepwise linear regression, partial least square regression, principal component regression, support vector regression, random forest and k-nearest neighbours. MeatReg" was tested with minced beef samples stored under aerobic and modified atmosphere packaging and analysed with electronic nose, HPLC, FT-IR, GC-MS and Multispectral imaging instrument. Population of total viable count, lactic acid bacteria, pseudomonads, Enterobacteriaceae and B. thermosphacta, were predicted. As a result, recommendations of which analytical platforms are suitable to predict each type of bacteria and which machine learning methods to use in each case were obtained. The developed system is accessible via the link: www.sorfml.com. Copyright © 2017 Elsevier Ltd. All rights reserved.
Automated Essay Grading using Machine Learning Algorithm
NASA Astrophysics Data System (ADS)
Ramalingam, V. V.; Pandian, A.; Chetry, Prateek; Nigam, Himanshu
2018-04-01
Essays are paramount for of assessing the academic excellence along with linking the different ideas with the ability to recall but are notably time consuming when they are assessed manually. Manual grading takes significant amount of evaluator’s time and hence it is an expensive process. Automated grading if proven effective will not only reduce the time for assessment but comparing it with human scores will also make the score realistic. The project aims to develop an automated essay assessment system by use of machine learning techniques by classifying a corpus of textual entities into small number of discrete categories, corresponding to possible grades. Linear regression technique will be utilized for training the model along with making the use of various other classifications and clustering techniques. We intend to train classifiers on the training set, make it go through the downloaded dataset, and then measure performance our dataset by comparing the obtained values with the dataset values. We have implemented our model using java.
Piccinini, Filippo; Balassa, Tamas; Szkalisity, Abel; Molnar, Csaba; Paavolainen, Lassi; Kujala, Kaisa; Buzas, Krisztina; Sarazova, Marie; Pietiainen, Vilja; Kutay, Ulrike; Smith, Kevin; Horvath, Peter
2017-06-28
High-content, imaging-based screens now routinely generate data on a scale that precludes manual verification and interrogation. Software applying machine learning has become an essential tool to automate analysis, but these methods require annotated examples to learn from. Efficiently exploring large datasets to find relevant examples remains a challenging bottleneck. Here, we present Advanced Cell Classifier (ACC), a graphical software package for phenotypic analysis that addresses these difficulties. ACC applies machine-learning and image-analysis methods to high-content data generated by large-scale, cell-based experiments. It features methods to mine microscopic image data, discover new phenotypes, and improve recognition performance. We demonstrate that these features substantially expedite the training process, successfully uncover rare phenotypes, and improve the accuracy of the analysis. ACC is extensively documented, designed to be user-friendly for researchers without machine-learning expertise, and distributed as a free open-source tool at www.cellclassifier.org. Copyright © 2017 Elsevier Inc. All rights reserved.
Applying machine learning classification techniques to automate sky object cataloguing
NASA Astrophysics Data System (ADS)
Fayyad, Usama M.; Doyle, Richard J.; Weir, W. Nick; Djorgovski, Stanislav
1993-08-01
We describe the application of an Artificial Intelligence machine learning techniques to the development of an automated tool for the reduction of a large scientific data set. The 2nd Mt. Palomar Northern Sky Survey is nearly completed. This survey provides comprehensive coverage of the northern celestial hemisphere in the form of photographic plates. The plates are being transformed into digitized images whose quality will probably not be surpassed in the next ten to twenty years. The images are expected to contain on the order of 107 galaxies and 108 stars. Astronomers wish to determine which of these sky objects belong to various classes of galaxies and stars. Unfortunately, the size of this data set precludes analysis in an exclusively manual fashion. Our approach is to develop a software system which integrates the functions of independently developed techniques for image processing and data classification. Digitized sky images are passed through image processing routines to identify sky objects and to extract a set of features for each object. These routines are used to help select a useful set of attributes for classifying sky objects. Then GID3 (Generalized ID3) and O-B Tree, two inductive learning techniques, learns classification decision trees from examples. These classifiers will then be applied to new data. These developmnent process is highly interactive, with astronomer input playing a vital role. Astronomers refine the feature set used to construct sky object descriptions, and evaluate the performance of the automated classification technique on new data. This paper gives an overview of the machine learning techniques with an emphasis on their general applicability, describes the details of our specific application, and reports the initial encouraging results. The results indicate that our machine learning approach is well-suited to the problem. The primary benefit of the approach is increased data reduction throughput. Another benefit is consistency of classification. The classification rules which are the product of the inductive learning techniques will form an objective, examinable basis for classifying sky objects. A final, not to be underestimated benefit is that astronomers will be freed from the tedium of an intensely visual task to pursue more challenging analysis and interpretation problems based on automatically catalogued data.
ERIC Educational Resources Information Center
Wind, Stefanie A.; Wolfe, Edward W.; Engelhard, George, Jr.; Foltz, Peter; Rosenstein, Mark
2018-01-01
Automated essay scoring engines (AESEs) are becoming increasingly popular as an efficient method for performance assessments in writing, including many language assessments that are used worldwide. Before they can be used operationally, AESEs must be "trained" using machine-learning techniques that incorporate human ratings. However, the…
Automation in Vocational Training of the Mentally Retarded. Final Report.
ERIC Educational Resources Information Center
Platt, Henry; And Others
Various uses of automation in teaching were studied with mentally retarded (IQ 70 to 90) and/or emotionally disturbed (IQ 80 to 90) youth aged 16 to 20. Programed instruction was presented by six audiovisual devices and techniques: the Devereux Model 50 Teaching Aid, the Learn-Ease Teaching Device, the Mast Teaching Machine, the Graflex…
Autonomous Scanning Probe Microscopy in Situ Tip Conditioning through Machine Learning.
Rashidi, Mohammad; Wolkow, Robert A
2018-05-23
Atomic-scale characterization and manipulation with scanning probe microscopy rely upon the use of an atomically sharp probe. Here we present automated methods based on machine learning to automatically detect and recondition the quality of the probe of a scanning tunneling microscope. As a model system, we employ these techniques on the technologically relevant hydrogen-terminated silicon surface, training the network to recognize abnormalities in the appearance of surface dangling bonds. Of the machine learning methods tested, a convolutional neural network yielded the greatest accuracy, achieving a positive identification of degraded tips in 97% of the test cases. By using multiple points of comparison and majority voting, the accuracy of the method is improved beyond 99%.
Feasibility of Active Machine Learning for Multiclass Compound Classification.
Lang, Tobias; Flachsenberg, Florian; von Luxburg, Ulrike; Rarey, Matthias
2016-01-25
A common task in the hit-to-lead process is classifying sets of compounds into multiple, usually structural classes, which build the groundwork for subsequent SAR studies. Machine learning techniques can be used to automate this process by learning classification models from training compounds of each class. Gathering class information for compounds can be cost-intensive as the required data needs to be provided by human experts or experiments. This paper studies whether active machine learning can be used to reduce the required number of training compounds. Active learning is a machine learning method which processes class label data in an iterative fashion. It has gained much attention in a broad range of application areas. In this paper, an active learning method for multiclass compound classification is proposed. This method selects informative training compounds so as to optimally support the learning progress. The combination with human feedback leads to a semiautomated interactive multiclass classification procedure. This method was investigated empirically on 15 compound classification tasks containing 86-2870 compounds in 3-38 classes. The empirical results show that active learning can solve these classification tasks using 10-80% of the data which would be necessary for standard learning techniques.
A New Automated Design Method Based on Machine Learning for CMOS Analog Circuits
NASA Astrophysics Data System (ADS)
Moradi, Behzad; Mirzaei, Abdolreza
2016-11-01
A new simulation based automated CMOS analog circuit design method which applies a multi-objective non-Darwinian-type evolutionary algorithm based on Learnable Evolution Model (LEM) is proposed in this article. The multi-objective property of this automated design of CMOS analog circuits is governed by a modified Strength Pareto Evolutionary Algorithm (SPEA) incorporated in the LEM algorithm presented here. LEM includes a machine learning method such as the decision trees that makes a distinction between high- and low-fitness areas in the design space. The learning process can detect the right directions of the evolution and lead to high steps in the evolution of the individuals. The learning phase shortens the evolution process and makes remarkable reduction in the number of individual evaluations. The expert designer's knowledge on circuit is applied in the design process in order to reduce the design space as well as the design time. The circuit evaluation is made by HSPICE simulator. In order to improve the design accuracy, bsim3v3 CMOS transistor model is adopted in this proposed design method. This proposed design method is tested on three different operational amplifier circuits. The performance of this proposed design method is verified by comparing it with the evolutionary strategy algorithm and other similar methods.
Automated structural classification of lipids by machine learning.
Taylor, Ryan; Miller, Ryan H; Miller, Ryan D; Porter, Michael; Dalgleish, James; Prince, John T
2015-03-01
Modern lipidomics is largely dependent upon structural ontologies because of the great diversity exhibited in the lipidome, but no automated lipid classification exists to facilitate this partitioning. The size of the putative lipidome far exceeds the number currently classified, despite a decade of work. Automated classification would benefit ongoing classification efforts by decreasing the time needed and increasing the accuracy of classification while providing classifications for mass spectral identification algorithms. We introduce a tool that automates classification into the LIPID MAPS ontology of known lipids with >95% accuracy and novel lipids with 63% accuracy. The classification is based upon simple chemical characteristics and modern machine learning algorithms. The decision trees produced are intelligible and can be used to clarify implicit assumptions about the current LIPID MAPS classification scheme. These characteristics and decision trees are made available to facilitate alternative implementations. We also discovered many hundreds of lipids that are currently misclassified in the LIPID MAPS database, strongly underscoring the need for automated classification. Source code and chemical characteristic lists as SMARTS search strings are available under an open-source license at https://www.github.com/princelab/lipid_classifier. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Kawata, Yasuo; Arimura, Hidetaka; Ikushima, Koujirou; Jin, Ze; Morita, Kento; Tokunaga, Chiaki; Yabu-Uchi, Hidetake; Shioyama, Yoshiyuki; Sasaki, Tomonari; Honda, Hiroshi; Sasaki, Masayuki
2017-10-01
The aim of this study was to investigate the impact of pixel-based machine learning (ML) techniques, i.e., fuzzy-c-means clustering method (FCM), and the artificial neural network (ANN) and support vector machine (SVM), on an automated framework for delineation of gross tumor volume (GTV) regions of lung cancer for stereotactic body radiation therapy. The morphological and metabolic features for GTV regions, which were determined based on the knowledge of radiation oncologists, were fed on a pixel-by-pixel basis into the respective FCM, ANN, and SVM ML techniques. Then, the ML techniques were incorporated into the automated delineation framework of GTVs followed by an optimum contour selection (OCS) method, which we proposed in a previous study. The three-ML-based frameworks were evaluated for 16 lung cancer cases (six solid, four ground glass opacity (GGO), six part-solid GGO) with the datasets of planning computed tomography (CT) and 18 F-fluorodeoxyglucose (FDG) positron emission tomography (PET)/CT images using the three-dimensional Dice similarity coefficient (DSC). DSC denotes the degree of region similarity between the GTVs contoured by radiation oncologists and those estimated using the automated framework. The FCM-based framework achieved the highest DSCs of 0.79±0.06, whereas DSCs of the ANN-based and SVM-based frameworks were 0.76±0.14 and 0.73±0.14, respectively. The FCM-based framework provided the highest segmentation accuracy and precision without a learning process (lowest calculation cost). Therefore, the FCM-based framework can be useful for delineation of tumor regions in practical treatment planning. Copyright © 2017 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.
Anesthesiology, automation, and artificial intelligence.
Alexander, John C; Joshi, Girish P
2018-01-01
There have been many attempts to incorporate automation into the practice of anesthesiology, though none have been successful. Fundamentally, these failures are due to the underlying complexity of anesthesia practice and the inability of rule-based feedback loops to fully master it. Recent innovations in artificial intelligence, especially machine learning, may usher in a new era of automation across many industries, including anesthesiology. It would be wise to consider the implications of such potential changes before they have been fully realized.
a Fully Automated Pipeline for Classification Tasks with AN Application to Remote Sensing
NASA Astrophysics Data System (ADS)
Suzuki, K.; Claesen, M.; Takeda, H.; De Moor, B.
2016-06-01
Nowadays deep learning has been intensively in spotlight owing to its great victories at major competitions, which undeservedly pushed `shallow' machine learning methods, relatively naive/handy algorithms commonly used by industrial engineers, to the background in spite of their facilities such as small requisite amount of time/dataset for training. We, with a practical point of view, utilized shallow learning algorithms to construct a learning pipeline such that operators can utilize machine learning without any special knowledge, expensive computation environment, and a large amount of labelled data. The proposed pipeline automates a whole classification process, namely feature-selection, weighting features and the selection of the most suitable classifier with optimized hyperparameters. The configuration facilitates particle swarm optimization, one of well-known metaheuristic algorithms for the sake of generally fast and fine optimization, which enables us not only to optimize (hyper)parameters but also to determine appropriate features/classifier to the problem, which has conventionally been a priori based on domain knowledge and remained untouched or dealt with naïve algorithms such as grid search. Through experiments with the MNIST and CIFAR-10 datasets, common datasets in computer vision field for character recognition and object recognition problems respectively, our automated learning approach provides high performance considering its simple setting (i.e. non-specialized setting depending on dataset), small amount of training data, and practical learning time. Moreover, compared to deep learning the performance stays robust without almost any modification even with a remote sensing object recognition problem, which in turn indicates that there is a high possibility that our approach contributes to general classification problems.
Navarro, Pedro J.; Fernández, Carlos; Borraz, Raúl; Alonso, Diego
2016-01-01
This article describes an automated sensor-based system to detect pedestrians in an autonomous vehicle application. Although the vehicle is equipped with a broad set of sensors, the article focuses on the processing of the information generated by a Velodyne HDL-64E LIDAR sensor. The cloud of points generated by the sensor (more than 1 million points per revolution) is processed to detect pedestrians, by selecting cubic shapes and applying machine vision and machine learning algorithms to the XY, XZ, and YZ projections of the points contained in the cube. The work relates an exhaustive analysis of the performance of three different machine learning algorithms: k-Nearest Neighbours (kNN), Naïve Bayes classifier (NBC), and Support Vector Machine (SVM). These algorithms have been trained with 1931 samples. The final performance of the method, measured a real traffic scenery, which contained 16 pedestrians and 469 samples of non-pedestrians, shows sensitivity (81.2%), accuracy (96.2%) and specificity (96.8%). PMID:28025565
Navarro, Pedro J; Fernández, Carlos; Borraz, Raúl; Alonso, Diego
2016-12-23
This article describes an automated sensor-based system to detect pedestrians in an autonomous vehicle application. Although the vehicle is equipped with a broad set of sensors, the article focuses on the processing of the information generated by a Velodyne HDL-64E LIDAR sensor. The cloud of points generated by the sensor (more than 1 million points per revolution) is processed to detect pedestrians, by selecting cubic shapes and applying machine vision and machine learning algorithms to the XY, XZ, and YZ projections of the points contained in the cube. The work relates an exhaustive analysis of the performance of three different machine learning algorithms: k-Nearest Neighbours (kNN), Naïve Bayes classifier (NBC), and Support Vector Machine (SVM). These algorithms have been trained with 1931 samples. The final performance of the method, measured a real traffic scenery, which contained 16 pedestrians and 469 samples of non-pedestrians, shows sensitivity (81.2%), accuracy (96.2%) and specificity (96.8%).
Epileptic seizure detection in EEG signal using machine learning techniques.
Jaiswal, Abeg Kumar; Banka, Haider
2018-03-01
Epilepsy is a well-known nervous system disorder characterized by seizures. Electroencephalograms (EEGs), which capture brain neural activity, can detect epilepsy. Traditional methods for analyzing an EEG signal for epileptic seizure detection are time-consuming. Recently, several automated seizure detection frameworks using machine learning technique have been proposed to replace these traditional methods. The two basic steps involved in machine learning are feature extraction and classification. Feature extraction reduces the input pattern space by keeping informative features and the classifier assigns the appropriate class label. In this paper, we propose two effective approaches involving subpattern based PCA (SpPCA) and cross-subpattern correlation-based PCA (SubXPCA) with Support Vector Machine (SVM) for automated seizure detection in EEG signals. Feature extraction was performed using SpPCA and SubXPCA. Both techniques explore the subpattern correlation of EEG signals, which helps in decision-making process. SVM is used for classification of seizure and non-seizure EEG signals. The SVM was trained with radial basis kernel. All the experiments have been carried out on the benchmark epilepsy EEG dataset. The entire dataset consists of 500 EEG signals recorded under different scenarios. Seven different experimental cases for classification have been conducted. The classification accuracy was evaluated using tenfold cross validation. The classification results of the proposed approaches have been compared with the results of some of existing techniques proposed in the literature to establish the claim.
Araki, Tadashi; Jain, Pankaj K; Suri, Harman S; Londhe, Narendra D; Ikeda, Nobutaka; El-Baz, Ayman; Shrivastava, Vimal K; Saba, Luca; Nicolaides, Andrew; Shafique, Shoaib; Laird, John R; Gupta, Ajay; Suri, Jasjit S
2017-01-01
Stroke risk stratification based on grayscale morphology of the ultrasound carotid wall has recently been shown to have a promise in classification of high risk versus low risk plaque or symptomatic versus asymptomatic plaques. In previous studies, this stratification has been mainly based on analysis of the far wall of the carotid artery. Due to the multifocal nature of atherosclerotic disease, the plaque growth is not restricted to the far wall alone. This paper presents a new approach for stroke risk assessment by integrating assessment of both the near and far walls of the carotid artery using grayscale morphology of the plaque. Further, this paper presents a scientific validation system for stroke risk assessment. Both these innovations have never been presented before. The methodology consists of an automated segmentation system of the near wall and far wall regions in grayscale carotid B-mode ultrasound scans. Sixteen grayscale texture features are computed, and fed into the machine learning system. The training system utilizes the lumen diameter to create ground truth labels for the stratification of stroke risk. The cross-validation procedure is adapted in order to obtain the machine learning testing classification accuracy through the use of three sets of partition protocols: (5, 10, and Jack Knife). The mean classification accuracy over all the sets of partition protocols for the automated system in the far and near walls is 95.08% and 93.47%, respectively. The corresponding accuracies for the manual system are 94.06% and 92.02%, respectively. The precision of merit of the automated machine learning system when compared against manual risk assessment system are 98.05% and 97.53% for the far and near walls, respectively. The ROC of the risk assessment system for the far and near walls is close to 1.0 demonstrating high accuracy. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Lary, D. J.
2013-12-01
A BigData case study is described where multiple datasets from several satellites, high-resolution global meteorological data, social media and in-situ observations are combined using machine learning on a distributed cluster using an automated workflow. The global particulate dataset is relevant to global public health studies and would not be possible to produce without the use of the multiple big datasets, in-situ data and machine learning.To greatly reduce the development time and enhance the functionality a high level language capable of parallel processing has been used (Matlab). A key consideration for the system is high speed access due to the large data volume, persistence of the large data volumes and a precise process time scheduling capability.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-04-25
... Family Assistance (OFA) is interested in learning about how States deliver Temporary Assistance to Needy... types of restrictions on assistance usage. OFA also is interested in learning about States' current... as ``the use of a credit or debit card service, automated teller machine, point-of-sale terminal, or...
Are we at a crossroads or a plateau? Radiomics and machine learning in abdominal oncology imaging.
Summers, Ronald M
2018-05-05
Advances in radiomics and machine learning have driven a technology boom in the automated analysis of radiology images. For the past several years, expectations have been nearly boundless for these new technologies to revolutionize radiology image analysis and interpretation. In this editorial, I compare the expectations with the realities with particular attention to applications in abdominal oncology imaging. I explore whether these technologies will leave us at a crossroads to an exciting future or to a sustained plateau and disillusionment.
Anesthesiology, automation, and artificial intelligence
Alexander, John C.; Joshi, Girish P.
2018-01-01
ABSTRACT There have been many attempts to incorporate automation into the practice of anesthesiology, though none have been successful. Fundamentally, these failures are due to the underlying complexity of anesthesia practice and the inability of rule-based feedback loops to fully master it. Recent innovations in artificial intelligence, especially machine learning, may usher in a new era of automation across many industries, including anesthesiology. It would be wise to consider the implications of such potential changes before they have been fully realized. PMID:29686578
[Application of Mass Spectrometry to the Diagnosis of Cancer--Chairman's Introductory Remarks].
Yatomi, Yutaka
2015-09-01
In this symposium, the latest application of mass spectrometry to laboratory medicine, i.e., to the early diagnosis of cancer, was introduced. Dr. Masaru YOSHIDA, who has been using metabolome analysis to discover biomarker candidates for gastroenterological diseases, presented an automated early diagnosis system for early stages of colon cancer based on metabolome analysis and using a minute amount of blood. On the other hand, Dr. Sen TAKEDA, who has developed a new approach by employing both mass spectrometry and machine-learning for cancer diagnosis, presented a device for the clinical diagnosis of cancer using probe electrospray ionization (PESI) and machine-learning called the dual penalized logistic regression machine (dPLRM).
Quantum ensembles of quantum classifiers.
Schuld, Maria; Petruccione, Francesco
2018-02-09
Quantum machine learning witnesses an increasing amount of quantum algorithms for data-driven decision making, a problem with potential applications ranging from automated image recognition to medical diagnosis. Many of those algorithms are implementations of quantum classifiers, or models for the classification of data inputs with a quantum computer. Following the success of collective decision making with ensembles in classical machine learning, this paper introduces the concept of quantum ensembles of quantum classifiers. Creating the ensemble corresponds to a state preparation routine, after which the quantum classifiers are evaluated in parallel and their combined decision is accessed by a single-qubit measurement. This framework naturally allows for exponentially large ensembles in which - similar to Bayesian learning - the individual classifiers do not have to be trained. As an example, we analyse an exponentially large quantum ensemble in which each classifier is weighed according to its performance in classifying the training data, leading to new results for quantum as well as classical machine learning.
Proceedings of the 1986 IEEE international conference on systems, man and cybernetics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
1986-01-01
This book presents the papers given at a conference on man-machine systems. Topics considered at the conference included neural model-based cognitive theory and engineering, user interfaces, adaptive and learning systems, human interaction with robotics, decision making, the testing and evaluation of expert systems, software development, international conflict resolution, intelligent interfaces, automation in man-machine system design aiding, knowledge acquisition in expert systems, advanced architectures for artificial intelligence, pattern recognition, knowledge bases, and machine vision.
Montagna, Fabio; Buiatti, Marco; Benatti, Simone; Rossi, Davide; Farella, Elisabetta; Benini, Luca
2017-10-01
EEG is a standard non-invasive technique used in neural disease diagnostics and neurosciences. Frequency-tagging is an increasingly popular experimental paradigm that efficiently tests brain function by measuring EEG responses to periodic stimulation. Recently, frequency-tagging paradigms have proven successful with low stimulation frequencies (0.5-6Hz), but the EEG signal is intrinsically noisy in this frequency range, requiring heavy signal processing and significant human intervention for response estimation. This limits the possibility to process the EEG on resource-constrained systems and to design smart EEG based devices for automated diagnostic. We propose an algorithm for artifact removal and automated detection of frequency tagging responses in a wide range of stimulation frequencies, which we test on a visual stimulation protocol. The algorithm is rooted on machine learning based pattern recognition techniques and it is tailored for a new generation parallel ultra low power processing platform (PULP), reaching performance of more that 90% accuracy in the frequency detection even for very low stimulation frequencies (<1Hz) with a power budget of 56mW. Copyright © 2017 Elsevier Inc. All rights reserved.
Joutsijoki, Henry; Haponen, Markus; Rasku, Jyrki; Aalto-Setälä, Katriina; Juhola, Martti
2016-01-01
The focus of this research is on automated identification of the quality of human induced pluripotent stem cell (iPSC) colony images. iPS cell technology is a contemporary method by which the patient's cells are reprogrammed back to stem cells and are differentiated to any cell type wanted. iPS cell technology will be used in future to patient specific drug screening, disease modeling, and tissue repairing, for instance. However, there are technical challenges before iPS cell technology can be used in practice and one of them is quality control of growing iPSC colonies which is currently done manually but is unfeasible solution in large-scale cultures. The monitoring problem returns to image analysis and classification problem. In this paper, we tackle this problem using machine learning methods such as multiclass Support Vector Machines and several baseline methods together with Scaled Invariant Feature Transformation based features. We perform over 80 test arrangements and do a thorough parameter value search. The best accuracy (62.4%) for classification was obtained by using a k-NN classifier showing improved accuracy compared to earlier studies.
Intelligent Machines in the 21st Century: Automating the Processes of Inference and Inquiry
NASA Technical Reports Server (NTRS)
Knuth, Kevin H.
2003-01-01
The last century saw the application of Boolean algebra toward the construction of computing machines, which work by applying logical transformations to information contained in their memory. The development of information theory and the generalization of Boolean algebra to Bayesian inference have enabled these computing machines. in the last quarter of the twentieth century, to be endowed with the ability to learn by making inferences from data. This revolution is just beginning as new computational techniques continue to make difficult problems more accessible. However, modern intelligent machines work by inferring knowledge using only their pre-programmed prior knowledge and the data provided. They lack the ability to ask questions, or request data that would aid their inferences. Recent advances in understanding the foundations of probability theory have revealed implications for areas other than logic. Of relevance to intelligent machines, we identified the algebra of questions as the free distributive algebra, which now allows us to work with questions in a way analogous to that which Boolean algebra enables us to work with logical statements. In this paper we describe this logic of inference and inquiry using the mathematics of partially ordered sets and the scaffolding of lattice theory, discuss the far-reaching implications of the methodology, and demonstrate its application with current examples in machine learning. Automation of both inference and inquiry promises to allow robots to perform science in the far reaches of our solar system and in other star systems by enabling them to not only make inferences from data, but also decide which question to ask, experiment to perform, or measurement to take given what they have learned and what they are designed to understand.
Global Bathymetry: Machine Learning for Data Editing
NASA Astrophysics Data System (ADS)
Sandwell, D. T.; Tea, B.; Freund, Y.
2017-12-01
The accuracy of global bathymetry depends primarily on the coverage and accuracy of the sounding data and secondarily on the depth predicted from gravity. A main focus of our research is to add newly-available data to the global compilation. Most data sources have 1-12% of erroneous soundings caused by a wide array of blunders and measurement errors. Over the years we have hand-edited this data using undergraduate employees at UCSD (440 million soundings at 500 m resolution). We are developing a machine learning approach to refine the flagging of the older soundings and provide automated editing of newly-acquired soundings. The approach has three main steps: 1) Combine the sounding data with additional information that may inform the machine learning algorithm. The additional parameters include: depth predicted from gravity; distance to the nearest sounding from other cruises; seafloor age; spreading rate; sediment thickness; and vertical gravity gradient. 2) Use available edit decisions as training data sets for a boosted tree algorithm with a binary logistic objective function and L2 regularization. Initial results with poor quality single beam soundings show that the automated algorithm matches the hand-edited data 89% of the time. The results show that most of the information for detecting outliers comes from predicted depth with secondary contributions from distance to the nearest sounding and longitude. A similar analysis using very high quality multibeam data shows that the automated algorithm matches the hand-edited data 93% of the time. Again, most of the information for detecting outliers comes from predicted depth secondary contributions from distance to the nearest sounding and longitude. 3) The third step in the process is to use the machine learning parameters, derived from the training data, to edit 12 million newly acquired single beam sounding data provided by the National Geospatial-Intelligence Agency. The output of the learning algorithm will be confidence ratedindicating which edits the algorithm is confident on and which it is not confident. We expect the majority ( 90%) of edits to be confident and not require human intervention. Human intervention will be required only on the 10% unconfident decisions, thus reducing the amount of human work by a factor of 10 or more.
Machine learning molecular dynamics for the simulation of infrared spectra.
Gastegger, Michael; Behler, Jörg; Marquetand, Philipp
2017-10-01
Machine learning has emerged as an invaluable tool in many research areas. In the present work, we harness this power to predict highly accurate molecular infrared spectra with unprecedented computational efficiency. To account for vibrational anharmonic and dynamical effects - typically neglected by conventional quantum chemistry approaches - we base our machine learning strategy on ab initio molecular dynamics simulations. While these simulations are usually extremely time consuming even for small molecules, we overcome these limitations by leveraging the power of a variety of machine learning techniques, not only accelerating simulations by several orders of magnitude, but also greatly extending the size of systems that can be treated. To this end, we develop a molecular dipole moment model based on environment dependent neural network charges and combine it with the neural network potential approach of Behler and Parrinello. Contrary to the prevalent big data philosophy, we are able to obtain very accurate machine learning models for the prediction of infrared spectra based on only a few hundreds of electronic structure reference points. This is made possible through the use of molecular forces during neural network potential training and the introduction of a fully automated sampling scheme. We demonstrate the power of our machine learning approach by applying it to model the infrared spectra of a methanol molecule, n -alkanes containing up to 200 atoms and the protonated alanine tripeptide, which at the same time represents the first application of machine learning techniques to simulate the dynamics of a peptide. In all of these case studies we find an excellent agreement between the infrared spectra predicted via machine learning models and the respective theoretical and experimental spectra.
Sweeney, Elizabeth M.; Vogelstein, Joshua T.; Cuzzocreo, Jennifer L.; Calabresi, Peter A.; Reich, Daniel S.; Crainiceanu, Ciprian M.; Shinohara, Russell T.
2014-01-01
Machine learning is a popular method for mining and analyzing large collections of medical data. We focus on a particular problem from medical research, supervised multiple sclerosis (MS) lesion segmentation in structural magnetic resonance imaging (MRI). We examine the extent to which the choice of machine learning or classification algorithm and feature extraction function impacts the performance of lesion segmentation methods. As quantitative measures derived from structural MRI are important clinical tools for research into the pathophysiology and natural history of MS, the development of automated lesion segmentation methods is an active research field. Yet, little is known about what drives performance of these methods. We evaluate the performance of automated MS lesion segmentation methods, which consist of a supervised classification algorithm composed with a feature extraction function. These feature extraction functions act on the observed T1-weighted (T1-w), T2-weighted (T2-w) and fluid-attenuated inversion recovery (FLAIR) MRI voxel intensities. Each MRI study has a manual lesion segmentation that we use to train and validate the supervised classification algorithms. Our main finding is that the differences in predictive performance are due more to differences in the feature vectors, rather than the machine learning or classification algorithms. Features that incorporate information from neighboring voxels in the brain were found to increase performance substantially. For lesion segmentation, we conclude that it is better to use simple, interpretable, and fast algorithms, such as logistic regression, linear discriminant analysis, and quadratic discriminant analysis, and to develop the features to improve performance. PMID:24781953
Sweeney, Elizabeth M; Vogelstein, Joshua T; Cuzzocreo, Jennifer L; Calabresi, Peter A; Reich, Daniel S; Crainiceanu, Ciprian M; Shinohara, Russell T
2014-01-01
Machine learning is a popular method for mining and analyzing large collections of medical data. We focus on a particular problem from medical research, supervised multiple sclerosis (MS) lesion segmentation in structural magnetic resonance imaging (MRI). We examine the extent to which the choice of machine learning or classification algorithm and feature extraction function impacts the performance of lesion segmentation methods. As quantitative measures derived from structural MRI are important clinical tools for research into the pathophysiology and natural history of MS, the development of automated lesion segmentation methods is an active research field. Yet, little is known about what drives performance of these methods. We evaluate the performance of automated MS lesion segmentation methods, which consist of a supervised classification algorithm composed with a feature extraction function. These feature extraction functions act on the observed T1-weighted (T1-w), T2-weighted (T2-w) and fluid-attenuated inversion recovery (FLAIR) MRI voxel intensities. Each MRI study has a manual lesion segmentation that we use to train and validate the supervised classification algorithms. Our main finding is that the differences in predictive performance are due more to differences in the feature vectors, rather than the machine learning or classification algorithms. Features that incorporate information from neighboring voxels in the brain were found to increase performance substantially. For lesion segmentation, we conclude that it is better to use simple, interpretable, and fast algorithms, such as logistic regression, linear discriminant analysis, and quadratic discriminant analysis, and to develop the features to improve performance.
12 CFR Appendix A to Part 205 - Model Disclosure Clauses and Forms
Code of Federal Regulations, 2011 CFR
2011-01-01
... maximum overdraft line of credit). If you tell us within 2 business days after you learn of the loss or... your permission.) If you do NOT tell us within 2 business days after you learn of the loss or theft of... [automated teller machines] [telephone bill-payment service] [point-of-sale transfer service]. (2) Fixed...
12 CFR Appendix A to Part 205 - Model Disclosure Clauses and Forms
Code of Federal Regulations, 2012 CFR
2012-01-01
... maximum overdraft line of credit). If you tell us within 2 business days after you learn of the loss or... your permission.) If you do NOT tell us within 2 business days after you learn of the loss or theft of... [automated teller machines] [telephone bill-payment service] [point-of-sale transfer service]. (2) Fixed...
12 CFR Appendix A to Part 205 - Model Disclosure Clauses and Forms
Code of Federal Regulations, 2013 CFR
2013-01-01
... maximum overdraft line of credit). If you tell us within 2 business days after you learn of the loss or... your permission.) If you do NOT tell us within 2 business days after you learn of the loss or theft of... [automated teller machines] [telephone bill-payment service] [point-of-sale transfer service]. (2) Fixed...
12 CFR Appendix A to Part 205 - Model Disclosure Clauses and Forms
Code of Federal Regulations, 2014 CFR
2014-01-01
... maximum overdraft line of credit). If you tell us within 2 business days after you learn of the loss or... your permission.) If you do NOT tell us within 2 business days after you learn of the loss or theft of... [automated teller machines] [telephone bill-payment service] [point-of-sale transfer service]. (2) Fixed...
Using machine learning techniques to automate sky survey catalog generation
NASA Technical Reports Server (NTRS)
Fayyad, Usama M.; Roden, J. C.; Doyle, R. J.; Weir, Nicholas; Djorgovski, S. G.
1993-01-01
We describe the application of machine classification techniques to the development of an automated tool for the reduction of a large scientific data set. The 2nd Palomar Observatory Sky Survey provides comprehensive photographic coverage of the northern celestial hemisphere. The photographic plates are being digitized into images containing on the order of 10(exp 7) galaxies and 10(exp 8) stars. Since the size of this data set precludes manual analysis and classification of objects, our approach is to develop a software system which integrates independently developed techniques for image processing and data classification. Image processing routines are applied to identify and measure features of sky objects. Selected features are used to determine the classification of each object. GID3* and O-BTree, two inductive learning techniques, are used to automatically learn classification decision trees from examples. We describe the techniques used, the details of our specific application, and the initial encouraging results which indicate that our approach is well-suited to the problem. The benefits of the approach are increased data reduction throughput, consistency of classification, and the automated derivation of classification rules that will form an objective, examinable basis for classifying sky objects. Furthermore, astronomers will be freed from the tedium of an intensely visual task to pursue more challenging analysis and interpretation problems given automatically cataloged data.
NASA Astrophysics Data System (ADS)
Ross, Z. E.; Meier, M. A.; Hauksson, E.
2017-12-01
Accurate first-motion polarities are essential for determining earthquake focal mechanisms, but are difficult to measure automatically because of picking errors and signal to noise issues. Here we develop an algorithm for reliable automated classification of first-motion polarities using machine learning algorithms. A classifier is designed to identify whether the first-motion polarity is up, down, or undefined by examining the waveform data directly. We first improve the accuracy of automatic P-wave onset picks by maximizing a weighted signal/noise ratio for a suite of candidate picks around the automatic pick. We then use the waveform amplitudes before and after the optimized pick as features for the classification. We demonstrate the method's potential by training and testing the classifier on tens of thousands of hand-made first-motion picks by the Southern California Seismic Network. The classifier assigned the same polarity as chosen by an analyst in more than 94% of the records. We show that the method is generalizable to a variety of learning algorithms, including neural networks and random forest classifiers. The method is suitable for automated processing of large seismic waveform datasets, and can potentially be used in real-time applications, e.g. for improving the source characterizations of earthquake early warning algorithms.
Classification of the Regional Ionospheric Disturbance Based on Machine Learning Techniques
NASA Astrophysics Data System (ADS)
Terzi, Merve Begum; Arikan, Orhan; Karatay, Secil; Arikan, Feza; Gulyaeva, Tamara
2016-08-01
In this study, Total Electron Content (TEC) estimated from GPS receivers is used to model the regional and local variability that differs from global activity along with solar and geomagnetic indices. For the automated classification of regional disturbances, a classification technique based on a robust machine learning technique that have found wide spread use, Support Vector Machine (SVM) is proposed. Performance of developed classification technique is demonstrated for midlatitude ionosphere over Anatolia using TEC estimates generated from GPS data provided by Turkish National Permanent GPS Network (TNPGN-Active) for solar maximum year of 2011. As a result of implementing developed classification technique to Global Ionospheric Map (GIM) TEC data, which is provided by the NASA Jet Propulsion Laboratory (JPL), it is shown that SVM can be a suitable learning method to detect anomalies in TEC variations.
Yang, Yu Xin; Chong, Mei Sian; Tay, Laura; Yew, Suzanne; Yeo, Audrey; Tan, Cher Heng
2016-10-01
To develop and validate a machine learning based automated segmentation method that jointly analyzes the four contrasts provided by Dixon MRI technique for improved thigh composition segmentation accuracy. The automatic detection of body composition is formulized as a three-class classification issue. Each image voxel in the training dataset is assigned with a correct label. A voxel classifier is trained and subsequently used to predict unseen data. Morphological operations are finally applied to generate volumetric segmented images for different structures. We applied this algorithm on datasets of (1) four contrast images, (2) water and fat images, and (3) unsuppressed images acquired from 190 subjects. The proposed method using four contrasts achieved most accurate and robust segmentation compared to the use of combined fat and water images and the use of unsuppressed image, average Dice coefficients of 0.94 ± 0.03, 0.96 ± 0.03, 0.80 ± 0.03, and 0.97 ± 0.01 has been achieved to bone region, subcutaneous adipose tissue (SAT), inter-muscular adipose tissue (IMAT), and muscle respectively. Our proposed method based on machine learning produces accurate tissue quantification and showed an effective use of large information provided by the four contrast images from Dixon MRI.
Mazzaferri, Javier; Larrivée, Bruno; Cakir, Bertan; Sapieha, Przemyslaw; Costantino, Santiago
2018-03-02
Preclinical studies of vascular retinal diseases rely on the assessment of developmental dystrophies in the oxygen induced retinopathy rodent model. The quantification of vessel tufts and avascular regions is typically computed manually from flat mounted retinas imaged using fluorescent probes that highlight the vascular network. Such manual measurements are time-consuming and hampered by user variability and bias, thus a rapid and objective method is needed. Here, we introduce a machine learning approach to segment and characterize vascular tufts, delineate the whole vasculature network, and identify and analyze avascular regions. Our quantitative retinal vascular assessment (QuRVA) technique uses a simple machine learning method and morphological analysis to provide reliable computations of vascular density and pathological vascular tuft regions, devoid of user intervention within seconds. We demonstrate the high degree of error and variability of manual segmentations, and designed, coded, and implemented a set of algorithms to perform this task in a fully automated manner. We benchmark and validate the results of our analysis pipeline using the consensus of several manually curated segmentations using commonly used computer tools. The source code of our implementation is released under version 3 of the GNU General Public License ( https://www.mathworks.com/matlabcentral/fileexchange/65699-javimazzaf-qurva ).
Guimaraes, Carolina V; Grzeszczuk, Robert; Bisset, George S; Donnelly, Lane F
2018-03-01
When implementing or monitoring department-sanctioned standardized radiology reports, feedback about individual faculty performance has been shown to be a useful driver of faculty compliance. Most commonly, these data are derived from manual audit, which can be both time-consuming and subject to sampling error. The purpose of this study was to evaluate whether a software program using natural language processing and machine learning could accurately audit radiologist compliance with the use of standardized reports compared with performed manual audits. Radiology reports from a 1-month period were loaded into such a software program, and faculty compliance with use of standardized reports was calculated. For that same period, manual audits were performed (25 reports audited for each of 42 faculty members). The mean compliance rates calculated by automated auditing were then compared with the confidence interval of the mean rate by manual audit. The mean compliance rate for use of standardized reports as determined by manual audit was 91.2% with a confidence interval between 89.3% and 92.8%. The mean compliance rate calculated by automated auditing was 92.0%, within that confidence interval. This study shows that by use of natural language processing and machine learning algorithms, an automated analysis can accurately define whether reports are compliant with use of standardized report templates and language, compared with manual audits. This may avoid significant labor costs related to conducting the manual auditing process. Copyright © 2017 American College of Radiology. Published by Elsevier Inc. All rights reserved.
An experimental result of estimating an application volume by machine learning techniques.
Hasegawa, Tatsuhito; Koshino, Makoto; Kimura, Haruhiko
2015-01-01
In this study, we improved the usability of smartphones by automating a user's operations. We developed an intelligent system using machine learning techniques that periodically detects a user's context on a smartphone. We selected the Android operating system because it has the largest market share and highest flexibility of its development environment. In this paper, we describe an application that automatically adjusts application volume. Adjusting the volume can be easily forgotten because users need to push the volume buttons to alter the volume depending on the given situation. Therefore, we developed an application that automatically adjusts the volume based on learned user settings. Application volume can be set differently from ringtone volume on Android devices, and these volume settings are associated with each specific application including games. Our application records a user's location, the volume setting, the foreground application name and other such attributes as learning data, thereby estimating whether the volume should be adjusted using machine learning techniques via Weka.
Integrating Machine Learning into a Crowdsourced Model for Earthquake-Induced Damage Assessment
NASA Technical Reports Server (NTRS)
Rebbapragada, Umaa; Oommen, Thomas
2011-01-01
On January 12th, 2010, a catastrophic 7.0M earthquake devastated the country of Haiti. In the aftermath of an earthquake, it is important to rapidly assess damaged areas in order to mobilize the appropriate resources. The Haiti damage assessment effort introduced a promising model that uses crowdsourcing to map damaged areas in freely available remotely-sensed data. This paper proposes the application of machine learning methods to improve this model. Specifically, we apply work on learning from multiple, imperfect experts to the assessment of volunteer reliability, and propose the use of image segmentation to automate the detection of damaged areas. We wrap both tasks in an active learning framework in order to shift volunteer effort from mapping a full catalog of images to the generation of high-quality training data. We hypothesize that the integration of machine learning into this model improves its reliability, maintains the speed of damage assessment, and allows the model to scale to higher data volumes.
NASA Technical Reports Server (NTRS)
Barrientos, Francesca; Castle, Joseph; McIntosh, Dawn; Srivastava, Ashok
2007-01-01
This document presents a preliminary evaluation the utility of the FAA Safety Analytics Thesaurus (SAT) utility in enhancing automated document processing applications under development at NASA Ames Research Center (ARC). Current development efforts at ARC are described, including overviews of the statistical machine learning techniques that have been investigated. An analysis of opportunities for applying thesaurus knowledge to improving algorithm performance is then presented.
Gandola, Emanuele; Antonioli, Manuela; Traficante, Alessio; Franceschini, Simone; Scardi, Michele; Congestri, Roberta
2016-05-01
Toxigenic cyanobacteria are one of the main health risks associated with water resources worldwide, as their toxins can affect humans and fauna exposed via drinking water, aquaculture and recreation. Microscopy monitoring of cyanobacteria in water bodies and massive growth systems is a routine operation for cell abundance and growth estimation. Here we present ACQUA (Automated Cyanobacterial Quantification Algorithm), a new fully automated image analysis method designed for filamentous genera in Bright field microscopy. A pre-processing algorithm has been developed to highlight filaments of interest from background signals due to other phytoplankton and dust. A spline-fitting algorithm has been designed to recombine interrupted and crossing filaments in order to perform accurate morphometric analysis and to extract the surface pattern information of highlighted objects. In addition, 17 specific pattern indicators have been developed and used as input data for a machine-learning algorithm dedicated to the recognition between five widespread toxic or potentially toxic filamentous genera in freshwater: Aphanizomenon, Cylindrospermopsis, Dolichospermum, Limnothrix and Planktothrix. The method was validated using freshwater samples from three Italian volcanic lakes comparing automated vs. manual results. ACQUA proved to be a fast and accurate tool to rapidly assess freshwater quality and to characterize cyanobacterial assemblages in aquatic environments. Copyright © 2016 Elsevier B.V. All rights reserved.
FORESEE™ User-Centric Energy Automation
DOE Office of Scientific and Technical Information (OSTI.GOV)
FORESEE™ is a home energy management system (HEMS) that provides a user centric energy automation solution for residential building occupants. Built upon advanced control and machine learning algorithms, FORESEE intelligently manages the home appliances and distributed energy resources (DERs) such as photovoltaics and battery storage in a home. Unlike existing HEMS in the market, FORESEE provides a tailored home automation solution for individual occupants by learning and adapting to their preferences on cost, comfort, convenience and carbon. FORESEE improves not only the energy efficiency of the home but also its capability to provide grid services such as demand response. Highlymore » reliable demand response services are likely to be incentivized by utility companies, making FORESEE economically viable for most homes.« less
NASA Astrophysics Data System (ADS)
Xiao, Guoqiang; Jiang, Yang; Song, Gang; Jiang, Jianmin
2010-12-01
We propose a support-vector-machine (SVM) tree to hierarchically learn from domain knowledge represented by low-level features toward automatic classification of sports videos. The proposed SVM tree adopts a binary tree structure to exploit the nature of SVM's binary classification, where each internal node is a single SVM learning unit, and each external node represents the classified output type. Such a SVM tree presents a number of advantages, which include: 1. low computing cost; 2. integrated learning and classification while preserving individual SVM's learning strength; and 3. flexibility in both structure and learning modules, where different numbers of nodes and features can be added to address specific learning requirements, and various learning models can be added as individual nodes, such as neural networks, AdaBoost, hidden Markov models, dynamic Bayesian networks, etc. Experiments support that the proposed SVM tree achieves good performances in sports video classifications.
Automatic vetting of planet candidates from ground based surveys: Machine learning with NGTS
NASA Astrophysics Data System (ADS)
Armstrong, David J.; Günther, Maximilian N.; McCormac, James; Smith, Alexis M. S.; Bayliss, Daniel; Bouchy, François; Burleigh, Matthew R.; Casewell, Sarah; Eigmüller, Philipp; Gillen, Edward; Goad, Michael R.; Hodgkin, Simon T.; Jenkins, James S.; Louden, Tom; Metrailler, Lionel; Pollacco, Don; Poppenhaeger, Katja; Queloz, Didier; Raynard, Liam; Rauer, Heike; Udry, Stéphane; Walker, Simon R.; Watson, Christopher A.; West, Richard G.; Wheatley, Peter J.
2018-05-01
State of the art exoplanet transit surveys are producing ever increasing quantities of data. To make the best use of this resource, in detecting interesting planetary systems or in determining accurate planetary population statistics, requires new automated methods. Here we describe a machine learning algorithm that forms an integral part of the pipeline for the NGTS transit survey, demonstrating the efficacy of machine learning in selecting planetary candidates from multi-night ground based survey data. Our method uses a combination of random forests and self-organising-maps to rank planetary candidates, achieving an AUC score of 97.6% in ranking 12368 injected planets against 27496 false positives in the NGTS data. We build on past examples by using injected transit signals to form a training set, a necessary development for applying similar methods to upcoming surveys. We also make the autovet code used to implement the algorithm publicly accessible. autovet is designed to perform machine learned vetting of planetary candidates, and can utilise a variety of methods. The apparent robustness of machine learning techniques, whether on space-based or the qualitatively different ground-based data, highlights their importance to future surveys such as TESS and PLATO and the need to better understand their advantages and pitfalls in an exoplanetary context.
Abstracts of AF Materials Laboratory Reports
1975-09-01
NO: TITLE: AUTHOR(S): CONTRACT NO; CONTRACTOR: AFML-TR-73-307 200,397 IMPROVED AUTOMATED TAPE LAYING MACHINE M. Poullos, W. J. Murray, D.L...AUTOMATED IMPROVED AUTOMATED TAPE LAYING MACHINE AUTOMATION AUTOMATION OF COATING PROCESSES FOR GAS TURBINE DLADcS AND VANES 203222/111 203072...IMP90VE0 TAPE LAYING MACHINE IMPP)VED AUTOMATED TAPE LAYING MACHINE A STUDY O^ THE STRESS-STRAIN TEHAVIOR OF GRAPHITE
Ranjith, G; Parvathy, R; Vikas, V; Chandrasekharan, Kesavadas; Nair, Suresh
2015-04-01
With the advent of new imaging modalities, radiologists are faced with handling increasing volumes of data for diagnosis and treatment planning. The use of automated and intelligent systems is becoming essential in such a scenario. Machine learning, a branch of artificial intelligence, is increasingly being used in medical image analysis applications such as image segmentation, registration and computer-aided diagnosis and detection. Histopathological analysis is currently the gold standard for classification of brain tumors. The use of machine learning algorithms along with extraction of relevant features from magnetic resonance imaging (MRI) holds promise of replacing conventional invasive methods of tumor classification. The aim of the study is to classify gliomas into benign and malignant types using MRI data. Retrospective data from 28 patients who were diagnosed with glioma were used for the analysis. WHO Grade II (low-grade astrocytoma) was classified as benign while Grade III (anaplastic astrocytoma) and Grade IV (glioblastoma multiforme) were classified as malignant. Features were extracted from MR spectroscopy. The classification was done using four machine learning algorithms: multilayer perceptrons, support vector machine, random forest and locally weighted learning. Three of the four machine learning algorithms gave an area under ROC curve in excess of 0.80. Random forest gave the best performance in terms of AUC (0.911) while sensitivity was best for locally weighted learning (86.1%). The performance of different machine learning algorithms in the classification of gliomas is promising. An even better performance may be expected by integrating features extracted from other MR sequences. © The Author(s) 2015 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.
De Tobel, J; Radesh, P; Vandermeulen, D; Thevissen, P W
2017-12-01
Automated methods to evaluate growth of hand and wrist bones on radiographs and magnetic resonance imaging have been developed. They can be applied to estimate age in children and subadults. Automated methods require the software to (1) recognise the region of interest in the image(s), (2) evaluate the degree of development and (3) correlate this to the age of the subject based on a reference population. For age estimation based on third molars an automated method for step (1) has been presented for 3D magnetic resonance imaging and is currently being optimised (Unterpirker et al. 2015). To develop an automated method for step (2) based on lower third molars on panoramic radiographs. A modified Demirjian staging technique including ten developmental stages was developed. Twenty panoramic radiographs per stage per gender were retrospectively selected for FDI element 38. Two observers decided in consensus about the stages. When necessary, a third observer acted as a referee to establish the reference stage for the considered third molar. This set of radiographs was used as training data for machine learning algorithms for automated staging. First, image contrast settings were optimised to evaluate the third molar of interest and a rectangular bounding box was placed around it in a standardised way using Adobe Photoshop CC 2017 software. This bounding box indicated the region of interest for the next step. Second, several machine learning algorithms available in MATLAB R2017a software were applied for automated stage recognition. Third, the classification performance was evaluated in a 5-fold cross-validation scenario, using different validation metrics (accuracy, Rank-N recognition rate, mean absolute difference, linear kappa coefficient). Transfer Learning as a type of Deep Learning Convolutional Neural Network approach outperformed all other tested approaches. Mean accuracy equalled 0.51, mean absolute difference was 0.6 stages and mean linearly weighted kappa was 0.82. The overall performance of the presented automated pilot technique to stage lower third molar development on panoramic radiographs was similar to staging by human observers. It will be further optimised in future research, since it represents a necessary step to achieve a fully automated dental age estimation method, which to date is not available.
Man-Robot Symbiosis: A Framework For Cooperative Intelligence And Control
NASA Astrophysics Data System (ADS)
Parker, Lynne E.; Pin, Francois G.
1988-10-01
The man-robot symbiosis concept has the fundamental objective of bridging the gap between fully human-controlled and fully autonomous systems to achieve true man-robot cooperative control and intelligence. Such a system would allow improved speed, accuracy, and efficiency of task execution, while retaining the man in the loop for innovative reasoning and decision-making. The symbiont would have capabilities for supervised and unsupervised learning, allowing an increase of expertise in a wide task domain. This paper describes a robotic system architecture facilitating the symbiotic integration of teleoperative and automated modes of task execution. The architecture reflects a unique blend of many disciplines of artificial intelligence into a working system, including job or mission planning, dynamic task allocation, man-robot communication, automated monitoring, and machine learning. These disciplines are embodied in five major components of the symbiotic framework: the Job Planner, the Dynamic Task Allocator, the Presenter/Interpreter, the Automated Monitor, and the Learning System.
12 CFR Appendix A to Part 1005 - Model Disclosure Clauses and Forms
Code of Federal Regulations, 2013 CFR
2013-01-01
... you learn of the loss or theft of your [card] [code], you can lose no more than $50 if someone used... learn of the loss or theft of your [card] [code], and we can prove we could have stopped someone from... each transfer you make using our [automated teller machines] [telephone bill-payment service] [point-of...
12 CFR Appendix A to Part 1005 - Model Disclosure Clauses and Forms
Code of Federal Regulations, 2014 CFR
2014-01-01
... line of credit). If you tell us within 2 business days after you learn of the loss or theft of your....) If you do NOT tell us within 2 business days after you learn of the loss or theft of your [card... our [automated teller machines] [telephone bill-payment service] [point-of-sale transfer service]. (2...
12 CFR Appendix A to Part 1005 - Model Disclosure Clauses and Forms
Code of Federal Regulations, 2012 CFR
2012-01-01
... you learn of the loss or theft of your [card] [code], you can lose no more than $50 if someone used... learn of the loss or theft of your [card] [code], and we can prove we could have stopped someone from... each transfer you make using our [automated teller machines] [telephone bill-payment service] [point-of...
Advanced methods in NDE using machine learning approaches
NASA Astrophysics Data System (ADS)
Wunderlich, Christian; Tschöpe, Constanze; Duckhorn, Frank
2018-04-01
Machine learning (ML) methods and algorithms have been applied recently with great success in quality control and predictive maintenance. Its goal to build new and/or leverage existing algorithms to learn from training data and give accurate predictions, or to find patterns, particularly with new and unseen similar data, fits perfectly to Non-Destructive Evaluation. The advantages of ML in NDE are obvious in such tasks as pattern recognition in acoustic signals or automated processing of images from X-ray, Ultrasonics or optical methods. Fraunhofer IKTS is using machine learning algorithms in acoustic signal analysis. The approach had been applied to such a variety of tasks in quality assessment. The principal approach is based on acoustic signal processing with a primary and secondary analysis step followed by a cognitive system to create model data. Already in the second analysis steps unsupervised learning algorithms as principal component analysis are used to simplify data structures. In the cognitive part of the software further unsupervised and supervised learning algorithms will be trained. Later the sensor signals from unknown samples can be recognized and classified automatically by the algorithms trained before. Recently the IKTS team was able to transfer the software for signal processing and pattern recognition to a small printed circuit board (PCB). Still, algorithms will be trained on an ordinary PC; however, trained algorithms run on the Digital Signal Processor and the FPGA chip. The identical approach will be used for pattern recognition in image analysis of OCT pictures. Some key requirements have to be fulfilled, however. A sufficiently large set of training data, a high signal-to-noise ratio, and an optimized and exact fixation of components are required. The automated testing can be done subsequently by the machine. By integrating the test data of many components along the value chain further optimization including lifetime and durability prediction based on big data becomes possible, even if components are used in different versions or configurations. This is the promise behind German Industry 4.0.
Machine learning to parse breast pathology reports in Chinese.
Tang, Rong; Ouyang, Lizhi; Li, Clara; He, Yue; Griffin, Molly; Taghian, Alphonse; Smith, Barbara; Yala, Adam; Barzilay, Regina; Hughes, Kevin
2018-06-01
Large structured databases of pathology findings are valuable in deriving new clinical insights. However, they are labor intensive to create and generally require manual annotation. There has been some work in the bioinformatics community to support automating this work via machine learning in English. Our contribution is to provide an automated approach to construct such structured databases in Chinese, and to set the stage for extraction from other languages. We collected 2104 de-identified Chinese benign and malignant breast pathology reports from Hunan Cancer Hospital. Physicians with native Chinese proficiency reviewed the reports and annotated a variety of binary and numerical pathologic entities. After excluding 78 cases with a bilateral lesion in the same report, 1216 cases were used as a training set for the algorithm, which was then refined by 405 development cases. The Natural language processing algorithm was tested by using the remaining 405 cases to evaluate the machine learning outcome. The model was used to extract 13 binary entities and 8 numerical entities. When compared to physicians with native Chinese proficiency, the model showed a per-entity accuracy from 91 to 100% for all common diagnoses on the test set. The overall accuracy of binary entities was 98% and of numerical entities was 95%. In a per-report evaluation for binary entities with more than 100 training cases, 85% of all the testing reports were completely correct and 11% had an error in 1 out of 22 entities. We have demonstrated that Chinese breast pathology reports can be automatically parsed into structured data using standard machine learning approaches. The results of our study demonstrate that techniques effective in parsing English reports can be scaled to other languages.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ren, X; Gao, H; Sharp, G
Purpose: Accurate image segmentation is a crucial step during image guided radiation therapy. This work proposes multi-atlas machine learning (MAML) algorithm for automated segmentation of head-and-neck CT images. Methods: As the first step, the algorithm utilizes normalized mutual information as similarity metric, affine registration combined with multiresolution B-Spline registration, and then fuses together using the label fusion strategy via Plastimatch. As the second step, the following feature selection strategy is proposed to extract five feature components from reference or atlas images: intensity (I), distance map (D), box (B), center of gravity (C) and stable point (S). The box feature Bmore » is novel. It describes a relative position from each point to minimum inscribed rectangle of ROI. The center-of-gravity feature C is the 3D Euclidean distance from a sample point to the ROI center of gravity, and then S is the distance of the sample point to the landmarks. Then, we adopt random forest (RF) in Scikit-learn, a Python module integrating a wide range of state-of-the-art machine learning algorithms as classifier. Different feature and atlas strategies are used for different ROIs for improved performance, such as multi-atlas strategy with reference box for brainstem, and single-atlas strategy with reference landmark for optic chiasm. Results: The algorithm was validated on a set of 33 CT images with manual contours using a leave-one-out cross-validation strategy. Dice similarity coefficients between manual contours and automated contours were calculated: the proposed MAML method had an improvement from 0.79 to 0.83 for brainstem and 0.11 to 0.52 for optic chiasm with respect to multi-atlas segmentation method (MA). Conclusion: A MAML method has been proposed for automated segmentation of head-and-neck CT images with improved performance. It provides the comparable result in brainstem and the improved result in optic chiasm compared with MA. Xuhua Ren and Hao Gao were partially supported by the NSFC (#11405105), the 973 Program (#2015CB856000), and the Shanghai Pujiang Talent Program (#14PJ1404500).« less
Oliveira, Bárbara L; Godinho, Daniela; O'Halloran, Martin; Glavin, Martin; Jones, Edward; Conceição, Raquel C
2018-05-19
Currently, breast cancer often requires invasive biopsies for diagnosis, motivating researchers to design and develop non-invasive and automated diagnosis systems. Recent microwave breast imaging studies have shown how backscattered signals carry relevant information about the shape of a tumour, and tumour shape is often used with current imaging modalities to assess malignancy. This paper presents a comprehensive analysis of microwave breast diagnosis systems which use machine learning to learn characteristics of benign and malignant tumours. The state-of-the-art, the main challenges still to overcome and potential solutions are outlined. Specifically, this work investigates the benefit of signal pre-processing on diagnostic performance, and proposes a new set of extracted features that capture the tumour shape information embedded in a signal. This work also investigates if a relationship exists between the antenna topology in a microwave system and diagnostic performance. Finally, a careful machine learning validation methodology is implemented to guarantee the robustness of the results and the accuracy of performance evaluation.
Automated classification of cell morphology by coherence-controlled holographic microscopy
NASA Astrophysics Data System (ADS)
Strbkova, Lenka; Zicha, Daniel; Vesely, Pavel; Chmelik, Radim
2017-08-01
In the last few years, classification of cells by machine learning has become frequently used in biology. However, most of the approaches are based on morphometric (MO) features, which are not quantitative in terms of cell mass. This may result in poor classification accuracy. Here, we study the potential contribution of coherence-controlled holographic microscopy enabling quantitative phase imaging for the classification of cell morphologies. We compare our approach with the commonly used method based on MO features. We tested both classification approaches in an experiment with nutritionally deprived cancer tissue cells, while employing several supervised machine learning algorithms. Most of the classifiers provided higher performance when quantitative phase features were employed. Based on the results, it can be concluded that the quantitative phase features played an important role in improving the performance of the classification. The methodology could be valuable help in refining the monitoring of live cells in an automated fashion. We believe that coherence-controlled holographic microscopy, as a tool for quantitative phase imaging, offers all preconditions for the accurate automated analysis of live cell behavior while enabling noninvasive label-free imaging with sufficient contrast and high-spatiotemporal phase sensitivity.
Automated classification of cell morphology by coherence-controlled holographic microscopy.
Strbkova, Lenka; Zicha, Daniel; Vesely, Pavel; Chmelik, Radim
2017-08-01
In the last few years, classification of cells by machine learning has become frequently used in biology. However, most of the approaches are based on morphometric (MO) features, which are not quantitative in terms of cell mass. This may result in poor classification accuracy. Here, we study the potential contribution of coherence-controlled holographic microscopy enabling quantitative phase imaging for the classification of cell morphologies. We compare our approach with the commonly used method based on MO features. We tested both classification approaches in an experiment with nutritionally deprived cancer tissue cells, while employing several supervised machine learning algorithms. Most of the classifiers provided higher performance when quantitative phase features were employed. Based on the results, it can be concluded that the quantitative phase features played an important role in improving the performance of the classification. The methodology could be valuable help in refining the monitoring of live cells in an automated fashion. We believe that coherence-controlled holographic microscopy, as a tool for quantitative phase imaging, offers all preconditions for the accurate automated analysis of live cell behavior while enabling noninvasive label-free imaging with sufficient contrast and high-spatiotemporal phase sensitivity. (2017) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).
Detection of longitudinal visual field progression in glaucoma using machine learning.
Yousefi, Siamak; Kiwaki, Taichi; Zheng, Yuhui; Suigara, Hiroki; Asaoka, Ryo; Murata, Hiroshi; Lemij, Hans; Yamanishi, Kenji
2018-06-16
Global indices of standard automated perimerty are insensitive to localized losses, while point-wise indices are sensitive but highly variable. Region-wise indices sit in between. This study introduces a machine-learning-based index for glaucoma progression detection that outperforms global, region-wise, and point-wise indices. Development and comparison of a prognostic index. Visual fields from 2085 eyes of 1214 subjects were used to identify glaucoma progression patterns using machine learning. Visual fields from 133 eyes of 71 glaucoma patients were collected 10 times over 10 weeks to provide a no-change, test-retest dataset. The parameters of all methods were identified using visual field sequences in the test-retest dataset to meet fixed 95% specificity. An independent dataset of 270 eyes of 136 glaucoma patients and survival analysis were utilized to compare methods. The time to detect progression in 25% of the eyes in the longitudinal dataset using global mean deviation (MD) was 5.2 years (95% confidence interval, 4.1 - 6.5 years); 4.5 years (4.0 - 5.5) using region-wise, 3.9 years (3.5 - 4.6) using point-wise, and 3.5 years (3.1 - 4.0) using machine learning analysis. The time until 25% of eyes showed subsequently confirmed progression after two additional visits were included were 6.6 years (5.6 - 7.4 years), 5.7 years (4.8 - 6.7), 5.6 years (4.7 - 6.5), and 5.1 years (4.5 - 6.0) for global, region-wise, point-wise, and machine learning analyses, respectively. Machine learning analysis detects progressing eyes earlier than other methods consistently, with or without confirmation visits. In particular, machine learning detects more slowly progressing eyes than other methods. Copyright © 2018 Elsevier Inc. All rights reserved.
Active machine learning-driven experimentation to determine compound effects on protein patterns.
Naik, Armaghan W; Kangas, Joshua D; Sullivan, Devin P; Murphy, Robert F
2016-02-03
High throughput screening determines the effects of many conditions on a given biological target. Currently, to estimate the effects of those conditions on other targets requires either strong modeling assumptions (e.g. similarities among targets) or separate screens. Ideally, data-driven experimentation could be used to learn accurate models for many conditions and targets without doing all possible experiments. We have previously described an active machine learning algorithm that can iteratively choose small sets of experiments to learn models of multiple effects. We now show that, with no prior knowledge and with liquid handling robotics and automated microscopy under its control, this learner accurately learned the effects of 48 chemical compounds on the subcellular localization of 48 proteins while performing only 29% of all possible experiments. The results represent the first practical demonstration of the utility of active learning-driven biological experimentation in which the set of possible phenotypes is unknown in advance.
ALE: automated label extraction from GEO metadata.
Giles, Cory B; Brown, Chase A; Ripperger, Michael; Dennis, Zane; Roopnarinesingh, Xiavan; Porter, Hunter; Perz, Aleksandra; Wren, Jonathan D
2017-12-28
NCBI's Gene Expression Omnibus (GEO) is a rich community resource containing millions of gene expression experiments from human, mouse, rat, and other model organisms. However, information about each experiment (metadata) is in the format of an open-ended, non-standardized textual description provided by the depositor. Thus, classification of experiments for meta-analysis by factors such as gender, age of the sample donor, and tissue of origin is not feasible without assigning labels to the experiments. Automated approaches are preferable for this, primarily because of the size and volume of the data to be processed, but also because it ensures standardization and consistency. While some of these labels can be extracted directly from the textual metadata, many of the data available do not contain explicit text informing the researcher about the age and gender of the subjects with the study. To bridge this gap, machine-learning methods can be trained to use the gene expression patterns associated with the text-derived labels to refine label-prediction confidence. Our analysis shows only 26% of metadata text contains information about gender and 21% about age. In order to ameliorate the lack of available labels for these data sets, we first extract labels from the textual metadata for each GEO RNA dataset and evaluate the performance against a gold standard of manually curated labels. We then use machine-learning methods to predict labels, based upon gene expression of the samples and compare this to the text-based method. Here we present an automated method to extract labels for age, gender, and tissue from textual metadata and GEO data using both a heuristic approach as well as machine learning. We show the two methods together improve accuracy of label assignment to GEO samples.
Critical Assessment of Small Molecule Identification 2016: automated methods.
Schymanski, Emma L; Ruttkies, Christoph; Krauss, Martin; Brouard, Céline; Kind, Tobias; Dührkop, Kai; Allen, Felicity; Vaniya, Arpana; Verdegem, Dries; Böcker, Sebastian; Rousu, Juho; Shen, Huibin; Tsugawa, Hiroshi; Sajed, Tanvir; Fiehn, Oliver; Ghesquière, Bart; Neumann, Steffen
2017-03-27
The fourth round of the Critical Assessment of Small Molecule Identification (CASMI) Contest ( www.casmi-contest.org ) was held in 2016, with two new categories for automated methods. This article covers the 208 challenges in Categories 2 and 3, without and with metadata, from organization, participation, results and post-contest evaluation of CASMI 2016 through to perspectives for future contests and small molecule annotation/identification. The Input Output Kernel Regression (CSI:IOKR) machine learning approach performed best in "Category 2: Best Automatic Structural Identification-In Silico Fragmentation Only", won by Team Brouard with 41% challenge wins. The winner of "Category 3: Best Automatic Structural Identification-Full Information" was Team Kind (MS-FINDER), with 76% challenge wins. The best methods were able to achieve over 30% Top 1 ranks in Category 2, with all methods ranking the correct candidate in the Top 10 in around 50% of challenges. This success rate rose to 70% Top 1 ranks in Category 3, with candidates in the Top 10 in over 80% of the challenges. The machine learning and chemistry-based approaches are shown to perform in complementary ways. The improvement in (semi-)automated fragmentation methods for small molecule identification has been substantial. The achieved high rates of correct candidates in the Top 1 and Top 10, despite large candidate numbers, open up great possibilities for high-throughput annotation of untargeted analysis for "known unknowns". As more high quality training data becomes available, the improvements in machine learning methods will likely continue, but the alternative approaches still provide valuable complementary information. Improved integration of experimental context will also improve identification success further for "real life" annotations. The true "unknown unknowns" remain to be evaluated in future CASMI contests. Graphical abstract .
Pesesky, Mitchell W; Hussain, Tahir; Wallace, Meghan; Patel, Sanket; Andleeb, Saadia; Burnham, Carey-Ann D; Dantas, Gautam
2016-01-01
The time-to-result for culture-based microorganism recovery and phenotypic antimicrobial susceptibility testing necessitates initial use of empiric (frequently broad-spectrum) antimicrobial therapy. If the empiric therapy is not optimal, this can lead to adverse patient outcomes and contribute to increasing antibiotic resistance in pathogens. New, more rapid technologies are emerging to meet this need. Many of these are based on identifying resistance genes, rather than directly assaying resistance phenotypes, and thus require interpretation to translate the genotype into treatment recommendations. These interpretations, like other parts of clinical diagnostic workflows, are likely to be increasingly automated in the future. We set out to evaluate the two major approaches that could be amenable to automation pipelines: rules-based methods and machine learning methods. The rules-based algorithm makes predictions based upon current, curated knowledge of Enterobacteriaceae resistance genes. The machine-learning algorithm predicts resistance and susceptibility based on a model built from a training set of variably resistant isolates. As our test set, we used whole genome sequence data from 78 clinical Enterobacteriaceae isolates, previously identified to represent a variety of phenotypes, from fully-susceptible to pan-resistant strains for the antibiotics tested. We tested three antibiotic resistance determinant databases for their utility in identifying the complete resistome for each isolate. The predictions of the rules-based and machine learning algorithms for these isolates were compared to results of phenotype-based diagnostics. The rules based and machine-learning predictions achieved agreement with standard-of-care phenotypic diagnostics of 89.0 and 90.3%, respectively, across twelve antibiotic agents from six major antibiotic classes. Several sources of disagreement between the algorithms were identified. Novel variants of known resistance factors and incomplete genome assembly confounded the rules-based algorithm, resulting in predictions based on gene family, rather than on knowledge of the specific variant found. Low-frequency resistance caused errors in the machine-learning algorithm because those genes were not seen or seen infrequently in the test set. We also identified an example of variability in the phenotype-based results that led to disagreement with both genotype-based methods. Genotype-based antimicrobial susceptibility testing shows great promise as a diagnostic tool, and we outline specific research goals to further refine this methodology.
NASA Astrophysics Data System (ADS)
Fern, Lisa Carolynn
This dissertation examines the challenges inherent in designing and regulating to support human-automation interaction for new technologies that will be deployed into complex systems. A key question for new technologies with increasingly capable automation, is how work will be accomplished by human and machine agents. This question has traditionally been framed as how functions should be allocated between humans and machines. Such framing misses the coordination and synchronization that is needed for the different human and machine roles in the system to accomplish their goals. Coordination and synchronization demands are driven by the underlying human-automation architecture of the new technology, which are typically not specified explicitly by designers. The human machine interface (HMI), which is intended to facilitate human-machine interaction and cooperation, typically is defined explicitly and therefore serves as a proxy for human-automation cooperation requirements with respect to technical standards for technologies. Unfortunately, mismatches between the HMI and the coordination and synchronization demands of the underlying human-automation architecture can lead to system breakdowns. A methodology is needed that both designers and regulators can utilize to evaluate the predicted performance of a new technology given potential human-automation architectures. Three experiments were conducted to inform the minimum HMI requirements for a detect and avoid (DAA) system for unmanned aircraft systems (UAS). The results of the experiments provided empirical input to specific minimum operational performance standards that UAS manufacturers will have to meet in order to operate UAS in the National Airspace System (NAS). These studies represent a success story for how to objectively and systematically evaluate prototype technologies as part of the process for developing regulatory requirements. They also provide an opportunity to reflect on the lessons learned in order to improve the methodology for defining technology requirements for regulators in the future. The biggest shortcoming of the presented research program was the absence of the explicit definition, generation and analysis of potential human-automation architectures. Failure to execute this step in the research process resulted in less efficient evaluation of the candidate prototypes technologies in addition to a lack of exploration of different approaches to human-automation cooperation. Defining potential human-automation architectures a priori also allows regulators to develop scenarios that will stress the performance boundaries of the technology during the evaluation phase. The importance of adding this step of generating and evaluating candidate human-automation architectures prior to formal empirical evaluation is discussed. This document concludes with a look at both the importance of, and the challenges facing, the inclusion of examining human-automation coordination issues as part of the safety assurance activities of new technologies.
Learning to recognize rat social behavior: Novel dataset and cross-dataset application.
Lorbach, Malte; Kyriakou, Elisavet I; Poppe, Ronald; van Dam, Elsbeth A; Noldus, Lucas P J J; Veltkamp, Remco C
2018-04-15
Social behavior is an important aspect of rodent models. Automated measuring tools that make use of video analysis and machine learning are an increasingly attractive alternative to manual annotation. Because machine learning-based methods need to be trained, it is important that they are validated using data from different experiment settings. To develop and validate automated measuring tools, there is a need for annotated rodent interaction datasets. Currently, the availability of such datasets is limited to two mouse datasets. We introduce the first, publicly available rat social interaction dataset, RatSI. We demonstrate the practical value of the novel dataset by using it as the training set for a rat interaction recognition method. We show that behavior variations induced by the experiment setting can lead to reduced performance, which illustrates the importance of cross-dataset validation. Consequently, we add a simple adaptation step to our method and improve the recognition performance. Most existing methods are trained and evaluated in one experimental setting, which limits the predictive power of the evaluation to that particular setting. We demonstrate that cross-dataset experiments provide more insight in the performance of classifiers. With our novel, public dataset we encourage the development and validation of automated recognition methods. We are convinced that cross-dataset validation enhances our understanding of rodent interactions and facilitates the development of more sophisticated recognition methods. Combining them with adaptation techniques may enable us to apply automated recognition methods to a variety of animals and experiment settings. Copyright © 2017 Elsevier B.V. All rights reserved.
Personalized Physical Activity Coaching: A Machine Learning Approach
Dijkhuis, Talko B.; van Ittersum, Miriam W.; Velthuijsen, Hugo
2018-01-01
Living a sedentary lifestyle is one of the major causes of numerous health problems. To encourage employees to lead a less sedentary life, the Hanze University started a health promotion program. One of the interventions in the program was the use of an activity tracker to record participants' daily step count. The daily step count served as input for a fortnightly coaching session. In this paper, we investigate the possibility of automating part of the coaching procedure on physical activity by providing personalized feedback throughout the day on a participant’s progress in achieving a personal step goal. The gathered step count data was used to train eight different machine learning algorithms to make hourly estimations of the probability of achieving a personalized, daily steps threshold. In 80% of the individual cases, the Random Forest algorithm was the best performing algorithm (mean accuracy = 0.93, range = 0.88–0.99, and mean F1-score = 0.90, range = 0.87–0.94). To demonstrate the practical usefulness of these models, we developed a proof-of-concept Web application that provides personalized feedback about whether a participant is expected to reach his or her daily threshold. We argue that the use of machine learning could become an invaluable asset in the process of automated personalized coaching. The individualized algorithms allow for predicting physical activity during the day and provides the possibility to intervene in time. PMID:29463052
Crowe, Simon F; Mahony, Kate; Jackson, Martin
2004-08-01
The purpose of the current study was to explore whether performance on standardised neuropsychological measures could predict functional ability with automated machines and services among people with an acquired brain injury (ABI). Participants were 45 individuals who met the criteria for mild, moderate or severe ABI and 15 control participants matched on demographic variables including age- and education. Each participant was required to complete a battery of neuropsychological tests, as well as performing three automated service delivery tasks: a transport automated ticketing machine, an automated teller machine (ATM) and an automated telephone service. The results showed consistently high relationship between the neuropsychological measures, both as single predictors and in combination, and level of competency with the automated machines. Automated machines are part of a relatively new phenomena in service delivery and offer an ecologically valid functional measure of performance that represents a true indication of functional disability.
NASA Technical Reports Server (NTRS)
Shewhart, Mark
1991-01-01
Statistical Process Control (SPC) charts are one of several tools used in quality control. Other tools include flow charts, histograms, cause and effect diagrams, check sheets, Pareto diagrams, graphs, and scatter diagrams. A control chart is simply a graph which indicates process variation over time. The purpose of drawing a control chart is to detect any changes in the process signalled by abnormal points or patterns on the graph. The Artificial Intelligence Support Center (AISC) of the Acquisition Logistics Division has developed a hybrid machine learning expert system prototype which automates the process of constructing and interpreting control charts.
Assessing Creative Problem-Solving with Automated Text Grading
ERIC Educational Resources Information Center
Wang, Hao-Chuan; Chang, Chun-Yen; Li, Tsai-Yen
2008-01-01
The work aims to improve the assessment of creative problem-solving in science education by employing language technologies and computational-statistical machine learning methods to grade students' natural language responses automatically. To evaluate constructs like creative problem-solving with validity, open-ended questions that elicit…
A Starter's Guide to Artificial Intelligence.
ERIC Educational Resources Information Center
McConnell, Barry A.; McConnell, Nancy J.
1988-01-01
Discussion of the history and development of artificial intelligence (AI) highlights a bibliography of introductory books on various aspects of AI, including AI programing; problem solving; automated reasoning; game playing; natural language; expert systems; machine learning; robotics and vision; critics of AI; and representative software. (LRW)
Automated placement of interfaces in conformational kinetics calculations using machine learning
NASA Astrophysics Data System (ADS)
Grazioli, Gianmarc; Butts, Carter T.; Andricioaei, Ioan
2017-10-01
Several recent implementations of algorithms for sampling reaction pathways employ a strategy for placing interfaces or milestones across the reaction coordinate manifold. Interfaces can be introduced such that the full feature space describing the dynamics of a macromolecule is divided into Voronoi (or other) cells, and the global kinetics of the molecular motions can be calculated from the set of fluxes through the interfaces between the cells. Although some methods of this type are exact for an arbitrary set of cells, in practice, the calculations will converge fastest when the interfaces are placed in regions where they can best capture transitions between configurations corresponding to local minima. The aim of this paper is to introduce a fully automated machine-learning algorithm for defining a set of cells for use in kinetic sampling methodologies based on subdividing the dynamical feature space; the algorithm requires no intuition about the system or input from the user and scales to high-dimensional systems.
Automated placement of interfaces in conformational kinetics calculations using machine learning.
Grazioli, Gianmarc; Butts, Carter T; Andricioaei, Ioan
2017-10-21
Several recent implementations of algorithms for sampling reaction pathways employ a strategy for placing interfaces or milestones across the reaction coordinate manifold. Interfaces can be introduced such that the full feature space describing the dynamics of a macromolecule is divided into Voronoi (or other) cells, and the global kinetics of the molecular motions can be calculated from the set of fluxes through the interfaces between the cells. Although some methods of this type are exact for an arbitrary set of cells, in practice, the calculations will converge fastest when the interfaces are placed in regions where they can best capture transitions between configurations corresponding to local minima. The aim of this paper is to introduce a fully automated machine-learning algorithm for defining a set of cells for use in kinetic sampling methodologies based on subdividing the dynamical feature space; the algorithm requires no intuition about the system or input from the user and scales to high-dimensional systems.
NASA Astrophysics Data System (ADS)
Shprits, Y.; Zhelavskaya, I. S.; Kellerman, A. C.; Spasojevic, M.; Kondrashov, D. A.; Ghil, M.; Aseev, N.; Castillo Tibocha, A. M.; Cervantes Villa, J. S.; Kletzing, C.; Kurth, W. S.
2017-12-01
Increasing volume of satellite measurements requires deployment of new tools that can utilize such vast amount of data. Satellite measurements are usually limited to a single location in space, which complicates the data analysis geared towards reproducing the global state of the space environment. In this study we show how measurements can be combined by means of data assimilation and how machine learning can help analyze large amounts of data and can help develop global models that are trained on single point measurement. Data Assimilation: Manual analysis of the satellite measurements is a challenging task, while automated analysis is complicated by the fact that measurements are given at various locations in space, have different instrumental errors, and often vary by orders of magnitude. We show results of the long term reanalysis of radiation belt measurements along with fully operational real-time predictions using data assimilative VERB code. Machine Learning: We present application of the machine learning tools for the analysis of NASA Van Allen Probes upper-hybrid frequency measurements. Using the obtained data set we train a new global predictive neural network. The results for the Van Allen Probes based neural network are compared with historical IMAGE satellite observations. We also show examples of predictions of geomagnetic indices using neural networks. Combination of machine learning and data assimilation: We discuss how data assimilation tools and machine learning tools can be combine so that physics-based insight into the dynamics of the particular system can be combined with empirical knowledge of it's non-linear behavior.
NASA Astrophysics Data System (ADS)
Nakamura, Christopher M.; Murphy, Sytil K.; Christel, Michael G.; Stevens, Scott M.; Zollman, Dean A.
2016-06-01
Computer-automated assessment of students' text responses to short-answer questions represents an important enabling technology for online learning environments. We have investigated the use of machine learning to train computer models capable of automatically classifying short-answer responses and assessed the results. Our investigations are part of a project to develop and test an interactive learning environment designed to help students learn introductory physics concepts. The system is designed around an interactive video tutoring interface. We have analyzed 9 with about 150 responses or less. We observe for 4 of the 9 automated assessment with interrater agreement of 70% or better with the human rater. This level of agreement may represent a baseline for practical utility in instruction and indicates that the method warrants further investigation for use in this type of application. Our results also suggest strategies that may be useful for writing activities and questions that are more appropriate for automated assessment. These strategies include building activities that have relatively few conceptually distinct ways of perceiving the physical behavior of relatively few physical objects. Further success in this direction may allow us to promote interactivity and better provide feedback in online learning systems. These capabilities could enable our system to function more like a real tutor.
Fuzzy Logic-Based Audio Pattern Recognition
NASA Astrophysics Data System (ADS)
Malcangi, M.
2008-11-01
Audio and audio-pattern recognition is becoming one of the most important technologies to automatically control embedded systems. Fuzzy logic may be the most important enabling methodology due to its ability to rapidly and economically model such application. An audio and audio-pattern recognition engine based on fuzzy logic has been developed for use in very low-cost and deeply embedded systems to automate human-to-machine and machine-to-machine interaction. This engine consists of simple digital signal-processing algorithms for feature extraction and normalization, and a set of pattern-recognition rules manually tuned or automatically tuned by a self-learning process.
Automated Tape Laying Machine for Composite Structures.
The invention comprises an automated tape laying machine, for laying tape on a composite structure. The tape laying machine has a tape laying head...neatly cut. The automated tape laying device utilizes narrow width tape to increase machine flexibility and reduce wastage.
Young, Sean D; Yu, Wenchao; Wang, Wei
2017-02-01
"Social big data" from technologies such as social media, wearable devices, and online searches continue to grow and can be used as tools for HIV research. Although researchers can uncover patterns and insights associated with HIV trends and transmission, the review process is time consuming and resource intensive. Machine learning methods derived from computer science might be used to assist HIV domain experts by learning how to rapidly and accurately identify patterns associated with HIV from a large set of social data. Using an existing social media data set that was associated with HIV and coded by an HIV domain expert, we tested whether 4 commonly used machine learning methods could learn the patterns associated with HIV risk behavior. We used the 10-fold cross-validation method to examine the speed and accuracy of these models in applying that knowledge to detect HIV content in social media data. Logistic regression and random forest resulted in the highest accuracy in detecting HIV-related social data (85.3%), whereas the Ridge Regression Classifier resulted in the lowest accuracy. Logistic regression yielded the fastest processing time (16.98 seconds). Machine learning can enable social big data to become a new and important tool in HIV research, helping to create a new field of "digital HIV epidemiology." If a domain expert can identify patterns in social data associated with HIV risk or HIV transmission, machine learning models could quickly and accurately learn those associations and identify potential HIV patterns in large social data sets.
A machine-learning apprentice for the completion of repetitive forms
NASA Technical Reports Server (NTRS)
Hermens, Leonard A.; Schlimmer, Jeffrey C.
1994-01-01
Forms of all types are used in businesses and government agencies, and most of them are filled in by hand. Yet much time and effort has been expended to automate form-filling by programming specific systems or computers. The high cost of programmers and other resources prohibits many organizations from benefiting from efficient office automation. A learning apprentice can be used for such repetitious form-filling tasks. In this paper, we establish the need for learning apprentices, describe a framework for such a system, explain the difficulties of form-filling, and present empirical results of a form-filling system used in our department from September 1991 to April 1992. The form-filling apprentice saves up to 87 percent in keystroke effort and correctly predicts nearly 90 percent of the values on the form.
Saba, Luca; Jain, Pankaj K; Suri, Harman S; Ikeda, Nobutaka; Araki, Tadashi; Singh, Bikesh K; Nicolaides, Andrew; Shafique, Shoaib; Gupta, Ajay; Laird, John R; Suri, Jasjit S
2017-06-01
Severe atherosclerosis disease in carotid arteries causes stenosis which in turn leads to stroke. Machine learning systems have been previously developed for plaque wall risk assessment using morphology-based characterization. The fundamental assumption in such systems is the extraction of the grayscale features of the plaque region. Even though these systems have the ability to perform risk stratification, they lack the ability to achieve higher performance due their inability to select and retain dominant features. This paper introduces a polling-based principal component analysis (PCA) strategy embedded in the machine learning framework to select and retain dominant features, resulting in superior performance. This leads to more stability and reliability. The automated system uses offline image data along with the ground truth labels to generate the parameters, which are then used to transform the online grayscale features to predict the risk of stroke. A set of sixteen grayscale plaque features is computed. Utilizing the cross-validation protocol (K = 10), and the PCA cutoff of 0.995, the machine learning system is able to achieve an accuracy of 98.55 and 98.83%corresponding to the carotidfar wall and near wall plaques, respectively. The corresponding reliability of the system was 94.56 and 95.63%, respectively. The automated system was validated against the manual risk assessment system and the precision of merit for same cross-validation settings and PCA cutoffs are 98.28 and 93.92%for the far and the near wall, respectively.PCA-embedded morphology-based plaque characterization shows a powerful strategy for risk assessment and can be adapted in clinical settings.
CRD's Daniela Ushizima Receives DOE Early Career Award
Science. The award will fund research into developing new methods to help scientists extract more -the-art data analysis methods with emphasis on pattern recognition and machine learning emerging sources, multidisciplinary teams to interpret the data and the computational methods to automate some of
Semi-Supervised Clustering for High-Dimensional and Sparse Features
ERIC Educational Resources Information Center
Yan, Su
2010-01-01
Clustering is one of the most common data mining tasks, used frequently for data organization and analysis in various application domains. Traditional machine learning approaches to clustering are fully automated and unsupervised where class labels are unknown a priori. In real application domains, however, some "weak" form of side…
NASA Technical Reports Server (NTRS)
Lum, Henry, Jr.
1988-01-01
Information on systems autonomy is given in viewgraph form. Information is given on space systems integration, intelligent autonomous systems, automated systems for in-flight mission operations, the Systems Autonomy Demonstration Project on the Space Station Thermal Control System, the architecture of an autonomous intelligent system, artificial intelligence research issues, machine learning, and real-time image processing.
Data-Driven Learning of Total and Local Energies in Elemental Boron
NASA Astrophysics Data System (ADS)
Deringer, Volker L.; Pickard, Chris J.; Csányi, Gábor
2018-04-01
The allotropes of boron continue to challenge structural elucidation and solid-state theory. Here we use machine learning combined with random structure searching (RSS) algorithms to systematically construct an interatomic potential for boron. Starting from ensembles of randomized atomic configurations, we use alternating single-point quantum-mechanical energy and force computations, Gaussian approximation potential (GAP) fitting, and GAP-driven RSS to iteratively generate a representation of the element's potential-energy surface. Beyond the total energies of the very different boron allotropes, our model readily provides atom-resolved, local energies and thus deepened insight into the frustrated β -rhombohedral boron structure. Our results open the door for the efficient and automated generation of GAPs, and other machine-learning-based interatomic potentials, and suggest their usefulness as a tool for materials discovery.
Data-Driven Learning of Total and Local Energies in Elemental Boron.
Deringer, Volker L; Pickard, Chris J; Csányi, Gábor
2018-04-13
The allotropes of boron continue to challenge structural elucidation and solid-state theory. Here we use machine learning combined with random structure searching (RSS) algorithms to systematically construct an interatomic potential for boron. Starting from ensembles of randomized atomic configurations, we use alternating single-point quantum-mechanical energy and force computations, Gaussian approximation potential (GAP) fitting, and GAP-driven RSS to iteratively generate a representation of the element's potential-energy surface. Beyond the total energies of the very different boron allotropes, our model readily provides atom-resolved, local energies and thus deepened insight into the frustrated β-rhombohedral boron structure. Our results open the door for the efficient and automated generation of GAPs, and other machine-learning-based interatomic potentials, and suggest their usefulness as a tool for materials discovery.
Pai, Yun Suen; Yap, Hwa Jen; Md Dawal, Siti Zawiah; Ramesh, S.; Phoon, Sin Ye
2016-01-01
This study presents a modular-based implementation of augmented reality to provide an immersive experience in learning or teaching the planning phase, control system, and machining parameters of a fully automated work cell. The architecture of the system consists of three code modules that can operate independently or combined to create a complete system that is able to guide engineers from the layout planning phase to the prototyping of the final product. The layout planning module determines the best possible arrangement in a layout for the placement of various machines, in this case a conveyor belt for transportation, a robot arm for pick-and-place operations, and a computer numerical control milling machine to generate the final prototype. The robotic arm module simulates the pick-and-place operation offline from the conveyor belt to a computer numerical control (CNC) machine utilising collision detection and inverse kinematics. Finally, the CNC module performs virtual machining based on the Uniform Space Decomposition method and axis aligned bounding box collision detection. The conducted case study revealed that given the situation, a semi-circle shaped arrangement is desirable, whereas the pick-and-place system and the final generated G-code produced the highest deviation of 3.83 mm and 5.8 mm respectively. PMID:27271840
Pai, Yun Suen; Yap, Hwa Jen; Md Dawal, Siti Zawiah; Ramesh, S; Phoon, Sin Ye
2016-06-07
This study presents a modular-based implementation of augmented reality to provide an immersive experience in learning or teaching the planning phase, control system, and machining parameters of a fully automated work cell. The architecture of the system consists of three code modules that can operate independently or combined to create a complete system that is able to guide engineers from the layout planning phase to the prototyping of the final product. The layout planning module determines the best possible arrangement in a layout for the placement of various machines, in this case a conveyor belt for transportation, a robot arm for pick-and-place operations, and a computer numerical control milling machine to generate the final prototype. The robotic arm module simulates the pick-and-place operation offline from the conveyor belt to a computer numerical control (CNC) machine utilising collision detection and inverse kinematics. Finally, the CNC module performs virtual machining based on the Uniform Space Decomposition method and axis aligned bounding box collision detection. The conducted case study revealed that given the situation, a semi-circle shaped arrangement is desirable, whereas the pick-and-place system and the final generated G-code produced the highest deviation of 3.83 mm and 5.8 mm respectively.
NASA Astrophysics Data System (ADS)
Pai, Yun Suen; Yap, Hwa Jen; Md Dawal, Siti Zawiah; Ramesh, S.; Phoon, Sin Ye
2016-06-01
This study presents a modular-based implementation of augmented reality to provide an immersive experience in learning or teaching the planning phase, control system, and machining parameters of a fully automated work cell. The architecture of the system consists of three code modules that can operate independently or combined to create a complete system that is able to guide engineers from the layout planning phase to the prototyping of the final product. The layout planning module determines the best possible arrangement in a layout for the placement of various machines, in this case a conveyor belt for transportation, a robot arm for pick-and-place operations, and a computer numerical control milling machine to generate the final prototype. The robotic arm module simulates the pick-and-place operation offline from the conveyor belt to a computer numerical control (CNC) machine utilising collision detection and inverse kinematics. Finally, the CNC module performs virtual machining based on the Uniform Space Decomposition method and axis aligned bounding box collision detection. The conducted case study revealed that given the situation, a semi-circle shaped arrangement is desirable, whereas the pick-and-place system and the final generated G-code produced the highest deviation of 3.83 mm and 5.8 mm respectively.
Gestural cue analysis in automated semantic miscommunication annotation
Inoue, Masashi; Ogihara, Mitsunori; Hanada, Ryoko; Furuyama, Nobuhiro
2011-01-01
The automated annotation of conversational video by semantic miscommunication labels is a challenging topic. Although miscommunications are often obvious to the speakers as well as the observers, it is difficult for machines to detect them from the low-level features. We investigate the utility of gestural cues in this paper among various non-verbal features. Compared with gesture recognition tasks in human-computer interaction, this process is difficult due to the lack of understanding on which cues contribute to miscommunications and the implicitness of gestures. Nine simple gestural features are taken from gesture data, and both simple and complex classifiers are constructed using machine learning. The experimental results suggest that there is no single gestural feature that can predict or explain the occurrence of semantic miscommunication in our setting. PMID:23585724
Macedo, Nayana Damiani; Buzin, Aline Rodrigues; de Araujo, Isabela Bastos Binotti Abreu; Nogueira, Breno Valentim; de Andrade, Tadeu Uggere; Endringer, Denise Coutinho; Lenz, Dominik
2017-02-01
The current study proposes an automated machine learning approach for the quantification of cells in cell death pathways according to DNA fragmentation. A total of 17 images of kidney histological slide samples from male Wistar rats were used. The slides were photographed using an Axio Zeiss Vert.A1 microscope with a 40x objective lens coupled with an Axio Cam MRC Zeiss camera and Zen 2012 software. The images were analyzed using CellProfiler (version 2.1.1) and CellProfiler Analyst open-source software. Out of the 10,378 objects, 4970 (47,9%) were identified as TUNEL positive, and 5408 (52,1%) were identified as TUNEL negative. On average, the sensitivity and specificity values of the machine learning approach were 0.80 and 0.77, respectively. Image cytometry provides a quantitative analytical alternative to the more traditional qualitative methods more commonly used in studies. Copyright © 2016 Elsevier Ltd. All rights reserved.
Golla, Gowtham Kumar; Carlson, Jordan A; Huan, Jun; Kerr, Jacqueline; Mitchell, Tarrah; Borner, Kelsey
2016-10-01
Sedentary behavior of youth is an important determinant of health. However, better measures are needed to improve understanding of this relationship and the mechanisms at play, as well as to evaluate health promotion interventions. Wearable accelerometers are considered as the standard for assessing physical activity in research, but do not perform well for assessing posture (i.e., sitting vs. standing), a critical component of sedentary behavior. The machine learning algorithms that we propose for assessing sedentary behavior will allow us to re-examine existing accelerometer data to better understand the association between sedentary time and health in various populations. We collected two datasets, a laboratory-controlled dataset and a free-living dataset. We trained machine learning classifiers separately on each dataset and compared performance across datasets. The classifiers predict five postures: sit, stand, sit-stand, stand-sit, and stand\\walk. We compared a manually constructed Hidden Markov model (HMM) with an automated HMM from existing software. The manually constructed HMM gave more F1-Macro score on both datasets.
Digital imaging biomarkers feed machine learning for melanoma screening.
Gareau, Daniel S; Correa da Rosa, Joel; Yagerman, Sarah; Carucci, John A; Gulati, Nicholas; Hueto, Ferran; DeFazio, Jennifer L; Suárez-Fariñas, Mayte; Marghoob, Ashfaq; Krueger, James G
2017-07-01
We developed an automated approach for generating quantitative image analysis metrics (imaging biomarkers) that are then analysed with a set of 13 machine learning algorithms to generate an overall risk score that is called a Q-score. These methods were applied to a set of 120 "difficult" dermoscopy images of dysplastic nevi and melanomas that were subsequently excised/classified. This approach yielded 98% sensitivity and 36% specificity for melanoma detection, approaching sensitivity/specificity of expert lesion evaluation. Importantly, we found strong spectral dependence of many imaging biomarkers in blue or red colour channels, suggesting the need to optimize spectral evaluation of pigmented lesions. © 2016 The Authors. Experimental Dermatology Published by John Wiley & Sons Ltd.
Tcheng, David K.; Nayak, Ashwin K.; Fowlkes, Charless C.; Punyasena, Surangi W.
2016-01-01
Discriminating between black and white spruce (Picea mariana and Picea glauca) is a difficult palynological classification problem that, if solved, would provide valuable data for paleoclimate reconstructions. We developed an open-source visual recognition software (ARLO, Automated Recognition with Layered Optimization) capable of differentiating between these two species at an accuracy on par with human experts. The system applies pattern recognition and machine learning to the analysis of pollen images and discovers general-purpose image features, defined by simple features of lines and grids of pixels taken at different dimensions, size, spacing, and resolution. It adapts to a given problem by searching for the most effective combination of both feature representation and learning strategy. This results in a powerful and flexible framework for image classification. We worked with images acquired using an automated slide scanner. We first applied a hash-based “pollen spotting” model to segment pollen grains from the slide background. We next tested ARLO’s ability to reconstruct black to white spruce pollen ratios using artificially constructed slides of known ratios. We then developed a more scalable hash-based method of image analysis that was able to distinguish between the pollen of black and white spruce with an estimated accuracy of 83.61%, comparable to human expert performance. Our results demonstrate the capability of machine learning systems to automate challenging taxonomic classifications in pollen analysis, and our success with simple image representations suggests that our approach is generalizable to many other object recognition problems. PMID:26867017
1991-09-05
34 Learning from Learning : Principles for Supporting Drivers" J A Groeger, MRC Applied Psychology Unit, UK "Argos: A Driver Behaviour Analysis System...Technology (CEST), UK MISCELLANEOUS "Modular Sensor System for Guiding Handling Machines " J Geit and J 423 Heinrich, TZN Forshcungs, FRG "Flexible...PUBLIC TRANSP . MANAa RESEARCH Arrrtympe PARTI "Implementation Strategl»» Systems engineering \\ PART III / Validation through Pilot
Active machine learning-driven experimentation to determine compound effects on protein patterns
Naik, Armaghan W; Kangas, Joshua D; Sullivan, Devin P; Murphy, Robert F
2016-01-01
High throughput screening determines the effects of many conditions on a given biological target. Currently, to estimate the effects of those conditions on other targets requires either strong modeling assumptions (e.g. similarities among targets) or separate screens. Ideally, data-driven experimentation could be used to learn accurate models for many conditions and targets without doing all possible experiments. We have previously described an active machine learning algorithm that can iteratively choose small sets of experiments to learn models of multiple effects. We now show that, with no prior knowledge and with liquid handling robotics and automated microscopy under its control, this learner accurately learned the effects of 48 chemical compounds on the subcellular localization of 48 proteins while performing only 29% of all possible experiments. The results represent the first practical demonstration of the utility of active learning-driven biological experimentation in which the set of possible phenotypes is unknown in advance. DOI: http://dx.doi.org/10.7554/eLife.10047.001 PMID:26840049
Merritt, Stephanie M; Ilgen, Daniel R
2008-04-01
We provide an empirical demonstration of the importance of attending to human user individual differences in examinations of trust and automation use. Past research has generally supported the notions that machine reliability predicts trust in automation, and trust in turn predicts automation use. However, links between user personality and perceptions of the machine with trust in automation have not been empirically established. On our X-ray screening task, 255 students rated trust and made automation use decisions while visually searching for weapons in X-ray images of luggage. We demonstrate that individual differences affect perceptions of machine characteristics when actual machine characteristics are constant, that perceptions account for 52% of trust variance above the effects of actual characteristics, and that perceptions mediate the effects of actual characteristics on trust. Importantly, we also demonstrate that when administered at different times, the same six trust items reflect two types of trust (dispositional trust and history-based trust) and that these two trust constructs are differentially related to other variables. Interactions were found among user characteristics, machine characteristics, and automation use. Our results suggest that increased specificity in the conceptualization and measurement of trust is required, future researchers should assess user perceptions of machine characteristics in addition to actual machine characteristics, and incorporation of user extraversion and propensity to trust machines can increase prediction of automation use decisions. Potential applications include the design of flexible automation training programs tailored to individuals who differ in systematic ways.
CRIE: An automated analyzer for Chinese texts.
Sung, Yao-Ting; Chang, Tao-Hsing; Lin, Wei-Chun; Hsieh, Kuan-Sheng; Chang, Kuo-En
2016-12-01
Textual analysis has been applied to various fields, such as discourse analysis, corpus studies, text leveling, and automated essay evaluation. Several tools have been developed for analyzing texts written in alphabetic languages such as English and Spanish. However, currently there is no tool available for analyzing Chinese-language texts. This article introduces a tool for the automated analysis of simplified and traditional Chinese texts, called the Chinese Readability Index Explorer (CRIE). Composed of four subsystems and incorporating 82 multilevel linguistic features, CRIE is able to conduct the major tasks of segmentation, syntactic parsing, and feature extraction. Furthermore, the integration of linguistic features with machine learning models enables CRIE to provide leveling and diagnostic information for texts in language arts, texts for learning Chinese as a foreign language, and texts with domain knowledge. The usage and validation of the functions provided by CRIE are also introduced.
Coupling Matched Molecular Pairs with Machine Learning for Virtual Compound Optimization.
Turk, Samo; Merget, Benjamin; Rippmann, Friedrich; Fulle, Simone
2017-12-26
Matched molecular pair (MMP) analyses are widely used in compound optimization projects to gain insights into structure-activity relationships (SAR). The analysis is traditionally done via statistical methods but can also be employed together with machine learning (ML) approaches to extrapolate to novel compounds. The here introduced MMP/ML method combines a fragment-based MMP implementation with different machine learning methods to obtain automated SAR decomposition and prediction. To test the prediction capabilities and model transferability, two different compound optimization scenarios were designed: (1) "new fragments" which occurs when exploring new fragments for a defined compound series and (2) "new static core and transformations" which resembles for instance the identification of a new compound series. Very good results were achieved by all employed machine learning methods especially for the new fragments case, but overall deep neural network models performed best, allowing reliable predictions also for the new static core and transformations scenario, where comprehensive SAR knowledge of the compound series is missing. Furthermore, we show that models trained on all available data have a higher generalizability compared to models trained on focused series and can extend beyond chemical space covered in the training data. Thus, coupling MMP with deep neural networks provides a promising approach to make high quality predictions on various data sets and in different compound optimization scenarios.
Technology assessment of advanced automation for space missions
NASA Technical Reports Server (NTRS)
1982-01-01
Six general classes of technology requirements derived during the mission definition phase of the study were identified as having maximum importance and urgency, including autonomous world model based information systems, learning and hypothesis formation, natural language and other man-machine communication, space manufacturing, teleoperators and robot systems, and computer science and technology.
ERIC Educational Resources Information Center
Prieto, L. P.; Sharma, K.; Kidzinski, L.; Rodríguez-Triana, M. J.; Dillenbourg, P.
2018-01-01
The pedagogical modelling of everyday classroom practice is an interesting kind of evidence, both for educational research and teachers' own professional development. This paper explores the usage of wearable sensors and machine learning techniques to automatically extract orchestration graphs (teaching activities and their social plane over time)…
The Roles of Suprasegmental Features in Predicting English Oral Proficiency with an Automated System
ERIC Educational Resources Information Center
Kang, Okim; Johnson, David
2018-01-01
Suprasegmental features have received growing attention in the field of oral assessment. In this article we describe a set of computer algorithms that automatically scores the oral proficiency of non-native speakers using unconstrained English speech. The algorithms employ machine learning and 11 suprasegmental measures divided into four groups…
ERIC Educational Resources Information Center
Allen, William H.; And Others
This study compared the relative effectiveness of an automated teaching machine with instructor presented instruction in graduate dental teaching. The objectives were to: (1) determine the effects of 3 laboratory instructional procedures used in combination with 2 lectures on the acquisition of manual operative skills, the learning of information…
A Cognitive Systems Engineering Approach to Developing HMI Requirements for New Technologies
NASA Technical Reports Server (NTRS)
Fern, Lisa Carolynn
2016-01-01
This document examines the challenges inherent in designing and regulating to support human-automation interaction for new technologies that will deployed into complex systems. A key question for new technologies, is how work will be accomplished by the human and machine agents. This question has traditionally been framed as how functions should be allocated between humans and machines. Such framing misses the coordination and synchronization that is needed for the different human and machine roles in the system to accomplish their goals. Coordination and synchronization demands are driven by the underlying human-automation architecture of the new technology, which are typically not specified explicitly by the designers. The human machine interface (HMI) which is intended to facilitate human-machine interaction and cooperation, however, typically is defined explicitly and therefore serves as a proxy for human-automation cooperation requirements with respect to technical standards for technologies. Unfortunately, mismatches between the HMI and the coordination and synchronization demands of the underlying human-automation architecture, can lead to system breakdowns. A methodology is needed that both designers and regulators can utilize to evaluate the expected performance of a new technology given potential human-automation architectures. Three experiments were conducted to inform the minimum HMI requirements a detect and avoid system for unmanned aircraft systems (UAS). The results of the experiments provided empirical input to specific minimum operational performance standards that UAS manufacturers will have to meet in order to operate UAS in the National Airspace System (NAS). These studies represent a success story for how to objectively and systematically evaluate prototype technologies as part of the process for developing regulatory requirements. They also provide an opportunity to reflect on the lessons learned from a recent research effort in order to improve the methodology for defining technology requirements for regulators in the future. The biggest shortcoming of the presented research program was the absence of the explicit definition, generation and analysis of potential human-automation architectures. Failure to execute this step in the research process resulted in less efficient evaluation of the candidate prototypes technologies in addition to the complete absence of different approaches to human-automation cooperation. For example, all of the prototype technologies that were evaluated in the research program assumed a human-automation architecture that relied on serial processing from the automation to the human. While this type of human-automation architecture is typical across many different technologies and in many different domains, it ignores different architectures where humans and automation work in parallel. Defining potential human-automation architectures a priori also allows regulators to develop scenarios that will stress the performance boundaries of the technology during the evaluation phase. The importance of adding this step of generating and evaluating candidate human-automation architectures prior to formal empirical evaluation is discussed.
Machine learning plus optical flow: a simple and sensitive method to detect cardioactive drugs
NASA Astrophysics Data System (ADS)
Lee, Eugene K.; Kurokawa, Yosuke K.; Tu, Robin; George, Steven C.; Khine, Michelle
2015-07-01
Current preclinical screening methods do not adequately detect cardiotoxicity. Using human induced pluripotent stem cell-derived cardiomyocytes (iPS-CMs), more physiologically relevant preclinical or patient-specific screening to detect potential cardiotoxic effects of drug candidates may be possible. However, one of the persistent challenges for developing a high-throughput drug screening platform using iPS-CMs is the need to develop a simple and reliable method to measure key electrophysiological and contractile parameters. To address this need, we have developed a platform that combines machine learning paired with brightfield optical flow as a simple and robust tool that can automate the detection of cardiomyocyte drug effects. Using three cardioactive drugs of different mechanisms, including those with primarily electrophysiological effects, we demonstrate the general applicability of this screening method to detect subtle changes in cardiomyocyte contraction. Requiring only brightfield images of cardiomyocyte contractions, we detect changes in cardiomyocyte contraction comparable to - and even superior to - fluorescence readouts. This automated method serves as a widely applicable screening tool to characterize the effects of drugs on cardiomyocyte function.
Fantuzzo, J. A.; Mirabella, V. R.; Zahn, J. D.
2017-01-01
Abstract Synapse formation analyses can be performed by imaging and quantifying fluorescent signals of synaptic markers. Traditionally, these analyses are done using simple or multiple thresholding and segmentation approaches or by labor-intensive manual analysis by a human observer. Here, we describe Intellicount, a high-throughput, fully-automated synapse quantification program which applies a novel machine learning (ML)-based image processing algorithm to systematically improve region of interest (ROI) identification over simple thresholding techniques. Through processing large datasets from both human and mouse neurons, we demonstrate that this approach allows image processing to proceed independently of carefully set thresholds, thus reducing the need for human intervention. As a result, this method can efficiently and accurately process large image datasets with minimal interaction by the experimenter, making it less prone to bias and less liable to human error. Furthermore, Intellicount is integrated into an intuitive graphical user interface (GUI) that provides a set of valuable features, including automated and multifunctional figure generation, routine statistical analyses, and the ability to run full datasets through nested folders, greatly expediting the data analysis process. PMID:29218324
Lee, Eugene K; Tran, David D; Keung, Wendy; Chan, Patrick; Wong, Gabriel; Chan, Camie W; Costa, Kevin D; Li, Ronald A; Khine, Michelle
2017-11-14
Accurately predicting cardioactive effects of new molecular entities for therapeutics remains a daunting challenge. Immense research effort has been focused toward creating new screening platforms that utilize human pluripotent stem cell (hPSC)-derived cardiomyocytes and three-dimensional engineered cardiac tissue constructs to better recapitulate human heart function and drug responses. As these new platforms become increasingly sophisticated and high throughput, the drug screens result in larger multidimensional datasets. Improved automated analysis methods must therefore be developed in parallel to fully comprehend the cellular response across a multidimensional parameter space. Here, we describe the use of machine learning to comprehensively analyze 17 functional parameters derived from force readouts of hPSC-derived ventricular cardiac tissue strips (hvCTS) electrically paced at a range of frequencies and exposed to a library of compounds. A generated metric is effective for then determining the cardioactivity of a given drug. Furthermore, we demonstrate a classification model that can automatically predict the mechanistic action of an unknown cardioactive drug. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Jongin Kim; Boreom Lee
2017-07-01
The classification of neuroimaging data for the diagnosis of Alzheimer's Disease (AD) is one of the main research goals of the neuroscience and clinical fields. In this study, we performed extreme learning machine (ELM) classifier to discriminate the AD, mild cognitive impairment (MCI) from normal control (NC). We compared the performance of ELM with that of a linear kernel support vector machine (SVM) for 718 structural MRI images from Alzheimer's Disease Neuroimaging Initiative (ADNI) database. The data consisted of normal control, MCI converter (MCI-C), MCI non-converter (MCI-NC), and AD. We employed SVM-based recursive feature elimination (RFE-SVM) algorithm to find the optimal subset of features. In this study, we found that the RFE-SVM feature selection approach in combination with ELM shows the superior classification accuracy to that of linear kernel SVM for structural T1 MRI data.
Machine learning vortices at the Kosterlitz-Thouless transition
NASA Astrophysics Data System (ADS)
Beach, Matthew J. S.; Golubeva, Anna; Melko, Roger G.
2018-01-01
Efficient and automated classification of phases from minimally processed data is one goal of machine learning in condensed-matter and statistical physics. Supervised algorithms trained on raw samples of microstates can successfully detect conventional phase transitions via learning a bulk feature such as an order parameter. In this paper, we investigate whether neural networks can learn to classify phases based on topological defects. We address this question on the two-dimensional classical XY model which exhibits a Kosterlitz-Thouless transition. We find significant feature engineering of the raw spin states is required to convincingly claim that features of the vortex configurations are responsible for learning the transition temperature. We further show a single-layer network does not correctly classify the phases of the XY model, while a convolutional network easily performs classification by learning the global magnetization. Finally, we design a deep network capable of learning vortices without feature engineering. We demonstrate the detection of vortices does not necessarily result in the best classification accuracy, especially for lattices of less than approximately 1000 spins. For larger systems, it remains a difficult task to learn vortices.
Refining fuzzy logic controllers with machine learning
NASA Technical Reports Server (NTRS)
Berenji, Hamid R.
1994-01-01
In this paper, we describe the GARIC (Generalized Approximate Reasoning-Based Intelligent Control) architecture, which learns from its past performance and modifies the labels in the fuzzy rules to improve performance. It uses fuzzy reinforcement learning which is a hybrid method of fuzzy logic and reinforcement learning. This technology can simplify and automate the application of fuzzy logic control to a variety of systems. GARIC has been applied in simulation studies of the Space Shuttle rendezvous and docking experiments. It has the potential of being applied in other aerospace systems as well as in consumer products such as appliances, cameras, and cars.
Jin, Bo; Krishnan, Balu; Adler, Sophie; Wagstyl, Konrad; Hu, Wenhan; Jones, Stephen; Najm, Imad; Alexopoulos, Andreas; Zhang, Kai; Zhang, Jianguo; Ding, Meiping; Wang, Shuang; Wang, Zhong Irene
2018-05-01
Focal cortical dysplasia (FCD) is a major pathology in patients undergoing surgical resection to treat pharmacoresistant epilepsy. Magnetic resonance imaging (MRI) postprocessing methods may provide essential help for detection of FCD. In this study, we utilized surface-based MRI morphometry and machine learning for automated lesion detection in a mixed cohort of patients with FCD type II from 3 different epilepsy centers. Sixty-one patients with pharmacoresistant epilepsy and histologically proven FCD type II were included in the study. The patients had been evaluated at 3 different epilepsy centers using 3 different MRI scanners. T1-volumetric sequence was used for postprocessing. A normal database was constructed with 120 healthy controls. We also included 35 healthy test controls and 15 disease test controls with histologically confirmed hippocampal sclerosis to assess specificity. Features were calculated and incorporated into a nonlinear neural network classifier, which was trained to identify lesional cluster. We optimized the threshold of the output probability map from the classifier by performing receiver operating characteristic (ROC) analyses. Success of detection was defined by overlap between the final cluster and the manual labeling. Performance was evaluated using k-fold cross-validation. The threshold of 0.9 showed optimal sensitivity of 73.7% and specificity of 90.0%. The area under the curve for the ROC analysis was 0.75, which suggests a discriminative classifier. Sensitivity and specificity were not significantly different for patients from different centers, suggesting robustness of performance. Correct detection rate was significantly lower in patients with initially normal MRI than patients with unequivocally positive MRI. Subgroup analysis showed the size of the training group and normal control database impacted classifier performance. Automated surface-based MRI morphometry equipped with machine learning showed robust performance across cohorts from different centers and scanners. The proposed method may be a valuable tool to improve FCD detection in presurgical evaluation for patients with pharmacoresistant epilepsy. Wiley Periodicals, Inc. © 2018 International League Against Epilepsy.
Burlina, Philippe; Pacheco, Katia D; Joshi, Neil; Freund, David E; Bressler, Neil M
2017-03-01
When left untreated, age-related macular degeneration (AMD) is the leading cause of vision loss in people over fifty in the US. Currently it is estimated that about eight million US individuals have the intermediate stage of AMD that is often asymptomatic with regard to visual deficit. These individuals are at high risk for progressing to the advanced stage where the often treatable choroidal neovascular form of AMD can occur. Careful monitoring to detect the onset and prompt treatment of the neovascular form as well as dietary supplementation can reduce the risk of vision loss from AMD, therefore, preferred practice patterns recommend identifying individuals with the intermediate stage in a timely manner. Past automated retinal image analysis (ARIA) methods applied on fundus imagery have relied on engineered and hand-designed visual features. We instead detail the novel application of a machine learning approach using deep learning for the problem of ARIA and AMD analysis. We use transfer learning and universal features derived from deep convolutional neural networks (DCNN). We address clinically relevant 4-class, 3-class, and 2-class AMD severity classification problems. Using 5664 color fundus images from the NIH AREDS dataset and DCNN universal features, we obtain values for accuracy for the (4-, 3-, 2-) class classification problem of (79.4%, 81.5%, 93.4%) for machine vs. (75.8%, 85.0%, 95.2%) for physician grading. This study demonstrates the efficacy of machine grading based on deep universal features/transfer learning when applied to ARIA and is a promising step in providing a pre-screener to identify individuals with intermediate AMD and also as a tool that can facilitate identifying such individuals for clinical studies aimed at developing improved therapies. It also demonstrates comparable performance between computer and physician grading. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Ota, Shunsuke; Deguchi, Daisuke; Kitasaka, Takayuki; Mori, Kensaku; Suenaga, Yasuhito; Hasegawa, Yoshinori; Imaizumi, Kazuyoshi; Takabatake, Hirotsugu; Mori, Masaki; Natori, Hiroshi
2008-03-01
This paper presents a method for automated anatomical labeling of bronchial branches (ALBB) extracted from 3D CT datasets. The proposed method constructs classifiers that output anatomical names of bronchial branches by employing the machine-learning approach. We also present its application to a bronchoscopy guidance system. Since the bronchus has a complex tree structure, bronchoscopists easily tend to get disoriented and lose the way to a target location. A bronchoscopy guidance system is strongly expected to be developed to assist bronchoscopists. In such guidance system, automated presentation of anatomical names is quite useful information for bronchoscopy. Although several methods for automated ALBB were reported, most of them constructed models taking only variations of branching patterns into account and did not consider those of running directions. Since the running directions of bronchial branches differ greatly in individuals, they could not perform ALBB accurately when running directions of bronchial branches were different from those of models. Our method tries to solve such problems by utilizing the machine-learning approach. Actual procedure consists of three steps: (a) extraction of bronchial tree structures from 3D CT datasets, (b) construction of classifiers using the multi-class AdaBoost technique, and (c) automated classification of bronchial branches by using the constructed classifiers. We applied the proposed method to 51 cases of 3D CT datasets. The constructed classifiers were evaluated by leave-one-out scheme. The experimental results showed that the proposed method could assign correct anatomical names to bronchial branches of 89.1% up to segmental lobe branches. Also, we confirmed that it was quite useful to assist the bronchoscopy by presenting anatomical names of bronchial branches on real bronchoscopic views.
Counterfeit Electronics Detection Using Image Processing and Machine Learning
NASA Astrophysics Data System (ADS)
Asadizanjani, Navid; Tehranipoor, Mark; Forte, Domenic
2017-01-01
Counterfeiting is an increasing concern for businesses and governments as greater numbers of counterfeit integrated circuits (IC) infiltrate the global market. There is an ongoing effort in experimental and national labs inside the United States to detect and prevent such counterfeits in the most efficient time period. However, there is still a missing piece to automatically detect and properly keep record of detected counterfeit ICs. Here, we introduce a web application database that allows users to share previous examples of counterfeits through an online database and to obtain statistics regarding the prevalence of known defects. We also investigate automated techniques based on image processing and machine learning to detect different physical defects and to determine whether or not an IC is counterfeit.
Large-scale deep learning for robotically gathered imagery for science
NASA Astrophysics Data System (ADS)
Skinner, K.; Johnson-Roberson, M.; Li, J.; Iscar, E.
2016-12-01
With the explosion of computing power, the intelligence and capability of mobile robotics has dramatically increased over the last two decades. Today, we can deploy autonomous robots to achieve observations in a variety of environments ripe for scientific exploration. These platforms are capable of gathering a volume of data previously unimaginable. Additionally, optical cameras, driven by mobile phones and consumer photography, have rapidly improved in size, power consumption, and quality making their deployment cheaper and easier. Finally, in parallel we have seen the rise of large-scale machine learning approaches, particularly deep neural networks (DNNs), increasing the quality of the semantic understanding that can be automatically extracted from optical imagery. In concert this enables new science using a combination of machine learning and robotics. This work will discuss the application of new low-cost high-performance computing approaches and the associated software frameworks to enable scientists to rapidly extract useful science data from millions of robotically gathered images. The automated analysis of imagery on this scale opens up new avenues of inquiry unavailable using more traditional manual or semi-automated approaches. We will use a large archive of millions of benthic images gathered with an autonomous underwater vehicle to demonstrate how these tools enable new scientific questions to be posed.
Fully automated, deep learning segmentation of oxygen-induced retinopathy images
Xiao, Sa; Bucher, Felicitas; Wu, Yue; Rokem, Ariel; Lee, Cecilia S.; Marra, Kyle V.; Fallon, Regis; Diaz-Aguilar, Sophia; Aguilar, Edith; Friedlander, Martin; Lee, Aaron Y.
2017-01-01
Oxygen-induced retinopathy (OIR) is a widely used model to study ischemia-driven neovascularization (NV) in the retina and to serve in proof-of-concept studies in evaluating antiangiogenic drugs for ocular, as well as nonocular, diseases. The primary parameters that are analyzed in this mouse model include the percentage of retina with vaso-obliteration (VO) and NV areas. However, quantification of these two key variables comes with a great challenge due to the requirement of human experts to read the images. Human readers are costly, time-consuming, and subject to bias. Using recent advances in machine learning and computer vision, we trained deep learning neural networks using over a thousand segmentations to fully automate segmentation in OIR images. While determining the percentage area of VO, our algorithm achieved a similar range of correlation coefficients to that of expert inter-human correlation coefficients. In addition, our algorithm achieved a higher range of correlation coefficients compared with inter-expert correlation coefficients for quantification of the percentage area of neovascular tufts. In summary, we have created an open-source, fully automated pipeline for the quantification of key values of OIR images using deep learning neural networks. PMID:29263301
NASA Astrophysics Data System (ADS)
Hengl, Tomislav
2016-04-01
Preliminary results of predicting distribution of soil organic soils (Histosols) and soil organic carbon stock (in tonnes per ha) using global compilations of soil profiles (about 150,000 points) and covariates at 250 m spatial resolution (about 150 covariates; mainly MODIS seasonal land products, SRTM DEM derivatives, climatic images, lithological and land cover and landform maps) are presented. We focus on using a data-driven approach i.e. Machine Learning techniques that often require no knowledge about the distribution of the target variable or knowledge about the possible relationships. Other advantages of using machine learning are (DOI: 10.1371/journal.pone.0125814): All rules required to produce outputs are formalized. The whole procedure is documented (the statistical model and associated computer script), enabling reproducible research. Predicted surfaces can make use of various information sources and can be optimized relative to all available quantitative point and covariate data. There is more flexibility in terms of the spatial extent, resolution and support of requested maps. Automated mapping is also more cost-effective: once the system is operational, maintenance and production of updates are an order of magnitude faster and cheaper. Consequently, prediction maps can be updated and improved at shorter and shorter time intervals. Some disadvantages of automated soil mapping based on Machine Learning are: Models are data-driven and any serious blunders or artifacts in the input data can propagate to order-of-magnitude larger errors than in the case of expert-based systems. Fitting machine learning models is at the order of magnitude computationally more demanding. Computing effort can be even tens of thousands higher than if e.g. linear geostatistics is used. Many machine learning models are fairly complex often abstract and any interpretation of such models is not trivial and require special multidimensional / multivariable plotting and data mining tools. Results of model fitting using the R packages nnet, randomForest and the h2o software (machine learning functions) show that significant models can be fitted for soil classes, bulk density (R-square 0.76), soil organic carbon (R-square 0.62) and coarse fragments (R-square 0.59). Consequently, we were able to estimate soil organic carbon stock for majority of the land mask (excluding permanent ice) and detect patches of landscape containing mainly organic soils (peat and similar). Our results confirm that hotspots of soil organic carbon in Tropics are peatlands in Indonesia, north of Peru, west Amazon and Congo river basin. Majority of world soil organic carbon stock is likely in the Northern latitudes (tundra and taiga of the north). Distribution of histosols seems to be mainly controlled by climatic conditions (especially temperature regime and water vapor) and hydrologic position in the landscape. Predicted distributions of organic soils (probability of occurrence) and total soil organic carbon stock at resolutions of 1 km and 250 m are available via the SoilGrids.org project homepage.
12 CFR 205.16 - Disclosures at automated teller machines.
Code of Federal Regulations, 2011 CFR
2011-01-01
... 12 Banks and Banking 2 2011-01-01 2011-01-01 false Disclosures at automated teller machines. 205.16 Section 205.16 Banks and Banking FEDERAL RESERVE SYSTEM BOARD OF GOVERNORS OF THE FEDERAL RESERVE SYSTEM ELECTRONIC FUND TRANSFERS (REGULATION E) § 205.16 Disclosures at automated teller machines. (a...
The Historical Evolution of Educational Software.
ERIC Educational Resources Information Center
Troutner, Joanne
This paper establishes the roots of computers and automated teaching in the field of psychology and describes Dr. S. L. Pressey's presentation of the teaching machine; B. F. Skinner's teaching machine; Meyer's steps in composing a program for the automated teaching machine; IBM's beginning research on automated courses and the development of the…
Burgansky-Eliash, Zvia; Wollstein, Gadi; Chu, Tianjiao; Ramsey, Joseph D.; Glymour, Clark; Noecker, Robert J.; Ishikawa, Hiroshi; Schuman, Joel S.
2007-01-01
Purpose Machine-learning classifiers are trained computerized systems with the ability to detect the relationship between multiple input parameters and a diagnosis. The present study investigated whether the use of machine-learning classifiers improves optical coherence tomography (OCT) glaucoma detection. Methods Forty-seven patients with glaucoma (47 eyes) and 42 healthy subjects (42 eyes) were included in this cross-sectional study. Of the glaucoma patients, 27 had early disease (visual field mean deviation [MD] ≥ −6 dB) and 20 had advanced glaucoma (MD < −6 dB). Machine-learning classifiers were trained to discriminate between glaucomatous and healthy eyes using parameters derived from OCT output. The classifiers were trained with all 38 parameters as well as with only 8 parameters that correlated best with the visual field MD. Five classifiers were tested: linear discriminant analysis, support vector machine, recursive partitioning and regression tree, generalized linear model, and generalized additive model. For the last two classifiers, a backward feature selection was used to find the minimal number of parameters that resulted in the best and most simple prediction. The cross-validated receiver operating characteristic (ROC) curve and accuracies were calculated. Results The largest area under the ROC curve (AROC) for glaucoma detection was achieved with the support vector machine using eight parameters (0.981). The sensitivity at 80% and 95% specificity was 97.9% and 92.5%, respectively. This classifier also performed best when judged by cross-validated accuracy (0.966). The best classification between early glaucoma and advanced glaucoma was obtained with the generalized additive model using only three parameters (AROC = 0.854). Conclusions Automated machine classifiers of OCT data might be useful for enhancing the utility of this technology for detecting glaucomatous abnormality. PMID:16249492
NASA Astrophysics Data System (ADS)
Mølgaard, Lasse L.; Buus, Ole T.; Larsen, Jan; Babamoradi, Hamid; Thygesen, Ida L.; Laustsen, Milan; Munk, Jens Kristian; Dossi, Eleftheria; O'Keeffe, Caroline; Lässig, Lina; Tatlow, Sol; Sandström, Lars; Jakobsen, Mogens H.
2017-05-01
We present a data-driven machine learning approach to detect drug- and explosives-precursors using colorimetric sensor technology for air-sampling. The sensing technology has been developed in the context of the CRIM-TRACK project. At present a fully- integrated portable prototype for air sampling with disposable sensing chips and automated data acquisition has been developed. The prototype allows for fast, user-friendly sampling, which has made it possible to produce large datasets of colorimetric data for different target analytes in laboratory and simulated real-world application scenarios. To make use of the highly multi-variate data produced from the colorimetric chip a number of machine learning techniques are employed to provide reliable classification of target analytes from confounders found in the air streams. We demonstrate that a data-driven machine learning method using dimensionality reduction in combination with a probabilistic classifier makes it possible to produce informative features and a high detection rate of analytes. Furthermore, the probabilistic machine learning approach provides a means of automatically identifying unreliable measurements that could produce false predictions. The robustness of the colorimetric sensor has been evaluated in a series of experiments focusing on the amphetamine pre-cursor phenylacetone as well as the improvised explosives pre-cursor hydrogen peroxide. The analysis demonstrates that the system is able to detect analytes in clean air and mixed with substances that occur naturally in real-world sampling scenarios. The technology under development in CRIM-TRACK has the potential as an effective tool to control trafficking of illegal drugs, explosive detection, or in other law enforcement applications.
Machine learning in a graph framework for subcortical segmentation
NASA Astrophysics Data System (ADS)
Guo, Zhihui; Kashyap, Satyananda; Sonka, Milan; Oguz, Ipek
2017-02-01
Automated and reliable segmentation of subcortical structures from human brain magnetic resonance images is of great importance for volumetric and shape analyses in quantitative neuroimaging studies. However, poor boundary contrast and variable shape of these structures make the automated segmentation a tough task. We propose a 3D graph-based machine learning method, called LOGISMOS-RF, to segment the caudate and the putamen from brain MRI scans in a robust and accurate way. An atlas-based tissue classification and bias-field correction method is applied to the images to generate an initial segmentation for each structure. Then a 3D graph framework is utilized to construct a geometric graph for each initial segmentation. A locally trained random forest classifier is used to assign a cost to each graph node. The max-flow algorithm is applied to solve the segmentation problem. Evaluation was performed on a dataset of T1-weighted MRI's of 62 subjects, with 42 images used for training and 20 images for testing. For comparison, FreeSurfer, FSL and BRAINSCut approaches were also evaluated using the same dataset. Dice overlap coefficients and surface-to-surfaces distances between the automated segmentation and expert manual segmentations indicate the results of our method are statistically significantly more accurate than the three other methods, for both the caudate (Dice: 0.89 +/- 0.03) and the putamen (0.89 +/- 0.03).
Wu, Mon-Ju; Passos, Ives Cavalcante; Bauer, Isabelle E; Lavagnino, Luca; Cao, Bo; Zunta-Soares, Giovana B; Kapczinski, Flávio; Mwangi, Benson; Soares, Jair C
2016-03-01
Previous studies have reported that patients with bipolar disorder (BD) present with cognitive impairments during mood episodes as well as euthymic phase. However, it is still unknown whether reported neurocognitive abnormalities can objectively identify individual BD patients from healthy controls (HC). A total of 21 euthymic BD patients and 21 demographically matched HC were included in the current study. Participants performed the computerized Cambridge Neurocognitive Test Automated Battery (CANTAB) to assess cognitive performance. The least absolute shrinkage selection operator (LASSO) machine learning algorithm was implemented to identify neurocognitive signatures to distinguish individual BD patients from HC. The LASSO machine learning algorithm identified individual BD patients from HC with an accuracy of 71%, area under receiver operating characteristic curve of 0.7143 and significant at p=0.0053. The LASSO algorithm assigned individual subjects with a probability score (0-healthy, 1-patient). Patients with rapid cycling (RC) were assigned increased probability scores as compared to patients without RC. A multivariate pattern of neurocognitive abnormalities comprising of affective Go/No-go and the Cambridge gambling task was relevant in distinguishing individual patients from HC. Our study sample was small as we only considered euthymic BD patients and demographically matched HC. Neurocognitive abnormalities can distinguish individual euthymic BD patients from HC with relatively high accuracy. In addition, patients with RC had more cognitive impairments compared to patients without RC. The predictive neurocognitive signature identified in the current study can potentially be used to provide individualized clinical inferences on BD patients. Copyright © 2015 Elsevier B.V. All rights reserved.
Lannin, Timothy B; Thege, Fredrik I; Kirby, Brian J
2016-10-01
Advances in rare cell capture technology have made possible the interrogation of circulating tumor cells (CTCs) captured from whole patient blood. However, locating captured cells in the device by manual counting bottlenecks data processing by being tedious (hours per sample) and compromises the results by being inconsistent and prone to user bias. Some recent work has been done to automate the cell location and classification process to address these problems, employing image processing and machine learning (ML) algorithms to locate and classify cells in fluorescent microscope images. However, the type of machine learning method used is a part of the design space that has not been thoroughly explored. Thus, we have trained four ML algorithms on three different datasets. The trained ML algorithms locate and classify thousands of possible cells in a few minutes rather than a few hours, representing an order of magnitude increase in processing speed. Furthermore, some algorithms have a significantly (P < 0.05) higher area under the receiver operating characteristic curve than do other algorithms. Additionally, significant (P < 0.05) losses to performance occur when training on cell lines and testing on CTCs (and vice versa), indicating the need to train on a system that is representative of future unlabeled data. Optimal algorithm selection depends on the peculiarities of the individual dataset, indicating the need of a careful comparison and optimization of algorithms for individual image classification tasks. © 2016 International Society for Advancement of Cytometry. © 2016 International Society for Advancement of Cytometry.
Empirical Analysis and Automated Classification of Security Bug Reports
NASA Technical Reports Server (NTRS)
Tyo, Jacob P.
2016-01-01
With the ever expanding amount of sensitive data being placed into computer systems, the need for effective cybersecurity is of utmost importance. However, there is a shortage of detailed empirical studies of security vulnerabilities from which cybersecurity metrics and best practices could be determined. This thesis has two main research goals: (1) to explore the distribution and characteristics of security vulnerabilities based on the information provided in bug tracking systems and (2) to develop data analytics approaches for automatic classification of bug reports as security or non-security related. This work is based on using three NASA datasets as case studies. The empirical analysis showed that the majority of software vulnerabilities belong only to a small number of types. Addressing these types of vulnerabilities will consequently lead to cost efficient improvement of software security. Since this analysis requires labeling of each bug report in the bug tracking system, we explored using machine learning to automate the classification of each bug report as a security or non-security related (two-class classification), as well as each security related bug report as specific security type (multiclass classification). In addition to using supervised machine learning algorithms, a novel unsupervised machine learning approach is proposed. An ac- curacy of 92%, recall of 96%, precision of 92%, probability of false alarm of 4%, F-Score of 81% and G-Score of 90% were the best results achieved during two-class classification. Furthermore, an accuracy of 80%, recall of 80%, precision of 94%, and F-score of 85% were the best results achieved during multiclass classification.
Automatic classification of protein structures using physicochemical parameters.
Mohan, Abhilash; Rao, M Divya; Sunderrajan, Shruthi; Pennathur, Gautam
2014-09-01
Protein classification is the first step to functional annotation; SCOP and Pfam databases are currently the most relevant protein classification schemes. However, the disproportion in the number of three dimensional (3D) protein structures generated versus their classification into relevant superfamilies/families emphasizes the need for automated classification schemes. Predicting function of novel proteins based on sequence information alone has proven to be a major challenge. The present study focuses on the use of physicochemical parameters in conjunction with machine learning algorithms (Naive Bayes, Decision Trees, Random Forest and Support Vector Machines) to classify proteins into their respective SCOP superfamily/Pfam family, using sequence derived information. Spectrophores™, a 1D descriptor of the 3D molecular field surrounding a structure was used as a benchmark to compare the performance of the physicochemical parameters. The machine learning algorithms were modified to select features based on information gain for each SCOP superfamily/Pfam family. The effect of combining physicochemical parameters and spectrophores on classification accuracy (CA) was studied. Machine learning algorithms trained with the physicochemical parameters consistently classified SCOP superfamilies and Pfam families with a classification accuracy above 90%, while spectrophores performed with a CA of around 85%. Feature selection improved classification accuracy for both physicochemical parameters and spectrophores based machine learning algorithms. Combining both attributes resulted in a marginal loss of performance. Physicochemical parameters were able to classify proteins from both schemes with classification accuracy ranging from 90-96%. These results suggest the usefulness of this method in classifying proteins from amino acid sequences.
Automated Creation of Labeled Pointcloud Datasets in Support of Machine-Learning Based Perception
2017-12-01
computationally intensive 3D vector math and took more than ten seconds to segment a single LIDAR frame from the HDL-32e with the Dell XPS15 9650’s Intel...Core i7 CPU. Depth Clustering avoids the computationally intensive 3D vector math of Euclidean Clustering-based DON segmentation and, instead
Managing Quality, Identity and Adversaries in Public Discourse with Machine Learning
ERIC Educational Resources Information Center
Brennan, Michael
2012-01-01
Automation can mitigate issues when scaling and managing quality and identity in public discourse on the web. Discourse needs to be curated and filtered. Anonymous speech has to be supported while handling adversaries. Reliance on human curators or analysts does not scale and content can be missed. These scaling and management issues include the…
ERIC Educational Resources Information Center
Mu, Jin; Stegmann, Karsten; Mayfield, Elijah; Rose, Carolyn; Fischer, Frank
2012-01-01
Research related to online discussions frequently faces the problem of analyzing huge corpora. Natural Language Processing (NLP) technologies may allow automating this analysis. However, the state-of-the-art in machine learning and text mining approaches yields models that do not transfer well between corpora related to different topics. Also,…
ICE: An Automated Tool for Teaching Advanced C Programming
ERIC Educational Resources Information Center
Gonzalez, Ruben
2017-01-01
There are many difficulties with learning and teaching programming that can be alleviated with the use of software tools. Most of these tools have focused on the teaching of introductory programming concepts where commonly code fragments or small user programs are run in a sandbox or virtual machine, often in the cloud. These do not permit user…
Nandi, Sutanu; Subramanian, Abhishek; Sarkar, Ram Rup
2017-07-25
Prediction of essential genes helps to identify a minimal set of genes that are absolutely required for the appropriate functioning and survival of a cell. The available machine learning techniques for essential gene prediction have inherent problems, like imbalanced provision of training datasets, biased choice of the best model for a given balanced dataset, choice of a complex machine learning algorithm, and data-based automated selection of biologically relevant features for classification. Here, we propose a simple support vector machine-based learning strategy for the prediction of essential genes in Escherichia coli K-12 MG1655 metabolism that integrates a non-conventional combination of an appropriate sample balanced training set, a unique organism-specific genotype, phenotype attributes that characterize essential genes, and optimal parameters of the learning algorithm to generate the best machine learning model (the model with the highest accuracy among all the models trained for different sample training sets). For the first time, we also introduce flux-coupled metabolic subnetwork-based features for enhancing the classification performance. Our strategy proves to be superior as compared to previous SVM-based strategies in obtaining a biologically relevant classification of genes with high sensitivity and specificity. This methodology was also trained with datasets of other recent supervised classification techniques for essential gene classification and tested using reported test datasets. The testing accuracy was always high as compared to the known techniques, proving that our method outperforms known methods. Observations from our study indicate that essential genes are conserved among homologous bacterial species, demonstrate high codon usage bias, GC content and gene expression, and predominantly possess a tendency to form physiological flux modules in metabolism.
InPRO: Automated Indoor Construction Progress Monitoring Using Unmanned Aerial Vehicles
NASA Astrophysics Data System (ADS)
Hamledari, Hesam
In this research, an envisioned automated intelligent robotic solution for automated indoor data collection and inspection that employs a series of unmanned aerial vehicles (UAV), entitled "InPRO", is presented. InPRO consists of four stages, namely: 1) automated path planning; 2) autonomous UAV-based indoor inspection; 3) automated computer vision-based assessment of progress; and, 4) automated updating of 4D building information models (BIM). The works presented in this thesis address the third stage of InPRO. A series of computer vision-based methods that automate the assessment of construction progress using images captured at indoor sites are introduced. The proposed methods employ computer vision and machine learning techniques to detect the components of under-construction indoor partitions. In particular, framing (studs), insulation, electrical outlets, and different states of drywall sheets (installing, plastering, and painting) are automatically detected using digital images. High accuracy rates, real-time performance, and operation without a priori information are indicators of the methods' promising performance.
Support patient search on pathology reports with interactive online learning based data extraction.
Zheng, Shuai; Lu, James J; Appin, Christina; Brat, Daniel; Wang, Fusheng
2015-01-01
Structural reporting enables semantic understanding and prompt retrieval of clinical findings about patients. While synoptic pathology reporting provides templates for data entries, information in pathology reports remains primarily in narrative free text form. Extracting data of interest from narrative pathology reports could significantly improve the representation of the information and enable complex structured queries. However, manual extraction is tedious and error-prone, and automated tools are often constructed with a fixed training dataset and not easily adaptable. Our goal is to extract data from pathology reports to support advanced patient search with a highly adaptable semi-automated data extraction system, which can adjust and self-improve by learning from a user's interaction with minimal human effort. We have developed an online machine learning based information extraction system called IDEAL-X. With its graphical user interface, the system's data extraction engine automatically annotates values for users to review upon loading each report text. The system analyzes users' corrections regarding these annotations with online machine learning, and incrementally enhances and refines the learning model as reports are processed. The system also takes advantage of customized controlled vocabularies, which can be adaptively refined during the online learning process to further assist the data extraction. As the accuracy of automatic annotation improves overtime, the effort of human annotation is gradually reduced. After all reports are processed, a built-in query engine can be applied to conveniently define queries based on extracted structured data. We have evaluated the system with a dataset of anatomic pathology reports from 50 patients. Extracted data elements include demographical data, diagnosis, genetic marker, and procedure. The system achieves F-1 scores of around 95% for the majority of tests. Extracting data from pathology reports could enable more accurate knowledge to support biomedical research and clinical diagnosis. IDEAL-X provides a bridge that takes advantage of online machine learning based data extraction and the knowledge from human's feedback. By combining iterative online learning and adaptive controlled vocabularies, IDEAL-X can deliver highly adaptive and accurate data extraction to support patient search.
NASA Astrophysics Data System (ADS)
Steinberg, P. D.; Brener, G.; Duffy, D.; Nearing, G. S.; Pelissier, C.
2017-12-01
Hyperparameterization, of statistical models, i.e. automated model scoring and selection, such as evolutionary algorithms, grid searches, and randomized searches, can improve forecast model skill by reducing errors associated with model parameterization, model structure, and statistical properties of training data. Ensemble Learning Models (Elm), and the related Earthio package, provide a flexible interface for automating the selection of parameters and model structure for machine learning models common in climate science and land cover classification, offering convenient tools for loading NetCDF, HDF, Grib, or GeoTiff files, decomposition methods like PCA and manifold learning, and parallel training and prediction with unsupervised and supervised classification, clustering, and regression estimators. Continuum Analytics is using Elm to experiment with statistical soil moisture forecasting based on meteorological forcing data from NASA's North American Land Data Assimilation System (NLDAS). There Elm is using the NSGA-2 multiobjective optimization algorithm for optimizing statistical preprocessing of forcing data to improve goodness-of-fit for statistical models (i.e. feature engineering). This presentation will discuss Elm and its components, including dask (distributed task scheduling), xarray (data structures for n-dimensional arrays), and scikit-learn (statistical preprocessing, clustering, classification, regression), and it will show how NSGA-2 is being used for automate selection of soil moisture forecast statistical models for North America.
NMRNet: A deep learning approach to automated peak picking of protein NMR spectra.
Klukowski, Piotr; Augoff, Michal; Zieba, Maciej; Drwal, Maciej; Gonczarek, Adam; Walczak, Michal J
2018-03-14
Automated selection of signals in protein NMR spectra, known as peak picking, has been studied for over 20 years, nevertheless existing peak picking methods are still largely deficient. Accurate and precise automated peak picking would accelerate the structure calculation, and analysis of dynamics and interactions of macromolecules. Recent advancement in handling big data, together with an outburst of machine learning techniques, offer an opportunity to tackle the peak picking problem substantially faster than manual picking and on par with human accuracy. In particular, deep learning has proven to systematically achieve human-level performance in various recognition tasks, and thus emerges as an ideal tool to address automated identification of NMR signals. We have applied a convolutional neural network for visual analysis of multidimensional NMR spectra. A comprehensive test on 31 manually-annotated spectra has demonstrated top-tier average precision (AP) of 0.9596, 0.9058 and 0.8271 for backbone, side-chain and NOESY spectra, respectively. Furthermore, a combination of extracted peak lists with automated assignment routine, FLYA, outperformed other methods, including the manual one, and led to correct resonance assignment at the levels of 90.40%, 89.90% and 90.20% for three benchmark proteins. The proposed model is a part of a Dumpling software (platform for protein NMR data analysis), and is available at https://dumpling.bio/. michaljerzywalczak@gmail.compiotr.klukowski@pwr.edu.pl. Supplementary data are available at Bioinformatics online.
NASA Astrophysics Data System (ADS)
Wang, Ke; Guo, Ping; Luo, A.-Li
2017-03-01
Spectral feature extraction is a crucial procedure in automated spectral analysis. This procedure starts from the spectral data and produces informative and non-redundant features, facilitating the subsequent automated processing and analysis with machine-learning and data-mining techniques. In this paper, we present a new automated feature extraction method for astronomical spectra, with application in spectral classification and defective spectra recovery. The basic idea of our approach is to train a deep neural network to extract features of spectra with different levels of abstraction in different layers. The deep neural network is trained with a fast layer-wise learning algorithm in an analytical way without any iterative optimization procedure. We evaluate the performance of the proposed scheme on real-world spectral data. The results demonstrate that our method is superior regarding its comprehensive performance, and the computational cost is significantly lower than that for other methods. The proposed method can be regarded as a new valid alternative general-purpose feature extraction method for various tasks in spectral data analysis.
Integrating human and machine intelligence in galaxy morphology classification tasks
NASA Astrophysics Data System (ADS)
Beck, Melanie R.; Scarlata, Claudia; Fortson, Lucy F.; Lintott, Chris J.; Simmons, B. D.; Galloway, Melanie A.; Willett, Kyle W.; Dickinson, Hugh; Masters, Karen L.; Marshall, Philip J.; Wright, Darryl
2018-06-01
Quantifying galaxy morphology is a challenging yet scientifically rewarding task. As the scale of data continues to increase with upcoming surveys, traditional classification methods will struggle to handle the load. We present a solution through an integration of visual and automated classifications, preserving the best features of both human and machine. We demonstrate the effectiveness of such a system through a re-analysis of visual galaxy morphology classifications collected during the Galaxy Zoo 2 (GZ2) project. We reprocess the top-level question of the GZ2 decision tree with a Bayesian classification aggregation algorithm dubbed SWAP, originally developed for the Space Warps gravitational lens project. Through a simple binary classification scheme, we increase the classification rate nearly 5-fold classifying 226 124 galaxies in 92 d of GZ2 project time while reproducing labels derived from GZ2 classification data with 95.7 per cent accuracy. We next combine this with a Random Forest machine learning algorithm that learns on a suite of non-parametric morphology indicators widely used for automated morphologies. We develop a decision engine that delegates tasks between human and machine and demonstrate that the combined system provides at least a factor of 8 increase in the classification rate, classifying 210 803 galaxies in just 32 d of GZ2 project time with 93.1 per cent accuracy. As the Random Forest algorithm requires a minimal amount of computational cost, this result has important implications for galaxy morphology identification tasks in the era of Euclid and other large-scale surveys.
A neural network controller for automated composite manufacturing
NASA Technical Reports Server (NTRS)
Lichtenwalner, Peter F.
1994-01-01
At McDonnell Douglas Aerospace (MDA), an artificial neural network based control system has been developed and implemented to control laser heating for the fiber placement composite manufacturing process. This neurocontroller learns an approximate inverse model of the process on-line to provide performance that improves with experience and exceeds that of conventional feedback control techniques. When untrained, the control system behaves as a proportional plus integral (PI) controller. However after learning from experience, the neural network feedforward control module provides control signals that greatly improve temperature tracking performance. Faster convergence to new temperature set points and reduced temperature deviation due to changing feed rate have been demonstrated on the machine. A Cerebellar Model Articulation Controller (CMAC) network is used for inverse modeling because of its rapid learning performance. This control system is implemented in an IBM compatible 386 PC with an A/D board interface to the machine.
NASA Astrophysics Data System (ADS)
Dubey, Kavita; Srivastava, Vishal; Singh Mehta, Dalip
2018-04-01
Early identification of fungal infection on the human scalp is crucial for avoiding hair loss. The diagnosis of fungal infection on the human scalp is based on a visual assessment by trained experts or doctors. Optical coherence tomography (OCT) has the ability to capture fungal infection information from the human scalp with a high resolution. In this study, we present a fully automated, non-contact, non-invasive optical method for rapid detection of fungal infections based on the extracted features from A-line and B-scan images of OCT. A multilevel ensemble machine model is designed to perform automated classification, which shows the superiority of our classifier to the best classifier based on the features extracted from OCT images. In this study, 60 samples (30 fungal, 30 normal) were imaged by OCT and eight features were extracted. The classification algorithm had an average sensitivity, specificity and accuracy of 92.30, 90.90 and 91.66%, respectively, for identifying fungal and normal human scalps. This remarkable classifying ability makes the proposed model readily applicable to classifying the human scalp.
Chang, Ni-Bin; Bai, Kaixu; Chen, Chi-Farn
2017-10-01
Monitoring water quality changes in lakes, reservoirs, estuaries, and coastal waters is critical in response to the needs for sustainable development. This study develops a remote sensing-based multiscale modeling system by integrating multi-sensor satellite data merging and image reconstruction algorithms in support of feature extraction with machine learning leading to automate continuous water quality monitoring in environmentally sensitive regions. This new Earth observation platform, termed "cross-mission data merging and image reconstruction with machine learning" (CDMIM), is capable of merging multiple satellite imageries to provide daily water quality monitoring through a series of image processing, enhancement, reconstruction, and data mining/machine learning techniques. Two existing key algorithms, including Spectral Information Adaptation and Synthesis Scheme (SIASS) and SMart Information Reconstruction (SMIR), are highlighted to support feature extraction and content-based mapping. Whereas SIASS can support various data merging efforts to merge images collected from cross-mission satellite sensors, SMIR can overcome data gaps by reconstructing the information of value-missing pixels due to impacts such as cloud obstruction. Practical implementation of CDMIM was assessed by predicting the water quality over seasons in terms of the concentrations of nutrients and chlorophyll-a, as well as water clarity in Lake Nicaragua, providing synergistic efforts to better monitor the aquatic environment and offer insightful lake watershed management strategies. Copyright © 2017 Elsevier Ltd. All rights reserved.
He, Qiwei; Veldkamp, Bernard P; Glas, Cees A W; de Vries, Theo
2017-03-01
Patients' narratives about traumatic experiences and symptoms are useful in clinical screening and diagnostic procedures. In this study, we presented an automated assessment system to screen patients for posttraumatic stress disorder via a natural language processing and text-mining approach. Four machine-learning algorithms-including decision tree, naive Bayes, support vector machine, and an alternative classification approach called the product score model-were used in combination with n-gram representation models to identify patterns between verbal features in self-narratives and psychiatric diagnoses. With our sample, the product score model with unigrams attained the highest prediction accuracy when compared with practitioners' diagnoses. The addition of multigrams contributed most to balancing the metrics of sensitivity and specificity. This article also demonstrates that text mining is a promising approach for analyzing patients' self-expression behavior, thus helping clinicians identify potential patients from an early stage.
NASA Astrophysics Data System (ADS)
Diaz, Kristians; Castaneda, Benjamin
2008-03-01
This paper presents a semi-automated algorithm for prostate boundary segmentation from three-dimensional (3D) ultrasound (US) images. The US volume is sampled into 72 slices which go through the center of the prostate gland and are separated at a uniform angular spacing of 2.5 degrees. The approach requires the user to select four points from slices (at 0, 45, 90 and 135 degrees) which are used to initialize a discrete dynamic contour (DDC) algorithm. 4 Support Vector Machines (SVMs) are trained over the output of the DDC and classify the rest of the slices. The output of the SVMs is refined using binary morphological operations and DDC to produce the final result. The algorithm was tested on seven ex vivo 3D US images of prostate glands embedded in an agar mold. Results show good agreement with manual segmentation.
Manifold learning in machine vision and robotics
NASA Astrophysics Data System (ADS)
Bernstein, Alexander
2017-02-01
Smart algorithms are used in Machine vision and Robotics to organize or extract high-level information from the available data. Nowadays, Machine learning is an essential and ubiquitous tool to automate extraction patterns or regularities from data (images in Machine vision; camera, laser, and sonar sensors data in Robotics) in order to solve various subject-oriented tasks such as understanding and classification of images content, navigation of mobile autonomous robot in uncertain environments, robot manipulation in medical robotics and computer-assisted surgery, and other. Usually such data have high dimensionality, however, due to various dependencies between their components and constraints caused by physical reasons, all "feasible and usable data" occupy only a very small part in high dimensional "observation space" with smaller intrinsic dimensionality. Generally accepted model of such data is manifold model in accordance with which the data lie on or near an unknown manifold (surface) of lower dimensionality embedded in an ambient high dimensional observation space; real-world high-dimensional data obtained from "natural" sources meet, as a rule, this model. The use of Manifold learning technique in Machine vision and Robotics, which discovers a low-dimensional structure of high dimensional data and results in effective algorithms for solving of a large number of various subject-oriented tasks, is the content of the conference plenary speech some topics of which are in the paper.
Deep machine learning provides state-of-the-art performance in image-based plant phenotyping.
Pound, Michael P; Atkinson, Jonathan A; Townsend, Alexandra J; Wilson, Michael H; Griffiths, Marcus; Jackson, Aaron S; Bulat, Adrian; Tzimiropoulos, Georgios; Wells, Darren M; Murchie, Erik H; Pridmore, Tony P; French, Andrew P
2017-10-01
In plant phenotyping, it has become important to be able to measure many features on large image sets in order to aid genetic discovery. The size of the datasets, now often captured robotically, often precludes manual inspection, hence the motivation for finding a fully automated approach. Deep learning is an emerging field that promises unparalleled results on many data analysis problems. Building on artificial neural networks, deep approaches have many more hidden layers in the network, and hence have greater discriminative and predictive power. We demonstrate the use of such approaches as part of a plant phenotyping pipeline. We show the success offered by such techniques when applied to the challenging problem of image-based plant phenotyping and demonstrate state-of-the-art results (>97% accuracy) for root and shoot feature identification and localization. We use fully automated trait identification using deep learning to identify quantitative trait loci in root architecture datasets. The majority (12 out of 14) of manually identified quantitative trait loci were also discovered using our automated approach based on deep learning detection to locate plant features. We have shown deep learning-based phenotyping to have very good detection and localization accuracy in validation and testing image sets. We have shown that such features can be used to derive meaningful biological traits, which in turn can be used in quantitative trait loci discovery pipelines. This process can be completely automated. We predict a paradigm shift in image-based phenotyping bought about by such deep learning approaches, given sufficient training sets. © The Authors 2017. Published by Oxford University Press.
On the impact of approximate computation in an analog DeSTIN architecture.
Young, Steven; Lu, Junjie; Holleman, Jeremy; Arel, Itamar
2014-05-01
Deep machine learning (DML) holds the potential to revolutionize machine learning by automating rich feature extraction, which has become the primary bottleneck of human engineering in pattern recognition systems. However, the heavy computational burden renders DML systems implemented on conventional digital processors impractical for large-scale problems. The highly parallel computations required to implement large-scale deep learning systems are well suited to custom hardware. Analog computation has demonstrated power efficiency advantages of multiple orders of magnitude relative to digital systems while performing nonideal computations. In this paper, we investigate typical error sources introduced by analog computational elements and their impact on system-level performance in DeSTIN--a compositional deep learning architecture. These inaccuracies are evaluated on a pattern classification benchmark, clearly demonstrating the robustness of the underlying algorithm to the errors introduced by analog computational elements. A clear understanding of the impacts of nonideal computations is necessary to fully exploit the efficiency of analog circuits.
NASA Astrophysics Data System (ADS)
Bonfanti, C. E.; Stewart, J.; Lee, Y. J.; Govett, M.; Trailovic, L.; Etherton, B.
2017-12-01
One of the National Oceanic and Atmospheric Administration (NOAA) goals is to provide timely and reliable weather forecasts to support important decisions when and where people need it for safety, emergencies, planning for day-to-day activities. Satellite data is essential for areas lacking in-situ observations for use as initial conditions in Numerical Weather Prediction (NWP) Models, such as spans of the ocean or remote areas of land. Currently only about 7% of total received satellite data is selected for use and from that, an even smaller percentage ever are assimilated into NWP models. With machine learning, the computational and time costs needed for satellite data selection can be greatly reduced. We study various machine learning approaches to process orders of magnitude more satellite data in significantly less time allowing for a greater quantity and more intelligent selection of data to be used for assimilation purposes. Given the future launches of satellites in the upcoming years, machine learning is capable of being applied for better selection of Regions of Interest (ROI) in the magnitudes more of satellite data that will be received. This paper discusses the background of machine learning methods as applied to weather forecasting and the challenges of creating a "labeled dataset" for training and testing purposes. In the training stage of supervised machine learning, labeled data are important to identify a ROI as either true or false so that the model knows what signatures in satellite data to identify. Authors have selected cyclones, including tropical cyclones and mid-latitude lows, as ROI for their machine learning purposes and created a labeled dataset of true or false for ROI from Global Forecast System (GFS) reanalysis data. A dataset like this does not yet exist and given the need for a high quantity of samples, is was decided this was best done with automation. This process was done by developing a program similar to the National Center for Environmental Prediction (NCEP) tropical cyclone tracker by Marchok that was used to identify cyclones based off its physical characteristics. We will discuss the methods and challenges to creating this dataset and the dataset's use for our current supervised machine learning model as well as use for future work on events such as convection initiation.
Wang, Zhuo; Camino, Acner; Hagag, Ahmed M; Wang, Jie; Weleber, Richard G; Yang, Paul; Pennesi, Mark E; Huang, David; Li, Dengwang; Jia, Yali
2018-05-01
Optical coherence tomography (OCT) can demonstrate early deterioration of the photoreceptor integrity caused by inherited retinal degeneration diseases (IRDs). A machine learning method based on random forests was developed to automatically detect continuous areas of preserved ellipsoid zone structure (an easily recognizable part of the photoreceptors on OCT) in 16 eyes of patients with choroideremia (a type of IRD). Pseudopodial extensions protruding from the preserved ellipsoid zone areas are detected separately by a local active contour routine. The algorithm is implemented on en face images with minimum segmentation requirements, only needing delineation of the Bruch's membrane, thus evading the inaccuracies and technical challenges associated with automatic segmentation of the ellipsoid zone in eyes with severe retinal degeneration. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
NASA Technical Reports Server (NTRS)
Wild, Christian; Eckhardt, Dave
1987-01-01
The development of a methodology for the production of highly reliable software is one of the greatest challenges facing the computer industry. Meeting this challenge will undoubtably involve the integration of many technologies. This paper describes the use of Artificial Intelligence technologies in the automated analysis of the formal algebraic specifications of abstract data types. These technologies include symbolic execution of specifications using techniques of automated deduction and machine learning through the use of examples. On-going research into the role of knowledge representation and problem solving in the process of developing software is also discussed.
Classification of Variable Objects in Massive Sky Monitoring Surveys
NASA Astrophysics Data System (ADS)
Woźniak, Przemek; Wyrzykowski, Łukasz; Belokurov, Vasily
2012-03-01
The era of great sky surveys is upon us. Over the past decade we have seen rapid progress toward a continuous photometric record of the optical sky. Numerous sky surveys are discovering and monitoring variable objects by hundreds of thousands. Advances in detector, computing, and networking technology are driving applications of all shapes and sizes ranging from small all sky monitors, through networks of robotic telescopes of modest size, to big glass facilities equipped with giga-pixel CCD mosaics. The Large Synoptic Survey Telescope will be the first peta-scale astronomical survey [18]. It will expand the volume of the parameter space available to us by three orders of magnitude and explore the mutable heavens down to an unprecedented level of sensitivity. Proliferation of large, multidimensional astronomical data sets is stimulating the work on new methods and tools to handle the identification and classification challenge [3]. Given exponentially growing data rates, automated classification of variability types is quickly becoming a necessity. Taking humans out of the loop not only eliminates the subjective nature of visual classification, but is also an enabling factor for time-critical applications. Full automation is especially important for studies of explosive phenomena such as γ-ray bursts that require rapid follow-up observations before the event is over. While there is a general consensus that machine learning will provide a viable solution, the available algorithmic toolbox remains underutilized in astronomy by comparison with other fields such as genomics or market research. Part of the problem is the nature of astronomical data sets that tend to be dominated by a variety of irregularities. Not all algorithms can handle gracefully uneven time sampling, missing features, or sparsely populated high-dimensional spaces. More sophisticated algorithms and better tools available in standard software packages are required to facilitate the adoption of machine learning in astronomy. The goal of this chapter is to show a number of successful applications of state-of-the-art machine learning methodology to time-resolved astronomical data, illustrate what is possible today, and help identify areas for further research and development. After a brief comparison of the utility of various machine learning classifiers, the discussion focuses on support vector machines (SVM), neural nets, and self-organizing maps. Traditionally, to detect and classify transient variability astronomers used ad hoc scan statistics. These methods will remain important as feature extractors for input into generic machine learning algorithms. Experience shows that the performance of machine learning tools on astronomical data critically depends on the definition and quality of the input features, and that a considerable amount of preprocessing is required before standard algorithms can be applied. However, with continued investments of effort by a growing number of astro-informatics savvy computer scientists and astronomers the much-needed expertise and infrastructure are growing faster than ever.
Building "e-rater"® Scoring Models Using Machine Learning Methods. Research Report. ETS RR-16-04
ERIC Educational Resources Information Center
Chen, Jing; Fife, James H.; Bejar, Isaac I.; Rupp, André A.
2016-01-01
The "e-rater"® automated scoring engine used at Educational Testing Service (ETS) scores the writing quality of essays. In the current practice, e-rater scores are generated via a multiple linear regression (MLR) model as a linear combination of various features evaluated for each essay and human scores as the outcome variable. This…
ERIC Educational Resources Information Center
Collis, Betty; Muir, Walter
The first of four major sections in this report presents an overview of the background and evolution of computer applications to learning and teaching. It begins with the early attempts toward "automated teaching" of the 1920s, and the "teaching machines" of B. F. Skinner of the 1940s through the 1960s. It then traces the…
Literature classification for semi-automated updating of biological knowledgebases
2013-01-01
Background As the output of biological assays increase in resolution and volume, the body of specialized biological data, such as functional annotations of gene and protein sequences, enables extraction of higher-level knowledge needed for practical application in bioinformatics. Whereas common types of biological data, such as sequence data, are extensively stored in biological databases, functional annotations, such as immunological epitopes, are found primarily in semi-structured formats or free text embedded in primary scientific literature. Results We defined and applied a machine learning approach for literature classification to support updating of TANTIGEN, a knowledgebase of tumor T-cell antigens. Abstracts from PubMed were downloaded and classified as either "relevant" or "irrelevant" for database update. Training and five-fold cross-validation of a k-NN classifier on 310 abstracts yielded classification accuracy of 0.95, thus showing significant value in support of data extraction from the literature. Conclusion We here propose a conceptual framework for semi-automated extraction of epitope data embedded in scientific literature using principles from text mining and machine learning. The addition of such data will aid in the transition of biological databases to knowledgebases. PMID:24564403
Motwani, Manish; Dey, Damini; Berman, Daniel S.; Germano, Guido; Achenbach, Stephan; Al-Mallah, Mouaz H.; Andreini, Daniele; Budoff, Matthew J.; Cademartiri, Filippo; Callister, Tracy Q.; Chang, Hyuk-Jae; Chinnaiyan, Kavitha; Chow, Benjamin J.W.; Cury, Ricardo C.; Delago, Augustin; Gomez, Millie; Gransar, Heidi; Hadamitzky, Martin; Hausleiter, Joerg; Hindoyan, Niree; Feuchtner, Gudrun; Kaufmann, Philipp A.; Kim, Yong-Jin; Leipsic, Jonathon; Lin, Fay Y.; Maffei, Erica; Marques, Hugo; Pontone, Gianluca; Raff, Gilbert; Rubinshtein, Ronen; Shaw, Leslee J.; Stehli, Julia; Villines, Todd C.; Dunning, Allison; Min, James K.; Slomka, Piotr J.
2017-01-01
Aims Traditional prognostic risk assessment in patients undergoing non-invasive imaging is based upon a limited selection of clinical and imaging findings. Machine learning (ML) can consider a greater number and complexity of variables. Therefore, we investigated the feasibility and accuracy of ML to predict 5-year all-cause mortality (ACM) in patients undergoing coronary computed tomographic angiography (CCTA), and compared the performance to existing clinical or CCTA metrics. Methods and results The analysis included 10 030 patients with suspected coronary artery disease and 5-year follow-up from the COronary CT Angiography EvaluatioN For Clinical Outcomes: An InteRnational Multicenter registry. All patients underwent CCTA as their standard of care. Twenty-five clinical and 44 CCTA parameters were evaluated, including segment stenosis score (SSS), segment involvement score (SIS), modified Duke index (DI), number of segments with non-calcified, mixed or calcified plaques, age, sex, gender, standard cardiovascular risk factors, and Framingham risk score (FRS). Machine learning involved automated feature selection by information gain ranking, model building with a boosted ensemble algorithm, and 10-fold stratified cross-validation. Seven hundred and forty-five patients died during 5-year follow-up. Machine learning exhibited a higher area-under-curve compared with the FRS or CCTA severity scores alone (SSS, SIS, DI) for predicting all-cause mortality (ML: 0.79 vs. FRS: 0.61, SSS: 0.64, SIS: 0.64, DI: 0.62; P< 0.001). Conclusions Machine learning combining clinical and CCTA data was found to predict 5-year ACM significantly better than existing clinical or CCTA metrics alone. PMID:27252451
Motwani, Manish; Dey, Damini; Berman, Daniel S; Germano, Guido; Achenbach, Stephan; Al-Mallah, Mouaz H; Andreini, Daniele; Budoff, Matthew J; Cademartiri, Filippo; Callister, Tracy Q; Chang, Hyuk-Jae; Chinnaiyan, Kavitha; Chow, Benjamin J W; Cury, Ricardo C; Delago, Augustin; Gomez, Millie; Gransar, Heidi; Hadamitzky, Martin; Hausleiter, Joerg; Hindoyan, Niree; Feuchtner, Gudrun; Kaufmann, Philipp A; Kim, Yong-Jin; Leipsic, Jonathon; Lin, Fay Y; Maffei, Erica; Marques, Hugo; Pontone, Gianluca; Raff, Gilbert; Rubinshtein, Ronen; Shaw, Leslee J; Stehli, Julia; Villines, Todd C; Dunning, Allison; Min, James K; Slomka, Piotr J
2017-02-14
Traditional prognostic risk assessment in patients undergoing non-invasive imaging is based upon a limited selection of clinical and imaging findings. Machine learning (ML) can consider a greater number and complexity of variables. Therefore, we investigated the feasibility and accuracy of ML to predict 5-year all-cause mortality (ACM) in patients undergoing coronary computed tomographic angiography (CCTA), and compared the performance to existing clinical or CCTA metrics. The analysis included 10 030 patients with suspected coronary artery disease and 5-year follow-up from the COronary CT Angiography EvaluatioN For Clinical Outcomes: An InteRnational Multicenter registry. All patients underwent CCTA as their standard of care. Twenty-five clinical and 44 CCTA parameters were evaluated, including segment stenosis score (SSS), segment involvement score (SIS), modified Duke index (DI), number of segments with non-calcified, mixed or calcified plaques, age, sex, gender, standard cardiovascular risk factors, and Framingham risk score (FRS). Machine learning involved automated feature selection by information gain ranking, model building with a boosted ensemble algorithm, and 10-fold stratified cross-validation. Seven hundred and forty-five patients died during 5-year follow-up. Machine learning exhibited a higher area-under-curve compared with the FRS or CCTA severity scores alone (SSS, SIS, DI) for predicting all-cause mortality (ML: 0.79 vs. FRS: 0.61, SSS: 0.64, SIS: 0.64, DI: 0.62; P< 0.001). Machine learning combining clinical and CCTA data was found to predict 5-year ACM significantly better than existing clinical or CCTA metrics alone. Published on behalf of the European Society of Cardiology. All rights reserved. © The Author 2016. For permissions please email: journals.permissions@oup.com.
The machine intelligence Hex project
NASA Astrophysics Data System (ADS)
Chalup, Stephan K.; Mellor, Drew; Rosamond, Fran
2005-12-01
Hex is a challenging strategy board game for two players. To enhance students’ progress in acquiring understanding and practical experience with complex machine intelligence and programming concepts we developed the Machine Intelligence Hex (MIHex) project. The associated undergraduate student assignment is about designing and implementing Hex players and evaluating them in an automated tournament of all programs developed by the class. This article surveys educational aspects of the MIHex project. Additionally, fundamental techniques for game programming as well as specific concepts for Hex board evaluation are reviewed. The MIHex game server and possibilities of tournament organisation are described. We summarise and discuss our experiences from running the MIHex project assignment over four consecutive years. The impact on student motivation and learning benefits are evaluated using questionnaires and interviews.
Fraccaro, Paolo; Nicolo, Massimo; Bonetto, Monica; Giacomini, Mauro; Weller, Peter; Traverso, Carlo Enrico; Prosperi, Mattia; OSullivan, Dympna
2015-01-27
To investigate machine learning methods, ranging from simpler interpretable techniques to complex (non-linear) "black-box" approaches, for automated diagnosis of Age-related Macular Degeneration (AMD). Data from healthy subjects and patients diagnosed with AMD or other retinal diseases were collected during routine visits via an Electronic Health Record (EHR) system. Patients' attributes included demographics and, for each eye, presence/absence of major AMD-related clinical signs (soft drusen, retinal pigment epitelium, defects/pigment mottling, depigmentation area, subretinal haemorrhage, subretinal fluid, macula thickness, macular scar, subretinal fibrosis). Interpretable techniques known as white box methods including logistic regression and decision trees as well as less interpreitable techniques known as black box methods, such as support vector machines (SVM), random forests and AdaBoost, were used to develop models (trained and validated on unseen data) to diagnose AMD. The gold standard was confirmed diagnosis of AMD by physicians. Sensitivity, specificity and area under the receiver operating characteristic (AUC) were used to assess performance. Study population included 487 patients (912 eyes). In terms of AUC, random forests, logistic regression and adaboost showed a mean performance of (0.92), followed by SVM and decision trees (0.90). All machine learning models identified soft drusen and age as the most discriminating variables in clinicians' decision pathways to diagnose AMD. Both black-box and white box methods performed well in identifying diagnoses of AMD and their decision pathways. Machine learning models developed through the proposed approach, relying on clinical signs identified by retinal specialists, could be embedded into EHR to provide physicians with real time (interpretable) support.
Using Neural Networks to Classify Digitized Images of Galaxies
NASA Astrophysics Data System (ADS)
Goderya, S. N.; McGuire, P. C.
2000-12-01
Automated classification of Galaxies into Hubble types is of paramount importance to study the large scale structure of the Universe, particularly as survey projects like the Sloan Digital Sky Survey complete their data acquisition of one million galaxies. At present it is not possible to find robust and efficient artificial intelligence based galaxy classifiers. In this study we will summarize progress made in the development of automated galaxy classifiers using neural networks as machine learning tools. We explore the Bayesian linear algorithm, the higher order probabilistic network, the multilayer perceptron neural network and Support Vector Machine Classifier. The performance of any machine classifier is dependant on the quality of the parameters that characterize the different groups of galaxies. Our effort is to develop geometric and invariant moment based parameters as input to the machine classifiers instead of the raw pixel data. Such an approach reduces the dimensionality of the classifier considerably, and removes the effects of scaling and rotation, and makes it easier to solve for the unknown parameters in the galaxy classifier. To judge the quality of training and classification we develop the concept of Mathews coefficients for the galaxy classification community. Mathews coefficients are single numbers that quantify classifier performance even with unequal prior probabilities of the classes.
Moreno-Duarte, Ingrid; Montenegro, Julio; Balonov, Konstantin; Schumann, Roman
2017-04-15
Most modern anesthesia workstations provide automated checkout, which indicates the readiness of the anesthesia machine. In this case report, an anesthesia machine passed the automated machine checkout. Minutes after the induction of general anesthesia, we observed a mismatch between the selected and delivered tidal volumes in the volume auto flow mode with increased inspiratory resistance during manual ventilation. Endotracheal tube kinking, circuit obstruction, leaks, and patient-related factors were ruled out. Further investigation revealed a broken internal insert within the CO2 absorbent canister that allowed absorbent granules to cause a partial obstruction to inspiratory and expiratory flow triggering contradictory alarms. We concluded that even when the automated machine checkout indicates machine readiness, unforeseen equipment failure due to unexpected events can occur and require providers to remain vigilant.
Fuller, John A; Berlinicke, Cynthia A; Inglese, James; Zack, Donald J
2016-01-01
High content analysis (HCA) has become a leading methodology in phenotypic drug discovery efforts. Typical HCA workflows include imaging cells using an automated microscope and analyzing the data using algorithms designed to quantify one or more specific phenotypes of interest. Due to the richness of high content data, unappreciated phenotypic changes may be discovered in existing image sets using interactive machine-learning based software systems. Primary postnatal day four retinal cells from the photoreceptor (PR) labeled QRX-EGFP reporter mice were isolated, seeded, treated with a set of 234 profiled kinase inhibitors and then cultured for 1 week. The cells were imaged with an Acumen plate-based laser cytometer to determine the number and intensity of GFP-expressing, i.e. PR, cells. Wells displaying intensities and counts above threshold values of interest were re-imaged at a higher resolution with an INCell2000 automated microscope. The images were analyzed with an open source HCA analysis tool, PhenoRipper (Rajaram et al., Nat Methods 9:635-637, 2012), to identify the high GFP-inducing treatments that additionally resulted in diverse phenotypes compared to the vehicle control samples. The pyrimidinopyrimidone kinase inhibitor CHEMBL-1766490, a pan kinase inhibitor whose major known targets are p38α and the Src family member lck, was identified as an inducer of photoreceptor neuritogenesis by using the open-source HCA program PhenoRipper. This finding was corroborated using a cell-based method of image analysis that measures quantitative differences in the mean neurite length in GFP expressing cells. Interacting with data using machine learning algorithms may complement traditional HCA approaches by leading to the discovery of small molecule-induced cellular phenotypes in addition to those upon which the investigator is initially focusing.
TEES 2.2: Biomedical Event Extraction for Diverse Corpora
2015-01-01
Background The Turku Event Extraction System (TEES) is a text mining program developed for the extraction of events, complex biomedical relationships, from scientific literature. Based on a graph-generation approach, the system detects events with the use of a rich feature set built via dependency parsing. The TEES system has achieved record performance in several of the shared tasks of its domain, and continues to be used in a variety of biomedical text mining tasks. Results The TEES system was quickly adapted to the BioNLP'13 Shared Task in order to provide a public baseline for derived systems. An automated approach was developed for learning the underlying annotation rules of event type, allowing immediate adaptation to the various subtasks, and leading to a first place in four out of eight tasks. The system for the automated learning of annotation rules is further enhanced in this paper to the point of requiring no manual adaptation to any of the BioNLP'13 tasks. Further, the scikit-learn machine learning library is integrated into the system, bringing a wide variety of machine learning methods usable with TEES in addition to the default SVM. A scikit-learn ensemble method is also used to analyze the importances of the features in the TEES feature sets. Conclusions The TEES system was introduced for the BioNLP'09 Shared Task and has since then demonstrated good performance in several other shared tasks. By applying the current TEES 2.2 system to multiple corpora from these past shared tasks an overarching analysis of the most promising methods and possible pitfalls in the evolving field of biomedical event extraction are presented. PMID:26551925
Ertosun, Mehmet Günhan; Rubin, Daniel L
2015-01-01
Brain glioma is the most common primary malignant brain tumors in adults with different pathologic subtypes: Lower Grade Glioma (LGG) Grade II, Lower Grade Glioma (LGG) Grade III, and Glioblastoma Multiforme (GBM) Grade IV. The survival and treatment options are highly dependent of this glioma grade. We propose a deep learning-based, modular classification pipeline for automated grading of gliomas using digital pathology images. Whole tissue digitized images of pathology slides obtained from The Cancer Genome Atlas (TCGA) were used to train our deep learning modules. Our modular pipeline provides diagnostic quality statistics, such as precision, sensitivity and specificity, of the individual deep learning modules, and (1) facilitates training given the limited data in this domain, (2) enables exploration of different deep learning structures for each module, (3) leads to developing less complex modules that are simpler to analyze, and (4) provides flexibility, permitting use of single modules within the framework or use of other modeling or machine learning applications, such as probabilistic graphical models or support vector machines. Our modular approach helps us meet the requirements of minimum accuracy levels that are demanded by the context of different decision points within a multi-class classification scheme. Convolutional Neural Networks are trained for each module for each sub-task with more than 90% classification accuracies on validation data set, and achieved classification accuracy of 96% for the task of GBM vs LGG classification, 71% for further identifying the grade of LGG into Grade II or Grade III on independent data set coming from new patients from the multi-institutional repository.
Ertosun, Mehmet Günhan; Rubin, Daniel L.
2015-01-01
Brain glioma is the most common primary malignant brain tumors in adults with different pathologic subtypes: Lower Grade Glioma (LGG) Grade II, Lower Grade Glioma (LGG) Grade III, and Glioblastoma Multiforme (GBM) Grade IV. The survival and treatment options are highly dependent of this glioma grade. We propose a deep learning-based, modular classification pipeline for automated grading of gliomas using digital pathology images. Whole tissue digitized images of pathology slides obtained from The Cancer Genome Atlas (TCGA) were used to train our deep learning modules. Our modular pipeline provides diagnostic quality statistics, such as precision, sensitivity and specificity, of the individual deep learning modules, and (1) facilitates training given the limited data in this domain, (2) enables exploration of different deep learning structures for each module, (3) leads to developing less complex modules that are simpler to analyze, and (4) provides flexibility, permitting use of single modules within the framework or use of other modeling or machine learning applications, such as probabilistic graphical models or support vector machines. Our modular approach helps us meet the requirements of minimum accuracy levels that are demanded by the context of different decision points within a multi-class classification scheme. Convolutional Neural Networks are trained for each module for each sub-task with more than 90% classification accuracies on validation data set, and achieved classification accuracy of 96% for the task of GBM vs LGG classification, 71% for further identifying the grade of LGG into Grade II or Grade III on independent data set coming from new patients from the multi-institutional repository. PMID:26958289
TEES 2.2: Biomedical Event Extraction for Diverse Corpora.
Björne, Jari; Salakoski, Tapio
2015-01-01
The Turku Event Extraction System (TEES) is a text mining program developed for the extraction of events, complex biomedical relationships, from scientific literature. Based on a graph-generation approach, the system detects events with the use of a rich feature set built via dependency parsing. The TEES system has achieved record performance in several of the shared tasks of its domain, and continues to be used in a variety of biomedical text mining tasks. The TEES system was quickly adapted to the BioNLP'13 Shared Task in order to provide a public baseline for derived systems. An automated approach was developed for learning the underlying annotation rules of event type, allowing immediate adaptation to the various subtasks, and leading to a first place in four out of eight tasks. The system for the automated learning of annotation rules is further enhanced in this paper to the point of requiring no manual adaptation to any of the BioNLP'13 tasks. Further, the scikit-learn machine learning library is integrated into the system, bringing a wide variety of machine learning methods usable with TEES in addition to the default SVM. A scikit-learn ensemble method is also used to analyze the importances of the features in the TEES feature sets. The TEES system was introduced for the BioNLP'09 Shared Task and has since then demonstrated good performance in several other shared tasks. By applying the current TEES 2.2 system to multiple corpora from these past shared tasks an overarching analysis of the most promising methods and possible pitfalls in the evolving field of biomedical event extraction are presented.
Design Methodology for Automated Construction Machines
1987-12-11
along with the design of a pair of machines which automate framework installation.-,, 20. DISTRIBUTION IAVAILABILITY OF ABSTRACT 21. ABSTRACT SECURITY... Development Assistant Professor of Civil Engineering and Laura A . Demsetz, David H. Levy, Bruce Schena Graduate Research Assistants December 11, 1987 U.S...are discussed along with the design of a pair of machines which automate framework installation. Preliminary analysis and testing indicate that these
Performance Evaluation of the UT Automated Road Maintenance Machine
DOT National Transportation Integrated Search
1997-10-01
This final report focuses mainly on evaluating the overall performance of The University of Texas' Automated Road Maintenance Machine (ARMM). It was concluded that the introduction of automated methods to the pavement crack-sealing process will impro...
Federal Register 2010, 2011, 2012, 2013, 2014
2010-01-20
..., LLC, Subsidiary of Mag Industrial Automation Systems, Machesney Park, IL; Notice of Negative... automation equipment and machine tools did not contribute to worker separations at the subject facility and...' firm's declining customers. The survey revealed no imports of automation equipment and machine tools by...
A machine learning-based framework to identify type 2 diabetes through electronic health records
Zheng, Tao; Xie, Wei; Xu, Liling; He, Xiaoying; Zhang, Ya; You, Mingrong; Yang, Gong; Chen, You
2016-01-01
Objective To discover diverse genotype-phenotype associations affiliated with Type 2 Diabetes Mellitus (T2DM) via genome-wide association study (GWAS) and phenome-wide association study (PheWAS), more cases (T2DM subjects) and controls (subjects without T2DM) are required to be identified (e.g., via Electronic Health Records (EHR)). However, existing expert based identification algorithms often suffer in a low recall rate and could miss a large number of valuable samples under conservative filtering standards. The goal of this work is to develop a semi-automated framework based on machine learning as a pilot study to liberalize filtering criteria to improve recall rate with a keeping of low false positive rate. Materials and methods We propose a data informed framework for identifying subjects with and without T2DM from EHR via feature engineering and machine learning. We evaluate and contrast the identification performance of widely-used machine learning models within our framework, including k-Nearest-Neighbors, Naïve Bayes, Decision Tree, Random Forest, Support Vector Machine and Logistic Regression. Our framework was conducted on 300 patient samples (161 cases, 60 controls and 79 unconfirmed subjects), randomly selected from 23,281 diabetes related cohort retrieved from a regional distributed EHR repository ranging from 2012 to 2014. Results We apply top-performing machine learning algorithms on the engineered features. We benchmark and contrast the accuracy, precision, AUC, sensitivity and specificity of classification models against the state-of-the-art expert algorithm for identification of T2DM subjects. Our results indicate that the framework achieved high identification performances (∼0.98 in average AUC), which are much higher than the state-of-the-art algorithm (0.71 in AUC). Discussion Expert algorithm-based identification of T2DM subjects from EHR is often hampered by the high missing rates due to their conservative selection criteria. Our framework leverages machine learning and feature engineering to loosen such selection criteria to achieve a high identification rate of cases and controls. Conclusions Our proposed framework demonstrates a more accurate and efficient approach for identifying subjects with and without T2DM from EHR. PMID:27919371
A machine learning-based framework to identify type 2 diabetes through electronic health records.
Zheng, Tao; Xie, Wei; Xu, Liling; He, Xiaoying; Zhang, Ya; You, Mingrong; Yang, Gong; Chen, You
2017-01-01
To discover diverse genotype-phenotype associations affiliated with Type 2 Diabetes Mellitus (T2DM) via genome-wide association study (GWAS) and phenome-wide association study (PheWAS), more cases (T2DM subjects) and controls (subjects without T2DM) are required to be identified (e.g., via Electronic Health Records (EHR)). However, existing expert based identification algorithms often suffer in a low recall rate and could miss a large number of valuable samples under conservative filtering standards. The goal of this work is to develop a semi-automated framework based on machine learning as a pilot study to liberalize filtering criteria to improve recall rate with a keeping of low false positive rate. We propose a data informed framework for identifying subjects with and without T2DM from EHR via feature engineering and machine learning. We evaluate and contrast the identification performance of widely-used machine learning models within our framework, including k-Nearest-Neighbors, Naïve Bayes, Decision Tree, Random Forest, Support Vector Machine and Logistic Regression. Our framework was conducted on 300 patient samples (161 cases, 60 controls and 79 unconfirmed subjects), randomly selected from 23,281 diabetes related cohort retrieved from a regional distributed EHR repository ranging from 2012 to 2014. We apply top-performing machine learning algorithms on the engineered features. We benchmark and contrast the accuracy, precision, AUC, sensitivity and specificity of classification models against the state-of-the-art expert algorithm for identification of T2DM subjects. Our results indicate that the framework achieved high identification performances (∼0.98 in average AUC), which are much higher than the state-of-the-art algorithm (0.71 in AUC). Expert algorithm-based identification of T2DM subjects from EHR is often hampered by the high missing rates due to their conservative selection criteria. Our framework leverages machine learning and feature engineering to loosen such selection criteria to achieve a high identification rate of cases and controls. Our proposed framework demonstrates a more accurate and efficient approach for identifying subjects with and without T2DM from EHR. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Das, Dev Kumar; Ghosh, Madhumala; Pal, Mallika; Maiti, Asok K; Chakraborty, Chandan
2013-02-01
The aim of this paper is to address the development of computer assisted malaria parasite characterization and classification using machine learning approach based on light microscopic images of peripheral blood smears. In doing this, microscopic image acquisition from stained slides, illumination correction and noise reduction, erythrocyte segmentation, feature extraction, feature selection and finally classification of different stages of malaria (Plasmodium vivax and Plasmodium falciparum) have been investigated. The erythrocytes are segmented using marker controlled watershed transformation and subsequently total ninety six features describing shape-size and texture of erythrocytes are extracted in respect to the parasitemia infected versus non-infected cells. Ninety four features are found to be statistically significant in discriminating six classes. Here a feature selection-cum-classification scheme has been devised by combining F-statistic, statistical learning techniques i.e., Bayesian learning and support vector machine (SVM) in order to provide the higher classification accuracy using best set of discriminating features. Results show that Bayesian approach provides the highest accuracy i.e., 84% for malaria classification by selecting 19 most significant features while SVM provides highest accuracy i.e., 83.5% with 9 most significant features. Finally, the performance of these two classifiers under feature selection framework has been compared toward malaria parasite classification. Copyright © 2012 Elsevier Ltd. All rights reserved.
Specimen coordinate automated measuring machine/fiducial automated measuring machine
Hedglen, Robert E.; Jacket, Howard S.; Schwartz, Allan I.
1991-01-01
The Specimen coordinate Automated Measuring Machine (SCAMM) and the Fiducial Automated Measuring Machine (FAMM) is a computer controlled metrology system capable of measuring length, width, and thickness, and of locating fiducial marks. SCAMM and FAMM have many similarities in their designs, and they can be converted from one to the other without taking them out of the hot cell. Both have means for: supporting a plurality of samples and a standard; controlling the movement of the samples in the +/- X and Y directions; determining the coordinates of the sample; compensating for temperature effects; and verifying the accuracy of the measurements and repeating as necessary. SCAMM and FAMM are designed to be used in hot cells.
iPTF report of bright transients
NASA Astrophysics Data System (ADS)
Cannella, Chris; Kuesters, Daniel; Ferretti, Raphael; Blagorodnova, Nadejda; Adams, Scott; Kupfer, Thomas; Neill, James D.; Walters, Richard; Yan, Lin; Kulkarni, Shri
2017-02-01
The intermediate Palomar Transient Factory (iPTF; ATel #4807) reports the following bright ( Our automated candidate vetting to distinguish a real astrophysical source (1.0) from bogus artifacts (0.0) is powered by three generations of machine learning algorithms: RB2 (Brink et al. 2013MNRAS.435.1047B), RB4 (Rebbapragada et al. 2015AAS...22543402R), and RB5 (Wozniak et al. 2013AAS...22143105W).
AUTOMATING ASSET KNOWLEDGE WITH MTCONNECT.
Venkatesh, Sid; Ly, Sidney; Manning, Martin; Michaloski, John; Proctor, Fred
2016-01-01
In order to maximize assets, manufacturers should use real-time knowledge garnered from ongoing and continuous collection and evaluation of factory-floor machine status data. In discrete parts manufacturing, factory machine monitoring has been difficult, due primarily to closed, proprietary automation equipment that make integration difficult. Recently, there has been a push in applying the data acquisition concepts of MTConnect to the real-time acquisition of machine status data. MTConnect is an open, free specification aimed at overcoming the "Islands of Automation" dilemma on the shop floor. With automated asset analysis, manufacturers can improve production to become lean, efficient, and effective. The focus of this paper will be on the deployment of MTConnect to collect real-time machine status to automate asset management. In addition, we will leverage the ISO 22400 standard, which defines an asset and quantifies asset performance metrics. In conjunction with these goals, the deployment of MTConnect in a large aerospace manufacturing facility will be studied with emphasis on asset management and understanding the impact of machine Overall Equipment Effectiveness (OEE) on manufacturing.
Learning diagnostic models using speech and language measures.
Peintner, Bart; Jarrold, William; Vergyriy, Dimitra; Richey, Colleen; Tempini, Maria Luisa Gorno; Ogar, Jennifer
2008-01-01
We describe results that show the effectiveness of machine learning in the automatic diagnosis of certain neurodegenerative diseases, several of which alter speech and language production. We analyzed audio from 9 control subjects and 30 patients diagnosed with one of three subtypes of Frontotemporal Lobar Degeneration. From this data, we extracted features of the audio signal and the words the patient used, which were obtained using our automated transcription technologies. We then automatically learned models that predict the diagnosis of the patient using these features. Our results show that learned models over these features predict diagnosis with accuracy significantly better than random. Future studies using higher quality recordings will likely improve these results.
An intelligent CNC machine control system architecture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Miller, D.J.; Loucks, C.S.
1996-10-01
Intelligent, agile manufacturing relies on automated programming of digitally controlled processes. Currently, processes such as Computer Numerically Controlled (CNC) machining are difficult to automate because of highly restrictive controllers and poor software environments. It is also difficult to utilize sensors and process models for adaptive control, or to integrate machining processes with other tasks within a factory floor setting. As part of a Laboratory Directed Research and Development (LDRD) program, a CNC machine control system architecture based on object-oriented design and graphical programming has been developed to address some of these problems and to demonstrate automated agile machining applications usingmore » platform-independent software.« less
Robust Machine Learning-Based Correction on Automatic Segmentation of the Cerebellum and Brainstem.
Wang, Jun Yi; Ngo, Michael M; Hessl, David; Hagerman, Randi J; Rivera, Susan M
2016-01-01
Automated segmentation is a useful method for studying large brain structures such as the cerebellum and brainstem. However, automated segmentation may lead to inaccuracy and/or undesirable boundary. The goal of the present study was to investigate whether SegAdapter, a machine learning-based method, is useful for automatically correcting large segmentation errors and disagreement in anatomical definition. We further assessed the robustness of the method in handling size of training set, differences in head coil usage, and amount of brain atrophy. High resolution T1-weighted images were acquired from 30 healthy controls scanned with either an 8-channel or 32-channel head coil. Ten patients, who suffered from brain atrophy because of fragile X-associated tremor/ataxia syndrome, were scanned using the 32-channel head coil. The initial segmentations of the cerebellum and brainstem were generated automatically using Freesurfer. Subsequently, Freesurfer's segmentations were both manually corrected to serve as the gold standard and automatically corrected by SegAdapter. Using only 5 scans in the training set, spatial overlap with manual segmentation in Dice coefficient improved significantly from 0.956 (for Freesurfer segmentation) to 0.978 (for SegAdapter-corrected segmentation) for the cerebellum and from 0.821 to 0.954 for the brainstem. Reducing the training set size to 2 scans only decreased the Dice coefficient ≤0.002 for the cerebellum and ≤ 0.005 for the brainstem compared to the use of training set size of 5 scans in corrective learning. The method was also robust in handling differences between the training set and the test set in head coil usage and the amount of brain atrophy, which reduced spatial overlap only by <0.01. These results suggest that the combination of automated segmentation and corrective learning provides a valuable method for accurate and efficient segmentation of the cerebellum and brainstem, particularly in large-scale neuroimaging studies, and potentially for segmenting other neural regions as well.
Robust Machine Learning-Based Correction on Automatic Segmentation of the Cerebellum and Brainstem
Wang, Jun Yi; Ngo, Michael M.; Hessl, David; Hagerman, Randi J.; Rivera, Susan M.
2016-01-01
Automated segmentation is a useful method for studying large brain structures such as the cerebellum and brainstem. However, automated segmentation may lead to inaccuracy and/or undesirable boundary. The goal of the present study was to investigate whether SegAdapter, a machine learning-based method, is useful for automatically correcting large segmentation errors and disagreement in anatomical definition. We further assessed the robustness of the method in handling size of training set, differences in head coil usage, and amount of brain atrophy. High resolution T1-weighted images were acquired from 30 healthy controls scanned with either an 8-channel or 32-channel head coil. Ten patients, who suffered from brain atrophy because of fragile X-associated tremor/ataxia syndrome, were scanned using the 32-channel head coil. The initial segmentations of the cerebellum and brainstem were generated automatically using Freesurfer. Subsequently, Freesurfer’s segmentations were both manually corrected to serve as the gold standard and automatically corrected by SegAdapter. Using only 5 scans in the training set, spatial overlap with manual segmentation in Dice coefficient improved significantly from 0.956 (for Freesurfer segmentation) to 0.978 (for SegAdapter-corrected segmentation) for the cerebellum and from 0.821 to 0.954 for the brainstem. Reducing the training set size to 2 scans only decreased the Dice coefficient ≤0.002 for the cerebellum and ≤ 0.005 for the brainstem compared to the use of training set size of 5 scans in corrective learning. The method was also robust in handling differences between the training set and the test set in head coil usage and the amount of brain atrophy, which reduced spatial overlap only by <0.01. These results suggest that the combination of automated segmentation and corrective learning provides a valuable method for accurate and efficient segmentation of the cerebellum and brainstem, particularly in large-scale neuroimaging studies, and potentially for segmenting other neural regions as well. PMID:27213683
Intelligent software for laboratory automation.
Whelan, Ken E; King, Ross D
2004-09-01
The automation of laboratory techniques has greatly increased the number of experiments that can be carried out in the chemical and biological sciences. Until recently, this automation has focused primarily on improving hardware. Here we argue that future advances will concentrate on intelligent software to integrate physical experimentation and results analysis with hypothesis formulation and experiment planning. To illustrate our thesis, we describe the 'Robot Scientist' - the first physically implemented example of such a closed loop system. In the Robot Scientist, experimentation is performed by a laboratory robot, hypotheses concerning the results are generated by machine learning and experiments are allocated and selected by a combination of techniques derived from artificial intelligence research. The performance of the Robot Scientist has been evaluated by a rediscovery task based on yeast functional genomics. The Robot Scientist is proof that the integration of programmable laboratory hardware and intelligent software can be used to develop increasingly automated laboratories.
Deep convolutional neural networks for classifying GPR B-scans
NASA Astrophysics Data System (ADS)
Besaw, Lance E.; Stimac, Philip J.
2015-05-01
Symmetric and asymmetric buried explosive hazards (BEHs) present real, persistent, deadly threats on the modern battlefield. Current approaches to mitigate these threats rely on highly trained operatives to reliably detect BEHs with reasonable false alarm rates using handheld Ground Penetrating Radar (GPR) and metal detectors. As computers become smaller, faster and more efficient, there exists greater potential for automated threat detection based on state-of-the-art machine learning approaches, reducing the burden on the field operatives. Recent advancements in machine learning, specifically deep learning artificial neural networks, have led to significantly improved performance in pattern recognition tasks, such as object classification in digital images. Deep convolutional neural networks (CNNs) are used in this work to extract meaningful signatures from 2-dimensional (2-D) GPR B-scans and classify threats. The CNNs skip the traditional "feature engineering" step often associated with machine learning, and instead learn the feature representations directly from the 2-D data. A multi-antennae, handheld GPR with centimeter-accurate positioning data was used to collect shallow subsurface data over prepared lanes containing a wide range of BEHs. Several heuristics were used to prevent over-training, including cross validation, network weight regularization, and "dropout." Our results show that CNNs can extract meaningful features and accurately classify complex signatures contained in GPR B-scans, complementing existing GPR feature extraction and classification techniques.
Yang, Xiaofeng; Wu, Ning; Cheng, Guanghui; Zhou, Zhengyang; Yu, David S; Beitler, Jonathan J; Curran, Walter J; Liu, Tian
2014-12-01
To develop an automated magnetic resonance imaging (MRI) parotid segmentation method to monitor radiation-induced parotid gland changes in patients after head and neck radiation therapy (RT). The proposed method combines the atlas registration method, which captures the global variation of anatomy, with a machine learning technology, which captures the local statistical features, to automatically segment the parotid glands from the MRIs. The segmentation method consists of 3 major steps. First, an atlas (pre-RT MRI and manually contoured parotid gland mask) is built for each patient. A hybrid deformable image registration is used to map the pre-RT MRI to the post-RT MRI, and the transformation is applied to the pre-RT parotid volume. Second, the kernel support vector machine (SVM) is trained with the subject-specific atlas pair consisting of multiple features (intensity, gradient, and others) from the aligned pre-RT MRI and the transformed parotid volume. Third, the well-trained kernel SVM is used to differentiate the parotid from surrounding tissues in the post-RT MRIs by statistically matching multiple texture features. A longitudinal study of 15 patients undergoing head and neck RT was conducted: baseline MRI was acquired prior to RT, and the post-RT MRIs were acquired at 3-, 6-, and 12-month follow-up examinations. The resulting segmentations were compared with the physicians' manual contours. Successful parotid segmentation was achieved for all 15 patients (42 post-RT MRIs). The average percentage of volume differences between the automated segmentations and those of the physicians' manual contours were 7.98% for the left parotid and 8.12% for the right parotid. The average volume overlap was 91.1% ± 1.6% for the left parotid and 90.5% ± 2.4% for the right parotid. The parotid gland volume reduction at follow-up was 25% at 3 months, 27% at 6 months, and 16% at 12 months. We have validated our automated parotid segmentation algorithm in a longitudinal study. This segmentation method may be useful in future studies to address radiation-induced xerostomia in head and neck radiation therapy. Copyright © 2014 Elsevier Inc. All rights reserved.
(Machine-)Learning to analyze in vivo microscopy: Support vector machines.
Wang, Michael F Z; Fernandez-Gonzalez, Rodrigo
2017-11-01
The development of new microscopy techniques for super-resolved, long-term monitoring of cellular and subcellular dynamics in living organisms is revealing new fundamental aspects of tissue development and repair. However, new microscopy approaches present several challenges. In addition to unprecedented requirements for data storage, the analysis of high resolution, time-lapse images is too complex to be done manually. Machine learning techniques are ideally suited for the (semi-)automated analysis of multidimensional image data. In particular, support vector machines (SVMs), have emerged as an efficient method to analyze microscopy images obtained from animals. Here, we discuss the use of SVMs to analyze in vivo microscopy data. We introduce the mathematical framework behind SVMs, and we describe the metrics used by SVMs and other machine learning approaches to classify image data. We discuss the influence of different SVM parameters in the context of an algorithm for cell segmentation and tracking. Finally, we describe how the application of SVMs has been critical to study protein localization in yeast screens, for lineage tracing in C. elegans, or to determine the developmental stage of Drosophila embryos to investigate gene expression dynamics. We propose that SVMs will become central tools in the analysis of the complex image data that novel microscopy modalities have made possible. This article is part of a Special Issue entitled: Biophysics in Canada, edited by Lewis Kay, John Baenziger, Albert Berghuis and Peter Tieleman. Copyright © 2017 Elsevier B.V. All rights reserved.
Toward accelerating landslide mapping with interactive machine learning techniques
NASA Astrophysics Data System (ADS)
Stumpf, André; Lachiche, Nicolas; Malet, Jean-Philippe; Kerle, Norman; Puissant, Anne
2013-04-01
Despite important advances in the development of more automated methods for landslide mapping from optical remote sensing images, the elaboration of inventory maps after major triggering events still remains a tedious task. Image classification with expert defined rules typically still requires significant manual labour for the elaboration and adaption of rule sets for each particular case. Machine learning algorithm, on the contrary, have the ability to learn and identify complex image patterns from labelled examples but may require relatively large amounts of training data. In order to reduce the amount of required training data active learning has evolved as key concept to guide the sampling for applications such as document classification, genetics and remote sensing. The general underlying idea of most active learning approaches is to initialize a machine learning model with a small training set, and to subsequently exploit the model state and/or the data structure to iteratively select the most valuable samples that should be labelled by the user and added in the training set. With relatively few queries and labelled samples, an active learning strategy should ideally yield at least the same accuracy than an equivalent classifier trained with many randomly selected samples. Our study was dedicated to the development of an active learning approach for landslide mapping from VHR remote sensing images with special consideration of the spatial distribution of the samples. The developed approach is a region-based query heuristic that enables to guide the user attention towards few compact spatial batches rather than distributed points resulting in time savings of 50% and more compared to standard active learning techniques. The approach was tested with multi-temporal and multi-sensor satellite images capturing recent large scale triggering events in Brazil and China and demonstrated balanced user's and producer's accuracies between 74% and 80%. The assessment also included an experimental evaluation of the uncertainties of manual mappings from multiple experts and demonstrated strong relationships between the uncertainty of the experts and the machine learning model.
Towards an Automated Classification of Transient Events in Synoptic Sky Surveys
NASA Technical Reports Server (NTRS)
Djorgovski, S. G.; Donalek, C.; Mahabal, A. A.; Moghaddam, B.; Turmon, M.; Graham, M. J.; Drake, A. J.; Sharma, N.; Chen, Y.
2011-01-01
We describe the development of a system for an automated, iterative, real-time classification of transient events discovered in synoptic sky surveys. The system under development incorporates a number of Machine Learning techniques, mostly using Bayesian approaches, due to the sparse nature, heterogeneity, and variable incompleteness of the available data. The classifications are improved iteratively as the new measurements are obtained. One novel featrue is the development of an automated follow-up recommendation engine, that suggest those measruements that would be the most advantageous in terms of resolving classification ambiguities and/or characterization of the astrophysically most interesting objects, given a set of available follow-up assets and their cost funcations. This illustrates the symbiotic relationship of astronomy and applied computer science through the emerging disciplne of AstroInformatics.
Nguyen, Su; Zhang, Mengjie; Tan, Kay Chen
2017-09-01
Automated design of dispatching rules for production systems has been an interesting research topic over the last several years. Machine learning, especially genetic programming (GP), has been a powerful approach to dealing with this design problem. However, intensive computational requirements, accuracy and interpretability are still its limitations. This paper aims at developing a new surrogate assisted GP to help improving the quality of the evolved rules without significant computational costs. The experiments have verified the effectiveness and efficiency of the proposed algorithms as compared to those in the literature. Furthermore, new simplification and visualisation approaches have also been developed to improve the interpretability of the evolved rules. These approaches have shown great potentials and proved to be a critical part of the automated design system.
Applying machine learning to pattern analysis for automated in-design layout optimization
NASA Astrophysics Data System (ADS)
Cain, Jason P.; Fakhry, Moutaz; Pathak, Piyush; Sweis, Jason; Gennari, Frank; Lai, Ya-Chieh
2018-04-01
Building on previous work for cataloging unique topological patterns in an integrated circuit physical design, a new process is defined in which a risk scoring methodology is used to rank patterns based on manufacturing risk. Patterns with high risk are then mapped to functionally equivalent patterns with lower risk. The higher risk patterns are then replaced in the design with their lower risk equivalents. The pattern selection and replacement is fully automated and suitable for use for full-chip designs. Results from 14nm product designs show that the approach can identify and replace risk patterns with quantifiable positive impact on the risk score distribution after replacement.
Automated detection of new impact sites on Martian surface from HiRISE images
NASA Astrophysics Data System (ADS)
Xin, Xin; Di, Kaichang; Wang, Yexin; Wan, Wenhui; Yue, Zongyu
2017-10-01
In this study, an automated method for Martian new impact site detection from single images is presented. It first extracts dark areas in full high resolution image, then detects new impact craters within dark areas using a cascade classifier which combines local binary pattern features and Haar-like features trained by an AdaBoost machine learning algorithm. Experimental results using 100 HiRISE images show that the overall detection rate of proposed method is 84.5%, with a true positive rate of 86.9%. The detection rate and true positive rate in the flat regions are 93.0% and 91.5%, respectively.
An automated diagnosis system of liver disease using artificial immune and genetic algorithms.
Liang, Chunlin; Peng, Lingxi
2013-04-01
The rise of health care cost is one of the world's most important problems. Disease prediction is also a vibrant research area. Researchers have approached this problem using various techniques such as support vector machine, artificial neural network, etc. This study typically exploits the immune system's characteristics of learning and memory to solve the problem of liver disease diagnosis. The proposed system applies a combination of two methods of artificial immune and genetic algorithm to diagnose the liver disease. The system architecture is based on artificial immune system. The learning procedure of system adopts genetic algorithm to interfere the evolution of antibody population. The experiments use two benchmark datasets in our study, which are acquired from the famous UCI machine learning repository. The obtained diagnosis accuracies are very promising with regard to the other diagnosis system in the literatures. These results suggest that this system may be a useful automatic diagnosis tool for liver disease.
CMU DeepLens: deep learning for automatic image-based galaxy-galaxy strong lens finding
NASA Astrophysics Data System (ADS)
Lanusse, François; Ma, Quanbin; Li, Nan; Collett, Thomas E.; Li, Chun-Liang; Ravanbakhsh, Siamak; Mandelbaum, Rachel; Póczos, Barnabás
2018-01-01
Galaxy-scale strong gravitational lensing can not only provide a valuable probe of the dark matter distribution of massive galaxies, but also provide valuable cosmological constraints, either by studying the population of strong lenses or by measuring time delays in lensed quasars. Due to the rarity of galaxy-scale strongly lensed systems, fast and reliable automated lens finding methods will be essential in the era of large surveys such as Large Synoptic Survey Telescope, Euclid and Wide-Field Infrared Survey Telescope. To tackle this challenge, we introduce CMU DeepLens, a new fully automated galaxy-galaxy lens finding method based on deep learning. This supervised machine learning approach does not require any tuning after the training step which only requires realistic image simulations of strongly lensed systems. We train and validate our model on a set of 20 000 LSST-like mock observations including a range of lensed systems of various sizes and signal-to-noise ratios (S/N). We find on our simulated data set that for a rejection rate of non-lenses of 99 per cent, a completeness of 90 per cent can be achieved for lenses with Einstein radii larger than 1.4 arcsec and S/N larger than 20 on individual g-band LSST exposures. Finally, we emphasize the importance of realistically complex simulations for training such machine learning methods by demonstrating that the performance of models of significantly different complexities cannot be distinguished on simpler simulations. We make our code publicly available at https://github.com/McWilliamsCenter/CMUDeepLens.
You, Zhu-Hong; Lei, Ying-Ke; Zhu, Lin; Xia, Junfeng; Wang, Bing
2013-01-01
Protein-protein interactions (PPIs) play crucial roles in the execution of various cellular processes and form the basis of biological mechanisms. Although large amount of PPIs data for different species has been generated by high-throughput experimental techniques, current PPI pairs obtained with experimental methods cover only a fraction of the complete PPI networks, and further, the experimental methods for identifying PPIs are both time-consuming and expensive. Hence, it is urgent and challenging to develop automated computational methods to efficiently and accurately predict PPIs. We present here a novel hierarchical PCA-EELM (principal component analysis-ensemble extreme learning machine) model to predict protein-protein interactions only using the information of protein sequences. In the proposed method, 11188 protein pairs retrieved from the DIP database were encoded into feature vectors by using four kinds of protein sequences information. Focusing on dimension reduction, an effective feature extraction method PCA was then employed to construct the most discriminative new feature set. Finally, multiple extreme learning machines were trained and then aggregated into a consensus classifier by majority voting. The ensembling of extreme learning machine removes the dependence of results on initial random weights and improves the prediction performance. When performed on the PPI data of Saccharomyces cerevisiae, the proposed method achieved 87.00% prediction accuracy with 86.15% sensitivity at the precision of 87.59%. Extensive experiments are performed to compare our method with state-of-the-art techniques Support Vector Machine (SVM). Experimental results demonstrate that proposed PCA-EELM outperforms the SVM method by 5-fold cross-validation. Besides, PCA-EELM performs faster than PCA-SVM based method. Consequently, the proposed approach can be considered as a new promising and powerful tools for predicting PPI with excellent performance and less time.
Mining the Galaxy Zoo Database: Machine Learning Applications
NASA Astrophysics Data System (ADS)
Borne, Kirk D.; Wallin, J.; Vedachalam, A.; Baehr, S.; Lintott, C.; Darg, D.; Smith, A.; Fortson, L.
2010-01-01
The new Zooniverse initiative is addressing the data flood in the sciences through a transformative partnership between professional scientists, volunteer citizen scientists, and machines. As part of this project, we are exploring the application of machine learning techniques to data mining problems associated with the large and growing database of volunteer science results gathered by the Galaxy Zoo citizen science project. We will describe the basic challenge, some machine learning approaches, and early results. One of the motivators for this study is the acquisition (through the Galaxy Zoo results database) of approximately 100 million classification labels for roughly one million galaxies, yielding a tremendously large and rich set of training examples for improving automated galaxy morphological classification algorithms. In our first case study, the goal is to learn which morphological and photometric features in the Sloan Digital Sky Survey (SDSS) database correlate most strongly with user-selected galaxy morphological class. As a corollary to this study, we are also aiming to identify which galaxy parameters in the SDSS database correspond to galaxies that have been the most difficult to classify (based upon large dispersion in their volunter-provided classifications). Our second case study will focus on similar data mining analyses and machine leaning algorithms applied to the Galaxy Zoo catalog of merging and interacting galaxies. The outcomes of this project will have applications in future large sky surveys, such as the LSST (Large Synoptic Survey Telescope) project, which will generate a catalog of 20 billion galaxies and will produce an additional astronomical alert database of approximately 100 thousand events each night for 10 years -- the capabilities and algorithms that we are exploring will assist in the rapid characterization and classification of such massive data streams. This research has been supported in part through NSF award #0941610.
DOT National Transportation Integrated Search
1974-08-01
Volume 3 describes the methodology for man-machine task allocation. It contains a description of man and machine performance capabilities and an explanation of the methodology employed to allocate tasks to human or automated resources. It also presen...
Yadav, Kabir; Sarioglu, Efsun; Choi, Hyeong Ah; Cartwright, Walter B; Hinds, Pamela S; Chamberlain, James M
2016-02-01
The authors have previously demonstrated highly reliable automated classification of free-text computed tomography (CT) imaging reports using a hybrid system that pairs linguistic (natural language processing) and statistical (machine learning) techniques. Previously performed for identifying the outcome of orbital fracture in unprocessed radiology reports from a clinical data repository, the performance has not been replicated for more complex outcomes. To validate automated outcome classification performance of a hybrid natural language processing (NLP) and machine learning system for brain CT imaging reports. The hypothesis was that our system has performance characteristics for identifying pediatric traumatic brain injury (TBI). This was a secondary analysis of a subset of 2,121 CT reports from the Pediatric Emergency Care Applied Research Network (PECARN) TBI study. For that project, radiologists dictated CT reports as free text, which were then deidentified and scanned as PDF documents. Trained data abstractors manually coded each report for TBI outcome. Text was extracted from the PDF files using optical character recognition. The data set was randomly split evenly for training and testing. Training patient reports were used as input to the Medical Language Extraction and Encoding (MedLEE) NLP tool to create structured output containing standardized medical terms and modifiers for negation, certainty, and temporal status. A random subset stratified by site was analyzed using descriptive quantitative content analysis to confirm identification of TBI findings based on the National Institute of Neurological Disorders and Stroke (NINDS) Common Data Elements project. Findings were coded for presence or absence, weighted by frequency of mentions, and past/future/indication modifiers were filtered. After combining with the manual reference standard, a decision tree classifier was created using data mining tools WEKA 3.7.5 and Salford Predictive Miner 7.0. Performance of the decision tree classifier was evaluated on the test patient reports. The prevalence of TBI in the sampled population was 159 of 2,217 (7.2%). The automated classification for pediatric TBI is comparable to our prior results, with the notable exception of lower positive predictive value. Manual review of misclassified reports, 95.5% of which were false-positives, revealed that a sizable number of false-positive errors were due to differing outcome definitions between NINDS TBI findings and PECARN clinical important TBI findings and report ambiguity not meeting definition criteria. A hybrid NLP and machine learning automated classification system continues to show promise in coding free-text electronic clinical data. For complex outcomes, it can reliably identify negative reports, but manual review of positive reports may be required. As such, it can still streamline data collection for clinical research and performance improvement. © 2016 by the Society for Academic Emergency Medicine.
Yadav, Kabir; Sarioglu, Efsun; Choi, Hyeong-Ah; Cartwright, Walter B.; Hinds, Pamela S.; Chamberlain, James M.
2016-01-01
Background The authors have previously demonstrated highly reliable automated classification of free text computed tomography (CT) imaging reports using a hybrid system that pairs linguistic (natural language processing) and statistical (machine learning) techniques. Previously performed for identifying the outcome of orbital fracture in unprocessed radiology reports from a clinical data repository, the performance has not been replicated for more complex outcomes. Objectives To validate automated outcome classification performance of a hybrid natural language processing (NLP) and machine learning system for brain CT imaging reports. The hypothesis was that our system has performance characteristics for identifying pediatric traumatic brain injury (TBI). Methods This was a secondary analysis of a subset of 2,121 CT reports from the Pediatric Emergency Care Applied Research Network (PECARN) TBI study. For that project, radiologists dictated CT reports as free text, which were then de-identified and scanned as PDF documents. Trained data abstractors manually coded each report for TBI outcome. Text was extracted from the PDF files using optical character recognition. The dataset was randomly split evenly for training and testing. Training patient reports were used as input to the Medical Language Extraction and Encoding (MedLEE) NLP tool to create structured output containing standardized medical terms and modifiers for negation, certainty, and temporal status. A random subset stratified by site was analyzed using descriptive quantitative content analysis to confirm identification of TBI findings based upon the National Institute of Neurological Disorders and Stroke Common Data Elements project. Findings were coded for presence or absence, weighted by frequency of mentions, and past/future/indication modifiers were filtered. After combining with the manual reference standard, a decision tree classifier was created using data mining tools WEKA 3.7.5 and Salford Predictive Miner 7.0. Performance of the decision tree classifier was evaluated on the test patient reports. Results The prevalence of TBI in the sampled population was 159 out of 2,217 (7.2%). The automated classification for pediatric TBI is comparable to our prior results, with the notable exception of lower positive predictive value (PPV). Manual review of misclassified reports, 95.5% of which were false positives, revealed that a sizable number of false-positive errors were due to differing outcome definitions between NINDS TBI findings and PECARN clinical important TBI findings, and report ambiguity not meeting definition criteria. Conclusions A hybrid NLP and machine learning automated classification system continues to show promise in coding free-text electronic clinical data. For complex outcomes, it can reliably identify negative reports, but manual review of positive reports may be required. As such, it can still streamline data collection for clinical research and performance improvement. PMID:26766600
Automatic Feature Selection and Improved Classification in SICADA Counterfeit Electronics Detection
2017-03-20
The SICADA methodology was developed to detect such counterfeit microelectronics by collecting power side channel data and applying machine learning...to identify counterfeits. This methodology has been extended to include a two-step automated feature selection process and now uses a one-class SVM...classifier. We describe this methodology and show results for empirical data collected from several types of Microchip dsPIC33F microcontrollers
Event detection for car park entries by video-surveillance
NASA Astrophysics Data System (ADS)
Coquin, Didier; Tailland, Johan; Cintract, Michel
2007-10-01
Intelligent surveillance has become an important research issue due to the high cost and low efficiency of human supervisors, and machine intelligence is required to provide a solution for automated event detection. In this paper we describe a real-time system that has been used for detecting car park entries, using an adaptive background learning algorithm and two indicators representing activity and identity to overcome the difficulty of tracking objects.
NASA Astrophysics Data System (ADS)
Nieten, Joseph L.; Burke, Roger
1993-03-01
The system diagnostic builder (SDB) is an automated knowledge acquisition tool using state- of-the-art artificial intelligence (AI) technologies. The SDB uses an inductive machine learning technique to generate rules from data sets that are classified by a subject matter expert (SME). Thus, data is captured from the subject system, classified by an expert, and used to drive the rule generation process. These rule-bases are used to represent the observable behavior of the subject system, and to represent knowledge about this system. The rule-bases can be used in any knowledge based system which monitors or controls a physical system or simulation. The SDB has demonstrated the utility of using inductive machine learning technology to generate reliable knowledge bases. In fact, we have discovered that the knowledge captured by the SDB can be used in any number of applications. For example, the knowledge bases captured from the SMS can be used as black box simulations by intelligent computer aided training devices. We can also use the SDB to construct knowledge bases for the process control industry, such as chemical production, or oil and gas production. These knowledge bases can be used in automated advisory systems to ensure safety, productivity, and consistency.
Automated Item Generation with Recurrent Neural Networks.
von Davier, Matthias
2018-03-12
Utilizing technology for automated item generation is not a new idea. However, test items used in commercial testing programs or in research are still predominantly written by humans, in most cases by content experts or professional item writers. Human experts are a limited resource and testing agencies incur high costs in the process of continuous renewal of item banks to sustain testing programs. Using algorithms instead holds the promise of providing unlimited resources for this crucial part of assessment development. The approach presented here deviates in several ways from previous attempts to solve this problem. In the past, automatic item generation relied either on generating clones of narrowly defined item types such as those found in language free intelligence tests (e.g., Raven's progressive matrices) or on an extensive analysis of task components and derivation of schemata to produce items with pre-specified variability that are hoped to have predictable levels of difficulty. It is somewhat unlikely that researchers utilizing these previous approaches would look at the proposed approach with favor; however, recent applications of machine learning show success in solving tasks that seemed impossible for machines not too long ago. The proposed approach uses deep learning to implement probabilistic language models, not unlike what Google brain and Amazon Alexa use for language processing and generation.
Oscar, Nels; Fox, Pamela A; Croucher, Racheal; Wernick, Riana; Keune, Jessica; Hooker, Karen
2017-09-01
Social scientists need practical methods for harnessing large, publicly available datasets that inform the social context of aging. We describe our development of a semi-automated text coding method and use a content analysis of Alzheimer's disease (AD) and dementia portrayal on Twitter to demonstrate its use. The approach improves feasibility of examining large publicly available datasets. Machine learning techniques modeled stigmatization expressed in 31,150 AD-related tweets collected via Twitter's search API based on 9 AD-related keywords. Two researchers manually coded 311 random tweets on 6 dimensions. This input from 1% of the dataset was used to train a classifier against the tweet text and code the remaining 99% of the dataset. Our automated process identified that 21.13% of the AD-related tweets used AD-related keywords to perpetuate public stigma, which could impact stereotypes and negative expectations for individuals with the disease and increase "excess disability". This technique could be applied to questions in social gerontology related to how social media outlets reflect and shape attitudes bearing on other developmental outcomes. Recommendations for the collection and analysis of large Twitter datasets are discussed. © The Author 2017. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Automated Tissue Classification Framework for Reproducible Chronic Wound Assessment
Mukherjee, Rashmi; Manohar, Dhiraj Dhane; Das, Dev Kumar; Achar, Arun; Mitra, Analava; Chakraborty, Chandan
2014-01-01
The aim of this paper was to develop a computer assisted tissue classification (granulation, necrotic, and slough) scheme for chronic wound (CW) evaluation using medical image processing and statistical machine learning techniques. The red-green-blue (RGB) wound images grabbed by normal digital camera were first transformed into HSI (hue, saturation, and intensity) color space and subsequently the “S” component of HSI color channels was selected as it provided higher contrast. Wound areas from 6 different types of CW were segmented from whole images using fuzzy divergence based thresholding by minimizing edge ambiguity. A set of color and textural features describing granulation, necrotic, and slough tissues in the segmented wound area were extracted using various mathematical techniques. Finally, statistical learning algorithms, namely, Bayesian classification and support vector machine (SVM), were trained and tested for wound tissue classification in different CW images. The performance of the wound area segmentation protocol was further validated by ground truth images labeled by clinical experts. It was observed that SVM with 3rd order polynomial kernel provided the highest accuracies, that is, 86.94%, 90.47%, and 75.53%, for classifying granulation, slough, and necrotic tissues, respectively. The proposed automated tissue classification technique achieved the highest overall accuracy, that is, 87.61%, with highest kappa statistic value (0.793). PMID:25114925
Advanced Airframe Structural Materials: A Primer and Cost Estimating Methodology
1991-01-01
laying machines for larger, mildly con- toured parts such as wing and stabilizer skins. For such parts, automated tape laying machines can operate many...heat guns (90-130°F). However, thermoplastics require as much as 650°F for forming. Automated tape laying machines for these materials use warm...cycles to properly seat the plies onto the tool. This time-consuming process can sometimes be eliminated or reduced by the use of automated tape laying procedures
Translation: Aids, Robots, and Automation.
ERIC Educational Resources Information Center
Andreyewsky, Alexander
1981-01-01
Examines electronic aids to translation both as ways to automate it and as an approach to solve problems resulting from shortage of qualified translators. Describes the limitations of robotic MT (Machine Translation) systems, viewing MAT (Machine-Aided Translation) as the only practical solution and the best vehicle for further automation. (MES)
Industrial Arts Curriculum Guide for Automated Machining in Metals Technology.
ERIC Educational Resources Information Center
1985
This curriculum guide is designed to be used for creating programs in automated machining education in Connecticut. The first sections of the guide are introductory, explaining the importance of computer-numerically controlled machines, describing the industrial arts scope and sequence for kindergarten through adult levels, describing the…
NASA Astrophysics Data System (ADS)
Wang, Dongyi; Vinson, Robert; Holmes, Maxwell; Seibel, Gary; Tao, Yang
2018-04-01
The Atlantic blue crab is among the highest-valued seafood found in the American Eastern Seaboard. Currently, the crab processing industry is highly dependent on manual labor. However, there is great potential for vision-guided intelligent machines to automate the meat picking process. Studies show that the back-fin knuckles are robust features containing information about a crab's size, orientation, and the position of the crab's meat compartments. Our studies also make it clear that detecting the knuckles reliably in images is challenging due to the knuckle's small size, anomalous shape, and similarity to joints in the legs and claws. An accurate and reliable computer vision algorithm was proposed to detect the crab's back-fin knuckles in digital images. Convolutional neural networks (CNNs) can localize rough knuckle positions with 97.67% accuracy, transforming a global detection problem into a local detection problem. Compared to the rough localization based on human experience or other machine learning classification methods, the CNN shows the best localization results. In the rough knuckle position, a k-means clustering method is able to further extract the exact knuckle positions based on the back-fin knuckle color features. The exact knuckle position can help us to generate a crab cutline in XY plane using a template matching method. This is a pioneering research project in crab image analysis and offers advanced machine intelligence for automated crab processing.
Automated expert modeling for automated student evaluation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Abbott, Robert G.
The 8th International Conference on Intelligent Tutoring Systems provides a leading international forum for the dissemination of original results in the design, implementation, and evaluation of intelligent tutoring systems and related areas. The conference draws researchers from a broad spectrum of disciplines ranging from artificial intelligence and cognitive science to pedagogy and educational psychology. The conference explores intelligent tutoring systems increasing real world impact on an increasingly global scale. Improved authoring tools and learning object standards enable fielding systems and curricula in real world settings on an unprecedented scale. Researchers deploy ITS's in ever larger studies and increasingly use datamore » from real students, tasks, and settings to guide new research. With high volumes of student interaction data, data mining, and machine learning, tutoring systems can learn from experience and improve their teaching performance. The increasing number of realistic evaluation studies also broaden researchers knowledge about the educational contexts for which ITS's are best suited. At the same time, researchers explore how to expand and improve ITS/student communications, for example, how to achieve more flexible and responsive discourse with students, help students integrate Web resources into learning, use mobile technologies and games to enhance student motivation and learning, and address multicultural perspectives.« less
Automated melanoma recognition in dermoscopic images based on extreme learning machine (ELM)
NASA Astrophysics Data System (ADS)
Rahman, Md. Mahmudur; Alpaslan, Nuh
2017-03-01
Melanoma is considered a major health problem since it is the deadliest form of skin cancer. The early diagnosis through periodic screening with dermoscopic images can significantly improve the survival rate as well as reduce the treatment cost and consequent suffering of patients. Dermoscopy or skin surface microscopy provides in vivo inspection of color and morphologic structures of pigmented skin lesions (PSLs), rendering higher accuracy for detecting suspicious cases than it is possible via inspecting with naked eye. However, interpretation of dermoscopic images is time consuming and subjective, even for trained dermatologists. Therefore, there is currently a great interest in the development of computeraided diagnosis (CAD) systems for automated melanoma recognition. However, the majority of the CAD systems are still in the early development stage with lack of descriptive feature generation and benchmark evaluation in ground-truth datasets. This work is focusing on by addressing the various issues related to the development of such a CAD system with effective feature extraction from Non-Subsampled Contourlet Transform (NSCT) and Eig(Hess) histogram of oriented gradients (HOG) and lesion classification with efficient Extreme Learning Machine (ELM) due to its good generalization abilities and a high learning efficiency and evaluating its effectiveness in a benchmark data set of dermoscopic images towards the goal of realistic comparison and real clinical integration. The proposed research on melanoma recognition has huge potential for offering powerful services that would significantly benefit the present Biomedical Information Systems.
Saha, Sajib Kumar; Fernando, Basura; Cuadros, Jorge; Xiao, Di; Kanagasingam, Yogesan
2018-04-27
Fundus images obtained in a telemedicine program are acquired at different sites that are captured by people who have varying levels of experience. These result in a relatively high percentage of images which are later marked as unreadable by graders. Unreadable images require a recapture which is time and cost intensive. An automated method that determines the image quality during acquisition is an effective alternative. To determine the image quality during acquisition, we describe here an automated method for the assessment of image quality in the context of diabetic retinopathy. The method explicitly applies machine learning techniques to access the image and to determine 'accept' and 'reject' categories. 'Reject' category image requires a recapture. A deep convolution neural network is trained to grade the images automatically. A large representative set of 7000 colour fundus images was used for the experiment which was obtained from the EyePACS that were made available by the California Healthcare Foundation. Three retinal image analysis experts were employed to categorise these images into 'accept' and 'reject' classes based on the precise definition of image quality in the context of DR. The network was trained using 3428 images. The method shows an accuracy of 100% to successfully categorise 'accept' and 'reject' images, which is about 2% higher than the traditional machine learning method. On a clinical trial, the proposed method shows 97% agreement with human grader. The method can be easily incorporated with the fundus image capturing system in the acquisition centre and can guide the photographer whether a recapture is necessary or not.
Unresolved Galaxy Classifier for ESA/Gaia mission: Support Vector Machines approach
NASA Astrophysics Data System (ADS)
Bellas-Velidis, Ioannis; Kontizas, Mary; Dapergolas, Anastasios; Livanou, Evdokia; Kontizas, Evangelos; Karampelas, Antonios
A software package Unresolved Galaxy Classifier (UGC) is being developed for the ground-based pipeline of ESA's Gaia mission. It aims to provide an automated taxonomic classification and specific parameters estimation analyzing Gaia BP/RP instrument low-dispersion spectra of unresolved galaxies. The UGC algorithm is based on a supervised learning technique, the Support Vector Machines (SVM). The software is implemented in Java as two separate modules. An offline learning module provides functions for SVM-models training. Once trained, the set of models can be repeatedly applied to unknown galaxy spectra by the pipeline's application module. A library of galaxy models synthetic spectra, simulated for the BP/RP instrument, is used to train and test the modules. Science tests show a very good classification performance of UGC and relatively good regression performance, except for some of the parameters. Possible approaches to improve the performance are discussed.
Structural damage detection using deep learning of ultrasonic guided waves
NASA Astrophysics Data System (ADS)
Melville, Joseph; Alguri, K. Supreet; Deemer, Chris; Harley, Joel B.
2018-04-01
Structural health monitoring using ultrasonic guided waves relies on accurate interpretation of guided wave propagation to distinguish damage state indicators. However, traditional physics based models do not provide an accurate representation, and classic data driven techniques, such as a support vector machine, are too simplistic to capture the complex nature of ultrasonic guide waves. To address this challenge, this paper uses a deep learning interpretation of ultrasonic guided waves to achieve fast, accurate, and automated structural damaged detection. To achieve this, full wavefield scans of thin metal plates are used, half from the undamaged state and half from the damaged state. This data is used to train our deep network to predict the damage state of a plate with 99.98% accuracy given signals from just 10 spatial locations on the plate, as compared to that of a support vector machine (SVM), which achieved a 62% accuracy.
Learning Motion Features for Example-Based Finger Motion Estimation for Virtual Characters
NASA Astrophysics Data System (ADS)
Mousas, Christos; Anagnostopoulos, Christos-Nikolaos
2017-09-01
This paper presents a methodology for estimating the motion of a character's fingers based on the use of motion features provided by a virtual character's hand. In the presented methodology, firstly, the motion data is segmented into discrete phases. Then, a number of motion features are computed for each motion segment of a character's hand. The motion features are pre-processed using restricted Boltzmann machines, and by using the different variations of semantically similar finger gestures in a support vector machine learning mechanism, the optimal weights for each feature assigned to a metric are computed. The advantages of the presented methodology in comparison to previous solutions are the following: First, we automate the computation of optimal weights that are assigned to each motion feature counted in our metric. Second, the presented methodology achieves an increase (about 17%) in correctly estimated finger gestures in comparison to a previous method.
Machine learning in motion control
NASA Technical Reports Server (NTRS)
Su, Renjeng; Kermiche, Noureddine
1989-01-01
The existing methodologies for robot programming originate primarily from robotic applications to manufacturing, where uncertainties of the robots and their task environment may be minimized by repeated off-line modeling and identification. In space application of robots, however, a higher degree of automation is required for robot programming because of the desire of minimizing the human intervention. We discuss a new paradigm of robotic programming which is based on the concept of machine learning. The goal is to let robots practice tasks by themselves and the operational data are used to automatically improve their motion performance. The underlying mathematical problem is to solve the problem of dynamical inverse by iterative methods. One of the key questions is how to ensure the convergence of the iterative process. There have been a few small steps taken into this important approach to robot programming. We give a representative result on the convergence problem.
Applying data fusion techniques for benthic habitat mapping and monitoring in a coral reef ecosystem
NASA Astrophysics Data System (ADS)
Zhang, Caiyun
2015-06-01
Accurate mapping and effective monitoring of benthic habitat in the Florida Keys are critical in developing management strategies for this valuable coral reef ecosystem. For this study, a framework was designed for automated benthic habitat mapping by combining multiple data sources (hyperspectral, aerial photography, and bathymetry data) and four contemporary imagery processing techniques (data fusion, Object-based Image Analysis (OBIA), machine learning, and ensemble analysis). In the framework, 1-m digital aerial photograph was first merged with 17-m hyperspectral imagery and 10-m bathymetry data using a pixel/feature-level fusion strategy. The fused dataset was then preclassified by three machine learning algorithms (Random Forest, Support Vector Machines, and k-Nearest Neighbor). Final object-based habitat maps were produced through ensemble analysis of outcomes from three classifiers. The framework was tested for classifying a group-level (3-class) and code-level (9-class) habitats in a portion of the Florida Keys. Informative and accurate habitat maps were achieved with an overall accuracy of 88.5% and 83.5% for the group-level and code-level classifications, respectively.
Automated annotation of functional imaging experiments via multi-label classification
Turner, Matthew D.; Chakrabarti, Chayan; Jones, Thomas B.; Xu, Jiawei F.; Fox, Peter T.; Luger, George F.; Laird, Angela R.; Turner, Jessica A.
2013-01-01
Identifying the experimental methods in human neuroimaging papers is important for grouping meaningfully similar experiments for meta-analyses. Currently, this can only be done by human readers. We present the performance of common machine learning (text mining) methods applied to the problem of automatically classifying or labeling this literature. Labeling terms are from the Cognitive Paradigm Ontology (CogPO), the text corpora are abstracts of published functional neuroimaging papers, and the methods use the performance of a human expert as training data. We aim to replicate the expert's annotation of multiple labels per abstract identifying the experimental stimuli, cognitive paradigms, response types, and other relevant dimensions of the experiments. We use several standard machine learning methods: naive Bayes (NB), k-nearest neighbor, and support vector machines (specifically SMO or sequential minimal optimization). Exact match performance ranged from only 15% in the worst cases to 78% in the best cases. NB methods combined with binary relevance transformations performed strongly and were robust to overfitting. This collection of results demonstrates what can be achieved with off-the-shelf software components and little to no pre-processing of raw text. PMID:24409112
Kuhn, Stefan; Egert, Björn; Neumann, Steffen; Steinbeck, Christoph
2008-09-25
Current efforts in Metabolomics, such as the Human Metabolome Project, collect structures of biological metabolites as well as data for their characterisation, such as spectra for identification of substances and measurements of their concentration. Still, only a fraction of existing metabolites and their spectral fingerprints are known. Computer-Assisted Structure Elucidation (CASE) of biological metabolites will be an important tool to leverage this lack of knowledge. Indispensable for CASE are modules to predict spectra for hypothetical structures. This paper evaluates different statistical and machine learning methods to perform predictions of proton NMR spectra based on data from our open database NMRShiftDB. A mean absolute error of 0.18 ppm was achieved for the prediction of proton NMR shifts ranging from 0 to 11 ppm. Random forest, J48 decision tree and support vector machines achieved similar overall errors. HOSE codes being a notably simple method achieved a comparatively good result of 0.17 ppm mean absolute error. NMR prediction methods applied in the course of this work delivered precise predictions which can serve as a building block for Computer-Assisted Structure Elucidation for biological metabolites.
Moody, Daniela I.; Brumby, Steven P.; Rowland, Joel C.; ...
2014-12-09
We present results from an ongoing effort to extend neuromimetic machine vision algorithms to multispectral data using adaptive signal processing combined with compressive sensing and machine learning techniques. Our goal is to develop a robust classification methodology that will allow for automated discretization of the landscape into distinct units based on attributes such as vegetation, surface hydrological properties, and topographic/geomorphic characteristics. We use a Hebbian learning rule to build spectral-textural dictionaries that are tailored for classification. We learn our dictionaries from millions of overlapping multispectral image patches and then use a pursuit search to generate classification features. Land cover labelsmore » are automatically generated using unsupervised clustering of sparse approximations (CoSA). We demonstrate our method on multispectral WorldView-2 data from a coastal plain ecosystem in Barrow, Alaska. We explore learning from both raw multispectral imagery and normalized band difference indices. We explore a quantitative metric to evaluate the spectral properties of the clusters in order to potentially aid in assigning land cover categories to the cluster labels. In this study, our results suggest CoSA is a promising approach to unsupervised land cover classification in high-resolution satellite imagery.« less
Deep learning based classification of breast tumors with shear-wave elastography.
Zhang, Qi; Xiao, Yang; Dai, Wei; Suo, Jingfeng; Wang, Congzhi; Shi, Jun; Zheng, Hairong
2016-12-01
This study aims to build a deep learning (DL) architecture for automated extraction of learned-from-data image features from the shear-wave elastography (SWE), and to evaluate the DL architecture in differentiation between benign and malignant breast tumors. We construct a two-layer DL architecture for SWE feature extraction, comprised of the point-wise gated Boltzmann machine (PGBM) and the restricted Boltzmann machine (RBM). The PGBM contains task-relevant and task-irrelevant hidden units, and the task-relevant units are connected to the RBM. Experimental evaluation was performed with five-fold cross validation on a set of 227 SWE images, 135 of benign tumors and 92 of malignant tumors, from 121 patients. The features learned with our DL architecture were compared with the statistical features quantifying image intensity and texture. Results showed that the DL features achieved better classification performance with an accuracy of 93.4%, a sensitivity of 88.6%, a specificity of 97.1%, and an area under the receiver operating characteristic curve of 0.947. The DL-based method integrates feature learning with feature selection on SWE. It may be potentially used in clinical computer-aided diagnosis of breast cancer. Copyright © 2016 Elsevier B.V. All rights reserved.
Automation and robotics technology for intelligent mining systems
NASA Technical Reports Server (NTRS)
Welsh, Jeffrey H.
1989-01-01
The U.S. Bureau of Mines is approaching the problems of accidents and efficiency in the mining industry through the application of automation and robotics to mining systems. This technology can increase safety by removing workers from hazardous areas of the mines or from performing hazardous tasks. The short-term goal of the Automation and Robotics program is to develop technology that can be implemented in the form of an autonomous mining machine using current continuous mining machine equipment. In the longer term, the goal is to conduct research that will lead to new intelligent mining systems that capitalize on the capabilities of robotics. The Bureau of Mines Automation and Robotics program has been structured to produce the technology required for the short- and long-term goals. The short-term goal of application of automation and robotics to an existing mining machine, resulting in autonomous operation, is expected to be accomplished within five years. Key technology elements required for an autonomous continuous mining machine are well underway and include machine navigation systems, coal-rock interface detectors, machine condition monitoring, and intelligent computer systems. The Bureau of Mines program is described, including status of key technology elements for an autonomous continuous mining machine, the program schedule, and future work. Although the program is directed toward underground mining, much of the technology being developed may have applications for space systems or mining on the Moon or other planets.
A Deep Learning Approach for Fault Diagnosis of Induction Motors in Manufacturing
NASA Astrophysics Data System (ADS)
Shao, Si-Yu; Sun, Wen-Jun; Yan, Ru-Qiang; Wang, Peng; Gao, Robert X.
2017-11-01
Extracting features from original signals is a key procedure for traditional fault diagnosis of induction motors, as it directly influences the performance of fault recognition. However, high quality features need expert knowledge and human intervention. In this paper, a deep learning approach based on deep belief networks (DBN) is developed to learn features from frequency distribution of vibration signals with the purpose of characterizing working status of induction motors. It combines feature extraction procedure with classification task together to achieve automated and intelligent fault diagnosis. The DBN model is built by stacking multiple-units of restricted Boltzmann machine (RBM), and is trained using layer-by-layer pre-training algorithm. Compared with traditional diagnostic approaches where feature extraction is needed, the presented approach has the ability of learning hierarchical representations, which are suitable for fault classification, directly from frequency distribution of the measurement data. The structure of the DBN model is investigated as the scale and depth of the DBN architecture directly affect its classification performance. Experimental study conducted on a machine fault simulator verifies the effectiveness of the deep learning approach for fault diagnosis of induction motors. This research proposes an intelligent diagnosis method for induction motor which utilizes deep learning model to automatically learn features from sensor data and realize working status recognition.
Pain Intensity Recognition Rates via Biopotential Feature Patterns with Support Vector Machines
Gruss, Sascha; Treister, Roi; Werner, Philipp; Traue, Harald C.; Crawcour, Stephen; Andrade, Adriano; Walter, Steffen
2015-01-01
Background The clinically used methods of pain diagnosis do not allow for objective and robust measurement, and physicians must rely on the patient’s report on the pain sensation. Verbal scales, visual analog scales (VAS) or numeric rating scales (NRS) count among the most common tools, which are restricted to patients with normal mental abilities. There also exist instruments for pain assessment in people with verbal and / or cognitive impairments and instruments for pain assessment in people who are sedated and automated ventilated. However, all these diagnostic methods either have limited reliability and validity or are very time-consuming. In contrast, biopotentials can be automatically analyzed with machine learning algorithms to provide a surrogate measure of pain intensity. Methods In this context, we created a database of biopotentials to advance an automated pain recognition system, determine its theoretical testing quality, and optimize its performance. Eighty-five participants were subjected to painful heat stimuli (baseline, pain threshold, two intermediate thresholds, and pain tolerance threshold) under controlled conditions and the signals of electromyography, skin conductance level, and electrocardiography were collected. A total of 159 features were extracted from the mathematical groupings of amplitude, frequency, stationarity, entropy, linearity, variability, and similarity. Results We achieved classification rates of 90.94% for baseline vs. pain tolerance threshold and 79.29% for baseline vs. pain threshold. The most selected pain features stemmed from the amplitude and similarity group and were derived from facial electromyography. Conclusion The machine learning measurement of pain in patients could provide valuable information for a clinical team and thus support the treatment assessment. PMID:26474183
Li, Yanxin; Knoll, Joan H; Wilkins, Ruth C; Flegal, Farrah N; Rogan, Peter K
2016-05-01
Dose from radiation exposure can be estimated from dicentric chromosome (DC) frequencies in metaphase cells of peripheral blood lymphocytes. We automated DC detection by extracting features in Giemsa-stained metaphase chromosome images and classifying objects by machine learning (ML). DC detection involves (i) intensity thresholded segmentation of metaphase objects, (ii) chromosome separation by watershed transformation and elimination of inseparable chromosome clusters, fragments and staining debris using a morphological decision tree filter, (iii) determination of chromosome width and centreline, (iv) derivation of centromere candidates, and (v) distinction of DCs from monocentric chromosomes (MC) by ML. Centromere candidates are inferred from 14 image features input to a Support Vector Machine (SVM). Sixteen features derived from these candidates are then supplied to a Boosting classifier and a second SVM which determines whether a chromosome is either a DC or MC. The SVM was trained with 292 DCs and 3135 MCs, and then tested with cells exposed to either low (1 Gy) or high (2-4 Gy) radiation dose. Results were then compared with those of 3 experts. True positive rates (TPR) and positive predictive values (PPV) were determined for the tuning parameter, σ. At larger σ, PPV decreases and TPR increases. At high dose, for σ = 1.3, TPR = 0.52 and PPV = 0.83, while at σ = 1.6, the TPR = 0.65 and PPV = 0.72. At low dose and σ = 1.3, TPR = 0.67 and PPV = 0.26. The algorithm differentiates DCs from MCs, overlapped chromosomes and other objects with acceptable accuracy over a wide range of radiation exposures. © 2016 Wiley Periodicals, Inc.
Memarian, Negar; Kim, Sally; Dewar, Sandra; Engel, Jerome; Staba, Richard J
2015-09-01
This study sought to predict postsurgical seizure freedom from pre-operative diagnostic test results and clinical information using a rapid automated approach, based on supervised learning methods in patients with drug-resistant focal seizures suspected to begin in temporal lobe. We applied machine learning, specifically a combination of mutual information-based feature selection and supervised learning classifiers on multimodal data, to predict surgery outcome retrospectively in 20 presurgical patients (13 female; mean age±SD, in years 33±9.7 for females, and 35.3±9.4 for males) who were diagnosed with mesial temporal lobe epilepsy (MTLE) and subsequently underwent standard anteromesial temporal lobectomy. The main advantage of the present work over previous studies is the inclusion of the extent of ipsilateral neocortical gray matter atrophy and spatiotemporal properties of depth electrode-recorded seizures as training features for individual patient surgery planning. A maximum relevance minimum redundancy (mRMR) feature selector identified the following features as the most informative predictors of postsurgical seizure freedom in this study's sample of patients: family history of epilepsy, ictal EEG onset pattern (positive correlation with seizure freedom), MRI-based gray matter thickness reduction in the hemisphere ipsilateral to seizure onset, proportion of seizures that first appeared in ipsilateral amygdala to total seizures, age, epilepsy duration, delay in the spread of ipsilateral ictal discharges from site of onset, gender, and number of electrode contacts at seizure onset (negative correlation with seizure freedom). Using these features in combination with a least square support vector machine (LS-SVM) classifier compared to other commonly used classifiers resulted in very high surgical outcome prediction accuracy (95%). Supervised machine learning using multimodal compared to unimodal data accurately predicted postsurgical outcome in patients with atypical MTLE. Published by Elsevier Ltd.
Applying Machine Learning to Star Cluster Classification
NASA Astrophysics Data System (ADS)
Fedorenko, Kristina; Grasha, Kathryn; Calzetti, Daniela; Mahadevan, Sridhar
2016-01-01
Catalogs describing populations of star clusters are essential in investigating a range of important issues, from star formation to galaxy evolution. Star cluster catalogs are typically created in a two-step process: in the first step, a catalog of sources is automatically produced; in the second step, each of the extracted sources is visually inspected by 3-to-5 human classifiers and assigned a category. Classification by humans is labor-intensive and time consuming, thus it creates a bottleneck, and substantially slows down progress in star cluster research.We seek to automate the process of labeling star clusters (the second step) through applying supervised machine learning techniques. This will provide a fast, objective, and reproducible classification. Our data is HST (WFC3 and ACS) images of galaxies in the distance range of 3.5-12 Mpc, with a few thousand star clusters already classified by humans as a part of the LEGUS (Legacy ExtraGalactic UV Survey) project. The classification is based on 4 labels (Class 1 - symmetric, compact cluster; Class 2 - concentrated object with some degree of asymmetry; Class 3 - multiple peak system, diffuse; and Class 4 - spurious detection). We start by looking at basic machine learning methods such as decision trees. We then proceed to evaluate performance of more advanced techniques, focusing on convolutional neural networks and other Deep Learning methods. We analyze the results, and suggest several directions for further improvement.
Screening Electronic Health Record-Related Patient Safety Reports Using Machine Learning.
Marella, William M; Sparnon, Erin; Finley, Edward
2017-03-01
The objective of this study was to develop a semiautomated approach to screening cases that describe hazards associated with the electronic health record (EHR) from a mandatory, population-based patient safety reporting system. Potentially relevant cases were identified through a query of the Pennsylvania Patient Safety Reporting System. A random sample of cases were manually screened for relevance and divided into training, testing, and validation data sets to develop a machine learning model. This model was used to automate screening of remaining potentially relevant cases. Of the 4 algorithms tested, a naive Bayes kernel performed best, with an area under the receiver operating characteristic curve of 0.927 ± 0.023, accuracy of 0.855 ± 0.033, and F score of 0.877 ± 0.027. The machine learning model and text mining approach described here are useful tools for identifying and analyzing adverse event and near-miss reports. Although reporting systems are beginning to incorporate structured fields on health information technology and the EHR, these methods can identify related events that reporters classify in other ways. These methods can facilitate analysis of legacy safety reports by retrieving health information technology-related and EHR-related events from databases without fields and controlled values focused on this subject and distinguishing them from reports in which the EHR is mentioned only in passing. Machine learning and text mining are useful additions to the patient safety toolkit and can be used to semiautomate screening and analysis of unstructured text in safety reports from frontline staff.
Meng, Qier; Kitasaka, Takayuki; Nimura, Yukitaka; Oda, Masahiro; Ueno, Junji; Mori, Kensaku
2017-02-01
Airway segmentation plays an important role in analyzing chest computed tomography (CT) volumes for computerized lung cancer detection, emphysema diagnosis and pre- and intra-operative bronchoscope navigation. However, obtaining a complete 3D airway tree structure from a CT volume is quite a challenging task. Several researchers have proposed automated airway segmentation algorithms basically based on region growing and machine learning techniques. However, these methods fail to detect the peripheral bronchial branches, which results in a large amount of leakage. This paper presents a novel approach for more accurate extraction of the complex airway tree. This proposed segmentation method is composed of three steps. First, Hessian analysis is utilized to enhance the tube-like structure in CT volumes; then, an adaptive multiscale cavity enhancement filter is employed to detect the cavity-like structure with different radii. In the second step, support vector machine learning will be utilized to remove the false positive (FP) regions from the result obtained in the previous step. Finally, the graph-cut algorithm is used to refine the candidate voxels to form an integrated airway tree. A test dataset including 50 standard-dose chest CT volumes was used for evaluating our proposed method. The average extraction rate was about 79.1 % with the significantly decreased FP rate. A new method of airway segmentation based on local intensity structure and machine learning technique was developed. The method was shown to be feasible for airway segmentation in a computer-aided diagnosis system for a lung and bronchoscope guidance system.
Whole brain white matter connectivity analysis using machine learning: An application to autism.
Zhang, Fan; Savadjiev, Peter; Cai, Weidong; Song, Yang; Rathi, Yogesh; Tunç, Birkan; Parker, Drew; Kapur, Tina; Schultz, Robert T; Makris, Nikos; Verma, Ragini; O'Donnell, Lauren J
2018-05-15
In this paper, we propose an automated white matter connectivity analysis method for machine learning classification and characterization of white matter abnormality via identification of discriminative fiber tracts. The proposed method uses diffusion MRI tractography and a data-driven approach to find fiber clusters corresponding to subdivisions of the white matter anatomy. Features extracted from each fiber cluster describe its diffusion properties and are used for machine learning. The method is demonstrated by application to a pediatric neuroimaging dataset from 149 individuals, including 70 children with autism spectrum disorder (ASD) and 79 typically developing controls (TDC). A classification accuracy of 78.33% is achieved in this cross-validation study. We investigate the discriminative diffusion features based on a two-tensor fiber tracking model. We observe that the mean fractional anisotropy from the second tensor (associated with crossing fibers) is most affected in ASD. We also find that local along-tract (central cores and endpoint regions) differences between ASD and TDC are helpful in differentiating the two groups. These altered diffusion properties in ASD are associated with multiple robustly discriminative fiber clusters, which belong to several major white matter tracts including the corpus callosum, arcuate fasciculus, uncinate fasciculus and aslant tract; and the white matter structures related to the cerebellum, brain stem, and ventral diencephalon. These discriminative fiber clusters, a small part of the whole brain tractography, represent the white matter connections that could be most affected in ASD. Our results indicate the potential of a machine learning pipeline based on white matter fiber clustering. Copyright © 2017 Elsevier Inc. All rights reserved.
Machine-Aided Indexing of Technical Literature
ERIC Educational Resources Information Center
Klingbiel, Paul H.
1973-01-01
To index at the Defense Documentation Center (DDC), an automated system must choose single words or phrases rapidly and economically. Automation of DDC's indexing has been machine-aided from its inception. A machine-aided indexing system is described that indexes one million words of text per hour of CPU time. (22 references) (Author/SJ)
Improving semi-automated segmentation by integrating learning with active sampling
NASA Astrophysics Data System (ADS)
Huo, Jing; Okada, Kazunori; Brown, Matthew
2012-02-01
Interactive segmentation algorithms such as GrowCut usually require quite a few user interactions to perform well, and have poor repeatability. In this study, we developed a novel technique to boost the performance of the interactive segmentation method GrowCut involving: 1) a novel "focused sampling" approach for supervised learning, as opposed to conventional random sampling; 2) boosting GrowCut using the machine learned results. We applied the proposed technique to the glioblastoma multiforme (GBM) brain tumor segmentation, and evaluated on a dataset of ten cases from a multiple center pharmaceutical drug trial. The results showed that the proposed system has the potential to reduce user interaction while maintaining similar segmentation accuracy.
Kusano, Kristofer; Gabler, Hampton C
2014-01-01
The odds of death for a seriously injured crash victim are drastically reduced if he or she received care at a trauma center. Advanced automated crash notification (AACN) algorithms are postcrash safety systems that use data measured by the vehicles during the crash to predict the likelihood of occupants being seriously injured. The accuracy of these models are crucial to the success of an AACN. The objective of this study was to compare the predictive performance of competing injury risk models and algorithms: logistic regression, random forest, AdaBoost, naïve Bayes, support vector machine, and classification k-nearest neighbors. This study compared machine learning algorithms to the widely adopted logistic regression modeling approach. Machine learning algorithms have not been commonly studied in the motor vehicle injury literature. Machine learning algorithms may have higher predictive power than logistic regression, despite the drawback of lacking the ability to perform statistical inference. To evaluate the performance of these algorithms, data on 16,398 vehicles involved in non-rollover collisions were extracted from the NASS-CDS. Vehicles with any occupants having an Injury Severity Score (ISS) of 15 or greater were defined as those requiring victims to be treated at a trauma center. The performance of each model was evaluated using cross-validation. Cross-validation assesses how a model will perform in the future given new data not used for model training. The crash ΔV (change in velocity during the crash), damage side (struck side of the vehicle), seat belt use, vehicle body type, number of events, occupant age, and occupant sex were used as predictors in each model. Logistic regression slightly outperformed the machine learning algorithms based on sensitivity and specificity of the models. Previous studies on AACN risk curves used the same data to train and test the power of the models and as a result had higher sensitivity compared to the cross-validated results from this study. Future studies should account for future data; for example, by using cross-validation or risk presenting optimistic predictions of field performance. Past algorithms have been criticized for relying on age and sex, being difficult to measure by vehicle sensors, and inaccuracies in classifying damage side. The models with accurate damage side and including age/sex did outperform models with less accurate damage side and without age/sex, but the differences were small, suggesting that the success of AACN is not reliant on these predictors.
Automated source classification of new transient sources
NASA Astrophysics Data System (ADS)
Oertel, M.; Kreikenbohm, A.; Wilms, J.; DeLuca, A.
2017-10-01
The EXTraS project harvests the hitherto unexplored temporal domain information buried in the serendipitous data collected by the European Photon Imaging Camera (EPIC) onboard the ESA XMM-Newton mission since its launch. This includes a search for fast transients, missed by standard image analysis, and a search and characterization of variability in hundreds of thousands of sources. We present an automated classification scheme for new transient sources in the EXTraS project. The method is as follows: source classification features of a training sample are used to train machine learning algorithms (performed in R; randomForest (Breiman, 2001) in supervised mode) which are then tested on a sample of known source classes and used for classification.
Modelling of human-machine interaction in equipment design of manufacturing cells
NASA Astrophysics Data System (ADS)
Cochran, David S.; Arinez, Jorge F.; Collins, Micah T.; Bi, Zhuming
2017-08-01
This paper proposes a systematic approach to model human-machine interactions (HMIs) in supervisory control of machining operations; it characterises the coexistence of machines and humans for an enterprise to balance the goals of automation/productivity and flexibility/agility. In the proposed HMI model, an operator is associated with a set of behavioural roles as a supervisor for multiple, semi-automated manufacturing processes. The model is innovative in the sense that (1) it represents an HMI based on its functions for process control but provides the flexibility for ongoing improvements in the execution of manufacturing processes; (2) it provides a computational tool to define functional requirements for an operator in HMIs. The proposed model can be used to design production systems at different levels of an enterprise architecture, particularly at the machine level in a production system where operators interact with semi-automation to accomplish the goal of 'autonomation' - automation that augments the capabilities of human beings.
Automated Planning and Scheduling for Space Mission Operations
NASA Technical Reports Server (NTRS)
Chien, Steve; Jonsson, Ari; Knight, Russell
2005-01-01
Research Trends: a) Finite-capacity scheduling under more complex constraints and increased problem dimensionality (subcontracting, overtime, lot splitting, inventory, etc.) b) Integrated planning and scheduling. c) Mixed-initiative frameworks. d) Management of uncertainty (proactive and reactive). e) Autonomous agent architectures and distributed production management. e) Integration of machine learning capabilities. f) Wider scope of applications: 1) analysis of supplier/buyer protocols & tradeoffs; 2) integration of strategic & tactical decision-making; and 3) enterprise integration.
iPTF Discoveries of Recent Type Ia Supernovae
NASA Astrophysics Data System (ADS)
Papadogiannakis, S.; Taddia, F.; Petrushevska, T.; Ferretti, R.; Fremling, C.; Karamehmetoglu, E.; Nyholm, A.; Roy, R.; Hangard, L.; Vreeswijk, P.; Horesh, A.; Manulis, I.; Rubin, A.; Yaron, O.; Leloudas, G.; Khazov, D.; Soumagnac, M.; Knezevic, S.; Johansson, J.; Nir, G.; Cao, Y.; Blagorodnova, N.; Kulkarni, S.
2016-05-01
The intermediate Palomar Transient Factory (ATel #4807) reports the discovery and classification of the following Type Ia SNe. Our automated candidate vetting to distinguish a real astrophysical source (1.0) from bogus artefacts (0.0) is powered by three generations of machine learning algorithms: RB2 (Brink et al. 2013MNRAS.435.1047B), RB4 (Rebbapragada et al. 2015AAS...22543402R) and RB5 (Wozniak et al. 2013AAS...22143105W).
iPTF Discoveries of Recent Core-Collapse Supernovae
NASA Astrophysics Data System (ADS)
Taddia, F.; Ferretti, R.; Papadogiannakis, S.; Petrushevska, T.; Fremling, C.; Karamehmetoglu, E.; Nyholm, A.; Roy, R.; Hangard, L.; Horesh, A.; Khazov, D.; Knezevic, S.; Johansson, J.; Leloudas, G.; Manulis, I.; Rubin, A.; Soumagnac, M.; Vreeswijk, P.; Yaron, O.; Bar, I.; Cao, Y.; Kulkarni, S.; Blagorodnova, N.
2016-05-01
The intermediate Palomar Transient Factory (ATel #4807) reports the discovery and classification of the following core-collapse SNe. Our automated candidate vetting to distinguish a real astrophysical source (1.0) from bogus artifacts (0.0) is powered by three generations of machine learning algorithms: RB2 (Brink et al. 2013MNRAS.435.1047B), RB4 (Rebbapragada et al. 2015AAS...22543402R) and RB5 (Wozniak et al. 2013AAS...22143105W).
iPTF Discoveries of Recent Core-Collapse Supernovae
NASA Astrophysics Data System (ADS)
Taddia, F.; Ferretti, R.; Fremling, C.; Karamehmetoglu, E.; Nyholm, A.; Papadogiannakis, S.; Petrushevska, T.; Roy, R.; Hangard, L.; De Cia, A.; Vreeswijk, P.; Horesh, A.; Manulis, I.; Sagiv, I.; Rubin, A.; Yaron, O.; Leloudas, G.; Khazov, D.; Soumagnac, M.; Bilgi, P.
2015-04-01
The intermediate Palomar Transient Factory (ATel #4807) reports the discovery and classification of the following Core-Collapse SNe. Our automated candidate vetting to distinguish a real astrophysical source (1.0) from bogus artifacts (0.0) is powered by three generations of machine learning algorithms: RB2 (Brink et al. 2013MNRAS.435.1047B), RB4 (Rebbapragada et al. 2015AAS...22543402R) and RB5 (Wozniak et al. 2013AAS...22143105W).
iPTF Discoveries of Recent Type Ia Supernova
NASA Astrophysics Data System (ADS)
Petrushevska, T.; Ferretti, R.; Fremling, C.; Hangard, L.; Karamehmetoglu, E.; Nyholm, A.; Papadogiannakis, S.; Roy, R.; Horesh, A.; Khazov, D.; Knezevic, S.; Johansson, J.; Leloudas, G.; Manulis, I.; Rubin, A.; Soumagnac, M.; Vreeswijk, P.; Yaron, O.; Bilgi, P.; Cao, Y.; Duggan, G.; Lunnan, R.; Andreoni, I.
2015-10-01
The intermediate Palomar Transient Factory (ATel #4807) reports the discovery and classification of the following Type Ia SNe. Our automated candidate vetting to distinguish a real astrophysical source (1.0) from bogus artifacts (0.0) is powered by three generations of machine learning algorithms: RB2 (Brink et al. 2013MNRAS.435.1047B), RB4 (Rebbapragada et al. 2015AAS...22543402R) and RB5 (Wozniak et al. 2013AAS...22143105W).
iPTF Discoveries of Recent SNe Ia
NASA Astrophysics Data System (ADS)
Ferretti, R.; Fremling, C.; Johansson, J.; Karamehmetoglu, E.; Migotto, K.; Nyholm, A.; Papadogiannakis, S.; Taddia, F.; Petrushevska, T.; Roy, R.; Ben-Ami, S.; De Cia, A.; Dzigan, Y.; Horesh, A.; Khazov, D.; Manulis, I.; Rubin, A.; Sagiv, I.; Vreeswijk, P.; Yaron, O.; Bilgi, P.; Cao, Y.; Duggan, G.
2015-02-01
The intermediate Palomar Transient Factory (ATel #4807) reports the discovery and classification of the following Type Ia SNe. Our automated candidate vetting to distinguish a real astrophysical source (1.0) from bogus artifacts (0.0) is powered by three generations of machine learning algorithms: RB2 (Brink et al. 2013MNRAS.435.1047B), RB4 (Rebbapragada et al. 2015AAS...22543402R) and RB5 (Wozniak et al. 2013AAS...22143105W).
iPTF Discoveries of Recent Type Ia Supernovae
NASA Astrophysics Data System (ADS)
Papadogiannakis, S.; Taddia, F.; Ferretti, R.; Fremling, C.; Karamehmetoglu, E.; Petrushevska, T.; Nyholm, A.; Roy, R.; Hangard, L.; Vreeswijk, P.; Horesh, A.; Manulis, I.; Rubin, A.; Yaron, O.; Leloudas, G.; Khazov, D.; Soumagnac, M.; Knezevic, S.; Johansson, J.; Lunnan, R.; Blagorodnova, N.; Cao, Y.; Cenk, S. B.
2016-01-01
The intermediate Palomar Transient Factory (ATel #4807) reports the discovery and classification of the following Type Ia SNe. Our automated candidate vetting to distinguish a real astrophysical source (1.0) from bogus artifacts (0.0) is powered by three generations of machine learning algorithms: RB2 (Brink et al. 2013MNRAS.435.1047B), RB4 (Rebbapragada et al. 2015AAS...22543402R) and RB5 (Wozniak et al. 2013AAS...22143105W).
iPTF Discoveries of Recent Type Ia Supernovae
NASA Astrophysics Data System (ADS)
Ferretti, R.; Fremling, C.; Hangard, L.; Karamehmetoglu, E.; Nyholm, A.; Papadogiannakis, S.; Petrushevska, T.; Roy, R.; Taddia, F.; Horesh, A.; Khazov, D.; Knezevic, S.; Leloudas, G.; Manulis, I.; Rubin, A.; Soumagnac, M.; Vreeswijk, P.; Yaron, O.; Cao, Y.; Duggan, G.; Lunnan, R.; Blagorodnova, N.
2015-11-01
The intermediate Palomar Transient Factory (ATel #4807) reports the discovery and classification of the following Type Ia SNe. Our automated candidate vetting to distinguish a real astrophysical source (1.0) from bogus artifacts (0.0) is powered by three generations of machine learning algorithms: RB2 (Brink et al. 2013MNRAS.435.1047B), RB4 (Rebbapragada et al. 2015AAS...22543402R) and RB5 (Wozniak et al. 2013AAS...22143105W).
iPTF Discoveries of Recent Core-Collapse Supernovae
NASA Astrophysics Data System (ADS)
Taddia, F.; Ferretti, R.; Fremling, C.; Karamehmetoglu, E.; Nyholm, A.; Papadogiannakis, S.; Petrushevska, T.; Roy, R.; Hangard, L.; Vreeswijk, P.; Horesh, A.; Manulis, I.; Rubin, A.; Yaron, O.; Leloudas, G.; Khazov, D.; Soumagnac, M.; Knezevic, S.; Johansson, J.; Duggan, G.; Lunnan, R.; Cao, Y.
2015-09-01
The intermediate Palomar Transient Factory (ATel #4807) reports the discovery and classification of the following Core-Collapse SNe. Our automated candidate vetting to distinguish a real astrophysical source (1.0) from bogus artifacts (0.0) is powered by three generations of machine learning algorithms: RB2 (Brink et al. 2013MNRAS.435.1047B), RB4 (Rebbapragada et al. 2015AAS...22543402R) and RB5 (Wozniak et al. 2013AAS...22143105W).
iPTF Discovery of Recent Type Ia Supernovae
NASA Astrophysics Data System (ADS)
Hangard, L.; Ferretti, R.; Fremling, C.; Karamehmetoglu, E.; Nyholm, A.; Papadogiannakis, S.; Petrushevska, T.; Roy, R.; Bar, I.; Horesh, A.; Johansson, J.; Khazov, D.; Knezevic, S.; Leloudas, G.; Manulis, I.; Rubin, A.; Soumagnac, M.; Vreeswijk, P.; Yaron, O.; Cao, Y.; Kulkarni, S.; Lunnan, R.; Ravi, V.; Vedantham, H. K.; Yan, L.
2016-04-01
The intermediate Palomar Transient Factory (ATel #4807) reports the discovery and classification of the following Type Ia SNe. Our automated candidate vetting to distinguish a real astrophysical source (1.0) from bogus artifacts (0.0) is powered by three generations of machine learning algorithms: RB2 (Brink et al. 2013MNRAS.435.1047B), RB4 (Rebbapragada et al. 2015AAS...22543402R) and RB5 (Wozniak et al. 2013AAS...22143105W).
iPTF Discoveries of Recent Core-Collapse Supernovae
NASA Astrophysics Data System (ADS)
Taddia, F.; Ferretti, R.; Fremling, C.; Karamehmetoglu, E.; Nyholm, A.; Papadogiannakis, S.; Petrushevska, T.; Roy, R.; Hangard, L.; Vreeswijk, P.; Horesh, A.; Manulis, I.; Rubin, A.; Yaron, O.; Leloudas, G.; Khazov, D.; Soumagnac, M.; Knezevic, S.; Johansson, J.; Lunnan, R.; Cao, Y.; Miller, A.
2015-11-01
The intermediate Palomar Transient Factory (ATel #4807) reports the discovery and classification of the following Core-Collapse SNe. Our automated candidate vetting to distinguish a real astrophysical source (1.0) from bogus artifacts (0.0) is powered by three generations of machine learning algorithms: RB2 (Brink et al. 2013MNRAS.435.1047B), RB4 (Rebbapragada et al. 2015AAS...22543402R) and RB5 (Wozniak et al. 2013AAS...22143105W).
iPTF Discoveries of Recent Type Ia Supernovae
NASA Astrophysics Data System (ADS)
Petrushevska, T.; Ferretti, R.; Fremling, C.; Hangard, L.; Karamehmetoglu, E.; Nyholm, A.; Papadogiannakis, S.; Roy, R.; Horesh, A.; Khazov, D.; Knezevic, S.; Johansson, J.; Leloudas, G.; Manulis, I.; Rubin, A.; Soumagnac, M.; Vreeswijk, P.; Yaron, O.; Bilgi, P.; Cao, Y.; Duggan, G.; Lunnan, R.
2016-02-01
The intermediate Palomar Transient Factory (ATel #4807) reports the discovery and classification of the following Type Ia SNe. Our automated candidate vetting to distinguish a real astrophysical source (1.0) from bogus artifacts (0.0) is powered by three generations of machine learning algorithms: RB2 (Brink et al. 2013MNRAS.435.1047B), RB4 (Rebbapragada et al. 2015AAS...22543402R) and RB5 (Wozniak et al. 2013AAS...22143105W).
iPTF Discovery of Recent Type Ia Supernovae
NASA Astrophysics Data System (ADS)
Hangard, L.; Taddia, F.; Ferretti, R.; Papadogiannakis, S.; Petrushevska, T.; Fremling, C.; Karamehmetoglu, E.; Nyholm, A.; Roy, R.; Horesh, A.; Khazov, D.; Knezevic, S.; Johansson, J.; Leloudas, G.; Manulis, I.; Rubin, A.; Soumagnac, M.; Vreeswijk, P.; Yaron, O.; Bar, I.; Lunnan, R.; Cenk, S. B.
2016-02-01
The intermediate Palomar Transient Factory (ATel #4807) reports the discovery and classification of the following Type Ia SNe. Our automated candidate vetting to distinguish a real astrophysical source (1.0) from bogus artifacts (0.0) is powered by three generations of machine learning algorithms: RB2 (Brink et al. 2013MNRAS.435.1047B), RB4 (Rebbapragada et al. 2015AAS...22543402R) and RB5 (Wozniak et al. 2013AAS...22143105W).
iPTF Discoveries of Recent Type Ia Supernovae
NASA Astrophysics Data System (ADS)
Papadogiannakis, S.; Fremling, C.; Hangard, L.; Karamehmetoglu, E.; Nyholm, A.; Ferretti, R.; Petrushevska, T.; Roy, R.; Taddia, F.; Bar, I.; Horesh, A.; Johansson, J.; Knezevic, S.; Leloudas, G.; Manulis, I.; Nir, G.; Rubin, A.; Soumagnac, M.; Vreeswijk, P.; Yaron, O.; Arcavi, I.; Howell, D. A.; McCully, C.; Hosseinzadeh, G.; Valenti, S.; Blagorodnova, N.; Cao, Y.; Duggan, G.; Ravi, V.; Lunnan, R.
2016-03-01
The intermediate Palomar Transient Factory (ATel #4807) reports the discovery and classification of the following Type Ia SNe. Our automated candidate vetting to distinguish a real astrophysical source (1.0) from bogus artifacts (0.0) is powered by three generations of machine learning algorithms: RB2 (Brink et al. 2013MNRAS.435.1047B), RB4 (Rebbapragada et al. 2015AAS...22543402R) and RB5 (Wozniak et al. 2013AAS...22143105W).
iPTF discoveries of recent type Ia supernovae
NASA Astrophysics Data System (ADS)
Papadogiannakis, S.; Ferretti, R.; Fremling, C.; Hangard, L.; Karamehmetoglu, E.; Nyholm, A.; Petrushevska, T.; Roy, R.; De Cia, A.; Vreeswijk, P.; Horesh, A.; Manulis, I.; Sagiv, I.; Rubin, A.; Yaron, O.; Leloudas, G.; Khazov, D.; Soumagnac, M.; Knezevic, S.; Cenko, S. B.; Capone, J.; Bartakk, M.
2015-09-01
The intermediate Palomar Transient Factory (ATel #4807) reports the discovery and classification of the following Type Ia SNe. Our automated candidate vetting to distinguish a real astrophysical source (1.0) from bogus artifacts (0.0) is powered by three generations of machine learning algorithms: RB2 (Brink et al. 2013MNRAS.435.1047B), RB4 (Rebbapragada et al. 2015AAS...22543402R) and RB5 (Wozniak et al. 2013AAS...22143105W).
iPTF Discovery of Recent Type Ia Supernova
NASA Astrophysics Data System (ADS)
Hangard, L.; Petrushevska, T.; Papadogiannakis, S.; Ferretti, R.; Fremling, C.; Karamehmetoglu, E.; Nyholm, A.; Roy, R.; Horesh, A.; Khazov, D.; Knezevic, S.; Johansson, J.; Leloudas, G.; Manulis, I.; Rubin, A.; Soumagnac, M.; Vreeswijk, P.; Yaron, O.; Kasliwal, M.
2015-10-01
The intermediate Palomar Transient Factory (ATel #4807) reports the discovery and classification of the following Type Ia SNe. Our automated candidate vetting to distinguish a real astrophysical source (1.0) from bogus artifacts (0.0) is powered by three generations of machine learning algorithms: RB2 (Brink et al. 2013MNRAS.435.1047B), RB4 (Rebbapragada et al. 2015AAS...22543402R) and RB5 (Wozniak et al. 2013AAS...22143105W).
iPTF Discoveries of Recent Type Ia Supernovae
NASA Astrophysics Data System (ADS)
Petrushevska, T.; Ferretti, R.; Fremling, C.; Hangard, L.; Karamehmetoglu, E.; Nyholm, A.; Papadogiannakis, S.; Roy, R.; Horesh, A.; Khazov, D.; Knezevic, S.; Johansson, J.; Leloudas, G.; Manulis, I.; Rubin, A.; Soumagnac, M.; Vreeswijk, P.; Yaron, O.; Bilgi, P.; Cao, Y.; Duggan, G.; Lunnan, R.; Neill, J. D.; Walters, R.
2016-04-01
The intermediate Palomar Transient Factory (ATel #4807) reports the discovery and classification of the following Type Ia SNe. Our automated candidate vetting to distinguish a real astrophysical source (1.0) from bogus artifacts (0.0) is powered by three generations of machine learning algorithms: RB2 (Brink et al. 2013MNRAS.435.1047B), RB4 (Rebbapragada et al. 2015AAS...22543402R) and RB5 (Wozniak et al. 2013AAS...22143105W).
iPTF Discoveries of Recent Type Ia Supernovae
NASA Astrophysics Data System (ADS)
Papadogiannakis, S.; Taddia, F.; Petrushevska, T.; Fremling, C.; Hangard, L.; Johansson, J.; Karamehmetoglu, E.; Migotto, K.; Nyholm, A.; Roy, R.; Ben-Ami, S.; De Cia, A.; Dzigan, Y.; Horesh, A.; Khazov, D.; Soumagnac, M.; Manulis, I.; Rubin, A.; Sagiv, I.; Vreeswijk, P.; Yaron, O.; Bond, H.; Bilgi, P.; Cao, Y.; Duggan, G.
2015-03-01
The intermediate Palomar Transient Factory (ATel #4807) reports the discovery and classification of the following Type Ia SNe. Our automated candidate vetting to distinguish a real astrophysical source (1.0) from bogus artifacts (0.0) is powered by three generations of machine learning algorithms: RB2 (Brink et al. 2013MNRAS.435.1047B), RB4 (Rebbapragada et al. 2015AAS...22543402R) and RB5 (Wozniak et al. 2013AAS...22143105W).
iPTF Discovery of Recent Type Ia Supernovae
NASA Astrophysics Data System (ADS)
Hangard, L.; Ferretti, R.; Papadogiannakis, S.; Petrushevska, T.; Fremling, C.; Karamehmetoglu, E.; Nyholm, A.; Roy, R.; Horesh, A.; Khazov, D.; Knezevic, S.; Johansson, J.; Leloudas, G.; Manulis, I.; Rubin, A.; Soumagnac, M.; Vreeswijk, P.; Yaron, O.; Cook, D.
2015-12-01
The intermediate Palomar Transient Factory (ATel #4807) reports the discovery and classification of the following Type Ia SNe. Our automated candidate vetting to distinguish a real astrophysical source (1.0) from bogus artifacts (0.0) is powered by three generations of machine learning algorithms: RB2 (Brink et al. 2013MNRAS.435.1047B), RB4 (Rebbapragada et al. 2015AAS...22543402R) and RB5 (Wozniak et al. 2013AAS...22143105W).
iPTF Discoveries of Recent Type Ia Supernova
NASA Astrophysics Data System (ADS)
Petrushevska, T.; Ferretti, R.; Fremling, C.; Hangard, L.; Karamehmetoglu, E.; Nyholm, A.; Papadogiannakis, S.; Roy, R.; Horesh, A.; Khazov, D.; Knezevic, S.; Johansson, J.; Leloudas, G.; Manulis, I.; Rubin, A.; Soumagnac, M.; Vreeswijk, P.; Yaron, O.; Bilgi, P.; Cao, Y.; Duggan, G.; Lunnan, R.; Jencson, J.
2015-11-01
The intermediate Palomar Transient Factory (ATel #4807) reports the discovery and classification of the following Type Ia SNe. Our automated candidate vetting to distinguish a real astrophysical source (1.0) from bogus artifacts (0.0) is powered by three generations of machine learning algorithms: RB2 (Brink et al. 2013MNRAS.435.1047B), RB4 (Rebbapragada et al. 2015AAS...22543402R) and RB5 (Wozniak et al. 2013AAS...22143105W).
Impact of the macroeconomic factors on university budgeting the US and Russia
NASA Astrophysics Data System (ADS)
Bogomolova, Arina; Balk, Igor; Ivachenko, Natalya; Temkin, Anatoly
2017-10-01
This paper discuses impact of macroeconomics factor on the university budgeting. Modern developments in the area of data science and machine learning made it possible to utilise automated techniques to address several problems of humankind ranging from genetic engineering and particle physics to sociology and economics. This paper is the first step to create a robust toolkit which will help universities sustain macroeconomic challenges utilising modern predictive analytics techniques.
Fall 2014 Data-Intensive Systems
2014-10-29
Oct 2014 © 2014 Carnegie Mellon University Big Data Systems NoSQL and horizontal scaling are changing architecture principles by creating...University Status LEAP4BD • Ready to pilot QuABase • Prototype is complete – covers 8 NoSQL /NewSQL implementations • Completing validation testing Big...machine learning to automate population of knowledge base • Initial focus on NoSQL /NewSQL technology domain • Extend to create knowledge bases in other
Analysis of large space structures assembly: Man/machine assembly analysis
NASA Technical Reports Server (NTRS)
1983-01-01
Procedures for analyzing large space structures assembly via three primary modes: manual, remote and automated are outlined. Data bases on each of the assembly modes and a general data base on the shuttle capabilities to support structures assembly are presented. Task element times and structure assembly component costs are given to provide a basis for determining the comparative economics of assembly alternatives. The lessons learned from simulations of space structures assembly are detailed.
Morota, Gota; Ventura, Ricardo V; Silva, Fabyano F; Koyama, Masanori; Fernando, Samodha C
2018-04-14
Precision animal agriculture is poised to rise to prominence in the livestock enterprise in the domains of management, production, welfare, sustainability, health surveillance, and environmental footprint. Considerable progress has been made in the use of tools to routinely monitor and collect information from animals and farms in a less laborious manner than before. These efforts have enabled the animal sciences to embark on information technology-driven discoveries to improve animal agriculture. However, the growing amount and complexity of data generated by fully automated, high-throughput data recording or phenotyping platforms, including digital images, sensor and sound data, unmanned systems, and information obtained from real-time noninvasive computer vision, pose challenges to the successful implementation of precision animal agriculture. The emerging fields of machine learning and data mining are expected to be instrumental in helping meet the daunting challenges facing global agriculture. Yet, their impact and potential in "big data" analysis have not been adequately appreciated in the animal science community, where this recognition has remained only fragmentary. To address such knowledge gaps, this article outlines a framework for machine learning and data mining and offers a glimpse into how they can be applied to solve pressing problems in animal sciences.
wACSF—Weighted atom-centered symmetry functions as descriptors in machine learning potentials
NASA Astrophysics Data System (ADS)
Gastegger, M.; Schwiedrzik, L.; Bittermann, M.; Berzsenyi, F.; Marquetand, P.
2018-06-01
We introduce weighted atom-centered symmetry functions (wACSFs) as descriptors of a chemical system's geometry for use in the prediction of chemical properties such as enthalpies or potential energies via machine learning. The wACSFs are based on conventional atom-centered symmetry functions (ACSFs) but overcome the undesirable scaling of the latter with an increasing number of different elements in a chemical system. The performance of these two descriptors is compared using them as inputs in high-dimensional neural network potentials (HDNNPs), employing the molecular structures and associated enthalpies of the 133 855 molecules containing up to five different elements reported in the QM9 database as reference data. A substantially smaller number of wACSFs than ACSFs is needed to obtain a comparable spatial resolution of the molecular structures. At the same time, this smaller set of wACSFs leads to a significantly better generalization performance in the machine learning potential than the large set of conventional ACSFs. Furthermore, we show that the intrinsic parameters of the descriptors can in principle be optimized with a genetic algorithm in a highly automated manner. For the wACSFs employed here, we find however that using a simple empirical parametrization scheme is sufficient in order to obtain HDNNPs with high accuracy.
Huff, Trevor J; Ludwig, Parker E; Zuniga, Jorge M
2018-05-01
3D-printed anatomical models play an important role in medical and research settings. The recent successes of 3D anatomical models in healthcare have led many institutions to adopt the technology. However, there remain several issues that must be addressed before it can become more wide-spread. Of importance are the problems of cost and time of manufacturing. Machine learning (ML) could be utilized to solve these issues by streamlining the 3D modeling process through rapid medical image segmentation and improved patient selection and image acquisition. The current challenges, potential solutions, and future directions for ML and 3D anatomical modeling in healthcare are discussed. Areas covered: This review covers research articles in the field of machine learning as related to 3D anatomical modeling. Topics discussed include automated image segmentation, cost reduction, and related time constraints. Expert commentary: ML-based segmentation of medical images could potentially improve the process of 3D anatomical modeling. However, until more research is done to validate these technologies in clinical practice, their impact on patient outcomes will remain unknown. We have the necessary computational tools to tackle the problems discussed. The difficulty now lies in our ability to collect sufficient data.
Biomimetic molecular design tools that learn, evolve, and adapt.
Winkler, David A
2017-01-01
A dominant hallmark of living systems is their ability to adapt to changes in the environment by learning and evolving. Nature does this so superbly that intensive research efforts are now attempting to mimic biological processes. Initially this biomimicry involved developing synthetic methods to generate complex bioactive natural products. Recent work is attempting to understand how molecular machines operate so their principles can be copied, and learning how to employ biomimetic evolution and learning methods to solve complex problems in science, medicine and engineering. Automation, robotics, artificial intelligence, and evolutionary algorithms are now converging to generate what might broadly be called in silico-based adaptive evolution of materials. These methods are being applied to organic chemistry to systematize reactions, create synthesis robots to carry out unit operations, and to devise closed loop flow self-optimizing chemical synthesis systems. Most scientific innovations and technologies pass through the well-known "S curve", with slow beginning, an almost exponential growth in capability, and a stable applications period. Adaptive, evolving, machine learning-based molecular design and optimization methods are approaching the period of very rapid growth and their impact is already being described as potentially disruptive. This paper describes new developments in biomimetic adaptive, evolving, learning computational molecular design methods and their potential impacts in chemistry, engineering, and medicine.
Biomimetic molecular design tools that learn, evolve, and adapt
2017-01-01
A dominant hallmark of living systems is their ability to adapt to changes in the environment by learning and evolving. Nature does this so superbly that intensive research efforts are now attempting to mimic biological processes. Initially this biomimicry involved developing synthetic methods to generate complex bioactive natural products. Recent work is attempting to understand how molecular machines operate so their principles can be copied, and learning how to employ biomimetic evolution and learning methods to solve complex problems in science, medicine and engineering. Automation, robotics, artificial intelligence, and evolutionary algorithms are now converging to generate what might broadly be called in silico-based adaptive evolution of materials. These methods are being applied to organic chemistry to systematize reactions, create synthesis robots to carry out unit operations, and to devise closed loop flow self-optimizing chemical synthesis systems. Most scientific innovations and technologies pass through the well-known “S curve”, with slow beginning, an almost exponential growth in capability, and a stable applications period. Adaptive, evolving, machine learning-based molecular design and optimization methods are approaching the period of very rapid growth and their impact is already being described as potentially disruptive. This paper describes new developments in biomimetic adaptive, evolving, learning computational molecular design methods and their potential impacts in chemistry, engineering, and medicine. PMID:28694872
Yang, Kamie K; Lewis, Ian H
2014-06-15
Various equipment malfunctions of anesthesia gas delivery systems have been previously reported. Our profession increasingly uses technology as a means to prevent these errors. We report a case of a near-total anesthesia circuit obstruction that went undetected before the induction of anesthesia despite the use of automated machine check technology. This case highlights that automated machine check modules can fail to detect severe equipment failure and demonstrates how, even in this era of expanding technology, manual checks still remain essential components of safe care.
Erdoğdu, Utku; Tan, Mehmet; Alhajj, Reda; Polat, Faruk; Rokne, Jon; Demetrick, Douglas
2013-01-01
The availability of enough samples for effective analysis and knowledge discovery has been a challenge in the research community, especially in the area of gene expression data analysis. Thus, the approaches being developed for data analysis have mostly suffered from the lack of enough data to train and test the constructed models. We argue that the process of sample generation could be successfully automated by employing some sophisticated machine learning techniques. An automated sample generation framework could successfully complement the actual sample generation from real cases. This argument is validated in this paper by describing a framework that integrates multiple models (perspectives) for sample generation. We illustrate its applicability for producing new gene expression data samples, a highly demanding area that has not received attention. The three perspectives employed in the process are based on models that are not closely related. The independence eliminates the bias of having the produced approach covering only certain characteristics of the domain and leading to samples skewed towards one direction. The first model is based on the Probabilistic Boolean Network (PBN) representation of the gene regulatory network underlying the given gene expression data. The second model integrates Hierarchical Markov Model (HIMM) and the third model employs a genetic algorithm in the process. Each model learns as much as possible characteristics of the domain being analysed and tries to incorporate the learned characteristics in generating new samples. In other words, the models base their analysis on domain knowledge implicitly present in the data itself. The developed framework has been extensively tested by checking how the new samples complement the original samples. The produced results are very promising in showing the effectiveness, usefulness and applicability of the proposed multi-model framework.
Photometric Supernova Classification with Machine Learning
NASA Astrophysics Data System (ADS)
Lochner, Michelle; McEwen, Jason D.; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K.
2016-08-01
Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k-nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.
Formal verification of human-automation interaction
NASA Technical Reports Server (NTRS)
Degani, Asaf; Heymann, Michael
2002-01-01
This paper discusses a formal and rigorous approach to the analysis of operator interaction with machines. It addresses the acute problem of detecting design errors in human-machine interaction and focuses on verifying the correctness of the interaction in complex and automated control systems. The paper describes a systematic methodology for evaluating whether the interface provides the necessary information about the machine to enable the operator to perform a specified task successfully and unambiguously. It also addresses the adequacy of information provided to the user via training material (e.g., user manual) about the machine's behavior. The essentials of the methodology, which can be automated and applied to the verification of large systems, are illustrated by several examples and through a case study of pilot interaction with an autopilot aboard a modern commercial aircraft. The expected application of this methodology is an augmentation and enhancement, by formal verification, of human-automation interfaces.
Robust snow avalanche detection using machine learning on infrasonic array data
NASA Astrophysics Data System (ADS)
Thüring, Thomas; Schoch, Marcel; van Herwijnen, Alec; Schweizer, Jürg
2014-05-01
Snow avalanches may threaten people and infrastructure in mountain areas. Automated detection of avalanche activity would be highly desirable, in particular during times of poor visibility, to improve hazard assessment, but also to monitor the effectiveness of avalanche control by explosives. In the past, a variety of remote sensing techniques and instruments for the automated detection of avalanche activity have been reported, which are based on radio waves (radar), seismic signals (geophone), optical signals (imaging sensor) or infrasonic signals (microphone). Optical imagery enables to assess avalanche activity with very high spatial resolution, however it is strongly weather dependent. Radar and geophone-based detection typically provide robust avalanche detection for all weather conditions, but are very limited in the size of the monitoring area. On the other hand, due to the long propagation distance of infrasound through air, the monitoring area of infrasonic sensors can cover a large territory using a single sensor (or an array). In addition, they are by far more cost effective than radars or optical imaging systems. Unfortunately, the reliability of infrasonic sensor systems has so far been rather low due to the strong variation of ambient noise (e.g. wind) causing a high false alarm rate. We analyzed the data collected by a low-cost infrasonic array system consisting of four sensors for the automated detection of avalanche activity at Lavin in the eastern Swiss Alps. A comparably large array aperture (~350m) allows highly accurate time delay estimations of signals which arrive at different times at the sensors, enabling precise source localization. An array of four sensors is sufficient for the time resolved source localization of signals in full 3D space, which is an excellent method to anticipate true avalanche activity. Robust avalanche detection is then achieved by using machine learning methods such as support vector machines. The system is initially trained by using characteristic data features from known avalanche and non-avalanche events. Data features are obtained from output signals of the source localization algorithm or from Fourier or time domain processing and support the learning phase of the system. A significantly improved detection rate as well as a reduction of the false alarm rate was achieved compared to previous approaches.
Naik, Hsiang Sing; Zhang, Jiaoping; Lofquist, Alec; Assefa, Teshale; Sarkar, Soumik; Ackerman, David; Singh, Arti; Singh, Asheesh K; Ganapathysubramanian, Baskar
2017-01-01
Phenotyping is a critical component of plant research. Accurate and precise trait collection, when integrated with genetic tools, can greatly accelerate the rate of genetic gain in crop improvement. However, efficient and automatic phenotyping of traits across large populations is a challenge; which is further exacerbated by the necessity of sampling multiple environments and growing replicated trials. A promising approach is to leverage current advances in imaging technology, data analytics and machine learning to enable automated and fast phenotyping and subsequent decision support. In this context, the workflow for phenotyping (image capture → data storage and curation → trait extraction → machine learning/classification → models/apps for decision support) has to be carefully designed and efficiently executed to minimize resource usage and maximize utility. We illustrate such an end-to-end phenotyping workflow for the case of plant stress severity phenotyping in soybean, with a specific focus on the rapid and automatic assessment of iron deficiency chlorosis (IDC) severity on thousands of field plots. We showcase this analytics framework by extracting IDC features from a set of ~4500 unique canopies representing a diverse germplasm base that have different levels of IDC, and subsequently training a variety of classification models to predict plant stress severity. The best classifier is then deployed as a smartphone app for rapid and real time severity rating in the field. We investigated 10 different classification approaches, with the best classifier being a hierarchical classifier with a mean per-class accuracy of ~96%. We construct a phenotypically meaningful 'population canopy graph', connecting the automatically extracted canopy trait features with plant stress severity rating. We incorporated this image capture → image processing → classification workflow into a smartphone app that enables automated real-time evaluation of IDC scores using digital images of the canopy. We expect this high-throughput framework to help increase the rate of genetic gain by providing a robust extendable framework for other abiotic and biotic stresses. We further envision this workflow embedded onto a high throughput phenotyping ground vehicle and unmanned aerial system that will allow real-time, automated stress trait detection and quantification for plant research, breeding and stress scouting applications.
Automating annotation of information-giving for analysis of clinical conversation.
Mayfield, Elijah; Laws, M Barton; Wilson, Ira B; Penstein Rosé, Carolyn
2014-02-01
Coding of clinical communication for fine-grained features such as speech acts has produced a substantial literature. However, annotation by humans is laborious and expensive, limiting application of these methods. We aimed to show that through machine learning, computers could code certain categories of speech acts with sufficient reliability to make useful distinctions among clinical encounters. The data were transcripts of 415 routine outpatient visits of HIV patients which had previously been coded for speech acts using the Generalized Medical Interaction Analysis System (GMIAS); 50 had also been coded for larger scale features using the Comprehensive Analysis of the Structure of Encounters System (CASES). We aggregated selected speech acts into information-giving and requesting, then trained the machine to automatically annotate using logistic regression classification. We evaluated reliability by per-speech act accuracy. We used multiple regression to predict patient reports of communication quality from post-visit surveys using the patient and provider information-giving to information-requesting ratio (briefly, information-giving ratio) and patient gender. Automated coding produces moderate reliability with human coding (accuracy 71.2%, κ=0.57), with high correlation between machine and human prediction of the information-giving ratio (r=0.96). The regression significantly predicted four of five patient-reported measures of communication quality (r=0.263-0.344). The information-giving ratio is a useful and intuitive measure for predicting patient perception of provider-patient communication quality. These predictions can be made with automated annotation, which is a practical option for studying large collections of clinical encounters with objectivity, consistency, and low cost, providing greater opportunity for training and reflection for care providers.
Translations from Kommunist, Number 13, September 1978
1978-10-30
programmed machine tool here is merely a component of a more complex reprogrammable technological system. This includes the robot machine tools with...sufficient possibilities for changing technological operations and processes and automated technological lines. 52 The reprogrammable automated sets will...simulate the possibilities of such sets. A new technological level will be developed in industry related to reprogrammable automated sets, their design
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Xiaofeng; Wu, Ning; Cheng, Guanghui
Purpose: To develop an automated magnetic resonance imaging (MRI) parotid segmentation method to monitor radiation-induced parotid gland changes in patients after head and neck radiation therapy (RT). Methods and Materials: The proposed method combines the atlas registration method, which captures the global variation of anatomy, with a machine learning technology, which captures the local statistical features, to automatically segment the parotid glands from the MRIs. The segmentation method consists of 3 major steps. First, an atlas (pre-RT MRI and manually contoured parotid gland mask) is built for each patient. A hybrid deformable image registration is used to map the pre-RTmore » MRI to the post-RT MRI, and the transformation is applied to the pre-RT parotid volume. Second, the kernel support vector machine (SVM) is trained with the subject-specific atlas pair consisting of multiple features (intensity, gradient, and others) from the aligned pre-RT MRI and the transformed parotid volume. Third, the well-trained kernel SVM is used to differentiate the parotid from surrounding tissues in the post-RT MRIs by statistically matching multiple texture features. A longitudinal study of 15 patients undergoing head and neck RT was conducted: baseline MRI was acquired prior to RT, and the post-RT MRIs were acquired at 3-, 6-, and 12-month follow-up examinations. The resulting segmentations were compared with the physicians' manual contours. Results: Successful parotid segmentation was achieved for all 15 patients (42 post-RT MRIs). The average percentage of volume differences between the automated segmentations and those of the physicians' manual contours were 7.98% for the left parotid and 8.12% for the right parotid. The average volume overlap was 91.1% ± 1.6% for the left parotid and 90.5% ± 2.4% for the right parotid. The parotid gland volume reduction at follow-up was 25% at 3 months, 27% at 6 months, and 16% at 12 months. Conclusions: We have validated our automated parotid segmentation algorithm in a longitudinal study. This segmentation method may be useful in future studies to address radiation-induced xerostomia in head and neck radiation therapy.« less
Active learning of neuron morphology for accurate automated tracing of neurites
Gala, Rohan; Chapeton, Julio; Jitesh, Jayant; Bhavsar, Chintan; Stepanyants, Armen
2014-01-01
Automating the process of neurite tracing from light microscopy stacks of images is essential for large-scale or high-throughput quantitative studies of neural circuits. While the general layout of labeled neurites can be captured by many automated tracing algorithms, it is often not possible to differentiate reliably between the processes belonging to different cells. The reason is that some neurites in the stack may appear broken due to imperfect labeling, while others may appear fused due to the limited resolution of optical microscopy. Trained neuroanatomists routinely resolve such topological ambiguities during manual tracing tasks by combining information about distances between branches, branch orientations, intensities, calibers, tortuosities, colors, as well as the presence of spines or boutons. Likewise, to evaluate different topological scenarios automatically, we developed a machine learning approach that combines many of the above mentioned features. A specifically designed confidence measure was used to actively train the algorithm during user-assisted tracing procedure. Active learning significantly reduces the training time and makes it possible to obtain less than 1% generalization error rates by providing few training examples. To evaluate the overall performance of the algorithm a number of image stacks were reconstructed automatically, as well as manually by several trained users, making it possible to compare the automated traces to the baseline inter-user variability. Several geometrical and topological features of the traces were selected for the comparisons. These features include the total trace length, the total numbers of branch and terminal points, the affinity of corresponding traces, and the distances between corresponding branch and terminal points. Our results show that when the density of labeled neurites is sufficiently low, automated traces are not significantly different from manual reconstructions obtained by trained users. PMID:24904306
Analysis of Big Data in Gait Biomechanics: Current Trends and Future Directions.
Phinyomark, Angkoon; Petri, Giovanni; Ibáñez-Marcelo, Esther; Osis, Sean T; Ferber, Reed
2018-01-01
The increasing amount of data in biomechanics research has greatly increased the importance of developing advanced multivariate analysis and machine learning techniques, which are better able to handle "big data". Consequently, advances in data science methods will expand the knowledge for testing new hypotheses about biomechanical risk factors associated with walking and running gait-related musculoskeletal injury. This paper begins with a brief introduction to an automated three-dimensional (3D) biomechanical gait data collection system: 3D GAIT, followed by how the studies in the field of gait biomechanics fit the quantities in the 5 V's definition of big data: volume, velocity, variety, veracity, and value. Next, we provide a review of recent research and development in multivariate and machine learning methods-based gait analysis that can be applied to big data analytics. These modern biomechanical gait analysis methods include several main modules such as initial input features, dimensionality reduction (feature selection and extraction), and learning algorithms (classification and clustering). Finally, a promising big data exploration tool called "topological data analysis" and directions for future research are outlined and discussed.
Das, Nilakash; Topalovic, Marko; Janssens, Wim
2018-03-01
The application of artificial intelligence in the diagnosis of obstructive lung diseases is an exciting phenomenon. Artificial intelligence algorithms work by finding patterns in data obtained from diagnostic tests, which can be used to predict clinical outcomes or to detect obstructive phenotypes. The purpose of this review is to describe the latest trends and to discuss the future potential of artificial intelligence in the diagnosis of obstructive lung diseases. Machine learning has been successfully used in automated interpretation of pulmonary function tests for differential diagnosis of obstructive lung diseases. Deep learning models such as convolutional neural network are state-of-the art for obstructive pattern recognition in computed tomography. Machine learning has also been applied in other diagnostic approaches such as forced oscillation test, breath analysis, lung sound analysis and telemedicine with promising results in small-scale studies. Overall, the application of artificial intelligence has produced encouraging results in the diagnosis of obstructive lung diseases. However, large-scale studies are still required to validate current findings and to boost its adoption by the medical community.
Application of Deep Learning in Automated Analysis of Molecular Images in Cancer: A Survey
Xue, Yong; Chen, Shihui; Liu, Yong
2017-01-01
Molecular imaging enables the visualization and quantitative analysis of the alterations of biological procedures at molecular and/or cellular level, which is of great significance for early detection of cancer. In recent years, deep leaning has been widely used in medical imaging analysis, as it overcomes the limitations of visual assessment and traditional machine learning techniques by extracting hierarchical features with powerful representation capability. Research on cancer molecular images using deep learning techniques is also increasing dynamically. Hence, in this paper, we review the applications of deep learning in molecular imaging in terms of tumor lesion segmentation, tumor classification, and survival prediction. We also outline some future directions in which researchers may develop more powerful deep learning models for better performance in the applications in cancer molecular imaging. PMID:29114182
Unbiased classification of spatial strategies in the Barnes maze.
Illouz, Tomer; Madar, Ravit; Clague, Charlotte; Griffioen, Kathleen J; Louzoun, Yoram; Okun, Eitan
2016-11-01
Spatial learning is one of the most widely studied cognitive domains in neuroscience. The Morris water maze and the Barnes maze are the most commonly used techniques to assess spatial learning and memory in rodents. Despite the fact that these tasks are well-validated paradigms for testing spatial learning abilities, manual categorization of performance into behavioral strategies is subject to individual interpretation, and thus to bias. We have previously described an unbiased machine-learning algorithm to classify spatial strategies in the Morris water maze. Here, we offer a support vector machine-based, automated, Barnes-maze unbiased strategy (BUNS) classification algorithm, as well as a cognitive score scale that can be used for memory acquisition, reversal training and probe trials. The BUNS algorithm can greatly benefit Barnes maze users as it provides a standardized method of strategy classification and cognitive scoring scale, which cannot be derived from typical Barnes maze data analysis. Freely available on the web at http://okunlab.wix.com/okunlab as a MATLAB application. eitan.okun@biu.ac.ilSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Eavesdropping on the Arctic: Automated bioacoustics reveal dynamics in songbird breeding phenology.
Oliver, Ruth Y; Ellis, Daniel P W; Chmura, Helen E; Krause, Jesse S; Pérez, Jonathan H; Sweet, Shannan K; Gough, Laura; Wingfield, John C; Boelman, Natalie T
2018-06-01
Bioacoustic networks could vastly expand the coverage of wildlife monitoring to complement satellite observations of climate and vegetation. This approach would enable global-scale understanding of how climate change influences phenomena such as migratory timing of avian species. The enormous data sets that autonomous recorders typically generate demand automated analyses that remain largely undeveloped. We devised automated signal processing and machine learning approaches to estimate dates on which songbird communities arrived at arctic breeding grounds. Acoustically estimated dates agreed well with those determined via traditional surveys and were strongly related to the landscape's snow-free dates. We found that environmental conditions heavily influenced daily variation in songbird vocal activity, especially before egg laying. Our novel approaches demonstrate that variation in avian migratory arrival can be detected autonomously. Large-scale deployment of this innovation in wildlife monitoring would enable the coverage necessary to assess and forecast changes in bird migration in the face of climate change.
iPTF Discoveries of Recent Type Ia Supernovae
NASA Astrophysics Data System (ADS)
Petrushevska, T.; Ferretti, R.; Fremling, C.; Hangard, L.; Johansson, J.; Migotto, K.; Nyholm, A.; Papadogiannakis, S.; Ben-Ami, S.; De Cia, A.; Dzigan, Y.; Horesh, A.; Leloudas, G.; Manulis, I.; Rubin, A.; Sagiv, I.; Vreeswijk, P.; Yaron, O.; Cao, Y.; Perley, D.; Miller, A.; Waszczak, A.; Kasliwal, M. M.; Hosseinzadeh, G.; Cenko, S. B.; Quimby, R.
2015-05-01
The intermediate Palomar Transient Factory (ATel #4807) reports the discovery and classification of the following Type Ia SNe. Our automated candidate vetting to distinguish a real astrophysical source (1.0) from bogus artifacts (0.0) is powered by three generations of machine learning algorithms: RB2 (Brink et al. 2013MNRAS.435.1047B), RB4 (Rebbapragada et al. 2015AAS...22543402R) and RB5 (Wozniak et al. 2013AAS...22143105W). See ATel #7112 for additional details.
1990-09-12
electronics reading to the next. To test this hypothesis and the suitability of EBL to acquiring schemas, I have implemented an automated reader/learner as...used. For example, testing the utility of a kidnapping schema using several readings about kidnapping can only go so far toward establishing the...the cost of carrying the new rules while processing unrelated material will be underestimated. The present research tests the utility of new schemas in
Metzger, Marie-Hélène; Tvardik, Nastassia; Gicquel, Quentin; Bouvry, Côme; Poulet, Emmanuel; Potinet-Pagliaroli, Véronique
2017-06-01
The aim of this study was to determine whether an expert system based on automated processing of electronic health records (EHRs) could provide a more accurate estimate of the annual rate of emergency department (ED) visits for suicide attempts in France, as compared to the current national surveillance system based on manual coding by emergency practitioners. A feasibility study was conducted at Lyon University Hospital, using data for all ED patient visits in 2012. After automatic data extraction and pre-processing, including automatic coding of medical free-text through use of the Unified Medical Language System, seven different machine-learning methods were used to classify the reasons for ED visits into "suicide attempts" versus "other reasons". The performance of these different methods was compared by using the F-measure. In a test sample of 444 patients admitted to the ED in 2012 (98 suicide attempts, 48 cases of suicidal ideation, and 292 controls with no recorded non-fatal suicidal behaviour), the F-measure for automatic detection of suicide attempts ranged from 70.4% to 95.3%. The random forest and naïve Bayes methods performed best. This study demonstrates that machine-learning methods can improve the quality of epidemiological indicators as compared to current national surveillance of suicide attempts. Copyright © 2016 John Wiley & Sons, Ltd.
Machine learning for fab automated diagnostics
NASA Astrophysics Data System (ADS)
Giollo, Manuel; Lam, Auguste; Gkorou, Dimitra; Liu, Xing Lan; van Haren, Richard
2017-06-01
Process optimization depends largely on field engineer's knowledge and expertise. However, this practice turns out to be less sustainable due to the fab complexity which is continuously increasing in order to support the extreme miniaturization of Integrated Circuits. On the one hand, process optimization and root cause analysis of tools is necessary for a smooth fab operation. On the other hand, the growth in number of wafer processing steps is adding a considerable new source of noise which may have a significant impact at the nanometer scale. This paper explores the ability of historical process data and Machine Learning to support field engineers in production analysis and monitoring. We implement an automated workflow in order to analyze a large volume of information, and build a predictive model of overlay variation. The proposed workflow addresses significant problems that are typical in fab production, like missing measurements, small number of samples, confounding effects due to heterogeneity of data, and subpopulation effects. We evaluate the proposed workflow on a real usecase and we show that it is able to predict overlay excursions observed in Integrated Circuits manufacturing. The chosen design focuses on linear and interpretable models of the wafer history, which highlight the process steps that are causing defective products. This is a fundamental feature for diagnostics, as it supports process engineers in the continuous improvement of the production line.
Griffin, Kingsley J; Hedge, Luke H; González-Rivero, Manuel; Hoegh-Guldberg, Ove I; Johnston, Emma L
2017-07-01
Historically, marine ecologists have lacked efficient tools that are capable of capturing detailed species distribution data over large areas. Emerging technologies such as high-resolution imaging and associated machine-learning image-scoring software are providing new tools to map species over large areas in the ocean. Here, we combine a novel diver propulsion vehicle (DPV) imaging system with free-to-use machine-learning software to semi-automatically generate dense and widespread abundance records of a habitat-forming algae over ~5,000 m 2 of temperate reef. We employ replicable spatial techniques to test the effectiveness of traditional diver-based sampling, and better understand the distribution and spatial arrangement of one key algal species. We found that the effectiveness of a traditional survey depended on the level of spatial structuring, and generally 10-20 transects (50 × 1 m) were required to obtain reliable results. This represents 2-20 times greater replication than have been collected in previous studies. Furthermore, we demonstrate the usefulness of fine-resolution distribution modeling for understanding patterns in canopy algae cover at multiple spatial scales, and discuss applications to other marine habitats. Our analyses demonstrate that semi-automated methods of data gathering and processing provide more accurate results than traditional methods for describing habitat structure at seascape scales, and therefore represent vastly improved techniques for understanding and managing marine seascapes.
NASA Astrophysics Data System (ADS)
Pinales, J. C.; Graber, H. C.; Hargrove, J. T.; Caruso, M. J.
2016-02-01
Previous studies have demonstrated the ability to detect and classify marine hydrocarbon films with spaceborne synthetic aperture radar (SAR) imagery. The dampening effects of hydrocarbon discharges on small surface capillary-gravity waves renders the ocean surface "radar dark" compared with the standard wind-borne ocean surfaces. Given the scope and impact of events like the Deepwater Horizon oil spill, the need for improved, automated and expedient monitoring of hydrocarbon-related marine anomalies has become a pressing and complex issue for governments and the extraction industry. The research presented here describes the development, training, and utilization of an algorithm that detects marine oil spills in an automated, semi-supervised manner, utilizing X-, C-, or L-band SAR data as the primary input. Ancillary datasets include related radar-borne variables (incidence angle, etc.), environmental data (wind speed, etc.) and textural descriptors. Shapefiles produced by an experienced human-analyst served as targets (validation) during the training portion of the investigation. Training and testing datasets were chosen for development and assessment of algorithm effectiveness as well as optimal conditions for oil detection in SAR data. The algorithm detects oil spills by following a 3-step methodology: object detection, feature extraction, and classification. Previous oil spill detection and classification methodologies such as machine learning algorithms, artificial neural networks (ANN), and multivariate classification methods like partial least squares-discriminant analysis (PLS-DA) are evaluated and compared. Statistical, transform, and model-based image texture techniques, commonly used for object mapping directly or as inputs for more complex methodologies, are explored to determine optimal textures for an oil spill detection system. The influence of the ancillary variables is explored, with a particular focus on the role of strong vs. weak wind forcing.
AUTOMATING ASSET KNOWLEDGE WITH MTCONNECT
Venkatesh, Sid; Ly, Sidney; Manning, Martin; Michaloski, John; Proctor, Fred
2017-01-01
In order to maximize assets, manufacturers should use real-time knowledge garnered from ongoing and continuous collection and evaluation of factory-floor machine status data. In discrete parts manufacturing, factory machine monitoring has been difficult, due primarily to closed, proprietary automation equipment that make integration difficult. Recently, there has been a push in applying the data acquisition concepts of MTConnect to the real-time acquisition of machine status data. MTConnect is an open, free specification aimed at overcoming the “Islands of Automation” dilemma on the shop floor. With automated asset analysis, manufacturers can improve production to become lean, efficient, and effective. The focus of this paper will be on the deployment of MTConnect to collect real-time machine status to automate asset management. In addition, we will leverage the ISO 22400 standard, which defines an asset and quantifies asset performance metrics. In conjunction with these goals, the deployment of MTConnect in a large aerospace manufacturing facility will be studied with emphasis on asset management and understanding the impact of machine Overall Equipment Effectiveness (OEE) on manufacturing. PMID:28691121
Accuracy of automated classification of major depressive disorder as a function of symptom severity.
Ramasubbu, Rajamannar; Brown, Matthew R G; Cortese, Filmeno; Gaxiola, Ismael; Goodyear, Bradley; Greenshaw, Andrew J; Dursun, Serdar M; Greiner, Russell
2016-01-01
Growing evidence documents the potential of machine learning for developing brain based diagnostic methods for major depressive disorder (MDD). As symptom severity may influence brain activity, we investigated whether the severity of MDD affected the accuracies of machine learned MDD-vs-Control diagnostic classifiers. Forty-five medication-free patients with DSM-IV defined MDD and 19 healthy controls participated in the study. Based on depression severity as determined by the Hamilton Rating Scale for Depression (HRSD), MDD patients were sorted into three groups: mild to moderate depression (HRSD 14-19), severe depression (HRSD 20-23), and very severe depression (HRSD ≥ 24). We collected functional magnetic resonance imaging (fMRI) data during both resting-state and an emotional-face matching task. Patients in each of the three severity groups were compared against controls in separate analyses, using either the resting-state or task-based fMRI data. We use each of these six datasets with linear support vector machine (SVM) binary classifiers for identifying individuals as patients or controls. The resting-state fMRI data showed statistically significant classification accuracy only for the very severe depression group (accuracy 66%, p = 0.012 corrected), while mild to moderate (accuracy 58%, p = 1.0 corrected) and severe depression (accuracy 52%, p = 1.0 corrected) were only at chance. With task-based fMRI data, the automated classifier performed at chance in all three severity groups. Binary linear SVM classifiers achieved significant classification of very severe depression with resting-state fMRI, but the contribution of brain measurements may have limited potential in differentiating patients with less severe depression from healthy controls.
Investigation of automated task learning, decomposition and scheduling
NASA Technical Reports Server (NTRS)
Livingston, David L.; Serpen, Gursel; Masti, Chandrashekar L.
1990-01-01
The details and results of research conducted in the application of neural networks to task planning and decomposition are presented. Task planning and decomposition are operations that humans perform in a reasonably efficient manner. Without the use of good heuristics and usually much human interaction, automatic planners and decomposers generally do not perform well due to the intractable nature of the problems under consideration. The human-like performance of neural networks has shown promise for generating acceptable solutions to intractable problems such as planning and decomposition. This was the primary reasoning behind attempting the study. The basis for the work is the use of state machines to model tasks. State machine models provide a useful means for examining the structure of tasks since many formal techniques have been developed for their analysis and synthesis. It is the approach to integrate the strong algebraic foundations of state machines with the heretofore trial-and-error approach to neural network synthesis.
Kim, Eun Young; Magnotta, Vincent A; Liu, Dawei; Johnson, Hans J
2014-09-01
Machine learning (ML)-based segmentation methods are a common technique in the medical image processing field. In spite of numerous research groups that have investigated ML-based segmentation frameworks, there remains unanswered aspects of performance variability for the choice of two key components: ML algorithm and intensity normalization. This investigation reveals that the choice of those elements plays a major part in determining segmentation accuracy and generalizability. The approach we have used in this study aims to evaluate relative benefits of the two elements within a subcortical MRI segmentation framework. Experiments were conducted to contrast eight machine-learning algorithm configurations and 11 normalization strategies for our brain MR segmentation framework. For the intensity normalization, a Stable Atlas-based Mapped Prior (STAMP) was utilized to take better account of contrast along boundaries of structures. Comparing eight machine learning algorithms on down-sampled segmentation MR data, it was obvious that a significant improvement was obtained using ensemble-based ML algorithms (i.e., random forest) or ANN algorithms. Further investigation between these two algorithms also revealed that the random forest results provided exceptionally good agreement with manual delineations by experts. Additional experiments showed that the effect of STAMP-based intensity normalization also improved the robustness of segmentation for multicenter data sets. The constructed framework obtained good multicenter reliability and was successfully applied on a large multicenter MR data set (n>3000). Less than 10% of automated segmentations were recommended for minimal expert intervention. These results demonstrate the feasibility of using the ML-based segmentation tools for processing large amount of multicenter MR images. We demonstrated dramatically different result profiles in segmentation accuracy according to the choice of ML algorithm and intensity normalization chosen. Copyright © 2014 Elsevier Inc. All rights reserved.
Automated Detection of Epileptic Biomarkers in Resting-State Interictal MEG Data
Soriano, Miguel C.; Niso, Guiomar; Clements, Jillian; Ortín, Silvia; Carrasco, Sira; Gudín, María; Mirasso, Claudio R.; Pereda, Ernesto
2017-01-01
Certain differences between brain networks of healthy and epilectic subjects have been reported even during the interictal activity, in which no epileptic seizures occur. Here, magnetoencephalography (MEG) data recorded in the resting state is used to discriminate between healthy subjects and patients with either idiopathic generalized epilepsy or frontal focal epilepsy. Signal features extracted from interictal periods without any epileptiform activity are used to train a machine learning algorithm to draw a diagnosis. This is potentially relevant to patients without frequent or easily detectable spikes. To analyze the data, we use an up-to-date machine learning algorithm and explore the benefits of including different features obtained from the MEG data as inputs to the algorithm. We find that the relative power spectral density of the MEG time-series is sufficient to distinguish between healthy and epileptic subjects with a high prediction accuracy. We also find that a combination of features such as the phase-locked value and the relative power spectral density allow to discriminate generalized and focal epilepsy, when these features are calculated over a filtered version of the signals in certain frequency bands. Machine learning algorithms are currently being applied to the analysis and classification of brain signals. It is, however, less evident to identify the proper features of these signals that are prone to be used in such machine learning algorithms. Here, we evaluate the influence of the input feature selection on a clinical scenario to distinguish between healthy and epileptic subjects. Our results indicate that such distinction is possible with a high accuracy (86%), allowing the discrimination between idiopathic generalized and frontal focal epilepsy types. PMID:28713260
Using machine learning to predict snow water equivalent in the Sierra Nevada USA and Afghanistan
NASA Astrophysics Data System (ADS)
Bair, N.; Rittger, K.; Dozier, J.
2017-12-01
In many mountain regions, snowmelt provides most of the runoff. Ranges such as the Sierra Nevada USA benefit from hundreds of manual and automated snow measurement stations as well as basin-wide snow water equavalent (SWE) estimates from new platforms like the Airborne Snow Observatory. Thus, we have been able to use the Sierra Nevada as a testbed to validate an approach called SWE reconstruction, where the snowpack is built-up in reverse using downscaled energy balance forcings. Our past work has shown that SWE reconstruction produces some of the most accurate basin-wide SWE estimates, comparable in accuracy to a snow pillow/course interpolation, but requires no in situ measurements, which is its main advantage. The disadvantages are that reconstruction cannot be used for a forecast and is only valid during the ablation period. To address these shortcomings, we have used machine learning trained on reconstructed SWE in the Sierra Nevada and Afghanistan, where there are no accessible snowpack measurements. Predictors are physiographic and remotely-sensed variables, including brightness temperatures from a new enhanced resolution passive microwave dataset. Two machine learning techniques—bagged regression trees and feed-forward neural networks—were used. Results show little bias on average and < 100 mm RMSE. For both areas, daily SWE climatology and fractional snow-covered area were the most important predictors. As expected, the passive microwave brigthness temperatures showed some predictive power in Afghanistan, with its almost nonexistent tree cover, but no predictive power in the Sierra Nevada, with its extensive canopy-covered snowpack. In the Sierra, we also explored how our machine learning approach performed outside of the training period, i.e. the ablation period.
Multichannel Convolutional Neural Network for Biological Relation Extraction.
Quan, Chanqin; Hua, Lei; Sun, Xiao; Bai, Wenjun
2016-01-01
The plethora of biomedical relations which are embedded in medical logs (records) demands researchers' attention. Previous theoretical and practical focuses were restricted on traditional machine learning techniques. However, these methods are susceptible to the issues of "vocabulary gap" and data sparseness and the unattainable automation process in feature extraction. To address aforementioned issues, in this work, we propose a multichannel convolutional neural network (MCCNN) for automated biomedical relation extraction. The proposed model has the following two contributions: (1) it enables the fusion of multiple (e.g., five) versions in word embeddings; (2) the need for manual feature engineering can be obviated by automated feature learning with convolutional neural network (CNN). We evaluated our model on two biomedical relation extraction tasks: drug-drug interaction (DDI) extraction and protein-protein interaction (PPI) extraction. For DDI task, our system achieved an overall f -score of 70.2% compared to the standard linear SVM based system (e.g., 67.0%) on DDIExtraction 2013 challenge dataset. And for PPI task, we evaluated our system on Aimed and BioInfer PPI corpus; our system exceeded the state-of-art ensemble SVM system by 2.7% and 5.6% on f -scores.
A Framework for Modeling Human-Machine Interactions
NASA Technical Reports Server (NTRS)
Shafto, Michael G.; Rosekind, Mark R. (Technical Monitor)
1996-01-01
Modern automated flight-control systems employ a variety of different behaviors, or modes, for managing the flight. While developments in cockpit automation have resulted in workload reduction and economical advantages, they have also given rise to an ill-defined class of human-machine problems, sometimes referred to as 'automation surprises'. Our interest in applying formal methods for describing human-computer interaction stems from our ongoing research on cockpit automation. In this area of aeronautical human factors, there is much concern about how flight crews interact with automated flight-control systems, so that the likelihood of making errors, in particular mode-errors, is minimized and the consequences of such errors are contained. The goal of the ongoing research on formal methods in this context is: (1) to develop a framework for describing human interaction with control systems; (2) to formally categorize such automation surprises; and (3) to develop tests for identification of these categories early in the specification phase of a new human-machine system.
Teaching an Old Log New Tricks with Machine Learning.
Schnell, Krista; Puri, Colin; Mahler, Paul; Dukatz, Carl
2014-03-01
To most people, the log file would not be considered an exciting area in technology today. However, these relatively benign, slowly growing data sources can drive large business transformations when combined with modern-day analytics. Accenture Technology Labs has built a new framework that helps to expand existing vendor solutions to create new methods of gaining insights from these benevolent information springs. This framework provides a systematic and effective machine-learning mechanism to understand, analyze, and visualize heterogeneous log files. These techniques enable an automated approach to analyzing log content in real time, learning relevant behaviors, and creating actionable insights applicable in traditionally reactive situations. Using this approach, companies can now tap into a wealth of knowledge residing in log file data that is currently being collected but underutilized because of its overwhelming variety and volume. By using log files as an important data input into the larger enterprise data supply chain, businesses have the opportunity to enhance their current operational log management solution and generate entirely new business insights-no longer limited to the realm of reactive IT management, but extending from proactive product improvement to defense from attacks. As we will discuss, this solution has immediate relevance in the telecommunications and security industries. However, the most forward-looking companies can take it even further. How? By thinking beyond the log file and applying the same machine-learning framework to other log file use cases (including logistics, social media, and consumer behavior) and any other transactional data source.
Automated Protocol for Large-Scale Modeling of Gene Expression Data.
Hall, Michelle Lynn; Calkins, David; Sherman, Woody
2016-11-28
With the continued rise of phenotypic- and genotypic-based screening projects, computational methods to analyze, process, and ultimately make predictions in this field take on growing importance. Here we show how automated machine learning workflows can produce models that are predictive of differential gene expression as a function of a compound structure using data from A673 cells as a proof of principle. In particular, we present predictive models with an average accuracy of greater than 70% across a highly diverse ∼1000 gene expression profile. In contrast to the usual in silico design paradigm, where one interrogates a particular target-based response, this work opens the opportunity for virtual screening and lead optimization for desired multitarget gene expression profiles.
GeneRIF indexing: sentence selection based on machine learning.
Jimeno-Yepes, Antonio J; Sticco, J Caitlin; Mork, James G; Aronson, Alan R
2013-05-31
A Gene Reference Into Function (GeneRIF) describes novel functionality of genes. GeneRIFs are available from the National Center for Biotechnology Information (NCBI) Gene database. GeneRIF indexing is performed manually, and the intention of our work is to provide methods to support creating the GeneRIF entries. The creation of GeneRIF entries involves the identification of the genes mentioned in MEDLINE®; citations and the sentences describing a novel function. We have compared several learning algorithms and several features extracted or derived from MEDLINE sentences to determine if a sentence should be selected for GeneRIF indexing. Features are derived from the sentences or using mechanisms to augment the information provided by them: assigning a discourse label using a previously trained model, for example. We show that machine learning approaches with specific feature combinations achieve results close to one of the annotators. We have evaluated different feature sets and learning algorithms. In particular, Naïve Bayes achieves better performance with a selection of features similar to one used in related work, which considers the location of the sentence, the discourse of the sentence and the functional terminology in it. The current performance is at a level similar to human annotation and it shows that machine learning can be used to automate the task of sentence selection for GeneRIF annotation. The current experiments are limited to the human species. We would like to see how the methodology can be extended to other species, specifically the normalization of gene mentions in other species.
Automated visual imaging interface for the plant floor
NASA Astrophysics Data System (ADS)
Wutke, John R.
1991-03-01
The paper will provide an overview of the challenges facing a user of automated visual imaging (" AVI" ) machines and the philosophies that should be employed in designing them. As manufacturing tools and equipment become more sophisticated it is increasingly difficult to maintain an efficient interaction between the operator and machine. The typical user of an AVI machine in a production environment is technically unsophisticated. Also operator and machine ergonomics are often a neglected or poorly addressed part of an efficient manufacturing process. This paper presents a number of man-machine interface design techniques and philosophies that effectively solve these problems.
Classification of ROTSE Variable Stars using Machine Learning
NASA Astrophysics Data System (ADS)
Wozniak, P. R.; Akerlof, C.; Amrose, S.; Brumby, S.; Casperson, D.; Gisler, G.; Kehoe, R.; Lee, B.; Marshall, S.; McGowan, K. E.; McKay, T.; Perkins, S.; Priedhorsky, W.; Rykoff, E.; Smith, D. A.; Theiler, J.; Vestrand, W. T.; Wren, J.; ROTSE Collaboration
2001-12-01
We evaluate several Machine Learning algorithms as potential tools for automated classification of variable stars. Using the ROTSE sample of ~1800 variables from a pilot study of 5% of the whole sky, we compare the effectiveness of a supervised technique (Support Vector Machines, SVM) versus unsupervised methods (K-means and Autoclass). There are 8 types of variables in the sample: RR Lyr AB, RR Lyr C, Delta Scuti, Cepheids, detached eclipsing binaries, contact binaries, Miras and LPVs. Preliminary results suggest a very high ( ~95%) efficiency of SVM in isolating a few best defined classes against the rest of the sample, and good accuracy ( ~70-75%) for all classes considered simultaneously. This includes some degeneracies, irreducible with the information at hand. Supervised methods naturally outperform unsupervised methods, in terms of final error rate, but unsupervised methods offer many advantages for large sets of unlabeled data. Therefore, both types of methods should be considered as promising tools for mining vast variability surveys. We project that there are more than 30,000 periodic variables in the ROTSE-I data base covering the entire local sky between V=10 and 15.5 mag. This sample size is already stretching the time capabilities of human analysts.
Task-focused modeling in automated agriculture
NASA Astrophysics Data System (ADS)
Vriesenga, Mark R.; Peleg, K.; Sklansky, Jack
1993-01-01
Machine vision systems analyze image data to carry out automation tasks. Our interest is in machine vision systems that rely on models to achieve their designed task. When the model is interrogated from an a priori menu of questions, the model need not be complete. Instead, the machine vision system can use a partial model that contains a large amount of information in regions of interest and less information elsewhere. We propose an adaptive modeling scheme for machine vision, called task-focused modeling, which constructs a model having just sufficient detail to carry out the specified task. The model is detailed in regions of interest to the task and is less detailed elsewhere. This focusing effect saves time and reduces the computational effort expended by the machine vision system. We illustrate task-focused modeling by an example involving real-time micropropagation of plants in automated agriculture.
Large robotized turning centers described
NASA Astrophysics Data System (ADS)
Kirsanov, V. V.; Tsarenko, V. I.
1985-09-01
The introduction of numerical control (NC) machine tools has made it possible to automate machining in series and small series production. The organization of automated production sections merged NC machine tools with automated transport systems. However, both the one and the other require the presence of an operative at the machine for low skilled operations. Industrial robots perform a number of auxiliary operations, such as equipment loading-unloading and control, changing cutting and auxiliary tools, controlling workpieces and parts, and cleaning of location surfaces. When used with a group of equipment they perform transfer operations between the machine tools. Industrial robots eliminate the need for workers to form auxiliary operations. This underscores the importance of developing robotized manufacturing centers providing for minimal human participation in production and creating conditions for two and three shift operation of equipment. Work carried out at several robotized manufacturing centers for series and small series production is described.
2013-01-01
Background Protein-protein interactions (PPIs) play crucial roles in the execution of various cellular processes and form the basis of biological mechanisms. Although large amount of PPIs data for different species has been generated by high-throughput experimental techniques, current PPI pairs obtained with experimental methods cover only a fraction of the complete PPI networks, and further, the experimental methods for identifying PPIs are both time-consuming and expensive. Hence, it is urgent and challenging to develop automated computational methods to efficiently and accurately predict PPIs. Results We present here a novel hierarchical PCA-EELM (principal component analysis-ensemble extreme learning machine) model to predict protein-protein interactions only using the information of protein sequences. In the proposed method, 11188 protein pairs retrieved from the DIP database were encoded into feature vectors by using four kinds of protein sequences information. Focusing on dimension reduction, an effective feature extraction method PCA was then employed to construct the most discriminative new feature set. Finally, multiple extreme learning machines were trained and then aggregated into a consensus classifier by majority voting. The ensembling of extreme learning machine removes the dependence of results on initial random weights and improves the prediction performance. Conclusions When performed on the PPI data of Saccharomyces cerevisiae, the proposed method achieved 87.00% prediction accuracy with 86.15% sensitivity at the precision of 87.59%. Extensive experiments are performed to compare our method with state-of-the-art techniques Support Vector Machine (SVM). Experimental results demonstrate that proposed PCA-EELM outperforms the SVM method by 5-fold cross-validation. Besides, PCA-EELM performs faster than PCA-SVM based method. Consequently, the proposed approach can be considered as a new promising and powerful tools for predicting PPI with excellent performance and less time. PMID:23815620
Nelwan, Erni J; Indrasanti, Evi; Sinto, Robert; Nurchaida, Farida; Sosrosumihardjo, Rustadi
2016-01-01
to evaluate the performance of Vitek2 compact machine (Biomerieux Inc. ver 04.02, France) in reference to manual methods for susceptibility test for Candida resistance among HIV/AIDS patients. a comparison study to evaluate Vitek2 compact machine (Biomerieux Inc. ver 04.02, France) in reference to manual methods for susceptibility test for Candida resistance among HIV/AIDS patient was done. Categorical agreement between manual disc diffusion and Vitek2 machine was calculated using predefined criteria. Time to susceptibility result for automated and manual methods were measured. there were 137 Candida isolates comprising eight Candida species with C.albicans and C. glabrata as the first (56.2%) and second (15.3%) most common species, respectively. For fluconazole drug, among the C. albicans, 2.6% was found resistant on manual disc diffusion methods and no resistant was determined by Vitek2 machine; whereas 100% C. krusei was identified as resistant on both methods. Resistant patterns for C. glabrata to fluconazole, voriconazole and amphotericin B were 52.4%, 23.8%, 23.8% vs. 9.5%, 9.5%, 4.8% respectively between manual diffusion disc methods and Vitek2 machine. Time to susceptibility result for automated methods compared to Vitex2 machine was shorter for all Candida species. there is a good categorical agreement between manual disc diffusion and Vitek2 machine, except for C. glabrata for measuring the antifungal resistant. Time to susceptibility result for automated methods is shorter for all Candida species.
Automated Cognitive Health Assessment From Smart Home-Based Behavior Data.
Dawadi, Prafulla Nath; Cook, Diane Joyce; Schmitter-Edgecombe, Maureen
2016-07-01
Smart home technologies offer potential benefits for assisting clinicians by automating health monitoring and well-being assessment. In this paper, we examine the actual benefits of smart home-based analysis by monitoring daily behavior in the home and predicting clinical scores of the residents. To accomplish this goal, we propose a clinical assessment using activity behavior (CAAB) approach to model a smart home resident's daily behavior and predict the corresponding clinical scores. CAAB uses statistical features that describe characteristics of a resident's daily activity performance to train machine learning algorithms that predict the clinical scores. We evaluate the performance of CAAB utilizing smart home sensor data collected from 18 smart homes over two years. We obtain a statistically significant correlation ( r=0.72) between CAAB-predicted and clinician-provided cognitive scores and a statistically significant correlation ( r=0.45) between CAAB-predicted and clinician-provided mobility scores. These prediction results suggest that it is feasible to predict clinical scores using smart home sensor data and learning-based data analysis.
Assessing the quality of activities in a smart environment.
Cook, Diane J; Schmitter-Edgecombe, M
2009-01-01
Pervasive computing technology can provide valuable health monitoring and assistance technology to help individuals live independent lives in their own homes. As a critical part of this technology, our objective is to design software algorithms that recognize and assess the consistency of activities of daily living that individuals perform in their own homes. We have designed algorithms that automatically learn Markov models for each class of activity. These models are used to recognize activities that are performed in a smart home and to identify errors and inconsistencies in the performed activity. We validate our approach using data collected from 60 volunteers who performed a series of activities in our smart apartment testbed. The results indicate that the algorithms correctly label the activities and successfully assess the completeness and consistency of the performed task. Our results indicate that activity recognition and assessment can be automated using machine learning algorithms and smart home technology. These algorithms will be useful for automating remote health monitoring and interventions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ibragimov, B; Pernus, F; Strojan, P
Purpose: Accurate and efficient delineation of tumor target and organs-at-risks is essential for the success of radiotherapy. In reality, despite of decades of intense research efforts, auto-segmentation has not yet become clinical practice. In this study, we present, for the first time, a deep learning-based classification algorithm for autonomous segmentation in head and neck (HaN) treatment planning. Methods: Fifteen HN datasets of CT, MR and PET images with manual annotation of organs-at-risk (OARs) including spinal cord, brainstem, optic nerves, chiasm, eyes, mandible, tongue, parotid glands were collected and saved in a library of plans. We also have ten super-resolution MRmore » images of the tongue area, where the genioglossus and inferior longitudinalis tongue muscles are defined as organs of interest. We applied the concepts of random forest- and deep learning-based object classification for automated image annotation with the aim of using machine learning to facilitate head and neck radiotherapy planning process. In this new paradigm of segmentation, random forests were used for landmark-assisted segmentation of super-resolution MR images. Alternatively to auto-segmentation with random forest-based landmark detection, deep convolutional neural networks were developed for voxel-wise segmentation of OARs in single and multi-modal images. The network consisted of three pairs of convolution and pooing layer, one RuLU layer and a softmax layer. Results: We present a comprehensive study on using machine learning concepts for auto-segmentation of OARs and tongue muscles for the HaN radiotherapy planning. An accuracy of 81.8% in terms of Dice coefficient was achieved for segmentation of genioglossus and inferior longitudinalis tongue muscles. Preliminary results of OARs regimentation also indicate that deep-learning afforded an unprecedented opportunities to improve the accuracy and robustness of radiotherapy planning. Conclusion: A novel machine learning framework has been developed for image annotation and structure segmentation. Our results indicate the great potential of deep learning in radiotherapy treatment planning.« less
JPRS Report, Science & Technology, Europe & Latin America.
1988-01-22
Rex Malik; ZERO UN INFORMATIQUE, 31 Aug 87) 25 FACTORY AUTOMATION, ROBOTICS West Europe Seeks To Halt Japanese Inroads in Machine Tool Sector...aircraft. 25048 CSO: 3698/A014 26 FACTORY AUTOMATION, ROBOTICS vrEST EUROpE WEST EUROPE SEEKS TO HALT JAPANESE INROADS IN MACHINE TOOL SECTOR...Trumpf, by the same journalist; first paragraph is L’USINE NOUVELLE introduction] [Excerpts] European machine - tool builders are stepping up mutual
One of My Favorite Assignments: Automated Teller Machine Simulation.
ERIC Educational Resources Information Center
Oberman, Paul S.
2001-01-01
Describes an assignment for an introductory computer science class that requires the student to write a software program that simulates an automated teller machine. Highlights include an algorithm for the assignment; sample file contents; language features used; assignment variations; and discussion points. (LRW)
POOL server: machine learning application for functional site prediction in proteins.
Somarowthu, Srinivas; Ondrechen, Mary Jo
2012-08-01
We present an automated web server for partial order optimum likelihood (POOL), a machine learning application that combines computed electrostatic and geometric information for high-performance prediction of catalytic residues from 3D structures. Input features consist of THEMATICS electrostatics data and pocket information from ConCavity. THEMATICS measures deviation from typical, sigmoidal titration behavior to identify functionally important residues and ConCavity identifies binding pockets by analyzing the surface geometry of protein structures. Both THEMATICS and ConCavity (structure only) do not require the query protein to have any sequence or structure similarity to other proteins. Hence, POOL is applicable to proteins with novel folds and engineered proteins. As an additional option for cases where sequence homologues are available, users can include evolutionary information from INTREPID for enhanced accuracy in site prediction. The web site is free and open to all users with no login requirements at http://www.pool.neu.edu. m.ondrechen@neu.edu Supplementary data are available at Bioinformatics online.
Automated discovery and construction of surface phase diagrams using machine learning
Ulissi, Zachary W.; Singh, Aayush R.; Tsai, Charlie; ...
2016-08-24
Surface phase diagrams are necessary for understanding surface chemistry in electrochemical catalysis, where a range of adsorbates and coverages exist at varying applied potentials. These diagrams are typically constructed using intuition, which risks missing complex coverages and configurations at potentials of interest. More accurate cluster expansion methods are often difficult to implement quickly for new surfaces. We adopt a machine learning approach to rectify both issues. Using a Gaussian process regression model, the free energy of all possible adsorbate coverages for surfaces is predicted for a finite number of adsorption sites. Our result demonstrates a rational, simple, and systematic approachmore » for generating accurate free-energy diagrams with reduced computational resources. Finally, the Pourbaix diagram for the IrO 2(110) surface (with nine coverages from fully hydrogenated to fully oxygenated surfaces) is reconstructed using just 20 electronic structure relaxations, compared to approximately 90 using typical search methods. Similar efficiency is demonstrated for the MoS 2 surface.« less
Imaging, Health Record, and Artificial Intelligence: Hype or Hope?
Mazzanti, Marco; Shirka, Ervina; Gjergo, Hortensia; Hasimi, Endri
2018-05-10
The review is focused on "digital health", which means advanced analytics based on multi-modal data. The "Health Care Internet of Things", which uses sensors, apps, and remote monitoring could provide continuous clinical information in the cloud that enables clinicians to access the information they need to care for patients everywhere. Greater standardization of acquisition protocols will be needed to maximize the potential gains from automation and machine learning. Recent artificial intelligence applications on cardiac imaging will not be diagnosing patients and replacing doctors but will be augmenting their ability to find key relevant data they need to care for a patient and present it in a concise, easily digestible format. Risk stratification will transition from oversimplified population-based risk scores to machine learning-based metrics incorporating a large number of patient-specific clinical and imaging variables in real-time beyond the limits of human cognition. This will deliver highly accurate and individual personalized risk assessments and facilitate tailored management plans.
Flexible software architecture for user-interface and machine control in laboratory automation.
Arutunian, E B; Meldrum, D R; Friedman, N A; Moody, S E
1998-10-01
We describe a modular, layered software architecture for automated laboratory instruments. The design consists of a sophisticated user interface, a machine controller and multiple individual hardware subsystems, each interacting through a client-server architecture built entirely on top of open Internet standards. In our implementation, the user-interface components are built as Java applets that are downloaded from a server integrated into the machine controller. The user-interface client can thereby provide laboratory personnel with a familiar environment for experiment design through a standard World Wide Web browser. Data management and security are seamlessly integrated at the machine-controller layer using QNX, a real-time operating system. This layer also controls hardware subsystems through a second client-server interface. This architecture has proven flexible and relatively easy to implement and allows users to operate laboratory automation instruments remotely through an Internet connection. The software architecture was implemented and demonstrated on the Acapella, an automated fluid-sample-processing system that is under development at the University of Washington.
ERIC Educational Resources Information Center
Seitz, Sue; Morris, Dan
In a study on short term memory, 32 educable mentally retarded subjects (mean IQ 62.68, mean mental age 103.78 months) were randomly assigned to each of the four experimental conditions. An automated machine presented the stimuli (32 three-letter words) and the interference items (a list of random numbers read aloud between stimuli presentations).…
2012-02-29
surface and Swiss roll) and real-world data sets (UCI Machine Learning Repository [12] and USPS digit handwriting data). In our experiments, we use...less than µn ( say µ = 0.8), we can first use screening technique to select µn candidate nodes, and then apply BIPS on them for further selection and...identified from node j to node i. So we can say the probability for the existence of this connection is approximately 82%. Given the probability matrix
PHOTOMETRIC SUPERNOVA CLASSIFICATION WITH MACHINE LEARNING
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lochner, Michelle; Peiris, Hiranya V.; Lahav, Ofer
Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models tomore » curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k -nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.« less
Application of machine learning for the evaluation of turfgrass plots using aerial images
NASA Astrophysics Data System (ADS)
Ding, Ke; Raheja, Amar; Bhandari, Subodh; Green, Robert L.
2016-05-01
Historically, investigation of turfgrass characteristics have been limited to visual ratings. Although relevant information may result from such evaluations, final inferences may be questionable because of the subjective nature in which the data is collected. Recent advances in computer vision techniques allow researchers to objectively measure turfgrass characteristics such as percent ground cover, turf color, and turf quality from the digital images. This paper focuses on developing a methodology for automated assessment of turfgrass quality from aerial images. Images of several turfgrass plots of varying quality were gathered using a camera mounted on an unmanned aerial vehicle. The quality of these plots were also evaluated based on visual ratings. The goal was to use the aerial images to generate quality evaluations on a regular basis for the optimization of water treatment. Aerial images are used to train a neural network so that appropriate features such as intensity, color, and texture of the turfgrass are extracted from these images. Neural network is a nonlinear classifier commonly used in machine learning. The output of the neural network trained model is the ratings of the grass, which is compared to the visual ratings. Currently, the quality and the color of turfgrass, measured as the greenness of the grass, are evaluated. The textures are calculated using the Gabor filter and co-occurrence matrix. Other classifiers such as support vector machines and simpler linear regression models such as Ridge regression and LARS regression are also used. The performance of each model is compared. The results show encouraging potential for using machine learning techniques for the evaluation of turfgrass quality and color.
Xu, Lina; Tetteh, Giles; Lipkova, Jana; Zhao, Yu; Li, Hongwei; Christ, Patrick; Piraud, Marie; Buck, Andreas; Shi, Kuangyu; Menze, Bjoern H
2018-01-01
The identification of bone lesions is crucial in the diagnostic assessment of multiple myeloma (MM). 68 Ga-Pentixafor PET/CT can capture the abnormal molecular expression of CXCR-4 in addition to anatomical changes. However, whole-body detection of dozens of lesions on hybrid imaging is tedious and error prone. It is even more difficult to identify lesions with a large heterogeneity. This study employed deep learning methods to automatically combine characteristics of PET and CT for whole-body MM bone lesion detection in a 3D manner. Two convolutional neural networks (CNNs), V-Net and W-Net, were adopted to segment and detect the lesions. The feasibility of deep learning for lesion detection on 68 Ga-Pentixafor PET/CT was first verified on digital phantoms generated using realistic PET simulation methods. Then the proposed methods were evaluated on real 68 Ga-Pentixafor PET/CT scans of MM patients. The preliminary results showed that deep learning method can leverage multimodal information for spatial feature representation, and W-Net obtained the best result for segmentation and lesion detection. It also outperformed traditional machine learning methods such as random forest classifier (RF), k -Nearest Neighbors ( k -NN), and support vector machine (SVM). The proof-of-concept study encourages further development of deep learning approach for MM lesion detection in population study.
Tetteh, Giles; Lipkova, Jana; Zhao, Yu; Li, Hongwei; Christ, Patrick; Buck, Andreas; Menze, Bjoern H.
2018-01-01
The identification of bone lesions is crucial in the diagnostic assessment of multiple myeloma (MM). 68Ga-Pentixafor PET/CT can capture the abnormal molecular expression of CXCR-4 in addition to anatomical changes. However, whole-body detection of dozens of lesions on hybrid imaging is tedious and error prone. It is even more difficult to identify lesions with a large heterogeneity. This study employed deep learning methods to automatically combine characteristics of PET and CT for whole-body MM bone lesion detection in a 3D manner. Two convolutional neural networks (CNNs), V-Net and W-Net, were adopted to segment and detect the lesions. The feasibility of deep learning for lesion detection on 68Ga-Pentixafor PET/CT was first verified on digital phantoms generated using realistic PET simulation methods. Then the proposed methods were evaluated on real 68Ga-Pentixafor PET/CT scans of MM patients. The preliminary results showed that deep learning method can leverage multimodal information for spatial feature representation, and W-Net obtained the best result for segmentation and lesion detection. It also outperformed traditional machine learning methods such as random forest classifier (RF), k-Nearest Neighbors (k-NN), and support vector machine (SVM). The proof-of-concept study encourages further development of deep learning approach for MM lesion detection in population study. PMID:29531504
Misra, Dharitri; Chen, Siyuan; Thoma, George R
2009-01-01
One of the most expensive aspects of archiving digital documents is the manual acquisition of context-sensitive metadata useful for the subsequent discovery of, and access to, the archived items. For certain types of textual documents, such as journal articles, pamphlets, official government records, etc., where the metadata is contained within the body of the documents, a cost effective method is to identify and extract the metadata in an automated way, applying machine learning and string pattern search techniques.At the U. S. National Library of Medicine (NLM) we have developed an automated metadata extraction (AME) system that employs layout classification and recognition models with a metadata pattern search model for a text corpus with structured or semi-structured information. A combination of Support Vector Machine and Hidden Markov Model is used to create the layout recognition models from a training set of the corpus, following which a rule-based metadata search model is used to extract the embedded metadata by analyzing the string patterns within and surrounding each field in the recognized layouts.In this paper, we describe the design of our AME system, with focus on the metadata search model. We present the extraction results for a historic collection from the Food and Drug Administration, and outline how the system may be adapted for similar collections. Finally, we discuss some ongoing enhancements to our AME system.
Taylor, R Andrew; Pare, Joseph R; Venkatesh, Arjun K; Mowafi, Hani; Melnick, Edward R; Fleischman, William; Hall, M Kennedy
2016-03-01
Predictive analytics in emergency care has mostly been limited to the use of clinical decision rules (CDRs) in the form of simple heuristics and scoring systems. In the development of CDRs, limitations in analytic methods and concerns with usability have generally constrained models to a preselected small set of variables judged to be clinically relevant and to rules that are easily calculated. Furthermore, CDRs frequently suffer from questions of generalizability, take years to develop, and lack the ability to be updated as new information becomes available. Newer analytic and machine learning techniques capable of harnessing the large number of variables that are already available through electronic health records (EHRs) may better predict patient outcomes and facilitate automation and deployment within clinical decision support systems. In this proof-of-concept study, a local, big data-driven, machine learning approach is compared to existing CDRs and traditional analytic methods using the prediction of sepsis in-hospital mortality as the use case. This was a retrospective study of adult ED visits admitted to the hospital meeting criteria for sepsis from October 2013 to October 2014. Sepsis was defined as meeting criteria for systemic inflammatory response syndrome with an infectious admitting diagnosis in the ED. ED visits were randomly partitioned into an 80%/20% split for training and validation. A random forest model (machine learning approach) was constructed using over 500 clinical variables from data available within the EHRs of four hospitals to predict in-hospital mortality. The machine learning prediction model was then compared to a classification and regression tree (CART) model, logistic regression model, and previously developed prediction tools on the validation data set using area under the receiver operating characteristic curve (AUC) and chi-square statistics. There were 5,278 visits among 4,676 unique patients who met criteria for sepsis. Of the 4,222 patients in the training group, 210 (5.0%) died during hospitalization, and of the 1,056 patients in the validation group, 50 (4.7%) died during hospitalization. The AUCs with 95% confidence intervals (CIs) for the different models were as follows: random forest model, 0.86 (95% CI = 0.82 to 0.90); CART model, 0.69 (95% CI = 0.62 to 0.77); logistic regression model, 0.76 (95% CI = 0.69 to 0.82); CURB-65, 0.73 (95% CI = 0.67 to 0.80); MEDS, 0.71 (95% CI = 0.63 to 0.77); and mREMS, 0.72 (95% CI = 0.65 to 0.79). The random forest model AUC was statistically different from all other models (p ≤ 0.003 for all comparisons). In this proof-of-concept study, a local big data-driven, machine learning approach outperformed existing CDRs as well as traditional analytic techniques for predicting in-hospital mortality of ED patients with sepsis. Future research should prospectively evaluate the effectiveness of this approach and whether it translates into improved clinical outcomes for high-risk sepsis patients. The methods developed serve as an example of a new model for predictive analytics in emergency care that can be automated, applied to other clinical outcomes of interest, and deployed in EHRs to enable locally relevant clinical predictions. © 2015 by the Society for Academic Emergency Medicine.
Taylor, R. Andrew; Pare, Joseph R.; Venkatesh, Arjun K.; Mowafi, Hani; Melnick, Edward R.; Fleischman, William; Hall, M. Kennedy
2018-01-01
Objectives Predictive analytics in emergency care has mostly been limited to the use of clinical decision rules (CDRs) in the form of simple heuristics and scoring systems. In the development of CDRs, limitations in analytic methods and concerns with usability have generally constrained models to a preselected small set of variables judged to be clinically relevant and to rules that are easily calculated. Furthermore, CDRs frequently suffer from questions of generalizability, take years to develop, and lack the ability to be updated as new information becomes available. Newer analytic and machine learning techniques capable of harnessing the large number of variables that are already available through electronic health records (EHRs) may better predict patient outcomes and facilitate automation and deployment within clinical decision support systems. In this proof-of-concept study, a local, big data–driven, machine learning approach is compared to existing CDRs and traditional analytic methods using the prediction of sepsis in-hospital mortality as the use case. Methods This was a retrospective study of adult ED visits admitted to the hospital meeting criteria for sepsis from October 2013 to October 2014. Sepsis was defined as meeting criteria for systemic inflammatory response syndrome with an infectious admitting diagnosis in the ED. ED visits were randomly partitioned into an 80%/20% split for training and validation. A random forest model (machine learning approach) was constructed using over 500 clinical variables from data available within the EHRs of four hospitals to predict in-hospital mortality. The machine learning prediction model was then compared to a classification and regression tree (CART) model, logistic regression model, and previously developed prediction tools on the validation data set using area under the receiver operating characteristic curve (AUC) and chi-square statistics. Results There were 5,278 visits among 4,676 unique patients who met criteria for sepsis. Of the 4,222 patients in the training group, 210 (5.0%) died during hospitalization, and of the 1,056 patients in the validation group, 50 (4.7%) died during hospitalization. The AUCs with 95% confidence intervals (CIs) for the different models were as follows: random forest model, 0.86 (95% CI = 0.82 to 0.90); CART model, 0.69 (95% CI = 0.62 to 0.77); logistic regression model, 0.76 (95% CI = 0.69 to 0.82); CURB-65, 0.73 (95% CI = 0.67 to 0.80); MEDS, 0.71 (95% CI = 0.63 to 0.77); and mREMS, 0.72 (95% CI = 0.65 to 0.79). The random forest model AUC was statistically different from all other models (p ≤ 0.003 for all comparisons). Conclusions In this proof-of-concept study, a local big data–driven, machine learning approach outperformed existing CDRs as well as traditional analytic techniques for predicting in-hospital mortality of ED patients with sepsis. Future research should prospectively evaluate the effectiveness of this approach and whether it translates into improved clinical outcomes for high-risk sepsis patients. The methods developed serve as an example of a new model for predictive analytics in emergency care that can be automated, applied to other clinical outcomes of interest, and deployed in EHRs to enable locally relevant clinical predictions. PMID:26679719
Jurrus, Elizabeth; Watanabe, Shigeki; Giuly, Richard J.; Paiva, Antonio R. C.; Ellisman, Mark H.; Jorgensen, Erik M.; Tasdizen, Tolga
2013-01-01
Neuroscientists are developing new imaging techniques and generating large volumes of data in an effort to understand the complex structure of the nervous system. The complexity and size of this data makes human interpretation a labor-intensive task. To aid in the analysis, new segmentation techniques for identifying neurons in these feature rich datasets are required. This paper presents a method for neuron boundary detection and nonbranching process segmentation in electron microscopy images and visualizing them in three dimensions. It combines both automated segmentation techniques with a graphical user interface for correction of mistakes in the automated process. The automated process first uses machine learning and image processing techniques to identify neuron membranes that deliniate the cells in each two-dimensional section. To segment nonbranching processes, the cell regions in each two-dimensional section are connected in 3D using correlation of regions between sections. The combination of this method with a graphical user interface specially designed for this purpose, enables users to quickly segment cellular processes in large volumes. PMID:22644867
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jurrus, Elizabeth R.; Watanabe, Shigeki; Giuly, Richard J.
2013-01-01
Neuroscientists are developing new imaging techniques and generating large volumes of data in an effort to understand the complex structure of the nervous system. The complexity and size of this data makes human interpretation a labor-intensive task. To aid in the analysis, new segmentation techniques for identifying neurons in these feature rich datasets are required. This paper presents a method for neuron boundary detection and nonbranching process segmentation in electron microscopy images and visualizing them in three dimensions. It combines both automated segmentation techniques with a graphical user interface for correction of mistakes in the automated process. The automated processmore » first uses machine learning and image processing techniques to identify neuron membranes that deliniate the cells in each two-dimensional section. To segment nonbranching processes, the cell regions in each two-dimensional section are connected in 3D using correlation of regions between sections. The combination of this method with a graphical user interface specially designed for this purpose, enables users to quickly segment cellular processes in large volumes.« less
Going deeper in the automated identification of Herbarium specimens.
Carranza-Rojas, Jose; Goeau, Herve; Bonnet, Pierre; Mata-Montero, Erick; Joly, Alexis
2017-08-11
Hundreds of herbarium collections have accumulated a valuable heritage and knowledge of plants over several centuries. Recent initiatives started ambitious preservation plans to digitize this information and make it available to botanists and the general public through web portals. However, thousands of sheets are still unidentified at the species level while numerous sheets should be reviewed and updated following more recent taxonomic knowledge. These annotations and revisions require an unrealistic amount of work for botanists to carry out in a reasonable time. Computer vision and machine learning approaches applied to herbarium sheets are promising but are still not well studied compared to automated species identification from leaf scans or pictures of plants in the field. In this work, we propose to study and evaluate the accuracy with which herbarium images can be potentially exploited for species identification with deep learning technology. In addition, we propose to study if the combination of herbarium sheets with photos of plants in the field is relevant in terms of accuracy, and finally, we explore if herbarium images from one region that has one specific flora can be used to do transfer learning to another region with other species; for example, on a region under-represented in terms of collected data. This is, to our knowledge, the first study that uses deep learning to analyze a big dataset with thousands of species from herbaria. Results show the potential of Deep Learning on herbarium species identification, particularly by training and testing across different datasets from different herbaria. This could potentially lead to the creation of a semi, or even fully automated system to help taxonomists and experts with their annotation, classification, and revision works.
A Santos, Jose C; Nassif, Houssam; Page, David; Muggleton, Stephen H; E Sternberg, Michael J
2012-07-11
There is a need for automated methods to learn general features of the interactions of a ligand class with its diverse set of protein receptors. An appropriate machine learning approach is Inductive Logic Programming (ILP), which automatically generates comprehensible rules in addition to prediction. The development of ILP systems which can learn rules of the complexity required for studies on protein structure remains a challenge. In this work we use a new ILP system, ProGolem, and demonstrate its performance on learning features of hexose-protein interactions. The rules induced by ProGolem detect interactions mediated by aromatics and by planar-polar residues, in addition to less common features such as the aromatic sandwich. The rules also reveal a previously unreported dependency for residues cys and leu. They also specify interactions involving aromatic and hydrogen bonding residues. This paper shows that Inductive Logic Programming implemented in ProGolem can derive rules giving structural features of protein/ligand interactions. Several of these rules are consistent with descriptions in the literature. In addition to confirming literature results, ProGolem's model has a 10-fold cross-validated predictive accuracy that is superior, at the 95% confidence level, to another ILP system previously used to study protein/hexose interactions and is comparable with state-of-the-art statistical learners.
Supporting the Growing Needs of the GIS Industry
NASA Technical Reports Server (NTRS)
2003-01-01
Visual Learning Systems, Inc. (VLS), of Missoula, Montana, has developed a commercial software application called Feature Analyst. Feature Analyst was conceived under a Small Business Innovation Research (SBIR) contract with NASA's Stennis Space Center, and through the Montana State University TechLink Center, an organization funded by NASA and the U.S. Department of Defense to link regional companies with Federal laboratories for joint research and technology transfer. The software provides a paradigm shift to automated feature extraction, as it utilizes spectral, spatial, temporal, and ancillary information to model the feature extraction process; presents the ability to remove clutter; incorporates advanced machine learning techniques to supply unparalleled levels of accuracy; and includes an exceedingly simple interface for feature extraction.
Tewary, S; Arun, I; Ahmed, R; Chatterjee, S; Chakraborty, C
2017-11-01
In prognostic evaluation of breast cancer Immunohistochemical (IHC) markers namely, oestrogen receptor (ER) and progesterone receptor (PR) are widely used. The expert pathologist investigates qualitatively the stained tissue slide under microscope to provide the Allred score; which is clinically used for therapeutic decision making. Such qualitative judgment is time-consuming, tedious and more often suffers from interobserver variability. As a result, it leads to imprecise IHC score for ER and PR. To overcome this, there is an urgent need of developing a reliable and efficient IHC quantifier for high throughput decision making. In view of this, our study aims at developing an automated IHC profiler for quantitative assessment of ER and PR molecular expression from stained tissue images. We propose here to use CMYK colour space for positively and negatively stained cell extraction for proportion score. Also colour features are used for quantitative assessment of intensity scoring among the positively stained cells. Five different machine learning models namely artificial neural network, Naïve Bayes, K-nearest neighbours, decision tree and random forest are considered for learning the colour features using average red, green and blue pixel values of positively stained cell patches. Fifty cases of ER- and PR-stained tissues have been evaluated for validation with the expert pathologist's score. All five models perform adequately where random forest shows the best correlation with the expert's score (Pearson's correlation coefficient = 0.9192). In the proposed approach the average variation of diaminobenzidine (DAB) to nuclear area from the expert's score is found to be 7.58%, as compared to 27.83% for state-of-the-art ImmunoRatio software. © 2017 The Authors Journal of Microscopy © 2017 Royal Microscopical Society.
Radio Galaxy Zoo: Machine learning for radio source host galaxy cross-identification
NASA Astrophysics Data System (ADS)
Alger, M. J.; Banfield, J. K.; Ong, C. S.; Rudnick, L.; Wong, O. I.; Wolf, C.; Andernach, H.; Norris, R. P.; Shabala, S. S.
2018-05-01
We consider the problem of determining the host galaxies of radio sources by cross-identification. This has traditionally been done manually, which will be intractable for wide-area radio surveys like the Evolutionary Map of the Universe (EMU). Automated cross-identification will be critical for these future surveys, and machine learning may provide the tools to develop such methods. We apply a standard approach from computer vision to cross-identification, introducing one possible way of automating this problem, and explore the pros and cons of this approach. We apply our method to the 1.4 GHz Australian Telescope Large Area Survey (ATLAS) observations of the Chandra Deep Field South (CDFS) and the ESO Large Area ISO Survey South 1 (ELAIS-S1) fields by cross-identifying them with the Spitzer Wide-area Infrared Extragalactic (SWIRE) survey. We train our method with two sets of data: expert cross-identifications of CDFS from the initial ATLAS data release and crowdsourced cross-identifications of CDFS from Radio Galaxy Zoo. We found that a simple strategy of cross-identifying a radio component with the nearest galaxy performs comparably to our more complex methods, though our estimated best-case performance is near 100 per cent. ATLAS contains 87 complex radio sources that have been cross-identified by experts, so there are not enough complex examples to learn how to cross-identify them accurately. Much larger datasets are therefore required for training methods like ours. We also show that training our method on Radio Galaxy Zoo cross-identifications gives comparable results to training on expert cross-identifications, demonstrating the value of crowdsourced training data.
On the virtues of automated quantitative structure-activity relationship: the new kid on the block.
de Oliveira, Marcelo T; Katekawa, Edson
2018-02-01
Quantitative structure-activity relationship (QSAR) has proved to be an invaluable tool in medicinal chemistry. Data availability at unprecedented levels through various databases have collaborated to a resurgence in the interest for QSAR. In this context, rapid generation of quality predictive models is highly desirable for hit identification and lead optimization. We showcase the application of an automated QSAR approach, which randomly selects multiple training/test sets and utilizes machine-learning algorithms to generate predictive models. Results demonstrate that AutoQSAR produces models of improved or similar quality to those generated by practitioners in the field but in just a fraction of the time. Despite the potential of the concept to the benefit of the community, the AutoQSAR opportunity has been largely undervalued.
Space Station man-machine automation trade-off analysis
NASA Technical Reports Server (NTRS)
Zimmerman, W. F.; Bard, J.; Feinberg, A.
1985-01-01
The man machine automation tradeoff methodology presented is of four research tasks comprising the autonomous spacecraft system technology (ASST) project. ASST was established to identify and study system level design problems for autonomous spacecraft. Using the Space Station as an example spacecraft system requiring a certain level of autonomous control, a system level, man machine automation tradeoff methodology is presented that: (1) optimizes man machine mixes for different ground and on orbit crew functions subject to cost, safety, weight, power, and reliability constraints, and (2) plots the best incorporation plan for new, emerging technologies by weighing cost, relative availability, reliability, safety, importance to out year missions, and ease of retrofit. A fairly straightforward approach is taken by the methodology to valuing human productivity, it is still sensitive to the important subtleties associated with designing a well integrated, man machine system. These subtleties include considerations such as crew preference to retain certain spacecraft control functions; or valuing human integration/decision capabilities over equivalent hardware/software where appropriate.
Respiratory Artefact Removal in Forced Oscillation Measurements: A Machine Learning Approach.
Pham, Thuy T; Thamrin, Cindy; Robinson, Paul D; McEwan, Alistair L; Leong, Philip H W
2017-08-01
Respiratory artefact removal for the forced oscillation technique can be treated as an anomaly detection problem. Manual removal is currently considered the gold standard, but this approach is laborious and subjective. Most existing automated techniques used simple statistics and/or rejected anomalous data points. Unfortunately, simple statistics are insensitive to numerous artefacts, leading to low reproducibility of results. Furthermore, rejecting anomalous data points causes an imbalance between the inspiratory and expiratory contributions. From a machine learning perspective, such methods are unsupervised and can be considered simple feature extraction. We hypothesize that supervised techniques can be used to find improved features that are more discriminative and more highly correlated with the desired output. Features thus found are then used for anomaly detection by applying quartile thresholding, which rejects complete breaths if one of its features is out of range. The thresholds are determined by both saliency and performance metrics rather than qualitative assumptions as in previous works. Feature ranking indicates that our new landmark features are among the highest scoring candidates regardless of age across saliency criteria. F1-scores, receiver operating characteristic, and variability of the mean resistance metrics show that the proposed scheme outperforms previous simple feature extraction approaches. Our subject-independent detector, 1IQR-SU, demonstrated approval rates of 80.6% for adults and 98% for children, higher than existing methods. Our new features are more relevant. Our removal is objective and comparable to the manual method. This is a critical work to automate forced oscillation technique quality control.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Masci, Frank J.; Grillmair, Carl J.; Cutri, Roc M.
2014-07-01
We describe a methodology to classify periodic variable stars identified using photometric time-series measurements constructed from the Wide-field Infrared Survey Explorer (WISE) full-mission single-exposure Source Databases. This will assist in the future construction of a WISE Variable Source Database that assigns variables to specific science classes as constrained by the WISE observing cadence with statistically meaningful classification probabilities. We have analyzed the WISE light curves of 8273 variable stars identified in previous optical variability surveys (MACHO, GCVS, and ASAS) and show that Fourier decomposition techniques can be extended into the mid-IR to assist with their classification. Combined with other periodicmore » light-curve features, this sample is then used to train a machine-learned classifier based on the random forest (RF) method. Consistent with previous classification studies of variable stars in general, the RF machine-learned classifier is superior to other methods in terms of accuracy, robustness against outliers, and relative immunity to features that carry little or redundant class information. For the three most common classes identified by WISE: Algols, RR Lyrae, and W Ursae Majoris type variables, we obtain classification efficiencies of 80.7%, 82.7%, and 84.5% respectively using cross-validation analyses, with 95% confidence intervals of approximately ±2%. These accuracies are achieved at purity (or reliability) levels of 88.5%, 96.2%, and 87.8% respectively, similar to that achieved in previous automated classification studies of periodic variable stars.« less
Supervised machine learning and active learning in classification of radiology reports.
Nguyen, Dung H M; Patrick, Jon D
2014-01-01
This paper presents an automated system for classifying the results of imaging examinations (CT, MRI, positron emission tomography) into reportable and non-reportable cancer cases. This system is part of an industrial-strength processing pipeline built to extract content from radiology reports for use in the Victorian Cancer Registry. In addition to traditional supervised learning methods such as conditional random fields and support vector machines, active learning (AL) approaches were investigated to optimize training production and further improve classification performance. The project involved two pilot sites in Victoria, Australia (Lake Imaging (Ballarat) and Peter MacCallum Cancer Centre (Melbourne)) and, in collaboration with the NSW Central Registry, one pilot site at Westmead Hospital (Sydney). The reportability classifier performance achieved 98.25% sensitivity and 96.14% specificity on the cancer registry's held-out test set. Up to 92% of training data needed for supervised machine learning can be saved by AL. AL is a promising method for optimizing the supervised training production used in classification of radiology reports. When an AL strategy is applied during the data selection process, the cost of manual classification can be reduced significantly. The most important practical application of the reportability classifier is that it can dramatically reduce human effort in identifying relevant reports from the large imaging pool for further investigation of cancer. The classifier is built on a large real-world dataset and can achieve high performance in filtering relevant reports to support cancer registries. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Automated EEG-based screening of depression using deep convolutional neural network.
Acharya, U Rajendra; Oh, Shu Lih; Hagiwara, Yuki; Tan, Jen Hong; Adeli, Hojjat; Subha, D P
2018-07-01
In recent years, advanced neurocomputing and machine learning techniques have been used for Electroencephalogram (EEG)-based diagnosis of various neurological disorders. In this paper, a novel computer model is presented for EEG-based screening of depression using a deep neural network machine learning approach, known as Convolutional Neural Network (CNN). The proposed technique does not require a semi-manually-selected set of features to be fed into a classifier for classification. It learns automatically and adaptively from the input EEG signals to differentiate EEGs obtained from depressive and normal subjects. The model was tested using EEGs obtained from 15 normal and 15 depressed patients. The algorithm attained accuracies of 93.5% and 96.0% using EEG signals from the left and right hemisphere, respectively. It was discovered in this research that the EEG signals from the right hemisphere are more distinctive in depression than those from the left hemisphere. This discovery is consistent with recent research and revelation that the depression is associated with a hyperactive right hemisphere. An exciting extension of this research would be diagnosis of different stages and severity of depression and development of a Depression Severity Index (DSI). Copyright © 2018 Elsevier B.V. All rights reserved.
Machine assisted histogram classification
NASA Astrophysics Data System (ADS)
Benyó, B.; Gaspar, C.; Somogyi, P.
2010-04-01
LHCb is one of the four major experiments under completion at the Large Hadron Collider (LHC). Monitoring the quality of the acquired data is important, because it allows the verification of the detector performance. Anomalies, such as missing values or unexpected distributions can be indicators of a malfunctioning detector, resulting in poor data quality. Spotting faulty or ageing components can be either done visually using instruments, such as the LHCb Histogram Presenter, or with the help of automated tools. In order to assist detector experts in handling the vast monitoring information resulting from the sheer size of the detector, we propose a graph based clustering tool combined with machine learning algorithm and demonstrate its use by processing histograms representing 2D hitmaps events. We prove the concept by detecting ion feedback events in the LHCb experiment's RICH subdetector.
Machine Vision Systems for Processing Hardwood Lumber and Logs
Philip A. Araman; Daniel L. Schmoldt; Tai-Hoon Cho; Dongping Zhu; Richard W. Conners; D. Earl Kline
1992-01-01
Machine vision and automated processing systems are under development at Virginia Tech University with support and cooperation from the USDA Forest Service. Our goals are to help U.S. hardwood producers automate, reduce costs, increase product volume and value recovery, and market higher value, more accurately graded and described products. Any vision system is...
Using microwave Doppler radar in automated manufacturing applications
NASA Astrophysics Data System (ADS)
Smith, Gregory C.
Since the beginning of the Industrial Revolution, manufacturers worldwide have used automation to improve productivity, gain market share, and meet growing or changing consumer demand for manufactured products. To stimulate further industrial productivity, manufacturers need more advanced automation technologies: "smart" part handling systems, automated assembly machines, CNC machine tools, and industrial robots that use new sensor technologies, advanced control systems, and intelligent decision-making algorithms to "see," "hear," "feel," and "think" at the levels needed to handle complex manufacturing tasks without human intervention. The investigator's dissertation offers three methods that could help make "smart" CNC machine tools and industrial robots possible: (1) A method for detecting acoustic emission using a microwave Doppler radar detector, (2) A method for detecting tool wear on a CNC lathe using a Doppler radar detector, and (3) An online non-contact method for detecting industrial robot position errors using a microwave Doppler radar motion detector. The dissertation studies indicate that microwave Doppler radar could be quite useful in automated manufacturing applications. In particular, the methods developed may help solve two difficult problems that hinder further progress in automating manufacturing processes: (1) Automating metal-cutting operations on CNC machine tools by providing a reliable non-contact method for detecting tool wear, and (2) Fully automating robotic manufacturing tasks by providing a reliable low-cost non-contact method for detecting on-line position errors. In addition, the studies offer a general non-contact method for detecting acoustic emission that may be useful in many other manufacturing and non-manufacturing areas, as well (e.g., monitoring and nondestructively testing structures, materials, manufacturing processes, and devices). By advancing the state of the art in manufacturing automation, the studies may help stimulate future growth in industrial productivity, which also promises to fuel economic growth and promote economic stability. The study also benefits the Department of Industrial Technology at Iowa State University and the field of Industrial Technology by contributing to the ongoing "smart" machine research program within the Department of Industrial Technology and by stimulating research into new sensor technologies within the University and within the field of Industrial Technology.
Eavesdropping on the Arctic: Automated bioacoustics reveal dynamics in songbird breeding phenology
Ellis, Daniel P. W.; Pérez, Jonathan H.; Wingfield, John C.; Boelman, Natalie T.
2018-01-01
Bioacoustic networks could vastly expand the coverage of wildlife monitoring to complement satellite observations of climate and vegetation. This approach would enable global-scale understanding of how climate change influences phenomena such as migratory timing of avian species. The enormous data sets that autonomous recorders typically generate demand automated analyses that remain largely undeveloped. We devised automated signal processing and machine learning approaches to estimate dates on which songbird communities arrived at arctic breeding grounds. Acoustically estimated dates agreed well with those determined via traditional surveys and were strongly related to the landscape’s snow-free dates. We found that environmental conditions heavily influenced daily variation in songbird vocal activity, especially before egg laying. Our novel approaches demonstrate that variation in avian migratory arrival can be detected autonomously. Large-scale deployment of this innovation in wildlife monitoring would enable the coverage necessary to assess and forecast changes in bird migration in the face of climate change. PMID:29938220
Sankar, Martial; Nieminen, Kaisa; Ragni, Laura; Xenarios, Ioannis; Hardtke, Christian S
2014-02-11
Among various advantages, their small size makes model organisms preferred subjects of investigation. Yet, even in model systems detailed analysis of numerous developmental processes at cellular level is severely hampered by their scale. For instance, secondary growth of Arabidopsis hypocotyls creates a radial pattern of highly specialized tissues that comprises several thousand cells starting from a few dozen. This dynamic process is difficult to follow because of its scale and because it can only be investigated invasively, precluding comprehensive understanding of the cell proliferation, differentiation, and patterning events involved. To overcome such limitation, we established an automated quantitative histology approach. We acquired hypocotyl cross-sections from tiled high-resolution images and extracted their information content using custom high-throughput image processing and segmentation. Coupled with automated cell type recognition through machine learning, we could establish a cellular resolution atlas that reveals vascular morphodynamics during secondary growth, for example equidistant phloem pole formation. DOI: http://dx.doi.org/10.7554/eLife.01567.001.
Automated diagnosis of fetal alcohol syndrome using 3D facial image analysis
Fang, Shiaofen; McLaughlin, Jason; Fang, Jiandong; Huang, Jeffrey; Autti-Rämö, Ilona; Fagerlund, Åse; Jacobson, Sandra W.; Robinson, Luther K.; Hoyme, H. Eugene; Mattson, Sarah N.; Riley, Edward; Zhou, Feng; Ward, Richard; Moore, Elizabeth S.; Foroud, Tatiana
2012-01-01
Objectives Use three-dimensional (3D) facial laser scanned images from children with fetal alcohol syndrome (FAS) and controls to develop an automated diagnosis technique that can reliably and accurately identify individuals prenatally exposed to alcohol. Methods A detailed dysmorphology evaluation, history of prenatal alcohol exposure, and 3D facial laser scans were obtained from 149 individuals (86 FAS; 63 Control) recruited from two study sites (Cape Town, South Africa and Helsinki, Finland). Computer graphics, machine learning, and pattern recognition techniques were used to automatically identify a set of facial features that best discriminated individuals with FAS from controls in each sample. Results An automated feature detection and analysis technique was developed and applied to the two study populations. A unique set of facial regions and features were identified for each population that accurately discriminated FAS and control faces without any human intervention. Conclusion Our results demonstrate that computer algorithms can be used to automatically detect facial features that can discriminate FAS and control faces. PMID:18713153
Yang, Jianji J; Cohen, Aaron M; Cohen, Aaron; McDonagh, Marian S
2008-11-06
Automatic document classification can be valuable in increasing the efficiency in updating systematic reviews (SR). In order for the machine learning process to work well, it is critical to create and maintain high-quality training datasets consisting of expert SR inclusion/exclusion decisions. This task can be laborious, especially when the number of topics is large and source data format is inconsistent.To approach this problem, we build an automated system to streamline the required steps, from initial notification of update in source annotation files to loading the data warehouse, along with a web interface to monitor the status of each topic. In our current collection of 26 SR topics, we were able to standardize almost all of the relevance judgments and recovered PMIDs for over 80% of all articles. Of those PMIDs, over 99% were correct in a manual random sample study. Our system performs an essential function in creating training and evaluation data sets for SR text mining research.
Yang, Jianji J.; Cohen, Aaron M.; McDonagh, Marian S.
2008-01-01
Automatic document classification can be valuable in increasing the efficiency in updating systematic reviews (SR). In order for the machine learning process to work well, it is critical to create and maintain high-quality training datasets consisting of expert SR inclusion/exclusion decisions. This task can be laborious, especially when the number of topics is large and source data format is inconsistent. To approach this problem, we build an automated system to streamline the required steps, from initial notification of update in source annotation files to loading the data warehouse, along with a web interface to monitor the status of each topic. In our current collection of 26 SR topics, we were able to standardize almost all of the relevance judgments and recovered PMIDs for over 80% of all articles. Of those PMIDs, over 99% were correct in a manual random sample study. Our system performs an essential function in creating training and evaluation datasets for SR text mining research. PMID:18999194
Automated Low-Cost Smartphone-Based Lateral Flow Saliva Test Reader for Drugs-of-Abuse Detection.
Carrio, Adrian; Sampedro, Carlos; Sanchez-Lopez, Jose Luis; Pimienta, Miguel; Campoy, Pascual
2015-11-24
Lateral flow assay tests are nowadays becoming powerful, low-cost diagnostic tools. Obtaining a result is usually subject to visual interpretation of colored areas on the test by a human operator, introducing subjectivity and the possibility of errors in the extraction of the results. While automated test readers providing a result-consistent solution are widely available, they usually lack portability. In this paper, we present a smartphone-based automated reader for drug-of-abuse lateral flow assay tests, consisting of an inexpensive light box and a smartphone device. Test images captured with the smartphone camera are processed in the device using computer vision and machine learning techniques to perform automatic extraction of the results. A deep validation of the system has been carried out showing the high accuracy of the system. The proposed approach, applicable to any line-based or color-based lateral flow test in the market, effectively reduces the manufacturing costs of the reader and makes it portable and massively available while providing accurate, reliable results.
Sankar, Martial; Nieminen, Kaisa; Ragni, Laura; Xenarios, Ioannis; Hardtke, Christian S
2014-01-01
Among various advantages, their small size makes model organisms preferred subjects of investigation. Yet, even in model systems detailed analysis of numerous developmental processes at cellular level is severely hampered by their scale. For instance, secondary growth of Arabidopsis hypocotyls creates a radial pattern of highly specialized tissues that comprises several thousand cells starting from a few dozen. This dynamic process is difficult to follow because of its scale and because it can only be investigated invasively, precluding comprehensive understanding of the cell proliferation, differentiation, and patterning events involved. To overcome such limitation, we established an automated quantitative histology approach. We acquired hypocotyl cross-sections from tiled high-resolution images and extracted their information content using custom high-throughput image processing and segmentation. Coupled with automated cell type recognition through machine learning, we could establish a cellular resolution atlas that reveals vascular morphodynamics during secondary growth, for example equidistant phloem pole formation. DOI: http://dx.doi.org/10.7554/eLife.01567.001 PMID:24520159
A review of active learning approaches to experimental design for uncovering biological networks
2017-01-01
Various types of biological knowledge describe networks of interactions among elementary entities. For example, transcriptional regulatory networks consist of interactions among proteins and genes. Current knowledge about the exact structure of such networks is highly incomplete, and laboratory experiments that manipulate the entities involved are conducted to test hypotheses about these networks. In recent years, various automated approaches to experiment selection have been proposed. Many of these approaches can be characterized as active machine learning algorithms. Active learning is an iterative process in which a model is learned from data, hypotheses are generated from the model to propose informative experiments, and the experiments yield new data that is used to update the model. This review describes the various models, experiment selection strategies, validation techniques, and successful applications described in the literature; highlights common themes and notable distinctions among methods; and identifies likely directions of future research and open problems in the area. PMID:28570593
Automated fiber pigtailing machine
Strand, Oliver T.; Lowry, Mark E.
1999-01-01
The Automated Fiber Pigtailing Machine (AFPM) aligns and attaches optical fibers to optoelectonic (OE) devices such as laser diodes, photodiodes, and waveguide devices without operator intervention. The so-called pigtailing process is completed with sub-micron accuracies in less than 3 minutes. The AFPM operates unattended for one hour, is modular in design and is compatible with a mass production manufacturing environment. This machine can be used to build components which are used in military aircraft navigation systems, computer systems, communications systems and in the construction of diagnostics and experimental systems.
Visual Recognition Software for Binary Classification and its Application to Pollen Identification
NASA Astrophysics Data System (ADS)
Punyasena, S. W.; Tcheng, D. K.; Nayak, A.
2014-12-01
An underappreciated source of uncertainty in paleoecology is the uncertainty of palynological identifications. The confidence of any given identification is not regularly reported in published results, so cannot be incorporated into subsequent meta-analyses. Automated identifications systems potentially provide a means of objectively measuring the confidence of a given count or single identification, as well as a mechanism for increasing sample sizes and throughput. We developed the software ARLO (Automated Recognition with Layered Optimization) to tackle difficult visual classification problems such as pollen identification. ARLO applies pattern recognition and machine learning to the analysis of pollen images. The features that the system discovers are not the traditional features of pollen morphology. Instead, general purpose image features, such as pixel lines and grids of different dimensions, size, spacing, and resolution, are used. ARLO adapts to a given problem by searching for the most effective combination of feature representation and learning strategy. We present a two phase approach which uses our machine learning process to first segment pollen grains from the background and then classify pollen pixels and report species ratios. We conducted two separate experiments that utilized two distinct sets of algorithms and optimization procedures. The first analysis focused on reconstructing black and white spruce pollen ratios, training and testing our classification model at the slide level. This allowed us to directly compare our automated counts and expert counts to slides of known spruce ratios. Our second analysis focused on maximizing classification accuracy at the individual pollen grain level. Instead of predicting ratios of given slides, we predicted the species represented in a given image window. The resulting analysis was more scalable, as we were able to adapt the most efficient parts of the methodology from our first analysis. ARLO was able to distinguish between the pollen of black and white spruce with an accuracy of ~83.61%. This compared favorably to human expert performance. At the writing of this abstract, we are also experimenting with experimenting with the analysis of higher diversity samples, including modern tropical pollen material collected from ground pollen traps.
Levitt, Joshua; Nitenson, Adam; Koyama, Suguru; Heijmans, Lonne; Curry, James; Ross, Jason T; Kamerling, Steven; Saab, Carl Y
2018-06-23
Electroencephalography (EEG) invariably contains extra-cranial artifacts that are commonly dealt with based on qualitative and subjective criteria. Failure to account for EEG artifacts compromises data interpretation. We have developed a quantitative and automated support vector machine (SVM)-based algorithm to accurately classify artifactual EEG epochs in awake rodent, canine and humans subjects. An embodiment of this method also enables the determination of 'eyes open/closed' states in human subjects. The levels of SVM accuracy for artifact classification in humans, Sprague Dawley rats and beagle dogs were 94.17%, 83.68%, and 85.37%, respectively, whereas 'eyes open/closed' states in humans were labeled with 88.60% accuracy. Each of these results was significantly higher than chance. Comparison with Existing Methods: Other existing methods, like those dependent on Independent Component Analysis, have not been tested in non-human subjects, and require full EEG montages, instead of only single channels, as this method does. We conclude that our EEG artifact detection algorithm provides a valid and practical solution to a common problem in the quantitative analysis and assessment of EEG in pre-clinical research settings across evolutionary spectra. Copyright © 2018. Published by Elsevier B.V.
Automated diagnosis of Alzheimer's disease with multi-atlas based whole brain segmentations
NASA Astrophysics Data System (ADS)
Luo, Yuan; Tang, Xiaoying
2017-03-01
Voxel-based analysis is widely used in quantitative analysis of structural brain magnetic resonance imaging (MRI) and automated disease detection, such as Alzheimer's disease (AD). However, noise at the voxel level may cause low sensitivity to AD-induced structural abnormalities. This can be addressed with the use of a whole brain structural segmentation approach which greatly reduces the dimension of features (the number of voxels). In this paper, we propose an automatic AD diagnosis system that combines such whole brain segmen- tations with advanced machine learning methods. We used a multi-atlas segmentation technique to parcellate T1-weighted images into 54 distinct brain regions and extract their structural volumes to serve as the features for principal-component-analysis-based dimension reduction and support-vector-machine-based classification. The relationship between the number of retained principal components (PCs) and the diagnosis accuracy was systematically evaluated, in a leave-one-out fashion, based on 28 AD subjects and 23 age-matched healthy subjects. Our approach yielded pretty good classification results with 96.08% overall accuracy being achieved using the three foremost PCs. In addition, our approach yielded 96.43% specificity, 100% sensitivity, and 0.9891 area under the receiver operating characteristic curve.
Automated Classification of Heritage Buildings for As-Built Bim Using Machine Learning Techniques
NASA Astrophysics Data System (ADS)
Bassier, M.; Vergauwen, M.; Van Genechten, B.
2017-08-01
Semantically rich three dimensional models such as Building Information Models (BIMs) are increasingly used in digital heritage. They provide the required information to varying stakeholders during the different stages of the historic buildings life cyle which is crucial in the conservation process. The creation of as-built BIM models is based on point cloud data. However, manually interpreting this data is labour intensive and often leads to misinterpretations. By automatically classifying the point cloud, the information can be proccesed more effeciently. A key aspect in this automated scan-to-BIM process is the classification of building objects. In this research we look to automatically recognise elements in existing buildings to create compact semantic information models. Our algorithm efficiently extracts the main structural components such as floors, ceilings, roofs, walls and beams despite the presence of significant clutter and occlusions. More specifically, Support Vector Machines (SVM) are proposed for the classification. The algorithm is evaluated using real data of a variety of existing buildings. The results prove that the used classifier recognizes the objects with both high precision and recall. As a result, entire data sets are reliably labelled at once. The approach enables experts to better document and process heritage assets.
Automated reliability assessment for spectroscopic redshift measurements
NASA Astrophysics Data System (ADS)
Jamal, S.; Le Brun, V.; Le Fèvre, O.; Vibert, D.; Schmitt, A.; Surace, C.; Copin, Y.; Garilli, B.; Moresco, M.; Pozzetti, L.
2018-03-01
Context. Future large-scale surveys, such as the ESA Euclid mission, will produce a large set of galaxy redshifts (≥106) that will require fully automated data-processing pipelines to analyze the data, extract crucial information and ensure that all requirements are met. A fundamental element in these pipelines is to associate to each galaxy redshift measurement a quality, or reliability, estimate. Aim. In this work, we introduce a new approach to automate the spectroscopic redshift reliability assessment based on machine learning (ML) and characteristics of the redshift probability density function. Methods: We propose to rephrase the spectroscopic redshift estimation into a Bayesian framework, in order to incorporate all sources of information and uncertainties related to the redshift estimation process and produce a redshift posterior probability density function (PDF). To automate the assessment of a reliability flag, we exploit key features in the redshift posterior PDF and machine learning algorithms. Results: As a working example, public data from the VIMOS VLT Deep Survey is exploited to present and test this new methodology. We first tried to reproduce the existing reliability flags using supervised classification in order to describe different types of redshift PDFs, but due to the subjective definition of these flags (classification accuracy 58%), we soon opted for a new homogeneous partitioning of the data into distinct clusters via unsupervised classification. After assessing the accuracy of the new clusters via resubstitution and test predictions (classification accuracy 98%), we projected unlabeled data from preliminary mock simulations for the Euclid space mission into this mapping to predict their redshift reliability labels. Conclusions: Through the development of a methodology in which a system can build its own experience to assess the quality of a parameter, we are able to set a preliminary basis of an automated reliability assessment for spectroscopic redshift measurements. This newly-defined method is very promising for next-generation large spectroscopic surveys from the ground and in space, such as Euclid and WFIRST. A table of the reclassified VVDS redshifts and reliability is only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/611/A53
Automated pixel-wise brain tissue segmentation of diffusion-weighted images via machine learning.
Ciritsis, Alexander; Boss, Andreas; Rossi, Cristina
2018-04-26
The diffusion-weighted (DW) MR signal sampled over a wide range of b-values potentially allows for tissue differentiation in terms of cellularity, microstructure, perfusion, and T 2 relaxivity. This study aimed to implement a machine learning algorithm for automatic brain tissue segmentation from DW-MRI datasets, and to determine the optimal sub-set of features for accurate segmentation. DWI was performed at 3 T in eight healthy volunteers using 15 b-values and 20 diffusion-encoding directions. The pixel-wise signal attenuation, as well as the trace and fractional anisotropy (FA) of the diffusion tensor, were used as features to train a support vector machine classifier for gray matter, white matter, and cerebrospinal fluid classes. The datasets of two volunteers were used for validation. For each subject, tissue classification was also performed on 3D T 1 -weighted data sets with a probabilistic framework. Confusion matrices were generated for quantitative assessment of image classification accuracy in comparison with the reference method. DWI-based tissue segmentation resulted in an accuracy of 82.1% on the validation dataset and of 82.2% on the training dataset, excluding relevant model over-fitting. A mean Dice coefficient (DSC) of 0.79 ± 0.08 was found. About 50% of the classification performance was attributable to five features (i.e. signal measured at b-values of 5/10/500/1200 s/mm 2 and the FA). This reduced set of features led to almost identical performances for the validation (82.2%) and the training (81.4%) datasets (DSC = 0.79 ± 0.08). Machine learning techniques applied to DWI data allow for accurate brain tissue segmentation based on both morphological and functional information. Copyright © 2018 John Wiley & Sons, Ltd.
Al-Kaysi, Alaa M; Al-Ani, Ahmed; Loo, Colleen K; Powell, Tamara Y; Martin, Donel M; Breakspear, Michael; Boonstra, Tjeerd W
2017-01-15
Transcranial direct current stimulation (tDCS) is a promising treatment for major depressive disorder (MDD). Standard tDCS treatment involves numerous sessions running over a few weeks. However, not all participants respond to this type of treatment. This study aims to investigate the feasibility of identifying MDD patients that respond to tDCS treatment based on resting-state electroencephalography (EEG) recorded prior to treatment commencing. We used machine learning to predict improvement in mood and cognition during tDCS treatment from baseline EEG power spectra. Ten participants with a current diagnosis of MDD were included. Power spectral density was assessed in five frequency bands: delta (0.5-4Hz), theta (4-8Hz), alpha (8-12Hz), beta (13-30Hz) and gamma (30-100Hz). Improvements in mood and cognition were assessed using the Montgomery-Åsberg Depression Rating Scale and Symbol Digit Modalities Test, respectively. We trained the classifiers using three algorithms (support vector machine, extreme learning machine and linear discriminant analysis) and a leave-one-out cross-validation approach. Mood labels were accurately predicted in 8 out of 10 participants using EEG channels FC4-AF8 (accuracy=76%, p=0.034). Cognition labels were accurately predicted in 10 out of 10 participants using channels pair CPz-CP2 (accuracy=92%, p=0.004). Due to the limited number of participants (n=10), the presented results mainly aim to serve as a proof of concept. These finding demonstrate the feasibility of using machine learning to identify patients that will respond to tDCS treatment. These promising results warrant a larger study to determine the clinical utility of this approach. Copyright © 2016 Elsevier B.V. All rights reserved.
Gender classification of running subjects using full-body kinematics
NASA Astrophysics Data System (ADS)
Williams, Christina M.; Flora, Jeffrey B.; Iftekharuddin, Khan M.
2016-05-01
This paper proposes novel automated gender classification of subjects while engaged in running activity. The machine learning techniques include preprocessing steps using principal component analysis followed by classification with linear discriminant analysis, and nonlinear support vector machines, and decision-stump with AdaBoost. The dataset consists of 49 subjects (25 males, 24 females, 2 trials each) all equipped with approximately 80 retroreflective markers. The trials are reflective of the subject's entire body moving unrestrained through a capture volume at a self-selected running speed, thus producing highly realistic data. The classification accuracy using leave-one-out cross validation for the 49 subjects is improved from 66.33% using linear discriminant analysis to 86.74% using the nonlinear support vector machine. Results are further improved to 87.76% by means of implementing a nonlinear decision stump with AdaBoost classifier. The experimental findings suggest that the linear classification approaches are inadequate in classifying gender for a large dataset with subjects running in a moderately uninhibited environment.
Open source machine-learning algorithms for the prediction of optimal cancer drug therapies.
Huang, Cai; Mezencev, Roman; McDonald, John F; Vannberg, Fredrik
2017-01-01
Precision medicine is a rapidly growing area of modern medical science and open source machine-learning codes promise to be a critical component for the successful development of standardized and automated analysis of patient data. One important goal of precision cancer medicine is the accurate prediction of optimal drug therapies from the genomic profiles of individual patient tumors. We introduce here an open source software platform that employs a highly versatile support vector machine (SVM) algorithm combined with a standard recursive feature elimination (RFE) approach to predict personalized drug responses from gene expression profiles. Drug specific models were built using gene expression and drug response data from the National Cancer Institute panel of 60 human cancer cell lines (NCI-60). The models are highly accurate in predicting the drug responsiveness of a variety of cancer cell lines including those comprising the recent NCI-DREAM Challenge. We demonstrate that predictive accuracy is optimized when the learning dataset utilizes all probe-set expression values from a diversity of cancer cell types without pre-filtering for genes generally considered to be "drivers" of cancer onset/progression. Application of our models to publically available ovarian cancer (OC) patient gene expression datasets generated predictions consistent with observed responses previously reported in the literature. By making our algorithm "open source", we hope to facilitate its testing in a variety of cancer types and contexts leading to community-driven improvements and refinements in subsequent applications.
Machine vision for various manipulation tasks
NASA Astrophysics Data System (ADS)
Domae, Yukiyasu
2017-03-01
Bin-picking, re-grasping, pick-and-place, kitting, etc. There are many manipulation tasks in the fields of automation of factory, warehouse and so on. The main problem of the automation is that the target objects (items/parts) have various shapes, weights and surface materials. In my talk, I will show latest machine vision systems and algorithms against the problem.
Semi-automated surface mapping via unsupervised classification
NASA Astrophysics Data System (ADS)
D'Amore, M.; Le Scaon, R.; Helbert, J.; Maturilli, A.
2017-09-01
Due to the increasing volume of the returned data from space mission, the human search for correlation and identification of interesting features becomes more and more unfeasible. Statistical extraction of features via machine learning methods will increase the scientific output of remote sensing missions and aid the discovery of yet unknown feature hidden in dataset. Those methods exploit algorithm trained on features from multiple instrument, returning classification maps that explore intra-dataset correlation, allowing for the discovery of unknown features. We present two applications, one for Mercury and one for Vesta.
Machine learning in smart home control systems - Algorithms and new opportunities
NASA Astrophysics Data System (ADS)
Berg, Ivan A.; Khorev, Oleg E.; Matvevnina, Arina I.; Prisjazhnyj, Alexey V.
2017-11-01
Worldwide, more and more attention is paid to issues related to a smart home. If in 2000 Scopus registered 25 publications with about "smart house", in 2016 their number increased up to 1600. The top three countries with interest in smart home technologies include the United States, China and India. Corporations begin to offer their package solutions for automation of the intellectual home, dozens of start-ups around the creation of technology are established. Where is such interest from? What can offer intelligent home technologies? What can an end user receive?
Machine learning properties of binary wurtzite superlattices
Pilania, G.; Liu, X. -Y.
2018-01-12
The burgeoning paradigm of high-throughput computations and materials informatics brings new opportunities in terms of targeted materials design and discovery. The discovery process can be significantly accelerated and streamlined if one can learn effectively from available knowledge and past data to predict materials properties efficiently. Indeed, a very active area in materials science research is to develop machine learning based methods that can deliver automated and cross-validated predictive models using either already available materials data or new data generated in a targeted manner. In the present paper, we show that fast and accurate predictions of a wide range of propertiesmore » of binary wurtzite superlattices, formed by a diverse set of chemistries, can be made by employing state-of-the-art statistical learning methods trained on quantum mechanical computations in combination with a judiciously chosen numerical representation to encode materials’ similarity. These surrogate learning models then allow for efficient screening of vast chemical spaces by providing instant predictions of the targeted properties. Moreover, the models can be systematically improved in an adaptive manner, incorporate properties computed at different levels of fidelities and are naturally amenable to inverse materials design strategies. Finally, while the learning approach to make predictions for a wide range of properties (including structural, elastic and electronic properties) is demonstrated here for a specific example set containing more than 1200 binary wurtzite superlattices, the adopted framework is equally applicable to other classes of materials as well.« less
Machine learning properties of binary wurtzite superlattices
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pilania, G.; Liu, X. -Y.
The burgeoning paradigm of high-throughput computations and materials informatics brings new opportunities in terms of targeted materials design and discovery. The discovery process can be significantly accelerated and streamlined if one can learn effectively from available knowledge and past data to predict materials properties efficiently. Indeed, a very active area in materials science research is to develop machine learning based methods that can deliver automated and cross-validated predictive models using either already available materials data or new data generated in a targeted manner. In the present paper, we show that fast and accurate predictions of a wide range of propertiesmore » of binary wurtzite superlattices, formed by a diverse set of chemistries, can be made by employing state-of-the-art statistical learning methods trained on quantum mechanical computations in combination with a judiciously chosen numerical representation to encode materials’ similarity. These surrogate learning models then allow for efficient screening of vast chemical spaces by providing instant predictions of the targeted properties. Moreover, the models can be systematically improved in an adaptive manner, incorporate properties computed at different levels of fidelities and are naturally amenable to inverse materials design strategies. Finally, while the learning approach to make predictions for a wide range of properties (including structural, elastic and electronic properties) is demonstrated here for a specific example set containing more than 1200 binary wurtzite superlattices, the adopted framework is equally applicable to other classes of materials as well.« less
Computer vision cracks the leaf code
Wilf, Peter; Zhang, Shengping; Chikkerur, Sharat; Little, Stefan A.; Wing, Scott L.; Serre, Thomas
2016-01-01
Understanding the extremely variable, complex shape and venation characters of angiosperm leaves is one of the most challenging problems in botany. Machine learning offers opportunities to analyze large numbers of specimens, to discover novel leaf features of angiosperm clades that may have phylogenetic significance, and to use those characters to classify unknowns. Previous computer vision approaches have primarily focused on leaf identification at the species level. It remains an open question whether learning and classification are possible among major evolutionary groups such as families and orders, which usually contain hundreds to thousands of species each and exhibit many times the foliar variation of individual species. Here, we tested whether a computer vision algorithm could use a database of 7,597 leaf images from 2,001 genera to learn features of botanical families and orders, then classify novel images. The images are of cleared leaves, specimens that are chemically bleached, then stained to reveal venation. Machine learning was used to learn a codebook of visual elements representing leaf shape and venation patterns. The resulting automated system learned to classify images into families and orders with a success rate many times greater than chance. Of direct botanical interest, the responses of diagnostic features can be visualized on leaf images as heat maps, which are likely to prompt recognition and evolutionary interpretation of a wealth of novel morphological characters. With assistance from computer vision, leaves are poised to make numerous new contributions to systematic and paleobotanical studies. PMID:26951664
Ball, Oliver; Robinson, Sarah; Bure, Kim; Brindley, David A; Mccall, David
2018-04-01
Phacilitate held a Special Interest Group workshop event in Edinburgh, UK, in May 2017. The event brought together leading stakeholders in the cell therapy bioprocessing field to identify present and future challenges and propose potential solutions to automation in cell therapy bioprocessing. Here, we review and summarize discussions from the event. Deep biological understanding of a product, its mechanism of action and indication pathogenesis underpin many factors relating to bioprocessing and automation. To fully exploit the opportunities of bioprocess automation, therapeutics developers must closely consider whether an automation strategy is applicable, how to design an 'automatable' bioprocess and how to implement process modifications with minimal disruption. Major decisions around bioprocess automation strategy should involve all relevant stakeholders; communication between technical and business strategy decision-makers is of particular importance. Developers should leverage automation to implement in-process testing, in turn applicable to process optimization, quality assurance (QA)/ quality control (QC), batch failure control, adaptive manufacturing and regulatory demands, but a lack of precedent and technical opportunities can complicate such efforts. Sparse standardization across product characterization, hardware components and software platforms is perceived to complicate efforts to implement automation. The use of advanced algorithmic approaches such as machine learning may have application to bioprocess and supply chain optimization. Automation can substantially de-risk the wider supply chain, including tracking and traceability, cryopreservation and thawing and logistics. The regulatory implications of automation are currently unclear because few hardware options exist and novel solutions require case-by-case validation, but automation can present attractive regulatory incentives. Copyright © 2018 International Society for Cellular Therapy. Published by Elsevier Inc. All rights reserved.
Spitzer observatory operations: increasing efficiency in mission operations
NASA Astrophysics Data System (ADS)
Scott, Charles P.; Kahr, Bolinda E.; Sarrel, Marc A.
2006-06-01
This paper explores the how's and why's of the Spitzer Mission Operations System's (MOS) success, efficiency, and affordability in comparison to other observatory-class missions. MOS exploits today's flight, ground, and operations capabilities, embraces automation, and balances both risk and cost. With operational efficiency as the primary goal, MOS maintains a strong control process by translating lessons learned into efficiency improvements, thereby enabling the MOS processes, teams, and procedures to rapidly evolve from concept (through thorough validation) into in-flight implementation. Operational teaming, planning, and execution are designed to enable re-use. Mission changes, unforeseen events, and continuous improvement have often times forced us to learn to fly anew. Collaborative spacecraft operations and remote science and instrument teams have become well integrated, and worked together to improve and optimize each human, machine, and software-system element. Adaptation to tighter spacecraft margins has facilitated continuous operational improvements via automated and autonomous software coupled with improved human analysis. Based upon what we now know and what we need to improve, adapt, or fix, the projected mission lifetime continues to grow - as does the opportunity for numerous scientific discoveries.
Software for Partly Automated Recognition of Targets
NASA Technical Reports Server (NTRS)
Opitz, David; Blundell, Stuart; Bain, William; Morris, Matthew; Carlson, Ian; Mangrich, Mark; Selinsky, T.
2002-01-01
The Feature Analyst is a computer program for assisted (partially automated) recognition of targets in images. This program was developed to accelerate the processing of high-resolution satellite image data for incorporation into geographic information systems (GIS). This program creates an advanced user interface that embeds proprietary machine-learning algorithms in commercial image-processing and GIS software. A human analyst provides samples of target features from multiple sets of data, then the software develops a data-fusion model that automatically extracts the remaining features from selected sets of data. The program thus leverages the natural ability of humans to recognize objects in complex scenes, without requiring the user to explain the human visual recognition process by means of lengthy software. Two major subprograms are the reactive agent and the thinking agent. The reactive agent strives to quickly learn the user's tendencies while the user is selecting targets and to increase the user's productivity by immediately suggesting the next set of pixels that the user may wish to select. The thinking agent utilizes all available resources, taking as much time as needed, to produce the most accurate autonomous feature-extraction model possible.
Nyholm, Sven
2017-07-18
Many ethicists writing about automated systems (e.g. self-driving cars and autonomous weapons systems) attribute agency to these systems. Not only that; they seemingly attribute an autonomous or independent form of agency to these machines. This leads some ethicists to worry about responsibility-gaps and retribution-gaps in cases where automated systems harm or kill human beings. In this paper, I consider what sorts of agency it makes sense to attribute to most current forms of automated systems, in particular automated cars and military robots. I argue that whereas it indeed makes sense to attribute different forms of fairly sophisticated agency to these machines, we ought not to regard them as acting on their own, independently of any human beings. Rather, the right way to understand the agency exercised by these machines is in terms of human-robot collaborations, where the humans involved initiate, supervise, and manage the agency of their robotic collaborators. This means, I argue, that there is much less room for justified worries about responsibility-gaps and retribution-gaps than many ethicists think.
An open-source solution for advanced imaging flow cytometry data analysis using machine learning.
Hennig, Holger; Rees, Paul; Blasi, Thomas; Kamentsky, Lee; Hung, Jane; Dao, David; Carpenter, Anne E; Filby, Andrew
2017-01-01
Imaging flow cytometry (IFC) enables the high throughput collection of morphological and spatial information from hundreds of thousands of single cells. This high content, information rich image data can in theory resolve important biological differences among complex, often heterogeneous biological samples. However, data analysis is often performed in a highly manual and subjective manner using very limited image analysis techniques in combination with conventional flow cytometry gating strategies. This approach is not scalable to the hundreds of available image-based features per cell and thus makes use of only a fraction of the spatial and morphometric information. As a result, the quality, reproducibility and rigour of results are limited by the skill, experience and ingenuity of the data analyst. Here, we describe a pipeline using open-source software that leverages the rich information in digital imagery using machine learning algorithms. Compensated and corrected raw image files (.rif) data files from an imaging flow cytometer (the proprietary .cif file format) are imported into the open-source software CellProfiler, where an image processing pipeline identifies cells and subcellular compartments allowing hundreds of morphological features to be measured. This high-dimensional data can then be analysed using cutting-edge machine learning and clustering approaches using "user-friendly" platforms such as CellProfiler Analyst. Researchers can train an automated cell classifier to recognize different cell types, cell cycle phases, drug treatment/control conditions, etc., using supervised machine learning. This workflow should enable the scientific community to leverage the full analytical power of IFC-derived data sets. It will help to reveal otherwise unappreciated populations of cells based on features that may be hidden to the human eye that include subtle measured differences in label free detection channels such as bright-field and dark-field imagery. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Misra, Dharitri; Chen, Siyuan; Thoma, George R.
2010-01-01
One of the most expensive aspects of archiving digital documents is the manual acquisition of context-sensitive metadata useful for the subsequent discovery of, and access to, the archived items. For certain types of textual documents, such as journal articles, pamphlets, official government records, etc., where the metadata is contained within the body of the documents, a cost effective method is to identify and extract the metadata in an automated way, applying machine learning and string pattern search techniques. At the U. S. National Library of Medicine (NLM) we have developed an automated metadata extraction (AME) system that employs layout classification and recognition models with a metadata pattern search model for a text corpus with structured or semi-structured information. A combination of Support Vector Machine and Hidden Markov Model is used to create the layout recognition models from a training set of the corpus, following which a rule-based metadata search model is used to extract the embedded metadata by analyzing the string patterns within and surrounding each field in the recognized layouts. In this paper, we describe the design of our AME system, with focus on the metadata search model. We present the extraction results for a historic collection from the Food and Drug Administration, and outline how the system may be adapted for similar collections. Finally, we discuss some ongoing enhancements to our AME system. PMID:21179386
NASA Astrophysics Data System (ADS)
Gan, Yu; Tsay, David; Amir, Syed B.; Marboe, Charles C.; Hendon, Christine P.
2016-03-01
Remodeling of the myocardium is associated with increased risk of arrhythmia and heart failure. Our objective is to automatically identify regions of fibrotic myocardium, dense collagen, and adipose tissue, which can serve as a way to guide radiofrequency ablation therapy or endomyocardial biopsies. Using computer vision and machine learning, we present an automated algorithm to classify tissue compositions from cardiac optical coherence tomography (OCT) images. Three dimensional OCT volumes were obtained from 15 human hearts ex vivo within 48 hours of donor death (source, NDRI). We first segmented B-scans using a graph searching method. We estimated the boundary of each region by minimizing a cost function, which consisted of intensity, gradient, and contour smoothness. Then, features, including texture analysis, optical properties, and statistics of high moments, were extracted. We used a statistical model, relevance vector machine, and trained this model with abovementioned features to classify tissue compositions. To validate our method, we applied our algorithm to 77 volumes. The datasets for validation were manually segmented and classified by two investigators who were blind to our algorithm results and identified the tissues based on trichrome histology and pathology. The difference between automated and manual segmentation was 51.78 +/- 50.96 μm. Experiments showed that the attenuation coefficients of dense collagen were significantly different from other tissue types (P < 0.05, ANOVA). Importantly, myocardial fibrosis tissues were different from normal myocardium in entropy and kurtosis. The tissue types were classified with an accuracy of 84%. The results show good agreements with histology.
First Annual Workshop on Space Operations Automation and Robotics (SOAR 87)
NASA Technical Reports Server (NTRS)
Griffin, Sandy (Editor)
1987-01-01
Several topics relative to automation and robotics technology are discussed. Automation of checkout, ground support, and logistics; automated software development; man-machine interfaces; neural networks; systems engineering and distributed/parallel processing architectures; and artificial intelligence/expert systems are among the topics covered.
Using Machine Learning to Enable Big Data Analysis within Human Review Time Budgets
NASA Astrophysics Data System (ADS)
Bue, B.; Rebbapragada, U.; Wagstaff, K.; Thompson, D. R.
2014-12-01
The quantity of astronomical observations collected by today's instruments far exceeds the capability of manual inspection by domain experts. Scientists often have a fixed time budget of a few hours spend to perform the monotonous task of scanning through a live stream or data dump of candidates that must be prioritized for follow-up analysis. Today's and next generation astronomical instruments produce millions of candidate detection per day, and necessitate the use of automated classifiers that serve as "data triage" in order to filter out spurious signals. Automated data triage enables increased science return by prioritizing interesting or anomalous observations for follow-up inspection, while also expediting analysis by filtering out noisy or redundant observations. We describe three specific astronomical investigations that are currently benefiting from data triage techniques in their respective processing pipelines.
NASA Astrophysics Data System (ADS)
Singla, Neeru; Dubey, Kavita; Srivastava, Vishal; Ahmad, Azeem; Mehta, D. S.
2018-02-01
We developed an automated high-resolution full-field spatial coherence tomography (FF-SCT) microscope for quantitative phase imaging that is based on the spatial, rather than the temporal, coherence gating. The Red and Green color laser light was used for finding the quantitative phase images of unstained human red blood cells (RBCs). This study uses morphological parameters of unstained RBCs phase images to distinguish between normal and infected cells. We recorded the single interferogram by a FF-SCT microscope for red and green color wavelength and average the two phase images to further reduced the noise artifacts. In order to characterize anemia infected from normal cells different morphological features were extracted and these features were used to train machine learning ensemble model to classify RBCs with high accuracy.
A Machine Learning Classifier for Fast Radio Burst Detection at the VLBA
NASA Astrophysics Data System (ADS)
Wagstaff, Kiri L.; Tang, Benyang; Thompson, David R.; Khudikyan, Shakeh; Wyngaard, Jane; Deller, Adam T.; Palaniswamy, Divya; Tingay, Steven J.; Wayth, Randall B.
2016-08-01
Time domain radio astronomy observing campaigns frequently generate large volumes of data. Our goal is to develop automated methods that can identify events of interest buried within the larger data stream. The V-FASTR fast transient system was designed to detect rare fast radio bursts within data collected by the Very Long Baseline Array. The resulting event candidates constitute a significant burden in terms of subsequent human reviewing time. We have trained and deployed a machine learning classifier that marks each candidate detection as a pulse from a known pulsar, an artifact due to radio frequency interference, or a potential new discovery. The classifier maintains high reliability by restricting its predictions to those with at least 90% confidence. We have also implemented several efficiency and usability improvements to the V-FASTR web-based candidate review system. Overall, we found that time spent reviewing decreased and the fraction of interesting candidates increased. The classifier now classifies (and therefore filters) 80%-90% of the candidates, with an accuracy greater than 98%, leaving only the 10%-20% most promising candidates to be reviewed by humans.
Jing Jin; Dauwels, Justin; Cash, Sydney; Westover, M Brandon
2014-01-01
Detection of interictal discharges is a key element of interpreting EEGs during the diagnosis and management of epilepsy. Because interpretation of clinical EEG data is time-intensive and reliant on experts who are in short supply, there is a great need for automated spike detectors. However, attempts to develop general-purpose spike detectors have so far been severely limited by a lack of expert-annotated data. Huge databases of interictal discharges are therefore in great demand for the development of general-purpose detectors. Detailed manual annotation of interictal discharges is time consuming, which severely limits the willingness of experts to participate. To address such problems, a graphical user interface "SpikeGUI" was developed in our work for the purposes of EEG viewing and rapid interictal discharge annotation. "SpikeGUI" substantially speeds up the task of annotating interictal discharges using a custom-built algorithm based on a combination of template matching and online machine learning techniques. While the algorithm is currently tailored to annotation of interictal epileptiform discharges, it can easily be generalized to other waveforms and signal types.
Malay sentiment analysis based on combined classification approaches and Senti-lexicon algorithm.
Al-Saffar, Ahmed; Awang, Suryanti; Tao, Hai; Omar, Nazlia; Al-Saiagh, Wafaa; Al-Bared, Mohammed
2018-01-01
Sentiment analysis techniques are increasingly exploited to categorize the opinion text to one or more predefined sentiment classes for the creation and automated maintenance of review-aggregation websites. In this paper, a Malay sentiment analysis classification model is proposed to improve classification performances based on the semantic orientation and machine learning approaches. First, a total of 2,478 Malay sentiment-lexicon phrases and words are assigned with a synonym and stored with the help of more than one Malay native speaker, and the polarity is manually allotted with a score. In addition, the supervised machine learning approaches and lexicon knowledge method are combined for Malay sentiment classification with evaluating thirteen features. Finally, three individual classifiers and a combined classifier are used to evaluate the classification accuracy. In experimental results, a wide-range of comparative experiments is conducted on a Malay Reviews Corpus (MRC), and it demonstrates that the feature extraction improves the performance of Malay sentiment analysis based on the combined classification. However, the results depend on three factors, the features, the number of features and the classification approach.
Malay sentiment analysis based on combined classification approaches and Senti-lexicon algorithm
Awang, Suryanti; Tao, Hai; Omar, Nazlia; Al-Saiagh, Wafaa; Al-bared, Mohammed
2018-01-01
Sentiment analysis techniques are increasingly exploited to categorize the opinion text to one or more predefined sentiment classes for the creation and automated maintenance of review-aggregation websites. In this paper, a Malay sentiment analysis classification model is proposed to improve classification performances based on the semantic orientation and machine learning approaches. First, a total of 2,478 Malay sentiment-lexicon phrases and words are assigned with a synonym and stored with the help of more than one Malay native speaker, and the polarity is manually allotted with a score. In addition, the supervised machine learning approaches and lexicon knowledge method are combined for Malay sentiment classification with evaluating thirteen features. Finally, three individual classifiers and a combined classifier are used to evaluate the classification accuracy. In experimental results, a wide-range of comparative experiments is conducted on a Malay Reviews Corpus (MRC), and it demonstrates that the feature extraction improves the performance of Malay sentiment analysis based on the combined classification. However, the results depend on three factors, the features, the number of features and the classification approach. PMID:29684036
NASA Astrophysics Data System (ADS)
Moran, Niklas; Nieland, Simon; Tintrup gen. Suntrup, Gregor; Kleinschmit, Birgit
2017-02-01
Manual field surveys for nature conservation management are expensive and time-consuming and could be supplemented and streamlined by using Remote Sensing (RS). RS is critical to meet requirements of existing laws such as the EU Habitats Directive (HabDir) and more importantly to meet future challenges. The full potential of RS has yet to be harnessed as different nomenclatures and procedures hinder interoperability, comparison and provenance. Therefore, automated tools are needed to use RS data to produce comparable, empirical data outputs that lend themselves to data discovery and provenance. These issues are addressed by a novel, semi-automatic ontology-based classification method that uses machine learning algorithms and Web Ontology Language (OWL) ontologies that yields traceable, interoperable and observation-based classification outputs. The method was tested on European Union Nature Information System (EUNIS) grasslands in Rheinland-Palatinate, Germany. The developed methodology is a first step in developing observation-based ontologies in the field of nature conservation. The tests show promising results for the determination of the grassland indicators wetness and alkalinity with an overall accuracy of 85% for alkalinity and 76% for wetness.
Jin, Jing; Dauwels, Justin; Cash, Sydney; Westover, M. Brandon
2015-01-01
Detection of interictal discharges is a key element of interpreting EEGs during the diagnosis and management of epilepsy. Because interpretation of clinical EEG data is time-intensive and reliant on experts who are in short supply, there is a great need for automated spike detectors. However, attempts to develop general-purpose spike detectors have so far been severely limited by a lack of expert-annotated data. Huge databases of interictal discharges are therefore in great demand for the development of general-purpose detectors. Detailed manual annotation of interictal discharges is time consuming, which severely limits the willingness of experts to participate. To address such problems, a graphical user interface “SpikeGUI” was developed in our work for the purposes of EEG viewing and rapid interictal discharge annotation. “SpikeGUI” substantially speeds up the task of annotating interictal discharges using a custom-built algorithm based on a combination of template matching and online machine learning techniques. While the algorithm is currently tailored to annotation of interictal epileptiform discharges, it can easily be generalized to other waveforms and signal types. PMID:25570976
Valleron, Alain-Jacques
2017-08-15
Automation of laboratory tests, bioinformatic analysis of biological sequences, and professional data management are used routinely in a modern university hospital-based infectious diseases institute. This dates back to at least the 1980s. However, the scientific methods of this 21st century are changing with the increased power and speed of computers, with the "big data" revolution having already happened in genomics and environment, and eventually arriving in medical informatics. The research will be increasingly "data driven," and the powerful machine learning methods whose efficiency is demonstrated in daily life will also revolutionize medical research. A university-based institute of infectious diseases must therefore not only gather excellent computer scientists and statisticians (as in the past, and as in any medical discipline), but also fully integrate the biologists and clinicians with these computer scientists, statisticians, and mathematical modelers having a broad culture in machine learning, knowledge representation, and knowledge discovery. © The Author 2017. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail: journals.permissions@oup.com.
Automated isotope identification algorithm using artificial neural networks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kamuda, Mark; Stinnett, Jacob; Sullivan, Clair
There is a need to develop an algorithm that can determine the relative activities of radio-isotopes in a large dataset of low-resolution gamma-ray spectra that contains a mixture of many radio-isotopes. Low-resolution gamma-ray spectra that contain mixtures of radio-isotopes often exhibit feature over-lap, requiring algorithms that can analyze these features when overlap occurs. While machine learning and pattern recognition algorithms have shown promise for the problem of radio-isotope identification, their ability to identify and quantify mixtures of radio-isotopes has not been studied. Because machine learning algorithms use abstract features of the spectrum, such as the shape of overlapping peaks andmore » Compton continuum, they are a natural choice for analyzing radio-isotope mixtures. An artificial neural network (ANN) has be trained to calculate the relative activities of 32 radio-isotopes in a spectrum. Furthermore, the ANN is trained with simulated gamma-ray spectra, allowing easy expansion of the library of target radio-isotopes. In this paper we present our initial algorithms based on an ANN and evaluate them against a series measured and simulated spectra.« less
Automated isotope identification algorithm using artificial neural networks
Kamuda, Mark; Stinnett, Jacob; Sullivan, Clair
2017-04-12
There is a need to develop an algorithm that can determine the relative activities of radio-isotopes in a large dataset of low-resolution gamma-ray spectra that contains a mixture of many radio-isotopes. Low-resolution gamma-ray spectra that contain mixtures of radio-isotopes often exhibit feature over-lap, requiring algorithms that can analyze these features when overlap occurs. While machine learning and pattern recognition algorithms have shown promise for the problem of radio-isotope identification, their ability to identify and quantify mixtures of radio-isotopes has not been studied. Because machine learning algorithms use abstract features of the spectrum, such as the shape of overlapping peaks andmore » Compton continuum, they are a natural choice for analyzing radio-isotope mixtures. An artificial neural network (ANN) has be trained to calculate the relative activities of 32 radio-isotopes in a spectrum. Furthermore, the ANN is trained with simulated gamma-ray spectra, allowing easy expansion of the library of target radio-isotopes. In this paper we present our initial algorithms based on an ANN and evaluate them against a series measured and simulated spectra.« less
NASA Astrophysics Data System (ADS)
Wen, Hongwei; Liu, Yue; Wang, Jieqiong; Zhang, Jishui; Peng, Yun; He, Huiguang
2016-03-01
Tourette syndrome (TS) is a developmental neuropsychiatric disorder with the cardinal symptoms of motor and vocal tics which emerges in early childhood and fluctuates in severity in later years. To date, the neural basis of TS is not fully understood yet and TS has a long-term prognosis that is difficult to accurately estimate. Few studies have looked at the potential of using diffusion tensor imaging (DTI) in conjunction with machine learning algorithms in order to automate the classification of healthy children and TS children. Here we apply Tract-Based Spatial Statistics (TBSS) method to 44 TS children and 48 age and gender matched healthy children in order to extract the diffusion values from each voxel in the white matter (WM) skeleton, and a feature selection algorithm (ReliefF) was used to select the most salient voxels for subsequent classification with support vector machine (SVM). We use a nested cross validation to yield an unbiased assessment of the classification method and prevent overestimation. The accuracy (88.04%), sensitivity (88.64%) and specificity (87.50%) were achieved in our method as peak performance of the SVM classifier was achieved using the axial diffusion (AD) metric, demonstrating the potential of a joint TBSS and SVM pipeline for fast, objective classification of healthy and TS children. These results support that our methods may be useful for the early identification of subjects with TS, and hold promise for predicting prognosis and treatment outcome for individuals with TS.
Automated image segmentation using support vector machines
NASA Astrophysics Data System (ADS)
Powell, Stephanie; Magnotta, Vincent A.; Andreasen, Nancy C.
2007-03-01
Neurodegenerative and neurodevelopmental diseases demonstrate problems associated with brain maturation and aging. Automated methods to delineate brain structures of interest are required to analyze large amounts of imaging data like that being collected in several on going multi-center studies. We have previously reported on using artificial neural networks (ANN) to define subcortical brain structures including the thalamus (0.88), caudate (0.85) and the putamen (0.81). In this work, apriori probability information was generated using Thirion's demons registration algorithm. The input vector consisted of apriori probability, spherical coordinates, and an iris of surrounding signal intensity values. We have applied the support vector machine (SVM) machine learning algorithm to automatically segment subcortical and cerebellar regions using the same input vector information. SVM architecture was derived from the ANN framework. Training was completed using a radial-basis function kernel with gamma equal to 5.5. Training was performed using 15,000 vectors collected from 15 training images in approximately 10 minutes. The resulting support vectors were applied to delineate 10 images not part of the training set. Relative overlap calculated for the subcortical structures was 0.87 for the thalamus, 0.84 for the caudate, 0.84 for the putamen, and 0.72 for the hippocampus. Relative overlap for the cerebellar lobes ranged from 0.76 to 0.86. The reliability of the SVM based algorithm was similar to the inter-rater reliability between manual raters and can be achieved without rater intervention.
Toward Intelligent Software Defect Detection
NASA Technical Reports Server (NTRS)
Benson, Markland J.
2011-01-01
Source code level software defect detection has gone from state of the art to a software engineering best practice. Automated code analysis tools streamline many of the aspects of formal code inspections but have the drawback of being difficult to construct and either prone to false positives or severely limited in the set of defects that can be detected. Machine learning technology provides the promise of learning software defects by example, easing construction of detectors and broadening the range of defects that can be found. Pinpointing software defects with the same level of granularity as prominent source code analysis tools distinguishes this research from past efforts, which focused on analyzing software engineering metrics data with granularity limited to that of a particular function rather than a line of code.
ERIC Educational Resources Information Center
Zhang, Mo; Chen, Jing; Ruan, Chunyi
2016-01-01
Successful detection of unusual responses is critical for using machine scoring in the assessment context. This study evaluated the utility of approaches to detecting unusual responses in automated essay scoring. Two research questions were pursued. One question concerned the performance of various prescreening advisory flags, and the other…
ERIC Educational Resources Information Center
Sedaghat, Ahmad; AlJundub, Mohammad; Eilaghi, Armin; Bani-Hani, Ehab; Sabri, Farhad; Mbarki, Raouf; Assad, M. El Haj
2017-01-01
The PBL unit of fluid and electrical drive systems is taught in final semester of undergraduates in mechanical engineering department of the Australian College of Kuwait (ACK). The recent project on an automated punching machine is discovered more appealing to both students and instructors in triggering new ideas and satisfaction end results. In…
Investigating the Human Computer Interaction Problems with Automated Teller Machine Navigation Menus
ERIC Educational Resources Information Center
Curran, Kevin; King, David
2008-01-01
Purpose: The automated teller machine (ATM) has become an integral part of our society. However, using the ATM can often be a frustrating experience as people frequently reinsert cards to conduct multiple transactions. This has led to the research question of whether ATM menus are designed in an optimal manner. This paper aims to address the…
NASA Technical Reports Server (NTRS)
Roske-Hofstrand, Renate J.
1990-01-01
The man-machine interface and its influence on the characteristics of computer displays in automated air traffic is discussed. The graphical presentation of spatial relationships and the problems it poses for air traffic control, and the solution of such problems are addressed. Psychological factors involved in the man-machine interface are stressed.
Living systematic reviews: 2. Combining human and machine effort.
Thomas, James; Noel-Storr, Anna; Marshall, Iain; Wallace, Byron; McDonald, Steven; Mavergames, Chris; Glasziou, Paul; Shemilt, Ian; Synnot, Anneliese; Turner, Tari; Elliott, Julian
2017-11-01
New approaches to evidence synthesis, which use human effort and machine automation in mutually reinforcing ways, can enhance the feasibility and sustainability of living systematic reviews. Human effort is a scarce and valuable resource, required when automation is impossible or undesirable, and includes contributions from online communities ("crowds") as well as more conventional contributions from review authors and information specialists. Automation can assist with some systematic review tasks, including searching, eligibility assessment, identification and retrieval of full-text reports, extraction of data, and risk of bias assessment. Workflows can be developed in which human effort and machine automation can each enable the other to operate in more effective and efficient ways, offering substantial enhancement to the productivity of systematic reviews. This paper describes and discusses the potential-and limitations-of new ways of undertaking specific tasks in living systematic reviews, identifying areas where these human/machine "technologies" are already in use, and where further research and development is needed. While the context is living systematic reviews, many of these enabling technologies apply equally to standard approaches to systematic reviewing. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
NASA Technical Reports Server (NTRS)
Gangal, M. D.; Isenberg, L.; Lewis, E. V.
1985-01-01
Proposed system offers safety and large return on investment. System, operating by year 2000, employs machines and processes based on proven principles. According to concept, line of parallel machines, connected in groups of four to service modules, attacks face of coal seam. High-pressure water jets and central auger on each machine break face. Jaws scoop up coal chunks, and auger grinds them and forces fragments into slurry-transport system. Slurry pumped through pipeline to point of use. Concept for highly automated coal-mining system increases productivity, makes mining safer, and protects health of mine workers.
Automated solar panel assembly line
NASA Technical Reports Server (NTRS)
Somberg, H.
1981-01-01
The initial stage of the automated solar panel assembly line program was devoted to concept development and proof of approach through simple experimental verification. In this phase, laboratory bench models were built to demonstrate and verify concepts. Following this phase was machine design and integration of the various machine elements. The third phase was machine assembly and debugging. In this phase, the various elements were operated as a unit and modifications were made as required. The final stage of development was the demonstration of the equipment in a pilot production operation.
Automated fiber pigtailing machine
Strand, O.T.; Lowry, M.E.
1999-01-05
The Automated Fiber Pigtailing Machine (AFPM) aligns and attaches optical fibers to optoelectronic (OE) devices such as laser diodes, photodiodes, and waveguide devices without operator intervention. The so-called pigtailing process is completed with sub-micron accuracies in less than 3 minutes. The AFPM operates unattended for one hour, is modular in design and is compatible with a mass production manufacturing environment. This machine can be used to build components which are used in military aircraft navigation systems, computer systems, communications systems and in the construction of diagnostics and experimental systems. 26 figs.
Movahedi, Faezeh; Coyle, James L; Sejdic, Ervin
2018-05-01
Deep learning, a relatively new branch of machine learning, has been investigated for use in a variety of biomedical applications. Deep learning algorithms have been used to analyze different physiological signals and gain a better understanding of human physiology for automated diagnosis of abnormal conditions. In this paper, we provide an overview of deep learning approaches with a focus on deep belief networks in electroencephalography applications. We investigate the state-of-the-art algorithms for deep belief networks and then cover the application of these algorithms and their performances in electroencephalographic applications. We covered various applications of electroencephalography in medicine, including emotion recognition, sleep stage classification, and seizure detection, in order to understand how deep learning algorithms could be modified to better suit the tasks desired. This review is intended to provide researchers with a broad overview of the currently existing deep belief network methodology for electroencephalography signals, as well as to highlight potential challenges for future research.
NASA Astrophysics Data System (ADS)
Ceylan Koydemir, Hatice; Feng, Steve; Liang, Kyle; Nadkarni, Rohan; Benien, Parul; Ozcan, Aydogan
2017-06-01
Giardia lamblia is a waterborne parasite that affects millions of people every year worldwide, causing a diarrheal illness known as giardiasis. Timely detection of the presence of the cysts of this parasite in drinking water is important to prevent the spread of the disease, especially in resource-limited settings. Here we provide extended experimental testing and evaluation of the performance and repeatability of a field-portable and cost-effective microscopy platform for automated detection and counting of Giardia cysts in water samples, including tap water, non-potable water, and pond water. This compact platform is based on our previous work, and is composed of a smartphone-based fluorescence microscope, a disposable sample processing cassette, and a custom-developed smartphone application. Our mobile phone microscope has a large field of view of 0.8 cm2 and weighs only 180 g, excluding the phone. A custom-developed smartphone application provides a user-friendly graphical interface, guiding the users to capture a fluorescence image of the sample filter membrane and analyze it automatically at our servers using an image processing algorithm and training data, consisting of >30,000 images of cysts and >100,000 images of other fluorescent particles that are captured, including, e.g. dust. The total time that it takes from sample preparation to automated cyst counting is less than an hour for each 10 ml of water sample that is tested. We compared the sensitivity and the specificity of our platform using multiple supervised classification models, including support vector machines and nearest neighbors, and demonstrated that a bootstrap aggregating (i.e. bagging) approach using raw image file format provides the best performance for automated detection of Giardia cysts. We evaluated the performance of this machine learning enabled pathogen detection device with water samples taken from different sources (e.g. tap water, non-potable water, pond water) and achieved a limit of detection of 12 cysts per 10 ml, an average cyst capture efficiency of 79%, and an accuracy of 95%. Providing rapid detection and quantification of waterborne pathogens without the need for a microbiology expert, this field-portable imaging and sensing platform running on a smartphone could be very useful for water quality monitoring in resource-limited settings.
Hättenschwiler, Nicole; Sterchi, Yanik; Mendes, Marcia; Schwaninger, Adrian
2018-10-01
Bomb attacks on civil aviation make detecting improvised explosive devices and explosive material in passenger baggage a major concern. In the last few years, explosive detection systems for cabin baggage screening (EDSCB) have become available. Although used by a number of airports, most countries have not yet implemented these systems on a wide scale. We investigated the benefits of EDSCB with two different levels of automation currently being discussed by regulators and airport operators: automation as a diagnostic aid with an on-screen alarm resolution by the airport security officer (screener) or EDSCB with an automated decision by the machine. The two experiments reported here tested and compared both scenarios and a condition without automation as baseline. Participants were screeners at two international airports who differed in both years of work experience and familiarity with automation aids. Results showed that experienced screeners were good at detecting improvised explosive devices even without EDSCB. EDSCB increased only their detection of bare explosives. In contrast, screeners with less experience (tenure < 1 year) benefitted substantially from EDSCB in detecting both improvised explosive devices and bare explosives. A comparison of all three conditions showed that automated decision provided better human-machine detection performance than on-screen alarm resolution and no automation. This came at the cost of slightly higher false alarm rates on the human-machine system level, which would still be acceptable from an operational point of view. Results indicate that a wide-scale implementation of EDSCB would increase the detection of explosives in passenger bags and automated decision instead of automation as diagnostic aid with on screen alarm resolution should be considered. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Confidence Preserving Machine for Facial Action Unit Detection
Zeng, Jiabei; Chu, Wen-Sheng; De la Torre, Fernando; Cohn, Jeffrey F.; Xiong, Zhang
2016-01-01
Facial action unit (AU) detection from video has been a long-standing problem in automated facial expression analysis. While progress has been made, accurate detection of facial AUs remains challenging due to ubiquitous sources of errors, such as inter-personal variability, pose, and low-intensity AUs. In this paper, we refer to samples causing such errors as hard samples, and the remaining as easy samples. To address learning with the hard samples, we propose the Confidence Preserving Machine (CPM), a novel two-stage learning framework that combines multiple classifiers following an “easy-to-hard” strategy. During the training stage, CPM learns two confident classifiers. Each classifier focuses on separating easy samples of one class from all else, and thus preserves confidence on predicting each class. During the testing stage, the confident classifiers provide “virtual labels” for easy test samples. Given the virtual labels, we propose a quasi-semi-supervised (QSS) learning strategy to learn a person-specific (PS) classifier. The QSS strategy employs a spatio-temporal smoothness that encourages similar predictions for samples within a spatio-temporal neighborhood. In addition, to further improve detection performance, we introduce two CPM extensions: iCPM that iteratively augments training samples to train the confident classifiers, and kCPM that kernelizes the original CPM model to promote nonlinearity. Experiments on four spontaneous datasets GFT [15], BP4D [56], DISFA [42], and RU-FACS [3] illustrate the benefits of the proposed CPM models over baseline methods and state-of-the-art semisupervised learning and transfer learning methods. PMID:27479964
2012-01-01
Background There is a need for automated methods to learn general features of the interactions of a ligand class with its diverse set of protein receptors. An appropriate machine learning approach is Inductive Logic Programming (ILP), which automatically generates comprehensible rules in addition to prediction. The development of ILP systems which can learn rules of the complexity required for studies on protein structure remains a challenge. In this work we use a new ILP system, ProGolem, and demonstrate its performance on learning features of hexose-protein interactions. Results The rules induced by ProGolem detect interactions mediated by aromatics and by planar-polar residues, in addition to less common features such as the aromatic sandwich. The rules also reveal a previously unreported dependency for residues cys and leu. They also specify interactions involving aromatic and hydrogen bonding residues. This paper shows that Inductive Logic Programming implemented in ProGolem can derive rules giving structural features of protein/ligand interactions. Several of these rules are consistent with descriptions in the literature. Conclusions In addition to confirming literature results, ProGolem’s model has a 10-fold cross-validated predictive accuracy that is superior, at the 95% confidence level, to another ILP system previously used to study protein/hexose interactions and is comparable with state-of-the-art statistical learners. PMID:22783946
Managing Multi-center Flow Cytometry Data for Immune Monitoring
White, Scott; Laske, Karoline; Welters, Marij JP; Bidmon, Nicole; van der Burg, Sjoerd H; Britten, Cedrik M; Enzor, Jennifer; Staats, Janet; Weinhold, Kent J; Gouttefangeas, Cécile; Chan, Cliburn
2014-01-01
With the recent results of promising cancer vaccines and immunotherapy1–5, immune monitoring has become increasingly relevant for measuring treatment-induced effects on T cells, and an essential tool for shedding light on the mechanisms responsible for a successful treatment. Flow cytometry is the canonical multi-parameter assay for the fine characterization of single cells in solution, and is ubiquitously used in pre-clinical tumor immunology and in cancer immunotherapy trials. Current state-of-the-art polychromatic flow cytometry involves multi-step, multi-reagent assays followed by sample acquisition on sophisticated instruments capable of capturing up to 20 parameters per cell at a rate of tens of thousands of cells per second. Given the complexity of flow cytometry assays, reproducibility is a major concern, especially for multi-center studies. A promising approach for improving reproducibility is the use of automated analysis borrowing from statistics, machine learning and information visualization21–23, as these methods directly address the subjectivity, operator-dependence, labor-intensive and low fidelity of manual analysis. However, it is quite time-consuming to investigate and test new automated analysis techniques on large data sets without some centralized information management system. For large-scale automated analysis to be practical, the presence of consistent and high-quality data linked to the raw FCS files is indispensable. In particular, the use of machine-readable standard vocabularies to characterize channel metadata is essential when constructing analytic pipelines to avoid errors in processing, analysis and interpretation of results. For automation, this high-quality metadata needs to be programmatically accessible, implying the need for a consistent Application Programming Interface (API). In this manuscript, we propose that upfront time spent normalizing flow cytometry data to conform to carefully designed data models enables automated analysis, potentially saving time in the long run. The ReFlow informatics framework was developed to address these data management challenges. PMID:26085786
Real-time bioacoustics monitoring and automated species identification.
Aide, T Mitchell; Corrada-Bravo, Carlos; Campos-Cerqueira, Marconi; Milan, Carlos; Vega, Giovany; Alvarez, Rafael
2013-01-01
Traditionally, animal species diversity and abundance is assessed using a variety of methods that are generally costly, limited in space and time, and most importantly, they rarely include a permanent record. Given the urgency of climate change and the loss of habitat, it is vital that we use new technologies to improve and expand global biodiversity monitoring to thousands of sites around the world. In this article, we describe the acoustical component of the Automated Remote Biodiversity Monitoring Network (ARBIMON), a novel combination of hardware and software for automating data acquisition, data management, and species identification based on audio recordings. The major components of the cyberinfrastructure include: a solar powered remote monitoring station that sends 1-min recordings every 10 min to a base station, which relays the recordings in real-time to the project server, where the recordings are processed and uploaded to the project website (arbimon.net). Along with a module for viewing, listening, and annotating recordings, the website includes a species identification interface to help users create machine learning algorithms to automate species identification. To demonstrate the system we present data on the vocal activity patterns of birds, frogs, insects, and mammals from Puerto Rico and Costa Rica.
Accelerating the discovery of materials for clean energy in the era of smart automation
NASA Astrophysics Data System (ADS)
Tabor, Daniel P.; Roch, Loïc M.; Saikin, Semion K.; Kreisbeck, Christoph; Sheberla, Dennis; Montoya, Joseph H.; Dwaraknath, Shyam; Aykol, Muratahan; Ortiz, Carlos; Tribukait, Hermann; Amador-Bedolla, Carlos; Brabec, Christoph J.; Maruyama, Benji; Persson, Kristin A.; Aspuru-Guzik, Alán
2018-05-01
The discovery and development of novel materials in the field of energy are essential to accelerate the transition to a low-carbon economy. Bringing recent technological innovations in automation, robotics and computer science together with current approaches in chemistry, materials synthesis and characterization will act as a catalyst for revolutionizing traditional research and development in both industry and academia. This Perspective provides a vision for an integrated artificial intelligence approach towards autonomous materials discovery, which, in our opinion, will emerge within the next 5 to 10 years. The approach we discuss requires the integration of the following tools, which have already seen substantial development to date: high-throughput virtual screening, automated synthesis planning, automated laboratories and machine learning algorithms. In addition to reducing the time to deployment of new materials by an order of magnitude, this integrated approach is expected to lower the cost associated with the initial discovery. Thus, the price of the final products (for example, solar panels, batteries and electric vehicles) will also decrease. This in turn will enable industries and governments to meet more ambitious targets in terms of reducing greenhouse gas emissions at a faster pace.
Large-Scale Image Analytics Using Deep Learning
NASA Astrophysics Data System (ADS)
Ganguly, S.; Nemani, R. R.; Basu, S.; Mukhopadhyay, S.; Michaelis, A.; Votava, P.
2014-12-01
High resolution land cover classification maps are needed to increase the accuracy of current Land ecosystem and climate model outputs. Limited studies are in place that demonstrates the state-of-the-art in deriving very high resolution (VHR) land cover products. In addition, most methods heavily rely on commercial softwares that are difficult to scale given the region of study (e.g. continents to globe). Complexities in present approaches relate to (a) scalability of the algorithm, (b) large image data processing (compute and memory intensive), (c) computational cost, (d) massively parallel architecture, and (e) machine learning automation. In addition, VHR satellite datasets are of the order of terabytes and features extracted from these datasets are of the order of petabytes. In our present study, we have acquired the National Agricultural Imaging Program (NAIP) dataset for the Continental United States at a spatial resolution of 1-m. This data comes as image tiles (a total of quarter million image scenes with ~60 million pixels) and has a total size of ~100 terabytes for a single acquisition. Features extracted from the entire dataset would amount to ~8-10 petabytes. In our proposed approach, we have implemented a novel semi-automated machine learning algorithm rooted on the principles of "deep learning" to delineate the percentage of tree cover. In order to perform image analytics in such a granular system, it is mandatory to devise an intelligent archiving and query system for image retrieval, file structuring, metadata processing and filtering of all available image scenes. Using the Open NASA Earth Exchange (NEX) initiative, which is a partnership with Amazon Web Services (AWS), we have developed an end-to-end architecture for designing the database and the deep belief network (following the distbelief computing model) to solve a grand challenge of scaling this process across quarter million NAIP tiles that cover the entire Continental United States. The AWS core components that we use to solve this problem are DynamoDB along with S3 for database query and storage, ElastiCache shared memory architecture for image segmentation, Elastic Map Reduce (EMR) for image feature extraction, and the memory optimized Elastic Cloud Compute (EC2) for the learning algorithm.
Using Twitter to Examine Smoking Behavior and Perceptions of Emerging Tobacco Products
Myslín, Mark; Zhu, Shu-Hong; Chapman, Wendy
2013-01-01
Background Social media platforms such as Twitter are rapidly becoming key resources for public health surveillance applications, yet little is known about Twitter users’ levels of informedness and sentiment toward tobacco, especially with regard to the emerging tobacco control challenges posed by hookah and electronic cigarettes. Objective To develop a content and sentiment analysis of tobacco-related Twitter posts and build machine learning classifiers to detect tobacco-relevant posts and sentiment towards tobacco, with a particular focus on new and emerging products like hookah and electronic cigarettes. Methods We collected 7362 tobacco-related Twitter posts at 15-day intervals from December 2011 to July 2012. Each tweet was manually classified using a triaxial scheme, capturing genre, theme, and sentiment. Using the collected data, machine-learning classifiers were trained to detect tobacco-related vs irrelevant tweets as well as positive vs negative sentiment, using Naïve Bayes, k-nearest neighbors, and Support Vector Machine (SVM) algorithms. Finally, phi contingency coefficients were computed between each of the categories to discover emergent patterns. Results The most prevalent genres were first- and second-hand experience and opinion, and the most frequent themes were hookah, cessation, and pleasure. Sentiment toward tobacco was overall more positive (1939/4215, 46% of tweets) than negative (1349/4215, 32%) or neutral among tweets mentioning it, even excluding the 9% of tweets categorized as marketing. Three separate metrics converged to support an emergent distinction between, on one hand, hookah and electronic cigarettes corresponding to positive sentiment, and on the other hand, traditional tobacco products and more general references corresponding to negative sentiment. These metrics included correlations between categories in the annotation scheme (phihookah-positive=0.39; phie-cigs-positive=0.19); correlations between search keywords and sentiment (χ2 4=414.50, P<.001, Cramer’s V=0.36), and the most discriminating unigram features for positive and negative sentiment ranked by log odds ratio in the machine learning component of the study. In the automated classification tasks, SVMs using a relatively small number of unigram features (500) achieved best performance in discriminating tobacco-related from unrelated tweets (F score=0.85). Conclusions Novel insights available through Twitter for tobacco surveillance are attested through the high prevalence of positive sentiment. This positive sentiment is correlated in complex ways with social image, personal experience, and recently popular products such as hookah and electronic cigarettes. Several apparent perceptual disconnects between these products and their health effects suggest opportunities for tobacco control education. Finally, machine classification of tobacco-related posts shows a promising edge over strictly keyword-based approaches, yielding an improved signal-to-noise ratio in Twitter data and paving the way for automated tobacco surveillance applications. PMID:23989137
Asakura, Kota; Azechi, Takuya; Sasano, Hiroshi; Matsui, Hidehito; Hanaki, Hideaki; Miyazaki, Motoyasu; Takata, Tohru; Sekine, Miwa; Takaku, Tomoiku; Ochiai, Tomonori; Komatsu, Norio; Shibayama, Keigo; Katayama, Yuki; Yahara, Koji
2018-01-01
Vancomycin-intermediately resistant Staphylococcus aureus (VISA) and heterogeneous VISA (hVISA) are associated with treatment failure. hVISA contains only a subpopulation of cells with increased minimal inhibitory concentrations, and its detection is problematic because it is classified as vancomycin-susceptible by standard susceptibility testing and the gold-standard method for its detection is impractical in clinical microbiology laboratories. Recently, a research group developed a machine-learning classifier to distinguish VISA and hVISA from vancomycin-susceptible S. aureus (VSSA) according to matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) data. Nonetheless, the sensitivity of hVISA classification was found to be 76%, and the program was not completely automated with a graphical user interface. Here, we developed a more accurate machine-learning classifier for discrimination of hVISA from VSSA and VISA among MRSA isolates in Japanese hospitals by means of MALDI-TOF MS data. The classifier showed 99% sensitivity of hVISA classification. Furthermore, we clarified the procedures for preparing samples and obtaining MALDI-TOF MS data and developed all-in-one software, hVISA Classifier, with a graphical user interface that automates the classification and is easy for medical workers to use; it is publicly available at https://github.com/bioprojects/hVISAclassifier. This system is useful and practical for screening MRSA isolates for the hVISA phenotype in clinical microbiology laboratories and thus should improve treatment of MRSA infections.
Asakura, Kota; Azechi, Takuya; Sasano, Hiroshi; Matsui, Hidehito; Hanaki, Hideaki; Miyazaki, Motoyasu; Takata, Tohru; Sekine, Miwa; Takaku, Tomoiku; Ochiai, Tomonori; Komatsu, Norio; Shibayama, Keigo
2018-01-01
Vancomycin-intermediately resistant Staphylococcus aureus (VISA) and heterogeneous VISA (hVISA) are associated with treatment failure. hVISA contains only a subpopulation of cells with increased minimal inhibitory concentrations, and its detection is problematic because it is classified as vancomycin-susceptible by standard susceptibility testing and the gold-standard method for its detection is impractical in clinical microbiology laboratories. Recently, a research group developed a machine-learning classifier to distinguish VISA and hVISA from vancomycin-susceptible S. aureus (VSSA) according to matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) data. Nonetheless, the sensitivity of hVISA classification was found to be 76%, and the program was not completely automated with a graphical user interface. Here, we developed a more accurate machine-learning classifier for discrimination of hVISA from VSSA and VISA among MRSA isolates in Japanese hospitals by means of MALDI-TOF MS data. The classifier showed 99% sensitivity of hVISA classification. Furthermore, we clarified the procedures for preparing samples and obtaining MALDI-TOF MS data and developed all-in-one software, hVISA Classifier, with a graphical user interface that automates the classification and is easy for medical workers to use; it is publicly available at https://github.com/bioprojects/hVISAclassifier. This system is useful and practical for screening MRSA isolates for the hVISA phenotype in clinical microbiology laboratories and thus should improve treatment of MRSA infections. PMID:29522576
NASA Astrophysics Data System (ADS)
Richards, Joseph W.; Starr, Dan L.; Miller, Adam A.; Bloom, Joshua S.; Butler, Nathaniel R.; Brink, Henrik; Crellin-Quick, Arien
2012-12-01
With growing data volumes from synoptic surveys, astronomers necessarily must become more abstracted from the discovery and introspection processes. Given the scarcity of follow-up resources, there is a particularly sharp onus on the frameworks that replace these human roles to provide accurate and well-calibrated probabilistic classification catalogs. Such catalogs inform the subsequent follow-up, allowing consumers to optimize the selection of specific sources for further study and permitting rigorous treatment of classification purities and efficiencies for population studies. Here, we describe a process to produce a probabilistic classification catalog of variability with machine learning from a multi-epoch photometric survey. In addition to producing accurate classifications, we show how to estimate calibrated class probabilities and motivate the importance of probability calibration. We also introduce a methodology for feature-based anomaly detection, which allows discovery of objects in the survey that do not fit within the predefined class taxonomy. Finally, we apply these methods to sources observed by the All-Sky Automated Survey (ASAS), and release the Machine-learned ASAS Classification Catalog (MACC), a 28 class probabilistic classification catalog of 50,124 ASAS sources in the ASAS Catalog of Variable Stars. We estimate that MACC achieves a sub-20% classification error rate and demonstrate that the class posterior probabilities are reasonably calibrated. MACC classifications compare favorably to the classifications of several previous domain-specific ASAS papers and to the ASAS Catalog of Variable Stars, which had classified only 24% of those sources into one of 12 science classes.
Automated classification of radiology reports to facilitate retrospective study in radiology.
Zhou, Yihua; Amundson, Per K; Yu, Fang; Kessler, Marcus M; Benzinger, Tammie L S; Wippold, Franz J
2014-12-01
Retrospective research is an import tool in radiology. Identifying imaging examinations appropriate for a given research question from the unstructured radiology reports is extremely useful, but labor-intensive. Using the machine learning text-mining methods implemented in LingPipe [1], we evaluated the performance of the dynamic language model (DLM) and the Naïve Bayesian (NB) classifiers in classifying radiology reports to facilitate identification of radiological examinations for research projects. The training dataset consisted of 14,325 sentences from 11,432 radiology reports randomly selected from a database of 5,104,594 reports in all disciplines of radiology. The training sentences were categorized manually into six categories (Positive, Differential, Post Treatment, Negative, Normal, and History). A 10-fold cross-validation [2] was used to evaluate the performance of the models, which were tested in classification of radiology reports for cases of sellar or suprasellar masses and colloid cysts. The average accuracies for the DLM and NB classifiers were 88.5% with 95% confidence interval (CI) of 1.9% and 85.9% with 95% CI of 2.0%, respectively. The DLM performed slightly better and was used to classify 1,397 radiology reports containing the keywords "sellar or suprasellar mass", or "colloid cyst". The DLM model produced an accuracy of 88.2% with 95% CI of 2.1% for 959 reports that contain "sellar or suprasellar mass" and an accuracy of 86.3% with 95% CI of 2.5% for 437 reports of "colloid cyst". We conclude that automated classification of radiology reports using machine learning techniques can effectively facilitate the identification of cases suitable for retrospective research.
NASA Astrophysics Data System (ADS)
Remmele, Steffen; Ritzerfeld, Julia; Nickel, Walter; Hesser, Jürgen
2011-03-01
RNAi-based high-throughput microscopy screens have become an important tool in biological sciences in order to decrypt mostly unknown biological functions of human genes. However, manual analysis is impossible for such screens since the amount of image data sets can often be in the hundred thousands. Reliable automated tools are thus required to analyse the fluorescence microscopy image data sets usually containing two or more reaction channels. The herein presented image analysis tool is designed to analyse an RNAi screen investigating the intracellular trafficking and targeting of acylated Src kinases. In this specific screen, a data set consists of three reaction channels and the investigated cells can appear in different phenotypes. The main issue of the image processing task is an automatic cell segmentation which has to be robust and accurate for all different phenotypes and a successive phenotype classification. The cell segmentation is done in two steps by segmenting the cell nuclei first and then using a classifier-enhanced region growing on basis of the cell nuclei to segment the cells. The classification of the cells is realized by a support vector machine which has to be trained manually using supervised learning. Furthermore, the tool is brightness invariant allowing different staining quality and it provides a quality control that copes with typical defects during preparation and acquisition. A first version of the tool has already been successfully applied for an RNAi-screen containing three hundred thousand image data sets and the SVM extended version is designed for additional screens.
Automated analysis of retinal imaging using machine learning techniques for computer vision.
De Fauw, Jeffrey; Keane, Pearse; Tomasev, Nenad; Visentin, Daniel; van den Driessche, George; Johnson, Mike; Hughes, Cian O; Chu, Carlton; Ledsam, Joseph; Back, Trevor; Peto, Tunde; Rees, Geraint; Montgomery, Hugh; Raine, Rosalind; Ronneberger, Olaf; Cornebise, Julien
2016-01-01
There are almost two million people in the United Kingdom living with sight loss, including around 360,000 people who are registered as blind or partially sighted. Sight threatening diseases, such as diabetic retinopathy and age related macular degeneration have contributed to the 40% increase in outpatient attendances in the last decade but are amenable to early detection and monitoring. With early and appropriate intervention, blindness may be prevented in many cases. Ophthalmic imaging provides a way to diagnose and objectively assess the progression of a number of pathologies including neovascular ("wet") age-related macular degeneration (wet AMD) and diabetic retinopathy. Two methods of imaging are commonly used: digital photographs of the fundus (the 'back' of the eye) and Optical Coherence Tomography (OCT, a modality that uses light waves in a similar way to how ultrasound uses sound waves). Changes in population demographics and expectations and the changing pattern of chronic diseases creates a rising demand for such imaging. Meanwhile, interrogation of such images is time consuming, costly, and prone to human error. The application of novel analysis methods may provide a solution to these challenges. This research will focus on applying novel machine learning algorithms to automatic analysis of both digital fundus photographs and OCT in Moorfields Eye Hospital NHS Foundation Trust patients. Through analysis of the images used in ophthalmology, along with relevant clinical and demographic information, DeepMind Health will investigate the feasibility of automated grading of digital fundus photographs and OCT and provide novel quantitative measures for specific disease features and for monitoring the therapeutic success.
Irusta, Unai; Morgado, Eduardo; Aramendi, Elisabete; Ayala, Unai; Wik, Lars; Kramer-Johansen, Jo; Eftestøl, Trygve; Alonso-Atienza, Felipe
2016-01-01
Early recognition of ventricular fibrillation (VF) and electrical therapy are key for the survival of out-of-hospital cardiac arrest (OHCA) patients treated with automated external defibrillators (AED). AED algorithms for VF-detection are customarily assessed using Holter recordings from public electrocardiogram (ECG) databases, which may be different from the ECG seen during OHCA events. This study evaluates VF-detection using data from both OHCA patients and public Holter recordings. ECG-segments of 4-s and 8-s duration were analyzed. For each segment 30 features were computed and fed to state of the art machine learning (ML) algorithms. ML-algorithms with built-in feature selection capabilities were used to determine the optimal feature subsets for both databases. Patient-wise bootstrap techniques were used to evaluate algorithm performance in terms of sensitivity (Se), specificity (Sp) and balanced error rate (BER). Performance was significantly better for public data with a mean Se of 96.6%, Sp of 98.8% and BER 2.2% compared to a mean Se of 94.7%, Sp of 96.5% and BER 4.4% for OHCA data. OHCA data required two times more features than the data from public databases for an accurate detection (6 vs 3). No significant differences in performance were found for different segment lengths, the BER differences were below 0.5-points in all cases. Our results show that VF-detection is more challenging for OHCA data than for data from public databases, and that accurate VF-detection is possible with segments as short as 4-s. PMID:27441719
Figuera, Carlos; Irusta, Unai; Morgado, Eduardo; Aramendi, Elisabete; Ayala, Unai; Wik, Lars; Kramer-Johansen, Jo; Eftestøl, Trygve; Alonso-Atienza, Felipe
2016-01-01
Early recognition of ventricular fibrillation (VF) and electrical therapy are key for the survival of out-of-hospital cardiac arrest (OHCA) patients treated with automated external defibrillators (AED). AED algorithms for VF-detection are customarily assessed using Holter recordings from public electrocardiogram (ECG) databases, which may be different from the ECG seen during OHCA events. This study evaluates VF-detection using data from both OHCA patients and public Holter recordings. ECG-segments of 4-s and 8-s duration were analyzed. For each segment 30 features were computed and fed to state of the art machine learning (ML) algorithms. ML-algorithms with built-in feature selection capabilities were used to determine the optimal feature subsets for both databases. Patient-wise bootstrap techniques were used to evaluate algorithm performance in terms of sensitivity (Se), specificity (Sp) and balanced error rate (BER). Performance was significantly better for public data with a mean Se of 96.6%, Sp of 98.8% and BER 2.2% compared to a mean Se of 94.7%, Sp of 96.5% and BER 4.4% for OHCA data. OHCA data required two times more features than the data from public databases for an accurate detection (6 vs 3). No significant differences in performance were found for different segment lengths, the BER differences were below 0.5-points in all cases. Our results show that VF-detection is more challenging for OHCA data than for data from public databases, and that accurate VF-detection is possible with segments as short as 4-s.
Tsipouras, Markos G; Giannakeas, Nikolaos; Tzallas, Alexandros T; Tsianou, Zoe E; Manousou, Pinelopi; Hall, Andrew; Tsoulos, Ioannis; Tsianos, Epameinondas
2017-03-01
Collagen proportional area (CPA) extraction in liver biopsy images provides the degree of fibrosis expansion in liver tissue, which is the most characteristic histological alteration in hepatitis C virus (HCV). Assessment of the fibrotic tissue is currently based on semiquantitative staging scores such as Ishak and Metavir. Since its introduction as a fibrotic tissue assessment technique, CPA calculation based on image analysis techniques has proven to be more accurate than semiquantitative scores. However, CPA has yet to reach everyday clinical practice, since the lack of standardized and robust methods for computerized image analysis for CPA assessment have proven to be a major limitation. The current work introduces a three-stage fully automated methodology for CPA extraction based on machine learning techniques. Specifically, clustering algorithms have been employed for background-tissue separation, as well as for fibrosis detection in liver tissue regions, in the first and the third stage of the methodology, respectively. Due to the existence of several types of tissue regions in the image (such as blood clots, muscle tissue, structural collagen, etc.), classification algorithms have been employed to identify liver tissue regions and exclude all other non-liver tissue regions from CPA computation. For the evaluation of the methodology, 79 liver biopsy images have been employed, obtaining 1.31% mean absolute CPA error, with 0.923 concordance correlation coefficient. The proposed methodology is designed to (i) avoid manual threshold-based and region selection processes, widely used in similar approaches presented in the literature, and (ii) minimize CPA calculation time. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Richards, Joseph W.; Starr, Dan L.; Miller, Adam A.
2012-12-15
With growing data volumes from synoptic surveys, astronomers necessarily must become more abstracted from the discovery and introspection processes. Given the scarcity of follow-up resources, there is a particularly sharp onus on the frameworks that replace these human roles to provide accurate and well-calibrated probabilistic classification catalogs. Such catalogs inform the subsequent follow-up, allowing consumers to optimize the selection of specific sources for further study and permitting rigorous treatment of classification purities and efficiencies for population studies. Here, we describe a process to produce a probabilistic classification catalog of variability with machine learning from a multi-epoch photometric survey. In additionmore » to producing accurate classifications, we show how to estimate calibrated class probabilities and motivate the importance of probability calibration. We also introduce a methodology for feature-based anomaly detection, which allows discovery of objects in the survey that do not fit within the predefined class taxonomy. Finally, we apply these methods to sources observed by the All-Sky Automated Survey (ASAS), and release the Machine-learned ASAS Classification Catalog (MACC), a 28 class probabilistic classification catalog of 50,124 ASAS sources in the ASAS Catalog of Variable Stars. We estimate that MACC achieves a sub-20% classification error rate and demonstrate that the class posterior probabilities are reasonably calibrated. MACC classifications compare favorably to the classifications of several previous domain-specific ASAS papers and to the ASAS Catalog of Variable Stars, which had classified only 24% of those sources into one of 12 science classes.« less
Abu, Arpah; Leow, Lee Kien; Ramli, Rosli; Omar, Hasmahzaiti
2016-12-22
Taxonomists frequently identify specimen from various populations based on the morphological characteristics and molecular data. This study looks into another invasive process in identification of house shrew (Suncus murinus) using image analysis and machine learning approaches. Thus, an automated identification system is developed to assist and simplify this task. In this study, seven descriptors namely area, convex area, major axis length, minor axis length, perimeter, equivalent diameter and extent which are based on the shape are used as features to represent digital image of skull that consists of dorsal, lateral and jaw views for each specimen. An Artificial Neural Network (ANN) is used as classifier to classify the skulls of S. murinus based on region (northern and southern populations of Peninsular Malaysia) and sex (adult male and female). Thus, specimen classification using Training data set and identification using Testing data set were performed through two stages of ANNs. At present, the classifier used has achieved an accuracy of 100% based on skulls' views. Classification and identification to regions and sexes have also attained 72.5%, 87.5% and 80.0% of accuracy for dorsal, lateral, and jaw views, respectively. This results show that the shape characteristic features used are substantial because they can differentiate the specimens based on regions and sexes up to the accuracy of 80% and above. Finally, an application was developed and can be used for the scientific community. This automated system demonstrates the practicability of using computer-assisted systems in providing interesting alternative approach for quick and easy identification of unknown species.
Finding Waldo: Learning about Users from their Interactions.
Brown, Eli T; Ottley, Alvitta; Zhao, Helen; Quan Lin; Souvenir, Richard; Endert, Alex; Chang, Remco
2014-12-01
Visual analytics is inherently a collaboration between human and computer. However, in current visual analytics systems, the computer has limited means of knowing about its users and their analysis processes. While existing research has shown that a user's interactions with a system reflect a large amount of the user's reasoning process, there has been limited advancement in developing automated, real-time techniques that mine interactions to learn about the user. In this paper, we demonstrate that we can accurately predict a user's task performance and infer some user personality traits by using machine learning techniques to analyze interaction data. Specifically, we conduct an experiment in which participants perform a visual search task, and apply well-known machine learning algorithms to three encodings of the users' interaction data. We achieve, depending on algorithm and encoding, between 62% and 83% accuracy at predicting whether each user will be fast or slow at completing the task. Beyond predicting performance, we demonstrate that using the same techniques, we can infer aspects of the user's personality factors, including locus of control, extraversion, and neuroticism. Further analyses show that strong results can be attained with limited observation time: in one case 95% of the final accuracy is gained after a quarter of the average task completion time. Overall, our findings show that interactions can provide information to the computer about its human collaborator, and establish a foundation for realizing mixed-initiative visual analytics systems.
Szlosek, Donald A; Ferrett, Jonathan
2016-01-01
As the number of clinical decision support systems (CDSSs) incorporated into electronic medical records (EMRs) increases, so does the need to evaluate their effectiveness. The use of medical record review and similar manual methods for evaluating decision rules is laborious and inefficient. The authors use machine learning and Natural Language Processing (NLP) algorithms to accurately evaluate a clinical decision support rule through an EMR system, and they compare it against manual evaluation. Modeled after the EMR system EPIC at Maine Medical Center, we developed a dummy data set containing physician notes in free text for 3,621 artificial patients records undergoing a head computed tomography (CT) scan for mild traumatic brain injury after the incorporation of an electronic best practice approach. We validated the accuracy of the Best Practice Advisories (BPA) using three machine learning algorithms-C-Support Vector Classification (SVC), Decision Tree Classifier (DecisionTreeClassifier), k-nearest neighbors classifier (KNeighborsClassifier)-by comparing their accuracy for adjudicating the occurrence of a mild traumatic brain injury against manual review. We then used the best of the three algorithms to evaluate the effectiveness of the BPA, and we compared the algorithm's evaluation of the BPA to that of manual review. The electronic best practice approach was found to have a sensitivity of 98.8 percent (96.83-100.0), specificity of 10.3 percent, PPV = 7.3 percent, and NPV = 99.2 percent when reviewed manually by abstractors. Though all the machine learning algorithms were observed to have a high level of prediction, the SVC displayed the highest with a sensitivity 93.33 percent (92.49-98.84), specificity of 97.62 percent (96.53-98.38), PPV = 50.00, NPV = 99.83. The SVC algorithm was observed to have a sensitivity of 97.9 percent (94.7-99.86), specificity 10.30 percent, PPV 7.25 percent, and NPV 99.2 percent for evaluating the best practice approach, after accounting for 17 cases (0.66 percent) where the patient records had to be reviewed manually due to the NPL systems inability to capture the proper diagnosis. CDSSs incorporated into EMRs can be evaluated in an automatic fashion by using NLP and machine learning techniques.
Distributed state machine supervision for long-baseline gravitational-wave detectors
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rollins, Jameson Graef, E-mail: jameson.rollins@ligo.org
The Laser Interferometer Gravitational-wave Observatory (LIGO) consists of two identical yet independent, widely separated, long-baseline gravitational-wave detectors. Each Advanced LIGO detector consists of complex optical-mechanical systems isolated from the ground by multiple layers of active seismic isolation, all controlled by hundreds of fast, digital, feedback control systems. This article describes a novel state machine-based automation platform developed to handle the automation and supervisory control challenges of these detectors. The platform, called Guardian, consists of distributed, independent, state machine automaton nodes organized hierarchically for full detector control. User code is written in standard Python and the platform is designed to facilitatemore » the fast-paced development process associated with commissioning the complicated Advanced LIGO instruments. While developed specifically for the Advanced LIGO detectors, Guardian is a generic state machine automation platform that is useful for experimental control at all levels, from simple table-top setups to large-scale multi-million dollar facilities.« less
Unmanned Mine of the 21st Centuries
NASA Astrophysics Data System (ADS)
Semykina, Irina; Grigoryev, Aleksandr; Gargayev, Andrey; Zavyalov, Valeriy
2017-11-01
The article is analytical. It considers the construction principles of the automation system structure which realize the concept of «unmanned mine». All of these principles intend to deal with problems caused by a continuous complication of mining-and-geological conditions at coalmine such as the labor safety and health protection, the weak integration of different mining automation subsystems and the deficiency of optimal balance between a quantity of resource and energy consumed by mining machines and their throughput. The authors describe the main problems and neck stage of mining machines autonomation and automation subsystem. The article makes a general survey of the applied «unmanned technology» in the field of mining such as the remotely operated autonomous complexes, the underground positioning systems of mining machines using infrared radiation in mine workings etc. The concept of «unmanned mine» is considered with an example of the robotic road heading machine. In the final, the authors analyze the techniques and methods that could solve the task of underground mining without human labor.
Records Management Handbook; Source Data Automation Equipment Guide.
ERIC Educational Resources Information Center
National Archives and Records Service (GSA), Washington, DC. Office of Records Management.
A detailed guide to selecting appropriate source data automation equipment is presented. Source data automation equipment is used to prepare data for electronic data processing or computerized recordkeeping. The guide contains specifications, performance data cost, and pictures of the major types of machines used in source data automation.…
Woldegebriel, Michael; Zomer, Paul; Mol, Hans G J; Vivó-Truyols, Gabriel
2016-08-02
In this work, we introduce an automated, efficient, and elegant model to combine all pieces of evidence (e.g., expected retention times, peak shapes, isotope distributions, fragment-to-parent ratio) obtained from liquid chromatography-tandem mass spectrometry (LC-MS/MS/MS) data for screening purposes. Combining all these pieces of evidence requires a careful assessment of the uncertainties in the analytical system as well as all possible outcomes. To-date, the majority of the existing algorithms are highly dependent on user input parameters. Additionally, the screening process is tackled as a deterministic problem. In this work we present a Bayesian framework to deal with the combination of all these pieces of evidence. Contrary to conventional algorithms, the information is treated in a probabilistic way, and a final probability assessment of the presence/absence of a compound feature is computed. Additionally, all the necessary parameters except the chromatographic band broadening for the method are learned from the data in training and learning phase of the algorithm, avoiding the introduction of a large number of user-defined parameters. The proposed method was validated with a large data set and has shown improved sensitivity and specificity in comparison to a threshold-based commercial software package.
Open-source software for collision detection in external beam radiation therapy
NASA Astrophysics Data System (ADS)
Suriyakumar, Vinith M.; Xu, Renee; Pinter, Csaba; Fichtinger, Gabor
2017-03-01
PURPOSE: Collision detection for external beam radiation therapy (RT) is important for eliminating the need for dryruns that aim to ensure patient safety. Commercial treatment planning systems (TPS) offer this feature but they are expensive and proprietary. Cobalt-60 RT machines are a viable solution to RT practice in low-budget scenarios. However, such clinics are hesitant to invest in these machines due to a lack of affordable treatment planning software. We propose the creation of an open-source room's eye view visualization module with automated collision detection as part of the development of an open-source TPS. METHODS: An openly accessible linac 3D geometry model is sliced into the different components of the treatment machine. The model's movements are based on the International Electrotechnical Commission standard. Automated collision detection is implemented between the treatment machine's components. RESULTS: The room's eye view module was built in C++ as part of SlicerRT, an RT research toolkit built on 3D Slicer. The module was tested using head and neck and prostate RT plans. These tests verified that the module accurately modeled the movements of the treatment machine and radiation beam. Automated collision detection was verified using tests where geometric parameters of the machine's components were changed, demonstrating accurate collision detection. CONCLUSION: Room's eye view visualization and automated collision detection are essential in a Cobalt-60 treatment planning system. Development of these features will advance the creation of an open-source TPS that will potentially help increase the feasibility of adopting Cobalt-60 RT.
2008-09-01
Abbreviations ATM automated teller machine BEA business enterprise architecture DOD...Limitations Automated Teller Machines (ATMs)-At-Sea 1988 Localized, shipboard ATMs that received and accounted for a portion of sailors’ and...use smart card technology for electronic retail ransactions and (2) economically justified on the basis of reliable analyses of stimated costs and
Performance of Color Camera Machine Vision in Automated Furniture Rough Mill Systems
D. Earl Kline; Agus Widoyoko; Janice K. Wiedenbeck; Philip A. Araman
1998-01-01
The objective of this study was to evaluate the performance of color camera machine vision for lumber processing in a furniture rough mill. The study used 134 red oak boards to compare the performance of automated gang-rip-first rough mill yield based on a prototype color camera lumber inspection system developed at Virginia Tech with both estimated optimum rough mill...
Production planning, production systems for flexible automation
NASA Astrophysics Data System (ADS)
Spur, G.; Mertins, K.
1982-09-01
Trends in flexible manufacturing system (FMS) applications are reviewed. Machining systems contain machines which complement each other and can replace each other. Computer controlled storage systems are widespread, with central storage capacity ranging from 20 pallet spaces to 200 magazine spaces. Handling function is fulfilled by pallet chargers in over 75% of FMS's. Data system degree of automation varies considerably. No trends are noted for transport systems.
Automation's Effect on Library Personnel.
ERIC Educational Resources Information Center
Dakshinamurti, Ganga
1985-01-01
Reports on survey studying the human-machine interface in Canadian university, public, and special libraries. Highlights include position category and educational background of 118 participants, participants' feelings toward automation, physical effects of automation, diffusion in decision making, interpersonal communication, future trends,…