Science.gov

Sample records for machine learning tools

  1. Machine learning: an indispensable tool in bioinformatics.

    PubMed

    Inza, Iñaki; Calvo, Borja; Armañanzas, Rubén; Bengoetxea, Endika; Larrañaga, Pedro; Lozano, José A

    2010-01-01

    The increase in the number and complexity of biological databases has raised the need for modern and powerful data analysis tools and techniques. In order to fulfill these requirements, the machine learning discipline has become an everyday tool in bio-laboratories. The use of machine learning techniques has been extended to a wide spectrum of bioinformatics applications. It is broadly used to investigate the underlying mechanisms and interactions between biological molecules in many diseases, and it is an essential tool in any biomarker discovery process. In this chapter, we provide a basic taxonomy of machine learning algorithms, and the characteristics of main data preprocessing, supervised classification, and clustering techniques are shown. Feature selection, classifier evaluation, and two supervised classification topics that have a deep impact on current bioinformatics are presented. We make the interested reader aware of a set of popular web resources, open source software tools, and benchmarking data repositories that are frequently used by the machine learning community. PMID:19957143

  2. Machine Learning: A Crucial Tool for Sensor Design

    PubMed Central

    Zhao, Weixiang; Bhushan, Abhinav; Santamaria, Anthony D.; Simon, Melinda G.; Davis, Cristina E.

    2009-01-01

    Sensors have been widely used for disease diagnosis, environmental quality monitoring, food quality control, industrial process analysis and control, and other related fields. As a key tool for sensor data analysis, machine learning is becoming a core part of novel sensor design. Dividing a complete machine learning process into three steps: data pre-treatment, feature extraction and dimension reduction, and system modeling, this paper provides a review of the methods that are widely used for each step. For each method, the principles and the key issues that affect modeling results are discussed. After reviewing the potential problems in machine learning processes, this paper gives a summary of current algorithms in this field and provides some feasible directions for future studies. PMID:20191110

  3. Advancing Research in Second Language Writing through Computational Tools and Machine Learning Techniques: A Research Agenda

    ERIC Educational Resources Information Center

    Crossley, Scott A.

    2013-01-01

    This paper provides an agenda for replication studies focusing on second language (L2) writing and the use of natural language processing (NLP) tools and machine learning algorithms. Specifically, it introduces a range of the available NLP tools and machine learning algorithms and demonstrates how these could be used to replicate seminal studies…

  4. Characterizing EMG data using machine-learning tools.

    PubMed

    Yousefi, Jamileh; Hamilton-Wright, Andrew

    2014-08-01

    Effective electromyographic (EMG) signal characterization is critical in the diagnosis of neuromuscular disorders. Machine-learning based pattern classification algorithms are commonly used to produce such characterizations. Several classifiers have been investigated to develop accurate and computationally efficient strategies for EMG signal characterization. This paper provides a critical review of some of the classification methodologies used in EMG characterization, and presents the state-of-the-art accomplishments in this field, emphasizing neuromuscular pathology. The techniques studied are grouped by their methodology, and a summary of the salient findings associated with each method is presented.

  5. The use of machine learning and nonlinear statistical tools for ADME prediction.

    PubMed

    Sakiyama, Yojiro

    2009-02-01

    Absorption, distribution, metabolism and excretion (ADME)-related failure of drug candidates is a major issue for the pharmaceutical industry today. Prediction of ADME by in silico tools has now become an inevitable paradigm to reduce cost and enhance efficiency in pharmaceutical research. Recently, machine learning as well as nonlinear statistical tools has been widely applied to predict routine ADME end points. To achieve accurate and reliable predictions, it would be a prerequisite to understand the concepts, mechanisms and limitations of these tools. Here, we have devised a small synthetic nonlinear data set to help understand the mechanism of machine learning by 2D-visualisation. We applied six new machine learning methods to four different data sets. The methods include Naive Bayes classifier, classification and regression tree, random forest, Gaussian process, support vector machine and k nearest neighbour. The results demonstrated that ensemble learning and kernel machine displayed greater accuracy of prediction than classical methods irrespective of the data set size. The importance of interaction with the engineering field is also addressed. The results described here provide insights into the mechanism of machine learning, which will enable appropriate usage in the future.

  6. The use of machine learning and nonlinear statistical tools for ADME prediction.

    PubMed

    Sakiyama, Yojiro

    2009-02-01

    Absorption, distribution, metabolism and excretion (ADME)-related failure of drug candidates is a major issue for the pharmaceutical industry today. Prediction of ADME by in silico tools has now become an inevitable paradigm to reduce cost and enhance efficiency in pharmaceutical research. Recently, machine learning as well as nonlinear statistical tools has been widely applied to predict routine ADME end points. To achieve accurate and reliable predictions, it would be a prerequisite to understand the concepts, mechanisms and limitations of these tools. Here, we have devised a small synthetic nonlinear data set to help understand the mechanism of machine learning by 2D-visualisation. We applied six new machine learning methods to four different data sets. The methods include Naive Bayes classifier, classification and regression tree, random forest, Gaussian process, support vector machine and k nearest neighbour. The results demonstrated that ensemble learning and kernel machine displayed greater accuracy of prediction than classical methods irrespective of the data set size. The importance of interaction with the engineering field is also addressed. The results described here provide insights into the mechanism of machine learning, which will enable appropriate usage in the future. PMID:19239395

  7. Recent progresses in the exploration of machine learning methods as in-silico ADME prediction tools.

    PubMed

    Tao, L; Zhang, P; Qin, C; Chen, S Y; Zhang, C; Chen, Z; Zhu, F; Yang, S Y; Wei, Y Q; Chen, Y Z

    2015-06-23

    In-silico methods have been explored as potential tools for assessing ADME and ADME regulatory properties particularly in early drug discovery stages. Machine learning methods, with their ability in classifying diverse structures and complex mechanisms, are well suited for predicting ADME and ADME regulatory properties. Recent efforts have been directed at the broadening of application scopes and the improvement of predictive performance with particular focuses on the coverage of ADME properties, and exploration of more diversified training data, appropriate molecular features, and consensus modeling. Moreover, several online machine learning ADME prediction servers have emerged. Here we review these progresses and discuss the performances, application prospects and challenges of exploring machine learning methods as useful tools in predicting ADME and ADME regulatory properties.

  8. Machine Learning

    NASA Astrophysics Data System (ADS)

    Hoffmann, Achim; Mahidadia, Ashesh

    The purpose of this chapter is to present fundamental ideas and techniques of machine learning suitable for the field of this book, i.e., for automated scientific discovery. The chapter focuses on those symbolic machine learning methods, which produce results that are suitable to be interpreted and understood by humans. This is particularly important in the context of automated scientific discovery as the scientific theories to be produced by machines are usually meant to be interpreted by humans. This chapter contains some of the most influential ideas and concepts in machine learning research to give the reader a basic insight into the field. After the introduction in Sect. 1, general ideas of how learning problems can be framed are given in Sect. 2. The section provides useful perspectives to better understand what learning algorithms actually do. Section 3 presents the Version space model which is an early learning algorithm as well as a conceptual framework, that provides important insight into the general mechanisms behind most learning algorithms. In section 4, a family of learning algorithms, the AQ family for learning classification rules is presented. The AQ family belongs to the early approaches in machine learning. The next, Sect. 5 presents the basic principles of decision tree learners. Decision tree learners belong to the most influential class of inductive learning algorithms today. Finally, a more recent group of learning systems are presented in Sect. 6, which learn relational concepts within the framework of logic programming. This is a particularly interesting group of learning systems since the framework allows also to incorporate background knowledge which may assist in generalisation. Section 7 discusses Association Rules - a technique that comes from the related field of Data mining. Section 8 presents the basic idea of the Naive Bayesian Classifier. While this is a very popular learning technique, the learning result is not well suited for

  9. Of Genes and Machines: Application of a Combination of Machine Learning Tools to Astronomy Data Sets

    NASA Astrophysics Data System (ADS)

    Heinis, S.; Kumar, S.; Gezari, S.; Burgett, W. S.; Chambers, K. C.; Draper, P. W.; Flewelling, H.; Kaiser, N.; Magnier, E. A.; Metcalfe, N.; Waters, C.

    2016-04-01

    We apply a combination of genetic algorithm (GA) and support vector machine (SVM) machine learning algorithms to solve two important problems faced by the astronomical community: star–galaxy separation and photometric redshift estimation of galaxies in survey catalogs. We use the GA to select the relevant features in the first step, followed by optimization of SVM parameters in the second step to obtain an optimal set of parameters to classify or regress, in the process of which we avoid overfitting. We apply our method to star–galaxy separation in Pan-STARRS1 data. We show that our method correctly classifies 98% of objects down to {i}{{P1}}=24.5, with a completeness (or true positive rate) of 99% for galaxies and 88% for stars. By combining colors with morphology, our star–galaxy separation method yields better results than the new SExtractor classifier spread_model, in particular at the faint end ({i}{{P1}}\\gt 22). We also use our method to derive photometric redshifts for galaxies in the COSMOS bright multiwavelength data set down to an error in (1+z) of σ =0.013, which compares well with estimates from spectral energy distribution fitting on the same data (σ =0.007) while making a significantly smaller number of assumptions.

  10. Of Genes and Machines: Application of a Combination of Machine Learning Tools to Astronomy Data Sets

    NASA Astrophysics Data System (ADS)

    Heinis, S.; Kumar, S.; Gezari, S.; Burgett, W. S.; Chambers, K. C.; Draper, P. W.; Flewelling, H.; Kaiser, N.; Magnier, E. A.; Metcalfe, N.; Waters, C.

    2016-04-01

    We apply a combination of genetic algorithm (GA) and support vector machine (SVM) machine learning algorithms to solve two important problems faced by the astronomical community: star-galaxy separation and photometric redshift estimation of galaxies in survey catalogs. We use the GA to select the relevant features in the first step, followed by optimization of SVM parameters in the second step to obtain an optimal set of parameters to classify or regress, in the process of which we avoid overfitting. We apply our method to star-galaxy separation in Pan-STARRS1 data. We show that our method correctly classifies 98% of objects down to {i}{{P1}}=24.5, with a completeness (or true positive rate) of 99% for galaxies and 88% for stars. By combining colors with morphology, our star-galaxy separation method yields better results than the new SExtractor classifier spread_model, in particular at the faint end ({i}{{P1}}\\gt 22). We also use our method to derive photometric redshifts for galaxies in the COSMOS bright multiwavelength data set down to an error in (1+z) of σ =0.013, which compares well with estimates from spectral energy distribution fitting on the same data (σ =0.007) while making a significantly smaller number of assumptions.

  11. Developing Prognosis Tools to Identify Learning Difficulties in Children Using Machine Learning Technologies.

    PubMed

    Loizou, Antonis; Laouris, Yiannis

    2011-09-01

    The Mental Attributes Profiling System was developed in 2002 (Laouris and Makris, Proceedings of multilingual & cross-cultural perspectives on Dyslexia, Omni Shoreham Hotel, Washington, D.C, 2002), to provide a multimodal evaluation of the learning potential and abilities of young children's brains. The method is based on the assessment of non-verbal abilities using video-like interfaces and was compared to more established methodologies in (Papadopoulos, Laouris, Makris, Proceedings of IDA 54th annual conference, San Diego, 2003), such as the Wechsler Intelligence Scale for Children (Watkins et al., Psychol Sch 34(4):309-319, 1997). To do so, various tests have been applied to a population of 134 children aged 7-12 years old. This paper addresses the issue of identifying a minimal set of variables that are able to accurately predict the learning abilities of a given child. The use of Machine Learning technologies to do this provides the advantage of making no prior assumptions about the nature of the data and eliminating natural bias associated with data processing carried out by humans. Kohonen's Self Organising Maps (Kohonen, Biol Cybern 43:59-69, 1982) algorithm is able to split a population into groups based on large and complex sets of observations. Once the population is split, the individual groups can then be probed for their defining characteristics providing insight into the rationale of the split. The characteristics identified form the basis of classification systems that are able to accurately predict which group an individual will belong to, using only a small subset of the tests available. The specifics of this methodology are detailed herein, and the resulting classification systems provide an effective tool to prognose the learning abilities of new subjects.

  12. A planning quality evaluation tool for prostate adaptive IMRT based on machine learning

    SciTech Connect

    Zhu Xiaofeng; Ge Yaorong; Li Taoran; Thongphiew, Danthai; Yin Fangfang; Wu, Q Jackie

    2011-02-15

    Purpose: To ensure plan quality for adaptive IMRT of the prostate, we developed a quantitative evaluation tool using a machine learning approach. This tool generates dose volume histograms (DVHs) of organs-at-risk (OARs) based on prior plans as a reference, to be compared with the adaptive plan derived from fluence map deformation. Methods: Under the same configuration using seven-field 15 MV photon beams, DVHs of OARs (bladder and rectum) were estimated based on anatomical information of the patient and a model learned from a database of high quality prior plans. In this study, the anatomical information was characterized by the organ volumes and distance-to-target histogram (DTH). The database consists of 198 high quality prostate plans and was validated with 14 cases outside the training pool. Principal component analysis (PCA) was applied to DVHs and DTHs to quantify their salient features. Then, support vector regression (SVR) was implemented to establish the correlation between the features of the DVH and the anatomical information. Results: DVH/DTH curves could be characterized sufficiently just using only two or three truncated principal components, thus, patient anatomical information was quantified with reduced numbers of variables. The evaluation of the model using the test data set demonstrated its accuracy {approx}80% in prediction and effectiveness in improving ART planning quality. Conclusions: An adaptive IMRT plan quality evaluation tool based on machine learning has been developed, which estimates OAR sparing and provides reference in evaluating ART.

  13. Machine tool locator

    DOEpatents

    Hanlon, John A.; Gill, Timothy J.

    2001-01-01

    Machine tools can be accurately measured and positioned on manufacturing machines within very small tolerances by use of an autocollimator on a 3-axis mount on a manufacturing machine and positioned so as to focus on a reference tooling ball or a machine tool, a digital camera connected to the viewing end of the autocollimator, and a marker and measure generator for receiving digital images from the camera, then displaying or measuring distances between the projection reticle and the reference reticle on the monitoring screen, and relating the distances to the actual position of the autocollimator relative to the reference tooling ball. The images and measurements are used to set the position of the machine tool and to measure the size and shape of the machine tool tip, and examine cutting edge wear. patent

  14. Gaussian Process Regression as a machine learning tool for predicting organic carbon from soil spectra - a machine learning comparison study

    NASA Astrophysics Data System (ADS)

    Schmidt, Andreas; Lausch, Angela; Vogel, Hans-Jörg

    2016-04-01

    Diffuse reflectance spectroscopy as a soil analytical tool is spreading more and more. There is a wide range of possible applications ranging from the point scale (e.g. simple soil samples, drill cores, vertical profile scans) through the field scale to the regional and even global scale (UAV, airborne and space borne instruments, soil reflectance databases). The basic idea is that the soil's reflectance spectrum holds information about its properties (like organic matter content or mineral composition). The relation between soil properties and the observable spectrum is usually not exactly know and is typically derived from statistical methods. Nowadays these methods are classified in the term machine learning, which comprises a vast pool of algorithms and methods for learning the relationship between pairs if input - output data (training data set). Within this pool of methods a Gaussian Process Regression (GPR) is newly emerging method (originating from Bayesian statistics) which is increasingly applied to applications in different fields. For example, it was successfully used to predict vegetation parameters from hyperspectral remote sensing data. In this study we apply GPR to predict soil organic carbon from soil spectroscopy data (400 - 2500 nm). We compare it to more traditional and widely used methods such as Partitial Least Squares Regression (PLSR), Random Forest (RF) and Gradient Boosted Regression Trees (GBRT). All these methods have the common ability to calculate a measure for the variable importance (wavelengths importance). The main advantage of GPR is its ability to also predict the variance of the target parameter. This makes it easy to see whether a prediction is reliable or not. The ability to choose from various covariance functions makes GPR a flexible method. This allows for including different assumptions or a priori knowledge about the data. For this study we use samples from three different locations to test the prediction accuracies. One

  15. Machine Tool Software

    NASA Technical Reports Server (NTRS)

    1988-01-01

    A NASA-developed software package has played a part in technical education of students who major in Mechanical Engineering Technology at William Rainey Harper College. Professor Hack has been using (APT) Automatically Programmed Tool Software since 1969 in his CAD/CAM Computer Aided Design and Manufacturing curriculum. Professor Hack teaches the use of APT programming languages for control of metal cutting machines. Machine tool instructions are geometry definitions written in APT Language to constitute a "part program." The part program is processed by the machine tool. CAD/CAM students go from writing a program to cutting steel in the course of a semester.

  16. Machine tools get smarter

    SciTech Connect

    Valenti, M.

    1995-11-01

    This article describes how, using software, sensors, and controllers, a new generation of intelligent machine tools are optimizing grinding, milling, and molding processes. A paradox of manufacturing parts is that the faster the parts are made, the less accurate they are--and vice versa. However, a combination of software, sensors, controllers, and mechanical innovations are being used to create a new generation of intelligent machine tools capable of optimizing their own grinding, milling, and molding processes. These brainy tools are allowing manufacturers to machine more-complex, higher-quality parts in shorter cycle times. The technology also lowers scrap rates and reduces or eliminates the need for polishing inadequately finished parts.

  17. Structure Based Thermostability Prediction Models for Protein Single Point Mutations with Machine Learning Tools

    PubMed Central

    Jia, Lei; Yarlagadda, Ramya; Reed, Charles C.

    2015-01-01

    Thermostability issue of protein point mutations is a common occurrence in protein engineering. An application which predicts the thermostability of mutants can be helpful for guiding decision making process in protein design via mutagenesis. An in silico point mutation scanning method is frequently used to find “hot spots” in proteins for focused mutagenesis. ProTherm (http://gibk26.bio.kyutech.ac.jp/jouhou/Protherm/protherm.html) is a public database that consists of thousands of protein mutants’ experimentally measured thermostability. Two data sets based on two differently measured thermostability properties of protein single point mutations, namely the unfolding free energy change (ddG) and melting temperature change (dTm) were obtained from this database. Folding free energy change calculation from Rosetta, structural information of the point mutations as well as amino acid physical properties were obtained for building thermostability prediction models with informatics modeling tools. Five supervised machine learning methods (support vector machine, random forests, artificial neural network, naïve Bayes classifier, K nearest neighbor) and partial least squares regression are used for building the prediction models. Binary and ternary classifications as well as regression models were built and evaluated. Data set redundancy and balancing, the reverse mutations technique, feature selection, and comparison to other published methods were discussed. Rosetta calculated folding free energy change ranked as the most influential features in all prediction models. Other descriptors also made significant contributions to increasing the accuracy of the prediction models. PMID:26361227

  18. Structure Based Thermostability Prediction Models for Protein Single Point Mutations with Machine Learning Tools.

    PubMed

    Jia, Lei; Yarlagadda, Ramya; Reed, Charles C

    2015-01-01

    Thermostability issue of protein point mutations is a common occurrence in protein engineering. An application which predicts the thermostability of mutants can be helpful for guiding decision making process in protein design via mutagenesis. An in silico point mutation scanning method is frequently used to find "hot spots" in proteins for focused mutagenesis. ProTherm (http://gibk26.bio.kyutech.ac.jp/jouhou/Protherm/protherm.html) is a public database that consists of thousands of protein mutants' experimentally measured thermostability. Two data sets based on two differently measured thermostability properties of protein single point mutations, namely the unfolding free energy change (ddG) and melting temperature change (dTm) were obtained from this database. Folding free energy change calculation from Rosetta, structural information of the point mutations as well as amino acid physical properties were obtained for building thermostability prediction models with informatics modeling tools. Five supervised machine learning methods (support vector machine, random forests, artificial neural network, naïve Bayes classifier, K nearest neighbor) and partial least squares regression are used for building the prediction models. Binary and ternary classifications as well as regression models were built and evaluated. Data set redundancy and balancing, the reverse mutations technique, feature selection, and comparison to other published methods were discussed. Rosetta calculated folding free energy change ranked as the most influential features in all prediction models. Other descriptors also made significant contributions to increasing the accuracy of the prediction models.

  19. Structure Based Thermostability Prediction Models for Protein Single Point Mutations with Machine Learning Tools.

    PubMed

    Jia, Lei; Yarlagadda, Ramya; Reed, Charles C

    2015-01-01

    Thermostability issue of protein point mutations is a common occurrence in protein engineering. An application which predicts the thermostability of mutants can be helpful for guiding decision making process in protein design via mutagenesis. An in silico point mutation scanning method is frequently used to find "hot spots" in proteins for focused mutagenesis. ProTherm (http://gibk26.bio.kyutech.ac.jp/jouhou/Protherm/protherm.html) is a public database that consists of thousands of protein mutants' experimentally measured thermostability. Two data sets based on two differently measured thermostability properties of protein single point mutations, namely the unfolding free energy change (ddG) and melting temperature change (dTm) were obtained from this database. Folding free energy change calculation from Rosetta, structural information of the point mutations as well as amino acid physical properties were obtained for building thermostability prediction models with informatics modeling tools. Five supervised machine learning methods (support vector machine, random forests, artificial neural network, naïve Bayes classifier, K nearest neighbor) and partial least squares regression are used for building the prediction models. Binary and ternary classifications as well as regression models were built and evaluated. Data set redundancy and balancing, the reverse mutations technique, feature selection, and comparison to other published methods were discussed. Rosetta calculated folding free energy change ranked as the most influential features in all prediction models. Other descriptors also made significant contributions to increasing the accuracy of the prediction models. PMID:26361227

  20. Using machine learning tools to model complex toxic interactions with limited sampling regimes.

    PubMed

    Bertin, Matthew J; Moeller, Peter; Guillette, Louis J; Chapman, Robert W

    2013-03-19

    A major impediment to understanding the impact of environmental stress, including toxins and other pollutants, on organisms, is that organisms are rarely challenged by one or a few stressors in natural systems. Thus, linking laboratory experiments that are limited by practical considerations to a few stressors and a few levels of these stressors to real world conditions is constrained. In addition, while the existence of complex interactions among stressors can be identified by current statistical methods, these methods do not provide a means to construct mathematical models of these interactions. In this paper, we offer a two-step process by which complex interactions of stressors on biological systems can be modeled in an experimental design that is within the limits of practicality. We begin with the notion that environment conditions circumscribe an n-dimensional hyperspace within which biological processes or end points are embedded. We then randomly sample this hyperspace to establish experimental conditions that span the range of the relevant parameters and conduct the experiment(s) based upon these selected conditions. Models of the complex interactions of the parameters are then extracted using machine learning tools, specifically artificial neural networks. This approach can rapidly generate highly accurate models of biological responses to complex interactions among environmentally relevant toxins, identify critical subspaces where nonlinear responses exist, and provide an expedient means of designing traditional experiments to test the impact of complex mixtures on biological responses. Further, this can be accomplished with an astonishingly small sample size.

  1. Automatically-Programed Machine Tools

    NASA Technical Reports Server (NTRS)

    Purves, L.; Clerman, N.

    1985-01-01

    Software produces cutter location files for numerically-controlled machine tools. APT, acronym for Automatically Programed Tools, is among most widely used software systems for computerized machine tools. APT developed for explicit purpose of providing effective software system for programing NC machine tools. APT system includes specification of APT programing language and language processor, which executes APT statements and generates NC machine-tool motions specified by APT statements.

  2. Introduction to machine learning.

    PubMed

    Baştanlar, Yalin; Ozuysal, Mustafa

    2014-01-01

    The machine learning field, which can be briefly defined as enabling computers make successful predictions using past experiences, has exhibited an impressive development recently with the help of the rapid increase in the storage capacity and processing power of computers. Together with many other disciplines, machine learning methods have been widely employed in bioinformatics. The difficulties and cost of biological analyses have led to the development of sophisticated machine learning approaches for this application area. In this chapter, we first review the fundamental concepts of machine learning such as feature assessment, unsupervised versus supervised learning and types of classification. Then, we point out the main issues of designing machine learning experiments and their performance evaluation. Finally, we introduce some supervised learning methods. PMID:24272434

  3. Introduction to machine learning.

    PubMed

    Baştanlar, Yalin; Ozuysal, Mustafa

    2014-01-01

    The machine learning field, which can be briefly defined as enabling computers make successful predictions using past experiences, has exhibited an impressive development recently with the help of the rapid increase in the storage capacity and processing power of computers. Together with many other disciplines, machine learning methods have been widely employed in bioinformatics. The difficulties and cost of biological analyses have led to the development of sophisticated machine learning approaches for this application area. In this chapter, we first review the fundamental concepts of machine learning such as feature assessment, unsupervised versus supervised learning and types of classification. Then, we point out the main issues of designing machine learning experiments and their performance evaluation. Finally, we introduce some supervised learning methods.

  4. Diamond machine tool face lapping machine

    DOEpatents

    Yetter, H.H.

    1985-05-06

    An apparatus for shaping, sharpening and polishing diamond-tipped single-point machine tools. The isolation of a rotating grinding wheel from its driving apparatus using an air bearing and causing the tool to be shaped, polished or sharpened to be moved across the surface of the grinding wheel so that it does not remain at one radius for more than a single rotation of the grinding wheel has been found to readily result in machine tools of a quality which can only be obtained by the most tedious and costly processing procedures, and previously unattainable by simple lapping techniques.

  5. Machine learning and radiology.

    PubMed

    Wang, Shijun; Summers, Ronald M

    2012-07-01

    In this paper, we give a short introduction to machine learning and survey its applications in radiology. We focused on six categories of applications in radiology: medical image segmentation, registration, computer aided detection and diagnosis, brain function or activity analysis and neurological disease diagnosis from fMR images, content-based image retrieval systems for CT or MRI images, and text analysis of radiology reports using natural language processing (NLP) and natural language understanding (NLU). This survey shows that machine learning plays a key role in many radiology applications. Machine learning identifies complex patterns automatically and helps radiologists make intelligent decisions on radiology data such as conventional radiographs, CT, MRI, and PET images and radiology reports. In many applications, the performance of machine learning-based automatic detection and diagnosis systems has shown to be comparable to that of a well-trained and experienced radiologist. Technology development in machine learning and radiology will benefit from each other in the long run. Key contributions and common characteristics of machine learning techniques in radiology are discussed. We also discuss the problem of translating machine learning applications to the radiology clinical setting, including advantages and potential barriers.

  6. Machine Learning and Radiology

    PubMed Central

    Wang, Shijun; Summers, Ronald M.

    2012-01-01

    In this paper, we give a short introduction to machine learning and survey its applications in radiology. We focused on six categories of applications in radiology: medical image segmentation, registration, computer aided detection and diagnosis, brain function or activity analysis and neurological disease diagnosis from fMR images, content-based image retrieval systems for CT or MRI images, and text analysis of radiology reports using natural language processing (NLP) and natural language understanding (NLU). This survey shows that machine learning plays a key role in many radiology applications. Machine learning identifies complex patterns automatically and helps radiologists make intelligent decisions on radiology data such as conventional radiographs, CT, MRI, and PET images and radiology reports. In many applications, the performance of machine learning-based automatic detection and diagnosis systems has shown to be comparable to that of a well-trained and experienced radiologist. Technology development in machine learning and radiology will benefit from each other in the long run. Key contributions and common characteristics of machine learning techniques in radiology are discussed. We also discuss the problem of translating machine learning applications to the radiology clinical setting, including advantages and potential barriers. PMID:22465077

  7. Development of a State Machine Sequencer for the Keck Interferometer: Evolution, Development and Lessons Learned using a CASE Tool Approach

    NASA Technical Reports Server (NTRS)

    Rede, Leonard J.; Booth, Andrew; Hsieh, Jonathon; Summer, Kellee

    2004-01-01

    This paper presents a discussion of the evolution of a sequencer from a simple EPICS (Experimental Physics and Industrial Control System) based sequencer into a complex implementation designed utilizing UML (Unified Modeling Language) methodologies and a CASE (Computer Aided Software Engineering) tool approach. The main purpose of the sequencer (called the IF Sequencer) is to provide overall control of the Keck Interferometer to enable science operations be carried out by a single operator (and/or observer). The interferometer links the two 10m telescopes of the W. M. Keck Observatory at Mauna Kea, Hawaii. The IF Sequencer is a high-level, multi-threaded, Hare1 finite state machine, software program designed to orchestrate several lower-level hardware and software hard real time subsystems that must perform their work in a specific and sequential order. The sequencing need not be done in hard real-time. Each state machine thread commands either a high-speed real-time multiple mode embedded controller via CORB A, or slower controllers via EPICS Channel Access interfaces. The overall operation of the system is simplified by the automation. The UML is discussed and our use of it to implement the sequencer is presented. The decision to use the Rhapsody product as our CASE tool is explained and reflected upon. Most importantly, a section on lessons learned is presented and the difficulty of integrating CASE tool automatically generated C++ code into a large control system consisting of multiple infrastructures is presented.

  8. Development of a state machine sequencer for the Keck Interferometer: evolution, development, and lessons learned using a CASE tool approach

    NASA Astrophysics Data System (ADS)

    Reder, Leonard J.; Booth, Andrew; Hsieh, Jonathan; Summers, Kellee R.

    2004-09-01

    This paper presents a discussion of the evolution of a sequencer from a simple Experimental Physics and Industrial Control System (EPICS) based sequencer into a complex implementation designed utilizing UML (Unified Modeling Language) methodologies and a Computer Aided Software Engineering (CASE) tool approach. The main purpose of the Interferometer Sequencer (called the IF Sequencer) is to provide overall control of the Keck Interferometer to enable science operations to be carried out by a single operator (and/or observer). The interferometer links the two 10m telescopes of the W. M. Keck Observatory at Mauna Kea, Hawaii. The IF Sequencer is a high-level, multi-threaded, Harel finite state machine software program designed to orchestrate several lower-level hardware and software hard real-time subsystems that must perform their work in a specific and sequential order. The sequencing need not be done in hard real-time. Each state machine thread commands either a high-speed real-time multiple mode embedded controller via CORBA, or slower controllers via EPICS Channel Access interfaces. The overall operation of the system is simplified by the automation. The UML is discussed and our use of it to implement the sequencer is presented. The decision to use the Rhapsody product as our CASE tool is explained and reflected upon. Most importantly, a section on lessons learned is presented and the difficulty of integrating CASE tool automatically generated C++ code into a large control system consisting of multiple infrastructures is presented.

  9. Slide system for machine tools

    DOEpatents

    Douglass, Spivey S.; Green, Walter L.

    1982-01-01

    The present invention relates to a machine tool which permits the machining of nonaxisymmetric surfaces on a workpiece while rotating the workpiece about a central axis of rotation. The machine tool comprises a conventional two-slide system (X-Y) with one of these slides being provided with a relatively short travel high-speed auxiliary slide which carries the material-removing tool. The auxiliary slide is synchronized with the spindle speed and the position of the other two slides and provides a high-speed reciprocating motion required for the displacement of the cutting tool for generating a nonaxisymmetric surface at a selected location on the workpiece.

  10. Slide system for machine tools

    DOEpatents

    Douglass, S.S.; Green, W.L.

    1980-06-12

    The present invention relates to a machine tool which permits the machining of nonaxisymmetric surfaces on a workpiece while rotating the workpiece about a central axis of rotation. The machine tool comprises a conventional two-slide system (X-Y) with one of these slides being provided with a relatively short travel high-speed auxiliary slide which carries the material-removing tool. The auxiliary slide is synchronized with the spindle speed and the position of the other two slides and provides a high-speed reciprocating motion required for the displacement of the cutting tool for generating a nonaxisymmetric surface at a selected location on the workpiece.

  11. Application of Machine Learning tools to recognition of molecular patterns in STM images

    NASA Astrophysics Data System (ADS)

    Maksov, Artem; Ziatdinov, Maxim; Fujii, Shintaro; Kiguchi, Manabu; Higashibayashi, Shuhei; Sakurai, Hidehiro; Kalinin, Sergei; Sumpter, Bobby

    The ability to utilize individual molecules and molecular assemblies as data storage elements has motivated scientist for years, concurrent with the continuous effort to shrink a size of data storage devices in microelectronics industry. One of the critical issues in this effort lies in being able to identify individual molecular assembly units (patterns), on a large scale in an automated fashion of complete information extraction. Here we present a novel method of applying machine learning techniques for extraction of positional and rotational information from scanning tunneling microscopy (STM) images of π-bowl sumanene molecules on gold. We use Markov Random Field (MRF) model to decode the polar rotational states for each molecule in a large scale STM image of molecular film. We further develop an algorithm that uses a convolutional Neural Network combined with MRF and input from density functional theory to classify molecules into different azimuthal rotational classes. Our results demonstrate that a molecular film is partitioned into distinctive azimuthal rotational domains consisting typically of 20-30 molecules. In each domain, the ``bowl-down'' molecules are generally surrounded by six nearest neighbor molecules in ``bowl-up'' configuration, and the resultant overall structure form a periodic lattice of rotational and polar states within each domain. Research was supported by the US Department of Energy.

  12. Paradigms for machine learning

    NASA Technical Reports Server (NTRS)

    Schlimmer, Jeffrey C.; Langley, Pat

    1991-01-01

    Five paradigms are described for machine learning: connectionist (neural network) methods, genetic algorithms and classifier systems, empirical methods for inducing rules and decision trees, analytic learning methods, and case-based approaches. Some dimensions are considered along with these paradigms vary in their approach to learning, and the basic methods are reviewed that are used within each framework, together with open research issues. It is argued that the similarities among the paradigms are more important than their differences, and that future work should attempt to bridge the existing boundaries. Finally, some recent developments in the field of machine learning are discussed, and their impact on both research and applications is examined.

  13. Automated cell analysis tool for a genome-wide RNAi screen with support vector machine based supervised learning

    NASA Astrophysics Data System (ADS)

    Remmele, Steffen; Ritzerfeld, Julia; Nickel, Walter; Hesser, Jürgen

    2011-03-01

    RNAi-based high-throughput microscopy screens have become an important tool in biological sciences in order to decrypt mostly unknown biological functions of human genes. However, manual analysis is impossible for such screens since the amount of image data sets can often be in the hundred thousands. Reliable automated tools are thus required to analyse the fluorescence microscopy image data sets usually containing two or more reaction channels. The herein presented image analysis tool is designed to analyse an RNAi screen investigating the intracellular trafficking and targeting of acylated Src kinases. In this specific screen, a data set consists of three reaction channels and the investigated cells can appear in different phenotypes. The main issue of the image processing task is an automatic cell segmentation which has to be robust and accurate for all different phenotypes and a successive phenotype classification. The cell segmentation is done in two steps by segmenting the cell nuclei first and then using a classifier-enhanced region growing on basis of the cell nuclei to segment the cells. The classification of the cells is realized by a support vector machine which has to be trained manually using supervised learning. Furthermore, the tool is brightness invariant allowing different staining quality and it provides a quality control that copes with typical defects during preparation and acquisition. A first version of the tool has already been successfully applied for an RNAi-screen containing three hundred thousand image data sets and the SVM extended version is designed for additional screens.

  14. Modeling Plan-Related Clinical Complications Using Machine Learning Tools in a Multiplan IMRT Framework

    SciTech Connect

    Zhang, Hao H.; D'Souza, Warren D. Shi Leyuan; Meyer, Robert R.

    2009-08-01

    Purpose: To predict organ-at-risk (OAR) complications as a function of dose-volume (DV) constraint settings without explicit plan computation in a multiplan intensity-modulated radiotherapy (IMRT) framework. Methods and Materials: Several plans were generated by varying the DV constraints (input features) on the OARs (multiplan framework), and the DV levels achieved by the OARs in the plans (plan properties) were modeled as a function of the imposed DV constraint settings. OAR complications were then predicted for each of the plans by using the imposed DV constraints alone (features) or in combination with modeled DV levels (plan properties) as input to machine learning (ML) algorithms. These ML approaches were used to model two OAR complications after head-and-neck and prostate IMRT: xerostomia, and Grade 2 rectal bleeding. Two-fold cross-validation was used for model verification and mean errors are reported. Results: Errors for modeling the achieved DV values as a function of constraint settings were 0-6%. In the head-and-neck case, the mean absolute prediction error of the saliva flow rate normalized to the pretreatment saliva flow rate was 0.42% with a 95% confidence interval of (0.41-0.43%). In the prostate case, an average prediction accuracy of 97.04% with a 95% confidence interval of (96.67-97.41%) was achieved for Grade 2 rectal bleeding complications. Conclusions: ML can be used for predicting OAR complications during treatment planning allowing for alternative DV constraint settings to be assessed within the planning framework.

  15. Improved tool grinding machine

    DOEpatents

    Dial, C.E. Sr.

    The present invention relates to an improved tool grinding mechanism for grinding single point diamond cutting tools to precise roundness and radius specifications. The present invention utilizes a tool holder which is longitudinally displaced with respect to the remainder of the grinding system due to contact of the tool with the grinding surface with this displacement being monitored so that any variation in the grinding of the cutting surface such as caused by crystal orientation or tool thicknesses may be compensated for during the grinding operation to assure the attainment of the desired cutting tool face specifications.

  16. Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights.

    PubMed

    Pasolli, Edoardo; Truong, Duy Tin; Malik, Faizan; Waldron, Levi; Segata, Nicola

    2016-07-01

    Shotgun metagenomic analysis of the human associated microbiome provides a rich set of microbial features for prediction and biomarker discovery in the context of human diseases and health conditions. However, the use of such high-resolution microbial features presents new challenges, and validated computational tools for learning tasks are lacking. Moreover, classification rules have scarcely been validated in independent studies, posing questions about the generality and generalization of disease-predictive models across cohorts. In this paper, we comprehensively assess approaches to metagenomics-based prediction tasks and for quantitative assessment of the strength of potential microbiome-phenotype associations. We develop a computational framework for prediction tasks using quantitative microbiome profiles, including species-level relative abundances and presence of strain-specific markers. A comprehensive meta-analysis, with particular emphasis on generalization across cohorts, was performed in a collection of 2424 publicly available metagenomic samples from eight large-scale studies. Cross-validation revealed good disease-prediction capabilities, which were in general improved by feature selection and use of strain-specific markers instead of species-level taxonomic abundance. In cross-study analysis, models transferred between studies were in some cases less accurate than models tested by within-study cross-validation. Interestingly, the addition of healthy (control) samples from other studies to training sets improved disease prediction capabilities. Some microbial species (most notably Streptococcus anginosus) seem to characterize general dysbiotic states of the microbiome rather than connections with a specific disease. Our results in modelling features of the "healthy" microbiome can be considered a first step toward defining general microbial dysbiosis. The software framework, microbiome profiles, and metadata for thousands of samples are publicly

  17. Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights

    PubMed Central

    Pasolli, Edoardo; Truong, Duy Tin; Malik, Faizan; Waldron, Levi

    2016-01-01

    Shotgun metagenomic analysis of the human associated microbiome provides a rich set of microbial features for prediction and biomarker discovery in the context of human diseases and health conditions. However, the use of such high-resolution microbial features presents new challenges, and validated computational tools for learning tasks are lacking. Moreover, classification rules have scarcely been validated in independent studies, posing questions about the generality and generalization of disease-predictive models across cohorts. In this paper, we comprehensively assess approaches to metagenomics-based prediction tasks and for quantitative assessment of the strength of potential microbiome-phenotype associations. We develop a computational framework for prediction tasks using quantitative microbiome profiles, including species-level relative abundances and presence of strain-specific markers. A comprehensive meta-analysis, with particular emphasis on generalization across cohorts, was performed in a collection of 2424 publicly available metagenomic samples from eight large-scale studies. Cross-validation revealed good disease-prediction capabilities, which were in general improved by feature selection and use of strain-specific markers instead of species-level taxonomic abundance. In cross-study analysis, models transferred between studies were in some cases less accurate than models tested by within-study cross-validation. Interestingly, the addition of healthy (control) samples from other studies to training sets improved disease prediction capabilities. Some microbial species (most notably Streptococcus anginosus) seem to characterize general dysbiotic states of the microbiome rather than connections with a specific disease. Our results in modelling features of the “healthy” microbiome can be considered a first step toward defining general microbial dysbiosis. The software framework, microbiome profiles, and metadata for thousands of samples are publicly

  18. Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights.

    PubMed

    Pasolli, Edoardo; Truong, Duy Tin; Malik, Faizan; Waldron, Levi; Segata, Nicola

    2016-07-01

    Shotgun metagenomic analysis of the human associated microbiome provides a rich set of microbial features for prediction and biomarker discovery in the context of human diseases and health conditions. However, the use of such high-resolution microbial features presents new challenges, and validated computational tools for learning tasks are lacking. Moreover, classification rules have scarcely been validated in independent studies, posing questions about the generality and generalization of disease-predictive models across cohorts. In this paper, we comprehensively assess approaches to metagenomics-based prediction tasks and for quantitative assessment of the strength of potential microbiome-phenotype associations. We develop a computational framework for prediction tasks using quantitative microbiome profiles, including species-level relative abundances and presence of strain-specific markers. A comprehensive meta-analysis, with particular emphasis on generalization across cohorts, was performed in a collection of 2424 publicly available metagenomic samples from eight large-scale studies. Cross-validation revealed good disease-prediction capabilities, which were in general improved by feature selection and use of strain-specific markers instead of species-level taxonomic abundance. In cross-study analysis, models transferred between studies were in some cases less accurate than models tested by within-study cross-validation. Interestingly, the addition of healthy (control) samples from other studies to training sets improved disease prediction capabilities. Some microbial species (most notably Streptococcus anginosus) seem to characterize general dysbiotic states of the microbiome rather than connections with a specific disease. Our results in modelling features of the "healthy" microbiome can be considered a first step toward defining general microbial dysbiosis. The software framework, microbiome profiles, and metadata for thousands of samples are publicly

  19. Machine Learning in Medicine.

    PubMed

    Deo, Rahul C

    2015-11-17

    Spurred by advances in processing power, memory, storage, and an unprecedented wealth of data, computers are being asked to tackle increasingly complex learning tasks, often with astonishing success. Computers have now mastered a popular variant of poker, learned the laws of physics from experimental data, and become experts in video games - tasks that would have been deemed impossible not too long ago. In parallel, the number of companies centered on applying complex data analysis to varying industries has exploded, and it is thus unsurprising that some analytic companies are turning attention to problems in health care. The purpose of this review is to explore what problems in medicine might benefit from such learning approaches and use examples from the literature to introduce basic concepts in machine learning. It is important to note that seemingly large enough medical data sets and adequate learning algorithms have been available for many decades, and yet, although there are thousands of papers applying machine learning algorithms to medical data, very few have contributed meaningfully to clinical care. This lack of impact stands in stark contrast to the enormous relevance of machine learning to many other industries. Thus, part of my effort will be to identify what obstacles there may be to changing the practice of medicine through statistical learning approaches, and discuss how these might be overcome. PMID:26572668

  20. Machine Learning in Medicine.

    PubMed

    Deo, Rahul C

    2015-11-17

    Spurred by advances in processing power, memory, storage, and an unprecedented wealth of data, computers are being asked to tackle increasingly complex learning tasks, often with astonishing success. Computers have now mastered a popular variant of poker, learned the laws of physics from experimental data, and become experts in video games - tasks that would have been deemed impossible not too long ago. In parallel, the number of companies centered on applying complex data analysis to varying industries has exploded, and it is thus unsurprising that some analytic companies are turning attention to problems in health care. The purpose of this review is to explore what problems in medicine might benefit from such learning approaches and use examples from the literature to introduce basic concepts in machine learning. It is important to note that seemingly large enough medical data sets and adequate learning algorithms have been available for many decades, and yet, although there are thousands of papers applying machine learning algorithms to medical data, very few have contributed meaningfully to clinical care. This lack of impact stands in stark contrast to the enormous relevance of machine learning to many other industries. Thus, part of my effort will be to identify what obstacles there may be to changing the practice of medicine through statistical learning approaches, and discuss how these might be overcome.

  1. Deformation Twin Nucleation and Growth Characterization in Magnesium Alloys Using Novel EBSD Pattern Analysis and Machine Learning Tools

    NASA Astrophysics Data System (ADS)

    Rampton, Travis M.

    Deformation twinning in Magnesium alloys both facilitates slip and forms sites for failure. Currently, basic studies of twinning in Mg are facilitated by electron backscatter diffraction (EBSD) which is able to extract a myriad of information relating to crystalline microstructures. Although much information is available via EBSD, various problems relating to deformation twinning have not been solved. This dissertation provides new insights into deformation twinning in Mg alloys, with particular focus on AZ31. These insights were gained through the development of new EBSD and related machine learning tools that extract more information beyond what is currently accessed. The first tool relating to characterization of deformed and twinned materials focuses on surface topography crack detection. The intensity map across EBSD images contains vital information that can be used to detect evolution of surface roughness and crack formation, which typically occurs at twin boundaries. The method of topography recovery resulted in reconstruction errors as low as 2% over a 500 microm length. The method was then applied to a 3 microm x 3 microm area of twinned Tantalum which experienced topographic alterations. The topography of Ta correlated with other measured changes in the microstructure. Additionally, EBSD images were used to identify the presence of cracks in Nickel microstructures. Several cracks were identified on the Ni specimen, demonstrating that cracks as thin as 34 nm could be measured. A further EBSD based tool developed for this study was used to identify thin compression twins in Mg; these are often missed in a traditional EBSD scan due to their size relative to the electron probe. This tool takes advantage of crystallographic relationships that exist between parent and twinned grains; common planes that exist in both grains lead to bands of consistent intensity as a scan crosses a twin. Hence, twin boundaries in a microstructure can be recognized, even when

  2. Machine Tool Operation, Course Description.

    ERIC Educational Resources Information Center

    Denny, Walter E.; Anderson, Floyd L.

    Prepared by an instructor and curriculum specialists, this course of study was designed to meet the individual needs of the dropout and/or hard-core unemployed youth by providing them skill training, related information, and supportive services knowledge in machine tool operation. The achievement level of each student is determined at entry, and…

  3. Data Mining and Machine Learning Tools for Combinatorial Material Science of All-Oxide Photovoltaic Cells.

    PubMed

    Yosipof, Abraham; Nahum, Oren E; Anderson, Assaf Y; Barad, Hannah-Noa; Zaban, Arie; Senderowitz, Hanoch

    2015-06-01

    Growth in energy demands, coupled with the need for clean energy, are likely to make solar cells an important part of future energy resources. In particular, cells entirely made of metal oxides (MOs) have the potential to provide clean and affordable energy if their power conversion efficiencies are improved. Such improvements require the development of new MOs which could benefit from combining combinatorial material sciences for producing solar cells libraries with data mining tools to direct synthesis efforts. In this work we developed a data mining workflow and applied it to the analysis of two recently reported solar cell libraries based on Titanium and Copper oxides. Our results demonstrate that QSAR models with good prediction statistics for multiple solar cells properties could be developed and that these models highlight important factors affecting these properties in accord with experimental findings. The resulting models are therefore suitable for designing better solar cells.

  4. Advances of implementing NC machine tools discussed

    NASA Astrophysics Data System (ADS)

    Kukuyev, Y. P.; Trukhan, Y. V.

    1984-11-01

    Numerical control machine tools which are one of the principal resources of reequipment, mechanization and automation of small series and series production in machine building were examined. The continually increasing volume of NC machine tools which are produced and introduced is economically significant for introduction of these machine tools to operation and organization of their effective use. Organizational and technical measures were directed at solving these problems. To insure the fastest introduction of NC machine tools into operation and their technical maintenance, a number of setting up organizations was organized. Setting up services are also provided by the plants manufacturing the NC machine tools, and appropriate subdivisions are created for this purpose.

  5. Quantum-Enhanced Machine Learning

    NASA Astrophysics Data System (ADS)

    Dunjko, Vedran; Taylor, Jacob M.; Briegel, Hans J.

    2016-09-01

    The emerging field of quantum machine learning has the potential to substantially aid in the problems and scope of artificial intelligence. This is only enhanced by recent successes in the field of classical machine learning. In this work we propose an approach for the systematic treatment of machine learning, from the perspective of quantum information. Our approach is general and covers all three main branches of machine learning: supervised, unsupervised, and reinforcement learning. While quantum improvements in supervised and unsupervised learning have been reported, reinforcement learning has received much less attention. Within our approach, we tackle the problem of quantum enhancements in reinforcement learning as well, and propose a systematic scheme for providing improvements. As an example, we show that quadratic improvements in learning efficiency, and exponential improvements in performance over limited time periods, can be obtained for a broad class of learning problems.

  6. Standardized Curriculum for Machine Tool Operation/Machine Shop.

    ERIC Educational Resources Information Center

    Mississippi State Dept. of Education, Jackson. Office of Vocational, Technical and Adult Education.

    Standardized vocational education course titles and core contents for two courses in Mississippi are provided: machine tool operation/machine shop I and II. The first course contains the following units: (1) orientation; (2) shop safety; (3) shop math; (4) measuring tools and instruments; (5) hand and bench tools; (6) blueprint reading; (7)…

  7. MLViS: A Web Tool for Machine Learning-Based Virtual Screening in Early-Phase of Drug Discovery and Development

    PubMed Central

    Korkmaz, Selcuk; Zararsiz, Gokmen; Goksuluk, Dincer

    2015-01-01

    Virtual screening is an important step in early-phase of drug discovery process. Since there are thousands of compounds, this step should be both fast and effective in order to distinguish drug-like and nondrug-like molecules. Statistical machine learning methods are widely used in drug discovery studies for classification purpose. Here, we aim to develop a new tool, which can classify molecules as drug-like and nondrug-like based on various machine learning methods, including discriminant, tree-based, kernel-based, ensemble and other algorithms. To construct this tool, first, performances of twenty-three different machine learning algorithms are compared by ten different measures, then, ten best performing algorithms have been selected based on principal component and hierarchical cluster analysis results. Besides classification, this application has also ability to create heat map and dendrogram for visual inspection of the molecules through hierarchical cluster analysis. Moreover, users can connect the PubChem database to download molecular information and to create two-dimensional structures of compounds. This application is freely available through www.biosoft.hacettepe.edu.tr/MLViS/. PMID:25928885

  8. Machine tools and fixtures: A compilation

    NASA Technical Reports Server (NTRS)

    1971-01-01

    As part of NASA's Technology Utilizations Program, a compilation was made of technological developments regarding machine tools, jigs, and fixtures that have been produced, modified, or adapted to meet requirements of the aerospace program. The compilation is divided into three sections that include: (1) a variety of machine tool applications that offer easier and more efficient production techniques; (2) methods, techniques, and hardware that aid in the setup, alignment, and control of machines and machine tools to further quality assurance in finished products: and (3) jigs, fixtures, and adapters that are ancillary to basic machine tools and aid in realizing their greatest potential.

  9. Prediction of Machine Tool Condition Using Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Wang, Peigong; Meng, Qingfeng; Zhao, Jian; Li, Junjie; Wang, Xiufeng

    2011-07-01

    Condition monitoring and predicting of CNC machine tools are investigated in this paper. Considering the CNC machine tools are often small numbers of samples, a condition predicting method for CNC machine tools based on support vector machines (SVMs) is proposed, then one-step and multi-step condition prediction models are constructed. The support vector machines prediction models are used to predict the trends of working condition of a certain type of CNC worm wheel and gear grinding machine by applying sequence data of vibration signal, which is collected during machine processing. And the relationship between different eigenvalue in CNC vibration signal and machining quality is discussed. The test result shows that the trend of vibration signal Peak-to-peak value in surface normal direction is most relevant to the trend of surface roughness value. In trends prediction of working condition, support vector machine has higher prediction accuracy both in the short term ('One-step') and long term (multi-step) prediction compared to autoregressive (AR) model and the RBF neural network. Experimental results show that it is feasible to apply support vector machine to CNC machine tool condition prediction.

  10. Machine Learning for Biological Trajectory Classification Applications

    NASA Technical Reports Server (NTRS)

    Sbalzarini, Ivo F.; Theriot, Julie; Koumoutsakos, Petros

    2002-01-01

    Machine-learning techniques, including clustering algorithms, support vector machines and hidden Markov models, are applied to the task of classifying trajectories of moving keratocyte cells. The different algorithms axe compared to each other as well as to expert and non-expert test persons, using concepts from signal-detection theory. The algorithms performed very well as compared to humans, suggesting a robust tool for trajectory classification in biological applications.

  11. Chip breaking system for automated machine tool

    DOEpatents

    Arehart, Theodore A.; Carey, Donald O.

    1987-01-01

    The invention is a rotary selectively directional valve assembly for use in an automated turret lathe for directing a stream of high pressure liquid machining coolant to the interface of a machine tool and workpiece for breaking up ribbon-shaped chips during the formation thereof so as to inhibit scratching or other marring of the machined surfaces by these ribbon-shaped chips. The valve assembly is provided by a manifold arrangement having a plurality of circumferentially spaced apart ports each coupled to a machine tool. The manifold is rotatable with the turret when the turret is positioned for alignment of a machine tool in a machining relationship with the workpiece. The manifold is connected to a non-rotational header having a single passageway therethrough which conveys the high pressure coolant to only the port in the manifold which is in registry with the tool disposed in a working relationship with the workpiece. To position the machine tools the turret is rotated and one of the tools is placed in a material-removing relationship of the workpiece. The passageway in the header and one of the ports in the manifold arrangement are then automatically aligned to supply the machining coolant to the machine tool workpiece interface for breaking up of the chips as well as cooling the tool and workpiece during the machining operation.

  12. How To Teach Common Characteristics of Machine Tools

    ERIC Educational Resources Information Center

    Kazanas, H. C.

    1970-01-01

    Organizes machine tools and machine operations into commonalities in order to help the student visualize and distinguish the common characteristics which exist between machine tools and operations. (GR)

  13. Model-based machine learning.

    PubMed

    Bishop, Christopher M

    2013-02-13

    Several decades of research in the field of machine learning have resulted in a multitude of different algorithms for solving a broad range of problems. To tackle a new application, a researcher typically tries to map their problem onto one of these existing methods, often influenced by their familiarity with specific algorithms and by the availability of corresponding software implementations. In this study, we describe an alternative methodology for applying machine learning, in which a bespoke solution is formulated for each new application. The solution is expressed through a compact modelling language, and the corresponding custom machine learning code is then generated automatically. This model-based approach offers several major advantages, including the opportunity to create highly tailored models for specific scenarios, as well as rapid prototyping and comparison of a range of alternative models. Furthermore, newcomers to the field of machine learning do not have to learn about the huge range of traditional methods, but instead can focus their attention on understanding a single modelling environment. In this study, we show how probabilistic graphical models, coupled with efficient inference algorithms, provide a very flexible foundation for model-based machine learning, and we outline a large-scale commercial application of this framework involving tens of millions of users. We also describe the concept of probabilistic programming as a powerful software environment for model-based machine learning, and we discuss a specific probabilistic programming language called Infer.NET, which has been widely used in practical applications. PMID:23277612

  14. Model-based machine learning.

    PubMed

    Bishop, Christopher M

    2013-02-13

    Several decades of research in the field of machine learning have resulted in a multitude of different algorithms for solving a broad range of problems. To tackle a new application, a researcher typically tries to map their problem onto one of these existing methods, often influenced by their familiarity with specific algorithms and by the availability of corresponding software implementations. In this study, we describe an alternative methodology for applying machine learning, in which a bespoke solution is formulated for each new application. The solution is expressed through a compact modelling language, and the corresponding custom machine learning code is then generated automatically. This model-based approach offers several major advantages, including the opportunity to create highly tailored models for specific scenarios, as well as rapid prototyping and comparison of a range of alternative models. Furthermore, newcomers to the field of machine learning do not have to learn about the huge range of traditional methods, but instead can focus their attention on understanding a single modelling environment. In this study, we show how probabilistic graphical models, coupled with efficient inference algorithms, provide a very flexible foundation for model-based machine learning, and we outline a large-scale commercial application of this framework involving tens of millions of users. We also describe the concept of probabilistic programming as a powerful software environment for model-based machine learning, and we discuss a specific probabilistic programming language called Infer.NET, which has been widely used in practical applications.

  15. Web Mining: Machine Learning for Web Applications.

    ERIC Educational Resources Information Center

    Chen, Hsinchun; Chau, Michael

    2004-01-01

    Presents an overview of machine learning research and reviews methods used for evaluating machine learning systems. Ways that machine-learning algorithms were used in traditional information retrieval systems in the "pre-Web" era are described, and the field of Web mining and how machine learning has been used in different Web mining applications…

  16. Paradigms for Realizing Machine Learning Algorithms.

    PubMed

    Agneeswaran, Vijay Srinivas; Tonpay, Pranay; Tiwary, Jayati

    2013-12-01

    The article explains the three generations of machine learning algorithms-with all three trying to operate on big data. The first generation tools are SAS, SPSS, etc., while second generation realizations include Mahout and RapidMiner (that work over Hadoop), and the third generation paradigms include Spark and GraphLab, among others. The essence of the article is that for a number of machine learning algorithms, it is important to look beyond the Hadoop's Map-Reduce paradigm in order to make them work on big data. A number of promising contenders have emerged in the third generation that can be exploited to realize deep analytics on big data.

  17. Topics in Machine Learning for Astronomers

    NASA Astrophysics Data System (ADS)

    Cisewski, Jessi

    2016-01-01

    As astronomical datasets continue to increase in size and complexity, innovative statistical and machine learning tools are required to address the scientific questions of interest in a computationally efficient manner. I will introduce some tools that astronomers can employ for such problems with a focus on clustering and classification techniques. I will introduce standard methods, but also get into more recent developments that may be of use to the astronomical community.

  18. National Machine Tool Partnership (NMTP) FY 1998

    SciTech Connect

    1997-12-01

    The Department of Energy (DOE) Defense Programs (DP) National Machine Tool Partnership (NMTP) program has been active since February 1993. The NMTP program is an element of the DP Technology Partnership Program. The NMTP has assisted the Association of Manufacturing Technology (AMT) in the formulation of a technology roadmap for the machine tool industry. This roadmap has been developed to provide a clearer step-by-step plan for technology development and implementation to help close the gap between user requirements and industry implementation. The document outlines a suggested path for the development of technologies for the machine tool industry. The plan details the technology issues or needs analysis facing the machine tool industry. In a parallel effort, the NMTP has prepared a needs analysis of machine tool related technologies needed in various DP laboratory weapons core programs, including the Advanced Design and Production Technologies (ADaPT) initiative.

  19. Machine Shop. Student Learning Guide.

    ERIC Educational Resources Information Center

    Palm Beach County Board of Public Instruction, West Palm Beach, FL.

    This student learning guide contains eight modules for completing a course in machine shop. It is designed especially for use in Palm Beach County, Florida. Each module covers one task, and consists of a purpose, performance objective, enabling objectives, learning activities and resources, information sheets, student self-check with answer key,…

  20. Machine learning for neuroimaging with scikit-learn.

    PubMed

    Abraham, Alexandre; Pedregosa, Fabian; Eickenberg, Michael; Gervais, Philippe; Mueller, Andreas; Kossaifi, Jean; Gramfort, Alexandre; Thirion, Bertrand; Varoquaux, Gaël

    2014-01-01

    Statistical machine learning methods are increasingly used for neuroimaging data analysis. Their main virtue is their ability to model high-dimensional datasets, e.g., multivariate analysis of activation images or resting-state time series. Supervised learning is typically used in decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learning can uncover hidden structures in sets of images (e.g., resting state functional MRI) or find sub-populations in large cohorts. By considering different functional neuroimaging applications, we illustrate how scikit-learn, a Python machine learning library, can be used to perform some key analysis steps. Scikit-learn contains a very large set of statistical learning algorithms, both supervised and unsupervised, and its application to neuroimaging data provides a versatile tool to study the brain. PMID:24600388

  1. Game-powered machine learning

    PubMed Central

    Barrington, Luke; Turnbull, Douglas; Lanckriet, Gert

    2012-01-01

    Searching for relevant content in a massive amount of multimedia information is facilitated by accurately annotating each image, video, or song with a large number of relevant semantic keywords, or tags. We introduce game-powered machine learning, an integrated approach to annotating multimedia content that combines the effectiveness of human computation, through online games, with the scalability of machine learning. We investigate this framework for labeling music. First, a socially-oriented music annotation game called Herd It collects reliable music annotations based on the “wisdom of the crowds.” Second, these annotated examples are used to train a supervised machine learning system. Third, the machine learning system actively directs the annotation games to collect new data that will most benefit future model iterations. Once trained, the system can automatically annotate a corpus of music much larger than what could be labeled using human computation alone. Automatically annotated songs can be retrieved based on their semantic relevance to text-based queries (e.g., “funky jazz with saxophone,” “spooky electronica,” etc.). Based on the results presented in this paper, we find that actively coupling annotation games with machine learning provides a reliable and scalable approach to making searchable massive amounts of multimedia data. PMID:22460786

  2. Game-powered machine learning.

    PubMed

    Barrington, Luke; Turnbull, Douglas; Lanckriet, Gert

    2012-04-24

    Searching for relevant content in a massive amount of multimedia information is facilitated by accurately annotating each image, video, or song with a large number of relevant semantic keywords, or tags. We introduce game-powered machine learning, an integrated approach to annotating multimedia content that combines the effectiveness of human computation, through online games, with the scalability of machine learning. We investigate this framework for labeling music. First, a socially-oriented music annotation game called Herd It collects reliable music annotations based on the "wisdom of the crowds." Second, these annotated examples are used to train a supervised machine learning system. Third, the machine learning system actively directs the annotation games to collect new data that will most benefit future model iterations. Once trained, the system can automatically annotate a corpus of music much larger than what could be labeled using human computation alone. Automatically annotated songs can be retrieved based on their semantic relevance to text-based queries (e.g., "funky jazz with saxophone," "spooky electronica," etc.). Based on the results presented in this paper, we find that actively coupling annotation games with machine learning provides a reliable and scalable approach to making searchable massive amounts of multimedia data.

  3. Machine learning methods in chemoinformatics

    PubMed Central

    Mitchell, John B O

    2014-01-01

    Machine learning algorithms are generally developed in computer science or adjacent disciplines and find their way into chemical modeling by a process of diffusion. Though particular machine learning methods are popular in chemoinformatics and quantitative structure–activity relationships (QSAR), many others exist in the technical literature. This discussion is methods-based and focused on some algorithms that chemoinformatics researchers frequently use. It makes no claim to be exhaustive. We concentrate on methods for supervised learning, predicting the unknown property values of a test set of instances, usually molecules, based on the known values for a training set. Particularly relevant approaches include Artificial Neural Networks, Random Forest, Support Vector Machine, k-Nearest Neighbors and naïve Bayes classifiers. WIREs Comput Mol Sci 2014, 4:468–481. How to cite this article: WIREs Comput Mol Sci 2014, 4:468–481. doi:10.1002/wcms.1183 PMID:25285160

  4. Method for machining steel with diamond tools

    DOEpatents

    Casstevens, John M.

    1986-01-01

    The present invention is directed to a method for machining optical quality inishes and contour accuracies of workpieces of carbon-containing metals such as steel with diamond tooling. The wear rate of the diamond tooling is significantly reduced by saturating the atmosphere at the interface of the workpiece and the diamond tool with a gaseous hydrocarbon during the machining operation. The presence of the gaseous hydrocarbon effectively eliminates the deterioration of the diamond tool by inhibiting or preventing the conversion of the diamond carbon to graphite carbon at the point of contact between the cutting tool and the workpiece.

  5. Method for machining steel with diamond tools

    DOEpatents

    Casstevens, J.M.

    1984-01-01

    The present invention is directed to a method for machine optical quality finishes and contour accuracies of workpieces of carbon-containing metals such as steel with diamond tooling. The wear rate of the diamond tooling is significantly reduced by saturating the atmosphere at the interface of the workpiece and the diamond tool with a gaseous hydrocarbon during the machining operation. The presence of the gaseous hydrocarbon effectively eliminates the deterioration of the diamond tool by inhibiting or preventing the conversion of the diamond carbon to graphite carbon at the point of contact between the cutting tool and the workpiece.

  6. Speed-Selector Guard For Machine Tool

    NASA Technical Reports Server (NTRS)

    Shakhshir, Roda J.; Valentine, Richard L.

    1992-01-01

    Simple guardplate prevents accidental reversal of direction of rotation or sudden change of speed of lathe, milling machine, or other machine tool. Custom-made for specific machine and control settings. Allows control lever to be placed at only one setting. Operator uses handle to slide guard to engage or disengage control lever. Protects personnel from injury and equipment from damage occurring if speed- or direction-control lever inadvertently placed in wrong position.

  7. Numerically Controlled Machine Tools and Worker Skills.

    ERIC Educational Resources Information Center

    Keefe, Jeffrey H.

    1991-01-01

    Analysis of data from "Industry Wage Surveys of Machinery Manufacturers" on the skill levels of 57 machining jobs found that introduction of numerically controlled machine tools has resulted in a very small reduction in skill levels or no significant change, supporting neither the deskilling argument nor argument that skill levels increase with…

  8. Machine Tool Series. Duty Task List.

    ERIC Educational Resources Information Center

    Oklahoma State Dept. of Vocational and Technical Education, Stillwater. Curriculum and Instructional Materials Center.

    This task list is intended for use in planning and/or evaluating a competency-based course to prepare machine tool, drill press, grinding machine, lathe, mill, and/or power saw operators. The listing is divided into six sections, with each one outlining the tasks required to perform the duties that have been identified for the given occupation.…

  9. Refrigerated cutting tools improve machining of superalloys

    NASA Technical Reports Server (NTRS)

    Dudley, G. M.

    1971-01-01

    Freon-12 applied to tool cutting edge evaporates quickly, leaves no residue, and permits higher cutting rate than with conventional coolants. This technique increases cutting rate on Rene-41 threefold and improves finish of machined surface.

  10. Applications of Machine Learning in Information Retrieval.

    ERIC Educational Resources Information Center

    Cunningham, Sally Jo; Witten, Ian H.; Littin, James

    1999-01-01

    Introduces the basic ideas that underpin applications of machine learning to information retrieval. Describes applications of machine learning to text categorization. Considers how machine learning can be applied to the query-formulation process. Examines methods of document filtering, where the user specifies a query that is to be applied to an…

  11. Learning Extended Finite State Machines

    NASA Technical Reports Server (NTRS)

    Cassel, Sofia; Howar, Falk; Jonsson, Bengt; Steffen, Bernhard

    2014-01-01

    We present an active learning algorithm for inferring extended finite state machines (EFSM)s, combining data flow and control behavior. Key to our learning technique is a novel learning model based on so-called tree queries. The learning algorithm uses the tree queries to infer symbolic data constraints on parameters, e.g., sequence numbers, time stamps, identifiers, or even simple arithmetic. We describe sufficient conditions for the properties that the symbolic constraints provided by a tree query in general must have to be usable in our learning model. We have evaluated our algorithm in a black-box scenario, where tree queries are realized through (black-box) testing. Our case studies include connection establishment in TCP and a priority queue from the Java Class Library.

  12. Learning Machine Learning: A Case Study

    ERIC Educational Resources Information Center

    Lavesson, N.

    2010-01-01

    This correspondence reports on a case study conducted in the Master's-level Machine Learning (ML) course at Blekinge Institute of Technology, Sweden. The students participated in a self-assessment test and a diagnostic test of prerequisite subjects, and their results on these tests are correlated with their achievement of the course's learning…

  13. The Higgs Machine Learning Challenge

    NASA Astrophysics Data System (ADS)

    Adam-Bourdarios, C.; Cowan, G.; Germain-Renaud, C.; Guyon, I.; Kégl, B.; Rousseau, D.

    2015-12-01

    The Higgs Machine Learning Challenge was an open data analysis competition that took place between May and September 2014. Samples of simulated data from the ATLAS Experiment at the LHC corresponding to signal events with Higgs bosons decaying to τ+τ- together with background events were made available to the public through the website of the data science organization Kaggle (kaggle.com). Participants attempted to identify the search region in a space of 30 kinematic variables that would maximize the expected discovery significance of the signal process. One of the primary goals of the Challenge was to promote communication of new ideas between the Machine Learning (ML) and HEP communities. In this regard it was a resounding success, with almost 2,000 participants from HEP, ML and other areas. The process of understanding and integrating the new ideas, particularly from ML into HEP, is currently underway.

  14. Paradigms for Realizing Machine Learning Algorithms.

    PubMed

    Agneeswaran, Vijay Srinivas; Tonpay, Pranay; Tiwary, Jayati

    2013-12-01

    The article explains the three generations of machine learning algorithms-with all three trying to operate on big data. The first generation tools are SAS, SPSS, etc., while second generation realizations include Mahout and RapidMiner (that work over Hadoop), and the third generation paradigms include Spark and GraphLab, among others. The essence of the article is that for a number of machine learning algorithms, it is important to look beyond the Hadoop's Map-Reduce paradigm in order to make them work on big data. A number of promising contenders have emerged in the third generation that can be exploited to realize deep analytics on big data. PMID:27447253

  15. Machine learning for medical images analysis.

    PubMed

    Criminisi, A

    2016-10-01

    This article discusses the application of machine learning for the analysis of medical images. Specifically: (i) We show how a special type of learning models can be thought of as automatically optimized, hierarchically-structured, rule-based algorithms, and (ii) We discuss how the issue of collecting large labelled datasets applies to both conventional algorithms as well as machine learning techniques. The size of the training database is a function of model complexity rather than a characteristic of machine learning methods.

  16. Machine-Tool Technology Instructor's Sourcebook.

    ERIC Educational Resources Information Center

    Tammer, Anthony M.

    This document lists and annotates commercial and noncommercial resources pertaining to machine-tool technology. Following an introduction that explains how the document came to be written, the subjects of succeeding chapters are (1) periodicals; (2) associations; (3) audiovisual resources, including a subject index; (4) publishers, including a…

  17. ATST telescope mount: telescope of machine tool

    NASA Astrophysics Data System (ADS)

    Jeffers, Paul; Stolz, Günter; Bonomi, Giovanni; Dreyer, Oliver; Kärcher, Hans

    2012-09-01

    The Advanced Technology Solar Telescope (ATST) will be the largest solar telescope in the world, and will be able to provide the sharpest views ever taken of the solar surface. The telescope has a 4m aperture primary mirror, however due to the off axis nature of the optical layout, the telescope mount has proportions similar to an 8 meter class telescope. The technology normally used in this class of telescope is well understood in the telescope community and has been successfully implemented in numerous projects. The world of large machine tools has developed in a separate realm with similar levels of performance requirement but different boundary conditions. In addition the competitive nature of private industry has encouraged development and usage of more cost effective solutions both in initial capital cost and thru-life operating cost. Telescope mounts move relatively slowly with requirements for high stability under external environmental influences such as wind buffeting. Large machine tools operate under high speed requirements coupled with high application of force through the machine but with little or no external environmental influences. The benefits of these parallel development paths and the ATST system requirements are being combined in the ATST Telescope Mount Assembly (TMA). The process of balancing the system requirements with new technologies is based on the experience of the ATST project team, Ingersoll Machine Tools who are the main contractor for the TMA and MT Mechatronics who are their design subcontractors. This paper highlights a number of these proven technologies from the commercially driven machine tool world that are being introduced to the TMA design. Also the challenges of integrating and ensuring that the differences in application requirements are accounted for in the design are discussed.

  18. Introducing Machine Learning Concepts with WEKA.

    PubMed

    Smith, Tony C; Frank, Eibe

    2016-01-01

    This chapter presents an introduction to data mining with machine learning. It gives an overview of various types of machine learning, along with some examples. It explains how to download, install, and run the WEKA data mining toolkit on a simple data set, then proceeds to explain how one might approach a bioinformatics problem. Finally, it includes a brief summary of machine learning algorithms for other types of data mining problems, and provides suggestions about where to find additional information.

  19. MLZ: Machine Learning for photo-Z

    NASA Astrophysics Data System (ADS)

    Carrasco Kind, Matias; Brunner, Robert

    2014-03-01

    The parallel Python framework MLZ (Machine Learning and photo-Z) computes fast and robust photometric redshift PDFs using Machine Learning algorithms. It uses a supervised technique with prediction trees and random forest through TPZ that can be used for a regression or a classification problem, or a unsupervised methods with self organizing maps and random atlas called SOMz. These machine learning implementations can be efficiently combined into a more powerful one resulting in robust and accurate probability distributions for photometric redshifts.

  20. An investigation of chatter and tool wear when machining titanium

    NASA Technical Reports Server (NTRS)

    Sutherland, I. A.

    1974-01-01

    The low thermal conductivity of titanium, together with the low contact area between chip and tool and the unusually high chip velocities, gives rise to high tool tip temperatures and accelerated tool wear. Machining speeds have to be considerably reduced to avoid these high temperatures with a consequential loss of productivity. Restoring this lost productivity involves increasing other machining variables, such as feed and depth-of-cut, and can lead to another machining problem commonly known as chatter. This work is to acquaint users with these problems, to examine the variables that may be encountered when machining a material like titanium, and to advise the machine tool user on how to maximize the output from the machines and tooling available to him. Recommendations are made on ways of improving tolerances, reducing machine tool instability or chatter, and improving productivity. New tool materials, tool coatings, and coolants are reviewed and their relevance examined when machining titanium.

  1. Machine learning of user profiles: Representational issues

    SciTech Connect

    Bloedorn, E.; Mani, I.; MacMillan, T.R.

    1996-12-31

    As more information becomes available electronically, tools for finding information of interest to users becomes increasingly important. The goal of the research described here is to build a system for generating comprehensible user profiles that accurately capture user interest with minimum user interaction. The research described here focuses on the importance of a suitable generalization hierarchy and representation for learning profiles which are predictively accurate and comprehensible. In our experiments we evaluated both traditional features based on weighted term vectors as well as subject features corresponding to categories which could be drawn from a thesaurus. Our experiments, conducted in the context of a content-based profiling system for on-line newspapers on the World Wide Web (the IDD News Browser), demonstrate the importance of a generalization hierarchy and the promise of combining natural language processing techniques with machine learning (ML) to address an information retrieval (ER) problem.

  2. Machine learning research 1989-90

    NASA Technical Reports Server (NTRS)

    Porter, Bruce W.; Souther, Arthur

    1990-01-01

    Multifunctional knowledge bases offer a significant advance in artificial intelligence because they can support numerous expert tasks within a domain. As a result they amortize the costs of building a knowledge base over multiple expert systems and they reduce the brittleness of each system. Due to the inevitable size and complexity of multifunctional knowledge bases, their construction and maintenance require knowledge engineering and acquisition tools that can automatically identify interactions between new and existing knowledge. Furthermore, their use requires software for accessing those portions of the knowledge base that coherently answer questions. Considerable progress was made in developing software for building and accessing multifunctional knowledge bases. A language was developed for representing knowledge, along with software tools for editing and displaying knowledge, a machine learning program for integrating new information into existing knowledge, and a question answering system for accessing the knowledge base.

  3. Defect Classification Using Machine Learning

    SciTech Connect

    Carr, A; Kegelmeyer, L; Liao, Z M; Abdulla, G; Cross, D; Kegelmeyer, W P; Raviza, F; Carr, C W

    2008-10-24

    Laser-induced damage growth on the surface of fused silica optics has been extensively studied and has been found to depend on a number of factors including fluence and the surface on which the damage site resides. It has been demonstrated that damage sites as small as a few tens of microns can be detected and tracked on optics installed a fusion-class laser, however, determining the surface of an optic on which a damage site resides in situ can be a significant challenge. In this work demonstrate that a machine-learning algorithm can successfully predict the surface location of the damage site using an expanded set of characteristics for each damage site, some of which are not historically associated with growth rate.

  4. Machine learning in motion control

    NASA Technical Reports Server (NTRS)

    Su, Renjeng; Kermiche, Noureddine

    1989-01-01

    The existing methodologies for robot programming originate primarily from robotic applications to manufacturing, where uncertainties of the robots and their task environment may be minimized by repeated off-line modeling and identification. In space application of robots, however, a higher degree of automation is required for robot programming because of the desire of minimizing the human intervention. We discuss a new paradigm of robotic programming which is based on the concept of machine learning. The goal is to let robots practice tasks by themselves and the operational data are used to automatically improve their motion performance. The underlying mathematical problem is to solve the problem of dynamical inverse by iterative methods. One of the key questions is how to ensure the convergence of the iterative process. There have been a few small steps taken into this important approach to robot programming. We give a representative result on the convergence problem.

  5. Automatic tool path generation for finish machining

    SciTech Connect

    Kwok, Kwan S.; Loucks, C.S.; Driessen, B.J.

    1997-03-01

    A system for automatic tool path generation was developed at Sandia National Laboratories for finish machining operations. The system consists of a commercially available 5-axis milling machine controlled by Sandia developed software. This system was used to remove overspray on cast turbine blades. A laser-based, structured-light sensor, mounted on a tool holder, is used to collect 3D data points around the surface of the turbine blade. Using the digitized model of the blade, a tool path is generated which will drive a 0.375 inch diameter CBN grinding pin around the tip of the blade. A fuzzified digital filter was developed to properly eliminate false sensor readings caused by burrs, holes and overspray. The digital filter was found to successfully generate the correct tool path for a blade with intentionally scanned holes and defects. The fuzzified filter improved the computation efficiency by a factor of 25. For application to general parts, an adaptive scanning algorithm was developed and presented with simulation results. A right pyramid and an ellipsoid were scanned successfully with the adaptive algorithm.

  6. Adaptive Learning Systems: Beyond Teaching Machines

    ERIC Educational Resources Information Center

    Kara, Nuri; Sevim, Nese

    2013-01-01

    Since 1950s, teaching machines have changed a lot. Today, we have different ideas about how people learn, what instructor should do to help students during their learning process. We have adaptive learning technologies that can create much more student oriented learning environments. The purpose of this article is to present these changes and its…

  7. Machine learning: An artificial intelligence approach

    SciTech Connect

    Michalski, R.S.; Carbonell, J.G.; Mitchell, T.M.

    1983-01-01

    This book contains tutorial overviews and research papers on contemporary trends in the area of machine learning viewed from an AI perspective. Research directions covered include: learning from examples, modeling human learning strategies, knowledge acquisition for expert systems, learning heuristics, discovery systems, and conceptual data analysis.

  8. Probabilistic machine learning and artificial intelligence.

    PubMed

    Ghahramani, Zoubin

    2015-05-28

    How can a machine learn from experience? Probabilistic modelling provides a framework for understanding what learning is, and has therefore emerged as one of the principal theoretical and practical approaches for designing machines that learn from data acquired through experience. The probabilistic framework, which describes how to represent and manipulate uncertainty about models and predictions, has a central role in scientific data analysis, machine learning, robotics, cognitive science and artificial intelligence. This Review provides an introduction to this framework, and discusses some of the state-of-the-art advances in the field, namely, probabilistic programming, Bayesian optimization, data compression and automatic model discovery.

  9. Probabilistic machine learning and artificial intelligence.

    PubMed

    Ghahramani, Zoubin

    2015-05-28

    How can a machine learn from experience? Probabilistic modelling provides a framework for understanding what learning is, and has therefore emerged as one of the principal theoretical and practical approaches for designing machines that learn from data acquired through experience. The probabilistic framework, which describes how to represent and manipulate uncertainty about models and predictions, has a central role in scientific data analysis, machine learning, robotics, cognitive science and artificial intelligence. This Review provides an introduction to this framework, and discusses some of the state-of-the-art advances in the field, namely, probabilistic programming, Bayesian optimization, data compression and automatic model discovery. PMID:26017444

  10. Probabilistic machine learning and artificial intelligence

    NASA Astrophysics Data System (ADS)

    Ghahramani, Zoubin

    2015-05-01

    How can a machine learn from experience? Probabilistic modelling provides a framework for understanding what learning is, and has therefore emerged as one of the principal theoretical and practical approaches for designing machines that learn from data acquired through experience. The probabilistic framework, which describes how to represent and manipulate uncertainty about models and predictions, has a central role in scientific data analysis, machine learning, robotics, cognitive science and artificial intelligence. This Review provides an introduction to this framework, and discusses some of the state-of-the-art advances in the field, namely, probabilistic programming, Bayesian optimization, data compression and automatic model discovery.

  11. Machine vision systems using machine learning for industrial product inspection

    NASA Astrophysics Data System (ADS)

    Lu, Yi; Chen, Tie Q.; Chen, Jie; Zhang, Jian; Tisler, Anthony

    2002-02-01

    Machine vision inspection requires efficient processing time and accurate results. In this paper, we present a machine vision inspection architecture, SMV (Smart Machine Vision). SMV decomposes a machine vision inspection problem into two stages, Learning Inspection Features (LIF), and On-Line Inspection (OLI). The LIF is designed to learn visual inspection features from design data and/or from inspection products. During the OLI stage, the inspection system uses the knowledge learnt by the LIF component to inspect the visual features of products. In this paper we will present two machine vision inspection systems developed under the SMV architecture for two different types of products, Printed Circuit Board (PCB) and Vacuum Florescent Displaying (VFD) boards. In the VFD board inspection system, the LIF component learns inspection features from a VFD board and its displaying patterns. In the PCB board inspection system, the LIF learns the inspection features from the CAD file of a PCB board. In both systems, the LIF component also incorporates interactive learning to make the inspection system more powerful and efficient. The VFD system has been deployed successfully in three different manufacturing companies and the PCB inspection system is the process of being deployed in a manufacturing plant.

  12. Evaluation of machine learning tools as a statistical downscaling tool: temperatures projections for multi-stations for Thames River Basin, Canada

    NASA Astrophysics Data System (ADS)

    Goyal, Manish Kumar; Burn, Donald H.; Ojha, C. S. P.

    2012-05-01

    Many impact studies require climate change information at a finer resolution than that provided by global climate models (GCMs). This paper investigates the performances of existing state-of-the-art rule induction and tree algorithms, namely single conjunctive rule learner, decision table, M5 model tree, and REPTree, and explores the impact of climate change on maximum and minimum temperatures (i.e., predictands) of 14 meteorological stations in the Upper Thames River Basin, Ontario, Canada. The data used for evaluation were large-scale predictor variables, extracted from National Centers for Environmental Prediction/National Center for Atmospheric Research reanalysis dataset and the simulations from third generation Canadian coupled global climate model. Data for four grid points covering the study region were used for developing the downscaling model. M5 model tree algorithm was found to yield better performance among all other learning techniques explored in the present study. Hence, this technique was applied to project predictands generated from GCM using three scenarios (A1B, A2, and B1) for the periods (2046-2065 and 2081-2100). A simple multiplicative shift was used for correcting predictand values. The potential of the downscaling models in simulating predictands was evaluated, and downscaling results reveal that the proposed downscaling model can reproduce local daily predictands from large-scale weather variables. Trend of projected maximum and minimum temperatures was studied for historical as well as downscaled values using GCM and scenario uncertainty. There is likely an increasing trend for T max and T min for A1B, A2, and B1 scenarios while decreasing trend has been observed for B1 scenarios during 2081-2100.

  13. Photonic Neurocomputers And Learning Machines

    NASA Astrophysics Data System (ADS)

    Farhat, Nabil H.

    1990-05-01

    The study of complex multidimensional nonlinear dynamical systems and the modeling and emulation of cognitive brain-like processing of sensory information (neural network research), including the study of chaos and its role in such systems would benefit immensely from the development of a new generation of programmable analog computers capable of carrying out collective, nonlinear and iterative computations at very high speed. The massive interconnectivity and nonlinearity needed in such analog computing structures indicate that a mix of optics and electronics mediated by judicial choice of device physics offer benefits for realizing networks with the following desirable properties: (a) large scale nets, i.e. nets with high number of decision making elements (neurons), (b) modifiable structure, i.e. ability to partition the net into any desired number of layers of prescribed size (number of neurons per layer) with any prescribed pattern of communications between them (e.g. feed forward or feedback (recurrent)), (c) programmable and/or adaptive connectivity weights between the neurons for self-organization and learning, (d) both synchroneous or asynchroneous update rules be possible, (e) high speed update i.e. neurons with lisec response time to enable rapid iteration and convergence, (f) can be used in the study and evaluation of a variety of adaptive learning algorithms, (g) can be used in rapid solution by fast simulated annealing of complex optimization problems of the kind encountered in adaptive learning, pattern recognition, and image processing. The aim of this paper is to describe recent efforts and progress made towards achieving these desirable attributes in analog photonic (optoelectronic and/or electron optical) hardware that utilizes primarily incoherent light. A specific example, hardware implementation of a stochastic Boltzmann learning machine, is used as vehicle for identifying generic issues and clarify research and development areas for further

  14. The application of discriminant analysis and Machine Learning methods as tools to identify and classify compounds with potential as transdermal enhancers.

    PubMed

    Moss, G P; Shah, A J; Adams, R G; Davey, N; Wilkinson, S C; Pugh, W J; Sun, Y

    2012-01-23

    Discriminant analysis (DA) has previously been shown to allow the proposal of simple guidelines for the classification of 73 chemical enhancers of percutaneous absorption. Pugh et al. employed DA to classify such enhancers into simple categories, based on the physicochemical properties of the enhancer molecules (Pugh et al., 2005). While this approach provided a reasonable accuracy of classification it was unable to provide a consistently reliable estimate of enhancement ratio (ER, defined as the amount of hydrocortisone transferred after 24h, relative to control). Machine Learning methods, including Gaussian process (GP) regression, have recently been employed in the prediction of percutaneous absorption of exogenous chemicals (Moss et al., 2009; Lam et al., 2010; Sun et al., 2011). They have shown that they provide more accurate predictions of these phenomena. In this study several Machine Learning methods, including the K-nearest-neighbour (KNN) regression, single layer networks, radial basis function networks and the SVM classifier were applied to an enhancer dataset reported previously. The SMOTE sampling method was used to oversample chemical compounds with ER>10 in each training set in order to improve estimation of GP and KNN. Results show that models using five physicochemical descriptors exhibit better performance than those with three features. The best classification result was obtained by using the SVM method without dealing with imbalanced data. Following over-sampling, GP gives the best result. It correctly assigned 8 of the 12 "good" (ER>10) enhancers and 56 of the 59 "poor" enhancers (ER<10). Overall success rates were similar. However, the pharmaceutical advantages of the Machine Learning methods are that they can provide more accurate classification of enhancer type with fewer false-positive results and that, unlike discriminant analysis, they are able to make predictions of enhancer ability.

  15. A New Approach to Precision Design for Machine Tools

    NASA Astrophysics Data System (ADS)

    Li, Baodong; Jiao, Aisheng; Yi, Xiangbin; Xu, Yanwei

    Precision of the NC axes is an important aspect of machine tool design. Conventionally, the precision specification of machine tools is empirically determined, resulting in poor designs with insufficient or excessive precision. To provide a cost-effective precision specification for machine tools, an active precision design approach is proposed to generate the specification of the positioning repeatability of NC axes to meet the designated working precision requirements of the machine tools. Finally, the approach is demonstrated and validated through a case study of precision design for a gear milling machine.

  16. Memristor models for machine learning.

    PubMed

    Carbajal, Juan Pablo; Dambre, Joni; Hermans, Michiel; Schrauwen, Benjamin

    2015-03-01

    In the quest for alternatives to traditional complementary metal-oxide-semiconductor, it is being suggested that digital computing efficiency and power can be improved by matching the precision to the application. Many applications do not need the high precision that is being used today. In particular, large gains in area and power efficiency could be achieved by dedicated analog realizations of approximate computing engines. In this work we explore the use of memristor networks for analog approximate computation, based on a machine learning framework called reservoir computing. Most experimental investigations on the dynamics of memristors focus on their nonvolatile behavior. Hence, the volatility that is present in the developed technologies is usually unwanted and is not included in simulation models. In contrast, in reservoir computing, volatility is not only desirable but necessary. Therefore, in this work, we propose two different ways to incorporate it into memristor simulation models. The first is an extension of Strukov's model, and the second is an equivalent Wiener model approximation. We analyze and compare the dynamical properties of these models and discuss their implications for the memory and the nonlinear processing capacity of memristor networks. Our results indicate that device variability, increasingly causing problems in traditional computer design, is an asset in the context of reservoir computing. We conclude that although both models could lead to useful memristor-based reservoir computing systems, their computational performance will differ. Therefore, experimental modeling research is required for the development of accurate volatile memristor models.

  17. Machine Learning and Cosmological Simulations

    NASA Astrophysics Data System (ADS)

    Kamdar, Harshil; Turk, Matthew; Brunner, Robert

    2016-01-01

    We explore the application of machine learning (ML) to the problem of galaxy formation and evolution in a hierarchical universe. Our motivations are two-fold: (1) presenting a new, promising technique to study galaxy formation, and (2) quantitatively evaluating the extent of the influence of dark matter halo properties on small-scale structure formation. For our analyses, we use both semi-analytical models (Millennium simulation) and N-body + hydrodynamical simulations (Illustris simulation). The ML algorithms are trained on important dark matter halo properties (inputs) and galaxy properties (outputs). The trained models are able to robustly predict the gas mass, stellar mass, black hole mass, star formation rate, $g-r$ color, and stellar metallicity. Moreover, the ML simulated galaxies obey fundamental observational constraints implying that the population of ML predicted galaxies is physically and statistically robust. Next, ML algorithms are trained on an N-body + hydrodynamical simulation and applied to an N-body only simulation (Dark Sky simulation, Illustris Dark), populating this new simulation with galaxies. We can examine how structure formation changes with different cosmological parameters and are able to mimic a full-blown hydrodynamical simulation in a computation time that is orders of magnitude smaller. We find that the set of ML simulated galaxies in Dark Sky obey the same observational constraints, further solidifying ML's place as an intriguing and promising technique in future galaxy formation studies and rapid mock galaxy catalog creation.

  18. 13. TOOL ROOM SHOWING W. ROBERTSON MACHINE & FOUNDRY CO. ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    13. TOOL ROOM SHOWING W. ROBERTSON MACHINE & FOUNDRY CO. NO. 5 POWER HACKSAW (FOREGROUND) AND WELLS METAL BAND SAW (BACKGROUND). VIEW SOUTHEAST - Oldman Boiler Works, Office/Machine Shop, 32 Illinois Street, Buffalo, Erie County, NY

  19. Machine Learning Through Signature Trees. Applications to Human Speech.

    ERIC Educational Resources Information Center

    White, George M.

    A signature tree is a binary decision tree used to classify unknown patterns. An attempt was made to develop a computer program for manipulating signature trees as a general research tool for exploring machine learning and pattern recognition. The program was applied to the problem of speech recognition to test its effectiveness for a specific…

  20. Machine Translation-Assisted Language Learning: Writing for Beginners

    ERIC Educational Resources Information Center

    Garcia, Ignacio; Pena, Maria Isabel

    2011-01-01

    The few studies that deal with machine translation (MT) as a language learning tool focus on its use by advanced learners, never by beginners. Yet, freely available MT engines (i.e. Google Translate) and MT-related web initiatives (i.e. Gabble-on.com) position themselves to cater precisely to the needs of learners with a limited command of a…

  1. Machine learning applications in genetics and genomics.

    PubMed

    Libbrecht, Maxwell W; Noble, William Stafford

    2015-06-01

    The field of machine learning, which aims to develop computer algorithms that improve with experience, holds promise to enable computers to assist humans in the analysis of large, complex data sets. Here, we provide an overview of machine learning applications for the analysis of genome sequencing data sets, including the annotation of sequence elements and epigenetic, proteomic or metabolomic data. We present considerations and recurrent challenges in the application of supervised, semi-supervised and unsupervised machine learning methods, as well as of generative and discriminative modelling approaches. We provide general guidelines to assist in the selection of these machine learning methods and their practical application for the analysis of genetic and genomic data sets. PMID:25948244

  2. Machine learning applications in genetics and genomics.

    PubMed

    Libbrecht, Maxwell W; Noble, William Stafford

    2015-06-01

    The field of machine learning, which aims to develop computer algorithms that improve with experience, holds promise to enable computers to assist humans in the analysis of large, complex data sets. Here, we provide an overview of machine learning applications for the analysis of genome sequencing data sets, including the annotation of sequence elements and epigenetic, proteomic or metabolomic data. We present considerations and recurrent challenges in the application of supervised, semi-supervised and unsupervised machine learning methods, as well as of generative and discriminative modelling approaches. We provide general guidelines to assist in the selection of these machine learning methods and their practical application for the analysis of genetic and genomic data sets.

  3. Interpolator for numerically controlled machine tools

    DOEpatents

    Bowers, Gary L.; Davenport, Clyde M.; Stephens, Albert E.

    1976-01-01

    A digital differential analyzer circuit is provided that depending on the embodiment chosen can carry out linear, parabolic, circular or cubic interpolation. In the embodiment for parabolic interpolations, the circuit provides pulse trains for the X and Y slide motors of a two-axis machine to effect tool motion along a parabolic path. The pulse trains are generated by the circuit in such a way that parabolic tool motion is obtained from information contained in only one block of binary input data. A part contour may be approximated by one or more parabolic arcs. Acceleration and initial velocity values from a data block are set in fixed bit size registers for each axis separately but simultaneously and the values are integrated to obtain the movement along the respective axis as a function of time. Integration is performed by continual addition at a specified rate of an integrand value stored in one register to the remainder temporarily stored in another identical size register. Overflows from the addition process are indicative of the integral. The overflow output pulses from the second integration may be applied to motors which position the respective machine slides according to a parabolic motion in time to produce a parabolic machine tool motion in space. An additional register for each axis is provided in the circuit to allow "floating" of the radix points of the integrand registers and the velocity increment to improve position accuracy and to reduce errors encountered when the acceleration integrand magnitudes are small when compared to the velocity integrands. A divider circuit is provided in the output of the circuit to smooth the output pulse spacing and prevent motor stall, because the overflow pulses produced in the binary addition process are spaced unevenly in time. The divider has the effect of passing only every nth motor drive pulse, with n being specifiable. The circuit inputs (integrands, rates, etc.) are scaled to give exactly n times the

  4. An introduction to quantum machine learning

    NASA Astrophysics Data System (ADS)

    Schuld, Maria; Sinayskiy, Ilya; Petruccione, Francesco

    2015-04-01

    Machine learning algorithms learn a desired input-output relation from examples in order to interpret new inputs. This is important for tasks such as image and speech recognition or strategy optimisation, with growing applications in the IT industry. In the last couple of years, researchers investigated if quantum computing can help to improve classical machine learning algorithms. Ideas range from running computationally costly algorithms or their subroutines efficiently on a quantum computer to the translation of stochastic methods into the language of quantum theory. This contribution gives a systematic overview of the emerging field of quantum machine learning. It presents the approaches as well as technical details in an accessible way, and discusses the potential of a future theory of quantum learning.

  5. Possibilities of Multi-Function Machining Systems as Tools for Complete Machining

    NASA Astrophysics Data System (ADS)

    Sajgalik, Michal; Czan, Andrej; Rakoci, Jozef

    2014-12-01

    This article deals with the use of a multi-function system for complete machining. It compares the use of conventional tools with multi-function system on the basis of main indicators of the quality of machining.

  6. New tools for learning.

    PubMed

    Dickinson, D

    1999-01-01

    In the last twenty-five years more has been learned about the human brain than in the past history of mankind. Through the use of new technologies such as PET and CAT scans and functional MRI's, it is now possible to see and learn much about the human brain while it is in the process of thinking. The research of neuroscientists, such as Marian Diamond, has demonstrated that the brain changes physiologically as a result of learning and experience--for better or worse--and that plasticity can continue throughout the lifespan. It appears that there are particular kinds of environments that are most conducive to the development of good mental equipment. They are positive, nurturing, stimulating, and encourage action and interaction. Many of the most effective schools and training programs have created such high-challenge low-threat environments. It is also very clear that intelligence is not a static structure, but an open, dynamic system that can continue to develop throughout life. This understanding is being utilized not only in school systems but in the workplace, where training programs show that even at the adult level people are able to develop their intelligence more fully. Corporations such as Motorola have implemented programs in which they are training their employees, managers, and executives to think, problem-solve and create more effectively using strategies developed by such educational innovators as Reuven Feurstein, J.P. Guilford, and Edward de Bono. A most recent development is in the new kinds of technology that make it possible for people to take responsibility for their own learning as they access and process information through the internet, communicate with experts anywhere in the world, and use software that facilitate higher order thinking and problem-solving. Computers are in no way replacing teachers, but rather these new tools allow them to spend more time being facilitators, mentors, and guides. As a result, teachers and students are able

  7. New tools for learning.

    PubMed

    Dickinson, D

    1999-01-01

    In the last twenty-five years more has been learned about the human brain than in the past history of mankind. Through the use of new technologies such as PET and CAT scans and functional MRI's, it is now possible to see and learn much about the human brain while it is in the process of thinking. The research of neuroscientists, such as Marian Diamond, has demonstrated that the brain changes physiologically as a result of learning and experience--for better or worse--and that plasticity can continue throughout the lifespan. It appears that there are particular kinds of environments that are most conducive to the development of good mental equipment. They are positive, nurturing, stimulating, and encourage action and interaction. Many of the most effective schools and training programs have created such high-challenge low-threat environments. It is also very clear that intelligence is not a static structure, but an open, dynamic system that can continue to develop throughout life. This understanding is being utilized not only in school systems but in the workplace, where training programs show that even at the adult level people are able to develop their intelligence more fully. Corporations such as Motorola have implemented programs in which they are training their employees, managers, and executives to think, problem-solve and create more effectively using strategies developed by such educational innovators as Reuven Feurstein, J.P. Guilford, and Edward de Bono. A most recent development is in the new kinds of technology that make it possible for people to take responsibility for their own learning as they access and process information through the internet, communicate with experts anywhere in the world, and use software that facilitate higher order thinking and problem-solving. Computers are in no way replacing teachers, but rather these new tools allow them to spend more time being facilitators, mentors, and guides. As a result, teachers and students are able

  8. Machine learning: Trends, perspectives, and prospects.

    PubMed

    Jordan, M I; Mitchell, T M

    2015-07-17

    Machine learning addresses the question of how to build computers that improve automatically through experience. It is one of today's most rapidly growing technical fields, lying at the intersection of computer science and statistics, and at the core of artificial intelligence and data science. Recent progress in machine learning has been driven both by the development of new learning algorithms and theory and by the ongoing explosion in the availability of online data and low-cost computation. The adoption of data-intensive machine-learning methods can be found throughout science, technology and commerce, leading to more evidence-based decision-making across many walks of life, including health care, manufacturing, education, financial modeling, policing, and marketing.

  9. Machine learning: Trends, perspectives, and prospects.

    PubMed

    Jordan, M I; Mitchell, T M

    2015-07-17

    Machine learning addresses the question of how to build computers that improve automatically through experience. It is one of today's most rapidly growing technical fields, lying at the intersection of computer science and statistics, and at the core of artificial intelligence and data science. Recent progress in machine learning has been driven both by the development of new learning algorithms and theory and by the ongoing explosion in the availability of online data and low-cost computation. The adoption of data-intensive machine-learning methods can be found throughout science, technology and commerce, leading to more evidence-based decision-making across many walks of life, including health care, manufacturing, education, financial modeling, policing, and marketing. PMID:26185243

  10. Tool force evaluation of lathe machined high explosives

    SciTech Connect

    Flowers, G.L.

    1980-04-01

    The purpose of this study was to develop a better understanding of the effects of machining properties upon tool forces encountered during lathe machining of high explosives, in order to optimize machining conditions for mechanical properties test specimens. Monetary considerations dictated that the tooling either already exist or be fabricated in-house using limited machine shop capability. The design chosen which fit between the tool holder and the tool post and interfaced to existing signal conditioners was easily fabricated. The study evaluated all forces on the cutter during machining of two types of high explosives at four cutter radii, four feed rates, three depths of cut and two cutting speeds. The study pointed out design problems, instrumentation drift, tool chatter and detection levels. It also showed that the type of high explosive was more significant than first thought toward influencing tool force levels.

  11. Information Model for Machine-Tool-Performance Tests

    PubMed Central

    Lee, Y. Tina; Soons, Johannes A.; Donmez, M. Alkan

    2001-01-01

    This report specifies an information model of machine-tool-performance tests in the EXPRESS [1] language. The information model provides a mechanism for describing the properties and results of machine-tool-performance tests. The objective of the information model is a standardized, computer-interpretable representation that allows for efficient archiving and exchange of performance test data throughout the life cycle of the machine. The report also demonstrates the implementation of the information model using three different implementation methods. PMID:27500031

  12. Linear positioning laser calibration setup of CNC machine tools

    NASA Astrophysics Data System (ADS)

    Sui, Xiulin; Yang, Congjing

    2002-10-01

    The linear positioning laser calibration setup of CNC machine tools is capable of executing machine tool laser calibraiotn and backlash compensation. Using this setup, hole locations on CNC machien tools will be correct and machien tool geometry will be evaluated and adjusted. Machien tool laser calibration and backlash compensation is a simple and straightforward process. First the setup is to 'find' the stroke limits of the axis. Then the laser head is then brought into correct alignment. Second is to move the machine axis to the other extreme, the laser head is now aligned, using rotation and elevation adjustments. Finally the machine is moved to the start position and final alignment is verified. The stroke of the machine, and the machine compensation interval dictate the amount of data required for each axis. These factors determine the amount of time required for a through compensation of the linear positioning accuracy. The Laser Calibrator System monitors the material temperature and the air density; this takes into consideration machine thermal growth and laser beam frequency. This linear positioning laser calibration setup can be used on CNC machine tools, CNC lathes, horizontal centers and vertical machining centers.

  13. Introduction to machine learning for brain imaging.

    PubMed

    Lemm, Steven; Blankertz, Benjamin; Dickhaus, Thorsten; Müller, Klaus-Robert

    2011-05-15

    Machine learning and pattern recognition algorithms have in the past years developed to become a working horse in brain imaging and the computational neurosciences, as they are instrumental for mining vast amounts of neural data of ever increasing measurement precision and detecting minuscule signals from an overwhelming noise floor. They provide the means to decode and characterize task relevant brain states and to distinguish them from non-informative brain signals. While undoubtedly this machinery has helped to gain novel biological insights, it also holds the danger of potential unintentional abuse. Ideally machine learning techniques should be usable for any non-expert, however, unfortunately they are typically not. Overfitting and other pitfalls may occur and lead to spurious and nonsensical interpretation. The goal of this review is therefore to provide an accessible and clear introduction to the strengths and also the inherent dangers of machine learning usage in the neurosciences. PMID:21172442

  14. Graphite fiber reinforced structure for supporting machine tools

    DOEpatents

    Knight, Jr., Charles E.; Kovach, Louis; Hurst, John S.

    1978-01-01

    Machine tools utilized in precision machine operations require tool support structures which exhibit minimal deflection, thermal expansion and vibration characteristics. The tool support structure of the present invention is a graphite fiber reinforced composite in which layers of the graphite fibers or yarn are disposed in a 0/90.degree. pattern and bonded together with an epoxy resin. The finished composite possesses a low coefficient of thermal expansion and a substantially greater elastic modulus, stiffness-to-weight ratio, and damping factor than a conventional steel tool support utilized in similar machining operations.

  15. Study on machining mechanism of nanotwinned CBN cutting tool

    NASA Astrophysics Data System (ADS)

    Chen, Junyun; Jin, Tianye; Wang, Jinhu; Zhao, Qingliang; Lu, Ling

    2014-08-01

    The latest developed nanotwinned cubic boron nitride (nt-CBN) with isotropic nano-sized microstructure possesses an extremely high hardness (~100GPa Hv), very large fracture toughness (>12Mpa m1/2) and excellent high temperature stability. Thus nt-CBN is a promising tool material to realize ultra-precision cutting of hardened steel which is widely used in mold insert of optical and opto-electrical mass products. In view of its hard machinability, the machining mechanism is studied in this paper. Three feasible methods of mechanical lapping, laser machining as well as ion beam sputtering are applied to process nt-CBN. The results indicate that among the three kinds of methods, mechanical lapping not only can achieve the highest machining accuracy because of material removing at ductile mode completely, but also has satisfactory high material removal rate. Thus mechanical lapping method is appropriate to finish machining of nt-CBN cutting tool. Moreover, laser machining method can be only used in contour machining or rough machining of cutting tool as worse machined surface quality. With regard to ion beam sputtering method, the material remove rate is too low in spite of high machining accuracy. Additionally, no phase transition was found in any machining process of nt-CBN.

  16. Modeling of cumulative tool wear in machining metal matrix composites

    SciTech Connect

    Hung, N.P.; Tan, V.K.; Oon, B.E.

    1995-12-31

    Metal matrix composites (MMCs) are notoriously known for their low machinability because of the abrasive and brittle reinforcement. Although a near-net-shape product could be produced, finish machining is still required for the final shape and dimension. The classical Taylor`s tool life equation that relates tool life and cutting conditions has been traditionally used to study machinability. The turning operation is commonly used to investigate the machinability of a material; tedious and costly milling experiments have to be performed separately; while a facing test is not applicable for the Taylor`s model since the facing speed varies as the tool moves radially. Collecting intensive machining data for MMCs is often difficult because of the constraints on size, cost of the material, and the availability of sophisticated machine tools. A more flexible model and machinability testing technique are, therefore, sought. This study presents and verifies new models for turning, facing, and milling operations. Different cutting conditions were utilized to assess the machinability of MMCs reinforced with silicon carbide or alumina particles. Experimental data show that tool wear does not depend on the order of different cutting speeds since abrasion is the main wear mechanism. Correlation between data for turning, milling, and facing is presented. It is more economical to rank machinability using data for facing and then to convert the data for turning and milling, if required. Subsurface damages such as work-hardened and cracked matrix alloy, and fractured and delaminated particles are discussed.

  17. Influence of Tool Balancing in High Speed Machining

    NASA Astrophysics Data System (ADS)

    Bašovská, Klaudia; Peterka, Jozef

    2014-12-01

    The high speed machining (HSM) is now considered as one of the key manufacturing technologies for higher throughput and productivity. HSM used higher rotational speed of the spindle (40,000 min-1 and higher). With increasing high speed spindle rotations raises a number of dynamic forces. Even a small mass unbalance in the spindle and tooling generates tool vibration. Tool vibration shortens tool life and lowers the quality of the machined surface. It is necessary to minimize this vibration by balancing tool and tool holder. The balancing process improves the mass distribution of a cutting tool and its holder, allowing the combination of the two to rotate with the minimum amount of unbalanced centrifugal forces. Machining with balanced tool will provide better surface quality, accuracy and less tool and machine wear. In this study is focused on unbalance cutting tools, definitions, balancing techniques, sources, effects, processes and machineries. The aim of this article was to examine the relationship between unbalance and tool holders used in high speed metalworking machine tools

  18. Diagnostic Tools for Learning Organizations.

    ERIC Educational Resources Information Center

    Moilanen, Raili

    2001-01-01

    The Learning Organization Diamond Tool was designed for holistic analysis of 10 learning organization elements at the individual and organizational levels. A test in 25 Finnish organizations established validity. Comparison with existing tools showed that differences derive from their different purposes. (Contains 33 references.) (SK)

  19. Study of on-machine error identification and compensation methods for micro machine tools

    NASA Astrophysics Data System (ADS)

    Wang, Shih-Ming; Yu, Han-Jen; Lee, Chun-Yi; Chiu, Hung-Sheng

    2016-08-01

    Micro machining plays an important role in the manufacturing of miniature products which are made of various materials with complex 3D shapes and tight machining tolerance. To further improve the accuracy of a micro machining process without increasing the manufacturing cost of a micro machine tool, an effective machining error measurement method and a software-based compensation method are essential. To avoid introducing additional errors caused by the re-installment of the workpiece, the measurement and compensation method should be on-machine conducted. In addition, because the contour of a miniature workpiece machined with a micro machining process is very tiny, the measurement method should be non-contact. By integrating the image re-constructive method, camera pixel correction, coordinate transformation, the error identification algorithm, and trajectory auto-correction method, a vision-based error measurement and compensation method that can on-machine inspect the micro machining errors and automatically generate an error-corrected numerical control (NC) program for error compensation was developed in this study. With the use of the Canny edge detection algorithm and camera pixel calibration, the edges of the contour of a machined workpiece were identified and used to re-construct the actual contour of the work piece. The actual contour was then mapped to the theoretical contour to identify the actual cutting points and compute the machining errors. With the use of a moving matching window and calculation of the similarity between the actual and theoretical contour, the errors between the actual cutting points and theoretical cutting points were calculated and used to correct the NC program. With the use of the error-corrected NC program, the accuracy of a micro machining process can be effectively improved. To prove the feasibility and effectiveness of the proposed methods, micro-milling experiments on a micro machine tool were conducted, and the results

  20. Machine Learning Toolkit for Extreme Scale

    2014-03-31

    Support Vector Machines (SVM) is a popular machine learning technique, which has been applied to a wide range of domains such as science, finance, and social networks for supervised learning. MaTEx undertakes the challenge of designing a scalable parallel SVM training algorithm for large scale systems, which includes commodity multi-core machines, tightly connected supercomputers and cloud computing systems. Several techniques are proposed for improved speed and memory space usage including adaptive and aggressive elimination ofmore » samples for faster convergence , and sparse format representation of data samples. Several heuristics for earliest possible to lazy elimination of non-contributing samples are considered in MaTEx. In many cases, where an early sample elimination might result in a false positive, low overhead mechanisms for reconstruction of key data structures are proposed. The proposed algorithm and heuristics are implemented and evaluated on various publicly available datasets« less

  1. Machine Learning Toolkit for Extreme Scale

    SciTech Connect

    2014-03-31

    Support Vector Machines (SVM) is a popular machine learning technique, which has been applied to a wide range of domains such as science, finance, and social networks for supervised learning. MaTEx undertakes the challenge of designing a scalable parallel SVM training algorithm for large scale systems, which includes commodity multi-core machines, tightly connected supercomputers and cloud computing systems. Several techniques are proposed for improved speed and memory space usage including adaptive and aggressive elimination of samples for faster convergence , and sparse format representation of data samples. Several heuristics for earliest possible to lazy elimination of non-contributing samples are considered in MaTEx. In many cases, where an early sample elimination might result in a false positive, low overhead mechanisms for reconstruction of key data structures are proposed. The proposed algorithm and heuristics are implemented and evaluated on various publicly available datasets

  2. Volumetric verification of multiaxis machine tool using laser tracker.

    PubMed

    Aguado, Sergio; Samper, David; Santolaria, Jorge; Aguilar, Juan José

    2014-01-01

    This paper aims to present a method of volumetric verification in machine tools with linear and rotary axes using a laser tracker. Beyond a method for a particular machine, it presents a methodology that can be used in any machine type. Along this paper, the schema and kinematic model of a machine with three axes of movement, two linear and one rotational axes, including the measurement system and the nominal rotation matrix of the rotational axis are presented. Using this, the machine tool volumetric error is obtained and nonlinear optimization techniques are employed to improve the accuracy of the machine tool. The verification provides a mathematical, not physical, compensation, in less time than other methods of verification by means of the indirect measurement of geometric errors of the machine from the linear and rotary axes. This paper presents an extensive study about the appropriateness and drawbacks of the regression function employed depending on the types of movement of the axes of any machine. In the same way, strengths and weaknesses of measurement methods and optimization techniques depending on the space available to place the measurement system are presented. These studies provide the most appropriate strategies to verify each machine tool taking into consideration its configuration and its available work space.

  3. Volumetric Verification of Multiaxis Machine Tool Using Laser Tracker

    PubMed Central

    Aguilar, Juan José

    2014-01-01

    This paper aims to present a method of volumetric verification in machine tools with linear and rotary axes using a laser tracker. Beyond a method for a particular machine, it presents a methodology that can be used in any machine type. Along this paper, the schema and kinematic model of a machine with three axes of movement, two linear and one rotational axes, including the measurement system and the nominal rotation matrix of the rotational axis are presented. Using this, the machine tool volumetric error is obtained and nonlinear optimization techniques are employed to improve the accuracy of the machine tool. The verification provides a mathematical, not physical, compensation, in less time than other methods of verification by means of the indirect measurement of geometric errors of the machine from the linear and rotary axes. This paper presents an extensive study about the appropriateness and drawbacks of the regression function employed depending on the types of movement of the axes of any machine. In the same way, strengths and weaknesses of measurement methods and optimization techniques depending on the space available to place the measurement system are presented. These studies provide the most appropriate strategies to verify each machine tool taking into consideration its configuration and its available work space. PMID:25202744

  4. Using Simple Machines to Leverage Learning

    ERIC Educational Resources Information Center

    Dotger, Sharon

    2008-01-01

    What would your students say if you told them they could lift you off the ground using a block and a board? Using a simple machine, they'll find out they can, and they'll learn about work, energy, and motion in the process! In addition, this integrated lesson gives students the opportunity to investigate variables while practicing measurement…

  5. Vitrification: Machines learn to recognize glasses

    NASA Astrophysics Data System (ADS)

    Ceriotti, Michele; Vitelli, Vincenzo

    2016-05-01

    The dynamics of a viscous liquid undergo a dramatic slowdown when it is cooled to form a solid glass. Recognizing the structural changes across such a transition remains a major challenge. Machine-learning methods, similar to those Facebook uses to recognize groups of friends, have now been applied to this problem.

  6. Digital Signal Processing and Machine Learning

    NASA Astrophysics Data System (ADS)

    Li, Yuanqing; Ang, Kai Keng; Guan, Cuntai

    Any brain-computer interface (BCI) system must translate signals from the users brain into messages or commands (see Fig. 1). Many signal processing and machine learning techniques have been developed for this signal translation, and this chapter reviews the most common ones. Although these techniques are often illustrated using electroencephalography (EEG) signals in this chapter, they are also suitable for other brain signals.

  7. AstroML: Python-powered Machine Learning for Astronomy

    NASA Astrophysics Data System (ADS)

    Vander Plas, Jake; Connolly, A. J.; Ivezic, Z.

    2014-01-01

    As astronomical data sets grow in size and complexity, automated machine learning and data mining methods are becoming an increasingly fundamental component of research in the field. The astroML project (http://astroML.org) provides a common repository for practical examples of the data mining and machine learning tools used and developed by astronomical researchers, written in Python. The astroML module contains a host of general-purpose data analysis and machine learning routines, loaders for openly-available astronomical datasets, and fast implementations of specific computational methods often used in astronomy and astrophysics. The associated website features hundreds of examples of these routines being used for analysis of real astronomical datasets, while the associated textbook provides a curriculum resource for graduate-level courses focusing on practical statistics, machine learning, and data mining approaches within Astronomical research. This poster will highlight several of the more powerful and unique examples of analysis performed with astroML, all of which can be reproduced in their entirety on any computer with the proper packages installed.

  8. Haptics-Augmented Simple-Machine Educational Tools.

    ERIC Educational Resources Information Center

    Williams, Robert L., II; Chen, Meng-Yun; Seaton, Jeffrey M.

    2003-01-01

    Describes a unique project using commercial haptic interfaces to augment the teaching of simple machines in elementary school. Suggests that the use of haptics in virtual simple-machine simulations has the potential for deeper, more engaging learning. (Contains 13 references.) (Author/YDS)

  9. Fast, Continuous Audiogram Estimation using Machine Learning

    PubMed Central

    Song, Xinyu D.; Wallace, Brittany M.; Gardner, Jacob R.; Ledbetter, Noah M.; Weinberger, Kilian Q.; Barbour, Dennis L.

    2016-01-01

    Objectives Pure-tone audiometry has been a staple of hearing assessments for decades. Many different procedures have been proposed for measuring thresholds with pure tones by systematically manipulating intensity one frequency at a time until a discrete threshold function is determined. The authors have developed a novel nonparametric approach for estimating a continuous threshold audiogram using Bayesian estimation and machine learning classification. The objective of this study is to assess the accuracy and reliability of this new method relative to a commonly used threshold measurement technique. Design The authors performed air conduction pure-tone audiometry on 21 participants between the ages of 18 and 90 years with varying degrees of hearing ability. Two repetitions of automated machine learning audiogram estimation and 1 repetition of conventional modified Hughson-Westlake ascending-descending audiogram estimation were acquired by an audiologist. The estimated hearing thresholds of these two techniques were compared at standard audiogram frequencies (i.e., 0.25, 0.5, 1, 2, 4, 8 kHz). Results The two threshold estimate methods delivered very similar estimates at standard audiogram frequencies. Specifically, the mean absolute difference between estimates was 4.16 ± 3.76 dB HL. The mean absolute difference between repeated measurements of the new machine learning procedure was 4.51 ± 4.45 dB HL. These values compare favorably to those of other threshold audiogram estimation procedures. Furthermore, the machine learning method generated threshold estimates from significantly fewer samples than the modified Hughson-Westlake procedure while returning a continuous threshold estimate as a function of frequency. Conclusions The new machine learning audiogram estimation technique produces continuous threshold audiogram estimates accurately, reliably, and efficiently, making it a strong candidate for widespread application in clinical and research audiometry. PMID

  10. Machine learning in soil classification.

    PubMed

    Bhattacharya, B; Solomatine, D P

    2006-03-01

    In a number of engineering problems, e.g. in geotechnics, petroleum engineering, etc. intervals of measured series data (signals) are to be attributed a class maintaining the constraint of contiguity and standard classification methods could be inadequate. Classification in this case needs involvement of an expert who observes the magnitude and trends of the signals in addition to any a priori information that might be available. In this paper, an approach for automating this classification procedure is presented. Firstly, a segmentation algorithm is developed and applied to segment the measured signals. Secondly, the salient features of these segments are extracted using boundary energy method. Based on the measured data and extracted features to assign classes to the segments classifiers are built; they employ Decision Trees, ANN and Support Vector Machines. The methodology was tested in classifying sub-surface soil using measured data from Cone Penetration Testing and satisfactory results were obtained. PMID:16530382

  11. Robustness of thermal error compensation model of CNC machine tool

    NASA Astrophysics Data System (ADS)

    Lang, Xianli; Miao, Enming; Gong, Yayun; Niu, Pengcheng; Xu, Zhishang

    2013-01-01

    Thermal error is the major factor in restricting the accuracy of CNC machining. The modeling accuracy is the key of thermal error compensation which can achieve precision machining of CNC machine tool. The traditional thermal error compensation models mostly focus on the fitting accuracy without considering the robustness of the models, it makes the research results into practice is difficult. In this paper, the experiment of model robustness is done in different spinde speeds of leaderway V-450 machine tool. Combining fuzzy clustering and grey relevance selects temperature-sensitive points of thermal error. Using multiple linear regression model (MLR) and distributed lag model (DL) establishes model of the multi-batch experimental data and then gives robustness analysis, demonstrates the difference between fitting precision and prediction precision in engineering application, and provides a reference method to choose thermal error compensation model of CNC machine tool in the practical engineering application.

  12. Prototype classification: insights from machine learning.

    PubMed

    Graf, Arnulf B A; Bousquet, Olivier; Rätsch, Gunnar; Schölkopf, Bernhard

    2009-01-01

    We shed light on the discrimination between patterns belonging to two different classes by casting this decoding problem into a generalized prototype framework. The discrimination process is then separated into two stages: a projection stage that reduces the dimensionality of the data by projecting it on a line and a threshold stage where the distributions of the projected patterns of both classes are separated. For this, we extend the popular mean-of-class prototype classification using algorithms from machine learning that satisfy a set of invariance properties. We report a simple yet general approach to express different types of linear classification algorithms in an identical and easy-to-visualize formal framework using generalized prototypes where these prototypes are used to express the normal vector and offset of the hyperplane. We investigate non-margin classifiers such as the classical prototype classifier, the Fisher classifier, and the relevance vector machine. We then study hard and soft margin classifiers such as the support vector machine and a boosted version of the prototype classifier. Subsequently, we relate mean-of-class prototype classification to other classification algorithms by showing that the prototype classifier is a limit of any soft margin classifier and that boosting a prototype classifier yields the support vector machine. While giving novel insights into classification per se by presenting a common and unified formalism, our generalized prototype framework also provides an efficient visualization and a principled comparison of machine learning classification.

  13. Job Grading Standard for Machine Tool Operator, WG-3431.

    ERIC Educational Resources Information Center

    Civil Service Commission, Washington, DC. Bureau of Policies and Standards.

    The standard covers nonsupervisory work involved in the set up, adjustment, and operation of conventional machine tools to perform machining operations in the manufacture and repair of castings, forgings, or parts from raw stock made of various metals, metal alloys, and other materials. A general description of the job at both the WG-8 and WG-9…

  14. Tool simplifies machining of pipe ends for precision welding

    NASA Technical Reports Server (NTRS)

    Matus, S. T.

    1969-01-01

    Single tool prepares a pipe end for precision welding by simultaneously performing internal machining, end facing, and bevel cutting to specification standards. The machining operation requires only one milling adjustment, can be performed quickly, and produces the high quality pipe-end configurations required to ensure precision-welded joints.

  15. Drill press in foreground is one of few machine tools ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    Drill press in foreground is one of few machine tools in operating condition which is still operated occasionally for public demonstrations. - Thomas A. Edison Laboratories, Building No. 5, Main Street & Lakeside Avenue, West Orange, Essex County, NJ

  16. 27. View within machine room showing water tank, tool chest ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    27. View within machine room showing water tank, tool chest and oil/grease cans used for maintenance. (Nov. 25, 1988) - University Heights Bridge, Spanning Harlem River at 207th Street & West Harlem Road, New York County, NY

  17. Reducing tool wear when machining austenitic stainless steels

    SciTech Connect

    Magee, J.H.; Kosa, T.

    1998-07-01

    Austenitic stainless steels are considered more difficult to machine than carbon steels due to their high work hardening rate, large spread between yield and ultimate tensile strength, high toughness and ductility, and low thermal conductivity. These characteristics can result in a built-up edge or excessive tool wear during machining, especially when the cutting speed is too high. The practical solution is to lower the cutting speed until tool life reaches an acceptable level. However, lower machining speed negatively impacts productivity. Thus, in order to overcome tool wear at relatively high machining speeds for these alloys, on-going research is being performed to improve cutting fluids, develop more wear-resistant tools, and to modify stainless steels to make them less likely to cause tool wear. This paper discusses compositional modifications to the two most commonly machined austenitic stainless steels (Type 303 and 304) which reduced their susceptibility to tool wear, and allowed these grades to be machined at higher cutting speeds.

  18. Mississippi Curriculum Framework for Machine Tool Operation/Machine Shop (Program CIP: 48.0503--Machine Shop Assistant). Secondary Programs.

    ERIC Educational Resources Information Center

    Mississippi Research and Curriculum Unit for Vocational and Technical Education, State College.

    This document, which reflects Mississippi's statutory requirement that instructional programs be based on core curricula and performance-based assessment, contains outlines of the instructional units required in local instructional management plans and daily lesson plans for machine tool operation/machine shop I and II. Presented first are a…

  19. Toward a metrology for precision-machine-tool control systems

    SciTech Connect

    Pomernacki, C.L.; McCue, H.K.; Newton, L.E.

    1982-07-20

    The difficulty of determining the source of an error in the performance of the control system of a computer numerically controlled (CNC) precision machine tool is discussed and recommendations are made for error isolation using the Machine Control System Meterology Tree. These recommendations refer to types of tests for specific errors and to a possible architecture for a CNC performance tester. It is concluded that there is a need for both a control system metrology and for establishing standards of performance and testing methods for precision machine tool control systems. (LCL)

  20. Toward a metrology for precision-machine-tool control systems

    NASA Astrophysics Data System (ADS)

    Pomernacki, C. L.; McCue, H. K.; Newton, L. E.

    1982-07-01

    The difficulty of determining the source of an error in the performance of the control system of a computer numerically controlled (CNC) precision machine tool is discussed and recommendations are made for error isolation using the Machine Control System Meterology Tree. These recommendations refer to types of tests for specific errors and to a possible architecture for a CNC performance tester. It is concluded that there is a need for both a control system metrology and for establishing standards of performance and testing methods for precision machine tool control systems.

  1. Machine Learning Assessments of Soil Drying

    NASA Astrophysics Data System (ADS)

    Coopersmith, E. J.; Minsker, B. S.; Wenzel, C.; Gilmore, B. J.

    2011-12-01

    Agricultural activities require the use of heavy equipment and vehicles on unpaved farmlands. When soil conditions are wet, equipment can cause substantial damage, leaving deep ruts. In extreme cases, implements can sink and become mired, causing considerable delays and expense to extricate the equipment. Farm managers, who are often located remotely, cannot assess sites before allocating equipment, causing considerable difficulty in reliably assessing conditions of countless sites with any reliability and frequency. For example, farmers often trace serpentine paths of over one hundred miles each day to assess the overall status of various tracts of land spanning thirty, forty, or fifty miles in each direction. One means of assessing the moisture content of a field lies in the strategic positioning of remotely-monitored in situ sensors. Unfortunately, land owners are often reluctant to place sensors across their properties due to the significant monetary cost and complexity. This work aspires to overcome these limitations by modeling the process of wetting and drying statistically - remotely assessing field readiness using only information that is publically accessible. Such data includes Nexrad radar and state climate network sensors, as well as Twitter-based reports of field conditions for validation. Three algorithms, classification trees, k-nearest-neighbors, and boosted perceptrons are deployed to deliver statistical field readiness assessments of an agricultural site located in Urbana, IL. Two of the three algorithms performed with 92-94% accuracy, with the majority of misclassifications falling within the calculated margins of error. This demonstrates the feasibility of using a machine learning framework with only public data, knowledge of system memory from previous conditions, and statistical tools to assess "readiness" without the need for real-time, on-site physical observation. Future efforts will produce a workflow assimilating Nexrad, climate network

  2. Lathe tool bit and holder for machining fiberglass materials

    NASA Technical Reports Server (NTRS)

    Winn, L. E. (Inventor)

    1972-01-01

    A lathe tool and holder combination for machining resin impregnated fiberglass cloth laminates is described. The tool holder and tool bit combination is designed to accommodate a conventional carbide-tipped, round shank router bit as the cutting medium, and provides an infinite number of cutting angles in order to produce a true and smooth surface in the fiberglass material workpiece with every pass of the tool bit. The technique utilizes damaged router bits which ordinarily would be discarded.

  3. Diamond tool machining of materials which react with diamond

    DOEpatents

    Lundin, Ralph L.; Stewart, Delbert D.; Evans, Christopher J.

    1992-01-01

    Apparatus for the diamond machining of materials which detrimentally react with diamond cutting tools in which the cutting tool and the workpiece are chilled to very low temperatures. This chilling halts or retards the chemical reaction between the workpiece and the diamond cutting tool so that wear rates of the diamond tool on previously detrimental materials are comparable with the diamond turning of materials which do not react with diamond.

  4. Diamond tool machining of materials which react with diamond

    DOEpatents

    Lundin, R.L.; Stewart, D.D.; Evans, C.J.

    1992-04-14

    An apparatus is described for the diamond machining of materials which detrimentally react with diamond cutting tools in which the cutting tool and the workpiece are chilled to very low temperatures. This chilling halts or retards the chemical reaction between the workpiece and the diamond cutting tool so that wear rates of the diamond tool on previously detrimental materials are comparable with the diamond turning of materials which do not react with diamond. 1 figs.

  5. Survey of Machine Learning Methods for Database Security

    NASA Astrophysics Data System (ADS)

    Kamra, Ashish; Ber, Elisa

    Application of machine learning techniques to database security is an emerging area of research. In this chapter, we present a survey of various approaches that use machine learning/data mining techniques to enhance the traditional security mechanisms of databases. There are two key database security areas in which these techniques have found applications, namely, detection of SQL Injection attacks and anomaly detection for defending against insider threats. Apart from the research prototypes and tools, various third-party commercial products are also available that provide database activity monitoring solutions by profiling database users and applications. We present a survey of such products. We end the chapter with a primer on mechanisms for responding to database anomalies.

  6. Machine Learning and Geometric Technique for SLAM

    NASA Astrophysics Data System (ADS)

    Bernal-Marin, Miguel; Bayro-Corrochano, Eduardo

    This paper describes a new approach for building 3D geometric maps using a laser rangefinder, a stereo camera system and a mathematical system the Conformal Geometric Algebra. The use of a known visual landmarks in the map helps to carry out a good localization of the robot. A machine learning technique is used for recognition of objects in the environment. These landmarks are found using the Viola and Jones algorithm and are represented with their position in the 3D virtual map.

  7. Prototype-based models in machine learning.

    PubMed

    Biehl, Michael; Hammer, Barbara; Villmann, Thomas

    2016-01-01

    An overview is given of prototype-based models in machine learning. In this framework, observations, i.e., data, are stored in terms of typical representatives. Together with a suitable measure of similarity, the systems can be employed in the context of unsupervised and supervised analysis of potentially high-dimensional, complex datasets. We discuss basic schemes of competitive vector quantization as well as the so-called neural gas approach and Kohonen's topology-preserving self-organizing map. Supervised learning in prototype systems is exemplified in terms of learning vector quantization. Most frequently, the familiar Euclidean distance serves as a dissimilarity measure. We present extensions of the framework to nonstandard measures and give an introduction to the use of adaptive distances in relevance learning.

  8. Scaling up: Distributed machine learning with cooperation

    SciTech Connect

    Provost, F.J.; Hennessy, D.N.

    1996-12-31

    Machine-learning methods are becoming increasingly popular for automated data analysis. However, standard methods do not scale up to massive scientific and business data sets without expensive hardware. This paper investigates a practical alternative for scaling up: the use of distributed processing to take advantage of the often dormant PCs and workstations available on local networks. Each workstation runs a common rule-learning program on a subset of the data. We first show that for commonly used rule-evaluation criteria, a simple form of cooperation can guarantee that a rule will look good to the set of cooperating learners if and only if it would look good to a single learner operating with the entire data set. We then show how such a system can further capitalize on different perspectives by sharing learned knowledge for significant reduction in search effort. We demonstrate the power of the method by learning from a massive data set taken from the domain of cellular fraud detection. Finally, we provide an overview of other methods for scaling up machine learning.

  9. Machine learning: how to get more out of HEP data and the Higgs Boson Machine Learning Challenge

    NASA Astrophysics Data System (ADS)

    Wolter, Marcin

    2015-09-01

    Multivariate techniques using machine learning algorithms have become an integral part in many High Energy Physics (HEP) data analyses. The article shows the gain in physics reach of the physics experiments due to the adaptation of machine learning techniques. Rapid development in the field of machine learning in the last years is a challenge for the HEP community. The open competition for machine learning experts "Higgs Boson Machine Learning Challenge" shows, that the modern techniques developed outside HEP can significantly improve the analysis of data from HEP experiments and improve the sensitivity of searches for new particles and processes.

  10. Wearable Learning Tools.

    ERIC Educational Resources Information Center

    Bowskill, Jerry; Dyer, Nick

    1999-01-01

    Describes wearable computers, or information and communication technology devices that are designed to be mobile. Discusses how such technologies can enhance computer-mediated communications, focusing on collaborative working for learning. Describes an experimental system, MetaPark, which explores communications, data retrieval and recording, and…

  11. Scalable Machine Learning for Massive Astronomical Datasets

    NASA Astrophysics Data System (ADS)

    Ball, Nicholas M.; Astronomy Data Centre, Canadian

    2014-01-01

    We present the ability to perform data mining and machine learning operations on a catalog of half a billion astronomical objects. This is the result of the combination of robust, highly accurate machine learning algorithms with linear scalability that renders the applications of these algorithms to massive astronomical data tractable. We demonstrate the core algorithms kernel density estimation, K-means clustering, linear regression, nearest neighbors, random forest and gradient-boosted decision tree, singular value decomposition, support vector machine, and two-point correlation function. Each of these is relevant for astronomical applications such as finding novel astrophysical objects, characterizing artifacts in data, object classification (including for rare objects), object distances, finding the important features describing objects, density estimation of distributions, probabilistic quantities, and exploring the unknown structure of new data. The software, Skytree Server, runs on any UNIX-based machine, a virtual machine, or cloud-based and distributed systems including Hadoop. We have integrated it on the cloud computing system of the Canadian Astronomical Data Centre, the Canadian Advanced Network for Astronomical Research (CANFAR), creating the world's first cloud computing data mining system for astronomy. We demonstrate results showing the scaling of each of our major algorithms on large astronomical datasets, including the full 470,992,970 objects of the 2 Micron All-Sky Survey (2MASS) Point Source Catalog. We demonstrate the ability to find outliers in the full 2MASS dataset utilizing multiple methods, e.g., nearest neighbors, and the local outlier factor. 2MASS is used as a proof-of-concept dataset due to its convenience and availability. These results are of interest to any astronomical project with large and/or complex datasets that wishes to extract the full scientific value from its data.

  12. Scalable Machine Learning for Massive Astronomical Datasets

    NASA Astrophysics Data System (ADS)

    Ball, Nicholas M.; Gray, A.

    2014-04-01

    We present the ability to perform data mining and machine learning operations on a catalog of half a billion astronomical objects. This is the result of the combination of robust, highly accurate machine learning algorithms with linear scalability that renders the applications of these algorithms to massive astronomical data tractable. We demonstrate the core algorithms kernel density estimation, K-means clustering, linear regression, nearest neighbors, random forest and gradient-boosted decision tree, singular value decomposition, support vector machine, and two-point correlation function. Each of these is relevant for astronomical applications such as finding novel astrophysical objects, characterizing artifacts in data, object classification (including for rare objects), object distances, finding the important features describing objects, density estimation of distributions, probabilistic quantities, and exploring the unknown structure of new data. The software, Skytree Server, runs on any UNIX-based machine, a virtual machine, or cloud-based and distributed systems including Hadoop. We have integrated it on the cloud computing system of the Canadian Astronomical Data Centre, the Canadian Advanced Network for Astronomical Research (CANFAR), creating the world's first cloud computing data mining system for astronomy. We demonstrate results showing the scaling of each of our major algorithms on large astronomical datasets, including the full 470,992,970 objects of the 2 Micron All-Sky Survey (2MASS) Point Source Catalog. We demonstrate the ability to find outliers in the full 2MASS dataset utilizing multiple methods, e.g., nearest neighbors. This is likely of particular interest to the radio astronomy community given, for example, that survey projects contain groups dedicated to this topic. 2MASS is used as a proof-of-concept dataset due to its convenience and availability. These results are of interest to any astronomical project with large and/or complex

  13. Hard turning micro-machine tool

    DOEpatents

    DeVor, Richard E; Adair, Kurt; Kapoor, Shiv G

    2013-10-22

    A micro-scale apparatus for supporting a tool for hard turning comprises a base, a pivot coupled to the base, an actuator coupled to the base, and at least one member coupled to the actuator at one end and rotatably coupled to the pivot at another end. A tool mount is disposed on the at least one member. The at least one member defines a first lever arm between the pivot and the tool mount, and a second lever arm between the pivot and the actuator. The first lever arm has a length that is less than a length of the second lever arm. The actuator moves the tool mount along an arc.

  14. Process Damping and Cutting Tool Geometry in Machining

    NASA Astrophysics Data System (ADS)

    Taylor, C. M.; Sims, N. D.; Turner, S.

    2011-12-01

    Regenerative vibration, or chatter, limits the performance of machining processes. Consequences of chatter include tool wear and poor machined surface finish. Process damping by tool-workpiece contact can reduce chatter effects and improve productivity. Process damping occurs when the flank (also known as the relief face) of the cutting tool makes contact with waves on the workpiece surface, created by chatter motion. Tool edge features can act to increase the damping effect. This paper examines how a tool's edge condition combines with the relief angle to affect process damping. An analytical model of cutting with chatter leads to a two-section curve describing how process damped vibration amplitude changes with surface speed for radiussed tools. The tool edge dominates the process damping effect at the lowest surface speeds, with the flank dominating at higher speeds. A similar curve is then proposed regarding tools with worn edges. Experimental data supports the notion of the two-section curve. A rule of thumb is proposed which could be useful to machine operators, regarding tool wear and process damping. The question is addressed, should a tool of a given geometry, used for a given application, be considered as sharp, radiussed or worn regarding process damping.

  15. Finding New Perovskite Halides via Machine learning

    NASA Astrophysics Data System (ADS)

    Pilania, Ghanshyam; Balachandran, Prasanna V.; Kim, Chiho; Lookman, Turab

    2016-04-01

    Advanced materials with improved properties have the potential to fuel future technological advancements. However, identification and discovery of these optimal materials for a specific application is a non-trivial task, because of the vastness of the chemical search space with enormous compositional and configurational degrees of freedom. Materials informatics provides an efficient approach towards rational design of new materials, via learning from known data to make decisions on new and previously unexplored compounds in an accelerated manner. Here, we demonstrate the power and utility of such statistical learning (or machine learning) via building a support vector machine (SVM) based classifier that uses elemental features (or descriptors) to predict the formability of a given ABX3 halide composition (where A and B represent monovalent and divalent cations, respectively, and X is F, Cl, Br or I anion) in the perovskite crystal structure. The classification model is built by learning from a dataset of 181 experimentally known ABX3 compounds. After exploring a wide range of features, we identify ionic radii, tolerance factor and octahedral factor to be the most important factors for the classification, suggesting that steric and geometric packing effects govern the stability of these halides. The trained and validated models then predict, with a high degree of confidence, several novel ABX3 compositions with perovskite crystal structure.

  16. Finding new perovskite halides via machine learning

    DOE PAGESBeta

    Pilania, Ghanshyam; Balachandran, Prasanna V.; Kim, Chiho; Lookman, Turab

    2016-04-26

    Advanced materials with improved properties have the potential to fuel future technological advancements. However, identification and discovery of these optimal materials for a specific application is a non-trivial task, because of the vastness of the chemical search space with enormous compositional and configurational degrees of freedom. Materials informatics provides an efficient approach toward rational design of new materials, via learning from known data to make decisions on new and previously unexplored compounds in an accelerated manner. Here, we demonstrate the power and utility of such statistical learning (or machine learning, henceforth referred to as ML) via building a support vectormore » machine (SVM) based classifier that uses elemental features (or descriptors) to predict the formability of a given ABX3 halide composition (where A and B represent monovalent and divalent cations, respectively, and X is F, Cl, Br, or I anion) in the perovskite crystal structure. The classification model is built by learning from a dataset of 185 experimentally known ABX3 compounds. After exploring a wide range of features, we identify ionic radii, tolerance factor, and octahedral factor to be the most important factors for the classification, suggesting that steric and geometric packing effects govern the stability of these halides. As a result, the trained and validated models then predict, with a high degree of confidence, several novel ABX3 compositions with perovskite crystal structure.« less

  17. Automated system for machine tool capacity and utilization

    SciTech Connect

    Bankes, W.F.

    1986-01-01

    An automated system based on Symphony spreadsheet softwre has been developed to monitor machine tool utilization and capacity in a small- to medium-sized machine shop. This application compiles reports on annual machine tool requirements and use from production routing data for a shop producing over 100 different small machined parts with batch sizes ranging from 100 to 1000 parts and up to 25,000 parts per year. The operational routings for approximately 30 parts are currently stored in the system. Levels of utilization are analyzed, which aids in determining the need for additional equipment or multiple workshifts, and thereby helps balance the workload and product flow. Valuable information was compiled in a special report for layout of a new shop facility. Group technology cell arrangements of equipment were analyzed for capacity and utilization. Many Symphony spreadsheet and data base management features were used to produce this program. The final system incorporated menu systems for users unfamiliar with this spreadsheet software.

  18. Sparse extreme learning machine for classification.

    PubMed

    Bai, Zuo; Huang, Guang-Bin; Wang, Danwei; Wang, Han; Westover, M Brandon

    2014-10-01

    Extreme learning machine (ELM) was initially proposed for single-hidden-layer feedforward neural networks (SLFNs). In the hidden layer (feature mapping), nodes are randomly generated independently of training data. Furthermore, a unified ELM was proposed, providing a single framework to simplify and unify different learning methods, such as SLFNs, least square support vector machines, proximal support vector machines, and so on. However, the solution of unified ELM is dense, and thus, usually plenty of storage space and testing time are required for large-scale applications. In this paper, a sparse ELM is proposed as an alternative solution for classification, reducing storage space and testing time. In addition, unified ELM obtains the solution by matrix inversion, whose computational complexity is between quadratic and cubic with respect to the training size. It still requires plenty of training time for large-scale problems, even though it is much faster than many other traditional methods. In this paper, an efficient training algorithm is specifically developed for sparse ELM. The quadratic programming problem involved in sparse ELM is divided into a series of smallest possible sub-problems, each of which are solved analytically. Compared with SVM, sparse ELM obtains better generalization performance with much faster training speed. Compared with unified ELM, sparse ELM achieves similar generalization performance for binary classification applications, and when dealing with large-scale binary classification problems, sparse ELM realizes even faster training speed than unified ELM. PMID:25222727

  19. Optoelectronic neural networks and learning machines

    SciTech Connect

    Farhat, N.H

    1989-09-01

    Optics offers advantages in realizing the parallelism, massive interconnectivity, and plasticity required in the design and construction of large-scale optoelectronic (photonic) neurocomputers that solve optimization problems at potentially very high speeds by learning to perform mappings and associations. To elucidate these advantages, a brief neural net primer based on phase-space and energy landscape considerations is presented. This provides the basis for subsequent discussion of optoelectronic architectures and implementations with self-organization and learning ability that are configured around an optical crossbar interconnect. Stochastic learning in the context of a Boltzmann machine is then described to illustrate the flexibility of optoelectronics in performing tasks that may be difficult for electronics alone. Stochastic nets are studies to gain insight into the possible role of noise in biological neural nets. The authors describe two approaches to realizing large-scale optoelectronic neurocomputers.

  20. Discriminative clustering via extreme learning machine.

    PubMed

    Huang, Gao; Liu, Tianchi; Yang, Yan; Lin, Zhiping; Song, Shiji; Wu, Cheng

    2015-10-01

    Discriminative clustering is an unsupervised learning framework which introduces the discriminative learning rule of supervised classification into clustering. The underlying assumption is that a good partition (clustering) of the data should yield high discrimination, namely, the partitioned data can be easily classified by some classification algorithms. In this paper, we propose three discriminative clustering approaches based on Extreme Learning Machine (ELM). The first algorithm iteratively trains weighted ELM (W-ELM) classifier to gradually maximize the data discrimination. The second and third methods are both built on Fisher's Linear Discriminant Analysis (LDA); but one approach adopts alternative optimization, while the other leverages kernel k-means. We show that the proposed algorithms can be easily implemented, and yield competitive clustering accuracy on real world data sets compared to state-of-the-art clustering methods. PMID:26143036

  1. 25. VIEW OF THE MACHINE TOOL LAYOUT IN ROOMS 244 ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    25. VIEW OF THE MACHINE TOOL LAYOUT IN ROOMS 244 AND 296. MACHINES WERE USED FOR STAINLESS STEEL FABRICATION (THE J-LINE). THE ORIGINAL DRAWING HAS BEEN ARCHIVED ON MICROFILM. THE DRAWING WAS REPRODUCED AT THE BEST QUALITY POSSIBLE. LETTERS AND NUMBERS IN THE CIRCLES INDICATE FOOTER AND/OR COLUMN LOCATIONS. - Rocky Flats Plant, General Manufacturing, Support, Records-Central Computing, Southern portion of Plant, Golden, Jefferson County, CO

  2. Entanglement-based machine learning on a quantum computer.

    PubMed

    Cai, X-D; Wu, D; Su, Z-E; Chen, M-C; Wang, X-L; Li, Li; Liu, N-L; Lu, C-Y; Pan, J-W

    2015-03-20

    Machine learning, a branch of artificial intelligence, learns from previous experience to optimize performance, which is ubiquitous in various fields such as computer sciences, financial analysis, robotics, and bioinformatics. A challenge is that machine learning with the rapidly growing "big data" could become intractable for classical computers. Recently, quantum machine learning algorithms [Lloyd, Mohseni, and Rebentrost, arXiv.1307.0411] were proposed which could offer an exponential speedup over classical algorithms. Here, we report the first experimental entanglement-based classification of two-, four-, and eight-dimensional vectors to different clusters using a small-scale photonic quantum computer, which are then used to implement supervised and unsupervised machine learning. The results demonstrate the working principle of using quantum computers to manipulate and classify high-dimensional vectors, the core mathematical routine in machine learning. The method can, in principle, be scaled to larger numbers of qubits, and may provide a new route to accelerate machine learning. PMID:25839250

  3. Entanglement-based machine learning on a quantum computer.

    PubMed

    Cai, X-D; Wu, D; Su, Z-E; Chen, M-C; Wang, X-L; Li, Li; Liu, N-L; Lu, C-Y; Pan, J-W

    2015-03-20

    Machine learning, a branch of artificial intelligence, learns from previous experience to optimize performance, which is ubiquitous in various fields such as computer sciences, financial analysis, robotics, and bioinformatics. A challenge is that machine learning with the rapidly growing "big data" could become intractable for classical computers. Recently, quantum machine learning algorithms [Lloyd, Mohseni, and Rebentrost, arXiv.1307.0411] were proposed which could offer an exponential speedup over classical algorithms. Here, we report the first experimental entanglement-based classification of two-, four-, and eight-dimensional vectors to different clusters using a small-scale photonic quantum computer, which are then used to implement supervised and unsupervised machine learning. The results demonstrate the working principle of using quantum computers to manipulate and classify high-dimensional vectors, the core mathematical routine in machine learning. The method can, in principle, be scaled to larger numbers of qubits, and may provide a new route to accelerate machine learning.

  4. Entanglement-Based Machine Learning on a Quantum Computer

    NASA Astrophysics Data System (ADS)

    Cai, X.-D.; Wu, D.; Su, Z.-E.; Chen, M.-C.; Wang, X.-L.; Li, Li; Liu, N.-L.; Lu, C.-Y.; Pan, J.-W.

    2015-03-01

    Machine learning, a branch of artificial intelligence, learns from previous experience to optimize performance, which is ubiquitous in various fields such as computer sciences, financial analysis, robotics, and bioinformatics. A challenge is that machine learning with the rapidly growing "big data" could become intractable for classical computers. Recently, quantum machine learning algorithms [Lloyd, Mohseni, and Rebentrost, arXiv.1307.0411] were proposed which could offer an exponential speedup over classical algorithms. Here, we report the first experimental entanglement-based classification of two-, four-, and eight-dimensional vectors to different clusters using a small-scale photonic quantum computer, which are then used to implement supervised and unsupervised machine learning. The results demonstrate the working principle of using quantum computers to manipulate and classify high-dimensional vectors, the core mathematical routine in machine learning. The method can, in principle, be scaled to larger numbers of qubits, and may provide a new route to accelerate machine learning.

  5. Machine learning: An artificial intelligence approach. Vol. II

    SciTech Connect

    Michalski, R.S.; Carbonell, J.G.; Mitchell, T.M.

    1986-01-01

    This book reflects the expansion of machine learning research through presentation of recent advances in the field. The book provides an account of current research directions. Major topics covered include the following: learning concepts and rules from examples; cognitive aspects of learning; learning by analogy; learning by observation and discovery; and an exploration of general aspects of learning.

  6. Method for producing hard-surfaced tools and machine components

    DOEpatents

    McHargue, Carl J.

    1985-01-01

    In one aspect, the invention comprises a method for producing tools and machine components having superhard crystalline-ceramic work surfaces. Broadly, the method comprises two steps: A tool or machine component having a ceramic near-surface region is mounted in ion-implantation apparatus. The region then is implanted with metal ions to form, in the region, a metastable alloy of the ions and said ceramic. The region containing the alloy is characterized by a significant increase in hardness properties, such as microhardness, fracture-toughness, and/or scratch-resistance. The resulting improved article has good thermal stability at temperatures characteristic of typical tool and machine-component uses. The method is relatively simple and reproducible.

  7. Method for producing hard-surfaced tools and machine components

    DOEpatents

    McHargue, C.J.

    1981-10-21

    In one aspect, the invention comprises a method for producing tools and machine components having superhard crystalline-ceramic work surfaces. Broadly, the method comprises two steps: a tool or machine component having a ceramic near-surface region is mounted in ion-implantation apparatus. The region then is implanted with metal ions to form, in the region, a metastable alloy of the ions and said ceramic. The region containing the alloy is characterized by a significant increase in hardness properties, such as microhardness, fracture-toughness, and/or scratch-resistance. The resulting improved article has good thermal stability at temperatures characteristic of typical tool and machine-component uses. The method is relatively simple and reproducible.

  8. Extreme Learning Machine for Multilayer Perceptron.

    PubMed

    Tang, Jiexiong; Deng, Chenwei; Huang, Guang-Bin

    2016-04-01

    Extreme learning machine (ELM) is an emerging learning algorithm for the generalized single hidden layer feedforward neural networks, of which the hidden node parameters are randomly generated and the output weights are analytically computed. However, due to its shallow architecture, feature learning using ELM may not be effective for natural signals (e.g., images/videos), even with a large number of hidden nodes. To address this issue, in this paper, a new ELM-based hierarchical learning framework is proposed for multilayer perceptron. The proposed architecture is divided into two main components: 1) self-taught feature extraction followed by supervised feature classification and 2) they are bridged by random initialized hidden weights. The novelties of this paper are as follows: 1) unsupervised multilayer encoding is conducted for feature extraction, and an ELM-based sparse autoencoder is developed via l1 constraint. By doing so, it achieves more compact and meaningful feature representations than the original ELM; 2) by exploiting the advantages of ELM random feature mapping, the hierarchically encoded outputs are randomly projected before final decision making, which leads to a better generalization with faster learning speed; and 3) unlike the greedy layerwise training of deep learning (DL), the hidden layers of the proposed framework are trained in a forward manner. Once the previous layer is established, the weights of the current layer are fixed without fine-tuning. Therefore, it has much better learning efficiency than the DL. Extensive experiments on various widely used classification data sets show that the proposed algorithm achieves better and faster convergence than the existing state-of-the-art hierarchical learning methods. Furthermore, multiple applications in computer vision further confirm the generality and capability of the proposed learning scheme. PMID:25966483

  9. Metagenomic taxonomic classification using extreme learning machines.

    PubMed

    Rasheed, Zeehasham; Rangwala, Huzefa

    2012-10-01

    Next-generation sequencing technologies have allowed researchers to determine the collective genomes of microbial communities co-existing within diverse ecological environments. Varying species abundance, length and complexities within different communities, coupled with discovery of new species makes the problem of taxonomic assignment to short DNA sequence reads extremely challenging. We have developed a new sequence composition-based taxonomic classifier using extreme learning machines referred to as TAC-ELM for metagenomic analysis. TAC-ELM uses the framework of extreme learning machines to quickly and accurately learn the weights for a neural network model. The input features consist of GC content and oligonucleotides. TAC-ELM is evaluated on two metagenomic benchmarks with sequence read lengths reflecting the traditional and current sequencing technologies. Our empirical results indicate the strength of the developed approach, which outperforms state-of-the-art taxonomic classifiers in terms of accuracy and implementation complexity. We also perform experiments that evaluate the pervasive case within metagenome analysis, where a species may not have been previously sequenced or discovered and will not exist in the reference genome databases. TAC-ELM was also combined with BLAST to show improved classification results. Code and Supplementary Results: http://www.cs.gmu.edu/~mlbio/TAC-ELM (BSD License). PMID:22849369

  10. Applying Machine Learning to Star Cluster Classification

    NASA Astrophysics Data System (ADS)

    Fedorenko, Kristina; Grasha, Kathryn; Calzetti, Daniela; Mahadevan, Sridhar

    2016-01-01

    Catalogs describing populations of star clusters are essential in investigating a range of important issues, from star formation to galaxy evolution. Star cluster catalogs are typically created in a two-step process: in the first step, a catalog of sources is automatically produced; in the second step, each of the extracted sources is visually inspected by 3-to-5 human classifiers and assigned a category. Classification by humans is labor-intensive and time consuming, thus it creates a bottleneck, and substantially slows down progress in star cluster research.We seek to automate the process of labeling star clusters (the second step) through applying supervised machine learning techniques. This will provide a fast, objective, and reproducible classification. Our data is HST (WFC3 and ACS) images of galaxies in the distance range of 3.5-12 Mpc, with a few thousand star clusters already classified by humans as a part of the LEGUS (Legacy ExtraGalactic UV Survey) project. The classification is based on 4 labels (Class 1 - symmetric, compact cluster; Class 2 - concentrated object with some degree of asymmetry; Class 3 - multiple peak system, diffuse; and Class 4 - spurious detection). We start by looking at basic machine learning methods such as decision trees. We then proceed to evaluate performance of more advanced techniques, focusing on convolutional neural networks and other Deep Learning methods. We analyze the results, and suggest several directions for further improvement.

  11. Ozone ensemble forecast with machine learning algorithms

    NASA Astrophysics Data System (ADS)

    Mallet, Vivien; Stoltz, Gilles; Mauricette, Boris

    2009-03-01

    We apply machine learning algorithms to perform sequential aggregation of ozone forecasts. The latter rely on a multimodel ensemble built for ozone forecasting with the modeling system Polyphemus. The ensemble simulations are obtained by changes in the physical parameterizations, the numerical schemes, and the input data to the models. The simulations are carried out for summer 2001 over western Europe in order to forecast ozone daily peaks and ozone hourly concentrations. On the basis of past observations and past model forecasts, the learning algorithms produce a weight for each model. A convex or linear combination of the model forecasts is then formed with these weights. This process is repeated for each round of forecasting and is therefore called sequential aggregation. The aggregated forecasts demonstrate good results; for instance, they always show better performance than the best model in the ensemble and they even compete against the best constant linear combination. In addition, the machine learning algorithms come with theoretical guarantees with respect to their performance, that hold for all possible sequences of observations, even nonstochastic ones. Our study also demonstrates the robustness of the methods. We therefore conclude that these aggregation methods are very relevant for operational forecasts.

  12. Machine-learning-assisted materials discovery using failed experiments.

    PubMed

    Raccuglia, Paul; Elbert, Katherine C; Adler, Philip D F; Falk, Casey; Wenny, Malia B; Mollo, Aurelio; Zeller, Matthias; Friedler, Sorelle A; Schrier, Joshua; Norquist, Alexander J

    2016-05-01

    Inorganic-organic hybrid materials such as organically templated metal oxides, metal-organic frameworks (MOFs) and organohalide perovskites have been studied for decades, and hydrothermal and (non-aqueous) solvothermal syntheses have produced thousands of new materials that collectively contain nearly all the metals in the periodic table. Nevertheless, the formation of these compounds is not fully understood, and development of new compounds relies primarily on exploratory syntheses. Simulation- and data-driven approaches (promoted by efforts such as the Materials Genome Initiative) provide an alternative to experimental trial-and-error. Three major strategies are: simulation-based predictions of physical properties (for example, charge mobility, photovoltaic properties, gas adsorption capacity or lithium-ion intercalation) to identify promising target candidates for synthetic efforts; determination of the structure-property relationship from large bodies of experimental data, enabled by integration with high-throughput synthesis and measurement tools; and clustering on the basis of similar crystallographic structure (for example, zeolite structure classification or gas adsorption properties). Here we demonstrate an alternative approach that uses machine-learning algorithms trained on reaction data to predict reaction outcomes for the crystallization of templated vanadium selenites. We used information on 'dark' reactions--failed or unsuccessful hydrothermal syntheses--collected from archived laboratory notebooks from our laboratory, and added physicochemical property descriptions to the raw notebook information using cheminformatics techniques. We used the resulting data to train a machine-learning model to predict reaction success. When carrying out hydrothermal synthesis experiments using previously untested, commercially available organic building blocks, our machine-learning model outperformed traditional human strategies, and successfully predicted conditions

  13. Machine-learning-assisted materials discovery using failed experiments.

    PubMed

    Raccuglia, Paul; Elbert, Katherine C; Adler, Philip D F; Falk, Casey; Wenny, Malia B; Mollo, Aurelio; Zeller, Matthias; Friedler, Sorelle A; Schrier, Joshua; Norquist, Alexander J

    2016-05-01

    Inorganic-organic hybrid materials such as organically templated metal oxides, metal-organic frameworks (MOFs) and organohalide perovskites have been studied for decades, and hydrothermal and (non-aqueous) solvothermal syntheses have produced thousands of new materials that collectively contain nearly all the metals in the periodic table. Nevertheless, the formation of these compounds is not fully understood, and development of new compounds relies primarily on exploratory syntheses. Simulation- and data-driven approaches (promoted by efforts such as the Materials Genome Initiative) provide an alternative to experimental trial-and-error. Three major strategies are: simulation-based predictions of physical properties (for example, charge mobility, photovoltaic properties, gas adsorption capacity or lithium-ion intercalation) to identify promising target candidates for synthetic efforts; determination of the structure-property relationship from large bodies of experimental data, enabled by integration with high-throughput synthesis and measurement tools; and clustering on the basis of similar crystallographic structure (for example, zeolite structure classification or gas adsorption properties). Here we demonstrate an alternative approach that uses machine-learning algorithms trained on reaction data to predict reaction outcomes for the crystallization of templated vanadium selenites. We used information on 'dark' reactions--failed or unsuccessful hydrothermal syntheses--collected from archived laboratory notebooks from our laboratory, and added physicochemical property descriptions to the raw notebook information using cheminformatics techniques. We used the resulting data to train a machine-learning model to predict reaction success. When carrying out hydrothermal synthesis experiments using previously untested, commercially available organic building blocks, our machine-learning model outperformed traditional human strategies, and successfully predicted conditions

  14. Machine-learning-assisted materials discovery using failed experiments

    NASA Astrophysics Data System (ADS)

    Raccuglia, Paul; Elbert, Katherine C.; Adler, Philip D. F.; Falk, Casey; Wenny, Malia B.; Mollo, Aurelio; Zeller, Matthias; Friedler, Sorelle A.; Schrier, Joshua; Norquist, Alexander J.

    2016-05-01

    Inorganic–organic hybrid materials such as organically templated metal oxides, metal–organic frameworks (MOFs) and organohalide perovskites have been studied for decades, and hydrothermal and (non-aqueous) solvothermal syntheses have produced thousands of new materials that collectively contain nearly all the metals in the periodic table. Nevertheless, the formation of these compounds is not fully understood, and development of new compounds relies primarily on exploratory syntheses. Simulation- and data-driven approaches (promoted by efforts such as the Materials Genome Initiative) provide an alternative to experimental trial-and-error. Three major strategies are: simulation-based predictions of physical properties (for example, charge mobility, photovoltaic properties, gas adsorption capacity or lithium-ion intercalation) to identify promising target candidates for synthetic efforts; determination of the structure–property relationship from large bodies of experimental data, enabled by integration with high-throughput synthesis and measurement tools; and clustering on the basis of similar crystallographic structure (for example, zeolite structure classification or gas adsorption properties). Here we demonstrate an alternative approach that uses machine-learning algorithms trained on reaction data to predict reaction outcomes for the crystallization of templated vanadium selenites. We used information on ‘dark’ reactions—failed or unsuccessful hydrothermal syntheses—collected from archived laboratory notebooks from our laboratory, and added physicochemical property descriptions to the raw notebook information using cheminformatics techniques. We used the resulting data to train a machine-learning model to predict reaction success. When carrying out hydrothermal synthesis experiments using previously untested, commercially available organic building blocks, our machine-learning model outperformed traditional human strategies, and successfully

  15. Machine-learning-assisted materials discovery using failed experiments

    NASA Astrophysics Data System (ADS)

    Raccuglia, Paul; Elbert, Katherine C.; Adler, Philip D. F.; Falk, Casey; Wenny, Malia B.; Mollo, Aurelio; Zeller, Matthias; Friedler, Sorelle A.; Schrier, Joshua; Norquist, Alexander J.

    2016-05-01

    Inorganic-organic hybrid materials such as organically templated metal oxides, metal-organic frameworks (MOFs) and organohalide perovskites have been studied for decades, and hydrothermal and (non-aqueous) solvothermal syntheses have produced thousands of new materials that collectively contain nearly all the metals in the periodic table. Nevertheless, the formation of these compounds is not fully understood, and development of new compounds relies primarily on exploratory syntheses. Simulation- and data-driven approaches (promoted by efforts such as the Materials Genome Initiative) provide an alternative to experimental trial-and-error. Three major strategies are: simulation-based predictions of physical properties (for example, charge mobility, photovoltaic properties, gas adsorption capacity or lithium-ion intercalation) to identify promising target candidates for synthetic efforts; determination of the structure-property relationship from large bodies of experimental data, enabled by integration with high-throughput synthesis and measurement tools; and clustering on the basis of similar crystallographic structure (for example, zeolite structure classification or gas adsorption properties). Here we demonstrate an alternative approach that uses machine-learning algorithms trained on reaction data to predict reaction outcomes for the crystallization of templated vanadium selenites. We used information on ‘dark’ reactions—failed or unsuccessful hydrothermal syntheses—collected from archived laboratory notebooks from our laboratory, and added physicochemical property descriptions to the raw notebook information using cheminformatics techniques. We used the resulting data to train a machine-learning model to predict reaction success. When carrying out hydrothermal synthesis experiments using previously untested, commercially available organic building blocks, our machine-learning model outperformed traditional human strategies, and successfully predicted

  16. Smarter Instruments, Smarter Archives: Machine Learning for Tactical Science

    NASA Astrophysics Data System (ADS)

    Thompson, D. R.; Kiran, R.; Allwood, A.; Altinok, A.; Estlin, T.; Flannery, D.

    2014-12-01

    There has been a growing interest by Earth and Planetary Sciences in machine learning, visualization and cyberinfrastructure to interpret ever-increasing volumes of instrument data. Such tools are commonly used to analyze archival datasets, but they can also play a valuable real-time role during missions. Here we discuss ways that machine learning can benefit tactical science decisions during Earth and Planetary Exploration. Machine learning's potential begins at the instrument itself. Smart instruments endowed with pattern recognition can immediately recognize science features of interest. This allows robotic explorers to optimize their limited communications bandwidth, triaging science products and prioritizing the most relevant data. Smart instruments can also target their data collection on the fly, using principles of experimental design to reduce redundancy and generally improve sampling efficiency for time-limited operations. Moreover, smart instruments can respond immediately to transient or unexpected phenomena. Examples include detections of cometary plumes, terrestrial floods, or volcanism. We show recent examples of smart instruments from 2014 tests including: aircraft and spacecraft remote sensing instruments that recognize cloud contamination, field tests of a "smart camera" for robotic surface geology, and adaptive data collection by X-Ray fluorescence spectrometers. Machine learning can also assist human operators when tactical decision making is required. Terrestrial scenarios include airborne remote sensing, where the decision to re-fly a transect must be made immediately. Planetary scenarios include deep space encounters or planetary surface exploration, where the number of command cycles is limited and operators make rapid daily decisions about where next to collect measurements. Visualization and modeling can reveal trends, clusters, and outliers in new data. This can help operators recognize instrument artifacts or spot anomalies in real time

  17. Modelling of Tool Wear and Residual Stress during Machining of AISI H13 Tool Steel

    NASA Astrophysics Data System (ADS)

    Outeiro, José C.; Umbrello, Domenico; Pina, José C.; Rizzuti, Stefania

    2007-05-01

    Residual stresses can enhance or impair the ability of a component to withstand loading conditions in service (fatigue, creep, stress corrosion cracking, etc.), depending on their nature: compressive or tensile, respectively. This poses enormous problems in structural assembly as this affects the structural integrity of the whole part. In addition, tool wear issues are of critical importance in manufacturing since these affect component quality, tool life and machining cost. Therefore, prediction and control of both tool wear and the residual stresses in machining are absolutely necessary. In this work, a two-dimensional Finite Element model using an implicit Lagrangian formulation with an automatic remeshing was applied to simulate the orthogonal cutting process of AISI H13 tool steel. To validate such model the predicted and experimentally measured chip geometry, cutting forces, temperatures, tool wear and residual stresses on the machined affected layers were compared. The proposed FE model allowed us to investigate the influence of tool geometry, cutting regime parameters and tool wear on residual stress distribution in the machined surface and subsurface of AISI H13 tool steel. The obtained results permit to conclude that in order to reduce the magnitude of surface residual stresses, the cutting speed should be increased, the uncut chip thickness (or feed) should be reduced and machining with honed tools having large cutting edge radii produce better results than chamfered tools. Moreover, increasing tool wear increases the magnitude of surface residual stresses.

  18. An Evolutionary Machine Learning Framework for Big Data Sequence Mining

    ERIC Educational Resources Information Center

    Kamath, Uday Krishna

    2014-01-01

    Sequence classification is an important problem in many real-world applications. Unlike other machine learning data, there are no "explicit" features or signals in sequence data that can help traditional machine learning algorithms learn and predict from the data. Sequence data exhibits inter-relationships in the elements that are…

  19. Patient-centered yes/no prognosis using learning machines

    PubMed Central

    König, I.R.; Malley, J.D.; Pajevic, S.; Weimar, C.; Diener, H-C.

    2009-01-01

    In the last 15 years several machine learning approaches have been developed for classification and regression. In an intuitive manner we introduce the main ideas of classification and regression trees, support vector machines, bagging, boosting and random forests. We discuss differences in the use of machine learning in the biomedical community and the computer sciences. We propose methods for comparing machines on a sound statistical basis. Data from the German Stroke Study Collaboration is used for illustration. We compare the results from learning machines to those obtained by a published logistic regression and discuss similarities and differences. PMID:19216340

  20. Finding Density Functionals with Machine Learning

    NASA Astrophysics Data System (ADS)

    Snyder, John C.; Rupp, Matthias; Hansen, Katja; Müller, Klaus-Robert; Burke, Kieron

    2012-06-01

    Machine learning is used to approximate density functionals. For the model problem of the kinetic energy of noninteracting fermions in 1D, mean absolute errors below 1kcal/mol on test densities similar to the training set are reached with fewer than 100 training densities. A predictor identifies if a test density is within the interpolation region. Via principal component analysis, a projected functional derivative finds highly accurate self-consistent densities. The challenges for application of our method to real electronic structure problems are discussed.

  1. Multivariate Mapping of Environmental Data Using Extreme Learning Machines

    NASA Astrophysics Data System (ADS)

    Leuenberger, Michael; Kanevski, Mikhail

    2014-05-01

    In most real cases environmental data are multivariate, highly variable at several spatio-temporal scales, and are generated by nonlinear and complex phenomena. Mapping - spatial predictions of such data, is a challenging problem. Machine learning algorithms, being universal nonlinear tools, have demonstrated their efficiency in modelling of environmental spatial and space-time data (Kanevski et al. 2009). Recently, a new approach in machine learning - Extreme Learning Machine (ELM), has gained a great popularity. ELM is a fast and powerful approach being a part of the machine learning algorithm category. Developed by G.-B. Huang et al. (2006), it follows the structure of a multilayer perceptron (MLP) with one single-hidden layer feedforward neural networks (SLFNs). The learning step of classical artificial neural networks, like MLP, deals with the optimization of weights and biases by using gradient-based learning algorithm (e.g. back-propagation algorithm). Opposed to this optimization phase, which can fall into local minima, ELM generates randomly the weights between the input layer and the hidden layer and also the biases in the hidden layer. By this initialization, it optimizes just the weight vector between the hidden layer and the output layer in a single way. The main advantage of this algorithm is the speed of the learning step. In a theoretical context and by growing the number of hidden nodes, the algorithm can learn any set of training data with zero error. To avoid overfitting, cross-validation method or "true validation" (by randomly splitting data into training, validation and testing subsets) are recommended in order to find an optimal number of neurons. With its universal property and solid theoretical basis, ELM is a good machine learning algorithm which can push the field forward. The present research deals with an extension of ELM to multivariate output modelling and application of ELM to the real data case study - pollution of the sediments in

  2. Effective and efficient optics inspection approach using machine learning algorithms

    SciTech Connect

    Abdulla, G; Kegelmeyer, L; Liao, Z; Carr, W

    2010-11-02

    The Final Optics Damage Inspection (FODI) system automatically acquires and utilizes the Optics Inspection (OI) system to analyze images of the final optics at the National Ignition Facility (NIF). During each inspection cycle up to 1000 images acquired by FODI are examined by OI to identify and track damage sites on the optics. The process of tracking growing damage sites on the surface of an optic can be made more effective by identifying and removing signals associated with debris or reflections. The manual process to filter these false sites is daunting and time consuming. In this paper we discuss the use of machine learning tools and data mining techniques to help with this task. We describe the process to prepare a data set that can be used for training and identifying hardware reflections in the image data. In order to collect training data, the images are first automatically acquired and analyzed with existing software and then relevant features such as spatial, physical and luminosity measures are extracted for each site. A subset of these sites is 'truthed' or manually assigned a class to create training data. A supervised classification algorithm is used to test if the features can predict the class membership of new sites. A suite of self-configuring machine learning tools called 'Avatar Tools' is applied to classify all sites. To verify, we used 10-fold cross correlation and found the accuracy was above 99%. This substantially reduces the number of false alarms that would otherwise be sent for more extensive investigation.

  3. Towards Machine Learning of Motor Skills

    NASA Astrophysics Data System (ADS)

    Peters, Jan; Schaal, Stefan; Schölkopf, Bernhard

    Autonomous robots that can adapt to novel situations has been a long standing vision of robotics, artificial intelligence, and cognitive sciences. Early approaches to this goal during the heydays of artificial intelligence research in the late 1980s, however, made it clear that an approach purely based on reasoning or human insights would not be able to model all the perceptuomotor tasks that a robot should fulfill. Instead, new hope was put in the growing wake of machine learning that promised fully adaptive control algorithms which learn both by observation and trial-and-error. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics, and usually scaling was only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general approach to motor skill learning in order to get one step closer towards human-like performance. For doing so, we study two major components for such an approach, i.e., firstly, a theoretically well-founded general approach to representing the required control structures for task representation and execution and, secondly, appropriate learning algorithms which can be applied in this setting.

  4. Medical Dataset Classification: A Machine Learning Paradigm Integrating Particle Swarm Optimization with Extreme Learning Machine Classifier

    PubMed Central

    Subbulakshmi, C. V.; Deepa, S. N.

    2015-01-01

    Medical data classification is a prime data mining problem being discussed about for a decade that has attracted several researchers around the world. Most classifiers are designed so as to learn from the data itself using a training process, because complete expert knowledge to determine classifier parameters is impracticable. This paper proposes a hybrid methodology based on machine learning paradigm. This paradigm integrates the successful exploration mechanism called self-regulated learning capability of the particle swarm optimization (PSO) algorithm with the extreme learning machine (ELM) classifier. As a recent off-line learning method, ELM is a single-hidden layer feedforward neural network (FFNN), proved to be an excellent classifier with large number of hidden layer neurons. In this research, PSO is used to determine the optimum set of parameters for the ELM, thus reducing the number of hidden layer neurons, and it further improves the network generalization performance. The proposed method is experimented on five benchmarked datasets of the UCI Machine Learning Repository for handling medical dataset classification. Simulation results show that the proposed approach is able to achieve good generalization performance, compared to the results of other classifiers. PMID:26491713

  5. Machine Learning for High-Throughput Stress Phenotyping in Plants.

    PubMed

    Singh, Arti; Ganapathysubramanian, Baskar; Singh, Asheesh Kumar; Sarkar, Soumik

    2016-02-01

    Advances in automated and high-throughput imaging technologies have resulted in a deluge of high-resolution images and sensor data of plants. However, extracting patterns and features from this large corpus of data requires the use of machine learning (ML) tools to enable data assimilation and feature identification for stress phenotyping. Four stages of the decision cycle in plant stress phenotyping and plant breeding activities where different ML approaches can be deployed are (i) identification, (ii) classification, (iii) quantification, and (iv) prediction (ICQP). We provide here a comprehensive overview and user-friendly taxonomy of ML tools to enable the plant community to correctly and easily apply the appropriate ML tools and best-practice guidelines for various biotic and abiotic stress traits.

  6. MysiRNA: improving siRNA efficacy prediction using a machine-learning model combining multi-tools and whole stacking energy (ΔG).

    PubMed

    Mysara, Mohamed; Elhefnawi, Mahmoud; Garibaldi, Jonathan M

    2012-06-01

    The investigation of small interfering RNA (siRNA) and its posttranscriptional gene-regulation has become an extremely important research topic, both for fundamental reasons and for potential longer-term therapeutic benefits. Several factors affect the functionality of siRNA including positional preferences, target accessibility and other thermodynamic features. State of the art tools aim to optimize the selection of target siRNAs by identifying those that may have high experimental inhibition. Such tools implement artificial neural network models as Biopredsi and ThermoComposition21, and linear regression models as DSIR, i-Score and Scales, among others. However, all these models have limitations in performance. In this work, a neural-network trained new siRNA scoring/efficacy prediction model was developed based on combining two existing scoring algorithms (ThermoComposition21 and i-Score), together with the whole stacking energy (ΔG), in a multi-layer artificial neural network. These three parameters were chosen after a comparative combinatorial study between five well known tools. Our developed model, 'MysiRNA' was trained on 2431 siRNA records and tested using three further datasets. MysiRNA was compared with 11 alternative existing scoring tools in an evaluation study to assess the predicted and experimental siRNA efficiency where it achieved the highest performance both in terms of correlation coefficient (R(2)=0.600) and receiver operating characteristics analysis (AUC=0.808), improving the prediction accuracy by up to 18% with respect to sensitivity and specificity of the best available tools. MysiRNA is a novel, freely accessible model capable of predicting siRNA inhibition efficiency with improved specificity and sensitivity. This multiclassifier approach could help improve the performance of prediction in several bioinformatics areas. MysiRNA model, part of MysiRNA-Designer package [1], is expected to play a key role in siRNA selection and evaluation.

  7. Classifying Structures in the ISM with Machine Learning Techniques

    NASA Astrophysics Data System (ADS)

    Beaumont, Christopher; Goodman, A. A.; Williams, J. P.

    2011-01-01

    The processes which govern molecular cloud evolution and star formation often sculpt structures in the ISM: filaments, pillars, shells, outflows, etc. Because of their morphological complexity, these objects are often identified manually. Manual classification has several disadvantages; the process is subjective, not easily reproducible, and does not scale well to handle increasingly large datasets. We have explored to what extent machine learning algorithms can be trained to autonomously identify specific morphological features in molecular cloud datasets. We show that the Support Vector Machine algorithm can successfully locate filaments and outflows blended with other emission structures. When the objects of interest are morphologically distinct from the surrounding emission, this autonomous classification achieves >90% accuracy. We have developed a set of IDL-based tools to apply this technique to other datasets.

  8. Multimedia: Multi-Learning Tool.

    ERIC Educational Resources Information Center

    Farmer, Lesley S. J.

    1995-01-01

    Examines facets using multimedia to enhance learning. Highlights include product linkage and customization; flexibility for lesson plans; hypermedia authoring tools; student presentations; expense; incompatible and confusing systems; high memory demands; hardware standards for Windows and Macintosh programs; and CD-ROM products. (AEF)

  9. Evaluation as a Learning Tool

    ERIC Educational Resources Information Center

    Feinstein, Osvaldo Nestor

    2012-01-01

    Evaluation of programs or projects is often perceived as a threat. This is to a great extent related to the anticipated use of evaluation for accountability, which is often prioritized at the expense of using evaluation as a learning tool. Frequently it is argued that there is a trade-off between these two evaluation functions. An alternative…

  10. Geological applications of machine learning on hyperspectral remote sensing data

    NASA Astrophysics Data System (ADS)

    Tse, C. H.; Li, Yi-liang; Lam, Edmund Y.

    2015-02-01

    The CRISM imaging spectrometer orbiting Mars has been producing a vast amount of data in the visible to infrared wavelengths in the form of hyperspectral data cubes. These data, compared with those obtained from previous remote sensing techniques, yield an unprecedented level of detailed spectral resolution in additional to an ever increasing level of spatial information. A major challenge brought about by the data is the burden of processing and interpreting these datasets and extract the relevant information from it. This research aims at approaching the challenge by exploring machine learning methods especially unsupervised learning to achieve cluster density estimation and classification, and ultimately devising an efficient means leading to identification of minerals. A set of software tools have been constructed by Python to access and experiment with CRISM hyperspectral cubes selected from two specific Mars locations. A machine learning pipeline is proposed and unsupervised learning methods were implemented onto pre-processed datasets. The resulting data clusters are compared with the published ASTER spectral library and browse data products from the Planetary Data System (PDS). The result demonstrated that this approach is capable of processing the huge amount of hyperspectral data and potentially providing guidance to scientists for more detailed studies.

  11. A Fast Reduced Kernel Extreme Learning Machine.

    PubMed

    Deng, Wan-Yu; Ong, Yew-Soon; Zheng, Qing-Hua

    2016-04-01

    In this paper, we present a fast and accurate kernel-based supervised algorithm referred to as the Reduced Kernel Extreme Learning Machine (RKELM). In contrast to the work on Support Vector Machine (SVM) or Least Square SVM (LS-SVM), which identifies the support vectors or weight vectors iteratively, the proposed RKELM randomly selects a subset of the available data samples as support vectors (or mapping samples). By avoiding the iterative steps of SVM, significant cost savings in the training process can be readily attained, especially on Big datasets. RKELM is established based on the rigorous proof of universal learning involving reduced kernel-based SLFN. In particular, we prove that RKELM can approximate any nonlinear functions accurately under the condition of support vectors sufficiency. Experimental results on a wide variety of real world small instance size and large instance size applications in the context of binary classification, multi-class problem and regression are then reported to show that RKELM can perform at competitive level of generalized performance as the SVM/LS-SVM at only a fraction of the computational effort incurred.

  12. Measure Transformer Semantics for Bayesian Machine Learning

    NASA Astrophysics Data System (ADS)

    Borgström, Johannes; Gordon, Andrew D.; Greenberg, Michael; Margetson, James; van Gael, Jurgen

    The Bayesian approach to machine learning amounts to inferring posterior distributions of random variables from a probabilistic model of how the variables are related (that is, a prior distribution) and a set of observations of variables. There is a trend in machine learning towards expressing Bayesian models as probabilistic programs. As a foundation for this kind of programming, we propose a core functional calculus with primitives for sampling prior distributions and observing variables. We define combinators for measure transformers, based on theorems in measure theory, and use these to give a rigorous semantics to our core calculus. The original features of our semantics include its support for discrete, continuous, and hybrid measures, and, in particular, for observations of zero-probability events. We compile our core language to a small imperative language that has a straightforward semantics via factor graphs, data structures that enable many efficient inference algorithms. We use an existing inference engine for efficient approximate inference of posterior marginal distributions, treating thousands of observations per second for large instances of realistic models.

  13. Galaxy morphology - An unsupervised machine learning approach

    NASA Astrophysics Data System (ADS)

    Schutter, A.; Shamir, L.

    2015-09-01

    Structural properties poses valuable information about the formation and evolution of galaxies, and are important for understanding the past, present, and future universe. Here we use unsupervised machine learning methodology to analyze a network of similarities between galaxy morphological types, and automatically deduce a morphological sequence of galaxies. Application of the method to the EFIGI catalog show that the morphological scheme produced by the algorithm is largely in agreement with the De Vaucouleurs system, demonstrating the ability of computer vision and machine learning methods to automatically profile galaxy morphological sequences. The unsupervised analysis method is based on comprehensive computer vision techniques that compute the visual similarities between the different morphological types. Rather than relying on human cognition, the proposed system deduces the similarities between sets of galaxy images in an automatic manner, and is therefore not limited by the number of galaxies being analyzed. The source code of the method is publicly available, and the protocol of the experiment is included in the paper so that the experiment can be replicated, and the method can be used to analyze user-defined datasets of galaxy images.

  14. A Fast Reduced Kernel Extreme Learning Machine.

    PubMed

    Deng, Wan-Yu; Ong, Yew-Soon; Zheng, Qing-Hua

    2016-04-01

    In this paper, we present a fast and accurate kernel-based supervised algorithm referred to as the Reduced Kernel Extreme Learning Machine (RKELM). In contrast to the work on Support Vector Machine (SVM) or Least Square SVM (LS-SVM), which identifies the support vectors or weight vectors iteratively, the proposed RKELM randomly selects a subset of the available data samples as support vectors (or mapping samples). By avoiding the iterative steps of SVM, significant cost savings in the training process can be readily attained, especially on Big datasets. RKELM is established based on the rigorous proof of universal learning involving reduced kernel-based SLFN. In particular, we prove that RKELM can approximate any nonlinear functions accurately under the condition of support vectors sufficiency. Experimental results on a wide variety of real world small instance size and large instance size applications in the context of binary classification, multi-class problem and regression are then reported to show that RKELM can perform at competitive level of generalized performance as the SVM/LS-SVM at only a fraction of the computational effort incurred. PMID:26829605

  15. Mining the Kepler Data using Machine Learning

    NASA Astrophysics Data System (ADS)

    Walkowicz, Lucianne; Howe, A. R.; Nayar, R.; Turner, E. L.; Scargle, J.; Meadows, V.; Zee, A.

    2014-01-01

    Kepler's high cadence and incredible precision has provided an unprecedented view into stars and their planetary companions, revealing both expected and novel phenomena and systems. Due to the large number of Kepler lightcurves, the discovery of novel phenomena in particular has often been serendipitous in the course of searching for known forms of variability (for example, the discovery of the doubly pulsating elliptical binary KOI-54, originally identified by the transiting planet search pipeline). In this talk, we discuss progress on mining the Kepler data through both supervised and unsupervised machine learning, intended to both systematically search the Kepler lightcurves for rare or anomalous variability, and to create a variability catalog for community use. Mining the dataset in this way also allows for a quantitative identification of anomalous variability, and so may also be used as a signal-agnostic form of optical SETI. As the Kepler data are exceptionally rich, they provide an interesting counterpoint to machine learning efforts typically performed on sparser and/or noisier survey data, and will inform similar characterization carried out on future survey datasets.

  16. Photometric Supernova Classification with Machine Learning

    NASA Astrophysics Data System (ADS)

    Lochner, Michelle; McEwen, Jason D.; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K.

    2016-08-01

    Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k-nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.

  17. 100. ARAIII. Operations with drilling tool used in machining of ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    100. ARA-III. Operations with drilling tool used in machining of ML-1 pressure vessel. Receptacle contains filings. July 12, 1963. Ineel photo no. 63-4456. Photographer: Benson. - Idaho National Engineering Laboratory, Army Reactors Experimental Area, Scoville, Butte County, ID

  18. Educational Resources for the Machine Tool Industry. Executive Summary.

    ERIC Educational Resources Information Center

    Texas State Technical Coll. System, Waco.

    This document describes the MASTER (Machine Tool Advanced Skills Educational Resources) program, a geographic partnership of seven of the nation's best 2-year technical and community colleges located in seven states. The project developed and disseminated a national training model for manufacturing processes and new technologies within the…

  19. A real-time tool positioning sensor for machine-tools.

    PubMed

    Ruiz, Antonio Ramon Jimenez; Rosas, Jorge Guevara; Granja, Fernando Seco; Honorato, Jose Carlos Prieto; Taboada, Jose Juan Esteve; Serrano, Vicente Mico; Jimenez, Teresa Molina

    2009-01-01

    In machining, natural oscillations, and elastic, gravitational or temperature deformations, are still a problem to guarantee the quality of fabricated parts. In this paper we present an optical measurement system designed to track and localize in 3D a reference retro-reflector close to the machine-tool's drill. The complete system and its components are described in detail. Several tests, some static (including impacts and rotations) and others dynamic (by executing linear and circular trajectories), were performed on two different machine tools. It has been integrated, for the first time, a laser tracking system into the position control loop of a machine-tool. Results indicate that oscillations and deformations close to the tool can be estimated with micrometric resolution and a bandwidth from 0 to more than 100 Hz. Therefore this sensor opens the possibility for on-line compensation of oscillations and deformations.

  20. A Real-Time Tool Positioning Sensor for Machine-Tools

    PubMed Central

    Ruiz, Antonio Ramon Jimenez; Rosas, Jorge Guevara; Granja, Fernando Seco; Honorato, Jose Carlos Prieto; Taboada, Jose Juan Esteve; Serrano, Vicente Mico; Jimenez, Teresa Molina

    2009-01-01

    In machining, natural oscillations, and elastic, gravitational or temperature deformations, are still a problem to guarantee the quality of fabricated parts. In this paper we present an optical measurement system designed to track and localize in 3D a reference retro-reflector close to the machine-tool's drill. The complete system and its components are described in detail. Several tests, some static (including impacts and rotations) and others dynamic (by executing linear and circular trajectories), were performed on two different machine tools. It has been integrated, for the first time, a laser tracking system into the position control loop of a machine-tool. Results indicate that oscillations and deformations close to the tool can be estimated with micrometric resolution and a bandwidth from 0 to more than 100 Hz. Therefore this sensor opens the possibility for on-line compensation of oscillations and deformations. PMID:22408472

  1. Laboratory directed research and development final report: Intelligent tools for on-machine acceptance of precision machined components

    SciTech Connect

    Christensen, N.G.; Harwell, L.D.; Hazelton, A.

    1997-02-01

    On-Machine Acceptance (OMA) is an agile manufacturing concept being developed for machine tools at SNL. The concept behind OMA is the integration of product design, fabrication, and qualification processes by using the machining center as a fabrication and inspection tool. This report documents the final results of a Laboratory Directed Research and Development effort to qualify OMA.

  2. Learning Activity Packets for Milling Machines. Unit I--Introduction to Milling Machines.

    ERIC Educational Resources Information Center

    Oklahoma State Board of Vocational and Technical Education, Stillwater. Curriculum and Instructional Materials Center.

    This learning activity packet (LAP) outlines the study activities and performance tasks covered in a related curriculum guide on milling machines. The course of study in this LAP is intended to help students learn to identify parts and attachments of vertical and horizontal milling machines, identify work-holding devices, state safety rules, and…

  3. Broiler chickens can benefit from machine learning: support vector machine analysis of observational epidemiological data.

    PubMed

    Hepworth, Philip J; Nefedov, Alexey V; Muchnik, Ilya B; Morgan, Kenton L

    2012-08-01

    Machine-learning algorithms pervade our daily lives. In epidemiology, supervised machine learning has the potential for classification, diagnosis and risk factor identification. Here, we report the use of support vector machine learning to identify the features associated with hock burn on commercial broiler farms, using routinely collected farm management data. These data lend themselves to analysis using machine-learning techniques. Hock burn, dermatitis of the skin over the hock, is an important indicator of broiler health and welfare. Remarkably, this classifier can predict the occurrence of high hock burn prevalence with accuracy of 0.78 on unseen data, as measured by the area under the receiver operating characteristic curve. We also compare the results with those obtained by standard multi-variable logistic regression and suggest that this technique provides new insights into the data. This novel application of a machine-learning algorithm, embedded in poultry management systems could offer significant improvements in broiler health and welfare worldwide.

  4. Developing an Intelligent Diagnosis and Assessment E-Learning Tool for Introductory Programming

    ERIC Educational Resources Information Center

    Huang, Chenn-Jung; Chen, Chun-Hua; Luo, Yun-Cheng; Chen, Hong-Xin; Chuang, Yi-Ta

    2008-01-01

    Recently, a lot of open source e-learning platforms have been offered for free in the Internet. We thus incorporate the intelligent diagnosis and assessment tool into an open software e-learning platform developed for programming language courses, wherein the proposed learning diagnosis assessment tools based on text mining and machine learning…

  5. Trends in extreme learning machines: a review.

    PubMed

    Huang, Gao; Huang, Guang-Bin; Song, Shiji; You, Keyou

    2015-01-01

    Extreme learning machine (ELM) has gained increasing interest from various research fields recently. In this review, we aim to report the current state of the theoretical research and practical advances on this subject. We first give an overview of ELM from the theoretical perspective, including the interpolation theory, universal approximation capability, and generalization ability. Then we focus on the various improvements made to ELM which further improve its stability, sparsity and accuracy under general or specific conditions. Apart from classification and regression, ELM has recently been extended for clustering, feature selection, representational learning and many other learning tasks. These newly emerging algorithms greatly expand the applications of ELM. From implementation aspect, hardware implementation and parallel computation techniques have substantially sped up the training of ELM, making it feasible for big data processing and real-time reasoning. Due to its remarkable efficiency, simplicity, and impressive generalization performance, ELM have been applied in a variety of domains, such as biomedical engineering, computer vision, system identification, and control and robotics. In this review, we try to provide a comprehensive view of these advances in ELM together with its future perspectives. PMID:25462632

  6. Dropout Prediction in E-Learning Courses through the Combination of Machine Learning Techniques

    ERIC Educational Resources Information Center

    Lykourentzou, Ioanna; Giannoukos, Ioannis; Nikolopoulos, Vassilis; Mpardis, George; Loumos, Vassili

    2009-01-01

    In this paper, a dropout prediction method for e-learning courses, based on three popular machine learning techniques and detailed student data, is proposed. The machine learning techniques used are feed-forward neural networks, support vector machines and probabilistic ensemble simplified fuzzy ARTMAP. Since a single technique may fail to…

  7. Influence of machining parameters on cutting tool life while machining aluminum alloy fly ash composite

    NASA Astrophysics Data System (ADS)

    Rao, C. R. Prakash; chandra, Poorna; Kiran, R.; Asha, P. B.

    2016-09-01

    Metal matrix composites containing fly ash as reinforcement are primarily preferred because these materials possess lower density and higher strength to weight ratio. The metal matrix composites possess hetrogeneous microstructure which is due to the presence of hard ceramic particles. While turning composites, the catastrophic failure of cutting tools is attributed to the presence of hard particles. Selection of optimal cutting conditions for a given machining process and grade of cutting tools are of utmost importance to enhance the tool life during turning operation. Thus the research work was aimed at the experimental investigation of the cutting tool life while machining aluminum alloy composite containing 0-15% fly-ash. The experiments carried out following ISO3685 standards. The carbide inserts of grade K10 and style CGGN120304 were the turning tools. The cutting speed selected was between 200m/min to 500m/min in step of 100m/min, feed of 0.08 & 0.16 mm/revolution and constant depth of cut of 1.0 mm. The experimental results revealed that the performance of K10 grade carbide insert found better while machining composite containing 5% filler, at all cutting speeds and 0.08mm/revolution feed. The failures of carbide tools are mainly due to notch wear followed by built up edge and edge chipping.

  8. Applying Machine Learning to Facilitate Autism Diagnostics: Pitfalls and Promises

    ERIC Educational Resources Information Center

    Bone, Daniel; Goodwin, Matthew S.; Black, Matthew P.; Lee, Chi-Chun; Audhkhasi, Kartik; Narayanan, Shrikanth

    2015-01-01

    Machine learning has immense potential to enhance diagnostic and intervention research in the behavioral sciences, and may be especially useful in investigations involving the highly prevalent and heterogeneous syndrome of autism spectrum disorder. However, use of machine learning in the absence of clinical domain expertise can be tenuous and lead…

  9. Large-Scale Machine Learning for Classification and Search

    ERIC Educational Resources Information Center

    Liu, Wei

    2012-01-01

    With the rapid development of the Internet, nowadays tremendous amounts of data including images and videos, up to millions or billions, can be collected for training machine learning models. Inspired by this trend, this thesis is dedicated to developing large-scale machine learning techniques for the purpose of making classification and nearest…

  10. Newton Methods for Large Scale Problems in Machine Learning

    ERIC Educational Resources Information Center

    Hansen, Samantha Leigh

    2014-01-01

    The focus of this thesis is on practical ways of designing optimization algorithms for minimizing large-scale nonlinear functions with applications in machine learning. Chapter 1 introduces the overarching ideas in the thesis. Chapters 2 and 3 are geared towards supervised machine learning applications that involve minimizing a sum of loss…

  11. Tilinglike learning in the parity machine

    NASA Astrophysics Data System (ADS)

    Biehl, Michael; Opper, Manfred

    1991-11-01

    An algorithm for the training of multilayered feedforward neural networks is presented. The strategy is very similar to the well-known tiling algorithm, yet the resulting architecture is completely different. New hidden units are added to one layer only in order to correct the errors of the previous ones; standard perceptron learning can be applied. The output of the network is given by the product of these k (+/-1) neurons (parity machine). In a special case with two hidden units, the capacity αc and stability of the network can be derived exactly by means of a replica-symmetric calculation. Correlations between the two sets of couplings vanish exactly. For the case of arbitrary k, estimates of αc are given. The asymptotic capacity per input neuron of a network trained according to the proposed algorithm is found to be αc~k lnk for k-->∞ in the estimation. This is in agreement with recent analytic results for the algorithm-independent capacity of a parity machine.

  12. Machine Learning for Dynamical Mean Field Theory

    NASA Astrophysics Data System (ADS)

    Arsenault, Louis-Francois; Lopez-Bezanilla, Alejandro; von Lilienfeld, O. Anatole; Littlewood, P. B.; Millis, Andy

    2014-03-01

    Machine Learning (ML), an approach that infers new results from accumulated knowledge, is in use for a variety of tasks ranging from face and voice recognition to internet searching and has recently been gaining increasing importance in chemistry and physics. In this talk, we investigate the possibility of using ML to solve the equations of dynamical mean field theory which otherwise requires the (numerically very expensive) solution of a quantum impurity model. Our ML scheme requires the relation between two functions: the hybridization function describing the bare (local) electronic structure of a material and the self-energy describing the many body physics. We discuss the parameterization of the two functions for the exact diagonalization solver and present examples, beginning with the Anderson Impurity model with a fixed bath density of states, demonstrating the advantages and the pitfalls of the method. DOE contract DE-AC02-06CH11357.

  13. Learning-Oriented Instructional Development Tools.

    ERIC Educational Resources Information Center

    Merrill, M. David

    1997-01-01

    Discusses design requirements, and advantages and disadvantages of the following learner-centered instructional development tools: information containers; authoring systems; templates, models, or widgets; learning-oriented instructional development tools; and adaptive learning-oriented systems. (AEF)

  14. Application of Machine Learning to the Prediction of Vegetation Health

    NASA Astrophysics Data System (ADS)

    Burchfield, Emily; Nay, John J.; Gilligan, Jonathan

    2016-06-01

    This project applies machine learning techniques to remotely sensed imagery to train and validate predictive models of vegetation health in Bangladesh and Sri Lanka. For both locations, we downloaded and processed eleven years of imagery from multiple MODIS datasets which were combined and transformed into two-dimensional matrices. We applied a gradient boosted machines model to the lagged dataset values to forecast future values of the Enhanced Vegetation Index (EVI). The predictive power of raw spectral data MODIS products were compared across time periods and land use categories. Our models have significantly more predictive power on held-out datasets than a baseline. Though the tool was built to increase capacity to monitor vegetation health in data scarce regions like South Asia, users may include ancillary spatiotemporal datasets relevant to their region of interest to increase predictive power and to facilitate interpretation of model results. The tool can automatically update predictions as new MODIS data is made available by NASA. The tool is particularly well-suited for decision makers interested in understanding and predicting vegetation health dynamics in countries in which environmental data is scarce and cloud cover is a significant concern.

  15. Mississippi Curriculum Framework for Machine Tool Operation/Machine Shop and Tool and Die Making Technology Cluster (Program CIP: 48.0507--Tool and Die Maker/Technologist) (Program CIP: 48.0503--Machine Shop Assistant). Postsecondary Programs.

    ERIC Educational Resources Information Center

    Mississippi Research and Curriculum Unit for Vocational and Technical Education, State College.

    This document, which is intended for use by community and junior colleges throughout Mississippi, contains curriculum frameworks for the course sequences in the machine tool operation/machine tool and tool and die making technology programs cluster. Presented in the introductory section are a framework of courses and programs, description of the…

  16. Tracking medical genetic literature through machine learning.

    PubMed

    Bornstein, Aaron T; McLoughlin, Matthew H; Aguilar, Jesus; Wong, Wendy S W; Solomon, Benjamin D

    2016-08-01

    There has been remarkable progress in identifying the causes of genetic conditions as well as understanding how changes in specific genes cause disease. Though difficult (and often superficial) to parse, an interesting tension involves emphasis on basic research aimed to dissect normal and abnormal biology versus more clearly clinical and therapeutic investigations. To examine one facet of this question and to better understand progress in Mendelian-related research, we developed an algorithm that classifies medical literature into three categories (Basic, Clinical, and Management) and conducted a retrospective analysis. We built a supervised machine learning classification model using the Azure Machine Learning (ML) Platform and analyzed the literature (1970-2014) from NCBI's Entrez Gene2Pubmed Database (http://www.ncbi.nlm.nih.gov/gene) using genes from the NHGRI's Clinical Genomics Database (http://research.nhgri.nih.gov/CGD/). We applied our model to 376,738 articles: 288,639 (76.6%) were classified as Basic, 54,178 (14.4%) as Clinical, and 24,569 (6.5%) as Management. The average classification accuracy was 92.2%. The rate of Clinical publication was significantly higher than Basic or Management. The rate of publication of article types differed significantly when divided into key eras: Human Genome Project (HGP) planning phase (1984-1990); HGP launch (1990) to publication (2001); following HGP completion to the "Next Generation" advent (2009); the era following 2009. In conclusion, in addition to the findings regarding the pace and focus of genetic progress, our algorithm produced a database that can be used in a variety of contexts including automating the identification of management-related literature.

  17. Tracking medical genetic literature through machine learning.

    PubMed

    Bornstein, Aaron T; McLoughlin, Matthew H; Aguilar, Jesus; Wong, Wendy S W; Solomon, Benjamin D

    2016-08-01

    There has been remarkable progress in identifying the causes of genetic conditions as well as understanding how changes in specific genes cause disease. Though difficult (and often superficial) to parse, an interesting tension involves emphasis on basic research aimed to dissect normal and abnormal biology versus more clearly clinical and therapeutic investigations. To examine one facet of this question and to better understand progress in Mendelian-related research, we developed an algorithm that classifies medical literature into three categories (Basic, Clinical, and Management) and conducted a retrospective analysis. We built a supervised machine learning classification model using the Azure Machine Learning (ML) Platform and analyzed the literature (1970-2014) from NCBI's Entrez Gene2Pubmed Database (http://www.ncbi.nlm.nih.gov/gene) using genes from the NHGRI's Clinical Genomics Database (http://research.nhgri.nih.gov/CGD/). We applied our model to 376,738 articles: 288,639 (76.6%) were classified as Basic, 54,178 (14.4%) as Clinical, and 24,569 (6.5%) as Management. The average classification accuracy was 92.2%. The rate of Clinical publication was significantly higher than Basic or Management. The rate of publication of article types differed significantly when divided into key eras: Human Genome Project (HGP) planning phase (1984-1990); HGP launch (1990) to publication (2001); following HGP completion to the "Next Generation" advent (2009); the era following 2009. In conclusion, in addition to the findings regarding the pace and focus of genetic progress, our algorithm produced a database that can be used in a variety of contexts including automating the identification of management-related literature. PMID:27268407

  18. Calibration of rotary joints in multi-axis machine tools

    NASA Astrophysics Data System (ADS)

    Khan, Abdul Wahid; Liu, Fei; Chen, Wuyi

    2009-05-01

    A novel technique is developed and implemented for error quantification in a rotary joint of a multi-axis machine tool by using a calibrated double ball bar (DBB) system as a working standard. This technique greatly simplified the measurement setup requirement and accelerated the calibration of rotary joints. In addition it is highly economical by reducing the complex optics and eliminating the usage of various tooling, instrumentation and accessories. This methodology is capable of measuring the five degree of freedom (DOF) errors out of 6DOF of a rotary joint by using the calibrated DBB system and a point locating fixture. The methodology is implemented on rotary joints of a five axis CNC machine tools. Equation solvers and error modeling technique are implemented and validity of the methodology and authenticity of the results obtained are tested through simulation in UG and Matlab software. The methodology is found extremely feasible pragmatic, quite simple, efficient and easy to use for error characterization of rotary joints of multi axis machine tools.

  19. Real-time machine tool chatter identification and control system

    NASA Astrophysics Data System (ADS)

    Zhang, Shilong

    1997-05-01

    Chatter in machining processes is one of the most important factors limiting production rates. In order to suppress machine tool chatter during orthogonal cutting processes, a real time active chatter controller is designed and implemented that is able to adopt to the continuously changing machining parameters. An electro-hydraulic servo system is used to control the movement of the cutting tool. The cutting force, workpiece acceleration, and tool displacement are measured in real time. The transfer function of the workpiece is estimated by using the cutting force and the acceleration of the workpiece. All the digital signal acquisition and processing tasks are performed by a digital signal processor (MicroStar DAP3200a/415). The digital controller is designed such that the servo/actuator dynamics is adjusted to match the workpiece dynamics to suppress chatter. To make the controller adaptive to the changing dynamics of the workpiece, a recursive least square technique is used to identify the workpiece dynamics in real time. The estimated workpiece dynamics parameters are then used in the digital controller to calculate a new servo output, thus controlling the tool movement. Simulations show that chatter can be suppressed successfully by using this method. Experiments agree well with simulations.

  20. Machine Tool User Cylindrical Die Rolling Performance Support System

    SciTech Connect

    Bohley, M.C.; Grothe, V.D.

    1998-08-06

    This project was initiated to provide the machine tool industry and the DOE a method for evaluating educating potential users about various aspects of the cylindrical die rolling process including: characteristics of the cylindrical die rolling processes, major productivity and material savings benefits, advantages for use in the fastener industry, production capabilities based on part parameters, and production capabilities based on machine specifications. AlliedSignal Federal Manufacturing and Technologies (ASFM and T) utilized data provided by Kinefac Corporation to develop an interactive performance support system. AlliedSignal developed one complete branch of the program and Kinefac will develop the remaining two branches. Macromedia Authorware version 3.5 and Microsoft Access version 7.0 were selected for development tools. These software tools maximize continued program development ease and program management with future machine technology advancements. Using this authoring tool and the external database resulted in development of a product that has many potential uses within the manufacturing industry. Source code for the product can be used as a template for other applications is reusable and can provide potential solutions to non-manufacturing needs. The final product will be released on CD-ROM.

  1. Machine Learning in the Big Data Era: Are We There Yet?

    SciTech Connect

    Sukumar, Sreenivas Rangan

    2014-01-01

    In this paper, we discuss the machine learning challenges of the Big Data era. We observe that recent innovations in being able to collect, access, organize, integrate, and query massive amounts of data from a wide variety of data sources have brought statistical machine learning under more scrutiny and evaluation for gleaning insights from the data than ever before. In that context, we pose and debate the question - Are machine learning algorithms scaling with the ability to store and compute? If yes, how? If not, why not? We survey recent developments in the state-of-the-art to discuss emerging and outstanding challenges in the design and implementation of machine learning algorithms at scale. We leverage experience from real-world Big Data knowledge discovery projects across domains of national security and healthcare to suggest our efforts be focused along the following axes: (i) the data science challenge - designing scalable and flexible computational architectures for machine learning (beyond just data-retrieval); (ii) the science of data challenge the ability to understand characteristics of data before applying machine learning algorithms and tools; and (iii) the scalable predictive functions challenge the ability to construct, learn and infer with increasing sample size, dimensionality, and categories of labels. We conclude with a discussion of opportunities and directions for future research.

  2. Error compensation for thermally induced errors on a machine tool

    SciTech Connect

    Krulewich, D.A.

    1996-11-08

    Heat flow from internal and external sources and the environment create machine deformations, resulting in positioning errors between the tool and workpiece. There is no industrially accepted method for thermal error compensation. A simple model has been selected that linearly relates discrete temperature measurements to the deflection. The biggest problem is how to locate the temperature sensors and to determine the number of required temperature sensors. This research develops a method to determine the number and location of temperature measurements.

  3. Assessing and comparison of different machine learning methods in parent-offspring trios for genotype imputation.

    PubMed

    Mikhchi, Abbas; Honarvar, Mahmood; Kashan, Nasser Emam Jomeh; Aminafshar, Mehdi

    2016-06-21

    Genotype imputation is an important tool for prediction of unknown genotypes for both unrelated individuals and parent-offspring trios. Several imputation methods are available and can either employ universal machine learning methods, or deploy algorithms dedicated to infer missing genotypes. In this research the performance of eight machine learning methods: Support Vector Machine, K-Nearest Neighbors, Extreme Learning Machine, Radial Basis Function, Random Forest, AdaBoost, LogitBoost, and TotalBoost compared in terms of the imputation accuracy, computation time and the factors affecting imputation accuracy. The methods employed using real and simulated datasets to impute the un-typed SNPs in parent-offspring trios. The tested methods show that imputation of parent-offspring trios can be accurate. The Random Forest and Support Vector Machine were more accurate than the other machine learning methods. The TotalBoost performed slightly worse than the other methods.The running times were different between methods. The ELM was always most fast algorithm. In case of increasing the sample size, the RBF requires long imputation time.The tested methods in this research can be an alternative for imputation of un-typed SNPs in low missing rate of data. However, it is recommended that other machine learning methods to be used for imputation.

  4. Prediction Of Abrasive And Diffusive Tool Wear Mechanisms In Machining

    NASA Astrophysics Data System (ADS)

    Rizzuti, S.; Umbrello, D.

    2011-01-01

    Tool wear prediction is regarded as very important task in order to maximize tool performance, minimize cutting costs and improve the quality of workpiece in cutting. In this research work, an experimental campaign was carried out at the varying of cutting conditions with the aim to measure both crater and flank tool wear, during machining of an AISI 1045 with an uncoated carbide tool P40. Parallel a FEM-based analysis was developed in order to study the tool wear mechanisms, taking also into account the influence of the cutting conditions and the temperature reached on the tool surfaces. The results show that, when the temperature of the tool rake surface is lower than the activation temperature of the diffusive phenomenon, the wear rate can be estimated applying an abrasive model. In contrast, in the tool area where the temperature is higher than the diffusive activation temperature, the wear rate can be evaluated applying a diffusive model. Finally, for a temperature ranges within the above cited values an adopted abrasive-diffusive wear model furnished the possibility to correctly evaluate the tool wear phenomena.

  5. Decorating Cutting as New Approach to Machine Tool System Dynamics

    NASA Astrophysics Data System (ADS)

    Murcinkova, Zuzana; Vasilko, Karol

    2014-12-01

    The paper presents so called decorating cutting focused on turning. It uses self-excited vibrations that are typical for turning and other types of cutting operations. The decorating turning do not utilize setting of unstable technological conditions of cutting process but it actively use the acting of cutting force on machine tool without generation of unwanted chatter vibrations. The special tool fixture was developed to utilize self-excited vibrations invoked by periodical changeability of cutting force by cutting process itself. Thus the typical texture of surface appears. The various macro/micro-textures of surfaces can be applied either for decorating purpose or for better holding of oil film.

  6. Method and apparatus for characterizing and enhancing the dynamic performance of machine tools

    DOEpatents

    Barkman, William E; Babelay, Jr., Edwin F

    2013-12-17

    Disclosed are various systems and methods for assessing and improving the capability of a machine tool. The disclosure applies to machine tools having at least one slide configured to move along a motion axis. Various patterns of dynamic excitation commands are employed to drive the one or more slides, typically involving repetitive short distance displacements. A quantification of a measurable merit of machine tool response to the one or more patterns of dynamic excitation commands is typically derived for the machine tool. Examples of measurable merits of machine tool performance include dynamic one axis positional accuracy of the machine tool, dynamic cross-axis stability of the machine tool, and dynamic multi-axis positional accuracy of the machine tool.

  7. Automatic programming of binary morphological machines by PAC learning

    NASA Astrophysics Data System (ADS)

    Barrera, Junior; Tomita, Nina S.; Correa da Silva, Flavio S.; Terada, Routo

    1995-08-01

    Binary image analysis problems can be solved by set operators implemented as programs for a binary morphological machine (BMM). This is a very general and powerful approach to solve this type of problem. However, the design of these programs is not a task manageable by nonexperts on mathematical morphology. In order to overcome this difficulty we have worked on tools that help users describe their goals at higher levels of abstraction and to translate them into BMM programs. Some of these tools are based on the representation of the goals of the user as a collection of input-output pairs of images and the estimation of the target operator from these data. PAC learning is a well suited methodology for this task, since in this theory 'concepts' are represented as Boolean functions that are equivalent to set operators. In order to apply this technique in practice we must have efficient learning algorithms. In this paper we introduce two PAC learning algorithms, both are based on the minimal representation of Boolean functions, which has a straightforward translation to the canonical decomposition of set operators. The first algorithm is based on the classical Quine-McCluskey algorithm for the simplification of Boolean functions, and the second one is based on a new idea for the construction of Boolean functions: the incremental splitting of intervals. We also present a comparative complexity analysis of the two algorithms. Finally, we give some application examples.

  8. Geological Mapping Using Machine Learning Algorithms

    NASA Astrophysics Data System (ADS)

    Harvey, A. S.; Fotopoulos, G.

    2016-06-01

    Remotely sensed spectral imagery, geophysical (magnetic and gravity), and geodetic (elevation) data are useful in a variety of Earth science applications such as environmental monitoring and mineral exploration. Using these data with Machine Learning Algorithms (MLA), which are widely used in image analysis and statistical pattern recognition applications, may enhance preliminary geological mapping and interpretation. This approach contributes towards a rapid and objective means of geological mapping in contrast to conventional field expedition techniques. In this study, four supervised MLAs (naïve Bayes, k-nearest neighbour, random forest, and support vector machines) are compared in order to assess their performance for correctly identifying geological rocktypes in an area with complete ground validation information. Geological maps of the Sudbury region are used for calibration and validation. Percent of correct classifications was used as indicators of performance. Results show that random forest is the best approach. As expected, MLA performance improves with more calibration clusters, i.e. a more uniform distribution of calibration data over the study region. Performance is generally low, though geological trends that correspond to a ground validation map are visualized. Low performance may be the result of poor spectral images of bare rock which can be covered by vegetation or water. The distribution of calibration clusters and MLA input parameters affect the performance of the MLAs. Generally, performance improves with more uniform sampling, though this increases required computational effort and time. With the achievable performance levels in this study, the technique is useful in identifying regions of interest and identifying general rocktype trends. In particular, phase I geological site investigations will benefit from this approach and lead to the selection of sites for advanced surveys.

  9. Dynamical Mass Measurements of Contaminated Galaxy Clusters Using Machine Learning

    NASA Astrophysics Data System (ADS)

    Ntampaka, M.; Trac, H.; Sutherland, D. J.; Fromenteau, S.; Póczos, B.; Schneider, J.

    2016-11-01

    We study dynamical mass measurements of galaxy clusters contaminated by interlopers and show that a modern machine learning algorithm can predict masses by better than a factor of two compared to a standard scaling relation approach. We create two mock catalogs from Multidark’s publicly available N-body MDPL1 simulation, one with perfect galaxy cluster membership information and the other where a simple cylindrical cut around the cluster center allows interlopers to contaminate the clusters. In the standard approach, we use a power-law scaling relation to infer cluster mass from galaxy line-of-sight (LOS) velocity dispersion. Assuming perfect membership knowledge, this unrealistic case produces a wide fractional mass error distribution, with a width of {{Δ }}ε ≈ 0.87. Interlopers introduce additional scatter, significantly widening the error distribution further ({{Δ }}ε ≈ 2.13). We employ the support distribution machine (SDM) class of algorithms to learn from distributions of data to predict single values. Applied to distributions of galaxy observables such as LOS velocity and projected distance from the cluster center, SDM yields better than a factor-of-two improvement ({{Δ }}ε ≈ 0.67) for the contaminated case. Remarkably, SDM applied to contaminated clusters is better able to recover masses than even the scaling relation approach applied to uncontaminated clusters. We show that the SDM method more accurately reproduces the cluster mass function, making it a valuable tool for employing cluster observations to evaluate cosmological models.

  10. Dynamical Mass Measurements of Contaminated Galaxy Clusters Using Machine Learning

    NASA Astrophysics Data System (ADS)

    Ntampaka, Michelle; Trac, Hy; Sutherland, Dougal; Fromenteau, Sebastien; Poczos, Barnabas; Schneider, Jeff

    2016-01-01

    Galaxy clusters are a rich source of information for examining fundamental astrophysical processes and cosmological parameters, however, employing clusters as cosmological probes requires accurate mass measurements derived from cluster observables. We study dynamical mass measurements of galaxy clusters contaminated by interlopers, and show that a modern machine learning (ML) algorithm can predict masses by better than a factor of two compared to a standard scaling relation approach. We create a mock catalog from Multidark's publicly-available N-body MDPL1 simulation where a simple cylindrical cut around the cluster center allows interlopers to contaminate the clusters. In the standard approach, we use a power law scaling relation to infer cluster mass from galaxy line of sight (LOS) velocity dispersion. The presence of interlopers in the catalog produces a wide, flat fractional mass error distribution, with width = 2.13. We employ the Support Distribution Machine (SDM) class of algorithms to learn from distributions of data to predict single values. Applied to distributions of galaxy observables such as LOS velocity and projected distance from the cluster center, SDM yields better than a factor-of-two improvement (width = 0.67). Remarkably, SDM applied to contaminated clusters is better able to recover masses than even a scaling relation approach applied to uncontaminated clusters. We show that the SDM method more accurately reproduces the cluster mass function, making it a valuable tool for employing cluster observations to evaluate cosmological models.

  11. Large-scale machine learning for metagenomics sequence classification

    PubMed Central

    Vervier, Kévin; Mahé, Pierre; Tournoud, Maud; Veyrieras, Jean-Baptiste; Vert, Jean-Philippe

    2016-01-01

    Motivation: Metagenomics characterizes the taxonomic diversity of microbial communities by sequencing DNA directly from an environmental sample. One of the main challenges in metagenomics data analysis is the binning step, where each sequenced read is assigned to a taxonomic clade. Because of the large volume of metagenomics datasets, binning methods need fast and accurate algorithms that can operate with reasonable computing requirements. While standard alignment-based methods provide state-of-the-art performance, compositional approaches that assign a taxonomic class to a DNA read based on the k-mers it contains have the potential to provide faster solutions. Results: We propose a new rank-flexible machine learning-based compositional approach for taxonomic assignment of metagenomics reads and show that it benefits from increasing the number of fragments sampled from reference genome to tune its parameters, up to a coverage of about 10, and from increasing the k-mer size to about 12. Tuning the method involves training machine learning models on about 108 samples in 107 dimensions, which is out of reach of standard softwares but can be done efficiently with modern implementations for large-scale machine learning. The resulting method is competitive in terms of accuracy with well-established alignment and composition-based tools for problems involving a small to moderate number of candidate species and for reasonable amounts of sequencing errors. We show, however, that machine learning-based compositional approaches are still limited in their ability to deal with problems involving a greater number of species and more sensitive to sequencing errors. We finally show that the new method outperforms the state-of-the-art in its ability to classify reads from species of lineage absent from the reference database and confirm that compositional approaches achieve faster prediction times, with a gain of 2–17 times with respect to the BWA-MEM short read mapper, depending

  12. Distributed machine learning: Scaling up with coarse-grained parallelism

    SciTech Connect

    Provost, F.J.; Hennessy, D.N.

    1994-12-31

    Machine teaming methods are becoming accepted as additions to the biologist`s data-analysis tool kit. However, scaling these techniques up to large data sets, such as those in biological and medical domains, is problematic in terms of both the required computational search effort and required memory (and the detrimental effects of excessive swapping). Our approach to tackling the problem of scaling up to large datasets is to take advantage of the ubiquitous workstation networks that are generally available in scientific and engineering environments. This paper introduces the notion of the invariant-partitioning property--that for certain evaluation criteria it is possible to partition a data set across multiple processors such that any rule that is satisfactory over the entire data set will also be satisfactory on at least one subset. In addition, by taking advantage of cooperation through interprocess communication, it is possible to build distributed learning algorithms such that only rules that are satisfactory over the entire data set will be learned. We describe a distributed learning system, CorPRL, that takes advantage of the invariant-partitioning property to learn from very large data sets, and present results demonstrating CorPRL`s effectiveness in analyzing data from two databases.

  13. Studying depression using imaging and machine learning methods.

    PubMed

    Patel, Meenal J; Khalaf, Alexander; Aizenstein, Howard J

    2016-01-01

    Depression is a complex clinical entity that can pose challenges for clinicians regarding both accurate diagnosis and effective timely treatment. These challenges have prompted the development of multiple machine learning methods to help improve the management of this disease. These methods utilize anatomical and physiological data acquired from neuroimaging to create models that can identify depressed patients vs. non-depressed patients and predict treatment outcomes. This article (1) presents a background on depression, imaging, and machine learning methodologies; (2) reviews methodologies of past studies that have used imaging and machine learning to study depression; and (3) suggests directions for future depression-related studies.

  14. Image Segmentation for Connectomics Using Machine Learning

    SciTech Connect

    Tasdizen, Tolga; Seyedhosseini, Mojtaba; Liu, TIng; Jones, Cory; Jurrus, Elizabeth R.

    2014-12-01

    Reconstruction of neural circuits at the microscopic scale of individual neurons and synapses, also known as connectomics, is an important challenge for neuroscience. While an important motivation of connectomics is providing anatomical ground truth for neural circuit models, the ability to decipher neural wiring maps at the individual cell level is also important in studies of many neurodegenerative diseases. Reconstruction of a neural circuit at the individual neuron level requires the use of electron microscopy images due to their extremely high resolution. Computational challenges include pixel-by-pixel annotation of these images into classes such as cell membrane, mitochondria and synaptic vesicles and the segmentation of individual neurons. State-of-the-art image analysis solutions are still far from the accuracy and robustness of human vision and biologists are still limited to studying small neural circuits using mostly manual analysis. In this chapter, we describe our image analysis pipeline that makes use of novel supervised machine learning techniques to tackle this problem.

  15. Predicting Increased Blood Pressure Using Machine Learning

    PubMed Central

    Golino, Hudson Fernandes; Amaral, Liliany Souza de Brito; Duarte, Stenio Fernando Pimentel; Soares, Telma de Jesus; dos Reis, Luciana Araujo

    2014-01-01

    The present study investigates the prediction of increased blood pressure by body mass index (BMI), waist (WC) and hip circumference (HC), and waist hip ratio (WHR) using a machine learning technique named classification tree. Data were collected from 400 college students (56.3% women) from 16 to 63 years old. Fifteen trees were calculated in the training group for each sex, using different numbers and combinations of predictors. The result shows that for women BMI, WC, and WHR are the combination that produces the best prediction, since it has the lowest deviance (87.42), misclassification (.19), and the higher pseudo R2 (.43). This model presented a sensitivity of 80.86% and specificity of 81.22% in the training set and, respectively, 45.65% and 65.15% in the test sample. For men BMI, WC, HC, and WHC showed the best prediction with the lowest deviance (57.25), misclassification (.16), and the higher pseudo R2 (.46). This model had a sensitivity of 72% and specificity of 86.25% in the training set and, respectively, 58.38% and 69.70% in the test set. Finally, the result from the classification tree analysis was compared with traditional logistic regression, indicating that the former outperformed the latter in terms of predictive power. PMID:24669313

  16. Many-body physics via machine learning

    NASA Astrophysics Data System (ADS)

    Arsenault, Louis-Francois; von Lilienfeld, O. Anatole; Millis, Andrew J.

    We demonstrate a method for the use of machine learning (ML) to solve the equations of many-body physics, which are functional equations linking a bare to an interacting Green's function (or self-energy) offering transferable power of prediction for physical quantities for both the forward and the reverse engineering problem of materials. Functions are represented by coefficients in an orthogonal polynomial expansion and kernel ridge regression is used. The method is demonstrated using as an example a database built from Dynamical Mean Field theory (DMFT) calculations on the three dimensional Hubbard model. We discuss the extension to a database for real materials. We also discuss some new area of investigation concerning high throughput predictions for real materials by offering a perspective of how our scheme is general enough for applications to other problems involving the inversion of integral equations from the integrated knowledge such as the analytical continuation of the Green's function and the reconstruction of lattice structures from X-ray spectra. Office of Science of the U.S. Department of Energy under SubContract DOE No. 3F-3138 and FG-ER04169.

  17. Machine learning applications in proteomics research: how the past can boost the future.

    PubMed

    Kelchtermans, Pieter; Bittremieux, Wout; De Grave, Kurt; Degroeve, Sven; Ramon, Jan; Laukens, Kris; Valkenborg, Dirk; Barsnes, Harald; Martens, Lennart

    2014-03-01

    Machine learning is a subdiscipline within artificial intelligence that focuses on algorithms that allow computers to learn solving a (complex) problem from existing data. This ability can be used to generate a solution to a particularly intractable problem, given that enough data are available to train and subsequently evaluate an algorithm on. Since MS-based proteomics has no shortage of complex problems, and since publicly available data are becoming available in ever growing amounts, machine learning is fast becoming a very popular tool in the field. We here therefore present an overview of the different applications of machine learning in proteomics that together cover nearly the entire wet- and dry-lab workflow, and that address key bottlenecks in experiment planning and design, as well as in data processing and analysis. PMID:24323524

  18. Monitoring frog communities: An application of machine learning

    SciTech Connect

    Taylor, A.; Watson, G.; Grigg, G.; McCallum, H.

    1996-12-31

    Automatic recognition of animal vocalizations would be a valuable tool for a variety of biological research and environmental monitoring applications. We report the development of a software system which can recognize the vocalizations of 22 species of frogs which occur in an area of northern Australia. This software system will be used in unattended operation to monitor the effect on frog populations of the introduced Cane Toad. The system is based around classification of local peaks in the spectrogram of the audio signal using Quinlan`s machine learning system, C4.5. Unreliable identifications of peaks are aggregated together using a hierarchical structure of segments based on the typical temporal vocalization species` patterns. This produces robust system performance.

  19. Stochastic upscaling in solid mechanics: An excercise in machine learning

    SciTech Connect

    Koutsourelakis, P.S.

    2007-09-10

    This paper presents a consistent theoretical and computational framework for upscaling in random microstructures. We adopt an information theoretic approach in order to quantify the informational content of the microstructural details and find ways to condense it while assessing quantitatively the approximation introduced. In particular, we substitute the high-dimensional microscale description by a lower-dimensional representation corresponding for example to an equivalent homogeneous medium. The probabilistic characteristics of the latter are determined by minimizing the distortion between actual macroscale predictions and the predictions made using the coarse model. A machine learning framework is essentially adopted in which a vector quantizer is trained using data generated computationally or collected experimentally. Several parallels and differences with similar problems in source coding theory are pointed out and an efficient computational tool is employed. Various applications in linear and non-linear problems in solid mechanics are examined.

  20. Learning machines applied to potential forest distribution.

    PubMed

    Ordóñez, Celestino; Taboada, Javier; Bastante, Fernando; Matías, Jose María; Felicísimo, Angel Manuel

    2005-01-01

    The clearing of forests to obtain land for pasture and agriculture and the replacement of autochthonous species by other faster-growing varieties of trees for timber have both led to the loss of vast areas of forest worldwide. At present, many developed countries are attempting to reverse these effects, establishing policies for the restoration of older woodland systems. Reforestation is a complex matter, planned and carried out by experts who need objective information regarding the type of forest that can be sustained in each area. This information is obtained by drawing up feasibility models constructed using statistical methods that make use of the information provided by morphological and environmental variables (height, gradient, rainfall, etc.) that partially condition the presence or absence of a specific kind of forestation in an area. The aim of this work is to construct a set of feasibility models for woodland located in the basin of the River Liébana (NW Spain), to serve as a support tool for the experts entrusted with carrying out the reforestation project. The techniques used are multilayer perceptron neural networks and support vector machines. Their results will be compared to the results obtained by traditional techniques (such as discriminant analysis and logistic regression) by measuring the degree of fit between each model and the existing distribution of woodlands. The interpretation and problems of the feasibility models are commented on in the Discussion section. PMID:15984068

  1. Acceleration of saddle-point searches with machine learning.

    PubMed

    Peterson, Andrew A

    2016-08-21

    In atomistic simulations, the location of the saddle point on the potential-energy surface (PES) gives important information on transitions between local minima, for example, via transition-state theory. However, the search for saddle points often involves hundreds or thousands of ab initio force calls, which are typically all done at full accuracy. This results in the vast majority of the computational effort being spent calculating the electronic structure of states not important to the researcher, and very little time performing the calculation of the saddle point state itself. In this work, we describe how machine learning (ML) can reduce the number of intermediate ab initio calculations needed to locate saddle points. Since machine-learning models can learn from, and thus mimic, atomistic simulations, the saddle-point search can be conducted rapidly in the machine-learning representation. The saddle-point prediction can then be verified by an ab initio calculation; if it is incorrect, this strategically has identified regions of the PES where the machine-learning representation has insufficient training data. When these training data are used to improve the machine-learning model, the estimates greatly improve. This approach can be systematized, and in two simple example problems we demonstrate a dramatic reduction in the number of ab initio force calls. We expect that this approach and future refinements will greatly accelerate searches for saddle points, as well as other searches on the potential energy surface, as machine-learning methods see greater adoption by the atomistics community.

  2. Acceleration of saddle-point searches with machine learning

    NASA Astrophysics Data System (ADS)

    Peterson, Andrew A.

    2016-08-01

    In atomistic simulations, the location of the saddle point on the potential-energy surface (PES) gives important information on transitions between local minima, for example, via transition-state theory. However, the search for saddle points often involves hundreds or thousands of ab initio force calls, which are typically all done at full accuracy. This results in the vast majority of the computational effort being spent calculating the electronic structure of states not important to the researcher, and very little time performing the calculation of the saddle point state itself. In this work, we describe how machine learning (ML) can reduce the number of intermediate ab initio calculations needed to locate saddle points. Since machine-learning models can learn from, and thus mimic, atomistic simulations, the saddle-point search can be conducted rapidly in the machine-learning representation. The saddle-point prediction can then be verified by an ab initio calculation; if it is incorrect, this strategically has identified regions of the PES where the machine-learning representation has insufficient training data. When these training data are used to improve the machine-learning model, the estimates greatly improve. This approach can be systematized, and in two simple example problems we demonstrate a dramatic reduction in the number of ab initio force calls. We expect that this approach and future refinements will greatly accelerate searches for saddle points, as well as other searches on the potential energy surface, as machine-learning methods see greater adoption by the atomistics community.

  3. Reduced multiple empirical kernel learning machine.

    PubMed

    Wang, Zhe; Lu, MingZhe; Gao, Daqi

    2015-02-01

    Multiple kernel learning (MKL) is demonstrated to be flexible and effective in depicting heterogeneous data sources since MKL can introduce multiple kernels rather than a single fixed kernel into applications. However, MKL would get a high time and space complexity in contrast to single kernel learning, which is not expected in real-world applications. Meanwhile, it is known that the kernel mapping ways of MKL generally have two forms including implicit kernel mapping and empirical kernel mapping (EKM), where the latter is less attracted. In this paper, we focus on the MKL with the EKM, and propose a reduced multiple empirical kernel learning machine named RMEKLM for short. To the best of our knowledge, it is the first to reduce both time and space complexity of the MKL with EKM. Different from the existing MKL, the proposed RMEKLM adopts the Gauss Elimination technique to extract a set of feature vectors, which is validated that doing so does not lose much information of the original feature space. Then RMEKLM adopts the extracted feature vectors to span a reduced orthonormal subspace of the feature space, which is visualized in terms of the geometry structure. It can be demonstrated that the spanned subspace is isomorphic to the original feature space, which means that the dot product of two vectors in the original feature space is equal to that of the two corresponding vectors in the generated orthonormal subspace. More importantly, the proposed RMEKLM brings a simpler computation and meanwhile needs a less storage space, especially in the processing of testing. Finally, the experimental results show that RMEKLM owns a much efficient and effective performance in terms of both complexity and classification. The contributions of this paper can be given as follows: (1) by mapping the input space into an orthonormal subspace, the geometry of the generated subspace is visualized; (2) this paper first reduces both the time and space complexity of the EKM-based MKL; (3

  4. Learning Machine, Vietnamese Based Human-Computer Interface.

    ERIC Educational Resources Information Center

    Northwest Regional Educational Lab., Portland, OR.

    The sixth session of IT@EDU98 consisted of seven papers on the topic of the learning machine--Vietnamese based human-computer interface, and was chaired by Phan Viet Hoang (Informatics College, Singapore). "Knowledge Based Approach for English Vietnamese Machine Translation" (Hoang Kiem, Dinh Dien) presents the knowledge base approach, which…

  5. Learn about Physical Science: Simple Machines. [CD-ROM].

    ERIC Educational Resources Information Center

    2000

    This CD-ROM, designed for students in grades K-2, explores the world of simple machines. It allows students to delve into the mechanical world and learn the ways in which simple machines make work easier. Animated demonstrations are provided of the lever, pulley, wheel, screw, wedge, and inclined plane. Activities include practical matching and…

  6. Machine Translation in Foreign Language Learning: Language Learners' and Tutors' Perceptions of Its Advantages and Disadvantages

    ERIC Educational Resources Information Center

    Nino, Ana

    2009-01-01

    This paper presents a snapshot of what has been investigated in terms of the relationship between machine translation (MT) and foreign language (FL) teaching and learning. For this purpose four different roles of MT in the language class have been identified: MT as a bad model, MT as a good model, MT as a vocational training tool (especially in…

  7. Method and apparatus for characterizing and enhancing the functional performance of machine tools

    DOEpatents

    Barkman, William E; Babelay, Jr., Edwin F; Smith, Kevin Scott; Assaid, Thomas S; McFarland, Justin T; Tursky, David A; Woody, Bethany; Adams, David

    2013-04-30

    Disclosed are various systems and methods for assessing and improving the capability of a machine tool. The disclosure applies to machine tools having at least one slide configured to move along a motion axis. Various patterns of dynamic excitation commands are employed to drive the one or more slides, typically involving repetitive short distance displacements. A quantification of a measurable merit of machine tool response to the one or more patterns of dynamic excitation commands is typically derived for the machine tool. Examples of measurable merits of machine tool performance include workpiece surface finish, and the ability to generate chips of the desired length.

  8. A Machine Learning System for Recognizing Subclasses (Demo)

    SciTech Connect

    Vatsavai, Raju

    2012-01-01

    Thematic information extraction from remote sensing images is a complex task. In this demonstration, we present *Miner machine learning system. In particular, we demonstrate an advanced subclass recognition algorithm that is specifically designed to extract finer classes from aggregate classes.

  9. Machine learning challenges in Mars rover traverse science

    NASA Technical Reports Server (NTRS)

    Castano, R.; Judd, M.; Anderson, R. C.; Estlin, T.

    2003-01-01

    The successful implementation of machine learning in autonomous rover traverse science requires addressing challenges that range from the analytical technical realm, to the fuzzy, philosophical domain of entrenched belief systems within scientists and mission managers.

  10. Shedding Light on Synergistic Chemical Genetic Connections with Machine Learning.

    PubMed

    Ekins, Sean; Siqueira-Neto, Jair Lage

    2015-12-23

    Machine learning can be used to predict compounds acting synergistically, and this could greatly expand the universe of available potential treatments for diseases that are currently hidden in the dark chemical matter. PMID:27136350

  11. Applying Machine Learning to Facilitate Autism Diagnostics: Pitfalls and promises

    PubMed Central

    Bone, Daniel; Goodwin, Matthew S.; Black, Matthew P.; Lee, Chi-Chun; Audhkhasi, Kartik; Narayanan, Shrikanth

    2014-01-01

    Machine learning has immense potential to enhance diagnostic and intervention research in the behavioral sciences, and may be especially useful in investigations involving the highly prevalent and heterogeneous syndrome of autism spectrum disorder. However, use of machine learning in the absence of clinical domain expertise can be tenuous and lead to misinformed conclusions. To illustrate this concern, the current paper critically evaluates and attempts to reproduce results from two studies (Wall et al., 2012a; Wall et al., 2012b) that claim to drastically reduce time to diagnose autism using machine learning. Our failure to generate comparable findings to those reported by Wall and colleagues using larger and more balanced data underscores several conceptual and methodological problems associated with these studies. We conclude with proposed best-practices when using machine learning in autism research, and highlight some especially promising areas for collaborative work at the intersection of computational and behavioral science. PMID:25294649

  12. Analysis of multidimensional signals as classifiers for machine learning prediction of disruptions

    NASA Astrophysics Data System (ADS)

    Parsons, Matthew; Tang, William; Feibush, Eliot

    2015-11-01

    ITER and future tokamaks beyond will require systems to predict oncoming disruptions so that damage to the machine can be avoided or mitigated. The use of supervised machine learning has proven to be successful in predicting the onset of disruptions with higher accuracy than a simple locked-mode detector, but only zero-dimensional time trace signals have been considered to examine this. We present initial results from our analysis of multidimensional signals (time + spatial dimensions) from the JET database to identify higher fidelity, physics-based classifiers that would allow the development of disruption prediction tools that are portable between machines.

  13. On Electro Discharge Machining of Inconel 718 with Hollow Tool

    NASA Astrophysics Data System (ADS)

    Rajesha, S.; Sharma, A. K.; Kumar, Pradeep

    2012-06-01

    Inconel 718 is a nickel-based alloy designed for high yield, tensile, and creep-rupture properties. This alloy has been widely used in jet engines and high-speed airframe parts in aeronautic application. In this study, electric discharge machining (EDM) process was used for machining commercially available Inconel 718. A copper electrode with 99.9% purity having tubular cross section was employed to machine holes of 20 mm height and 12 mm diameter on Inconel 718 workpieces. Experiments were planned using response surface methodology (RSM). Effects of five major process parameters—pulse current, duty factor, sensitivity control, gap control, and flushing pressure on the process responses—material removal rate (MRR) and surface roughness (SR) have been discussed. Mathematical models for MRR and SR have been developed using analysis of variance. Influences of process parameters on tool wear and tool geometry have been presented with the help of scanning electron microscope (SEM) micrographs. Analysis shows significant interaction effect of pulse current and duty factor on MRR yielding a wide range from 14.4 to 22.6 mm3/min, while pulse current remains the most contributing factor with approximate changes in the MRR and SR of 48 and 37%, respectively, corresponding to the extreme values considered. Interactions of duty factor and flushing pressure yield a minimum surface roughness of 6.2 μm. The thickness of the sputtered layer and the crack length were found to be functions of pulse current. The hollow tool gets worn out on both the outer and the inner edges owing to spark erosion as well as abrasion due to flow of debris.

  14. Risk prediction with machine learning and regression methods.

    PubMed

    Steyerberg, Ewout W; van der Ploeg, Tjeerd; Van Calster, Ben

    2014-07-01

    This is a discussion of issues in risk prediction based on the following papers: "Probability estimation with machine learning methods for dichotomous and multicategory outcome: Theory" by Jochen Kruppa, Yufeng Liu, Gérard Biau, Michael Kohler, Inke R. König, James D. Malley, and Andreas Ziegler; and "Probability estimation with machine learning methods for dichotomous and multicategory outcome: Applications" by Jochen Kruppa, Yufeng Liu, Hans-Christian Diener, Theresa Holste, Christian Weimar, Inke R. König, and Andreas Ziegler.

  15. Reducing maintenance costs in agreement with CNC machine tools reliability

    NASA Astrophysics Data System (ADS)

    Ungureanu, A. L.; Stan, G.; Butunoi, P. A.

    2016-08-01

    Aligning maintenance strategy with reliability is a challenge due to the need to find an optimal balance between them. Because the various methods described in the relevant literature involve laborious calculations or use of software that can be costly, this paper proposes a method that is easier to implement on CNC machine tools. The new method, called the Consequence of Failure Analysis (CFA) is based on technical and economic optimization, aimed at obtaining a level of required performance with minimum investment and maintenance costs.

  16. Learning Activity Packets for Milling Machines. Unit III--Vertical Milling Machines.

    ERIC Educational Resources Information Center

    Oklahoma State Board of Vocational and Technical Education, Stillwater. Curriculum and Instructional Materials Center.

    This learning activity packet (LAP) outlines the study activities and performance tasks covered in a related curriculum guide on milling machines. The course of study in this LAP is intended to help students learn to set up and operate a vertical mill. Tasks addressed in the LAP include mounting and removing cutters and cutter holders for vertical…

  17. Learning Activity Packets for Milling Machines. Unit II--Horizontal Milling Machines.

    ERIC Educational Resources Information Center

    Oklahoma State Board of Vocational and Technical Education, Stillwater. Curriculum and Instructional Materials Center.

    This learning activity packet (LAP) outlines the study activities and performance tasks covered in a related curriculum guide on milling machines. The course of study in this LAP is intended to help students learn to set up and operate a horizontal mill. Tasks addressed in the LAP include mounting style "A" or "B" arbors and adjusting arbor…

  18. Portfolio as a learning tool: students' perspective.

    PubMed

    Elango, S; Jutti, R C; Lee, L K

    2005-09-01

    Portfolio writing is a method of encouraging reflective learning among professionals. Although portfolio-based learning is popular among educators, not many studies have been done to determine students' perceptions of portfolio as a learning tool. A questionnaire survey was conducted among 143 medical students to find out their perceptions of the portfolio as a learning tool. A majority of the students felt that the portfolio is a good learning tool. However, they also perceived that it is stressful and time-consuming to develop a proper portfolio. The study indicates that students need appropriate guidance from the academic staff for the system to succeed. PMID:16205830

  19. Machine tool accuracy characterization workshops. Final report, May 5, 1992--November 5 1993

    SciTech Connect

    1995-01-06

    The ability to assess the accuracy of machine tools is required by both tool builders and users. Builders must have this ability in order to predict the accuracy capability of a machine tool for different part geometry`s, to provide verifiable accuracy information for sales purposes, and to locate error sources for maintenance, troubleshooting, and design enhancement. Users require the same ability in order to make intelligent choices in selecting or procuring machine tools, to predict component manufacturing accuracy, and to perform maintenance and troubleshooting. In both instances, the ability to fully evaluate the accuracy capabilities of a machine tool and the source of its limitations is essential for using the tool to its maximum accuracy and productivity potential. This project was designed to transfer expertise in modern machine tool accuracy testing methods from LLNL to US industry, and to educate users on the use and application of emerging standards for machine tool performance testing.

  20. Learning to Learn Together with CSCL Tools

    ERIC Educational Resources Information Center

    Schwarz, Baruch B.; de Groot, Reuma; Mavrikis, Manolis; Dragon, Toby

    2015-01-01

    In this paper, we identify "Learning to Learn Together" (L2L2) as a new and important educational goal. Our view of L2L2 is a substantial extension of "Learning to Learn" (L2L): L2L2 consists of learning to collaborate to successfully face L2L challenges. It is inseparable from L2L, as it emerges when individuals face problems…

  1. Mobile Learning: A Powerful Tool for Ubiquitous Language Learning

    ERIC Educational Resources Information Center

    Gomes, Nelson; Lopes, Sérgio; Araújo, Sílvia

    2016-01-01

    Mobile devices (smartphones, tablets, e-readers, etc.) have come to be used as tools for mobile learning. Several studies support the integration of such technological devices with learning, particularly with language learning. In this paper, we wish to present an Android app designed for the teaching and learning of Portuguese as a foreign…

  2. Using machine learning techniques to automate sky survey catalog generation

    NASA Technical Reports Server (NTRS)

    Fayyad, Usama M.; Roden, J. C.; Doyle, R. J.; Weir, Nicholas; Djorgovski, S. G.

    1993-01-01

    We describe the application of machine classification techniques to the development of an automated tool for the reduction of a large scientific data set. The 2nd Palomar Observatory Sky Survey provides comprehensive photographic coverage of the northern celestial hemisphere. The photographic plates are being digitized into images containing on the order of 10(exp 7) galaxies and 10(exp 8) stars. Since the size of this data set precludes manual analysis and classification of objects, our approach is to develop a software system which integrates independently developed techniques for image processing and data classification. Image processing routines are applied to identify and measure features of sky objects. Selected features are used to determine the classification of each object. GID3* and O-BTree, two inductive learning techniques, are used to automatically learn classification decision trees from examples. We describe the techniques used, the details of our specific application, and the initial encouraging results which indicate that our approach is well-suited to the problem. The benefits of the approach are increased data reduction throughput, consistency of classification, and the automated derivation of classification rules that will form an objective, examinable basis for classifying sky objects. Furthermore, astronomers will be freed from the tedium of an intensely visual task to pursue more challenging analysis and interpretation problems given automatically cataloged data.

  3. Prospects for chaos control of machine tool chatter

    SciTech Connect

    Hively, L.M.; Protopopescu, V.A.; Clapp, N.E.; Daw, C.S.

    1998-06-01

    The authors analyze the nonlinear tool-part dynamics during turning of stainless steel in the nonchatter and chatter regimes, toward the ultimate objective of chatter control. Their previous work analyzed tool acceleration in three dimensions at four spindle speeds. In the present work, the authors analyze the machining power and obtain nonlinear measures of this power. They also calculate the cycle-to-cycle energy for the turning process. Return maps for power cycle times do not reveal fixed points or (un)stable manifolds. Energy return maps do display stable and unstable directions (manifolds) to and from an unstable period-1 orbit, which is the dominant periodicity. Both nonchatter and chatter dynamics have the unusual feature of arriving at the unstable period-1 fixed point and departing from that fixed point of the energy return map in a single step. This unusual feature makes chaos maintenance, based on the well-known Ott-Grebogi-Yorke scheme, a very difficult option for chatter suppression. Alternative control schemes, such as synchronization of the tool-part motion to prerecorded nonchatter dynamics or dynamically damping the period-1 motion, are briefly discussed.

  4. Web-Based Learning Design Tool

    ERIC Educational Resources Information Center

    Bruno, F. B.; Silva, T. L. K.; Silva, R. P.; Teixeira, F. G.

    2012-01-01

    Purpose: The purpose of this paper is to propose a web-based tool that enables the development and provision of learning designs and its reuse and re-contextualization as generative learning objects, aimed at developing educational materials. Design/methodology/approach: The use of learning objects can facilitate the process of production and…

  5. Machine learning in cell biology - teaching computers to recognize phenotypes.

    PubMed

    Sommer, Christoph; Gerlich, Daniel W

    2013-12-15

    Recent advances in microscope automation provide new opportunities for high-throughput cell biology, such as image-based screening. High-complex image analysis tasks often make the implementation of static and predefined processing rules a cumbersome effort. Machine-learning methods, instead, seek to use intrinsic data structure, as well as the expert annotations of biologists to infer models that can be used to solve versatile data analysis tasks. Here, we explain how machine-learning methods work and what needs to be considered for their successful application in cell biology. We outline how microscopy images can be converted into a data representation suitable for machine learning, and then introduce various state-of-the-art machine-learning algorithms, highlighting recent applications in image-based screening. Our Commentary aims to provide the biologist with a guide to the application of machine learning to microscopy assays and we therefore include extensive discussion on how to optimize experimental workflow as well as the data analysis pipeline. PMID:24259662

  6. Machine learning in cell biology - teaching computers to recognize phenotypes.

    PubMed

    Sommer, Christoph; Gerlich, Daniel W

    2013-12-15

    Recent advances in microscope automation provide new opportunities for high-throughput cell biology, such as image-based screening. High-complex image analysis tasks often make the implementation of static and predefined processing rules a cumbersome effort. Machine-learning methods, instead, seek to use intrinsic data structure, as well as the expert annotations of biologists to infer models that can be used to solve versatile data analysis tasks. Here, we explain how machine-learning methods work and what needs to be considered for their successful application in cell biology. We outline how microscopy images can be converted into a data representation suitable for machine learning, and then introduce various state-of-the-art machine-learning algorithms, highlighting recent applications in image-based screening. Our Commentary aims to provide the biologist with a guide to the application of machine learning to microscopy assays and we therefore include extensive discussion on how to optimize experimental workflow as well as the data analysis pipeline.

  7. Data Triage of Astronomical Transients: A Machine Learning Approach

    NASA Astrophysics Data System (ADS)

    Rebbapragada, U.

    This talk presents real-time machine learning systems for triage of big data streams generated by photometric and image-differencing pipelines. Our first system is a transient event detection system in development for the Palomar Transient Factory (PTF), a fully-automated synoptic sky survey that has demonstrated real-time discovery of optical transient events. The system is tasked with discriminating between real astronomical objects and bogus objects, which are usually artifacts of the image differencing pipeline. We performed a machine learning forensics investigation on PTF’s initial system that led to training data improvements that decreased both false positive and negative rates. The second machine learning system is a real-time classification engine of transients and variables in development for the Australian Square Kilometre Array Pathfinder (ASKAP), an upcoming wide-field radio survey with unprecedented ability to investigate the radio transient sky. The goal of our system is to classify light curves into known classes with as few observations as possible in order to trigger follow-up on costlier assets. We discuss the violation of standard machine learning assumptions incurred by this task, and propose the use of ensemble and hierarchical machine learning classifiers that make predictions most robustly.

  8. Building Artificial Vision Systems with Machine Learning

    SciTech Connect

    LeCun, Yann

    2011-02-23

    Three questions pose the next challenge for Artificial Intelligence (AI), robotics, and neuroscience. How do we learn perception (e.g. vision)? How do we learn representations of the perceptual world? How do we learn visual categories from just a few examples?

  9. 38. METAL WORKING TOOLS AND MACHINES ADJACENT TO THE CIRCA ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    38. METAL WORKING TOOLS AND MACHINES ADJACENT TO THE CIRCA 1900 MICHIGAN MACHINERY MFG. CO. PUNCH PRESS NEAR THE CENTER OF THE FACTORY BUILDING. AT THE LEFT FOREGROUND IS A MOVABLE TIRE BENDER FOR SHAPING ELI WINDMILL WHEEL RIMS. AT THE CENTER IS A FLOOR-MOUNTED CIRCA 1900 SNAG GRINDER OF THE TYPE USED FOR SMOOTHING ROUGH CASTINGS. ON THE WHEELED WORK STATION IS A SUNNEN BUSHING GRINDER, BEHIND WHICH IS A TRIPOD CHAIN VICE. IN THE CENTER BACKGROUND IS A WOODEN CHEST OF DRAWERS WHICH CONTAINS A 'RAG DRAWER' STILL FILLED WITH CLOTH RAGS PLACED IN THE FACTORY BUILDING AT THE INSISTENCE OF LOUISE (MRS. ARTHUR) KREGEL FOR THE CONVENIENCE AND CLEANLINESS OF WORKERS. IN THE LEFT BACKGROUND IS A CIRCA 1900 CROSS-CUTOFF CIRCULAR SAW. - Kregel Windmill Company Factory, 1416 Central Avenue, Nebraska City, Otoe County, NE

  10. Feasibility of Active Machine Learning for Multiclass Compound Classification.

    PubMed

    Lang, Tobias; Flachsenberg, Florian; von Luxburg, Ulrike; Rarey, Matthias

    2016-01-25

    A common task in the hit-to-lead process is classifying sets of compounds into multiple, usually structural classes, which build the groundwork for subsequent SAR studies. Machine learning techniques can be used to automate this process by learning classification models from training compounds of each class. Gathering class information for compounds can be cost-intensive as the required data needs to be provided by human experts or experiments. This paper studies whether active machine learning can be used to reduce the required number of training compounds. Active learning is a machine learning method which processes class label data in an iterative fashion. It has gained much attention in a broad range of application areas. In this paper, an active learning method for multiclass compound classification is proposed. This method selects informative training compounds so as to optimally support the learning progress. The combination with human feedback leads to a semiautomated interactive multiclass classification procedure. This method was investigated empirically on 15 compound classification tasks containing 86-2870 compounds in 3-38 classes. The empirical results show that active learning can solve these classification tasks using 10-80% of the data which would be necessary for standard learning techniques.

  11. Acceleration of saddle-point searches with machine learning.

    PubMed

    Peterson, Andrew A

    2016-08-21

    In atomistic simulations, the location of the saddle point on the potential-energy surface (PES) gives important information on transitions between local minima, for example, via transition-state theory. However, the search for saddle points often involves hundreds or thousands of ab initio force calls, which are typically all done at full accuracy. This results in the vast majority of the computational effort being spent calculating the electronic structure of states not important to the researcher, and very little time performing the calculation of the saddle point state itself. In this work, we describe how machine learning (ML) can reduce the number of intermediate ab initio calculations needed to locate saddle points. Since machine-learning models can learn from, and thus mimic, atomistic simulations, the saddle-point search can be conducted rapidly in the machine-learning representation. The saddle-point prediction can then be verified by an ab initio calculation; if it is incorrect, this strategically has identified regions of the PES where the machine-learning representation has insufficient training data. When these training data are used to improve the machine-learning model, the estimates greatly improve. This approach can be systematized, and in two simple example problems we demonstrate a dramatic reduction in the number of ab initio force calls. We expect that this approach and future refinements will greatly accelerate searches for saddle points, as well as other searches on the potential energy surface, as machine-learning methods see greater adoption by the atomistics community. PMID:27544086

  12. Recognition of printed Arabic text using machine learning

    NASA Astrophysics Data System (ADS)

    Amin, Adnan

    1998-04-01

    Many papers have been concerned with the recognition of Latin, Chinese and Japanese characters. However, although almost a third of a billion people worldwide, in several different languages, use Arabic characters for writing, little research progress, in both on-line and off-line has been achieved towards the automatic recognition of Arabic characters. This is a result of the lack of adequate support in terms of funding, and other utilities such as Arabic text database, dictionaries, etc. and of course of the cursive nature of its writing rules. The main theme of this paper is the automatic recognition of Arabic printed text using machine learning C4.5. Symbolic machine learning algorithms are designed to accept example descriptions in the form of feature vectors which include a label that identifies the class to which an example belongs. The output of the algorithm is a set of rules that classifies unseen examples based on generalization from the training set. This ability to generalize is the main attraction of machine learning for handwriting recognition. Samples of a character can be preprocessed into a feature vector representation for presentation to a machine learning algorithm that creates rules for recognizing characters of the same class. Symbolic machine learning has several advantages over other learning methods. It is fast in training and in recognition, generalizes well, is noise tolerant and the symbolic representation is easy to understand. The technique can be divided into three major steps: the first step is pre- processing in which the original image is transformed into a binary image utilizing a 300 dpi scanner and then forming the connected component. Second, global features of the input Arabic word are then extracted such as number subwords, number of peaks within the subword, number and position of the complementary character, etc. Finally, machine learning C4.5 is used for character classification to generate a decision tree.

  13. A collaborative framework for Distributed Privacy-Preserving Support Vector Machine learning.

    PubMed

    Que, Jialan; Jiang, Xiaoqian; Ohno-Machado, Lucila

    2012-01-01

    A Support Vector Machine (SVM) is a popular tool for decision support. The traditional way to build an SVM model is to estimate parameters based on a centralized repository of data. However, in the field of biomedicine, patient data are sometimes stored in local repositories or institutions where they were collected, and may not be easily shared due to privacy concerns. This creates a substantial barrier for researchers to effectively learn from the distributed data using machine learning tools like SVMs. To overcome this difficulty and promote efficient information exchange without sharing sensitive raw data, we developed a Distributed Privacy Preserving Support Vector Machine (DPP-SVM). The DPP-SVM enables privacy-preserving collaborative learning, in which a trusted server integrates "privacy-insensitive" intermediary results. The globally learned model is guaranteed to be exactly the same as learned from combined data. We also provide a free web-service (http://privacy.ucsd.edu:8080/ppsvm/) for multiple participants to collaborate and complete the SVM-learning task in an efficient and privacy-preserving manner. PMID:23304414

  14. A collaborative framework for Distributed Privacy-Preserving Support Vector Machine learning.

    PubMed

    Que, Jialan; Jiang, Xiaoqian; Ohno-Machado, Lucila

    2012-01-01

    A Support Vector Machine (SVM) is a popular tool for decision support. The traditional way to build an SVM model is to estimate parameters based on a centralized repository of data. However, in the field of biomedicine, patient data are sometimes stored in local repositories or institutions where they were collected, and may not be easily shared due to privacy concerns. This creates a substantial barrier for researchers to effectively learn from the distributed data using machine learning tools like SVMs. To overcome this difficulty and promote efficient information exchange without sharing sensitive raw data, we developed a Distributed Privacy Preserving Support Vector Machine (DPP-SVM). The DPP-SVM enables privacy-preserving collaborative learning, in which a trusted server integrates "privacy-insensitive" intermediary results. The globally learned model is guaranteed to be exactly the same as learned from combined data. We also provide a free web-service (http://privacy.ucsd.edu:8080/ppsvm/) for multiple participants to collaborate and complete the SVM-learning task in an efficient and privacy-preserving manner.

  15. Machine Learning Based Classification of Microsatellite Variation: An Effective Approach for Phylogeographic Characterization of Olive Populations.

    PubMed

    Torkzaban, Bahareh; Kayvanjoo, Amir Hossein; Ardalan, Arman; Mousavi, Soraya; Mariotti, Roberto; Baldoni, Luciana; Ebrahimie, Esmaeil; Ebrahimi, Mansour; Hosseini-Mazinani, Mehdi

    2015-01-01

    Finding efficient analytical techniques is overwhelmingly turning into a bottleneck for the effectiveness of large biological data. Machine learning offers a novel and powerful tool to advance classification and modeling solutions in molecular biology. However, these methods have been less frequently used with empirical population genetics data. In this study, we developed a new combined approach of data analysis using microsatellite marker data from our previous studies of olive populations using machine learning algorithms. Herein, 267 olive accessions of various origins including 21 reference cultivars, 132 local ecotypes, and 37 wild olive specimens from the Iranian plateau, together with 77 of the most represented Mediterranean varieties were investigated using a finely selected panel of 11 microsatellite markers. We organized data in two '4-targeted' and '16-targeted' experiments. A strategy of assaying different machine based analyses (i.e. data cleaning, feature selection, and machine learning classification) was devised to identify the most informative loci and the most diagnostic alleles to represent the population and the geography of each olive accession. These analyses revealed microsatellite markers with the highest differentiating capacity and proved efficiency for our method of clustering olive accessions to reflect upon their regions of origin. A distinguished highlight of this study was the discovery of the best combination of markers for better differentiating of populations via machine learning models, which can be exploited to distinguish among other biological populations.

  16. Can Machine Learning Methods Predict Extubation Outcome in Premature Infants as well as Clinicians?

    PubMed Central

    Mueller, Martina; Almeida, Jonas S.; Stanislaus, Romesh; Wagner, Carol L.

    2014-01-01

    Rationale Though treatment of the prematurely born infant breathing with assistance of a mechanical ventilator has much advanced in the past decades, predicting extubation outcome at a given point in time remains challenging. Numerous studies have been conducted to identify predictors for extubation outcome; however, the rate of infants failing extubation attempts has not declined. Objective To develop a decision-support tool for the prediction of extubation outcome in premature infants using a set of machine learning algorithms Methods A dataset assembled from 486 premature infants on mechanical ventilation was used to develop predictive models using machine learning algorithms such as artificial neural networks (ANN), support vector machine (SVM), naïve Bayesian classifier (NBC), boosted decision trees (BDT), and multivariable logistic regression (MLR). Performance of all models was evaluated using area under the curve (AUC). Results For some of the models (ANN, MLR and NBC) results were satisfactory (AUC: 0.63–0.76); however, two algorithms (SVM and BDT) showed poor performance with AUCs of ~0.5. Conclusion Clinician's predictions still outperform machine learning due to the complexity of the data and contextual information that may not be captured in clinical data used as input for the development of the machine learning algorithms. Inclusion of preprocessing steps in future studies may improve the performance of prediction models. PMID:25419493

  17. Machine learning for Big Data analytics in plants.

    PubMed

    Ma, Chuang; Zhang, Hao Helen; Wang, Xiangfeng

    2014-12-01

    Rapid advances in high-throughput genomic technology have enabled biology to enter the era of 'Big Data' (large datasets). The plant science community not only needs to build its own Big-Data-compatible parallel computing and data management infrastructures, but also to seek novel analytical paradigms to extract information from the overwhelming amounts of data. Machine learning offers promising computational and analytical solutions for the integrative analysis of large, heterogeneous and unstructured datasets on the Big-Data scale, and is gradually gaining popularity in biology. This review introduces the basic concepts and procedures of machine-learning applications and envisages how machine learning could interface with Big Data technology to facilitate basic research and biotechnology in the plant sciences. PMID:25223304

  18. Predicting Market Impact Costs Using Nonparametric Machine Learning Models.

    PubMed

    Park, Saerom; Lee, Jaewook; Son, Youngdoo

    2016-01-01

    Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance.

  19. Accurate Identification of Cancerlectins through Hybrid Machine Learning Technology.

    PubMed

    Zhang, Jieru; Ju, Ying; Lu, Huijuan; Xuan, Ping; Zou, Quan

    2016-01-01

    Cancerlectins are cancer-related proteins that function as lectins. They have been identified through computational identification techniques, but these techniques have sometimes failed to identify proteins because of sequence diversity among the cancerlectins. Advanced machine learning identification methods, such as support vector machine and basic sequence features (n-gram), have also been used to identify cancerlectins. In this study, various protein fingerprint features and advanced classifiers, including ensemble learning techniques, were utilized to identify this group of proteins. We improved the prediction accuracy of the original feature extraction methods and classification algorithms by more than 10% on average. Our work provides a basis for the computational identification of cancerlectins and reveals the power of hybrid machine learning techniques in computational proteomics. PMID:27478823

  20. Machine learning for Big Data analytics in plants.

    PubMed

    Ma, Chuang; Zhang, Hao Helen; Wang, Xiangfeng

    2014-12-01

    Rapid advances in high-throughput genomic technology have enabled biology to enter the era of 'Big Data' (large datasets). The plant science community not only needs to build its own Big-Data-compatible parallel computing and data management infrastructures, but also to seek novel analytical paradigms to extract information from the overwhelming amounts of data. Machine learning offers promising computational and analytical solutions for the integrative analysis of large, heterogeneous and unstructured datasets on the Big-Data scale, and is gradually gaining popularity in biology. This review introduces the basic concepts and procedures of machine-learning applications and envisages how machine learning could interface with Big Data technology to facilitate basic research and biotechnology in the plant sciences.

  1. Predicting Market Impact Costs Using Nonparametric Machine Learning Models.

    PubMed

    Park, Saerom; Lee, Jaewook; Son, Youngdoo

    2016-01-01

    Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance. PMID:26926235

  2. Accurate Identification of Cancerlectins through Hybrid Machine Learning Technology

    PubMed Central

    Ju, Ying

    2016-01-01

    Cancerlectins are cancer-related proteins that function as lectins. They have been identified through computational identification techniques, but these techniques have sometimes failed to identify proteins because of sequence diversity among the cancerlectins. Advanced machine learning identification methods, such as support vector machine and basic sequence features (n-gram), have also been used to identify cancerlectins. In this study, various protein fingerprint features and advanced classifiers, including ensemble learning techniques, were utilized to identify this group of proteins. We improved the prediction accuracy of the original feature extraction methods and classification algorithms by more than 10% on average. Our work provides a basis for the computational identification of cancerlectins and reveals the power of hybrid machine learning techniques in computational proteomics. PMID:27478823

  3. The Philosophy of Science and its relation to Machine Learning

    NASA Astrophysics Data System (ADS)

    Williamson, Jon

    In this chapter I discuss connections between machine learning and the philosophy of science. First I consider the relationship between the two disciplines. There is a clear analogy between hypothesis choice in science and model selection in machine learning. While this analogy has been invoked to argue that the two disciplines are essentially doing the same thing and should merge, I maintain that the disciplines are distinct but related and that there is a dynamic interaction operating between the two: a series of mutually beneficial interactions that changes over time. I will introduce some particularly fruitful interactions, in particular the consequences of automated scientific discovery for the debate on inductivism versus falsificationism in the philosophy of science, and the importance of philosophical work on Bayesian epistemology and causality for contemporary machine learning. I will close by suggesting the locus of a possible future interaction: evidence integration.

  4. Predicting Market Impact Costs Using Nonparametric Machine Learning Models

    PubMed Central

    Park, Saerom; Lee, Jaewook; Son, Youngdoo

    2016-01-01

    Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance. PMID:26926235

  5. The cerebellum: a neuronal learning machine?

    NASA Technical Reports Server (NTRS)

    Raymond, J. L.; Lisberger, S. G.; Mauk, M. D.

    1996-01-01

    Comparison of two seemingly quite different behaviors yields a surprisingly consistent picture of the role of the cerebellum in motor learning. Behavioral and physiological data about classical conditioning of the eyelid response and motor learning in the vestibulo-ocular reflex suggests that (i) plasticity is distributed between the cerebellar cortex and the deep cerebellar nuclei; (ii) the cerebellar cortex plays a special role in learning the timing of movement; and (iii) the cerebellar cortex guides learning in the deep nuclei, which may allow learning to be transferred from the cortex to the deep nuclei. Because many of the similarities in the data from the two systems typify general features of cerebellar organization, the cerebellar mechanisms of learning in these two systems may represent principles that apply to many motor systems.

  6. Energy landscapes for a machine learning application to series data

    NASA Astrophysics Data System (ADS)

    Ballard, Andrew J.; Stevenson, Jacob D.; Das, Ritankar; Wales, David J.

    2016-03-01

    Methods developed to explore and characterise potential energy landscapes are applied to the corresponding landscapes obtained from optimisation of a cost function in machine learning. We consider neural network predictions for the outcome of local geometry optimisation in a triatomic cluster, where four distinct local minima exist. The accuracy of the predictions is compared for fits using data from single and multiple points in the series of atomic configurations resulting from local geometry optimisation and for alternative neural networks. The machine learning solution landscapes are visualised using disconnectivity graphs, and signatures in the effective heat capacity are analysed in terms of distributions of local minima and their properties.

  7. RECONCILE: a machine-learning coreference resolution system

    2007-12-10

    RECONCILE is a noun phrase conference resolution system: it identifies noun phrases in a text document and determines which subsets refer to each real world entity referenced in the text. The heart of the system is a combination of supervised and unsupervised machine learning systems. It uses a machine learning algorithm (chosen from an extensive suite, including Weka) for training noun phrase coreference classifier models and implements a variety of clustering algorithms to coordinate themore » pairwise classifications. A number of features have been implemented, including all of the features employed in Ng & Cardie [2002].« less

  8. Energy landscapes for a machine learning application to series data.

    PubMed

    Ballard, Andrew J; Stevenson, Jacob D; Das, Ritankar; Wales, David J

    2016-03-28

    Methods developed to explore and characterise potential energy landscapes are applied to the corresponding landscapes obtained from optimisation of a cost function in machine learning. We consider neural network predictions for the outcome of local geometry optimisation in a triatomic cluster, where four distinct local minima exist. The accuracy of the predictions is compared for fits using data from single and multiple points in the series of atomic configurations resulting from local geometry optimisation and for alternative neural networks. The machine learning solution landscapes are visualised using disconnectivity graphs, and signatures in the effective heat capacity are analysed in terms of distributions of local minima and their properties.

  9. 3D Visualization of Machine Learning Algorithms with Astronomical Data

    NASA Astrophysics Data System (ADS)

    Kent, Brian R.

    2016-01-01

    We present innovative machine learning (ML) methods using unsupervised clustering with minimum spanning trees (MSTs) to study 3D astronomical catalogs. Utilizing Python code to build trees based on galaxy catalogs, we can render the results with the visualization suite Blender to produce interactive 360 degree panoramic videos. The catalogs and their ML results can be explored in a 3D space using mobile devices, tablets or desktop browsers. We compare the statistics of the MST results to a number of machine learning methods relating to optimization and efficiency.

  10. Machine Learning Search for Gamma-Ray Burst Afterglows in Optical Images

    NASA Astrophysics Data System (ADS)

    Topinka, M.

    2016-06-01

    Thanks to the advances in robotic telescopes, time domain astronomy leads to a large number of transient events detected in images every night. Data mining and machine learning tools used for object classification are presented. The goal is to automatically classify transient events for both further follow-up by a larger telescope and for statistical studies of transient events. Special attention is given to the identification of gamma-ray burst afterglows. Machine learning techniques are used to identify GROND gamma-ray burst afterglow among the astrophysical objects present in the SDSS archival images based on the g'-r', r'-i' and i'-z' color indices. The performance of the support vector machine, random forest and neural network algorithms is compared. A joint meta-classifier, built on top of the individual classifiers, can identify GRB afterglows with the overall accuracy of ≳ 90%.

  11. Machine learning applications in cancer prognosis and prediction.

    PubMed

    Kourou, Konstantina; Exarchos, Themis P; Exarchos, Konstantinos P; Karamouzis, Michalis V; Fotiadis, Dimitrios I

    2015-01-01

    Cancer has been characterized as a heterogeneous disease consisting of many different subtypes. The early diagnosis and prognosis of a cancer type have become a necessity in cancer research, as it can facilitate the subsequent clinical management of patients. The importance of classifying cancer patients into high or low risk groups has led many research teams, from the biomedical and the bioinformatics field, to study the application of machine learning (ML) methods. Therefore, these techniques have been utilized as an aim to model the progression and treatment of cancerous conditions. In addition, the ability of ML tools to detect key features from complex datasets reveals their importance. A variety of these techniques, including Artificial Neural Networks (ANNs), Bayesian Networks (BNs), Support Vector Machines (SVMs) and Decision Trees (DTs) have been widely applied in cancer research for the development of predictive models, resulting in effective and accurate decision making. Even though it is evident that the use of ML methods can improve our understanding of cancer progression, an appropriate level of validation is needed in order for these methods to be considered in the everyday clinical practice. In this work, we present a review of recent ML approaches employed in the modeling of cancer progression. The predictive models discussed here are based on various supervised ML techniques as well as on different input features and data samples. Given the growing trend on the application of ML methods in cancer research, we present here the most recent publications that employ these techniques as an aim to model cancer risk or patient outcomes.

  12. Machine learning applications in cancer prognosis and prediction

    PubMed Central

    Kourou, Konstantina; Exarchos, Themis P.; Exarchos, Konstantinos P.; Karamouzis, Michalis V.; Fotiadis, Dimitrios I.

    2014-01-01

    Cancer has been characterized as a heterogeneous disease consisting of many different subtypes. The early diagnosis and prognosis of a cancer type have become a necessity in cancer research, as it can facilitate the subsequent clinical management of patients. The importance of classifying cancer patients into high or low risk groups has led many research teams, from the biomedical and the bioinformatics field, to study the application of machine learning (ML) methods. Therefore, these techniques have been utilized as an aim to model the progression and treatment of cancerous conditions. In addition, the ability of ML tools to detect key features from complex datasets reveals their importance. A variety of these techniques, including Artificial Neural Networks (ANNs), Bayesian Networks (BNs), Support Vector Machines (SVMs) and Decision Trees (DTs) have been widely applied in cancer research for the development of predictive models, resulting in effective and accurate decision making. Even though it is evident that the use of ML methods can improve our understanding of cancer progression, an appropriate level of validation is needed in order for these methods to be considered in the everyday clinical practice. In this work, we present a review of recent ML approaches employed in the modeling of cancer progression. The predictive models discussed here are based on various supervised ML techniques as well as on different input features and data samples. Given the growing trend on the application of ML methods in cancer research, we present here the most recent publications that employ these techniques as an aim to model cancer risk or patient outcomes. PMID:25750696

  13. A machine learning approach to quantifying noise in medical images

    NASA Astrophysics Data System (ADS)

    Chowdhury, Aritra; Sevinsky, Christopher J.; Yener, Bülent; Aggour, Kareem S.; Gustafson, Steven M.

    2016-03-01

    As advances in medical imaging technology are resulting in significant growth of biomedical image data, new techniques are needed to automate the process of identifying images of low quality. Automation is needed because it is very time consuming for a domain expert such as a medical practitioner or a biologist to manually separate good images from bad ones. While there are plenty of de-noising algorithms in the literature, their focus is on designing filters which are necessary but not sufficient for determining how useful an image is to a domain expert. Thus a computational tool is needed to assign a score to each image based on its perceived quality. In this paper, we introduce a machine learning-based score and call it the Quality of Image (QoI) score. The QoI score is computed by combining the confidence values of two popular classification techniques—support vector machines (SVMs) and Naïve Bayes classifiers. We test our technique on clinical image data obtained from cancerous tissue samples. We used 747 tissue samples that are stained by four different markers (abbreviated as CK15, pck26, E_cad and Vimentin) leading to a total of 2,988 images. The results show that images can be classified as good (high QoI), bad (low QoI) or ugly (intermediate QoI) based on their QoI scores. Our automated labeling is in agreement with the domain experts with a bi-modal classification accuracy of 94%, on average. Furthermore, ugly images can be recovered and forwarded for further post-processing.

  14. Molecular learning with DNA kernel machines.

    PubMed

    Noh, Yung-Kyun; Lee, Daniel D; Yang, Kyung-Ae; Kim, Cheongtag; Zhang, Byoung-Tak

    2015-11-01

    We present a computational learning method for bio-molecular classification. This method shows how to design biochemical operations both for learning and pattern classification. As opposed to prior work, our molecular algorithm learns generic classes considering the realization in vitro via a sequence of molecular biological operations on sets of DNA examples. Specifically, hybridization between DNA molecules is interpreted as computing the inner product between embedded vectors in a corresponding vector space, and our algorithm performs learning of a binary classifier in this vector space. We analyze the thermodynamic behavior of these learning algorithms, and show simulations on artificial and real datasets as well as demonstrate preliminary wet experimental results using gel electrophoresis.

  15. Learning Activity Packets for Grinding Machines. Unit I--Grinding Machines.

    ERIC Educational Resources Information Center

    Oklahoma State Board of Vocational and Technical Education, Stillwater. Curriculum and Instructional Materials Center.

    This learning activity packet (LAP) is one of three that accompany the curriculum guide on grinding machines. It outlines the study activities and performance tasks for the first unit of this curriculum guide. Its purpose is to aid the student in attaining a working knowledge of this area of training and in achieving a skilled or moderately…

  16. Refining fuzzy logic controllers with machine learning

    NASA Technical Reports Server (NTRS)

    Berenji, Hamid R.

    1994-01-01

    In this paper, we describe the GARIC (Generalized Approximate Reasoning-Based Intelligent Control) architecture, which learns from its past performance and modifies the labels in the fuzzy rules to improve performance. It uses fuzzy reinforcement learning which is a hybrid method of fuzzy logic and reinforcement learning. This technology can simplify and automate the application of fuzzy logic control to a variety of systems. GARIC has been applied in simulation studies of the Space Shuttle rendezvous and docking experiments. It has the potential of being applied in other aerospace systems as well as in consumer products such as appliances, cameras, and cars.

  17. Outsmarting neural networks: an alternative paradigm for machine learning

    SciTech Connect

    Protopopescu, V.; Rao, N.S.V.

    1996-10-01

    We address three problems in machine learning, namely: (i) function learning, (ii) regression estimation, and (iii) sensor fusion, in the Probably and Approximately Correct (PAC) framework. We show that, under certain conditions, one can reduce the three problems above to the regression estimation. The latter is usually tackled with artificial neural networks (ANNs) that satisfy the PAC criteria, but have high computational complexity. We propose several computationally efficient PAC alternatives to ANNs to solve the regression estimation. Thereby we also provide efficient PAC solutions to the function learning and sensor fusion problems. The approach is based on cross-fertilizing concepts and methods from statistical estimation, nonlinear algorithms, and the theory of computational complexity, and is designed as part of a new, coherent paradigm for machine learning.

  18. Machine Learning for Treatment Assignment: Improving Individualized Risk Attribution.

    PubMed

    Weiss, Jeremy; Kuusisto, Finn; Boyd, Kendrick; Liu, Jie; Page, David

    2015-01-01

    Clinical studies model the average treatment effect (ATE), but apply this population-level effect to future individuals. Due to recent developments of machine learning algorithms with useful statistical guarantees, we argue instead for modeling the individualized treatment effect (ITE), which has better applicability to new patients. We compare ATE-estimation using randomized and observational analysis methods against ITE-estimation using machine learning, and describe how the ITE theoretically generalizes to new population distributions, whereas the ATE may not. On a synthetic data set of statin use and myocardial infarction (MI), we show that a learned ITE model improves true ITE estimation and outperforms the ATE. We additionally argue that ITE models should be learned with a consistent, nonparametric algorithm from unweighted examples and show experiments in favor of our argument using our synthetic data model and a real data set of D-penicillamine use for primary biliary cirrhosis.

  19. A Machine Learning Approach for Accurate Annotation of Noncoding RNAs.

    PubMed

    Song, Yinglei; Liu, Chunmei; Wang, Zhi

    2015-01-01

    Searching genomes to locate noncoding RNA genes with known secondary structure is an important problem in bioinformatics. In general, the secondary structure of a searched noncoding RNA is defined with a structure model constructed from the structural alignment of a set of sequences from its family. Computing the optimal alignment between a sequence and a structure model is the core part of an algorithm that can search genomes for noncoding RNAs. In practice, a single structure model may not be sufficient to capture all crucial features important for a noncoding RNA family. In this paper, we develop a novel machine learning approach that can efficiently search genomes for noncoding RNAs with high accuracy. During the search procedure, a sequence segment in the searched genome sequence is processed and a feature vector is extracted to represent it. Based on the feature vector, a classifier is used to determine whether the sequence segment is the searched ncRNA or not. Our testing results show that this approach is able to efficiently capture crucial features of a noncoding RNA family. Compared with existing search tools, it significantly improves the accuracy of genome annotation. PMID:26357266

  20. Machine learning and cosmological simulations - II. Hydrodynamical simulations

    NASA Astrophysics Data System (ADS)

    Kamdar, Harshil M.; Turk, Matthew J.; Brunner, Robert J.

    2016-04-01

    We extend a machine learning (ML) framework presented previously to model galaxy formation and evolution in a hierarchical universe using N-body + hydrodynamical simulations. In this work, we show that ML is a promising technique to study galaxy formation in the backdrop of a hydrodynamical simulation. We use the Illustris simulation to train and test various sophisticated ML algorithms. By using only essential dark matter halo physical properties and no merger history, our model predicts the gas mass, stellar mass, black hole mass, star formation rate, g - r colour, and stellar metallicity fairly robustly. Our results provide a unique and powerful phenomenological framework to explore the galaxy-halo connection that is built upon a solid hydrodynamical simulation. The promising reproduction of the listed galaxy properties demonstrably place ML as a promising and a significantly more computationally efficient tool to study small-scale structure formation. We find that ML mimics a full-blown hydrodynamical simulation surprisingly well in a computation time of mere minutes. The population of galaxies simulated by ML, while not numerically identical to Illustris, is statistically robust and physically consistent with Illustris galaxies and follows the same fundamental observational constraints. ML offers an intriguing and promising technique to create quick mock galaxy catalogues in the future.

  1. A Machine Learning Approach for Accurate Annotation of Noncoding RNAs

    PubMed Central

    Liu, Chunmei; Wang, Zhi

    2016-01-01

    Searching genomes to locate noncoding RNA genes with known secondary structure is an important problem in bioinformatics. In general, the secondary structure of a searched noncoding RNA is defined with a structure model constructed from the structural alignment of a set of sequences from its family. Computing the optimal alignment between a sequence and a structure model is the core part of an algorithm that can search genomes for noncoding RNAs. In practice, a single structure model may not be sufficient to capture all crucial features important for a noncoding RNA family. In this paper, we develop a novel machine learning approach that can efficiently search genomes for noncoding RNAs with high accuracy. During the search procedure, a sequence segment in the searched genome sequence is processed and a feature vector is extracted to represent it. Based on the feature vector, a classifier is used to determine whether the sequence segment is the searched ncRNA or not. Our testing results show that this approach is able to efficiently capture crucial features of a noncoding RNA family. Compared with existing search tools, it significantly improves the accuracy of genome annotation. PMID:26357266

  2. Machine learning and cosmological simulations - I. Semi-analytical models

    NASA Astrophysics Data System (ADS)

    Kamdar, Harshil M.; Turk, Matthew J.; Brunner, Robert J.

    2016-01-01

    We present a new exploratory framework to model galaxy formation and evolution in a hierarchical Universe by using machine learning (ML). Our motivations are two-fold: (1) presenting a new, promising technique to study galaxy formation, and (2) quantitatively analysing the extent of the influence of dark matter halo properties on galaxies in the backdrop of semi-analytical models (SAMs). We use the influential Millennium Simulation and the corresponding Munich SAM to train and test various sophisticated ML algorithms (k-Nearest Neighbors, decision trees, random forests, and extremely randomized trees). By using only essential dark matter halo physical properties for haloes of M > 1012 M⊙ and a partial merger tree, our model predicts the hot gas mass, cold gas mass, bulge mass, total stellar mass, black hole mass and cooling radius at z = 0 for each central galaxy in a dark matter halo for the Millennium run. Our results provide a unique and powerful phenomenological framework to explore the galaxy-halo connection that is built upon SAMs and demonstrably place ML as a promising and a computationally efficient tool to study small-scale structure formation.

  3. Energy landscapes for a machine-learning prediction of patient discharge

    NASA Astrophysics Data System (ADS)

    Das, Ritankar; Wales, David J.

    2016-06-01

    The energy landscapes framework is applied to a configuration space generated by training the parameters of a neural network. In this study the input data consists of time series for a collection of vital signs monitored for hospital patients, and the outcomes are patient discharge or continued hospitalisation. Using machine learning as a predictive diagnostic tool to identify patterns in large quantities of electronic health record data in real time is a very attractive approach for supporting clinical decisions, which have the potential to improve patient outcomes and reduce waiting times for discharge. Here we report some preliminary analysis to show how machine learning might be applied. In particular, we visualize the fitting landscape in terms of locally optimal neural networks and the connections between them in parameter space. We anticipate that these results, and analogues of thermodynamic properties for molecular systems, may help in the future design of improved predictive tools.

  4. Improving Organizational Learning: Defining Units of Learning from Social Tools

    ERIC Educational Resources Information Center

    Menolli, André Luís Andrade; Reinehr, Sheila; Malucelli, Andreia

    2013-01-01

    New technologies, such as social networks, wikis, blogs and other social tools, enable collaborative work and are important facilitators of the social learning process. Many companies are using these types of tools as substitutes for their intranets, especially software development companies. However, the content generated by these tools in many…

  5. Application of Learning Machines and Combinatorial Algorithms in Water Resources Management and Hydrologic Sciences

    SciTech Connect

    Khalil, Abedalrazq F.; Kaheil, Yasir H.; Gill, Kashif; Mckee, Mac

    2010-01-01

    Contemporary and water resources engineering and management rely increasingly on pattern recognition techniques that have the ability to capitalize on the unrelenting accumulation of data that is made possible by modern information technology and remote sensing methods. In response to the growing information needs of modern water systems, advanced computational models and tools have been devised to identify and extract relevant information from the mass of data that is now available. This chapter presents innovative applications from computational learning science within the fields of hydrology, hydrogeology, hydroclimatology, and water management. The success of machine learning is evident from the growing number of studies involving the application of Artificial Neural Networks (ANN), Support Vector Machines (SVM), Relevance Vector Machines (RVM), and Locally Weighted Projection Regression (LWPR) to address various issues in hydrologic sciences. The applications that will be discussed within the chapter employ the abovementioned machine learning techniques for intelligent modeling of reservoir operations, temporal downscaling of precipitation, spatial downscaling of soil moisture and evapotranspiration, comparisons of various techniques for groundwater quality modeling, and forecasting of chaotic time series behavior. Combinatorial algorithms to capture the intrinsic complexities in the modeled phenomena and to overcome disparate scales are developed; for example, learning machines have been coupled with geostatistical techniques, non-homogenous hidden Markov models, wavelets, and evolutionary computing techniques. This chapter does not intend to be exhaustive; it reviews the progress that has been made over the past decade in the use of learning machines in applied hydrologic sciences and presents a summary of future needs and challenges for further advancement of these methods.

  6. Predicting single-molecule conductance through machine learning

    NASA Astrophysics Data System (ADS)

    Lanzillo, Nicholas A.; Breneman, Curt M.

    2016-10-01

    We present a robust machine learning model that is trained on the experimentally determined electrical conductance values of approximately 120 single-molecule junctions used in scanning tunnelling microscope molecular break junction (STM-MBJ) experiments. Quantum mechanical, chemical, and topological descriptors are used to correlate each molecular structure with a conductance value, and the resulting machine-learning model can predict the corresponding value of conductance with correlation coefficients of r 2 = 0.95 for the training set and r 2 = 0.78 for a blind testing set. While neglecting entirely the effects of the metal contacts, this work demonstrates that single molecule conductance can be qualitatively correlated with a number of molecular descriptors through a suitably trained machine learning model. The dominant features in the machine learning model include those based on the electronic wavefunction, the geometry/topology of the molecule as well as the surface chemistry of the molecule. This model can be used to identify promising molecular structures for use in single-molecule electronic circuits and can guide synthesis and experiments in the future.

  7. Acquiring Software Design Schemas: A Machine Learning Perspective

    NASA Technical Reports Server (NTRS)

    Harandi, Mehdi T.; Lee, Hing-Yan

    1991-01-01

    In this paper, we describe an approach based on machine learning that acquires software design schemas from design cases of existing applications. An overview of the technique, design representation, and acquisition system are presented. the paper also addresses issues associated with generalizing common features such as biases. The generalization process is illustrated using an example.

  8. Machine learning of fault characteristics from rocket engine simulation data

    NASA Technical Reports Server (NTRS)

    Ke, Min; Ali, Moonis

    1990-01-01

    Transformation of data into knowledge through conceptual induction has been the focus of our research described in this paper. We have developed a Machine Learning System (MLS) to analyze the rocket engine simulation data. MLS can provide to its users fault analysis, characteristics, and conceptual descriptions of faults, and the relationships of attributes and sensors. All the results are critically important in identifying faults.

  9. An efficient learning procedure for deep Boltzmann machines.

    PubMed

    Salakhutdinov, Ruslan; Hinton, Geoffrey

    2012-08-01

    We present a new learning algorithm for Boltzmann machines that contain many layers of hidden variables. Data-dependent statistics are estimated using a variational approximation that tends to focus on a single mode, and data-independent statistics are estimated using persistent Markov chains. The use of two quite different techniques for estimating the two types of statistic that enter into the gradient of the log likelihood makes it practical to learn Boltzmann machines with multiple hidden layers and millions of parameters. The learning can be made more efficient by using a layer-by-layer pretraining phase that initializes the weights sensibly. The pretraining also allows the variational inference to be initialized sensibly with a single bottom-up pass. We present results on the MNIST and NORB data sets showing that deep Boltzmann machines learn very good generative models of handwritten digits and 3D objects. We also show that the features discovered by deep Boltzmann machines are a very effective way to initialize the hidden layers of feedforward neural nets, which are then discriminatively fine-tuned.

  10. Testing and Validating Machine Learning Classifiers by Metamorphic Testing.

    PubMed

    Xie, Xiaoyuan; Ho, Joshua W K; Murphy, Christian; Kaiser, Gail; Xu, Baowen; Chen, Tsong Yueh

    2011-04-01

    Machine Learning algorithms have provided core functionality to many application domains - such as bioinformatics, computational linguistics, etc. However, it is difficult to detect faults in such applications because often there is no "test oracle" to verify the correctness of the computed outputs. To help address the software quality, in this paper we present a technique for testing the implementations of machine learning classification algorithms which support such applications. Our approach is based on the technique "metamorphic testing", which has been shown to be effective to alleviate the oracle problem. Also presented include a case study on a real-world machine learning application framework, and a discussion of how programmers implementing machine learning algorithms can avoid the common pitfalls discovered in our study. We also conduct mutation analysis and cross-validation, which reveal that our method has high effectiveness in killing mutants, and that observing expected cross-validation result alone is not sufficiently effective to detect faults in a supervised classification program. The effectiveness of metamorphic testing is further confirmed by the detection of real faults in a popular open-source classification program.

  11. Machine learning techniques for fault isolation and sensor placement

    NASA Technical Reports Server (NTRS)

    Carnes, James R.; Fisher, Douglas H.

    1993-01-01

    Fault isolation and sensor placement are vital for monitoring and diagnosis. A sensor conveys information about a system's state that guides troubleshooting if problems arise. We are using machine learning methods to uncover behavioral patterns over snapshots of system simulations that will aid fault isolation and sensor placement, with an eye towards minimality, fault coverage, and noise tolerance.

  12. Coupling for joining a ball nut to a machine tool carriage

    DOEpatents

    Gerth, Howard L.

    1979-01-01

    The present invention relates to an improved coupling for joining a lead screw ball nut to a machine tool carriage. The ball nut is coupled to the machine tool carriage by a plurality of laterally flexible bolts which function as hinges during the rotation of the lead screw for substantially reducing lateral carriage movement due to wobble in the lead screw.

  13. Machine and Woodworking Tool Safety. Module SH-24. Safety and Health.

    ERIC Educational Resources Information Center

    Center for Occupational Research and Development, Inc., Waco, TX.

    This student module on machine and woodworking tool safety is one of 50 modules concerned with job safety and health. This module discusses specific practices and precautions concerned with the efficient operation and use of most machine and woodworking tools in use today. Following the introduction, 13 objectives (each keyed to a page in the…

  14. Effects of machining parameters on tool life and its optimization in turning mild steel with brazed carbide cutting tool

    NASA Astrophysics Data System (ADS)

    Dasgupta, S.; Mukherjee, S.

    2016-09-01

    One of the most significant factors in metal cutting is tool life. In this research work, the effects of machining parameters on tool under wet machining environment were studied. Tool life characteristics of brazed carbide cutting tool machined against mild steel and optimization of machining parameters based on Taguchi design of experiments were examined. The experiments were conducted using three factors, spindle speed, feed rate and depth of cut each having three levels. Nine experiments were performed on a high speed semi-automatic precision central lathe. ANOVA was used to determine the level of importance of the machining parameters on tool life. The optimum machining parameter combination was obtained by the analysis of S/N ratio. A mathematical model based on multiple regression analysis was developed to predict the tool life. Taguchi's orthogonal array analysis revealed the optimal combination of parameters at lower levels of spindle speed, feed rate and depth of cut which are 550 rpm, 0.2 mm/rev and 0.5mm respectively. The Main Effects plot reiterated the same. The variation of tool life with different process parameters has been plotted. Feed rate has the most significant effect on tool life followed by spindle speed and depth of cut.

  15. Rapid Probabilistic Source Inversion Using Machine Learning Techniques

    NASA Astrophysics Data System (ADS)

    Kaeufl, P.; Valentine, A. P.; Trampert, J.

    2013-12-01

    Determination of earthquake source parameters is an important task in seismology. For many applications, it is also valuable to understand the uncertainties associated with these determinations, and this is particularly true in the context of earthquake early warning and hazard mitigation. We present a framework for probabilistic centroid moment tensor point source inversions in near real-time, applicable to a wide variety of data-types. Our methodology allows us to find an approximation to p(m|d), the conditional probability of source parameters (m) given observations, (d). This approximation is obtained by smoothly interpolating a set of random prior samples, using a machine learning algorithm able to learn the mapping from d to m. The approximation obtained can be evaluated within milliseconds on a standard desktop computer for a new observation (d). This makes the method well suited for use in situations such as earthquake early warning, where inversions must be performed routinely, for a fixed station geometry, and where it is important that results are obtained rapidly. This is a major advantage over traditional sampling based techniques, such as Markov-Chain Monte-Carlo methods, where a re-sampling of the posterior is necessary every time a new observation is made. We demonstrated the method by applying it to a regional static GPS displacement data set for the 2010 MW 7.2 El Mayor Cucapah earthquake in Baja California and obtained estimates of logarithmic magnitude, centroid location and depth, and focal mechanism (Käufl et al., submitted). We will present an extension of this approach to the inversion of full waveforms and explore possibilities for jointly inverting seismic and geodetic data. (1) P. Käufl, A. P. Valentine, T.B. O'Toole, J. Trampert, submitted, Geophysical Journal International

  16. A 128-Channel Extreme Learning Machine-Based Neural Decoder for Brain Machine Interfaces.

    PubMed

    Chen, Yi; Yao, Enyi; Basu, Arindam

    2016-06-01

    Currently, state-of-the-art motor intention decoding algorithms in brain-machine interfaces are mostly implemented on a PC and consume significant amount of power. A machine learning coprocessor in 0.35- μm CMOS for the motor intention decoding in the brain-machine interfaces is presented in this paper. Using Extreme Learning Machine algorithm and low-power analog processing, it achieves an energy efficiency of 3.45 pJ/MAC at a classification rate of 50 Hz. The learning in second stage and corresponding digitally stored coefficients are used to increase robustness of the core analog processor. The chip is verified with neural data recorded in monkey finger movements experiment, achieving a decoding accuracy of 99.3% for movement type. The same coprocessor is also used to decode time of movement from asynchronous neural spikes. With time-delayed feature dimension enhancement, the classification accuracy can be increased by 5% with limited number of input channels. Further, a sparsity promoting training scheme enables reduction of number of programmable weights by ≈ 2X.

  17. The immune system, adaptation, and machine learning

    NASA Astrophysics Data System (ADS)

    Farmer, J. Doyne; Packard, Norman H.; Perelson, Alan S.

    1986-10-01

    The immune system is capable of learning, memory, and pattern recognition. By employing genetic operators on a time scale fast enough to observe experimentally, the immune system is able to recognize novel shapes without preprogramming. Here we describe a dynamical model for the immune system that is based on the network hypothesis of Jerne, and is simple enough to simulate on a computer. This model has a strong similarity to an approach to learning and artificial intelligence introduced by Holland, called the classifier system. We demonstrate that simple versions of the classifier system can be cast as a nonlinear dynamical system, and explore the analogy between the immune and classifier systems in detail. Through this comparison we hope to gain insight into the way they perform specific tasks, and to suggest new approaches that might be of value in learning systems.

  18. Machine learning and predictive data analytics enabling metrology and process control in IC fabrication

    NASA Astrophysics Data System (ADS)

    Rana, Narender; Zhang, Yunlin; Wall, Donald; Dirahoui, Bachir; Bailey, Todd C.

    2015-03-01

    Integrate circuit (IC) technology is going through multiple changes in terms of patterning techniques (multiple patterning, EUV and DSA), device architectures (FinFET, nanowire, graphene) and patterning scale (few nanometers). These changes require tight controls on processes and measurements to achieve the required device performance, and challenge the metrology and process control in terms of capability and quality. Multivariate data with complex nonlinear trends and correlations generally cannot be described well by mathematical or parametric models but can be relatively easily learned by computing machines and used to predict or extrapolate. This paper introduces the predictive metrology approach which has been applied to three different applications. Machine learning and predictive analytics have been leveraged to accurately predict dimensions of EUV resist patterns down to 18 nm half pitch leveraging resist shrinkage patterns. These patterns could not be directly and accurately measured due to metrology tool limitations. Machine learning has also been applied to predict the electrical performance early in the process pipeline for deep trench capacitance and metal line resistance. As the wafer goes through various processes its associated cost multiplies. It may take days to weeks to get the electrical performance readout. Predicting the electrical performance early on can be very valuable in enabling timely actionable decision such as rework, scrap, feedforward, feedback predicted information or information derived from prediction to improve or monitor processes. This paper provides a general overview of machine learning and advanced analytics application in the advanced semiconductor development and manufacturing.

  19. Advances in Climate Informatics: Accelerating Discovery in Climate Science with Machine Learning

    NASA Astrophysics Data System (ADS)

    Monteleoni, C.

    2015-12-01

    Despite the scientific consensus on climate change, drastic uncertainties remain. The climate system is characterized by complex phenomena that are imperfectly observed and even more imperfectly simulated. Climate data is Big Data, yet the magnitude of data and climate model output increasingly overwhelms the tools currently used to analyze them. Computational innovation is therefore needed. Machine learning is a cutting-edge research area at the intersection of computer science and statistics, focused on developing algorithms for big data analytics. Machine learning has revolutionized scientific discovery (e.g. Bioinformatics), and spawned new technologies (e.g. Web search). The impact of machine learning on climate science promises to be similarly profound. The goal of the novel interdisciplinary field of Climate Informatics is to accelerate discovery in climate science with machine learning, in order to shed light on urgent questions about climate change. In this talk, I will survey my research group's progress in the emerging field of climate informatics. Our work includes algorithms to improve the combined predictions of the IPCC multi-model ensemble, applications to seasonal and subseasonal prediction, and a data-driven technique to detect and define extreme events.

  20. Machine learning approach for the outcome prediction of temporal lobe epilepsy surgery.

    PubMed

    Armañanzas, Rubén; Alonso-Nanclares, Lidia; Defelipe-Oroquieta, Jesús; Kastanauskaite, Asta; de Sola, Rafael G; Defelipe, Javier; Bielza, Concha; Larrañaga, Pedro

    2013-01-01

    Epilepsy surgery is effective in reducing both the number and frequency of seizures, particularly in temporal lobe epilepsy (TLE). Nevertheless, a significant proportion of these patients continue suffering seizures after surgery. Here we used a machine learning approach to predict the outcome of epilepsy surgery based on supervised classification data mining taking into account not only the common clinical variables, but also pathological and neuropsychological evaluations. We have generated models capable of predicting whether a patient with TLE secondary to hippocampal sclerosis will fully recover from epilepsy or not. The machine learning analysis revealed that outcome could be predicted with an estimated accuracy of almost 90% using some clinical and neuropsychological features. Importantly, not all the features were needed to perform the prediction; some of them proved to be irrelevant to the prognosis. Personality style was found to be one of the key features to predict the outcome. Although we examined relatively few cases, findings were verified across all data, showing that the machine learning approach described in the present study may be a powerful method. Since neuropsychological assessment of epileptic patients is a standard protocol in the pre-surgical evaluation, we propose to include these specific psychological tests and machine learning tools to improve the selection of candidates for epilepsy surgery.

  1. Machine learning approach for the outcome prediction of temporal lobe epilepsy surgery.

    PubMed

    Armañanzas, Rubén; Alonso-Nanclares, Lidia; Defelipe-Oroquieta, Jesús; Kastanauskaite, Asta; de Sola, Rafael G; Defelipe, Javier; Bielza, Concha; Larrañaga, Pedro

    2013-01-01

    Epilepsy surgery is effective in reducing both the number and frequency of seizures, particularly in temporal lobe epilepsy (TLE). Nevertheless, a significant proportion of these patients continue suffering seizures after surgery. Here we used a machine learning approach to predict the outcome of epilepsy surgery based on supervised classification data mining taking into account not only the common clinical variables, but also pathological and neuropsychological evaluations. We have generated models capable of predicting whether a patient with TLE secondary to hippocampal sclerosis will fully recover from epilepsy or not. The machine learning analysis revealed that outcome could be predicted with an estimated accuracy of almost 90% using some clinical and neuropsychological features. Importantly, not all the features were needed to perform the prediction; some of them proved to be irrelevant to the prognosis. Personality style was found to be one of the key features to predict the outcome. Although we examined relatively few cases, findings were verified across all data, showing that the machine learning approach described in the present study may be a powerful method. Since neuropsychological assessment of epileptic patients is a standard protocol in the pre-surgical evaluation, we propose to include these specific psychological tests and machine learning tools to improve the selection of candidates for epilepsy surgery. PMID:23646148

  2. Machine Learning Approach for the Outcome Prediction of Temporal Lobe Epilepsy Surgery

    PubMed Central

    DeFelipe-Oroquieta, Jesús; Kastanauskaite, Asta; de Sola, Rafael G.; DeFelipe, Javier; Bielza, Concha; Larrañaga, Pedro

    2013-01-01

    Epilepsy surgery is effective in reducing both the number and frequency of seizures, particularly in temporal lobe epilepsy (TLE). Nevertheless, a significant proportion of these patients continue suffering seizures after surgery. Here we used a machine learning approach to predict the outcome of epilepsy surgery based on supervised classification data mining taking into account not only the common clinical variables, but also pathological and neuropsychological evaluations. We have generated models capable of predicting whether a patient with TLE secondary to hippocampal sclerosis will fully recover from epilepsy or not. The machine learning analysis revealed that outcome could be predicted with an estimated accuracy of almost 90% using some clinical and neuropsychological features. Importantly, not all the features were needed to perform the prediction; some of them proved to be irrelevant to the prognosis. Personality style was found to be one of the key features to predict the outcome. Although we examined relatively few cases, findings were verified across all data, showing that the machine learning approach described in the present study may be a powerful method. Since neuropsychological assessment of epileptic patients is a standard protocol in the pre-surgical evaluation, we propose to include these specific psychological tests and machine learning tools to improve the selection of candidates for epilepsy surgery. PMID:23646148

  3. Efficiently Ranking Hyphotheses in Machine Learning

    NASA Technical Reports Server (NTRS)

    Chien, Steve

    1997-01-01

    This paper considers the problem of learning the ranking of a set of alternatives based upon incomplete information (e.g. a limited number of observations). At each decision cycle, the system can output a complete ordering on the hypotheses or decide to gather additional information (e.g. observation) at some cost.

  4. Protein secondary structure prediction using logic-based machine learning.

    PubMed

    Muggleton, S; King, R D; Sternberg, M J

    1992-10-01

    Many attempts have been made to solve the problem of predicting protein secondary structure from the primary sequence but the best performance results are still disappointing. In this paper, the use of a machine learning algorithm which allows relational descriptions is shown to lead to improved performance. The Inductive Logic Programming computer program, Golem, was applied to learning secondary structure prediction rules for alpha/alpha domain type proteins. The input to the program consisted of 12 non-homologous proteins (1612 residues) of known structure, together with a background knowledge describing the chemical and physical properties of the residues. Golem learned a small set of rules that predict which residues are part of the alpha-helices--based on their positional relationships and chemical and physical properties. The rules were tested on four independent non-homologous proteins (416 residues) giving an accuracy of 81% (+/- 2%). This is an improvement, on identical data, over the previously reported result of 73% by King and Sternberg (1990, J. Mol. Biol., 216, 441-457) using the machine learning program PROMIS, and of 72% using the standard Garnier-Osguthorpe-Robson method. The best previously reported result in the literature for the alpha/alpha domain type is 76%, achieved using a neural net approach. Machine learning also has the advantage over neural network and statistical methods in producing more understandable results. PMID:1480619

  5. Stacking for machine learning redshifts applied to SDSS galaxies

    NASA Astrophysics Data System (ADS)

    Zitlau, Roman; Hoyle, Ben; Paech, Kerstin; Weller, Jochen; Rau, Markus Michael; Seitz, Stella

    2016-08-01

    We present an analysis of a general machine learning technique called `stacking' for the estimation of photometric redshifts. Stacking techniques can feed the photometric redshift estimate, as output by a base algorithm, back into the same algorithm as an additional input feature in a subsequent learning round. We show how all tested base algorithms benefit from at least one additional stacking round (or layer). To demonstrate the benefit of stacking, we apply the method to both unsupervised machine learning techniques based on self-organizing maps (SOMs), and supervised machine learning methods based on decision trees. We explore a range of stacking architectures, such as the number of layers and the number of base learners per layer. Finally we explore the effectiveness of stacking even when using a successful algorithm such as AdaBoost. We observe a significant improvement of between 1.9 per cent and 21 per cent on all computed metrics when stacking is applied to weak learners (such as SOMs and decision trees). When applied to strong learning algorithms (such as AdaBoost) the ratio of improvement shrinks, but still remains positive and is between 0.4 per cent and 2.5 per cent for the explored metrics and comes at almost no additional computational cost.

  6. Research on knowledge representation, machine learning, and knowledge acquisition

    NASA Technical Reports Server (NTRS)

    Buchanan, Bruce G.

    1987-01-01

    Research in knowledge representation, machine learning, and knowledge acquisition performed at Knowledge Systems Lab. is summarized. The major goal of the research was to develop flexible, effective methods for representing the qualitative knowledge necessary for solving large problems that require symbolic reasoning as well as numerical computation. The research focused on integrating different representation methods to describe different kinds of knowledge more effectively than any one method can alone. In particular, emphasis was placed on representing and using spatial information about three dimensional objects and constraints on the arrangement of these objects in space. Another major theme is the development of robust machine learning programs that can be integrated with a variety of intelligent systems. To achieve this goal, learning methods were designed, implemented and experimented within several different problem solving environments.

  7. Combining data mining and machine learning for effective user profiling

    SciTech Connect

    Fawcett, T.; Provost, F.

    1996-12-31

    This paper describes the automatic design of methods for detecting fraudulent behavior. Much of the design is accomplished using a series of machine learning methods. In particular, we combine data mining and constructive induction with more standard machine learning techniques to design methods for detecting fraudulent usage of cellular telephones based on profiling customer behavior. Specifically, we use a rule-learning program to uncover indicators of fraudulent behavior from a large database of cellular calls. These indicators are used to create profilers, which then serve as features to a system that combines evidence from multiple profilers to generate high-confidence alarms. Experiments indicate that this automatic approach performs nearly as well as the best hand-tuned methods for detecting fraud.

  8. Bots as Language Learning Tools

    ERIC Educational Resources Information Center

    Fryer, Luke; Carpenter, Rollo

    2006-01-01

    Foreign Language Learning (FLL) students commonly have few opportunities to use their target language. Teachers in FLL situations do their best to create opportunities during classes through pair or group work, but a variety of factors ranging from a lack of time to shyness or limited opportunity for quality feedback hamper this. This paper…

  9. Problem-Based Learning Tools

    ERIC Educational Resources Information Center

    Chin, Christine; Chia, Li-Gek

    2008-01-01

    One way of implementing project-based science (PBS) is to use problem-based learning (PBL), in which students formulate their own problems. These problems are often ill-structured, mirroring complex real-life problems where data are often messy and inclusive. In this article, the authors describe how they used PBL in a ninth-grade biology class in…

  10. The development of a two-component force dynamometer and tool control system for dynamic machine tool research

    NASA Technical Reports Server (NTRS)

    Sutherland, I. A.

    1973-01-01

    The development is presented of a tooling system that makes a controlled sinusoidal oscillation simulating a dynamic chip removal condition. It also measures the machining forces in two mutually perpendicular directions without any cross sensitivity.

  11. Machine learning bandgaps of double perovskites

    NASA Astrophysics Data System (ADS)

    Pilania, Ghanshyam; Mannodi-Kanakkithodi, Arun; Uberuaga, Blas; Ramprasad, Rampi; Gubernatis, James; Lookman, Turab

    The ability to make rapid and accurate predictions of bandgaps for double perovskites is of much practical interest for a range of applications. While quantum mechanical computations for high-fidelity bandgaps are enormously computation-time intensive and thus impractical in high throughput studies, informatics-based statistical learning approaches can be a promising alternative. Here we demonstrate a systematic feature-engineering approach and a robust learning framework for efficient and accurate predictions of electronic bandgaps for double perovskites. After evaluating a set of nearly 1.2 million features, we identify several elemental features of the constituent atomic species as the most crucial and relevant predictors. The developed models are validated and tested using the best practices of data science (on a dataset of more than 1300 double perovskite bandgaps) and further analyzed to rationalize their prediction performance. Los Alamos National Laboratory LDRD program and the U.S. Department of Energy, Office of Science, Basic Energy Sciences.

  12. Machine learning bandgaps of double perovskites

    NASA Astrophysics Data System (ADS)

    Pilania, G.; Mannodi-Kanakkithodi, A.; Uberuaga, B. P.; Ramprasad, R.; Gubernatis, J. E.; Lookman, T.

    2016-01-01

    The ability to make rapid and accurate predictions on bandgaps of double perovskites is of much practical interest for a range of applications. While quantum mechanical computations for high-fidelity bandgaps are enormously computation-time intensive and thus impractical in high throughput studies, informatics-based statistical learning approaches can be a promising alternative. Here we demonstrate a systematic feature-engineering approach and a robust learning framework for efficient and accurate predictions of electronic bandgaps of double perovskites. After evaluating a set of more than 1.2 million features, we identify lowest occupied Kohn-Sham levels and elemental electronegativities of the constituent atomic species as the most crucial and relevant predictors. The developed models are validated and tested using the best practices of data science and further analyzed to rationalize their prediction performance.

  13. Machine learning bandgaps of double perovskites

    DOE PAGESBeta

    Pilania, G.; Mannodi-Kanakkithodi, A.; Uberuaga, B. P.; Ramprasad, R.; Gubernatis, J. E.; Lookman, T.

    2016-01-19

    The ability to make rapid and accurate predictions on bandgaps of double perovskites is of much practical interest for a range of applications. While quantum mechanical computations for high-fidelity bandgaps are enormously computation-time intensive and thus impractical in high throughput studies, informatics-based statistical learning approaches can be a promising alternative. Here we demonstrate a systematic feature-engineering approach and a robust learning framework for efficient and accurate predictions of electronic bandgaps of double perovskites. After evaluating a set of more than 1.2 million features, we identify lowest occupied Kohn-Sham levels and elemental electronegativities of the constituent atomic species as the mostmore » crucial and relevant predictors. As a result, the developed models are validated and tested using the best practices of data science and further analyzed to rationalize their prediction performance.« less

  14. Machine learning bandgaps of double perovskites

    PubMed Central

    Pilania, G.; Mannodi-Kanakkithodi, A.; Uberuaga, B. P.; Ramprasad, R.; Gubernatis, J. E.; Lookman, T.

    2016-01-01

    The ability to make rapid and accurate predictions on bandgaps of double perovskites is of much practical interest for a range of applications. While quantum mechanical computations for high-fidelity bandgaps are enormously computation-time intensive and thus impractical in high throughput studies, informatics-based statistical learning approaches can be a promising alternative. Here we demonstrate a systematic feature-engineering approach and a robust learning framework for efficient and accurate predictions of electronic bandgaps of double perovskites. After evaluating a set of more than 1.2 million features, we identify lowest occupied Kohn-Sham levels and elemental electronegativities of the constituent atomic species as the most crucial and relevant predictors. The developed models are validated and tested using the best practices of data science and further analyzed to rationalize their prediction performance. PMID:26783247

  15. The in-situ 3D measurement system combined with CNC machine tools

    NASA Astrophysics Data System (ADS)

    Zhao, Huijie; Jiang, Hongzhi; Li, Xudong; Sui, Shaochun; Tang, Limin; Liang, Xiaoyue; Diao, Xiaochun; Dai, Jiliang

    2013-06-01

    With the development of manufacturing industry, the in-situ 3D measurement for the machining workpieces in CNC machine tools is regarded as the new trend of efficient measurement. We introduce a 3D measurement system based on the stereovision and phase-shifting method combined with CNC machine tools, which can measure 3D profile of the machining workpieces between the key machining processes. The measurement system utilizes the method of high dynamic range fringe acquisition to solve the problem of saturation induced by specular lights reflected from shiny surfaces such as aluminum alloy workpiece or titanium alloy workpiece. We measured two workpieces of aluminum alloy on the CNC machine tools to demonstrate the effectiveness of the developed measurement system.

  16. Multimode vibration reduction concept for machine tools and automotive applications

    NASA Astrophysics Data System (ADS)

    Neugebauer, Reimund; Drossel, Welf-Guntram; Kranz, Burkhard; Kunze, Holger

    2005-05-01

    This paper reports a numerical and experimental study on a new multi mode vibration reduction concept for struts of machine tools or shafts of automotives. The example described in detail validates this new concept for high dynamic parallel kinematic struts. The structural advantages of parallel kinematic mechanisms are undisputed. However statical and dynamical bending and torsional loads must be considered during the design process of the structure and thus effect the shape of the strut geometry. The here described new actuator concept for multi mode vibration reduction is to influence these bending and torsional loads. It uses piezopatches based on the MFC technology licensed by NASA. Initial simulation and experimental tests were done at an one side clamped aluminium beam with applicated 45°-MFC's on both sides. Simulation results show, that driving the piezos in opposite direction leads to a bending deflection of the beam, driving them in the same phase leads to a torsional deflection of the aluminium beam. Experimental measurements confirm the simulation results. The benefit we get is a decreased number of actuators for multimode vibration reduction. Likewise these actuators allow the separation or selective combination of bending and torsion. This new actuation concept is not limited on beams. Further simulations for cylindrical struts result in a design of a MFC-ring with eight segments with changing fiber orientation for separation of bending and torsion on struts and shafts. The selective controlled activation of each of the segments leads to bending in x-direction, bending in y-direction or torsion.

  17. Collaborative Inquiry Learning: Models, tools, and challenges

    NASA Astrophysics Data System (ADS)

    Bell, Thorsten; Urhahne, Detlef; Schanze, Sascha; Ploetzner, Rolf

    2010-02-01

    Collaborative inquiry learning is one of the most challenging and exciting ventures for today's schools. It aims at bringing a new and promising culture of teaching and learning into the classroom where students in groups engage in self-regulated learning activities supported by the teacher. It is expected that this way of learning fosters students' motivation and interest in science, that they learn to perform steps of inquiry similar to scientists and that they gain knowledge on scientific processes. Starting from general pedagogical reflections and science standards, the article reviews some prominent models of inquiry learning. This comparison results in a set of inquiry processes being the basis for cooperation in the scientific network NetCoIL. Inquiry learning is conceived in several ways with emphasis on different processes. For an illustration of the spectrum, some main conceptions of inquiry and their focuses are described. In the next step, the article describes exemplary computer tools and environments from within and outside the NetCoIL network that were designed to support processes of collaborative inquiry learning. These tools are analysed by describing their functionalities as well as effects on student learning known from the literature. The article closes with challenges for further developments elaborated by the NetCoIL network.

  18. Machine Learning of Hierarchical Clustering to Segment 2D and 3D Images

    PubMed Central

    Nunez-Iglesias, Juan; Kennedy, Ryan; Parag, Toufiq; Shi, Jianbo; Chklovskii, Dmitri B.

    2013-01-01

    We aim to improve segmentation through the use of machine learning tools during region agglomeration. We propose an active learning approach for performing hierarchical agglomerative segmentation from superpixels. Our method combines multiple features at all scales of the agglomerative process, works for data with an arbitrary number of dimensions, and scales to very large datasets. We advocate the use of variation of information to measure segmentation accuracy, particularly in 3D electron microscopy (EM) images of neural tissue, and using this metric demonstrate an improvement over competing algorithms in EM and natural images. PMID:23977123

  19. Learning by Design: Good Video Games as Learning Machines

    ERIC Educational Resources Information Center

    Gee, James Paul

    2005-01-01

    This article asks how good video and computer game designers manage to get new players to learn long, complex and difficult games. The short answer is that designers of good games have hit on excellent methods for getting people to learn and to enjoy learning. The longer answer is more complex. Integral to this answer are the good principles of…

  20. Machine learning strategies for systems with invariance properties

    DOE PAGESBeta

    Ling, Julia; Jones, Reese E.; Templeton, Jeremy Alan

    2016-05-06

    Here, in many scientific fields, empirical models are employed to facilitate computational simulations of engineering systems. For example, in fluid mechanics, empirical Reynolds stress closures enable computationally-efficient Reynolds-Averaged Navier-Stokes simulations. Likewise, in solid mechanics, constitutive relations between the stress and strain in a material are required in deformation analysis. Traditional methods for developing and tuning empirical models usually combine physical intuition with simple regression techniques on limited data sets. The rise of high-performance computing has led to a growing availability of high-fidelity simulation data, which open up the possibility of using machine learning algorithms, such as random forests or neuralmore » networks, to develop more accurate and general empirical models. A key question when using data-driven algorithms to develop these models is how domain knowledge should be incorporated into the machine learning process. This paper will specifically address physical systems that possess symmetry or invariance properties. Two different methods for teaching a machine learning model an invariance property are compared. In the first , a basis of invariant inputs is constructed, and the machine learning model is trained upon this basis, thereby embedding the invariance into the model. In the second method, the algorithm is trained on multiple transformations of the raw input data until the model learns invariance to that transformation. Results are discussed for two case studies: one in turbulence modeling and one in crystal elasticity. It is shown that in both cases embedding the invariance property into the input features yields higher performance with significantly reduced computational training costs.« less

  1. Machine learning strategies for systems with invariance properties

    NASA Astrophysics Data System (ADS)

    Ling, Julia; Jones, Reese; Templeton, Jeremy

    2016-08-01

    In many scientific fields, empirical models are employed to facilitate computational simulations of engineering systems. For example, in fluid mechanics, empirical Reynolds stress closures enable computationally-efficient Reynolds Averaged Navier Stokes simulations. Likewise, in solid mechanics, constitutive relations between the stress and strain in a material are required in deformation analysis. Traditional methods for developing and tuning empirical models usually combine physical intuition with simple regression techniques on limited data sets. The rise of high performance computing has led to a growing availability of high fidelity simulation data. These data open up the possibility of using machine learning algorithms, such as random forests or neural networks, to develop more accurate and general empirical models. A key question when using data-driven algorithms to develop these empirical models is how domain knowledge should be incorporated into the machine learning process. This paper will specifically address physical systems that possess symmetry or invariance properties. Two different methods for teaching a machine learning model an invariance property are compared. In the first method, a basis of invariant inputs is constructed, and the machine learning model is trained upon this basis, thereby embedding the invariance into the model. In the second method, the algorithm is trained on multiple transformations of the raw input data until the model learns invariance to that transformation. Results are discussed for two case studies: one in turbulence modeling and one in crystal elasticity. It is shown that in both cases embedding the invariance property into the input features yields higher performance at significantly reduced computational training costs.

  2. Remotely sensed data assimilation technique to develop machine learning models for use in water management

    NASA Astrophysics Data System (ADS)

    Zaman, Bushra

    Increasing population and water conflicts are making water management one of the most important issues of the present world. It has become absolutely necessary to find ways to manage water more efficiently. Technological advancement has introduced various techniques for data acquisition and analysis, and these tools can be used to address some of the critical issues that challenge water resource management. This research used learning machine techniques and information acquired through remote sensing, to solve problems related to soil moisture estimation and crop identification on large spatial scales. In this dissertation, solutions were proposed in three problem areas that can be important in the decision making process related to water management in irrigated systems. A data assimilation technique was used to build a learning machine model that generated soil moisture estimates commensurate with the scale of the data. The research was taken further by developing a multivariate machine learning algorithm to predict root zone soil moisture both in space and time. Further, a model was developed for supervised classification of multi-spectral reflectance data using a multi-class machine learning algorithm. The procedure was designed for classifying crops but the model is data dependent and can be used with other datasets and hence can be applied to other landcover classification problems. The dissertation compared the performance of relevance vector and the support vector machines in estimating soil moisture. A multivariate relevance vector machine algorithm was tested in the spatio-temporal prediction of soil moisture, and the multi-class relevance vector machine model was used for classifying different crop types. It was concluded that the classification scheme may uncover important data patterns contributing greatly to knowledge bases, and to scientific and medical research. The results for the soil moisture models would give a rough idea to farmers

  3. Force Sensor Based Tool Condition Monitoring Using a Heterogeneous Ensemble Learning Model

    PubMed Central

    Wang, Guofeng; Yang, Yinwei; Li, Zhimeng

    2014-01-01

    Tool condition monitoring (TCM) plays an important role in improving machining efficiency and guaranteeing workpiece quality. In order to realize reliable recognition of the tool condition, a robust classifier needs to be constructed to depict the relationship between tool wear states and sensory information. However, because of the complexity of the machining process and the uncertainty of the tool wear evolution, it is hard for a single classifier to fit all the collected samples without sacrificing generalization ability. In this paper, heterogeneous ensemble learning is proposed to realize tool condition monitoring in which the support vector machine (SVM), hidden Markov model (HMM) and radius basis function (RBF) are selected as base classifiers and a stacking ensemble strategy is further used to reflect the relationship between the outputs of these base classifiers and tool wear states. Based on the heterogeneous ensemble learning classifier, an online monitoring system is constructed in which the harmonic features are extracted from force signals and a minimal redundancy and maximal relevance (mRMR) algorithm is utilized to select the most prominent features. To verify the effectiveness of the proposed method, a titanium alloy milling experiment was carried out and samples with different tool wear states were collected to build the proposed heterogeneous ensemble learning classifier. Moreover, the homogeneous ensemble learning model and majority voting strategy are also adopted to make a comparison. The analysis and comparison results show that the proposed heterogeneous ensemble learning classifier performs better in both classification accuracy and stability. PMID:25405514

  4. CNC machine tool's wear diagnostic and prognostic by using dynamic Bayesian networks

    NASA Astrophysics Data System (ADS)

    Tobon-Mejia, D. A.; Medjaher, K.; Zerhouni, N.

    2012-04-01

    The failure of critical components in industrial systems may have negative consequences on the availability, the productivity, the security and the environment. To avoid such situations, the health condition of the physical system, and particularly of its critical components, can be constantly assessed by using the monitoring data to perform on-line system diagnostics and prognostics. The present paper is a contribution on the assessment of the health condition of a computer numerical control (CNC) tool machine and the estimation of its remaining useful life (RUL). The proposed method relies on two main phases: an off-line phase and an on-line phase. During the first phase, the raw data provided by the sensors are processed to extract reliable features. These latter are used as inputs of learning algorithms in order to generate the models that represent the wear's behavior of the cutting tool. Then, in the second phase, which is an assessment one, the constructed models are exploited to identify the tool's current health state, predict its RUL and the associated confidence bounds. The proposed method is applied on a benchmark of condition monitoring data gathered during several cuts of a CNC tool. Simulation results are obtained and discussed at the end of the paper.

  5. Application of machine learning and expert systems to Statistical Process Control (SPC) chart interpretation

    NASA Technical Reports Server (NTRS)

    Shewhart, Mark

    1991-01-01

    Statistical Process Control (SPC) charts are one of several tools used in quality control. Other tools include flow charts, histograms, cause and effect diagrams, check sheets, Pareto diagrams, graphs, and scatter diagrams. A control chart is simply a graph which indicates process variation over time. The purpose of drawing a control chart is to detect any changes in the process signalled by abnormal points or patterns on the graph. The Artificial Intelligence Support Center (AISC) of the Acquisition Logistics Division has developed a hybrid machine learning expert system prototype which automates the process of constructing and interpreting control charts.

  6. Machine Learning Methods for Attack Detection in the Smart Grid.

    PubMed

    Ozay, Mete; Esnaola, Inaki; Yarman Vural, Fatos Tunay; Kulkarni, Sanjeev R; Poor, H Vincent

    2016-08-01

    Attack detection problems in the smart grid are posed as statistical learning problems for different attack scenarios in which the measurements are observed in batch or online settings. In this approach, machine learning algorithms are used to classify measurements as being either secure or attacked. An attack detection framework is provided to exploit any available prior knowledge about the system and surmount constraints arising from the sparse structure of the problem in the proposed approach. Well-known batch and online learning algorithms (supervised and semisupervised) are employed with decision- and feature-level fusion to model the attack detection problem. The relationships between statistical and geometric properties of attack vectors employed in the attack scenarios and learning algorithms are analyzed to detect unobservable attacks using statistical learning methods. The proposed algorithms are examined on various IEEE test systems. Experimental analyses show that machine learning algorithms can detect attacks with performances higher than attack detection algorithms that employ state vector estimation methods in the proposed attack detection framework.

  7. Machine Learning Methods for Attack Detection in the Smart Grid.

    PubMed

    Ozay, Mete; Esnaola, Inaki; Yarman Vural, Fatos Tunay; Kulkarni, Sanjeev R; Poor, H Vincent

    2016-08-01

    Attack detection problems in the smart grid are posed as statistical learning problems for different attack scenarios in which the measurements are observed in batch or online settings. In this approach, machine learning algorithms are used to classify measurements as being either secure or attacked. An attack detection framework is provided to exploit any available prior knowledge about the system and surmount constraints arising from the sparse structure of the problem in the proposed approach. Well-known batch and online learning algorithms (supervised and semisupervised) are employed with decision- and feature-level fusion to model the attack detection problem. The relationships between statistical and geometric properties of attack vectors employed in the attack scenarios and learning algorithms are analyzed to detect unobservable attacks using statistical learning methods. The proposed algorithms are examined on various IEEE test systems. Experimental analyses show that machine learning algorithms can detect attacks with performances higher than attack detection algorithms that employ state vector estimation methods in the proposed attack detection framework. PMID:25807571

  8. Stochastic Synapses Enable Efficient Brain-Inspired Learning Machines.

    PubMed

    Neftci, Emre O; Pedroni, Bruno U; Joshi, Siddharth; Al-Shedivat, Maruan; Cauwenberghs, Gert

    2016-01-01

    Recent studies have shown that synaptic unreliability is a robust and sufficient mechanism for inducing the stochasticity observed in cortex. Here, we introduce Synaptic Sampling Machines (S2Ms), a class of neural network models that uses synaptic stochasticity as a means to Monte Carlo sampling and unsupervised learning. Similar to the original formulation of Boltzmann machines, these models can be viewed as a stochastic counterpart of Hopfield networks, but where stochasticity is induced by a random mask over the connections. Synaptic stochasticity plays the dual role of an efficient mechanism for sampling, and a regularizer during learning akin to DropConnect. A local synaptic plasticity rule implementing an event-driven form of contrastive divergence enables the learning of generative models in an on-line fashion. S2Ms perform equally well using discrete-timed artificial units (as in Hopfield networks) or continuous-timed leaky integrate and fire neurons. The learned representations are remarkably sparse and robust to reductions in bit precision and synapse pruning: removal of more than 75% of the weakest connections followed by cursory re-learning causes a negligible performance loss on benchmark classification tasks. The spiking neuron-based S2Ms outperform existing spike-based unsupervised learners, while potentially offering substantial advantages in terms of power and complexity, and are thus promising models for on-line learning in brain-inspired hardware. PMID:27445650

  9. Stochastic Synapses Enable Efficient Brain-Inspired Learning Machines

    PubMed Central

    Neftci, Emre O.; Pedroni, Bruno U.; Joshi, Siddharth; Al-Shedivat, Maruan; Cauwenberghs, Gert

    2016-01-01

    Recent studies have shown that synaptic unreliability is a robust and sufficient mechanism for inducing the stochasticity observed in cortex. Here, we introduce Synaptic Sampling Machines (S2Ms), a class of neural network models that uses synaptic stochasticity as a means to Monte Carlo sampling and unsupervised learning. Similar to the original formulation of Boltzmann machines, these models can be viewed as a stochastic counterpart of Hopfield networks, but where stochasticity is induced by a random mask over the connections. Synaptic stochasticity plays the dual role of an efficient mechanism for sampling, and a regularizer during learning akin to DropConnect. A local synaptic plasticity rule implementing an event-driven form of contrastive divergence enables the learning of generative models in an on-line fashion. S2Ms perform equally well using discrete-timed artificial units (as in Hopfield networks) or continuous-timed leaky integrate and fire neurons. The learned representations are remarkably sparse and robust to reductions in bit precision and synapse pruning: removal of more than 75% of the weakest connections followed by cursory re-learning causes a negligible performance loss on benchmark classification tasks. The spiking neuron-based S2Ms outperform existing spike-based unsupervised learners, while potentially offering substantial advantages in terms of power and complexity, and are thus promising models for on-line learning in brain-inspired hardware. PMID:27445650

  10. Foam-machining tool with eddy-current transducer

    NASA Technical Reports Server (NTRS)

    Copper, W. P.

    1975-01-01

    Three-cutter machining system for foam-covered tanks incorporates eddy-current sensor. Sensor feeds signal to numerical controller which programs rotational and vertical axes of sensor travel, enabling cutterhead to profile around tank protrusions.

  11. ASAP: a machine learning framework for local protein properties

    PubMed Central

    Brandes, Nadav; Ofer, Dan; Linial, Michal

    2016-01-01

    Determining residue-level protein properties, such as sites of post-translational modifications (PTMs), is vital to understanding protein function. Experimental methods are costly and time-consuming, while traditional rule-based computational methods fail to annotate sites lacking substantial similarity. Machine Learning (ML) methods are becoming fundamental in annotating unknown proteins and their heterogeneous properties. We present ASAP (Amino-acid Sequence Annotation Prediction), a universal ML framework for predicting residue-level properties. ASAP extracts numerous features from raw sequences, and supports easy integration of external features such as secondary structure, solvent accessibility, intrinsically disorder or PSSM profiles. Features are then used to train ML classifiers. ASAP can create new classifiers within minutes for a variety of tasks, including PTM prediction (e.g. cleavage sites by convertase, phosphoserine modification). We present a detailed case study for ASAP: CleavePred, an ASAP-based model to predict protein precursor cleavage sites, with state-of-the-art results. Protein cleavage is a PTM shared by a wide variety of proteins sharing minimal sequence similarity. Current rule-based methods suffer from high false positive rates, making them suboptimal. The high performance of CleavePred makes it suitable for analyzing new proteomes at a genomic scale. The tool is attractive to protein design, mass spectrometry search engines and the discovery of new bioactive peptides from precursors. ASAP functions as a baseline approach for residue-level protein sequence prediction. CleavePred is freely accessible as a web-based application. Both ASAP and CleavePred are open-source with a flexible Python API. Database URL: ASAP’s and CleavePred source code, webtool and tutorials are available at: https://github.com/ddofer/asap; http://protonet.cs.huji.ac.il/cleavepred. PMID:27694209

  12. ASAP: a machine learning framework for local protein properties

    PubMed Central

    Brandes, Nadav; Ofer, Dan; Linial, Michal

    2016-01-01

    Determining residue-level protein properties, such as sites of post-translational modifications (PTMs), is vital to understanding protein function. Experimental methods are costly and time-consuming, while traditional rule-based computational methods fail to annotate sites lacking substantial similarity. Machine Learning (ML) methods are becoming fundamental in annotating unknown proteins and their heterogeneous properties. We present ASAP (Amino-acid Sequence Annotation Prediction), a universal ML framework for predicting residue-level properties. ASAP extracts numerous features from raw sequences, and supports easy integration of external features such as secondary structure, solvent accessibility, intrinsically disorder or PSSM profiles. Features are then used to train ML classifiers. ASAP can create new classifiers within minutes for a variety of tasks, including PTM prediction (e.g. cleavage sites by convertase, phosphoserine modification). We present a detailed case study for ASAP: CleavePred, an ASAP-based model to predict protein precursor cleavage sites, with state-of-the-art results. Protein cleavage is a PTM shared by a wide variety of proteins sharing minimal sequence similarity. Current rule-based methods suffer from high false positive rates, making them suboptimal. The high performance of CleavePred makes it suitable for analyzing new proteomes at a genomic scale. The tool is attractive to protein design, mass spectrometry search engines and the discovery of new bioactive peptides from precursors. ASAP functions as a baseline approach for residue-level protein sequence prediction. CleavePred is freely accessible as a web-based application. Both ASAP and CleavePred are open-source with a flexible Python API. Database URL: ASAP’s and CleavePred source code, webtool and tutorials are available at: https://github.com/ddofer/asap; http://protonet.cs.huji.ac.il/cleavepred.

  13. Machine learning classification of SDSS transient survey images

    NASA Astrophysics Data System (ADS)

    du Buisson, L.; Sivanandam, N.; Bassett, Bruce A.; Smith, M.

    2015-12-01

    We show that multiple machine learning algorithms can match human performance in classifying transient imaging data from the Sloan Digital Sky Survey (SDSS) supernova survey into real objects and artefacts. This is a first step in any transient science pipeline and is currently still done by humans, but future surveys such as the Large Synoptic Survey Telescope (LSST) will necessitate fully machine-enabled solutions. Using features trained from eigenimage analysis (principal component analysis, PCA) of single-epoch g, r and i difference images, we can reach a completeness (recall) of 96 per cent, while only incorrectly classifying at most 18 per cent of artefacts as real objects, corresponding to a precision (purity) of 84 per cent. In general, random forests performed best, followed by the k-nearest neighbour and the SkyNet artificial neural net algorithms, compared to other methods such as naive Bayes and kernel support vector machine. Our results show that PCA-based machine learning can match human success levels and can naturally be extended by including multiple epochs of data, transient colours and host galaxy information which should allow for significant further improvements, especially at low signal-to-noise.

  14. Prototype Vector Machine for Large Scale Semi-Supervised Learning

    SciTech Connect

    Zhang, Kai; Kwok, James T.; Parvin, Bahram

    2009-04-29

    Practicaldataminingrarelyfalls exactlyinto the supervisedlearning scenario. Rather, the growing amount of unlabeled data poses a big challenge to large-scale semi-supervised learning (SSL). We note that the computationalintensivenessofgraph-based SSLarises largely from the manifold or graph regularization, which in turn lead to large models that are dificult to handle. To alleviate this, we proposed the prototype vector machine (PVM), a highlyscalable,graph-based algorithm for large-scale SSL. Our key innovation is the use of"prototypes vectors" for effcient approximation on both the graph-based regularizer and model representation. The choice of prototypes are grounded upon two important criteria: they not only perform effective low-rank approximation of the kernel matrix, but also span a model suffering the minimum information loss compared with the complete model. We demonstrate encouraging performance and appealing scaling properties of the PVM on a number of machine learning benchmark data sets.

  15. Robust Extreme Learning Machine With its Application to Indoor Positioning.

    PubMed

    Lu, Xiaoxuan; Zou, Han; Zhou, Hongming; Xie, Lihua; Huang, Guang-Bin

    2016-01-01

    The increasing demands of location-based services have spurred the rapid development of indoor positioning system and indoor localization system interchangeably (IPSs). However, the performance of IPSs suffers from noisy measurements. In this paper, two kinds of robust extreme learning machines (RELMs), corresponding to the close-to-mean constraint, and the small-residual constraint, have been proposed to address the issue of noisy measurements in IPSs. Based on whether the feature mapping in extreme learning machine is explicit, we respectively provide random-hidden-nodes and kernelized formulations of RELMs by second order cone programming. Furthermore, the computation of the covariance in feature space is discussed. Simulations and real-world indoor localization experiments are extensively carried out and the results demonstrate that the proposed algorithms can not only improve the accuracy and repeatability, but also reduce the deviation and worst case error of IPSs compared with other baseline algorithms. PMID:26684258

  16. Prediction of Zeolite Framework Types by a Machine Learning Approach

    NASA Astrophysics Data System (ADS)

    Yang, Shujiang; Lach-Hab, Mohammed; Vaisman, Iosif; Blaisten-Barojas, Estela

    2009-03-01

    Zeolites are microporous crystalline materials with highly regular framework structures consisting of molecular-sized pores and channels. Characteristic framework types of zeolites are traditionally determined by the combined information of coordination sequences and vertex symbols. Here we present a machine learning model for classifying zeolite crystals according to their framework types. An eighteen-dimensional feature vector is defined including topological descriptors and physical/chemical properties of zeolite crystals [Microporous and Mesoporous Materials 117, 339 (2009)]. Trained with crystallographic data of known zeolites, the new model can predict the framework types of unknown zeolite crystals with up to 98 % accuracy. Compared with conventional methods, the machine learning model is more robust handling crystal disorder and/or crystal defects in a more effective manner. This model can be adapted for classifying and clustering other crystalline species.

  17. Protein function in precision medicine: deep understanding with machine learning.

    PubMed

    Rost, Burkhard; Radivojac, Predrag; Bromberg, Yana

    2016-08-01

    Precision medicine and personalized health efforts propose leveraging complex molecular, medical and family history, along with other types of personal data toward better life. We argue that this ambitious objective will require advanced and specialized machine learning solutions. Simply skimming some low-hanging results off the data wealth might have limited potential. Instead, we need to better understand all parts of the system to define medically relevant causes and effects: how do particular sequence variants affect particular proteins and pathways? How do these effects, in turn, cause the health or disease-related phenotype? Toward this end, deeper understanding will not simply diffuse from deeper machine learning, but from more explicit focus on understanding protein function, context-specific protein interaction networks, and impact of variation on both. PMID:27423136

  18. Stochastic Local Interaction (SLI) model: Bridging machine learning and geostatistics

    NASA Astrophysics Data System (ADS)

    Hristopulos, Dionissios T.

    2015-12-01

    Machine learning and geostatistics are powerful mathematical frameworks for modeling spatial data. Both approaches, however, suffer from poor scaling of the required computational resources for large data applications. We present the Stochastic Local Interaction (SLI) model, which employs a local representation to improve computational efficiency. SLI combines geostatistics and machine learning with ideas from statistical physics and computational geometry. It is based on a joint probability density function defined by an energy functional which involves local interactions implemented by means of kernel functions with adaptive local kernel bandwidths. SLI is expressed in terms of an explicit, typically sparse, precision (inverse covariance) matrix. This representation leads to a semi-analytical expression for interpolation (prediction), which is valid in any number of dimensions and avoids the computationally costly covariance matrix inversion.

  19. Comparison between laser interferometric and calibrated artifacts for the geometric test of machine tools

    NASA Astrophysics Data System (ADS)

    Sousa, Andre R.; Schneider, Carlos A.

    2001-09-01

    A touch probe is used on a 3-axis vertical machine center to check against a hole plate, calibrated on a coordinate measuring machine (CMM). By comparing the results obtained from the machine tool and CMM, the main machine tool error components are measured, attesting the machine accuracy. The error values can b used also t update the error compensation table at the CNC, enhancing the machine accuracy. The method is easy to us, has a lower cost than classical test techniques, and preliminary results have shown that its uncertainty is comparable to well established techniques. In this paper the method is compared with the laser interferometric system, regarding reliability, cost and time efficiency.

  20. Controlling misses and false alarms in a machine learning framework for predicting uniformity of printed pages

    NASA Astrophysics Data System (ADS)

    Nguyen, Minh Q.; Allebach, Jan P.

    2015-01-01

    In our previous work1 , we presented a block-based technique to analyze printed page uniformity both visually and metrically. The features learned from the models were then employed in a Support Vector Machine (SVM) framework to classify the pages into one of the two categories of acceptable and unacceptable quality. In this paper, we introduce a set of tools for machine learning in the assessment of printed page uniformity. This work is primarily targeted to the printing industry, specifically the ubiquitous laser, electrophotographic printer. We use features that are well-correlated with the rankings of expert observers to develop a novel machine learning framework that allows one to achieve the minimum "false alarm" rate, subject to a chosen "miss" rate. Surprisingly, most of the research that has been conducted on machine learning does not consider this framework. During the process of developing a new product, test engineers will print hundreds of test pages, which can be scanned and then analyzed by an autonomous algorithm. Among these pages, most may be of acceptable quality. The objective is to find the ones that are not. These will provide critically important information to systems designers, regarding issues that need to be addressed in improving the printer design. A "miss" is defined to be a page that is not of acceptable quality to an expert observer that the prediction algorithm declares to be a "pass". Misses are a serious problem, since they represent problems that will not be seen by the systems designers. On the other hand, "false alarms" correspond to pages that an expert observer would declare to be of acceptable quality, but which are flagged by the prediction algorithm as "fails". In a typical printer testing and development scenario, such pages would be examined by an expert, and found to be of acceptable quality after all. "False alarm" pages result in extra pages to be examined by expert observers, which increases labor cost. But "false

  1. Machine Tool Technology. Automatic Screw Machine Troubleshooting & Set-Up Training Outlines [and] Basic Operator's Skills Set List.

    ERIC Educational Resources Information Center

    Anoka-Hennepin Technical Coll., Minneapolis, MN.

    This set of two training outlines and one basic skills set list are designed for a machine tool technology program developed during a project to retrain defense industry workers at risk of job loss or dislocation because of conversion of the defense industry. The first troubleshooting training outline lists the categories of problems that develop…

  2. AstroML: Machine learning and data mining in astronomy

    NASA Astrophysics Data System (ADS)

    VanderPlas, Jacob; Fouesneau, Morgan; Taylor, Julia

    2014-07-01

    Written in Python, AstroML is a library of statistical and machine learning routines for analyzing astronomical data in python, loaders for several open astronomical datasets, and a large suite of examples of analyzing and visualizing astronomical datasets. An optional companion library, astroML_addons, is available; it requires a C compiler and contains faster and more efficient implementations of certain algorithms in compiled code.

  3. Galaxy Zoo: reproducing galaxy morphologies via machine learning

    NASA Astrophysics Data System (ADS)

    Banerji, Manda; Lahav, Ofer; Lintott, Chris J.; Abdalla, Filipe B.; Schawinski, Kevin; Bamford, Steven P.; Andreescu, Dan; Murray, Phil; Raddick, M. Jordan; Slosar, Anze; Szalay, Alex; Thomas, Daniel; Vandenberg, Jan

    2010-07-01

    We present morphological classifications obtained using machine learning for objects in the Sloan Digital Sky Survey DR6 that have been classified by Galaxy Zoo into three classes, namely early types, spirals and point sources/artefacts. An artificial neural network is trained on a subset of objects classified by the human eye, and we test whether the machine-learning algorithm can reproduce the human classifications for the rest of the sample. We find that the success of the neural network in matching the human classifications depends crucially on the set of input parameters chosen for the machine-learning algorithm. The colours and parameters associated with profile fitting are reasonable in separating the objects into three classes. However, these results are considerably improved when adding adaptive shape parameters as well as concentration and texture. The adaptive moments, concentration and texture parameters alone cannot distinguish between early type galaxies and the point sources/artefacts. Using a set of 12 parameters, the neural network is able to reproduce the human classifications to better than 90 per cent for all three morphological classes. We find that using a training set that is incomplete in magnitude does not degrade our results given our particular choice of the input parameters to the network. We conclude that it is promising to use machine-learning algorithms to perform morphological classification for the next generation of wide-field imaging surveys and that the Galaxy Zoo catalogue provides an invaluable training set for such purposes. This publication has been made possible by the participation of more than 100000 volunteers in the Galaxy Zoo project. Their contributions are individually acknowledged at http://www.galaxyzoo.org/Volunteers.aspx. E-mail: mbanerji@ast.cam.ac.uk ‡ Einstein Fellow.

  4. Applying machine learning techniques to DNA sequence analysis

    SciTech Connect

    Shavlik, J.W.

    1992-01-01

    We are developing a machine learning system that modifies existing knowledge about specific types of biological sequences. It does this by considering sample members and nonmembers of the sequence motif being learned. Using this information (which we call a domain theory''), our learning algorithm produces a more accurate representation of the knowledge needed to categorize future sequences. Specifically, the KBANN algorithm maps inference rules, such as consensus sequences, into a neural (connectionist) network. Neural network training techniques then use the training examples of refine these inference rules. We have been applying this approach to several problems in DNA sequence analysis and have also been extending the capabilities of our learning system along several dimensions.

  5. Classification of ABO3 perovskite solids: a machine learning study.

    PubMed

    Pilania, G; Balachandran, P V; Gubernatis, J E; Lookman, T

    2015-10-01

    We explored the use of machine learning methods for classifying whether a particular ABO3 chemistry forms a perovskite or non-perovskite structured solid. Starting with three sets of feature pairs (the tolerance and octahedral factors, the A and B ionic radii relative to the radius of O, and the bond valence distances between the A and B ions from the O atoms), we used machine learning to create a hyper-dimensional partial dependency structure plot using all three feature pairs or any two of them. Doing so increased the accuracy of our predictions by 2-3 percentage points over using any one pair. We also included the Mendeleev numbers of the A and B atoms to this set of feature pairs. Doing this and using the capabilities of our machine learning algorithm, the gradient tree boosting classifier, enabled us to generate a new type of structure plot that has the simplicity of one based on using just the Mendeleev numbers, but with the added advantages of having a higher accuracy and providing a measure of likelihood of the predicted structure.

  6. Experimental Investigation of Three Machine Learning Algorithms for ITS Dataset

    NASA Astrophysics Data System (ADS)

    Yearwood, J. L.; Kang, B. H.; Kelarev, A. V.

    The present article is devoted to experimental investigation of the performance of three machine learning algorithms for ITS dataset in their ability to achieve agreement with classes published in the biologi cal literature before. The ITS dataset consists of nuclear ribosomal DNA sequences, where rather sophisticated alignment scores have to be used as a measure of distance. These scores do not form a Minkowski metric and the sequences cannot be regarded as points in a finite dimensional space. This is why it is necessary to develop novel machine learning ap proaches to the analysis of datasets of this sort. This paper introduces a k-committees classifier and compares it with the discrete k-means and Nearest Neighbour classifiers. It turns out that all three machine learning algorithms are efficient and can be used to automate future biologically significant classifications for datasets of this kind. A simplified version of a synthetic dataset, where the k-committees classifier outperforms k-means and Nearest Neighbour classifiers, is also presented.

  7. Mammogram retrieval through machine learning within BI-RADS standards.

    PubMed

    Wei, Chia-Hung; Li, Yue; Huang, Pai Jung

    2011-08-01

    A content-based mammogram retrieval system can support usual comparisons made on images by physicians, answering similarity queries over images stored in the database. The importance of searching for similar mammograms lies in the fact that physicians usually try to recall similar cases by seeking images that are pathologically similar to a given image. This paper presents a content-based mammogram retrieval system, which employs a query example to search for similar mammograms in the database. In this system the mammographic lesions are interpreted based on their medical characteristics specified in the Breast Imaging Reporting and Data System (BI-RADS) standards. A hierarchical similarity measurement scheme based on a distance weighting function is proposed to model user's perception and maximizes the effectiveness of each feature in a mammographic descriptor. A machine learning approach based on support vector machines and user's relevance feedback is also proposed to analyze the user's information need in order to retrieve target images more accurately. Experimental results demonstrate that the proposed machine learning approach with Radial Basis Function (RBF) kernel function achieves the best performance among all tested ones. Furthermore, the results also show that the proposed learning approach can improve retrieval performance when applied to retrieve mammograms with similar mass and calcification lesions, respectively. PMID:21277387

  8. Classifying black and white spruce pollen using layered machine learning.

    PubMed

    Punyasena, Surangi W; Tcheng, David K; Wesseln, Cassandra; Mueller, Pietra G

    2012-11-01

    Pollen is among the most ubiquitous of terrestrial fossils, preserving an extended record of vegetation change. However, this temporal continuity comes with a taxonomic tradeoff. Analytical methods that improve the taxonomic precision of pollen identifications would expand the research questions that could be addressed by pollen, in fields such as paleoecology, paleoclimatology, biostratigraphy, melissopalynology, and forensics. We developed a supervised, layered, instance-based machine-learning classification system that uses leave-one-out bias optimization and discriminates among small variations in pollen shape, size, and texture. We tested our system on black and white spruce, two paleoclimatically significant taxa in the North American Quaternary. We achieved > 93% grain-to-grain classification accuracies in a series of experiments with both fossil and reference material. More significantly, when applied to Quaternary samples, the learning system was able to replicate the count proportions of a human expert (R(2) = 0.78, P = 0.007), with one key difference - the machine achieved these ratios by including larger numbers of grains with low-confidence identifications. Our results demonstrate the capability of machine-learning systems to solve the most challenging palynological classification problem, the discrimination of congeneric species, extending the capabilities of the pollen analyst and improving the taxonomic resolution of the palynological record.

  9. Machine Learning for Flood Prediction in Google Earth Engine

    NASA Astrophysics Data System (ADS)

    Kuhn, C.; Tellman, B.; Max, S. A.; Schwarz, B.

    2015-12-01

    With the increasing availability of high-resolution satellite imagery, dynamic flood mapping in near real time is becoming a reachable goal for decision-makers. This talk describes a newly developed framework for predicting biophysical flood vulnerability using public data, cloud computing and machine learning. Our objective is to define an approach to flood inundation modeling using statistical learning methods deployed in a cloud-based computing platform. Traditionally, static flood extent maps grounded in physically based hydrologic models can require hours of human expertise to construct at significant financial cost. In addition, desktop modeling software and limited local server storage can impose restraints on the size and resolution of input datasets. Data-driven, cloud-based processing holds promise for predictive watershed modeling at a wide range of spatio-temporal scales. However, these benefits come with constraints. In particular, parallel computing limits a modeler's ability to simulate the flow of water across a landscape, rendering traditional routing algorithms unusable in this platform. Our project pushes these limits by testing the performance of two machine learning algorithms, Support Vector Machine (SVM) and Random Forests, at predicting flood extent. Constructed in Google Earth Engine, the model mines a suite of publicly available satellite imagery layers to use as algorithm inputs. Results are cross-validated using MODIS-based flood maps created using the Dartmouth Flood Observatory detection algorithm. Model uncertainty highlights the difficulty of deploying unbalanced training data sets based on rare extreme events.

  10. Beatquency domain and machine learning improve prediction of cardiovascular death after acute coronary syndrome

    PubMed Central

    Liu, Yun; Scirica, Benjamin M.; Stultz, Collin M.; Guttag, John V.

    2016-01-01

    Frequency domain measures of heart rate variability (HRV) are associated with adverse events after a myocardial infarction. However, patterns in the traditional frequency domain (measured in Hz, or cycles per second) may capture different cardiac phenomena at different heart rates. An alternative is to consider frequency with respect to heartbeats, or beatquency. We compared the use of frequency and beatquency domains to predict patient risk after an acute coronary syndrome. We then determined whether machine learning could further improve the predictive performance. We first evaluated the use of pre-defined frequency and beatquency bands in a clinical trial dataset (N = 2302) for the HRV risk measure LF/HF (the ratio of low frequency to high frequency power). Relative to frequency, beatquency improved the ability of LF/HF to predict cardiovascular death within one year (Area Under the Curve, or AUC, of 0.730 vs. 0.704, p < 0.001). Next, we used machine learning to learn frequency and beatquency bands with optimal predictive power, which further improved the AUC for beatquency to 0.753 (p < 0.001), but not for frequency. Results in additional validation datasets (N = 2255 and N = 765) were similar. Our results suggest that beatquency and machine learning provide valuable tools in physiological studies of HRV. PMID:27708350

  11. Machine Learning Techniques in Optimal Design

    NASA Technical Reports Server (NTRS)

    Cerbone, Giuseppe

    1992-01-01

    to the problem, is then obtained by solving in parallel each of the sub-problems in the set and computing the one with the minimum cost. In addition to speeding up the optimization process, our use of learning methods also relieves the expert from the burden of identifying rules that exactly pinpoint optimal candidate sub-problems. In real engineering tasks it is usually too costly to the engineers to derive such rules. Therefore, this paper also contributes to a further step towards the solution of the knowledge acquisition bottleneck [Feigenbaum, 1977] which has somewhat impaired the construction of rulebased expert systems.

  12. Machine-z: rapid machine-learned redshift indicator for Swift gamma-ray bursts

    NASA Astrophysics Data System (ADS)

    Ukwatta, T. N.; Woźniak, P. R.; Gehrels, N.

    2016-06-01

    Studies of high-redshift gamma-ray bursts (GRBs) provide important information about the early Universe such as the rates of stellar collapsars and mergers, the metallicity content, constraints on the re-ionization period, and probes of the Hubble expansion. Rapid selection of high-z candidates from GRB samples reported in real time by dedicated space missions such as Swift is the key to identifying the most distant bursts before the optical afterglow becomes too dim to warrant a good spectrum. Here, we introduce `machine-z', a redshift prediction algorithm and a `high-z' classifier for Swift GRBs based on machine learning. Our method relies exclusively on canonical data commonly available within the first few hours after the GRB trigger. Using a sample of 284 bursts with measured redshifts, we trained a randomized ensemble of decision trees (random forest) to perform both regression and classification. Cross-validated performance studies show that the correlation coefficient between machine-z predictions and the true redshift is nearly 0.6. At the same time, our high-z classifier can achieve 80 per cent recall of true high-redshift bursts, while incurring a false positive rate of 20 per cent. With 40 per cent false positive rate the classifier can achieve ˜100 per cent recall. The most reliable selection of high-redshift GRBs is obtained by combining predictions from both the high-z classifier and the machine-z regressor.

  13. 76 FR 5832 - International Business Machines (IBM), Software Group Business Unit, Optim Data Studio Tools QA...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-02-02

    ... November 17, 2010 (75 FR 70296). The negative determination of the TAA petition filed on behalf of workers at International Business Machines (IBM), Software Group Business Unit, Optim Data Studio Tools QA... Employment and Training Administration International Business Machines (IBM), Software Group Business...

  14. 12. TOOL ROOM SHOWING LANDIS MACHINE CO. BOL/T THREADER (L), ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    12. TOOL ROOM SHOWING LANDIS MACHINE CO. BOL/T THREADER (L), OSTER MANUFACTURING CO. PIPE MASTER (R), AND OLDMAN KINK, A SHOP-MADE WELDING STRENGTH TESTER (L, BACKGROUND). VIEW NORTHEAST - Oldman Boiler Works, Office/Machine Shop, 32 Illinois Street, Buffalo, Erie County, NY

  15. Some Principles of Learning and Learning with the Aid of Machines.

    ERIC Educational Resources Information Center

    Dolyatovskii, V. A.; Sotnikov, E. M.

    A translated Soviet document describes some theories of learning, and the practical problems of developing a teaching machine--as taught in an Industrial Electronics course (in the automation and telemechanics curriculum). The point is stressed that the growing number of students at institutions of higher learning in the Soviet Union, up forty…

  16. Engineered Protein Machines: Emergent Tools for Synthetic Biology.

    PubMed

    Glasscock, Cameron J; Lucks, Julius B; DeLisa, Matthew P

    2016-01-21

    Nature has evolved an array of intricate protein assemblies that work together to perform the chemistry that maintains life. These protein machines function with exquisite specificity and coordination to accomplish their tasks, from DNA and RNA synthesis to protein folding and post-translational modifications. Despite their complexity, synthetic biologists have succeeded in redesigning many aspects of these molecular machines. For example, natural DNA polymerases have now been engineered to catalyze the synthesis of alternative genetic polymers called XNAs, orthogonal RNA polymerases and ribosomes have been engineered to enable the construction of genetic logic gates, and protein biogenesis machinery such as chaperonins and protein translocons have been repurposed to improve folding and expression of recombinant proteins. In this Review, we highlight the progress made in understanding, engineering, and repurposing bacterial protein machines for use in synthetic biology and biotechnology.

  17. Effect of Flexural Rigidity of Tool on Machining Accuracy during Microgrooving by Ultrasonic Vibration Cutting Method

    NASA Astrophysics Data System (ADS)

    Furusawa, Toshiaki

    2010-12-01

    It is necessary to form fine holes and grooves by machining in the manufacture of equipment in the medical or information field and the establishment of such a machining technology is required. In micromachining, the use of the ultrasonic vibration cutting method is expected and examined. In this study, I experimentally form microgrooves in stainless steel SUS304 by the ultrasonic vibration cutting method and examine the effects of the shape and material of the tool on the machining accuracy. As a result, the following are clarified. The evaluation of the machining accuracy of the straightness of the finished surface revealed that there is an optimal rake angle of the tools related to the increase in cutting resistance as a result of increases in work hardening and the cutting area. The straightness is improved by using a tool with low flexural rigidity. In particular, Young's modulus more significantly affects the cutting accuracy than the shape of the tool.

  18. Social Networking Sites as a Learning Tool

    ERIC Educational Resources Information Center

    Sanchez-Casado, Noelia; Cegarra Navarro, Juan Gabriel; Wensley, Anthony; Tomaseti-Solano, Eva

    2016-01-01

    Purpose: Over the past few years, social networking sites (SNSs) have become very useful for firms, allowing companies to manage the customer-brand relationships. In this context, SNSs can be considered as a learning tool because of the brand knowledge that customers develop from these relationships. Because of the fact that knowledge in…

  19. Tools for the Assessment of Learning

    ERIC Educational Resources Information Center

    Pappas, Marjorie L.

    2007-01-01

    Assessment tools enable both learning and assessing. They also give library media specialists snapshots of evidence that demonstrates student understanding of the Information Literacy Standards. Over time the evidence provide a more complete picture of learners' ability to gather, evaluate, and use information to solve problems, make decisions,…

  20. Learning about Tool Categories via Eavesdropping

    ERIC Educational Resources Information Center

    Phillips, Brenda; Seston, Rebecca; Kelemen, Deborah

    2012-01-01

    Prior research has found that toddlers will form enduring artifact categories after direct exposure to an adult using a novel tool. Four studies explored whether 2- (N = 48) and 3-year-olds (N = 32) demonstrate this same capacity when learning by eavesdropping. After surreptitiously observing an adult use 1 of 2 artifacts to operate a bell via a…

  1. New Accessory for Cleaning the Inside of the Machine Tool Cavity

    SciTech Connect

    Lazarus, Lloyd

    2009-04-21

    The best way to extend the life of a metalworking fluid (MWF) is to make sure the machine tool and MWF delivery system are properly cleaned at least once per year. The dilemma the MWF manager is faced with is: How does one clean the machine tool and the MWF system on a large machine tool with an enclosure in a timely manner without impacting production schedules? Remember the walls and roof of the machine enclosure are coated with a film of dried contaminated MWF that must also be removed. If not removed, the deposits on these surfaces can recontaminate the fresh charge of MWF. I have found a product that with this revised procedure helps to shorten the machine tool down time involved with machine cleaning. (1) Discuss with your MWF supplier if they have a machine cleaning product that can be used with your current water based MWF during normal machining operations. Most MWF manufacturers have a machine cleaner that can be used at a lower concentration (1-2% vs. 5%) and can be used while still making production parts for a short period of time (usually 24-48 hours). (2) Make sure this machine cleaner is compatible with the work-piece material you are machining into product. Most cleaners are compatible with ferrous alloys. Because of the increased alkalinity of the fluid you might experience staining if you are machining copper or aluminum alloys. (3) Remove the chips from the chips pans and fluid channels. (4) During off shift hours circulate the MWF using a new product marketed by Rego-Fix called a 'Hydroball'. This device has a 5/8 inch diameter straight shank which allows it to be installed in any collet or solid quick change tool holder. It has multiple nozzles so that the user can control the spray pattern generated when the MWF is circulated. It allows the user to utilize the high pressure, through spindle MWF delivery capability of your machine tool for cleaning purposes. The high pressure MWF system can now be effectively used for cleaning purposes. This

  2. Programmable phase plate for tool modification in laser machining applications

    DOEpatents

    Thompson Jr., Charles A.; Kartz, Michael W.; Brase, James M.; Pennington, Deanna; Perry, Michael D.

    2004-04-06

    A system for laser machining includes a laser source for propagating a laser beam toward a target location, and a spatial light modulator having individual controllable elements capable of modifying a phase profile of the laser beam to produce a corresponding irradiance pattern on the target location. The system also includes a controller operably connected to the spatial light modulator for controlling the individual controllable elements. By controlling the individual controllable elements, the phase profile of the laser beam may be modified into a desired phase profile so as to produce a corresponding desired irradiance pattern on the target location capable of performing a machining operation on the target location.

  3. Mining the Galaxy Zoo Database: Machine Learning Applications

    NASA Astrophysics Data System (ADS)

    Borne, Kirk D.; Wallin, J.; Vedachalam, A.; Baehr, S.; Lintott, C.; Darg, D.; Smith, A.; Fortson, L.

    2010-01-01

    The new Zooniverse initiative is addressing the data flood in the sciences through a transformative partnership between professional scientists, volunteer citizen scientists, and machines. As part of this project, we are exploring the application of machine learning techniques to data mining problems associated with the large and growing database of volunteer science results gathered by the Galaxy Zoo citizen science project. We will describe the basic challenge, some machine learning approaches, and early results. One of the motivators for this study is the acquisition (through the Galaxy Zoo results database) of approximately 100 million classification labels for roughly one million galaxies, yielding a tremendously large and rich set of training examples for improving automated galaxy morphological classification algorithms. In our first case study, the goal is to learn which morphological and photometric features in the Sloan Digital Sky Survey (SDSS) database correlate most strongly with user-selected galaxy morphological class. As a corollary to this study, we are also aiming to identify which galaxy parameters in the SDSS database correspond to galaxies that have been the most difficult to classify (based upon large dispersion in their volunter-provided classifications). Our second case study will focus on similar data mining analyses and machine leaning algorithms applied to the Galaxy Zoo catalog of merging and interacting galaxies. The outcomes of this project will have applications in future large sky surveys, such as the LSST (Large Synoptic Survey Telescope) project, which will generate a catalog of 20 billion galaxies and will produce an additional astronomical alert database of approximately 100 thousand events each night for 10 years -- the capabilities and algorithms that we are exploring will assist in the rapid characterization and classification of such massive data streams. This research has been supported in part through NSF award #0941610.

  4. Automatic pathology classification using a single feature machine learning support - vector machines

    NASA Astrophysics Data System (ADS)

    Yepes-Calderon, Fernando; Pedregosa, Fabian; Thirion, Bertrand; Wang, Yalin; Lepore, Natasha

    2014-03-01

    Magnetic Resonance Imaging (MRI) has been gaining popularity in the clinic in recent years as a safe in-vivo imaging technique. As a result, large troves of data are being gathered and stored daily that may be used as clinical training sets in hospitals. While numerous machine learning (ML) algorithms have been implemented for Alzheimer's disease classification, their outputs are usually difficult to interpret in the clinical setting. Here, we propose a simple method of rapid diagnostic classification for the clinic using Support Vector Machines (SVM)1 and easy to obtain geometrical measurements that, together with a cortical and sub-cortical brain parcellation, create a robust framework capable of automatic diagnosis with high accuracy. On a significantly large imaging dataset consisting of over 800 subjects taken from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database, classification-success indexes of up to 99.2% are reached with a single measurement.

  5. Amp: A modular approach to machine learning in atomistic simulations

    NASA Astrophysics Data System (ADS)

    Khorshidi, Alireza; Peterson, Andrew A.

    2016-10-01

    Electronic structure calculations, such as those employing Kohn-Sham density functional theory or ab initio wavefunction theories, have allowed for atomistic-level understandings of a wide variety of phenomena and properties of matter at small scales. However, the computational cost of electronic structure methods drastically increases with length and time scales, which makes these methods difficult for long time-scale molecular dynamics simulations or large-sized systems. Machine-learning techniques can provide accurate potentials that can match the quality of electronic structure calculations, provided sufficient training data. These potentials can then be used to rapidly simulate large and long time-scale phenomena at similar quality to the parent electronic structure approach. Machine-learning potentials usually take a bias-free mathematical form and can be readily developed for a wide variety of systems. Electronic structure calculations have favorable properties-namely that they are noiseless and targeted training data can be produced on-demand-that make them particularly well-suited for machine learning. This paper discusses our modular approach to atomistic machine learning through the development of the open-source Atomistic Machine-learning Package (Amp), which allows for representations of both the total and atom-centered potential energy surface, in both periodic and non-periodic systems. Potentials developed through the atom-centered approach are simultaneously applicable for systems with various sizes. Interpolation can be enhanced by introducing custom descriptors of the local environment. We demonstrate this in the current work for Gaussian-type, bispectrum, and Zernike-type descriptors. Amp has an intuitive and modular structure with an interface through the python scripting language yet has parallelizable fortran components for demanding tasks; it is designed to integrate closely with the widely used Atomic Simulation Environment (ASE), which

  6. High accurate interpolation of NURBS tool path for CNC machine tools

    NASA Astrophysics Data System (ADS)

    Liu, Qiang; Liu, Huan; Yuan, Songmei

    2016-06-01

    Feedrate fluctuation caused by approximation errors of interpolation methods has great effects on machining quality in NURBS interpolation, but few methods can efficiently eliminate or reduce it to a satisfying level without sacrificing the computing efficiency at present. In order to solve this problem, a high accurate interpolation method for NURBS tool path is proposed. The proposed method can efficiently reduce the feedrate fluctuation by forming a quartic equation with respect to the curve parameter increment, which can be efficiently solved by analytic methods in real-time. Theoretically, the proposed method can totally eliminate the feedrate fluctuation for any 2nd degree NURBS curves and can interpolate 3rd degree NURBS curves with minimal feedrate fluctuation. Moreover, a smooth feedrate planning algorithm is also proposed to generate smooth tool motion with considering multiple constraints and scheduling errors by an efficient planning strategy. Experiments are conducted to verify the feasibility and applicability of the proposed method. This research presents a novel NURBS interpolation method with not only high accuracy but also satisfying computing efficiency.

  7. Machine Learning of Protein Interactions in Fungal Secretory Pathways.

    PubMed

    Kludas, Jana; Arvas, Mikko; Castillo, Sandra; Pakula, Tiina; Oja, Merja; Brouard, Céline; Jäntti, Jussi; Penttilä, Merja; Rousu, Juho

    2016-01-01

    In this paper we apply machine learning methods for predicting protein interactions in fungal secretion pathways. We assume an inter-species transfer setting, where training data is obtained from a single species and the objective is to predict protein interactions in other, related species. In our methodology, we combine several state of the art machine learning approaches, namely, multiple kernel learning (MKL), pairwise kernels and kernelized structured output prediction in the supervised graph inference framework. For MKL, we apply recently proposed centered kernel alignment and p-norm path following approaches to integrate several feature sets describing the proteins, demonstrating improved performance. For graph inference, we apply input-output kernel regression (IOKR) in supervised and semi-supervised modes as well as output kernel trees (OK3). In our experiments simulating increasing genetic distance, Input-Output Kernel Regression proved to be the most robust prediction approach. We also show that the MKL approaches improve the predictions compared to uniform combination of the kernels. We evaluate the methods on the task of predicting protein-protein-interactions in the secretion pathways in fungi, S.cerevisiae, baker's yeast, being the source, T. reesei being the target of the inter-species transfer learning. We identify completely novel candidate secretion proteins conserved in filamentous fungi. These proteins could contribute to their unique secretion capabilities. PMID:27441920

  8. Machine Learning of Protein Interactions in Fungal Secretory Pathways.

    PubMed

    Kludas, Jana; Arvas, Mikko; Castillo, Sandra; Pakula, Tiina; Oja, Merja; Brouard, Céline; Jäntti, Jussi; Penttilä, Merja; Rousu, Juho

    2016-01-01

    In this paper we apply machine learning methods for predicting protein interactions in fungal secretion pathways. We assume an inter-species transfer setting, where training data is obtained from a single species and the objective is to predict protein interactions in other, related species. In our methodology, we combine several state of the art machine learning approaches, namely, multiple kernel learning (MKL), pairwise kernels and kernelized structured output prediction in the supervised graph inference framework. For MKL, we apply recently proposed centered kernel alignment and p-norm path following approaches to integrate several feature sets describing the proteins, demonstrating improved performance. For graph inference, we apply input-output kernel regression (IOKR) in supervised and semi-supervised modes as well as output kernel trees (OK3). In our experiments simulating increasing genetic distance, Input-Output Kernel Regression proved to be the most robust prediction approach. We also show that the MKL approaches improve the predictions compared to uniform combination of the kernels. We evaluate the methods on the task of predicting protein-protein-interactions in the secretion pathways in fungi, S.cerevisiae, baker's yeast, being the source, T. reesei being the target of the inter-species transfer learning. We identify completely novel candidate secretion proteins conserved in filamentous fungi. These proteins could contribute to their unique secretion capabilities.

  9. Machine Learning of Protein Interactions in Fungal Secretory Pathways

    PubMed Central

    Kludas, Jana; Arvas, Mikko; Castillo, Sandra; Pakula, Tiina; Oja, Merja; Brouard, Céline; Jäntti, Jussi; Penttilä, Merja

    2016-01-01

    In this paper we apply machine learning methods for predicting protein interactions in fungal secretion pathways. We assume an inter-species transfer setting, where training data is obtained from a single species and the objective is to predict protein interactions in other, related species. In our methodology, we combine several state of the art machine learning approaches, namely, multiple kernel learning (MKL), pairwise kernels and kernelized structured output prediction in the supervised graph inference framework. For MKL, we apply recently proposed centered kernel alignment and p-norm path following approaches to integrate several feature sets describing the proteins, demonstrating improved performance. For graph inference, we apply input-output kernel regression (IOKR) in supervised and semi-supervised modes as well as output kernel trees (OK3). In our experiments simulating increasing genetic distance, Input-Output Kernel Regression proved to be the most robust prediction approach. We also show that the MKL approaches improve the predictions compared to uniform combination of the kernels. We evaluate the methods on the task of predicting protein-protein-interactions in the secretion pathways in fungi, S.cerevisiae, baker’s yeast, being the source, T. reesei being the target of the inter-species transfer learning. We identify completely novel candidate secretion proteins conserved in filamentous fungi. These proteins could contribute to their unique secretion capabilities. PMID:27441920

  10. Atwood's Machine as a Tool to Introduce Variable Mass Systems

    ERIC Educational Resources Information Center

    de Sousa, Celia A.

    2012-01-01

    This article discusses an instructional strategy which explores eventual similarities and/or analogies between familiar problems and more sophisticated systems. In this context, the Atwood's machine problem is used to introduce students to more complex problems involving ropes and chains. The methodology proposed helps students to develop the…

  11. A Comparative Study of Teacher Education Institutions and Machine Tool Manufacturers to Determine Course Content for a Machine Tool Maintenance Course in the Woodworking Area.

    ERIC Educational Resources Information Center

    Polette, Douglas Lee

    To determine what type of maintenance training the prospective industrial arts teacher should receive in the woodworking area and how this information should be taught, a research instrument was constructed using information obtained from a review of relevant literature. Specific data on machine tool maintenance was gathered by the use of two…

  12. Mapping of Estimations and Prediction Intervals Using Extreme Learning Machines

    NASA Astrophysics Data System (ADS)

    Leuenberger, Michael; Kanevski, Mikhail

    2015-04-01

    Due to the large amount and complexity of data available nowadays in environmental sciences, we face the need to apply more robust methodology allowing analyses and understanding of the phenomena under study. One particular but very important aspect of this understanding is the reliability of generated prediction models. From the data collection to the prediction map, several sources of error can occur and affect the final result. Theses sources are mainly identified as uncertainty in data (data noise), and uncertainty in the model. Their combination leads to the so-called prediction interval. Quantifying these two categories of uncertainty allows a finer understanding of phenomena under study and a better assessment of the prediction accuracy. The present research deals with a methodology combining a machine learning algorithm (ELM - Extreme Learning Machine) with a bootstrap-based procedure. Developed by G.-B. Huang et al. (2006), ELM is an artificial neural network following the structure of a multilayer perceptron (MLP) with one single hidden layer. Compared to classical MLP, ELM has the ability to learn faster without loss of accuracy, and need only one hyper-parameter to be fitted (that is the number of nodes in the hidden layer). The key steps of the proposed method are as following: sample from the original data a variety of subsets using bootstrapping; from these subsets, train and validate ELM models; and compute residuals. Then, the same procedure is performed a second time with only the squared training residuals. Finally, taking into account the two modeling levels allows developing the mean prediction map, the model uncertainty variance, and the data noise variance. The proposed approach is illustrated using geospatial data. References Efron B., and Tibshirani R. 1986, Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical accuracy, Statistical Science, vol. 1: 54-75. Huang G.-B., Zhu Q.-Y., and Siew C.-K. 2006

  13. Biosimilarity Assessments of Model IgG1-Fc Glycoforms Using a Machine Learning Approach.

    PubMed

    Kim, Jae Hyun; Joshi, Sangeeta B; Tolbert, Thomas J; Middaugh, C Russell; Volkin, David B; Smalter Hall, Aaron

    2016-02-01

    Biosimilarity assessments are performed to decide whether 2 preparations of complex biomolecules can be considered "highly similar." In this work, a machine learning approach is demonstrated as a mathematical tool for such assessments using a variety of analytical data sets. As proof-of-principle, physical stability data sets from 8 samples, 4 well-defined immunoglobulin G1-Fragment crystallizable glycoforms in 2 different formulations, were examined (see More et al., companion article in this issue). The data sets included triplicate measurements from 3 analytical methods across different pH and temperature conditions (2066 data features). Established machine learning techniques were used to determine whether the data sets contain sufficient discriminative power in this application. The support vector machine classifier identified the 8 distinct samples with high accuracy. For these data sets, there exists a minimum threshold in terms of information quality and volume to grant enough discriminative power. Generally, data from multiple analytical techniques, multiple pH conditions, and at least 200 representative features were required to achieve the highest discriminative accuracy. In addition to classification accuracy tests, various methods such as sample space visualization, similarity analysis based on Euclidean distance, and feature ranking by mutual information scores are demonstrated to display their effectiveness as modeling tools for biosimilarity assessments.

  14. Investigation on the Surface Integrity and Tool Wear in Cryogenic Machining

    SciTech Connect

    Dutra Xavier, Sandro E.; Delijaicov, Sergio; Farias, Adalto de; Stipkovic Filho, Marco; Ferreira Batalha, Gilmar

    2011-01-17

    This work aimed to study the influences of cryogenic cooling on tool wear, comparing it to dry machining during on the surface integrity of test circular steel SAE 52100 hardened to 62 HRC, during the turning of the face, with the use of special PcBN, using liquid nitrogen with cooler. The surface integrity parameters analyzed were: surface roughness and white layer and tool wear. The results of the present work indicated reduction in tool wear, which enhance the tool life.

  15. Sex estimation from the tarsal bones in a Portuguese sample: a machine learning approach.

    PubMed

    Navega, David; Vicente, Ricardo; Vieira, Duarte N; Ross, Ann H; Cunha, Eugénia

    2015-05-01

    Sex estimation is extremely important in the analysis of human remains as many of the subsequent biological parameters are sex specific (e.g., age at death, stature, and ancestry). When dealing with incomplete or fragmented remains, metric analysis of the tarsal bones of the feet has proven valuable. In this study, the utility of 18 width, length, and height tarsal measurements were assessed for sex-related variation in a Portuguese sample. A total of 300 males and females from the Coimbra Identified Skeletal Collection were used to develop sex prediction models based on statistical and machine learning algorithm such as discriminant function analysis, logistic regression, classification trees, and artificial neural networks. All models were evaluated using 10-fold cross-validation and an independent test sample composed of 60 males and females from the Identified Skeletal Collection of the 21st Century. Results showed that tarsal bone sex-related variation can be easily captured with a high degree of repeatability. A simple tree-based multivariate algorithm involving measurements from the calcaneus, talus, first and third cuneiforms, and cuboid resulted in 88.3% correct sex estimation both on training and independent test sets. Traditional statistical classifiers such as the discriminant function analysis were outperformed by machine learning techniques. Results obtained show that machine learning algorithm are an important tool the forensic practitioners should consider when developing new standards for sex estimation.

  16. On the generalizability of resting-state fMRI machine learning classifiers.

    PubMed

    Huf, Wolfgang; Kalcher, Klaudius; Boubela, Roland N; Rath, Georg; Vecsei, Andreas; Filzmoser, Peter; Moser, Ewald

    2014-01-01

    Machine learning classifiers have become increasingly popular tools to generate single-subject inferences from fMRI data. With this transition from the traditional group level difference investigations to single-subject inference, the application of machine learning methods can be seen as a considerable step forward. Existing studies, however, have given scarce or no information on the generalizability to other subject samples, limiting the use of such published classifiers in other research projects. We conducted a simulation study using publicly available resting-state fMRI data from the 1000 Functional Connectomes and COBRE projects to examine the generalizability of classifiers based on regional homogeneity of resting-state time series. While classification accuracies of up to 0.8 (using sex as the target variable) could be achieved on test datasets drawn from the same study as the training dataset, the generalizability of classifiers to different study samples proved to be limited albeit above chance. This shows that on the one hand a certain amount of generalizability can robustly be expected, but on the other hand this generalizability should not be overestimated. Indeed, this study substantiates the need to include data from several sites in a study investigating machine learning classifiers with the aim of generalizability.

  17. Sex estimation from the tarsal bones in a Portuguese sample: a machine learning approach.

    PubMed

    Navega, David; Vicente, Ricardo; Vieira, Duarte N; Ross, Ann H; Cunha, Eugénia

    2015-05-01

    Sex estimation is extremely important in the analysis of human remains as many of the subsequent biological parameters are sex specific (e.g., age at death, stature, and ancestry). When dealing with incomplete or fragmented remains, metric analysis of the tarsal bones of the feet has proven valuable. In this study, the utility of 18 width, length, and height tarsal measurements were assessed for sex-related variation in a Portuguese sample. A total of 300 males and females from the Coimbra Identified Skeletal Collection were used to develop sex prediction models based on statistical and machine learning algorithm such as discriminant function analysis, logistic regression, classification trees, and artificial neural networks. All models were evaluated using 10-fold cross-validation and an independent test sample composed of 60 males and females from the Identified Skeletal Collection of the 21st Century. Results showed that tarsal bone sex-related variation can be easily captured with a high degree of repeatability. A simple tree-based multivariate algorithm involving measurements from the calcaneus, talus, first and third cuneiforms, and cuboid resulted in 88.3% correct sex estimation both on training and independent test sets. Traditional statistical classifiers such as the discriminant function analysis were outperformed by machine learning techniques. Results obtained show that machine learning algorithm are an important tool the forensic practitioners should consider when developing new standards for sex estimation. PMID:25186617

  18. Using a Machine Learning Approach to Predict Outcomes after Radiosurgery for Cerebral Arteriovenous Malformations

    PubMed Central

    Oermann, Eric Karl; Rubinsteyn, Alex; Ding, Dale; Mascitelli, Justin; Starke, Robert M.; Bederson, Joshua B.; Kano, Hideyuki; Lunsford, L. Dade; Sheehan, Jason P.; Hammerbacher, Jeffrey; Kondziolka, Douglas

    2016-01-01

    Predictions of patient outcomes after a given therapy are fundamental to medical practice. We employ a machine learning approach towards predicting the outcomes after stereotactic radiosurgery for cerebral arteriovenous malformations (AVMs). Using three prospective databases, a machine learning approach of feature engineering and model optimization was implemented to create the most accurate predictor of AVM outcomes. Existing prognostic systems were scored for purposes of comparison. The final predictor was secondarily validated on an independent site’s dataset not utilized for initial construction. Out of 1,810 patients, 1,674 to 1,291 patients depending upon time threshold, with 23 features were included for analysis and divided into training and validation sets. The best predictor had an average area under the curve (AUC) of 0.71 compared to existing clinical systems of 0.63 across all time points. On the heldout dataset, the predictor had an accuracy of around 0.74 at across all time thresholds with a specificity and sensitivity of 62% and 85% respectively. This machine learning approach was able to provide the best possible predictions of AVM radiosurgery outcomes of any method to date, identify a novel radiobiological feature (3D surface dose), and demonstrate a paradigm for further development of prognostic tools in medical care. PMID:26856372

  19. Using a Machine Learning Approach to Predict Outcomes after Radiosurgery for Cerebral Arteriovenous Malformations.

    PubMed

    Oermann, Eric Karl; Rubinsteyn, Alex; Ding, Dale; Mascitelli, Justin; Starke, Robert M; Bederson, Joshua B; Kano, Hideyuki; Lunsford, L Dade; Sheehan, Jason P; Hammerbacher, Jeffrey; Kondziolka, Douglas

    2016-02-09

    Predictions of patient outcomes after a given therapy are fundamental to medical practice. We employ a machine learning approach towards predicting the outcomes after stereotactic radiosurgery for cerebral arteriovenous malformations (AVMs). Using three prospective databases, a machine learning approach of feature engineering and model optimization was implemented to create the most accurate predictor of AVM outcomes. Existing prognostic systems were scored for purposes of comparison. The final predictor was secondarily validated on an independent site's dataset not utilized for initial construction. Out of 1,810 patients, 1,674 to 1,291 patients depending upon time threshold, with 23 features were included for analysis and divided into training and validation sets. The best predictor had an average area under the curve (AUC) of 0.71 compared to existing clinical systems of 0.63 across all time points. On the heldout dataset, the predictor had an accuracy of around 0.74 at across all time thresholds with a specificity and sensitivity of 62% and 85% respectively. This machine learning approach was able to provide the best possible predictions of AVM radiosurgery outcomes of any method to date, identify a novel radiobiological feature (3D surface dose), and demonstrate a paradigm for further development of prognostic tools in medical care.

  20. Teaching an Old Log New Tricks with Machine Learning.

    PubMed

    Schnell, Krista; Puri, Colin; Mahler, Paul; Dukatz, Carl

    2014-03-01

    To most people, the log file would not be considered an exciting area in technology today. However, these relatively benign, slowly growing data sources can drive large business transformations when combined with modern-day analytics. Accenture Technology Labs has built a new framework that helps to expand existing vendor solutions to create new methods of gaining insights from these benevolent information springs. This framework provides a systematic and effective machine-learning mechanism to understand, analyze, and visualize heterogeneous log files. These techniques enable an automated approach to analyzing log content in real time, learning relevant behaviors, and creating actionable insights applicable in traditionally reactive situations. Using this approach, companies can now tap into a wealth of knowledge residing in log file data that is currently being collected but underutilized because of its overwhelming variety and volume. By using log files as an important data input into the larger enterprise data supply chain, businesses have the opportunity to enhance their current operational log management solution and generate entirely new business insights-no longer limited to the realm of reactive IT management, but extending from proactive product improvement to defense from attacks. As we will discuss, this solution has immediate relevance in the telecommunications and security industries. However, the most forward-looking companies can take it even further. How? By thinking beyond the log file and applying the same machine-learning framework to other log file use cases (including logistics, social media, and consumer behavior) and any other transactional data source.

  1. Machine learning and data mining: strategies for hypothesis generation.

    PubMed

    Oquendo, M A; Baca-Garcia, E; Artés-Rodríguez, A; Perez-Cruz, F; Galfalvy, H C; Blasco-Fontecilla, H; Madigan, D; Duan, N

    2012-10-01

    Strategies for generating knowledge in medicine have included observation of associations in clinical or research settings and more recently, development of pathophysiological models based on molecular biology. Although critically important, they limit hypothesis generation to an incremental pace. Machine learning and data mining are alternative approaches to identifying new vistas to pursue, as is already evident in the literature. In concert with these analytic strategies, novel approaches to data collection can enhance the hypothesis pipeline as well. In data farming, data are obtained in an 'organic' way, in the sense that it is entered by patients themselves and available for harvesting. In contrast, in evidence farming (EF), it is the provider who enters medical data about individual patients. EF differs from regular electronic medical record systems because frontline providers can use it to learn from their own past experience. In addition to the possibility of generating large databases with farming approaches, it is likely that we can further harness the power of large data sets collected using either farming or more standard techniques through implementation of data-mining and machine-learning strategies. Exploiting large databases to develop new hypotheses regarding neurobiological and genetic underpinnings of psychiatric illness is useful in itself, but also affords the opportunity to identify novel mechanisms to be targeted in drug discovery and development.

  2. Machine learning approach for objective inpainting quality assessment

    NASA Astrophysics Data System (ADS)

    Frantc, V. A.; Voronin, V. V.; Marchuk, V. I.; Sherstobitov, A. I.; Agaian, S.; Egiazarian, K.

    2014-05-01

    This paper focuses on a machine learning approach for objective inpainting quality assessment. Inpainting has received a lot of attention in recent years and quality assessment is an important task to evaluate different image reconstruction approaches. Quantitative metrics for successful image inpainting currently do not exist; researchers instead are relying upon qualitative human comparisons in order to evaluate their methodologies and techniques. We present an approach for objective inpainting quality assessment based on natural image statistics and machine learning techniques. Our method is based on observation that when images are properly normalized or transferred to a transform domain, local descriptors can be modeled by some parametric distributions. The shapes of these distributions are different for noninpainted and inpainted images. Approach permits to obtain a feature vector strongly correlated with a subjective image perception by a human visual system. Next, we use a support vector regression learned on assessed by human images to predict perceived quality of inpainted images. We demonstrate how our predicted quality value repeatably correlates with a qualitative opinion in a human observer study.

  3. Effective feature selection for image steganalysis using extreme learning machine

    NASA Astrophysics Data System (ADS)

    Feng, Guorui; Zhang, Haiyan; Zhang, Xinpeng

    2014-11-01

    Image steganography delivers secret data by slight modifications of the cover. To detect these data, steganalysis tries to create some features to embody the discrepancy between the cover and steganographic images. Therefore, the urgent problem is how to design an effective classification architecture for given feature vectors extracted from the images. We propose an approach to automatically select effective features based on the well-known JPEG steganographic methods. This approach, referred to as extreme learning machine revisited feature selection (ELM-RFS), can tune input weights in terms of the importance of input features. This idea is derived from cross-validation learning and one-dimensional (1-D) search. While updating input weights, we seek the energy decreasing direction using the leave-one-out (LOO) selection. Furthermore, we optimize the 1-D energy function instead of directly discarding the least significant feature. Since recent Liu features can gain considerable low detection errors compared to a previous JPEG steganalysis, the experimental results demonstrate that the new approach results in less classification error than other classifiers such as SVM, Kodovsky ensemble classifier, direct ELM-LOO learning, kernel ELM, and conventional ELM in Liu features. Furthermore, ELM-RFS achieves a similar performance with a deep Boltzmann machine using less training time.

  4. Teaching an Old Log New Tricks with Machine Learning.

    PubMed

    Schnell, Krista; Puri, Colin; Mahler, Paul; Dukatz, Carl

    2014-03-01

    To most people, the log file would not be considered an exciting area in technology today. However, these relatively benign, slowly growing data sources can drive large business transformations when combined with modern-day analytics. Accenture Technology Labs has built a new framework that helps to expand existing vendor solutions to create new methods of gaining insights from these benevolent information springs. This framework provides a systematic and effective machine-learning mechanism to understand, analyze, and visualize heterogeneous log files. These techniques enable an automated approach to analyzing log content in real time, learning relevant behaviors, and creating actionable insights applicable in traditionally reactive situations. Using this approach, companies can now tap into a wealth of knowledge residing in log file data that is currently being collected but underutilized because of its overwhelming variety and volume. By using log files as an important data input into the larger enterprise data supply chain, businesses have the opportunity to enhance their current operational log management solution and generate entirely new business insights-no longer limited to the realm of reactive IT management, but extending from proactive product improvement to defense from attacks. As we will discuss, this solution has immediate relevance in the telecommunications and security industries. However, the most forward-looking companies can take it even further. How? By thinking beyond the log file and applying the same machine-learning framework to other log file use cases (including logistics, social media, and consumer behavior) and any other transactional data source. PMID:27447306

  5. Kernel-based machine learning techniques for infrasound signal classification

    NASA Astrophysics Data System (ADS)

    Tuma, Matthias; Igel, Christian; Mialle, Pierrick

    2014-05-01

    Infrasound monitoring is one of four remote sensing technologies continuously employed by the CTBTO Preparatory Commission. The CTBTO's infrasound network is designed to monitor the Earth for potential evidence of atmospheric or shallow underground nuclear explosions. Upon completion, it will comprise 60 infrasound array stations distributed around the globe, of which 47 were certified in January 2014. Three stages can be identified in CTBTO infrasound data processing: automated processing at the level of single array stations, automated processing at the level of the overall global network, and interactive review by human analysts. At station level, the cross correlation-based PMCC algorithm is used for initial detection of coherent wavefronts. It produces estimates for trace velocity and azimuth of incoming wavefronts, as well as other descriptive features characterizing a signal. Detected arrivals are then categorized into potentially treaty-relevant versus noise-type signals by a rule-based expert system. This corresponds to a binary classification task at the level of station processing. In addition, incoming signals may be grouped according to their travel path in the atmosphere. The present work investigates automatic classification of infrasound arrivals by kernel-based pattern recognition methods. It aims to explore the potential of state-of-the-art machine learning methods vis-a-vis the current rule-based and task-tailored expert system. To this purpose, we first address the compilation of a representative, labeled reference benchmark dataset as a prerequisite for both classifier training and evaluation. Data representation is based on features extracted by the CTBTO's PMCC algorithm. As classifiers, we employ support vector machines (SVMs) in a supervised learning setting. Different SVM kernel functions are used and adapted through different hyperparameter optimization routines. The resulting performance is compared to several baseline classifiers. All

  6. Machine learning approaches in medical image analysis: From detection to diagnosis.

    PubMed

    de Bruijne, Marleen

    2016-10-01

    Machine learning approaches are increasingly successful in image-based diagnosis, disease prognosis, and risk assessment. This paper highlights new research directions and discusses three main challenges related to machine learning in medical imaging: coping with variation in imaging protocols, learning from weak labels, and interpretation and evaluation of results.

  7. Orchestrating Learning Activities Using the CADMOS Learning Design Tool

    ERIC Educational Resources Information Center

    Katsamani, Maria; Retalis, Symeon

    2013-01-01

    This paper gives an overview of CADMOS (CoursewAre Development Methodology for Open instructional Systems), a graphical IMS-LD Level A & B compliant learning design (LD) tool, which promotes the concept of "separation of concerns" during the design process, via the creation of two models: the conceptual model, which describes the…

  8. Machine-z: Rapid machine-learned redshift indicator for Swift gamma-ray bursts

    DOE PAGESBeta

    Ukwatta, T. N.; Wozniak, P. R.; Gehrels, N.

    2016-03-08

    Studies of high-redshift gamma-ray bursts (GRBs) provide important information about the early Universe such as the rates of stellar collapsars and mergers, the metallicity content, constraints on the re-ionization period, and probes of the Hubble expansion. Rapid selection of high-z candidates from GRB samples reported in real time by dedicated space missions such as Swift is the key to identifying the most distant bursts before the optical afterglow becomes too dim to warrant a good spectrum. Here, we introduce ‘machine-z’, a redshift prediction algorithm and a ‘high-z’ classifier for Swift GRBs based on machine learning. Our method relies exclusively onmore » canonical data commonly available within the first few hours after the GRB trigger. Using a sample of 284 bursts with measured redshifts, we trained a randomized ensemble of decision trees (random forest) to perform both regression and classification. Cross-validated performance studies show that the correlation coefficient between machine-z predictions and the true redshift is nearly 0.6. At the same time, our high-z classifier can achieve 80 per cent recall of true high-redshift bursts, while incurring a false positive rate of 20 per cent. With 40 per cent false positive rate the classifier can achieve ~100 per cent recall. As a result, the most reliable selection of high-redshift GRBs is obtained by combining predictions from both the high-z classifier and the machine-z regressor.« less

  9. Machine Learning and the Starship - A Match Made in Heaven

    NASA Astrophysics Data System (ADS)

    Galea, P.

    The computer control system of an unmanned interstellar craft must deal with a variety of complex problems. For example, upon reaching the destination star, the computer may need to make assessments of the planets and other objects to prioritize the most `interesting', and assign appropriate probes to each. These decisions would normally be regarded as intelligent if they were made by humans. This paper looks at machine learning technologies currently deployed in non-aerospace contexts, such as book recommendation systems, dating websites and social network analysis, and investigates the ways in which they can be adapted for applications in the starship. This paper is a submission of the Project Icarus Study Group.

  10. Extreme learning machine for ranking: generalization analysis and applications.

    PubMed

    Chen, Hong; Peng, Jiangtao; Zhou, Yicong; Li, Luoqing; Pan, Zhibin

    2014-05-01

    The extreme learning machine (ELM) has attracted increasing attention recently with its successful applications in classification and regression. In this paper, we investigate the generalization performance of ELM-based ranking. A new regularized ranking algorithm is proposed based on the combinations of activation functions in ELM. The generalization analysis is established for the ELM-based ranking (ELMRank) in terms of the covering numbers of hypothesis space. Empirical results on the benchmark datasets show the competitive performance of the ELMRank over the state-of-the-art ranking methods. PMID:24590011

  11. Transferable Atomic Multipole Machine Learning Models for Small Organic Molecules.

    PubMed

    Bereau, Tristan; Andrienko, Denis; von Lilienfeld, O Anatole

    2015-07-14

    Accurate representation of the molecular electrostatic potential, which is often expanded in distributed multipole moments, is crucial for an efficient evaluation of intermolecular interactions. Here we introduce a machine learning model for multipole coefficients of atom types H, C, O, N, S, F, and Cl in any molecular conformation. The model is trained on quantum-chemical results for atoms in varying chemical environments drawn from thousands of organic molecules. Multipoles in systems with neutral, cationic, and anionic molecular charge states are treated with individual models. The models' predictive accuracy and applicability are illustrated by evaluating intermolecular interaction energies of nearly 1,000 dimers and the cohesive energy of the benzene crystal.

  12. Coordinated machine learning and decision support for situation awareness.

    SciTech Connect

    Draelos, Timothy John; Zhang, Peng-Chu.; Wunsch, Donald C.; Seiffertt, John; Conrad, Gregory N.; Brannon, Nathan Gregory

    2007-09-01

    For applications such as force protection, an effective decision maker needs to maintain an unambiguous grasp of the environment. Opportunities exist to leverage computational mechanisms for the adaptive fusion of diverse information sources. The current research employs neural networks and Markov chains to process information from sources including sensors, weather data, and law enforcement. Furthermore, the system operator's input is used as a point of reference for the machine learning algorithms. More detailed features of the approach are provided, along with an example force protection scenario.

  13. Detections of Propellers in Saturn's Rings using Machine Learning: Preliminary Results

    NASA Astrophysics Data System (ADS)

    Gordon, Mitchell K.; Showalter, Mark R.; Odess, Jennifer; Del Villar, Ambi; LaMora, Andy; Paik, Jin; Lakhani, Karim; Sergeev, Rinat; Erickson, Kristen; Galica, Carol; Grayzeck, Edwin; Morgan, Thomas; Knopf, William

    2015-11-01

    We report on the initial analysis of the output of a tool designed to identify persistent, non-axisymmetric features in the rings of Saturn. This project introduces a new paradigm for scientific software development. The preliminary results include what appear to be new detections of propellers in the rings of Saturn.The Planetary Data System (PDS), working with the NASA Tournament Lab (NTL), Crowd Innovation Lab at Harvard University, and the Topcoder community at Appirio, Inc., under the umbrella “Cassini Rings Challenge”, sponsored a set of competitions employing crowd sourcing and machine learning to develop a tool which could be made available to the community at large. The Challenge was tackled by running a series of separate contests to solve individual tasks prior to the major machine learning challenge. Each contest was comprised of a set of requirements, a timeline, one or more prizes, and other incentives, and was posted by Appirio to the Topcoder Community. In the case of the machine learning challenge (a “Marathon Challenge” on the Topcoder platform), members competed against each other by submitting solutions that were scored in real time and posted to a public leader-board by a scoring algorithm developed by Appirio for this contest.The current version of the algorithm was run against ~30,000 of the highest resolution Cassini ISS images. That set included 668 images with a total of 786 features previously identified as propellers in the main rings. The tool identified 81% of those previously identified propellers. In a preliminary, close examination of 130 detections identified by the tool, we determined that of the 130 detections, 11 were previously identified propeller detections, 5 appear to be new detections of known propellers, and 4 appear to be detections of propellers which have not been seen previously. A total of 20 valid detections from 130 candidates implies a relatively high false positive rate which we hope to reduce by further

  14. Coadaptive brain-machine interface via reinforcement learning.

    PubMed

    DiGiovanna, Jack; Mahmoudi, Babak; Fortes, Jose; Principe, Jose C; Sanchez, Justin C

    2009-01-01

    This paper introduces and demonstrates a novel brain-machine interface (BMI) architecture based on the concepts of reinforcement learning (RL), coadaptation, and shaping. RL allows the BMI control algorithm to learn to complete tasks from interactions with the environment, rather than an explicit training signal. Coadaption enables continuous, synergistic adaptation between the BMI control algorithm and BMI user working in changing environments. Shaping is designed to reduce the learning curve for BMI users attempting to control a prosthetic. Here, we present the theory and in vivo experimental paradigm to illustrate how this BMI learns to complete a reaching task using a prosthetic arm in a 3-D workspace based on the user's neuronal activity. This semisupervised learning framework does not require user movements. We quantify BMI performance in closed-loop brain control over six to ten days for three rats as a function of increasing task difficulty. All three subjects coadapted with their BMI control algorithms to control the prosthetic significantly above chance at each level of difficulty.

  15. Semi-supervised and unsupervised extreme learning machines.

    PubMed

    Huang, Gao; Song, Shiji; Gupta, Jatinder N D; Wu, Cheng

    2014-12-01

    Extreme learning machines (ELMs) have proven to be efficient and effective learning mechanisms for pattern classification and regression. However, ELMs are primarily applied to supervised learning problems. Only a few existing research papers have used ELMs to explore unlabeled data. In this paper, we extend ELMs for both semi-supervised and unsupervised tasks based on the manifold regularization, thus greatly expanding the applicability of ELMs. The key advantages of the proposed algorithms are as follows: 1) both the semi-supervised ELM (SS-ELM) and the unsupervised ELM (US-ELM) exhibit learning capability and computational efficiency of ELMs; 2) both algorithms naturally handle multiclass classification or multicluster clustering; and 3) both algorithms are inductive and can handle unseen data at test time directly. Moreover, it is shown in this paper that all the supervised, semi-supervised, and unsupervised ELMs can actually be put into a unified framework. This provides new perspectives for understanding the mechanism of random feature mapping, which is the key concept in ELM theory. Empirical study on a wide range of data sets demonstrates that the proposed algorithms are competitive with the state-of-the-art semi-supervised or unsupervised learning algorithms in terms of accuracy and efficiency. PMID:25415946

  16. Performance Evaluation of Multi-Axis CNC Machine Tools by Interferometry Principle using Laser Calibration System

    NASA Astrophysics Data System (ADS)

    Barman, S.; Sen, R.

    2012-06-01

    Advancement in digital electronics and microprocessors has made the manufacturing sector capable to generate complex components within small tolerance zone in nanometre range at one machining center. All motion control systems have some form of position feed back system fitted with the machine. But the systems are not perfectly accurate due to the errors in the positioning performance of the machine tools which will change over time to time due to wear, damage and environmental effect. The complex structure of multi-axis CNC machine tools produces an inaccuracy at the tool tip caused by kinematic parameter deviations resulting in manufacturing errors, assembly error and quasi-static errors. Analysis of these errors using a laser measurement system provides the user with a way to achieve better accuracy, and hence higher quality output from these processes. In this paper, characteristic of the positioning errors of the axes of multi-axis CNC machine tools and the technique to measure the errors by a laser interferometer calibration system have been discussed and the positioning accuracy of the machine each axis has been verified.

  17. MEAT: An Authoring Tool for Generating Adaptable Learning Resources

    ERIC Educational Resources Information Center

    Kuo, Yen-Hung; Huang, Yueh-Min

    2009-01-01

    Mobile learning (m-learning) is a new trend in the e-learning field. The learning services in m-learning environments are supported by fundamental functions, especially the content and assessment services, which need an authoring tool to rapidly generate adaptable learning resources. To fulfill the imperious demand, this study proposes an…

  18. Extremely Randomized Machine Learning Methods for Compound Activity Prediction.

    PubMed

    Czarnecki, Wojciech M; Podlewska, Sabina; Bojarski, Andrzej J

    2015-11-09

    Speed, a relatively low requirement for computational resources and high effectiveness of the evaluation of the bioactivity of compounds have caused a rapid growth of interest in the application of machine learning methods to virtual screening tasks. However, due to the growth of the amount of data also in cheminformatics and related fields, the aim of research has shifted not only towards the development of algorithms of high predictive power but also towards the simplification of previously existing methods to obtain results more quickly. In the study, we tested two approaches belonging to the group of so-called 'extremely randomized methods'-Extreme Entropy Machine and Extremely Randomized Trees-for their ability to properly identify compounds that have activity towards particular protein targets. These methods were compared with their 'non-extreme' competitors, i.e., Support Vector Machine and Random Forest. The extreme approaches were not only found out to improve the efficiency of the classification of bioactive compounds, but they were also proved to be less computationally complex, requiring fewer steps to perform an optimization procedure.

  19. Relevance vector machines as a tool for forecasting geomagnetic storms during years 1996-2007

    NASA Astrophysics Data System (ADS)

    Andriyas, T.; Andriyas, S.

    2015-04-01

    In this paper, we investigate the use of relevance vector machine (RVM) as a learning tool in order to generate 1-h (one hour) ahead forecasts for geomagnetic storms driven by the interaction of the solar wind with the Earth's magnetosphere during the years 1996-2007. This epoch included solar cycle 23 with storms that were both ICME (interplanetary coronal mass ejection) and CIR (corotating interaction region) driven. Merged plasma and magnetic field measurements of the solar wind from the Advanced Composition Explorer (ACE) and WIND satellites located upstream of the Earth's magnetosphere at 1-h cadence were used as inputs to the model. The magnetospheric response to the solar wind driving measured by the disturbance storm time or the Dst index (measured in nT) was used as the output to be forecasted. The model was first tested on previously reported storms in Wu and Lundstedt (1997) and it gave a linear correlation coefficient, ρ, of above 90% and prediction efficiency (PE) above 80%. During 1996-2007, several storms (within each year) were chosen as test cases to analyze the forecasting robustness of the model. The top three forecasts per year were analyzed to assess the generalization ability of the model. These included storms with varying intensities ranging from weak (-53.01 nT) to strong (-422.02 nT) and durations (119-445 h). The top RVM forecast in a given year had ρ above 85% (87.00-96.85%), PE > 73 % (73.59-93.59%), and a root mean square error (RMSE) ranging from 9.31 to 33.45 nT. A qualitative comparison is made with model forecasts previously reported by Ji et al. (2012). We found that the robustness of the model with regards to fast learning and generating forecasts within acceptable error bounds makes it a very good proposition as a prediction tool (given the solar wind parameters) for space weather monitoring.

  20. GeneRIF indexing: sentence selection based on machine learning

    PubMed Central

    2013-01-01

    Background A Gene Reference Into Function (GeneRIF) describes novel functionality of genes. GeneRIFs are available from the National Center for Biotechnology Information (NCBI) Gene database. GeneRIF indexing is performed manually, and the intention of our work is to provide methods to support creating the GeneRIF entries. The creation of GeneRIF entries involves the identification of the genes mentioned in MEDLINE®; citations and the sentences describing a novel function. Results We have compared several learning algorithms and several features extracted or derived from MEDLINE sentences to determine if a sentence should be selected for GeneRIF indexing. Features are derived from the sentences or using mechanisms to augment the information provided by them: assigning a discourse label using a previously trained model, for example. We show that machine learning approaches with specific feature combinations achieve results close to one of the annotators. We have evaluated different feature sets and learning algorithms. In particular, Naïve Bayes achieves better performance with a selection of features similar to one used in related work, which considers the location of the sentence, the discourse of the sentence and the functional terminology in it. Conclusions The current performance is at a level similar to human annotation and it shows that machine learning can be used to automate the task of sentence selection for GeneRIF annotation. The current experiments are limited to the human species. We would like to see how the methodology can be extended to other species, specifically the normalization of gene mentions in other species. PMID:23725347

  1. Using machine learning to predict gene expression and discover sequence motifs

    NASA Astrophysics Data System (ADS)

    Li, Xuejing

    Recently, large amounts of experimental data for complex biological systems have become available. We use tools and algorithms from machine learning to build data-driven predictive models. We first present a novel algorithm to discover gene sequence motifs associated with temporal expression patterns of genes. Our algorithm, which is based on partial least squares (PLS) regression, is able to directly model the flow of information, from gene sequence to gene expression, to learn cis regulatory motifs and characterize associated gene expression patterns. Our algorithm outperforms traditional computational methods e.g. clustering in motif discovery. We then present a study of extending a machine learning model for transcriptional regulation predictive of genetic regulatory response to Caenorhabditis elegans. We show meaningful results both in terms of prediction accuracy on the test experiments and biological information extracted from the regulatory program. The model discovers DNA binding sites ab initio. We also present a case study where we detect a signal of lineage-specific regulation. Finally we present a comparative study on learning predictive models for motif discovery, based on different boosting algorithms: Adaptive Boosting (AdaBoost), Linear Programming Boosting (LPBoost) and Totally Corrective Boosting (TotalBoost). We evaluate and compare the performance of the three boosting algorithms via both statistical and biological validation, for hypoxia response in Saccharomyces cerevisiae.

  2. Calibrating Building Energy Models Using Supercomputer Trained Machine Learning Agents

    SciTech Connect

    Sanyal, Jibonananda; New, Joshua Ryan; Edwards, Richard; Parker, Lynne Edwards

    2014-01-01

    Building Energy Modeling (BEM) is an approach to model the energy usage in buildings for design and retrofit purposes. EnergyPlus is the flagship Department of Energy software that performs BEM for different types of buildings. The input to EnergyPlus can often extend in the order of a few thousand parameters which have to be calibrated manually by an expert for realistic energy modeling. This makes it challenging and expensive thereby making building energy modeling unfeasible for smaller projects. In this paper, we describe the Autotune research which employs machine learning algorithms to generate agents for the different kinds of standard reference buildings in the U.S. building stock. The parametric space and the variety of building locations and types make this a challenging computational problem necessitating the use of supercomputers. Millions of EnergyPlus simulations are run on supercomputers which are subsequently used to train machine learning algorithms to generate agents. These agents, once created, can then run in a fraction of the time thereby allowing cost-effective calibration of building models.

  3. Machine Learning for Quantum Metrology and Quantum Control

    NASA Astrophysics Data System (ADS)

    Sanders, Barry; Zahedinejad, Ehsan; Palittapongarnpim, Pantita

    Generating quantum metrological procedures and quantum gate designs, subject to constraints such as temporal or particle-number bounds or limits on the number of control parameters, are typically hard computationally. Although greedy machine learning algorithms are ubiquitous for tackling these problems, the severe constraints listed above limit the efficacy of such approaches. Our aim is to devise heuristic machine learning techniques to generate tractable procedures for adaptive quantum metrology and quantum gate design. In particular we have modified differential evolution to generate adaptive interferometric-phase quantum metrology procedures for up to 100 photons including loss and noise, and we have generated policies for designing single-shot high-fidelity three-qubit gates in superconducting circuits by avoided level crossings. Although quantum metrology and quantum control are regarded as disparate, we have developed a unified framework for these two subjects, and this unification enables us to transfer insights and breakthroughs from one of the topics to the other. Thanks to NSERC, AITF and 1000 Talent Plan.

  4. Overlay improvements using a real time machine learning algorithm

    NASA Astrophysics Data System (ADS)

    Schmitt-Weaver, Emil; Kubis, Michael; Henke, Wolfgang; Slotboom, Daan; Hoogenboom, Tom; Mulkens, Jan; Coogans, Martyn; ten Berge, Peter; Verkleij, Dick; van de Mast, Frank

    2014-04-01

    While semiconductor manufacturing is moving towards the 14nm node using immersion lithography, the overlay requirements are tightened to below 5nm. Next to improvements in the immersion scanner platform, enhancements in the overlay optimization and process control are needed to enable these low overlay numbers. Whereas conventional overlay control methods address wafer and lot variation autonomously with wafer pre exposure alignment metrology and post exposure overlay metrology, we see a need to reduce these variations by correlating more of the TWINSCAN system's sensor data directly to the post exposure YieldStar metrology in time. In this paper we will present the results of a study on applying a real time control algorithm based on machine learning technology. Machine learning methods use context and TWINSCAN system sensor data paired with post exposure YieldStar metrology to recognize generic behavior and train the control system to anticipate on this generic behavior. Specific for this study, the data concerns immersion scanner context, sensor data and on-wafer measured overlay data. By making the link between the scanner data and the wafer data we are able to establish a real time relationship. The result is an inline controller that accounts for small changes in scanner hardware performance in time while picking up subtle lot to lot and wafer to wafer deviations introduced by wafer processing.

  5. Parsimonious kernel extreme learning machine in primal via Cholesky factorization.

    PubMed

    Zhao, Yong-Ping

    2016-08-01

    Recently, extreme learning machine (ELM) has become a popular topic in machine learning community. By replacing the so-called ELM feature mappings with the nonlinear mappings induced by kernel functions, two kernel ELMs, i.e., P-KELM and D-KELM, are obtained from primal and dual perspectives, respectively. Unfortunately, both P-KELM and D-KELM possess the dense solutions in direct proportion to the number of training data. To this end, a constructive algorithm for P-KELM (CCP-KELM) is first proposed by virtue of Cholesky factorization, in which the training data incurring the largest reductions on the objective function are recruited as significant vectors. To reduce its training cost further, PCCP-KELM is then obtained with the application of a probabilistic speedup scheme into CCP-KELM. Corresponding to CCP-KELM, a destructive P-KELM (CDP-KELM) is presented using a partial Cholesky factorization strategy, where the training data incurring the smallest reductions on the objective function after their removals are pruned from the current set of significant vectors. Finally, to verify the efficacy and feasibility of the proposed algorithms in this paper, experiments on both small and large benchmark data sets are investigated.

  6. Machine learning of molecular electronic properties in chemical compound space

    NASA Astrophysics Data System (ADS)

    Montavon, Grégoire; Rupp, Matthias; Gobre, Vivekanand; Vazquez-Mayagoitia, Alvaro; Hansen, Katja; Tkatchenko, Alexandre; Müller, Klaus-Robert; Anatole von Lilienfeld, O.

    2013-09-01

    The combination of modern scientific computing with electronic structure theory can lead to an unprecedented amount of data amenable to intelligent data analysis for the identification of meaningful, novel and predictive structure-property relationships. Such relationships enable high-throughput screening for relevant properties in an exponentially growing pool of virtual compounds that are synthetically accessible. Here, we present a machine learning model, trained on a database of ab initio calculation results for thousands of organic molecules, that simultaneously predicts multiple electronic ground- and excited-state properties. The properties include atomization energy, polarizability, frontier orbital eigenvalues, ionization potential, electron affinity and excitation energies. The machine learning model is based on a deep multi-task artificial neural network, exploiting the underlying correlations between various molecular properties. The input is identical to ab initio methods, i.e. nuclear charges and Cartesian coordinates of all atoms. For small organic molecules, the accuracy of such a ‘quantum machine’ is similar, and sometimes superior, to modern quantum-chemical methods—at negligible computational cost.

  7. Machine Learning for Knowledge Extraction from PHR Big Data.

    PubMed

    Poulymenopoulou, Michaela; Malamateniou, Flora; Vassilacopoulos, George

    2014-01-01

    Cloud computing, Internet of things (IOT) and NoSQL database technologies can support a new generation of cloud-based PHR services that contain heterogeneous (unstructured, semi-structured and structured) patient data (health, social and lifestyle) from various sources, including automatically transmitted data from Internet connected devices of patient living space (e.g. medical devices connected to patients at home care). The patient data stored in such PHR systems constitute big data whose analysis with the use of appropriate machine learning algorithms is expected to improve diagnosis and treatment accuracy, to cut healthcare costs and, hence, to improve the overall quality and efficiency of healthcare provided. This paper describes a health data analytics engine which uses machine learning algorithms for analyzing cloud based PHR big health data towards knowledge extraction to support better healthcare delivery as regards disease diagnosis and prognosis. This engine comprises of the data preparation, the model generation and the data analysis modules and runs on the cloud taking advantage from the map/reduce paradigm provided by Apache Hadoop. PMID:25000009

  8. Estimation of Alpine Skier Posture Using Machine Learning Techniques

    PubMed Central

    Nemec, Bojan; Petrič, Tadej; Babič, Jan; Supej, Matej

    2014-01-01

    High precision Global Navigation Satellite System (GNSS) measurements are becoming more and more popular in alpine skiing due to the relatively undemanding setup and excellent performance. However, GNSS provides only single-point measurements that are defined with the antenna placed typically behind the skier's neck. A key issue is how to estimate other more relevant parameters of the skier's body, like the center of mass (COM) and ski trajectories. Previously, these parameters were estimated by modeling the skier's body with an inverted-pendulum model that oversimplified the skier's body. In this study, we propose two machine learning methods that overcome this shortcoming and estimate COM and skis trajectories based on a more faithful approximation of the skier's body with nine degrees-of-freedom. The first method utilizes a well-established approach of artificial neural networks, while the second method is based on a state-of-the-art statistical generalization method. Both methods were evaluated using the reference measurements obtained on a typical giant slalom course and compared with the inverted-pendulum method. Our results outperform the results of commonly used inverted-pendulum methods and demonstrate the applicability of machine learning techniques in biomechanical measurements of alpine skiing. PMID:25313492

  9. Analyzing angle crashes at unsignalized intersections using machine learning techniques.

    PubMed

    Abdel-Aty, Mohamed; Haleem, Kirolos

    2011-01-01

    A recently developed machine learning technique, multivariate adaptive regression splines (MARS), is introduced in this study to predict vehicles' angle crashes. MARS has a promising prediction power, and does not suffer from interpretation complexity. Negative Binomial (NB) and MARS models were fitted and compared using extensive data collected on unsignalized intersections in Florida. Two models were estimated for angle crash frequency at 3- and 4-legged unsignalized intersections. Treating crash frequency as a continuous response variable for fitting a MARS model was also examined by considering the natural logarithm of the crash frequency. Finally, combining MARS with another machine learning technique (random forest) was explored and discussed. The fitted NB angle crash models showed several significant factors that contribute to angle crash occurrence at unsignalized intersections such as, traffic volume on the major road, the upstream distance to the nearest signalized intersection, the distance between successive unsignalized intersections, median type on the major approach, percentage of trucks on the major approach, size of the intersection and the geographic location within the state. Based on the mean square prediction error (MSPE) assessment criterion, MARS outperformed the corresponding NB models. Also, using MARS for predicting continuous response variables yielded more favorable results than predicting discrete response variables. The generated MARS models showed the most promising results after screening the covariates using random forest. Based on the results of this study, MARS is recommended as an efficient technique for predicting crashes at unsignalized intersections (angle crashes in this study).

  10. Parsimonious kernel extreme learning machine in primal via Cholesky factorization.

    PubMed

    Zhao, Yong-Ping

    2016-08-01

    Recently, extreme learning machine (ELM) has become a popular topic in machine learning community. By replacing the so-called ELM feature mappings with the nonlinear mappings induced by kernel functions, two kernel ELMs, i.e., P-KELM and D-KELM, are obtained from primal and dual perspectives, respectively. Unfortunately, both P-KELM and D-KELM possess the dense solutions in direct proportion to the number of training data. To this end, a constructive algorithm for P-KELM (CCP-KELM) is first proposed by virtue of Cholesky factorization, in which the training data incurring the largest reductions on the objective function are recruited as significant vectors. To reduce its training cost further, PCCP-KELM is then obtained with the application of a probabilistic speedup scheme into CCP-KELM. Corresponding to CCP-KELM, a destructive P-KELM (CDP-KELM) is presented using a partial Cholesky factorization strategy, where the training data incurring the smallest reductions on the objective function after their removals are pruned from the current set of significant vectors. Finally, to verify the efficacy and feasibility of the proposed algorithms in this paper, experiments on both small and large benchmark data sets are investigated. PMID:27203553

  11. Edge detection in grayscale imagery using machine learning

    SciTech Connect

    Glocer, K. A.; Perkins, S. J.

    2004-01-01

    Edge detection can be formulated as a binary classification problem at the pixel level with the goal of identifying individual pixels as either on-edge or off-edge. To solve this classification problem we use both fixed and adaptive feature selection in conjunction with a support vector machine. This approach provides a direct data-driven solution and does not require the intermediate step of learning a distribution to perform a likelihood-based classification. Furthermore, the approach can readily be adapted for other image processing tasks. The algorithm was tested on a data set of 50 object images, each associated with a hand-drawn 'ground truth' image. We computed ROC curves to evaluate the performance of the general feature extraction and machine learning approach, and compared that to the standard Canny edge detector and with recent work on statistical edge detection. Using a direct pixel-by-pixel error metric enabled us to compare to the statistical edge detection approach, and our algorithm compared favorably. Using a more 'natural' metric enabled comparision with work by the authors of the image data set, and our algorithm performed comparably to the suite of state-of-art edge detectors in that study.

  12. Prediction of brain tumor progression using a machine learning technique

    NASA Astrophysics Data System (ADS)

    Shen, Yuzhong; Banerjee, Debrup; Li, Jiang; Chandler, Adam; Shen, Yufei; McKenzie, Frederic D.; Wang, Jihong

    2010-03-01

    A machine learning technique is presented for assessing brain tumor progression by exploring six patients' complete MRI records scanned during their visits in the past two years. There are ten MRI series, including diffusion tensor image (DTI), for each visit. After registering all series to the corresponding DTI scan at the first visit, annotated normal and tumor regions were overlaid. Intensity value of each pixel inside the annotated regions were then extracted across all of the ten MRI series to compose a 10 dimensional vector. Each feature vector falls into one of three categories:normal, tumor, and normal but progressed to tumor at a later time. In this preliminary study, we focused on the trend of brain tumor progression during three consecutive visits, i.e., visit A, B, and C. A machine learning algorithm was trained using the data containing information from visit A to visit B, and the trained model was used to predict tumor progression from visit A to visit C. Preliminary results showed that prediction for brain tumor progression is feasible. An average of 80.9% pixel-wise accuracy was achieved for tumor progression prediction at visit C.

  13. Machine Learning for Knowledge Extraction from PHR Big Data.

    PubMed

    Poulymenopoulou, Michaela; Malamateniou, Flora; Vassilacopoulos, George

    2014-01-01

    Cloud computing, Internet of things (IOT) and NoSQL database technologies can support a new generation of cloud-based PHR services that contain heterogeneous (unstructured, semi-structured and structured) patient data (health, social and lifestyle) from various sources, including automatically transmitted data from Internet connected devices of patient living space (e.g. medical devices connected to patients at home care). The patient data stored in such PHR systems constitute big data whose analysis with the use of appropriate machine learning algorithms is expected to improve diagnosis and treatment accuracy, to cut healthcare costs and, hence, to improve the overall quality and efficiency of healthcare provided. This paper describes a health data analytics engine which uses machine learning algorithms for analyzing cloud based PHR big health data towards knowledge extraction to support better healthcare delivery as regards disease diagnosis and prognosis. This engine comprises of the data preparation, the model generation and the data analysis modules and runs on the cloud taking advantage from the map/reduce paradigm provided by Apache Hadoop.

  14. Machine Learning Based Road Detection from High Resolution Imagery

    NASA Astrophysics Data System (ADS)

    Lv, Ye; Wang, Guofeng; Hu, Xiangyun

    2016-06-01

    At present, remote sensing technology is the best weapon to get information from the earth surface, and it is very useful in geo- information updating and related applications. Extracting road from remote sensing images is one of the biggest demand of rapid city development, therefore, it becomes a hot issue. Roads in high-resolution images are more complex, patterns of roads vary a lot, which becomes obstacles for road extraction. In this paper, a machine learning based strategy is presented. The strategy overall uses the geometry features, radiation features, topology features and texture features. In high resolution remote sensing images, the images cover a great scale of landscape, thus, the speed of extracting roads is slow. So, roads' ROIs are firstly detected by using Houghline detection and buffering method to narrow down the detecting area. As roads in high resolution images are normally in ribbon shape, mean-shift and watershed segmentation methods are used to extract road segments. Then, Real Adaboost supervised machine learning algorithm is used to pick out segments that contain roads' pattern. At last, geometric shape analysis and morphology methods are used to prune and restore the whole roads' area and to detect the centerline of roads.

  15. Forecasting daily streamflow using online sequential extreme learning machines

    NASA Astrophysics Data System (ADS)

    Lima, Aranildo R.; Cannon, Alex J.; Hsieh, William W.

    2016-06-01

    While nonlinear machine methods have been widely used in environmental forecasting, in situations where new data arrive continually, the need to make frequent model updates can become cumbersome and computationally costly. To alleviate this problem, an online sequential learning algorithm for single hidden layer feedforward neural networks - the online sequential extreme learning machine (OSELM) - is automatically updated inexpensively as new data arrive (and the new data can then be discarded). OSELM was applied to forecast daily streamflow at two small watersheds in British Columbia, Canada, at lead times of 1-3 days. Predictors used were weather forecast data generated by the NOAA Global Ensemble Forecasting System (GEFS), and local hydro-meteorological observations. OSELM forecasts were tested with daily, monthly or yearly model updates. More frequent updating gave smaller forecast errors, including errors for data above the 90th percentile. Larger datasets used in the initial training of OSELM helped to find better parameters (number of hidden nodes) for the model, yielding better predictions. With the online sequential multiple linear regression (OSMLR) as benchmark, we concluded that OSELM is an attractive approach as it easily outperformed OSMLR in forecast accuracy.

  16. Predicting submicron air pollution indicators: a machine learning approach.

    PubMed

    Pandey, Gaurav; Zhang, Bin; Jian, Le

    2013-05-01

    The regulation of air pollutant levels is rapidly becoming one of the most important tasks for the governments of developing countries, especially China. Submicron particles, such as ultrafine particles (UFP, aerodynamic diameter ≤ 100 nm) and particulate matter ≤ 1.0 micrometers (PM1.0), are an unregulated emerging health threat to humans, but the relationships between the concentration of these particles and meteorological and traffic factors are poorly understood. To shed some light on these connections, we employed a range of machine learning techniques to predict UFP and PM1.0 levels based on a dataset consisting of observations of weather and traffic variables recorded at a busy roadside in Hangzhou, China. Based upon the thorough examination of over twenty five classifiers used for this task, we find that it is possible to predict PM1.0 and UFP levels reasonably accurately and that tree-based classification models (Alternating Decision Tree and Random Forests) perform the best for both these particles. In addition, weather variables show a stronger relationship with PM1.0 and UFP levels, and thus cannot be ignored for predicting submicron particle levels. Overall, this study has demonstrated the potential application value of systematically collecting and analysing datasets using machine learning techniques for the prediction of submicron sized ambient air pollutants.

  17. Complex extreme learning machine applications in terahertz pulsed signals feature sets.

    PubMed

    Yin, X-X; Hadjiloucas, S; Zhang, Y

    2014-11-01

    This paper presents a novel approach to the automatic classification of very large data sets composed of terahertz pulse transient signals, highlighting their potential use in biochemical, biomedical, pharmaceutical and security applications. Two different types of THz spectra are considered in the classification process. Firstly a binary classification study of poly-A and poly-C ribonucleic acid samples is performed. This is then contrasted with a difficult multi-class classification problem of spectra from six different powder samples that although have fairly indistinguishable features in the optical spectrum, they also possess a few discernable spectral features in the terahertz part of the spectrum. Classification is performed using a complex-valued extreme learning machine algorithm that takes into account features in both the amplitude as well as the phase of the recorded spectra. Classification speed and accuracy are contrasted with that achieved using a support vector machine classifier. The study systematically compares the classifier performance achieved after adopting different Gaussian kernels when separating amplitude and phase signatures. The two signatures are presented as feature vectors for both training and testing purposes. The study confirms the utility of complex-valued extreme learning machine algorithms for classification of the very large data sets generated with current terahertz imaging spectrometers. The classifier can take into consideration heterogeneous layers within an object as would be required within a tomographic setting and is sufficiently robust to detect patterns hidden inside noisy terahertz data sets. The proposed study opens up the opportunity for the establishment of complex-valued extreme learning machine algorithms as new chemometric tools that will assist the wider proliferation of terahertz sensing technology for chemical sensing, quality control, security screening and clinic diagnosis. Furthermore, the proposed

  18. Complex extreme learning machine applications in terahertz pulsed signals feature sets.

    PubMed

    Yin, X-X; Hadjiloucas, S; Zhang, Y

    2014-11-01

    This paper presents a novel approach to the automatic classification of very large data sets composed of terahertz pulse transient signals, highlighting their potential use in biochemical, biomedical, pharmaceutical and security applications. Two different types of THz spectra are considered in the classification process. Firstly a binary classification study of poly-A and poly-C ribonucleic acid samples is performed. This is then contrasted with a difficult multi-class classification problem of spectra from six different powder samples that although have fairly indistinguishable features in the optical spectrum, they also possess a few discernable spectral features in the terahertz part of the spectrum. Classification is performed using a complex-valued extreme learning machine algorithm that takes into account features in both the amplitude as well as the phase of the recorded spectra. Classification speed and accuracy are contrasted with that achieved using a support vector machine classifier. The study systematically compares the classifier performance achieved after adopting different Gaussian kernels when separating amplitude and phase signatures. The two signatures are presented as feature vectors for both training and testing purposes. The study confirms the utility of complex-valued extreme learning machine algorithms for classification of the very large data sets generated with current terahertz imaging spectrometers. The classifier can take into consideration heterogeneous layers within an object as would be required within a tomographic setting and is sufficiently robust to detect patterns hidden inside noisy terahertz data sets. The proposed study opens up the opportunity for the establishment of complex-valued extreme learning machine algorithms as new chemometric tools that will assist the wider proliferation of terahertz sensing technology for chemical sensing, quality control, security screening and clinic diagnosis. Furthermore, the proposed

  19. Genomics and Machine Learning for Taxonomy Consensus: The Mycobacterium tuberculosis Complex Paradigm.

    PubMed

    Azé, Jérôme; Sola, Christophe; Zhang, Jian; Lafosse-Marin, Florian; Yasmin, Memona; Siddiqui, Rubina; Kremer, Kristin; van Soolingen, Dick; Refrégier, Guislaine

    2015-01-01

    Infra-species taxonomy is a prerequisite to compare features such as virulence in different pathogen lineages. Mycobacterium tuberculosis complex taxonomy has rapidly evolved in the last 20 years through intensive clinical isolation, advances in sequencing and in the description of fast-evolving loci (CRISPR and MIRU-VNTR). On-line tools to describe new isolates have been set up based on known diversity either on CRISPRs (also known as spoligotypes) or on MIRU-VNTR profiles. The underlying taxonomies are largely concordant but use different names and offer different depths. The objectives of this study were 1) to explicit the consensus that exists between the alternative taxonomies, and 2) to provide an on-line tool to ease classification of new isolates. Genotyping (24-VNTR, 43-spacers spoligotypes, IS6110-RFLP) was undertaken for 3,454 clinical isolates from the Netherlands (2004-2008). The resulting database was enlarged with African isolates to include most human tuberculosis diversity. Assignations were obtained using TB-Lineage, MIRU-VNTRPlus, SITVITWEB and an algorithm from Borile et al. By identifying the recurrent concordances between the alternative taxonomies, we proposed a consensus including 22 sublineages. Original and consensus assignations of the all isolates from the database were subsequently implemented into an ensemble learning approach based on Machine Learning tool Weka to derive a classification scheme. All assignations were reproduced with very good sensibilities and specificities. When applied to independent datasets, it was able to suggest new sublineages such as pseudo-Beijing. This Lineage Prediction tool, efficient on 15-MIRU, 24-VNTR and spoligotype data is available on the web interface "TBminer." Another section of this website helps summarizing key molecular epidemiological data, easing tuberculosis surveillance. Altogether, we successfully used Machine Learning on a large dataset to set up and make available the first consensual

  20. Genomics and Machine Learning for Taxonomy Consensus: The Mycobacterium tuberculosis Complex Paradigm

    PubMed Central

    Azé, Jérôme; Sola, Christophe; Zhang, Jian; Lafosse-Marin, Florian; Yasmin, Memona; Siddiqui, Rubina; Kremer, Kristin; van Soolingen, Dick; Refrégier, Guislaine

    2015-01-01

    Infra-species taxonomy is a prerequisite to compare features such as virulence in different pathogen lineages. Mycobacterium tuberculosis complex taxonomy has rapidly evolved in the last 20 years through intensive clinical isolation, advances in sequencing and in the description of fast-evolving loci (CRISPR and MIRU-VNTR). On-line tools to describe new isolates have been set up based on known diversity either on CRISPRs (also known as spoligotypes) or on MIRU-VNTR profiles. The underlying taxonomies are largely concordant but use different names and offer different depths. The objectives of this study were 1) to explicit the consensus that exists between the alternative taxonomies, and 2) to provide an on-line tool to ease classification of new isolates. Genotyping (24-VNTR, 43-spacers spoligotypes, IS6110-RFLP) was undertaken for 3,454 clinical isolates from the Netherlands (2004-2008). The resulting database was enlarged with African isolates to include most human tuberculosis diversity. Assignations were obtained using TB-Lineage, MIRU-VNTRPlus, SITVITWEB and an algorithm from Borile et al. By identifying the recurrent concordances between the alternative taxonomies, we proposed a consensus including 22 sublineages. Original and consensus assignations of the all isolates from the database were subsequently implemented into an ensemble learning approach based on Machine Learning tool Weka to derive a classification scheme. All assignations were reproduced with very good sensibilities and specificities. When applied to independent datasets, it was able to suggest new sublineages such as pseudo-Beijing. This Lineage Prediction tool, efficient on 15-MIRU, 24-VNTR and spoligotype data is available on the web interface “TBminer.” Another section of this website helps summarizing key molecular epidemiological data, easing tuberculosis surveillance. Altogether, we successfully used Machine Learning on a large dataset to set up and make available the first

  1. INL Review of Fueling Machine Inspection Tool Development Proposal

    SciTech Connect

    Griffith, George

    2015-03-01

    A review of a technical proposal for James Fischer Nuclear. The document describes an inspection tool to examine the graphite moderator in an AGR reactor. The system is an optical system to look at the graphite blocks for cracks. INL reviews the document for technical value.

  2. A Qualitative Evaluation of Evolution of a Learning Analytics Tool

    ERIC Educational Resources Information Center

    Ali, Liaqat; Hatala, Marek; Gasevic, Dragan; Jovanovic, Jelena

    2012-01-01

    LOCO-Analyst is a learning analytics tool we developed to provide educators with feedback on students learning activities and performance. Evaluation of the first version of the tool led to the enhancement of the tool's data visualization, user interface, and supported feedback types. The second evaluation of the improved tool allowed us to see…

  3. Application of machine learning using support vector machines for crater detection from Martian digital topography data

    NASA Astrophysics Data System (ADS)

    Salamunićcar, Goran; Lončarić, Sven

    In our previous work, in order to extend the GT-57633 catalogue [PSS, 56 (15), 1992-2008] with still uncatalogued impact-craters, the following has been done [GRS, 48 (5), in press, doi:10.1109/TGRS.2009.2037750]: (1) the crater detection algorithm (CDA) based on digital elevation model (DEM) was developed; (2) using 1/128° MOLA data, this CDA proposed 414631 crater-candidates; (3) each crater-candidate was analyzed manually; and (4) 57592 were confirmed as correct detections. The resulting GT-115225 catalog is the significant result of this effort. However, to check such a large number of crater-candidates manually was a demanding task. This was the main motivation for work on improvement of the CDA in order to provide better classification of craters as true and false detections. To achieve this, we extended the CDA with the machine learning capability, using support vector machines (SVM). In the first step, the CDA (re)calculates numerous terrain morphometric attributes from DEM. For this purpose, already existing modules of the CDA from our previous work were reused in order to be capable to prepare these attributes. In addition, new attributes were introduced such as ellipse eccentricity and tilt. For machine learning purpose, the CDA is additionally extended to provide 2-D topography-profile and 3-D shape for each crater-candidate. The latter two are a performance problem because of the large number of crater-candidates in combination with the large number of attributes. As a solution, we developed a CDA architecture wherein it is possible to combine the SVM with a radial basis function (RBF) or any other kernel (for initial set of attributes), with the SVM with linear kernel (for the cases when 2-D and 3-D data are included as well). Another challenge is that, in addition to diversity of possible crater types, there are numerous morphological differences between the smallest (mostly very circular bowl-shaped craters) and the largest (multi-ring) impact

  4. Two-Stage Machine Learning model for guideline development.

    PubMed

    Mani, S; Shankle, W R; Dick, M B; Pazzani, M J

    1999-05-01

    We present a Two-Stage Machine Learning (ML) model as a data mining method to develop practice guidelines and apply it to the problem of dementia staging. Dementia staging in clinical settings is at present complex and highly subjective because of the ambiguities and the complicated nature of existing guidelines. Our model abstracts the two-stage process used by physicians to arrive at the global Clinical Dementia Rating Scale (CDRS) score. The model incorporates learning intermediate concepts (CDRS category scores) in the first stage that then become the feature space for the second stage (global CDRS score). The sample consisted of 678 patients evaluated in the Alzheimer's Disease Research Center at the University of California, Irvine. The demographic variables, functional and cognitive test results used by physicians for the task of dementia severity staging were used as input to the machine learning algorithms. Decision tree learners and rule inducers (C4.5, Cart, C4.5 rules) were selected for our study as they give expressive models, and Naive Bayes was used as a baseline algorithm for comparison purposes. We first learned the six CDRS category scores (memory, orientation, judgement and problem solving, personal care, home and hobbies, and community affairs). These learned CDRS category scores were then used to learn the global CDRS scores. The Two-Stage ML model classified as well as or better than the published inter-rater agreements for both the category and global CDRS scoring by dementia experts. Furthermore, for the most critical distinction, normal versus very mildly impaired, the Two-Stage ML model was 28.1 and 6.6% more accurate than published performances by domain experts. Our study of the CDRS examined one of the largest, most diverse samples in the literature, suggesting that our findings are robust. The Two-Stage ML model also identified a CDRS category, Judgment and Problem Solving, which has low classification accuracy similar to published

  5. GIS learning tool for world's largest earthquakes and their causes

    NASA Astrophysics Data System (ADS)

    Chatterjee, Moumita

    The objective of this thesis is to increase awareness about earthquakes among people, especially young students by showing the five largest and two most predictable earthquake locations in the world and their plate tectonic settings. This is a geographic based interactive tool which could be used for learning about the cause of great earthquakes in the past and the safest places on the earth in order to avoid direct effect of earthquakes. This approach provides an effective way of learning for the students as it is very user friendly and more aligned to the interests of the younger generation. In this tool the user can click on the various points located on the world map which will open a picture and link to the webpage for that point, showing detailed information of the earthquake history of that place including magnitude of quake, year of past quakes and the plate tectonic settings that made this place earthquake prone. Apart from knowing the earthquake related information students will also be able to customize the tool to suit their needs or interests. Students will be able to add/remove layers, measure distance between any two points on the map, select any place on the map and know more information for that place, create a layer from this set to do a detail analysis, run a query, change display settings, etc. At the end of this tool the user has to go through the earthquake safely guidelines in order to be safe during an earthquake. This tool uses Java as programming language and uses Map Objects Java Edition (MOJO) provided by ESRI. This tool is developed for educational purpose and hence its interface has been kept simple and easy to use so that students can gain maximum knowledge through it instead of having a hard time to install it. There are lots of details to explore which can help more about what a GIS based tool is capable of. Only thing needed to run this tool is latest JAVA edition installed in their machine. This approach makes study more fun and

  6. Clinical utility of machine-learning approaches in schizophrenia: improving diagnostic confidence for translational neuroimaging.

    PubMed

    Iwabuchi, Sarina J; Liddle, Peter F; Palaniyappan, Lena

    2013-01-01

    Machine-learning approaches are becoming commonplace in the neuroimaging literature as potential diagnostic and prognostic tools for the study of clinical populations. However, very few studies provide clinically informative measures to aid in decision-making and resource allocation. Head-to-head comparison of neuroimaging-based multivariate classifiers is an essential first step to promote translation of these tools to clinical practice. We systematically evaluated the classifier performance using back-to-back structural MRI in two field strengths (3- and 7-T) to discriminate patients with schizophrenia (n = 19) from healthy controls (n = 20). Gray matter (GM) and white matter images were used as inputs into a support vector machine to classify patients and control subjects. Seven Tesla classifiers outperformed the 3-T classifiers with accuracy reaching as high as 77% for the 7-T GM classifier compared to 66.6% for the 3-T GM classifier. Furthermore, diagnostic odds ratio (a measure that is not affected by variations in sample characteristics) and number needed to predict (a measure based on Bayesian certainty of a test result) indicated superior performance of the 7-T classifiers, whereby for each correct diagnosis made, the number of patients that need to be examined using the 7-T GM classifier was one less than the number that need to be examined if a different classifier was used. Using a hypothetical example, we highlight how these findings could have significant implications for clinical decision-making. We encourage the reporting of measures proposed here in future studies utilizing machine-learning approaches. This will not only promote the search for an optimum diagnostic tool but also aid in the translation of neuroimaging to clinical use. PMID:24009589

  7. Clinical Utility of Machine-Learning Approaches in Schizophrenia: Improving Diagnostic Confidence for Translational Neuroimaging

    PubMed Central

    Iwabuchi, Sarina J.; Liddle, Peter F.; Palaniyappan, Lena

    2013-01-01

    Machine-learning approaches are becoming commonplace in the neuroimaging literature as potential diagnostic and prognostic tools for the study of clinical populations. However, very few studies provide clinically informative measures to aid in decision-making and resource allocation. Head-to-head comparison of neuroimaging-based multivariate classifiers is an essential first step to promote translation of these tools to clinical practice. We systematically evaluated the classifier performance using back-to-back structural MRI in two field strengths (3- and 7-T) to discriminate patients with schizophrenia (n = 19) from healthy controls (n = 20). Gray matter (GM) and white matter images were used as inputs into a support vector machine to classify patients and control subjects. Seven Tesla classifiers outperformed the 3-T classifiers with accuracy reaching as high as 77% for the 7-T GM classifier compared to 66.6% for the 3-T GM classifier. Furthermore, diagnostic odds ratio (a measure that is not affected by variations in sample characteristics) and number needed to predict (a measure based on Bayesian certainty of a test result) indicated superior performance of the 7-T classifiers, whereby for each correct diagnosis made, the number of patients that need to be examined using the 7-T GM classifier was one less than the number that need to be examined if a different classifier was used. Using a hypothetical example, we highlight how these findings could have significant implications for clinical decision-making. We encourage the reporting of measures proposed here in future studies utilizing machine-learning approaches. This will not only promote the search for an optimum diagnostic tool but also aid in the translation of neuroimaging to clinical use. PMID:24009589

  8. Clinical utility of machine-learning approaches in schizophrenia: improving diagnostic confidence for translational neuroimaging.

    PubMed

    Iwabuchi, Sarina J; Liddle, Peter F; Palaniyappan, Lena

    2013-01-01

    Machine-learning approaches are becoming commonplace in the neuroimaging literature as potential diagnostic and prognostic tools for the study of clinical populations. However, very few studies provide clinically informative measures to aid in decision-making and resource allocation. Head-to-head comparison of neuroimaging-based multivariate classifiers is an essential first step to promote translation of these tools to clinical practice. We systematically evaluated the classifier performance using back-to-back structural MRI in two field strengths (3- and 7-T) to discriminate patients with schizophrenia (n = 19) from healthy controls (n = 20). Gray matter (GM) and white matter images were used as inputs into a support vector machine to classify patients and control subjects. Seven Tesla classifiers outperformed the 3-T classifiers with accuracy reaching as high as 77% for the 7-T GM classifier compared to 66.6% for the 3-T GM classifier. Furthermore, diagnostic odds ratio (a measure that is not affected by variations in sample characteristics) and number needed to predict (a measure based on Bayesian certainty of a test result) indicated superior performance of the 7-T classifiers, whereby for each correct diagnosis made, the number of patients that need to be examined using the 7-T GM classifier was one less than the number that need to be examined if a different classifier was used. Using a hypothetical example, we highlight how these findings could have significant implications for clinical decision-making. We encourage the reporting of measures proposed here in future studies utilizing machine-learning approaches. This will not only promote the search for an optimum diagnostic tool but also aid in the translation of neuroimaging to clinical use.

  9. Compensation of Gravity-Induced Errors on a Hexapod-Type Parallel Kinematic Machine Tool

    NASA Astrophysics Data System (ADS)

    Ibaraki, Soichi; Okuda, Toshihiro; Kakino, Yoshiaki; Nakagawa, Masao; Matsushita, Tetsuya; Ando, Tomoharu

    This paper presents a methodology to compensate contouring errors introduced by the gravity on a Hexapod-type parallel kinematic machine tool with the Stewart platform. Unlike conventional serial kinematic feed drives, the gravity imposes a critical effect on the positioning accuracy of a parallel kinematic feed drive, and its effect significantly varies depending on the position and the orientation of the spindle. We first present a kinematic model to predict the elastic deformation of struts caused by the gravity. The positioning error at the tool tip is given as the superposition of the deformation of each strut. It is experimentally verified for a commercial parallel kinematic machine tool that the machine's contouring error is significantly reduced by compensating gravity-induced errors on a reference trajectory.

  10. Method and apparatus for suppressing regenerative instability and related chatter in machine tools

    DOEpatents

    Segalman, Daniel J.; Redmond, James M.

    2001-01-01

    Methods of and apparatuses for mitigating chatter vibrations in machine tools or components thereof. Chatter therein is suppressed by periodically or continuously varying the stiffness of the cutting tool (or some component of the cutting tool), and hence the resonant frequency of the cutting tool (or some component thereof). The varying of resonant frequency of the cutting tool can be accomplished by modulating the stiffness of the cutting tool, the cutting tool holder, or any other component of the support for the cutting tool. By periodically altering the impedance of the cutting tool assembly, chatter is mitigated. In one embodiment, a cyclic electric (or magnetic) field is applied to the spindle quill which contains an electro-rheological (or magneto-rheological) fluid. The variable yield stress in the fluid affects the coupling of the spindle to the machine tool structure, changing the natural frequency of oscillation. Altering the modal characteristics in this fashion disrupts the modulation of current tool vibrations with previous tool vibrations recorded on the workpiece surface.

  11. Method and apparatus for suppressing regenerative instability and related chatter in machine tools

    DOEpatents

    Segalman, Daniel J.; Redmond, James M.

    1999-01-01

    Methods of and apparatuses for mitigating chatter vibrations in machine tools or components thereof. Chatter therein is suppressed by periodically or continuously varying the stiffness of the cutting tool (or some component of the cutting tool), and hence the resonant frequency of the cutting tool (or some component thereof). The varying of resonant frequency of the cutting tool can be accomplished by modulating the stiffness of the cutting tool, the cutting tool holder, or any other component of the support for the cutting tool. By periodically altering the impedance of the cutting tool assembly, chatter is mitigated. In one embodiment, a cyclic electric (or magnetic) field is applied to the spindle quill which contains an electro-rheological (or magneto-rheological) fluid. The variable yield stress in the fluid affects the coupling of the spindle to the machine tool structure, changing the natural frequency of oscillation. Altering the modal characteristics in this fashion disrupts the modulation of current tool vibrations with previous tool vibrations recorded on the workpiece surface.

  12. Ten million and one penguins, or, lessons learned from booting millions of virtual machines on HPC systems.

    SciTech Connect

    Minnich, Ronald G.; Rudish, Donald W.

    2009-01-01

    In this paper we describe Megatux, a set of tools we are developing for rapid provisioning of millions of virtual machines and controlling and monitoring them, as well as what we've learned from booting one million Linux virtual machines on the Thunderbird (4660 nodes) and 550,000 Linux virtual machines on the Hyperion (1024 nodes) clusters. As might be expected, our tools use hierarchical structures. In contrast to existing HPC systems, our tools do not require perfect hardware; that all systems be booted at the same time; and static configuration files that define the role of each node. While we believe these tools will be useful for future HPC systems, we are using them today to construct botnets. Botnets have been in the news recently, as discoveries of their scale (millions of infected machines for even a single botnet) and their reach (global) and their impact on organizations (devastating in financial costs and time lost to recovery) have become more apparent. A distinguishing feature of botnets is their emergent behavior: fairly simple operational rule sets can result in behavior that cannot be predicted. In general, there is no reducible understanding of how a large network will behave ahead of 'running it'. 'Running it' means observing the actual network in operation or simulating/emulating it. Unfortunately, this behavior is only seen at scale, i.e. when at minimum 10s of thousands of machines are infected. To add to the problem, botnets typically change at least 11% of the machines they are using in any given week, and this changing population is an integral part of their behavior. The use of virtual machines to assist in the forensics of malware is not new to the cyber security world. Reverse engineering techniques often use virtual machines in combination with code debuggers. Nevertheless, this task largely remains a manual process to get past code obfuscation and is inherently slow. As part of our cyber security work at Sandia National Laboratories

  13. Digital teaching tools and global learning communities.

    PubMed

    Williams, Mary; Lockhart, Patti; Martin, Cathie

    2015-01-01

    In 2009, we started a project to support the teaching and learning of university-level plant sciences, called Teaching Tools in Plant Biology. Articles in this series are published by the plant science journal, The Plant Cell (published by the American Society of Plant Biologists). Five years on, we investigated how the published materials are being used through an analysis of the Google Analytics pageviews distribution and through a user survey. Our results suggest that this project has had a broad, global impact in supporting higher education, and also that the materials are used differently by individuals in terms of their role (instructor, independent learner, student) and geographical location. We also report on our ongoing efforts to develop a global learning community that encourages discussion and resource sharing.

  14. Digital teaching tools and global learning communities

    PubMed Central

    Williams, Mary; Lockhart, Patti; Martin, Cathie

    2015-01-01

    In 2009, we started a project to support the teaching and learning of university-level plant sciences, called Teaching Tools in Plant Biology. Articles in this series are published by the plant science journal, The Plant Cell (published by the American Society of Plant Biologists). Five years on, we investigated how the published materials are being used through an analysis of the Google Analytics pageviews distribution and through a user survey. Our results suggest that this project has had a broad, global impact in supporting higher education, and also that the materials are used differently by individuals in terms of their role (instructor, independent learner, student) and geographical location. We also report on our ongoing efforts to develop a global learning community that encourages discussion and resource sharing. PMID:25949805

  15. Machine Learning Data Imputation and Classification in a Multicohort Hypertension Clinical Study.

    PubMed

    Seffens, William; Evans, Chad; Taylor, Herman

    2015-01-01

    Health-care initiatives are pushing the development and utilization of clinical data for medical discovery and translational research studies. Machine learning tools implemented for Big Data have been applied to detect patterns in complex diseases. This study focuses on hypertension and examines phenotype data across a major clinical study called Minority Health Genomics and Translational Research Repository Database composed of self-reported African American (AA) participants combined with related cohorts. Prior genome-wide association studies for hypertension in AAs presumed that an increase of disease burden in susceptible populations is due to rare variants. But genomic analysis of hypertension, even those designed to focus on rare variants, has yielded marginal genome-wide results over many studies. Machine learning and other nonparametric statistical methods have recently been shown to uncover relationships in complex phenotypes, genotypes, and clinical data. We trained neural networks with phenotype data for missing-data imputation to increase the usable size of a clinical data set. Validity was established by showing performance effects using the expanded data set for the association of phenotype variables with case/control status of patients. Data mining classification tools were used to generate association rules.

  16. Machine learning patterns for neuroimaging-genetic studies in the cloud.

    PubMed

    Da Mota, Benoit; Tudoran, Radu; Costan, Alexandru; Varoquaux, Gaël; Brasche, Goetz; Conrod, Patricia; Lemaitre, Herve; Paus, Tomas; Rietschel, Marcella; Frouin, Vincent; Poline, Jean-Baptiste; Antoniu, Gabriel; Thirion, Bertrand

    2014-01-01

    Brain imaging is a natural intermediate phenotype to understand the link between genetic information and behavior or brain pathologies risk factors. Massive efforts have been made in the last few years to acquire high-dimensional neuroimaging and genetic data on large cohorts of subjects. The statistical analysis of such data is carried out with increasingly sophisticated techniques and represents a great computational challenge. Fortunately, increasing computational power in distributed architectures can be harnessed, if new neuroinformatics infrastructures are designed and training to use these new tools is provided. Combining a MapReduce framework (TomusBLOB) with machine learning algorithms (Scikit-learn library), we design a scalable analysis tool that can deal with non-parametric statistics on high-dimensional data. End-users describe the statistical procedure to perform and can then test the model on their own computers before running the very same code in the cloud at a larger scale. We illustrate the potential of our approach on real data with an experiment showing how the functional signal in subcortical brain regions can be significantly fit with genome-wide genotypes. This experiment demonstrates the scalability and the reliability of our framework in the cloud with a 2 weeks deployment on hundreds of virtual machines.

  17. Machine learning patterns for neuroimaging-genetic studies in the cloud

    PubMed Central

    Da Mota, Benoit; Tudoran, Radu; Costan, Alexandru; Varoquaux, Gaël; Brasche, Goetz; Conrod, Patricia; Lemaitre, Herve; Paus, Tomas; Rietschel, Marcella; Frouin, Vincent; Poline, Jean-Baptiste; Antoniu, Gabriel; Thirion, Bertrand

    2014-01-01

    Brain imaging is a natural intermediate phenotype to understand the link between genetic information and behavior or brain pathologies risk factors. Massive efforts have been made in the last few years to acquire high-dimensional neuroimaging and genetic data on large cohorts of subjects. The statistical analysis of such data is carried out with increasingly sophisticated techniques and represents a great computational challenge. Fortunately, increasing computational power in distributed architectures can be harnessed, if new neuroinformatics infrastructures are designed and training to use these new tools is provided. Combining a MapReduce framework (TomusBLOB) with machine learning algorithms (Scikit-learn library), we design a scalable analysis tool that can deal with non-parametric statistics on high-dimensional data. End-users describe the statistical procedure to perform and can then test the model on their own computers before running the very same code in the cloud at a larger scale. We illustrate the potential of our approach on real data with an experiment showing how the functional signal in subcortical brain regions can be significantly fit with genome-wide genotypes. This experiment demonstrates the scalability and the reliability of our framework in the cloud with a 2 weeks deployment on hundreds of virtual machines. PMID:24782753

  18. Machine Learning Data Imputation and Classification in a Multicohort Hypertension Clinical Study

    PubMed Central

    Seffens, William; Evans, Chad; Taylor, Herman

    2015-01-01

    Health-care initiatives are pushing the development and utilization of clinical data for medical discovery and translational research studies. Machine learning tools implemented for Big Data have been applied to detect patterns in complex diseases. This study focuses on hypertension and examines phenotype data across a major clinical study called Minority Health Genomics and Translational Research Repository Database composed of self-reported African American (AA) participants combined with related cohorts. Prior genome-wide association studies for hypertension in AAs presumed that an increase of disease burden in susceptible populations is due to rare variants. But genomic analysis of hypertension, even those designed to focus on rare variants, has yielded marginal genome-wide results over many studies. Machine learning and other nonparametric statistical methods have recently been shown to uncover relationships in complex phenotypes, genotypes, and clinical data. We trained neural networks with phenotype data for missing-data imputation to increase the usable size of a clinical data set. Validity was established by showing performance effects using the expanded data set for the association of phenotype variables with case/control status of patients. Data mining classification tools were used to generate association rules. PMID:27199552

  19. Machine learning patterns for neuroimaging-genetic studies in the cloud.

    PubMed

    Da Mota, Benoit; Tudoran, Radu; Costan, Alexandru; Varoquaux, Gaël; Brasche, Goetz; Conrod, Patricia; Lemaitre, Herve; Paus, Tomas; Rietschel, Marcella; Frouin, Vincent; Poline, Jean-Baptiste; Antoniu, Gabriel; Thirion, Bertrand

    2014-01-01

    Brain imaging is a natural intermediate phenotype to understand the link between genetic information and behavior or brain pathologies risk factors. Massive efforts have been made in the last few years to acquire high-dimensional neuroimaging and genetic data on large cohorts of subjects. The statistical analysis of such data is carried out with increasingly sophisticated techniques and represents a great computational challenge. Fortunately, increasing computational power in distributed architectures can be harnessed, if new neuroinformatics infrastructures are designed and training to use these new tools is provided. Combining a MapReduce framework (TomusBLOB) with machine learning algorithms (Scikit-learn library), we design a scalable analysis tool that can deal with non-parametric statistics on high-dimensional data. End-users describe the statistical procedure to perform and can then test the model on their own computers before running the very same code in the cloud at a larger scale. We illustrate the potential of our approach on real data with an experiment showing how the functional signal in subcortical brain regions can be significantly fit with genome-wide genotypes. This experiment demonstrates the scalability and the reliability of our framework in the cloud with a 2 weeks deployment on hundreds of virtual machines. PMID:24782753

  20. Calibration of prismatic joints in multi-axis machine tools by a three lines measuring method

    NASA Astrophysics Data System (ADS)

    Khan, Abdul Wahid; Chen, Wuyi

    2008-10-01

    A three line calibration method for error quantification in a prismatic joint of a machine tool was proposed and implemented by using a laser interferometer as a working standard. It greatly simplified the measurement setup requirements and accelerated the calibration of prismatic joints. Moreover, it was highly economical by reducing the calibration time and eliminating the use of complex optics. The methodology was implemented on prismatic joints of a three axis CNC machine tool as per standard procedures and guide lines. Cubic spline technique was implemented as error modeling and results obtained were reported for its further use to compensate the errors for improving the accuracy in prismatic joints.

  1. The Challenges of Blended Learning Using a Media Annotation Tool

    ERIC Educational Resources Information Center

    Douglas, Kathy A.; Lang, Josephine; Colasante, Meg

    2014-01-01

    Blended learning has been evolving as an important approach to learning and teaching in tertiary education. This approach incorporates learning in both online and face-to-face modes and promotes deep learning by incorporating the best of both approaches. An innovation in blended learning is the use of an online media annotation tool (MAT) in…

  2. Machine Learning Strategy for Accelerated Design of Polymer Dielectrics

    PubMed Central

    Mannodi-Kanakkithodi, Arun; Pilania, Ghanshyam; Huan, Tran Doan; Lookman, Turab; Ramprasad, Rampi

    2016-01-01

    The ability to efficiently design new and advanced dielectric polymers is hampered by the lack of sufficient, reliable data on wide polymer chemical spaces, and the difficulty of generating such data given time and computational/experimental constraints. Here, we address the issue of accelerating polymer dielectrics design by extracting learning models from data generated by accurate state-of-the-art first principles computations for polymers occupying an important part of the chemical subspace. The polymers are ‘fingerprinted’ as simple, easily attainable numerical representations, which are mapped to the properties of interest using a machine learning algorithm to develop an on-demand property prediction model. Further, a genetic algorithm is utilised to optimise polymer constituent blocks in an evolutionary manner, thus directly leading to the design of polymers with given target properties. While this philosophy of learning to make instant predictions and design is demonstrated here for the example of polymer dielectrics, it is equally applicable to other classes of materials as well. PMID:26876223

  3. Machine learning strategy for accelerated design of polymer dielectrics

    DOE PAGESBeta

    Mannodi-Kanakkithodi, Arun; Pilania, Ghanshyam; Huan, Tran Doan; Lookman, Turab; Ramprasad, Rampi

    2016-02-15

    The ability to efficiently design new and advanced dielectric polymers is hampered by the lack of sufficient, reliable data on wide polymer chemical spaces, and the difficulty of generating such data given time and computational/experimental constraints. Here, we address the issue of accelerating polymer dielectrics design by extracting learning models from data generated by accurate state-of-the-art first principles computations for polymers occupying an important part of the chemical subspace. The polymers are ‘fingerprinted’ as simple, easily attainable numerical representations, which are mapped to the properties of interest using a machine learning algorithm to develop an on-demand property prediction model. Further,more » a genetic algorithm is utilised to optimise polymer constituent blocks in an evolutionary manner, thus directly leading to the design of polymers with given target properties. Furthermore, while this philosophy of learning to make instant predictions and design is demonstrated here for the example of polymer dielectrics, it is equally applicable to other classes of materials as well.« less

  4. Visual tracking based on extreme learning machine and sparse representation.

    PubMed

    Wang, Baoxian; Tang, Linbo; Yang, Jinglin; Zhao, Baojun; Wang, Shuigen

    2015-01-01

    The existing sparse representation-based visual trackers mostly suffer from both being time consuming and having poor robustness problems. To address these issues, a novel tracking method is presented via combining sparse representation and an emerging learning technique, namely extreme learning machine (ELM). Specifically, visual tracking can be divided into two consecutive processes. Firstly, ELM is utilized to find the optimal separate hyperplane between the target observations and background ones. Thus, the trained ELM classification function is able to remove most of the candidate samples related to background contents efficiently, thereby reducing the total computational cost of the following sparse representation. Secondly, to further combine ELM and sparse representation, the resultant confidence values (i.e., probabilities to be a target) of samples on the ELM classification function are used to construct a new manifold learning constraint term of the sparse representation framework, which tends to achieve robuster results. Moreover, the accelerated proximal gradient method is used for deriving the optimal solution (in matrix form) of the constrained sparse tracking model. Additionally, the matrix form solution allows the candidate samples to be calculated in parallel, thereby leading to a higher efficiency. Experiments demonstrate the effectiveness of the proposed tracker. PMID:26506359

  5. Machine Learning Strategy for Accelerated Design of Polymer Dielectrics

    NASA Astrophysics Data System (ADS)

    Mannodi-Kanakkithodi, Arun; Pilania, Ghanshyam; Huan, Tran Doan; Lookman, Turab; Ramprasad, Rampi

    2016-02-01

    The ability to efficiently design new and advanced dielectric polymers is hampered by the lack of sufficient, reliable data on wide polymer chemical spaces, and the difficulty of generating such data given time and computational/experimental constraints. Here, we address the issue of accelerating polymer dielectrics design by extracting learning models from data generated by accurate state-of-the-art first principles computations for polymers occupying an important part of the chemical subspace. The polymers are ‘fingerprinted’ as simple, easily attainable numerical representations, which are mapped to the properties of interest using a machine learning algorithm to develop an on-demand property prediction model. Further, a genetic algorithm is utilised to optimise polymer constituent blocks in an evolutionary manner, thus directly leading to the design of polymers with given target properties. While this philosophy of learning to make instant predictions and design is demonstrated here for the example of polymer dielectrics, it is equally applicable to other classes of materials as well.

  6. Visual Tracking Based on Extreme Learning Machine and Sparse Representation

    PubMed Central

    Wang, Baoxian; Tang, Linbo; Yang, Jinglin; Zhao, Baojun; Wang, Shuigen

    2015-01-01

    The existing sparse representation-based visual trackers mostly suffer from both being time consuming and having poor robustness problems. To address these issues, a novel tracking method is presented via combining sparse representation and an emerging learning technique, namely extreme learning machine (ELM). Specifically, visual tracking can be divided into two consecutive processes. Firstly, ELM is utilized to find the optimal separate hyperplane between the target observations and background ones. Thus, the trained ELM classification function is able to remove most of the candidate samples related to background contents efficiently, thereby reducing the total computational cost of the following sparse representation. Secondly, to further combine ELM and sparse representation, the resultant confidence values (i.e., probabilities to be a target) of samples on the ELM classification function are used to construct a new manifold learning constraint term of the sparse representation framework, which tends to achieve robuster results. Moreover, the accelerated proximal gradient method is used for deriving the optimal solution (in matrix form) of the constrained sparse tracking model. Additionally, the matrix form solution allows the candidate samples to be calculated in parallel, thereby leading to a higher efficiency. Experiments demonstrate the effectiveness of the proposed tracker. PMID:26506359

  7. Rule extraction from support vector machines using ensemble learning approach: an application for diagnosis of diabetes.

    PubMed

    Han, Longfei; Luo, Senlin; Yu, Jianmin; Pan, Limin; Chen, Songjing

    2015-03-01

    Diabetes mellitus is a chronic disease and a worldwide public health challenge. It has been shown that 50-80% proportion of T2DM is undiagnosed. In this paper, support vector machines are utilized to screen diabetes, and an ensemble learning module is added, which turns the "black box" of SVM decisions into comprehensible and transparent rules, and it is also useful for solving imbalance problem. Results on China Health and Nutrition Survey data show that the proposed ensemble learning method generates rule sets with weighted average precision 94.2% and weighted average recall 93.9% for all classes. Furthermore, the hybrid system can provide a tool for diagnosis of diabetes, and it supports a second opinion for lay users.

  8. Application of machine learning methodology for PET-based definition of lung cancer.

    PubMed

    Kerhet, A; Small, C; Quon, H; Riauka, T; Schrader, L; Greiner, R; Yee, D; McEwan, A; Roa, W

    2010-02-01

    We applied a learning methodology framework to assist in the threshold-based segmentation of non-small-cell lung cancer (NSCLC) tumours in positron-emission tomography-computed tomography (PET-CT) imaging for use in radiotherapy planning. Gated and standard free-breathing studies of two patients were independently analysed (four studies in total). Each study had a pet-ct and a treatment-planning ct image. The reference gross tumour volume (GTV) was identified by two experienced radiation oncologists who also determined reference standardized uptake value (SUV) thresholds that most closely approximated the GTV contour on each slice. A set of uptake distribution-related attributes was calculated for each PET slice. A machine learning algorithm was trained on a subset of the PET slices to cope with slice-to-slice variation in the optimal suv threshold: that is, to predict the most appropriate suv threshold from the calculated attributes for each slice. The algorithm's performance was evaluated using the remainder of the pet slices. A high degree of geometric similarity was achieved between the areas outlined by the predicted and the reference SUV thresholds (Jaccard index exceeding 0.82). No significant difference was found between the gated and the free-breathing results in the same patient. In this preliminary work, we demonstrated the potential applicability of a machine learning methodology as an auxiliary tool for radiation treatment planning in NSCLC.

  9. An Integrated Approach of Fuzzy Linguistic Preference Based AHP and Fuzzy COPRAS for Machine Tool Evaluation.

    PubMed

    Nguyen, Huu-Tho; Md Dawal, Siti Zawiah; Nukman, Yusoff; Aoyama, Hideki; Case, Keith

    2015-01-01

    Globalization of business and competitiveness in manufacturing has forced companies to improve their manufacturing facilities to respond to market requirements. Machine tool evaluation involves an essential decision using imprecise and vague information, and plays a major role to improve the productivity and flexibility in manufacturing. The aim of this study is to present an integrated approach for decision-making in machine tool selection. This paper is focused on the integration of a consistent fuzzy AHP (Analytic Hierarchy Process) and a fuzzy COmplex PRoportional ASsessment (COPRAS) for multi-attribute decision-making in selecting the most suitable machine tool. In this method, the fuzzy linguistic reference relation is integrated into AHP to handle the imprecise and vague information, and to simplify the data collection for the pair-wise comparison matrix of the AHP which determines the weights of attributes. The output of the fuzzy AHP is imported into the fuzzy COPRAS method for ranking alternatives through the closeness coefficient. Presentation of the proposed model application is provided by a numerical example based on the collection of data by questionnaire and from the literature. The results highlight the integration of the improved fuzzy AHP and the fuzzy COPRAS as a precise tool and provide effective multi-attribute decision-making for evaluating the machine tool in the uncertain environment. PMID:26368541

  10. An Integrated Approach of Fuzzy Linguistic Preference Based AHP and Fuzzy COPRAS for Machine Tool Evaluation

    PubMed Central

    Nguyen, Huu-Tho; Md Dawal, Siti Zawiah; Nukman, Yusoff; Aoyama, Hideki; Case, Keith

    2015-01-01

    Globalization of business and competitiveness in manufacturing has forced companies to improve their manufacturing facilities to respond to market requirements. Machine tool evaluation involves an essential decision using imprecise and vague information, and plays a major role to improve the productivity and flexibility in manufacturing. The aim of this study is to present an integrated approach for decision-making in machine tool selection. This paper is focused on the integration of a consistent fuzzy AHP (Analytic Hierarchy Process) and a fuzzy COmplex PRoportional ASsessment (COPRAS) for multi-attribute decision-making in selecting the most suitable machine tool. In this method, the fuzzy linguistic reference relation is integrated into AHP to handle the imprecise and vague information, and to simplify the data collection for the pair-wise comparison matrix of the AHP which determines the weights of attributes. The output of the fuzzy AHP is imported into the fuzzy COPRAS method for ranking alternatives through the closeness coefficient. Presentation of the proposed model application is provided by a numerical example based on the collection of data by questionnaire and from the literature. The results highlight the integration of the improved fuzzy AHP and the fuzzy COPRAS as a precise tool and provide effective multi-attribute decision-making for evaluating the machine tool in the uncertain environment. PMID:26368541

  11. Enhancement of Plant Metabolite Fingerprinting by Machine Learning1[W

    PubMed Central

    Scott, Ian M.; Vermeer, Cornelia P.; Liakata, Maria; Corol, Delia I.; Ward, Jane L.; Lin, Wanchang; Johnson, Helen E.; Whitehead, Lynne; Kular, Baldeep; Baker, John M.; Walsh, Sean; Dave, Anuja; Larson, Tony R.; Graham, Ian A.; Wang, Trevor L.; King, Ross D.; Draper, John; Beale, Michael H.

    2010-01-01

    Metabolite fingerprinting of Arabidopsis (Arabidopsis thaliana) mutants with known or predicted metabolic lesions was performed by 1H-nuclear magnetic resonance, Fourier transform infrared, and flow injection electrospray-mass spectrometry. Fingerprinting enabled processing of five times more plants than conventional chromatographic profiling and was competitive for discriminating mutants, other than those affected in only low-abundance metabolites. Despite their rapidity and complexity, fingerprints yielded metabolomic insights (e.g. that effects of single lesions were usually not confined to individual pathways). Among fingerprint techniques, 1H-nuclear magnetic resonance discriminated the most mutant phenotypes from the wild type and Fourier transform infrared discriminated the fewest. To maximize information from fingerprints, data analysis was crucial. One-third of distinctive phenotypes might have been overlooked had data models been confined to principal component analysis score plots. Among several methods tested, machine learning (ML) algorithms, namely support vector machine or random forest (RF) classifiers, were unsurpassed for phenotype discrimination. Support vector machines were often the best performing classifiers, but RFs yielded some particularly informative measures. First, RFs estimated margins between mutant phenotypes, whose relations could then be visualized by Sammon mapping or hierarchical clustering. Second, RFs provided importance scores for the features within fingerprints that discriminated mutants. These scores correlated with analysis of variance F values (as did Kruskal-Wallis tests, true- and false-positive measures, mutual information, and the Relief feature selection algorithm). ML classifiers, as models trained on one data set to predict another, were ideal for focused metabolomic queries, such as the distinctiveness and consistency of mutant phenotypes. Accessible software for use of ML in plant physiology is highlighted. PMID

  12. A Sustainable Model for Integrating Current Topics in Machine Learning Research into the Undergraduate Curriculum

    ERIC Educational Resources Information Center

    Georgiopoulos, M.; DeMara, R. F.; Gonzalez, A. J.; Wu, A. S.; Mollaghasemi, M.; Gelenbe, E.; Kysilka, M.; Secretan, J.; Sharma, C. A.; Alnsour, A. J.

    2009-01-01

    This paper presents an integrated research and teaching model that has resulted from an NSF-funded effort to introduce results of current Machine Learning research into the engineering and computer science curriculum at the University of Central Florida (UCF). While in-depth exposure to current topics in Machine Learning has traditionally occurred…

  13. Detecting falls with wearable sensors using machine learning techniques.

    PubMed

    Özdemir, Ahmet Turan; Barshan, Billur

    2014-01-01

    Falls are a serious public health problem and possibly life threatening for people in fall risk groups. We develop an automated fall detection system with wearable motion sensor units fitted to the subjects' body at six different positions. Each unit comprises three tri-axial devices (accelerometer, gyroscope, and magnetometer/compass). Fourteen volunteers perform a standardized set of movements including 20 voluntary falls and 16 activities of daily living (ADLs), resulting in a large dataset with 2520 trials. To reduce the computational complexity of training and testing the classifiers, we focus on the raw data for each sensor in a 4 s time window around the point of peak total acceleration of the waist sensor, and then perform feature extraction and reduction. Most earlier studies on fall detection employ rule-based approaches that rely on simple thresholding of the sensor outputs. We successfully distinguish falls from ADLs using six machine learning techniques (classifiers): the k-nearest neighbor (k-NN) classifier, least squares method (LSM), support vector machines (SVM), Bayesian decision making (BDM), dynamic time warping (DTW), and artificial neural networks (ANNs). We compare the performance and the computational complexity of the classifiers and achieve the best results with the k-NN classifier and LSM, with sensitivity, specificity, and accuracy all above 99%. These classifiers also have acceptable computational requirements for training and testing. Our approach would be applicable in real-world scenarios where data records of indeterminate length, containing multiple activities in sequence, are recorded.

  14. New machine-learning algorithms for prediction of Parkinson's disease

    NASA Astrophysics Data System (ADS)

    Mandal, Indrajit; Sairam, N.

    2014-03-01

    This article presents an enhanced prediction accuracy of diagnosis of Parkinson's disease (PD) to prevent the delay and misdiagnosis of patients using the proposed robust inference system. New machine-learning methods are proposed and performance comparisons are based on specificity, sensitivity, accuracy and other measurable parameters. The robust methods of treating Parkinson's disease (PD) includes sparse multinomial logistic regression, rotation forest ensemble with support vector machines and principal components analysis, artificial neural networks, boosting methods. A new ensemble method comprising of the Bayesian network optimised by Tabu search algorithm as classifier and Haar wavelets as projection filter is used for relevant feature selection and ranking. The highest accuracy obtained by linear logistic regression and sparse multinomial logistic regression is 100% and sensitivity, specificity of 0.983 and 0.996, respectively. All the experiments are conducted over 95% and 99% confidence levels and establish the results with corrected t-tests. This work shows a high degree of advancement in software reliability and quality of the computer-aided diagnosis system and experimentally shows best results with supportive statistical inference.

  15. An application of machine learning to the organization of institutional software repositories

    NASA Technical Reports Server (NTRS)

    Bailin, Sidney; Henderson, Scott; Truszkowski, Walt

    1993-01-01

    Software reuse has become a major goal in the development of space systems, as a recent NASA-wide workshop on the subject made clear. The Data Systems Technology Division of Goddard Space Flight Center has been working on tools and techniques for promoting reuse, in particular in the development of satellite ground support software. One of these tools is the Experiment in Libraries via Incremental Schemata and Cobweb (ElvisC). ElvisC applies machine learning to the problem of organizing a reusable software component library for efficient and reliable retrieval. In this paper we describe the background factors that have motivated this work, present the design of the system, and evaluate the results of its application.

  16. A global prediction of seafloor sediment porosity using machine learning

    NASA Astrophysics Data System (ADS)

    Martin, Kylara M.; Wood, Warren T.; Becker, Joseph J.

    2015-12-01

    Porosity (void ratio) is a critical parameter in models of acoustic propagation, bearing strength, and many other seafloor phenomena. However, like many seafloor phenomena, direct measurements are expensive and sparse. We show here how porosity everywhere at the seafloor can be estimated using a machine learning technique (specifically, Random Forests). Such techniques use sparsely acquired direct samples and dense grids of other parameters to produce a statistically optimal estimate where direct measurements are lacking. Our porosity estimate is both qualitatively more consistent with geologic principles than the results produced by interpolation and quantitatively more accurate than results produced by interpolation or regression methods. We present here a seafloor porosity estimate on a 5 arc min, pixel registered grid, produced using widely available, densely sampled grids of other seafloor properties. These techniques represent the only practical means of estimating seafloor properties in inaccessible regions of the seafloor (e.g., the Arctic).

  17. Machine learning, medical diagnosis, and biomedical engineering research - commentary.

    PubMed

    Foster, Kenneth R; Koprowski, Robert; Skufca, Joseph D

    2014-07-05

    A large number of papers are appearing in the biomedical engineering literature that describe the use of machine learning techniques to develop classifiers for detection or diagnosis of disease. However, the usefulness of this approach in developing clinically validated diagnostic techniques so far has been limited and the methods are prone to overfitting and other problems which may not be immediately apparent to the investigators. This commentary is intended to help sensitize investigators as well as readers and reviewers of papers to some potential pitfalls in the development of classifiers, and suggests steps that researchers can take to help avoid these problems. Building classifiers should be viewed not simply as an add-on statistical analysis, but as part and parcel of the experimental process. Validation of classifiers for diagnostic applications should be considered as part of a much larger process of establishing the clinical validity of the diagnostic technique.

  18. Liquid intake monitoring through breathing signal using machine learning

    NASA Astrophysics Data System (ADS)

    Dong, Bo; Biswas, Subir

    2013-05-01

    This paper presents the design, system structure and performance for a wireless and wearable diet monitoring system. Food and drink intake can be detected by the way of detecting a person's swallow events. The system works based on the key observation that a person's otherwise continuous breathing process is interrupted by a short apnea when she or he swallows as a part of solid or liquid intake process. We detect the swallows through the difference between normal breathing cycle and breathing cycle with swallows using a wearable chest-belt. Three popular machine learning algorithms have been applied on both time and frequency domain features. Discrimination power of features is then analyzed for applications where only small number of features is allowed. It is shown that high detection performance can be achieved with only few features.

  19. Machine learning, medical diagnosis, and biomedical engineering research - commentary

    PubMed Central

    2014-01-01

    A large number of papers are appearing in the biomedical engineering literature that describe the use of machine learning techniques to develop classifiers for detection or diagnosis of disease. However, the usefulness of this approach in developing clinically validated diagnostic techniques so far has been limited and the methods are prone to overfitting and other problems which may not be immediately apparent to the investigators. This commentary is intended to help sensitize investigators as well as readers and reviewers of papers to some potential pitfalls in the development of classifiers, and suggests steps that researchers can take to help avoid these problems. Building classifiers should be viewed not simply as an add-on statistical analysis, but as part and parcel of the experimental process. Validation of classifiers for diagnostic applications should be considered as part of a much larger process of establishing the clinical validity of the diagnostic technique. PMID:24998888

  20. Nonlinear machine learning and design of reconfigurable digital colloids.

    PubMed

    Long, Andrew W; Phillips, Carolyn L; Jankowksi, Eric; Ferguson, Andrew L

    2016-09-14

    Digital colloids, a cluster of freely rotating "halo" particles tethered to the surface of a central particle, were recently proposed as ultra-high density memory elements for information storage. Rational design of these digital colloids for memory storage applications requires a quantitative understanding of the thermodynamic and kinetic stability of the configurational states within which information is stored. We apply nonlinear machine learning to Brownian dynamics simulations of these digital colloids to extract the low-dimensional intrinsic manifold governing digital colloid morphology, thermodynamics, and kinetics. By modulating the relative size ratio between halo particles and central particles, we investigate the size-dependent configurational stability and transition kinetics for the 2-state tetrahedral (N = 4) and 30-state octahedral (N = 6) digital colloids. We demonstrate the use of this framework to guide the rational design of a memory storage element to hold a block of text that trades off the competing design criteria of memory addressability and volatility. PMID:27498992

  1. Robust Machine Learning Applied to Terascale Astronomical Datasets

    NASA Astrophysics Data System (ADS)

    Ball, N. M.; Brunner, R. J.; Myers, A. D.

    2008-08-01

    We present recent results from the Laboratory for Cosmological Data Mining {http://lcdm.astro.uiuc.edu} at the National Center for Supercomputing Applications (NCSA) to provide robust classifications and photometric redshifts for objects in the terascale-class Sloan Digital Sky Survey (SDSS). Through a combination of machine learning in the form of decision trees, k-nearest neighbor, and genetic algorithms, the use of supercomputing resources at NCSA, and the cyberenvironment Data-to-Knowledge, we are able to provide improved classifications for over 100 million objects in the SDSS, improved photometric redshifts, and a full exploitation of the powerful k-nearest neighbor algorithm. This work is the first to apply the full power of these algorithms to contemporary terascale astronomical datasets, and the improvement over existing results is demonstrable. We discuss issues that we have encountered in dealing with data on the terascale, and possible solutions that can be implemented to deal with upcoming petascale datasets.

  2. Combining satellite imagery and machine learning to predict poverty.

    PubMed

    Jean, Neal; Burke, Marshall; Xie, Michael; Davis, W Matthew; Lobell, David B; Ermon, Stefano

    2016-08-19

    Reliable data on economic livelihoods remain scarce in the developing world, hampering efforts to study these outcomes and to design policies that improve them. Here we demonstrate an accurate, inexpensive, and scalable method for estimating consumption expenditure and asset wealth from high-resolution satellite imagery. Using survey and satellite data from five African countries--Nigeria, Tanzania, Uganda, Malawi, and Rwanda--we show how a convolutional neural network can be trained to identify image features that can explain up to 75% of the variation in local-level economic outcomes. Our method, which requires only publicly available data, could transform efforts to track and target poverty in developing countries. It also demonstrates how powerful machine learning techniques can be applied in a setting with limited training data, suggesting broad potential application across many scientific domains.

  3. Gene discovery for facioscapulohumeral muscular dystrophy by machine learning techniques.

    PubMed

    González-Navarro, Félix F; Belanche-Muñoz, Lluís A; Gámez-Moreno, María G; Flores-Ríos, Brenda L; Ibarra-Esquer, Jorge E; López-Morteo, Gabriel A

    2016-04-28

    Facioscapulohumeral muscular dystrophy (FSHD) is a neuromuscular disorder that shows a preference for the facial, shoulder and upper arm muscles. FSHD affects about one in 20-400,000 people, and no effective therapeutic strategies are known to halt disease progression or reverse muscle weakness or atrophy. Many genes may be incorrectly regulated in affected muscle tissue, but the mechanisms responsible for the progressive muscle weakness remain largely unknown. Although machine learning (ML) has made significant inroads in biomedical disciplines such as cancer research, no reports have yet addressed FSHD analysis using ML techniques. This study explores a specific FSHD data set from a ML perspective. We report results showing a very promising small group of genes that clearly separates FSHD samples from healthy samples. In addition to numerical prediction figures, we show data visualizations and biological evidence illustrating the potential usefulness of these results. PMID:26960968

  4. Machine-Learning Techniques Applied to Antibacterial Drug Discovery

    PubMed Central

    Durrant, Jacob D.; Amaro, Rommie E.

    2014-01-01

    The emergence of drug-resistant bacteria threatens to catapult humanity back to the pre-antibiotic era. Even now, multi-drug-resistant bacterial infections annually result in millions of hospital days, billions in healthcare costs, and, most importantly, tens of thousands of lives lost. As many pharmaceutical companies have abandoned antibiotic development in search of more lucrative therapeutics, academic researchers are uniquely positioned to fill the resulting vacuum. Traditional high-throughput screens and lead-optimization efforts are expensive and labor intensive. Computer-aided drug discovery techniques, which are cheaper and faster, can accelerate the identification of novel antibiotics in an academic setting, leading to improved hit rates and faster transitions to pre-clinical and clinical testing. The current review describes two machine-learning techniques, neural networks and decision trees, that have been used to identify experimentally validated antibiotics. We conclude by describing the future directions of this exciting field. PMID:25521642

  5. Combining satellite imagery and machine learning to predict poverty.

    PubMed

    Jean, Neal; Burke, Marshall; Xie, Michael; Davis, W Matthew; Lobell, David B; Ermon, Stefano

    2016-08-19

    Reliable data on economic livelihoods remain scarce in the developing world, hampering efforts to study these outcomes and to design policies that improve them. Here we demonstrate an accurate, inexpensive, and scalable method for estimating consumption expenditure and asset wealth from high-resolution satellite imagery. Using survey and satellite data from five African countries--Nigeria, Tanzania, Uganda, Malawi, and Rwanda--we show how a convolutional neural network can be trained to identify image features that can explain up to 75% of the variation in local-level economic outcomes. Our method, which requires only publicly available data, could transform efforts to track and target poverty in developing countries. It also demonstrates how powerful machine learning techniques can be applied in a setting with limited training data, suggesting broad potential application across many scientific domains. PMID:27540167

  6. Applying Machine Learning to GlueX Data Analysis

    NASA Astrophysics Data System (ADS)

    Boettcher, Thomas

    2014-03-01

    GlueX is a high energy physics experiment with the goal of collecting data necessary for understanding confinement in quantum chromodynamics. Beginning in 2015, GlueX will collect huge amounts of data describing billions of particle collisions. In preparation for data collection, efforts are underway to develop a methodology for analyzing these large data sets. One of the primary challenges in GlueX data analysis is isolating events of interest from a proportionally large background. GlueX has recently begun approaching this selection problem using machine learning algorithms, specifically boosted decision trees. Preliminary studies indicate that these algorithms have the potential to offer vast improvements in both signal selection efficiency and purity over more traditional techniques.

  7. Calibration transfer via an extreme learning machine auto-encoder.

    PubMed

    Chen, Wo-Ruo; Bin, Jun; Lu, Hong-Mei; Zhang, Zhi-Min; Liang, Yi-Zeng

    2016-03-21

    In order to solve the spectra standardization problem in near-infrared (NIR) spectroscopy, a Transfer via Extreme learning machine Auto-encoder Method (TEAM) has been proposed in this study. A comparative study among TEAM, piecewise direct standardization (PDS), generalized least squares (GLS) and calibration transfer methods based on canonical correlation analysis (CCA) was conducted, and the performances of these algorithms were benchmarked with three spectral datasets: corn, tobacco and pharmaceutical tablet spectra. The results show that TEAM is a stable method and can significantly reduce prediction errors compared with PDS, GLS and CCA. TEAM can also achieve the best RMSEPs in most cases with a small number of calibration sets. TEAM is implemented in Python language and available as an open source package at https://github.com/zmzhang/TEAM. PMID:26846329

  8. Machine-learning techniques applied to antibacterial drug discovery.

    PubMed

    Durrant, Jacob D; Amaro, Rommie E

    2015-01-01

    The emergence of drug-resistant bacteria threatens to revert humanity back to the preantibiotic era. Even now, multidrug-resistant bacterial infections annually result in millions of hospital days, billions in healthcare costs, and, most importantly, tens of thousands of lives lost. As many pharmaceutical companies have abandoned antibiotic development in search of more lucrative therapeutics, academic researchers are uniquely positioned to fill the pipeline. Traditional high-throughput screens and lead-optimization efforts are expensive and labor intensive. Computer-aided drug-discovery techniques, which are cheaper and faster, can accelerate the identification of novel antibiotics, leading to improved hit rates and faster transitions to preclinical and clinical testing. The current review describes two machine-learning techniques, neural networks and decision trees, that have been used to identify experimentally validated antibiotics. We conclude by describing the future directions of this exciting field.

  9. Identifying Energy-Efficient Concurrency Levels using Machine Learning

    SciTech Connect

    Curtis-Maury, M; Singh, K; Blagojevic, F; Nikolopoulos, D S; de Supinski, B R; Schulz, M; McKee, S A

    2007-07-23

    Multicore microprocessors have been largely motivated by the diminishing returns in performance and the increased power consumption of single-threaded ILP microprocessors. With the industry already shifting from multicore to many-core microprocessors, software developers must extract more thread-level parallelism from applications. Unfortunately, low power-efficiency and diminishing returns in performance remain major obstacles with many cores. Poor interaction between software and hardware, and bottlenecks in shared hardware structures often prevent scaling to many cores, even in applications where a high degree of parallelism is potentially available. In some cases, throwing additional cores at a problem may actually harm performance and increase power consumption. Better use of otherwise limitedly beneficial cores by software components such as hypervisors and operating systems can improve system-wide performance and reliability, even in cases where power consumption is not a main concern. In response to these observations, we evaluate an approach to throttle concurrency in parallel programs dynamically. We throttle concurrency to levels with higher predicted efficiency from both performance and energy standpoints, and we do so via machine learning, specifically artificial neural networks (ANNs). One advantage of using ANNs over similar techniques previously explored is that the training phase is greatly simplified, thereby reducing the burden on the end user. Using machine learning in the context of concurrency throttling is novel. We show that ANNs are effective for identifying energy-efficient concurrency levels in multithreaded scientific applications, and we do so using physical experimentation on a state-of-the-art quad-core Xeon platform.

  10. Predicting phenotypes of asthma and eczema with machine learning

    PubMed Central

    2014-01-01

    Background There is increasing recognition that asthma and eczema are heterogeneous diseases. We investigated the predictive ability of a spectrum of machine learning methods to disambiguate clinical sub-groups of asthma, wheeze and eczema, using a large heterogeneous set of attributes in an unselected population. The aim was to identify to what extent such heterogeneous information can be combined to reveal specific clinical manifestations. Methods The study population comprised a cross-sectional sample of adults, and included representatives of the general population enriched by subjects with asthma. Linear and non-linear machine learning methods, from logistic regression to random forests, were fit on a large attribute set including demographic, clinical and laboratory features, genetic profiles and environmental exposures. Outcome of interest were asthma, wheeze and eczema encoded by different operational definitions. Model validation was performed via bootstrapping. Results The study population included 554 adults, 42% male, 38% previous or current smokers. Proportion of asthma, wheeze, and eczema diagnoses was 16.7%, 12.3%, and 21.7%, respectively. Models were fit on 223 non-genetic variables plus 215 single nucleotide polymorphisms. In general, non-linear models achieved higher sensitivity and specificity than other methods, especially for asthma and wheeze, less for eczema, with areas under receiver operating characteristic curve of 84%, 76% and 64%, respectively. Our findings confirm that allergen sensitisation and lung function characterise asthma better in combination than separately. The predictive ability of genetic markers alone is limited. For eczema, new predictors such as bio-impedance were discovered. Conclusions More usefully-complex modelling is the key to a better understanding of disease mechanisms and personalised healthcare: further advances are likely with the incorporation of more factors/attributes and longitudinal measures. PMID:25077568

  11. Stellar classification from single-band imaging using machine learning

    NASA Astrophysics Data System (ADS)

    Kuntzer, T.; Tewes, M.; Courbin, F.

    2016-06-01

    Information on the spectral types of stars is of great interest in view of the exploitation of space-based imaging surveys. In this article, we investigate the classification of stars into spectral types using only the shape of their diffraction pattern in a single broad-band image. We propose a supervised machine learning approach to this endeavour, based on principal component analysis (PCA) for dimensionality reduction, followed by artificial neural networks (ANNs) estimating the spectral type. Our analysis is performed with image simulations mimicking the Hubble Space Telescope (HST) Advanced Camera for Surveys (ACS) in the F606W and F814W bands, as well as the Euclid VIS imager. We first demonstrate this classification in a simple context, assuming perfect knowledge of the point spread function (PSF) model and the possibility of accurately generating mock training data for the machine learning. We then analyse its performance in a fully data-driven situation, in which the training would be performed with a limited subset of bright stars from a survey, and an unknown PSF with spatial variations across the detector. We use simulations of main-sequence stars with flat distributions in spectral type and in signal-to-noise ratio, and classify these stars into 13 spectral subclasses, from O5 to M5. Under these conditions, the algorithm achieves a high success rate both for Euclid and HST images, with typical errors of half a spectral class. Although more detailed simulations would be needed to assess the performance of the algorithm on a specific survey, this shows that stellar classification from single-band images is well possible.

  12. Is extreme learning machine feasible? A theoretical assessment (part II).

    PubMed

    Lin, Shaobo; Liu, Xia; Fang, Jian; Xu, Zongben

    2015-01-01

    An extreme learning machine (ELM) can be regarded as a two-stage feed-forward neural network (FNN) learning system that randomly assigns the connections with and within hidden neurons in the first stage and tunes the connections with output neurons in the second stage. Therefore, ELM training is essentially a linear learning problem, which significantly reduces the computational burden. Numerous applications show that such a computation burden reduction does not degrade the generalization capability. It has, however, been open that whether this is true in theory. The aim of this paper is to study the theoretical feasibility of ELM by analyzing the pros and cons of ELM. In the previous part of this topic, we pointed out that via appropriately selected activation functions, ELM does not degrade the generalization capability in the sense of expectation. In this paper, we launch the study in a different direction and show that the randomness of ELM also leads to certain negative consequences. On one hand, we find that the randomness causes an additional uncertainty problem of ELM, both in approximation and learning. On the other hand, we theoretically justify that there also exist activation functions such that the corresponding ELM degrades the generalization capability. In particular, we prove that the generalization capability of ELM with Gaussian kernel is essentially worse than that of FNN with Gaussian kernel. To facilitate the use of ELM, we also provide a remedy to such a degradation. We find that the well-developed coefficient regularization technique can essentially improve the generalization capability. The obtained results reveal the essential characteristic of ELM in a certain sense and give theoretical guidance concerning how to use ELM.

  13. Investigation into the accuracy of a proposed laser diode based multilateration machine tool calibration system

    NASA Astrophysics Data System (ADS)

    Fletcher, S.; Longstaff, A. P.; Myers, A.

    2005-01-01

    Geometric and thermal calibration of CNC machine tools is required in modern machine shops with volumetric accuracy assessment becoming the standard machine tool qualification in many industries. Laser interferometry is a popular method of measuring the errors but this, and other alternatives, tend to be expensive, time consuming or both. This paper investigates the feasibility of using a laser diode based system that capitalises on the low cost nature of the diode to provide multiple laser sources for fast error measurement using multilateration. Laser diode module technology enables improved wavelength stability and spectral linewidth which are important factors for laser interferometry. With more than three laser sources, the set-up process can be greatly simplified while providing flexibility in the location of the laser sources improving the accuracy of the system.

  14. A Tool for Assessing the Text Legibility of Digital Human Machine Interfaces

    SciTech Connect

    Roger Lew; Ronald L. Boring; Thomas A. Ulrich

    2015-08-01

    A tool intended to aid qualified professionals in the assessment of the legibility of text presented on a digital display is described. The assessment of legibility is primarily for the purposes of designing and analyzing human machine interfaces in accordance with NUREG-0700 and MIL-STD 1472G. The tool addresses shortcomings of existing guidelines by providing more accurate metrics of text legibility with greater sensitivity to design alternatives.

  15. AstroML: "better, faster, cheaper" towards state-of-the-art data mining and machine learning

    NASA Astrophysics Data System (ADS)

    Ivezic, Zeljko; Connolly, Andrew J.; Vanderplas, Jacob

    2015-01-01

    We present AstroML, a Python module for machine learning and data mining built on numpy, scipy, scikit-learn, matplotlib, and astropy, and distributed under an open license. AstroML contains a growing library of statistical and machine learning routines for analyzing astronomical data in Python, loaders for several open astronomical datasets (such as SDSS and other recent major surveys), and a large suite of examples of analyzing and visualizing astronomical datasets. AstroML is especially suitable for introducing undergraduate students to numerical research projects and for graduate students to rapidly undertake cutting-edge research. The long-term goal of astroML is to provide a community repository for fast Python implementations of common tools and routines used for statistical data analysis in astronomy and astrophysics (see http://www.astroml.org).

  16. Open Source for Knowledge and Learning Management: Strategies beyond Tools

    ERIC Educational Resources Information Center

    Lytras, Miltiadis, Ed.; Naeve, Ambjorn, Ed.

    2007-01-01

    In the last years, knowledge and learning management have made a significant impact on the IT research community. "Open Source for Knowledge and Learning Management: Strategies Beyond Tools" presents learning and knowledge management from a point of view where the basic tools and applications are provided by open source technologies. This book…

  17. User Studies: Developing Learning Strategy Tool Software for Children.

    ERIC Educational Resources Information Center

    Fitzgerald, Gail E.; Koury, Kevin A.; Peng, Hsinyi

    This paper is a report of user studies for developing learning strategy tool software for children. The prototype software demonstrated is designed for children with learning and behavioral disabilities. The tools consist of easy-to-use templates for creating organizational, memory, and learning approach guides for use in classrooms and at home.…

  18. On Recommending Web 2.0 Tools to Personalise Learning

    ERIC Educational Resources Information Center

    Juškeviciene, Anita; Kurilovas, Eugenijus

    2014-01-01

    The paper aims to present research results on using Web 2.0 tools for learning personalisation. In the work, personalised Web 2.0 tools selection method is presented. This method takes into account student's learning preferences for content and communication modes tailored to the learning activities with a view to help the learner to quickly and…

  19. Determination of real machine-tool settings and minimization of real surface deviation by computerized inspection

    NASA Technical Reports Server (NTRS)

    Litvin, Faydor L.; Kuan, Chihping; Zhang, YI

    1991-01-01

    A numerical method is developed for the minimization of deviations of real tooth surfaces from the theoretical ones. The deviations are caused by errors of manufacturing, errors of installment of machine-tool settings and distortion of surfaces by heat-treatment. The deviations are determined by coordinate measurements of gear tooth surfaces. The minimization of deviations is based on the proper correction of initially applied machine-tool settings. The contents of accomplished research project cover the following topics: (1) Descriptions of the principle of coordinate measurements of gear tooth surfaces; (2) Deviation of theoretical tooth surfaces (with examples of surfaces of hypoid gears and references for spiral bevel gears); (3) Determination of the reference point and the grid; (4) Determination of the deviations of real tooth surfaces at the points of the grid; and (5) Determination of required corrections of machine-tool settings for minimization of deviations. The procedure for minimization of deviations is based on numerical solution of an overdetermined system of n linear equations in m unknowns (m much less than n ), where n is the number of points of measurements and m is the number of parameters of applied machine-tool settings to be corrected. The developed approach is illustrated with numerical examples.

  20. 76 FR 27668 - ASC Machine Tools, Inc., Spokane Valley, WA; Notice of Negative Determination on Reconsideration

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-05-12

    ... Employment and Training Administration ASC Machine Tools, Inc., Spokane Valley, WA; Notice of Negative... Register on October 25, 2010 (75 FR 65516). The workers produce custom-order metal cutting machinery used... reconsideration of the decision. The initial investigation resulted in a negative determination based on...