Sample records for machine learning library

  1. Comparative analysis of machine learning methods in ligand-based virtual screening of large compound libraries.

    PubMed

    Ma, Xiao H; Jia, Jia; Zhu, Feng; Xue, Ying; Li, Ze R; Chen, Yu Z

    2009-05-01

    Machine learning methods have been explored as ligand-based virtual screening tools for facilitating drug lead discovery. These methods predict compounds of specific pharmacodynamic, pharmacokinetic or toxicological properties based on their structure-derived structural and physicochemical properties. Increasing attention has been directed at these methods because of their capability in predicting compounds of diverse structures and complex structure-activity relationships without requiring the knowledge of target 3D structure. This article reviews current progresses in using machine learning methods for virtual screening of pharmacodynamically active compounds from large compound libraries, and analyzes and compares the reported performances of machine learning tools with those of structure-based and other ligand-based (such as pharmacophore and clustering) virtual screening methods. The feasibility to improve the performance of machine learning methods in screening large libraries is discussed.

  2. ClearTK 2.0: Design Patterns for Machine Learning in UIMA

    PubMed Central

    Bethard, Steven; Ogren, Philip; Becker, Lee

    2014-01-01

    ClearTK adds machine learning functionality to the UIMA framework, providing wrappers to popular machine learning libraries, a rich feature extraction library that works across different classifiers, and utilities for applying and evaluating machine learning models. Since its inception in 2008, ClearTK has evolved in response to feedback from developers and the community. This evolution has followed a number of important design principles including: conceptually simple annotator interfaces, readable pipeline descriptions, minimal collection readers, type system agnostic code, modules organized for ease of import, and assisting user comprehension of the complex UIMA framework. PMID:29104966

  3. ClearTK 2.0: Design Patterns for Machine Learning in UIMA.

    PubMed

    Bethard, Steven; Ogren, Philip; Becker, Lee

    2014-05-01

    ClearTK adds machine learning functionality to the UIMA framework, providing wrappers to popular machine learning libraries, a rich feature extraction library that works across different classifiers, and utilities for applying and evaluating machine learning models. Since its inception in 2008, ClearTK has evolved in response to feedback from developers and the community. This evolution has followed a number of important design principles including: conceptually simple annotator interfaces, readable pipeline descriptions, minimal collection readers, type system agnostic code, modules organized for ease of import, and assisting user comprehension of the complex UIMA framework.

  4. Clinical data miner: an electronic case report form system with integrated data preprocessing and machine-learning libraries supporting clinical diagnostic model research.

    PubMed

    Installé, Arnaud Jf; Van den Bosch, Thierry; De Moor, Bart; Timmerman, Dirk

    2014-10-20

    Using machine-learning techniques, clinical diagnostic model research extracts diagnostic models from patient data. Traditionally, patient data are often collected using electronic Case Report Form (eCRF) systems, while mathematical software is used for analyzing these data using machine-learning techniques. Due to the lack of integration between eCRF systems and mathematical software, extracting diagnostic models is a complex, error-prone process. Moreover, due to the complexity of this process, it is usually only performed once, after a predetermined number of data points have been collected, without insight into the predictive performance of the resulting models. The objective of the study of Clinical Data Miner (CDM) software framework is to offer an eCRF system with integrated data preprocessing and machine-learning libraries, improving efficiency of the clinical diagnostic model research workflow, and to enable optimization of patient inclusion numbers through study performance monitoring. The CDM software framework was developed using a test-driven development (TDD) approach, to ensure high software quality. Architecturally, CDM's design is split over a number of modules, to ensure future extendability. The TDD approach has enabled us to deliver high software quality. CDM's eCRF Web interface is in active use by the studies of the International Endometrial Tumor Analysis consortium, with over 4000 enrolled patients, and more studies planned. Additionally, a derived user interface has been used in six separate interrater agreement studies. CDM's integrated data preprocessing and machine-learning libraries simplify some otherwise manual and error-prone steps in the clinical diagnostic model research workflow. Furthermore, CDM's libraries provide study coordinators with a method to monitor a study's predictive performance as patient inclusions increase. To our knowledge, CDM is the only eCRF system integrating data preprocessing and machine-learning libraries. This integration improves the efficiency of the clinical diagnostic model research workflow. Moreover, by simplifying the generation of learning curves, CDM enables study coordinators to assess more accurately when data collection can be terminated, resulting in better models or lower patient recruitment costs.

  5. Libraries Can Learn from Banks.

    ERIC Educational Resources Information Center

    Lawrence, Gail H.

    1983-01-01

    The experiences of banks introducing computerized services to the public are described to provide some idea of what libraries can expect when they introduce online systems. Volume of use of Automated Teller Machines, types of users, introduction of machines, and user acceptance are highlighted. Thirty-two references are cited. (EJS)

  6. Creating the New from the Old: Combinatorial Libraries Generation with Machine-Learning-Based Compound Structure Optimization.

    PubMed

    Podlewska, Sabina; Czarnecki, Wojciech M; Kafel, Rafał; Bojarski, Andrzej J

    2017-02-27

    The growing computational abilities of various tools that are applied in the broadly understood field of computer-aided drug design have led to the extreme popularity of virtual screening in the search for new biologically active compounds. Most often, the source of such molecules consists of commercially available compound databases, but they can also be searched for within the libraries of structures generated in silico from existing ligands. Various computational combinatorial approaches are based solely on the chemical structure of compounds, using different types of substitutions for new molecules formation. In this study, the starting point for combinatorial library generation was the fingerprint referring to the optimal substructural composition in terms of the activity toward a considered target, which was obtained using a machine learning-based optimization procedure. The systematic enumeration of all possible connections between preferred substructures resulted in the formation of target-focused libraries of new potential ligands. The compounds were initially assessed by machine learning methods using a hashed fingerprint to represent molecules; the distribution of their physicochemical properties was also investigated, as well as their synthetic accessibility. The examination of various fingerprints and machine learning algorithms indicated that the Klekota-Roth fingerprint and support vector machine were an optimal combination for such experiments. This study was performed for 8 protein targets, and the obtained compound sets and their characterization are publically available at http://skandal.if-pan.krakow.pl/comb_lib/ .

  7. Toolkits and Libraries for Deep Learning.

    PubMed

    Erickson, Bradley J; Korfiatis, Panagiotis; Akkus, Zeynettin; Kline, Timothy; Philbrick, Kenneth

    2017-08-01

    Deep learning is an important new area of machine learning which encompasses a wide range of neural network architectures designed to complete various tasks. In the medical imaging domain, example tasks include organ segmentation, lesion detection, and tumor classification. The most popular network architecture for deep learning for images is the convolutional neural network (CNN). Whereas traditional machine learning requires determination and calculation of features from which the algorithm learns, deep learning approaches learn the important features as well as the proper weighting of those features to make predictions for new data. In this paper, we will describe some of the libraries and tools that are available to aid in the construction and efficient execution of deep learning as applied to medical images.

  8. Machine Learning Approach to Deconvolution of Thermal Infrared (TIR) Spectrum of Mercury Supporting MERTIS Onboard ESA/JAXA BepiColombo

    NASA Astrophysics Data System (ADS)

    Varatharajan, I.; D'Amore, M.; Maturilli, A.; Helbert, J.; Hiesinger, H.

    2018-04-01

    Machine learning approach to spectral unmixing of emissivity spectra of Mercury is carried out using endmember spectral library measured at simulated daytime surface conditions of Mercury. Study supports MERTIS payload onboard ESA/JAXA BepiColombo.

  9. Building a Digital Library from the Ground Up: an Examination of Emergent Information Resources in the Machine Learning Community

    NASA Astrophysics Data System (ADS)

    Cunningham, Sally Jo

    The current crop of digital libraries for the computing community are strongly grounded in the conventional library paradigm: they provide indexes to support searching of collections of research papers. As such, these digital libraries are relatively impoverished; the present computing digital libraries omit many of the documents and resources that are currently available to computing researchers, and offer few browsing structures. These computing digital libraries were built 'top down': the resources and collection contents are forced to fit an existing digital library architecture. A 'bottom up' approach to digital library development would begin with an investigation of a community's information needs and available documents, and then design a library to organize those documents in such a way as to fulfill the community's needs. The 'home grown', informal information resources developed by and for the machine learning community are examined as a case study, to determine the types of information and document organizations 'native' to this group of researchers. The insights gained in this type of case study can be used to inform construction of a digital library tailored to this community.

  10. Hello World Deep Learning in Medical Imaging.

    PubMed

    Lakhani, Paras; Gray, Daniel L; Pett, Carl R; Nagy, Paul; Shih, George

    2018-05-03

    There is recent popularity in applying machine learning to medical imaging, notably deep learning, which has achieved state-of-the-art performance in image analysis and processing. The rapid adoption of deep learning may be attributed to the availability of machine learning frameworks and libraries to simplify their use. In this tutorial, we provide a high-level overview of how to build a deep neural network for medical image classification, and provide code that can help those new to the field begin their informatics projects.

  11. Cognitive Foundry v. 3.0 (OSS)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Basilico, Justin; Dixon, Kevin; McClain, Jonathan

    2009-11-18

    The Cognitive Foundry is a unified collection of tools designed for research and applications that use cognitive modeling, machine learning, or pattern recognition. The software library contains design patterns, interface definitions, and default implementations of reusable software components and algorithms designed to support a wide variety of research and development needs. The library contains three main software packages: the Common package that contains basic utilities and linear algebraic methods, the Cognitive Framework package that contains tools to assist in implementing and analyzing theories of cognition, and the Machine Learning package that provides general algorithms and methods for populating Cognitive Frameworkmore » components from domain-relevant data.« less

  12. Machine learning for neuroimaging with scikit-learn.

    PubMed

    Abraham, Alexandre; Pedregosa, Fabian; Eickenberg, Michael; Gervais, Philippe; Mueller, Andreas; Kossaifi, Jean; Gramfort, Alexandre; Thirion, Bertrand; Varoquaux, Gaël

    2014-01-01

    Statistical machine learning methods are increasingly used for neuroimaging data analysis. Their main virtue is their ability to model high-dimensional datasets, e.g., multivariate analysis of activation images or resting-state time series. Supervised learning is typically used in decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learning can uncover hidden structures in sets of images (e.g., resting state functional MRI) or find sub-populations in large cohorts. By considering different functional neuroimaging applications, we illustrate how scikit-learn, a Python machine learning library, can be used to perform some key analysis steps. Scikit-learn contains a very large set of statistical learning algorithms, both supervised and unsupervised, and its application to neuroimaging data provides a versatile tool to study the brain.

  13. Machine learning for neuroimaging with scikit-learn

    PubMed Central

    Abraham, Alexandre; Pedregosa, Fabian; Eickenberg, Michael; Gervais, Philippe; Mueller, Andreas; Kossaifi, Jean; Gramfort, Alexandre; Thirion, Bertrand; Varoquaux, Gaël

    2014-01-01

    Statistical machine learning methods are increasingly used for neuroimaging data analysis. Their main virtue is their ability to model high-dimensional datasets, e.g., multivariate analysis of activation images or resting-state time series. Supervised learning is typically used in decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learning can uncover hidden structures in sets of images (e.g., resting state functional MRI) or find sub-populations in large cohorts. By considering different functional neuroimaging applications, we illustrate how scikit-learn, a Python machine learning library, can be used to perform some key analysis steps. Scikit-learn contains a very large set of statistical learning algorithms, both supervised and unsupervised, and its application to neuroimaging data provides a versatile tool to study the brain. PMID:24600388

  14. Rapid and Accurate Machine Learning Recognition of High Performing Metal Organic Frameworks for CO2 Capture.

    PubMed

    Fernandez, Michael; Boyd, Peter G; Daff, Thomas D; Aghaji, Mohammad Zein; Woo, Tom K

    2014-09-04

    In this work, we have developed quantitative structure-property relationship (QSPR) models using advanced machine learning algorithms that can rapidly and accurately recognize high-performing metal organic framework (MOF) materials for CO2 capture. More specifically, QSPR classifiers have been developed that can, in a fraction of a section, identify candidate MOFs with enhanced CO2 adsorption capacity (>1 mmol/g at 0.15 bar and >4 mmol/g at 1 bar). The models were tested on a large set of 292 050 MOFs that were not part of the training set. The QSPR classifier could recover 945 of the top 1000 MOFs in the test set while flagging only 10% of the whole library for compute intensive screening. Thus, using the machine learning classifiers as part of a high-throughput screening protocol would result in an order of magnitude reduction in compute time and allow intractably large structure libraries and search spaces to be screened.

  15. Machine learning phases of matter

    NASA Astrophysics Data System (ADS)

    Carrasquilla, Juan; Melko, Roger G.

    2017-02-01

    Condensed-matter physics is the study of the collective behaviour of infinitely complex assemblies of electrons, nuclei, magnetic moments, atoms or qubits. This complexity is reflected in the size of the state space, which grows exponentially with the number of particles, reminiscent of the `curse of dimensionality' commonly encountered in machine learning. Despite this curse, the machine learning community has developed techniques with remarkable abilities to recognize, classify, and characterize complex sets of data. Here, we show that modern machine learning architectures, such as fully connected and convolutional neural networks, can identify phases and phase transitions in a variety of condensed-matter Hamiltonians. Readily programmable through modern software libraries, neural networks can be trained to detect multiple types of order parameter, as well as highly non-trivial states with no conventional order, directly from raw state configurations sampled with Monte Carlo.

  16. An application of machine learning to the organization of institutional software repositories

    NASA Technical Reports Server (NTRS)

    Bailin, Sidney; Henderson, Scott; Truszkowski, Walt

    1993-01-01

    Software reuse has become a major goal in the development of space systems, as a recent NASA-wide workshop on the subject made clear. The Data Systems Technology Division of Goddard Space Flight Center has been working on tools and techniques for promoting reuse, in particular in the development of satellite ground support software. One of these tools is the Experiment in Libraries via Incremental Schemata and Cobweb (ElvisC). ElvisC applies machine learning to the problem of organizing a reusable software component library for efficient and reliable retrieval. In this paper we describe the background factors that have motivated this work, present the design of the system, and evaluate the results of its application.

  17. Learning to Predict Demand in a Transport-Resource Sharing Task

    DTIC Science & Technology

    2015-09-01

    exhaustive manner. We experimented with the scikit- learn machine- learning library for Python and a range of R packages before settling on R. We...NAVAL POSTGRADUATE SCHOOL MONTEREY, CALIFORNIA THESIS Approved for public release; distribution is unlimited LEARNING TO...COVERED Master’s thesis 4. TITLE AND SUBTITLE LEARNING TO PREDICT DEMAND IN A TRANSPORT-RESOURCE SHARING TASK 5. FUNDING NUMBERS 6. AUTHOR

  18. The Python Spectral Analysis Tool (PySAT): A Powerful, Flexible, Preprocessing and Machine Learning Library and Interface

    NASA Astrophysics Data System (ADS)

    Anderson, R. B.; Finch, N.; Clegg, S. M.; Graff, T. G.; Morris, R. V.; Laura, J.; Gaddis, L. R.

    2017-12-01

    Machine learning is a powerful but underutilized approach that can enable planetary scientists to derive meaningful results from the rapidly-growing quantity of available spectral data. For example, regression methods such as Partial Least Squares (PLS) and Least Absolute Shrinkage and Selection Operator (LASSO), can be used to determine chemical concentrations from ChemCam and SuperCam Laser-Induced Breakdown Spectroscopy (LIBS) data [1]. Many scientists are interested in testing different spectral data processing and machine learning methods, but few have the time or expertise to write their own software to do so. We are therefore developing a free open-source library of software called the Python Spectral Analysis Tool (PySAT) along with a flexible, user-friendly graphical interface to enable scientists to process and analyze point spectral data without requiring significant programming or machine-learning expertise. A related but separately-funded effort is working to develop a graphical interface for orbital data [2]. The PySAT point-spectra tool includes common preprocessing steps (e.g. interpolation, normalization, masking, continuum removal, dimensionality reduction), plotting capabilities, and capabilities to prepare data for machine learning such as creating stratified folds for cross validation, defining training and test sets, and applying calibration transfer so that data collected on different instruments or under different conditions can be used together. The tool leverages the scikit-learn library [3] to enable users to train and compare the results from a variety of multivariate regression methods. It also includes the ability to combine multiple "sub-models" into an overall model, a method that has been shown to improve results and is currently used for ChemCam data [4]. Although development of the PySAT point-spectra tool has focused primarily on the analysis of LIBS spectra, the relevant steps and methods are applicable to any spectral data. The tool is available at https://github.com/USGS-Astrogeology/PySAT_Point_Spectra_GUI. [1] Clegg, S.M., et al. (2017) Spectrochim Acta B. 129, 64-85. [2] Gaddis, L. et al. (2017) 3rd Planetary Data Workshop, #1986. [3] http://scikit-learn.org/ [4] Anderson, R.B., et al. (2017) Spectrochim. Acta B. 129, 49-57.

  19. Automatic MeSH term assignment and quality assessment.

    PubMed Central

    Kim, W.; Aronson, A. R.; Wilbur, W. J.

    2001-01-01

    For computational purposes documents or other objects are most often represented by a collection of individual attributes that may be strings or numbers. Such attributes are often called features and success in solving a given problem can depend critically on the nature of the features selected to represent documents. Feature selection has received considerable attention in the machine learning literature. In the area of document retrieval we refer to feature selection as indexing. Indexing has not traditionally been evaluated by the same methods used in machine learning feature selection. Here we show how indexing quality may be evaluated in a machine learning setting and apply this methodology to results of the Indexing Initiative at the National Library of Medicine. PMID:11825203

  20. MoleculeNet: a benchmark for molecular machine learning† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c7sc02664a

    PubMed Central

    Wu, Zhenqin; Ramsundar, Bharath; Feinberg, Evan N.; Gomes, Joseph; Geniesse, Caleb; Pappu, Aneesh S.; Leswing, Karl

    2017-01-01

    Molecular machine learning has been maturing rapidly over the last few years. Improved methods and the presence of larger datasets have enabled machine learning algorithms to make increasingly accurate predictions about molecular properties. However, algorithmic progress has been limited due to the lack of a standard benchmark to compare the efficacy of proposed methods; most new algorithms are benchmarked on different datasets making it challenging to gauge the quality of proposed methods. This work introduces MoleculeNet, a large scale benchmark for molecular machine learning. MoleculeNet curates multiple public datasets, establishes metrics for evaluation, and offers high quality open-source implementations of multiple previously proposed molecular featurization and learning algorithms (released as part of the DeepChem open source library). MoleculeNet benchmarks demonstrate that learnable representations are powerful tools for molecular machine learning and broadly offer the best performance. However, this result comes with caveats. Learnable representations still struggle to deal with complex tasks under data scarcity and highly imbalanced classification. For quantum mechanical and biophysical datasets, the use of physics-aware featurizations can be more important than choice of particular learning algorithm. PMID:29629118

  1. Learning Extended Finite State Machines

    NASA Technical Reports Server (NTRS)

    Cassel, Sofia; Howar, Falk; Jonsson, Bengt; Steffen, Bernhard

    2014-01-01

    We present an active learning algorithm for inferring extended finite state machines (EFSM)s, combining data flow and control behavior. Key to our learning technique is a novel learning model based on so-called tree queries. The learning algorithm uses the tree queries to infer symbolic data constraints on parameters, e.g., sequence numbers, time stamps, identifiers, or even simple arithmetic. We describe sufficient conditions for the properties that the symbolic constraints provided by a tree query in general must have to be usable in our learning model. We have evaluated our algorithm in a black-box scenario, where tree queries are realized through (black-box) testing. Our case studies include connection establishment in TCP and a priority queue from the Java Class Library.

  2. Designing Multi-target Compound Libraries with Gaussian Process Models.

    PubMed

    Bieler, Michael; Reutlinger, Michael; Rodrigues, Tiago; Schneider, Petra; Kriegl, Jan M; Schneider, Gisbert

    2016-05-01

    We present the application of machine learning models to selecting G protein-coupled receptor (GPCR)-focused compound libraries. The library design process was realized by ant colony optimization. A proprietary Boehringer-Ingelheim reference set consisting of 3519 compounds tested in dose-response assays at 11 GPCR targets served as training data for machine learning and activity prediction. We compared the usability of the proprietary data with a public data set from ChEMBL. Gaussian process models were trained to prioritize compounds from a virtual combinatorial library. We obtained meaningful models for three of the targets (5-HT2c , MCH, A1), which were experimentally confirmed for 12 of 15 selected and synthesized or purchased compounds. Overall, the models trained on the public data predicted the observed assay results more accurately. The results of this study motivate the use of Gaussian process regression on public data for virtual screening and target-focused compound library design. © 2016 The Authors. Published by Wiley-VCH Verlag GmbH & Co. KGaA. This is an open access article under the terms of the Creative Commons Attribution Non-Commercial NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.

  3. AstroML: "better, faster, cheaper" towards state-of-the-art data mining and machine learning

    NASA Astrophysics Data System (ADS)

    Ivezic, Zeljko; Connolly, Andrew J.; Vanderplas, Jacob

    2015-01-01

    We present AstroML, a Python module for machine learning and data mining built on numpy, scipy, scikit-learn, matplotlib, and astropy, and distributed under an open license. AstroML contains a growing library of statistical and machine learning routines for analyzing astronomical data in Python, loaders for several open astronomical datasets (such as SDSS and other recent major surveys), and a large suite of examples of analyzing and visualizing astronomical datasets. AstroML is especially suitable for introducing undergraduate students to numerical research projects and for graduate students to rapidly undertake cutting-edge research. The long-term goal of astroML is to provide a community repository for fast Python implementations of common tools and routines used for statistical data analysis in astronomy and astrophysics (see http://www.astroml.org).

  4. Evaluating Machine Learning Classifiers for Hybrid Network Intrusion Detection Systems

    DTIC Science & Technology

    2015-03-26

    7 VRT Vulnerability Research Team...and the Talos (formerly the Vulnerability Research Team ( VRT )) [7] 7 ruleset libraries are the two leading rulesets in use. Both libraries offer paid...rule sets to load for the signature-based IDS. Snort is selected as the IDS engine using the “ VRT and ET No/GPL” rule set. The total rule count in the

  5. Robust Library Building for Autonomous Classification of Downhole Geophysical Logs Using Gaussian Processes

    NASA Astrophysics Data System (ADS)

    Silversides, Katherine L.; Melkumyan, Arman

    2017-03-01

    Machine learning techniques such as Gaussian Processes can be used to identify stratigraphically important features in geophysical logs. The marker shales in the banded iron formation hosted iron ore deposits of the Hamersley Ranges, Western Australia, form distinctive signatures in the natural gamma logs. The identification of these marker shales is important for stratigraphic identification of unit boundaries for the geological modelling of the deposit. Machine learning techniques each have different unique properties that will impact the results. For Gaussian Processes (GPs), the output values are inclined towards the mean value, particularly when there is not sufficient information in the library. The impact that these inclinations have on the classification can vary depending on the parameter values selected by the user. Therefore, when applying machine learning techniques, care must be taken to fit the technique to the problem correctly. This study focuses on optimising the settings and choices for training a GPs system to identify a specific marker shale. We show that the final results converge even when different, but equally valid starting libraries are used for the training. To analyse the impact on feature identification, GP models were trained so that the output was inclined towards a positive, neutral or negative output. For this type of classification, the best results were when the pull was towards a negative output. We also show that the GP output can be adjusted by using a standard deviation coefficient that changes the balance between certainty and accuracy in the results.

  6. Hyperopt: a Python library for model selection and hyperparameter optimization

    NASA Astrophysics Data System (ADS)

    Bergstra, James; Komer, Brent; Eliasmith, Chris; Yamins, Dan; Cox, David D.

    2015-01-01

    Sequential model-based optimization (also known as Bayesian optimization) is one of the most efficient methods (per function evaluation) of function minimization. This efficiency makes it appropriate for optimizing the hyperparameters of machine learning algorithms that are slow to train. The Hyperopt library provides algorithms and parallelization infrastructure for performing hyperparameter optimization (model selection) in Python. This paper presents an introductory tutorial on the usage of the Hyperopt library, including the description of search spaces, minimization (in serial and parallel), and the analysis of the results collected in the course of minimization. This paper also gives an overview of Hyperopt-Sklearn, a software project that provides automatic algorithm configuration of the Scikit-learn machine learning library. Following Auto-Weka, we take the view that the choice of classifier and even the choice of preprocessing module can be taken together to represent a single large hyperparameter optimization problem. We use Hyperopt to define a search space that encompasses many standard components (e.g. SVM, RF, KNN, PCA, TFIDF) and common patterns of composing them together. We demonstrate, using search algorithms in Hyperopt and standard benchmarking data sets (MNIST, 20-newsgroups, convex shapes), that searching this space is practical and effective. In particular, we improve on best-known scores for the model space for both MNIST and convex shapes. The paper closes with some discussion of ongoing and future work.

  7. LIS Professionals as Knowledge Engineers.

    ERIC Educational Resources Information Center

    Poulter, Alan; And Others

    1994-01-01

    Considers the role of library and information science professionals as knowledge engineers. Highlights include knowledge acquisition, including personal experience, interviews, protocol analysis, observation, multidimensional sorting, printed sources, and machine learning; knowledge representation, including production rules and semantic nets;…

  8. Muxstep: an open-source C ++ multiplex HMM library for making inferences on multiple data types.

    PubMed

    Veličković, Petar; Liò, Pietro

    2016-08-15

    With the development of experimental methods and technology, we are able to reliably gain access to data in larger quantities, dimensions and types. This has great potential for the improvement of machine learning (as the learning algorithms have access to a larger space of information). However, conventional machine learning approaches used thus far on single-dimensional data inputs are unlikely to be expressive enough to accurately model the problem in higher dimensions; in fact, it should generally be most suitable to represent our underlying models as some form of complex networksng;nsio with nontrivial topological features. As the first step in establishing such a trend, we present MUXSTEP: , an open-source library utilising multiplex networks for the purposes of binary classification on multiple data types. The library is designed to be used out-of-the-box for developing models based on the multiplex network framework, as well as easily modifiable to suit problem modelling needs that may differ significantly from the default approach described. The full source code is available on GitHub: https://github.com/PetarV-/muxstep petar.velickovic@cl.cam.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  9. VariantSpark: population scale clustering of genotype information.

    PubMed

    O'Brien, Aidan R; Saunders, Neil F W; Guo, Yi; Buske, Fabian A; Scott, Rodney J; Bauer, Denis C

    2015-12-10

    Genomic information is increasingly used in medical practice giving rise to the need for efficient analysis methodology able to cope with thousands of individuals and millions of variants. The widely used Hadoop MapReduce architecture and associated machine learning library, Mahout, provide the means for tackling computationally challenging tasks. However, many genomic analyses do not fit the Map-Reduce paradigm. We therefore utilise the recently developed SPARK engine, along with its associated machine learning library, MLlib, which offers more flexibility in the parallelisation of population-scale bioinformatics tasks. The resulting tool, VARIANTSPARK provides an interface from MLlib to the standard variant format (VCF), offers seamless genome-wide sampling of variants and provides a pipeline for visualising results. To demonstrate the capabilities of VARIANTSPARK, we clustered more than 3,000 individuals with 80 Million variants each to determine the population structure in the dataset. VARIANTSPARK is 80 % faster than the SPARK-based genome clustering approach, ADAM, the comparable implementation using Hadoop/Mahout, as well as ADMIXTURE, a commonly used tool for determining individual ancestries. It is over 90 % faster than traditional implementations using R and Python. The benefits of speed, resource consumption and scalability enables VARIANTSPARK to open up the usage of advanced, efficient machine learning algorithms to genomic data.

  10. Scalable Nearest Neighbor Algorithms for High Dimensional Data.

    PubMed

    Muja, Marius; Lowe, David G

    2014-11-01

    For many computer vision and machine learning problems, large training sets are key for good performance. However, the most computationally expensive part of many computer vision and machine learning algorithms consists of finding nearest neighbor matches to high dimensional vectors that represent the training data. We propose new algorithms for approximate nearest neighbor matching and evaluate and compare them with previous algorithms. For matching high dimensional features, we find two algorithms to be the most efficient: the randomized k-d forest and a new algorithm proposed in this paper, the priority search k-means tree. We also propose a new algorithm for matching binary features by searching multiple hierarchical clustering trees and show it outperforms methods typically used in the literature. We show that the optimal nearest neighbor algorithm and its parameters depend on the data set characteristics and describe an automated configuration procedure for finding the best algorithm to search a particular data set. In order to scale to very large data sets that would otherwise not fit in the memory of a single machine, we propose a distributed nearest neighbor matching framework that can be used with any of the algorithms described in the paper. All this research has been released as an open source library called fast library for approximate nearest neighbors (FLANN), which has been incorporated into OpenCV and is now one of the most popular libraries for nearest neighbor matching.

  11. Cheminformatics in Drug Discovery, an Industrial Perspective.

    PubMed

    Chen, Hongming; Kogej, Thierry; Engkvist, Ola

    2018-05-18

    Cheminformatics has established itself as a core discipline within large scale drug discovery operations. It would be impossible to handle the amount of data generated today in a small molecule drug discovery project without persons skilled in cheminformatics. In addition, due to increased emphasis on "Big Data", machine learning and artificial intelligence, not only in the society in general, but also in drug discovery, it is expected that the cheminformatics field will be even more important in the future. Traditional areas like virtual screening, library design and high-throughput screening analysis are highlighted in this review. Applying machine learning in drug discovery is an area that has become very important. Applications of machine learning in early drug discovery has been extended from predicting ADME properties and target activity to tasks like de novo molecular design and prediction of chemical reactions. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  12. Carbon Nanotube Growth Rate Regression using Support Vector Machines and Artificial Neural Networks

    DTIC Science & Technology

    2014-03-27

    intensity D peak. Reprinted with permission from [38]. The SVM classifier is trained using custom written Java code leveraging the Sequential Minimal...Society Encog is a machine learning framework for Java , C++ and .Net applications that supports Bayesian Networks, Hidden Markov Models, SVMs and ANNs [13...SVM classifiers are trained using Weka libraries and leveraging custom written Java code. The data set is created as an Attribute Relationship File

  13. Managing a Special Library. Parts I and II.

    ERIC Educational Resources Information Center

    Labovitz, Judy; Swanigan, Meryl

    1985-01-01

    Various concepts from "In Search of Excellence" are described in context of authors' personal styles. Discussions address thinking in terms of options, using statistics, learning value of corporate politics, a bias toward action, productivity through people, the "lean-machine" concept, staying close to client, entrepreneurship…

  14. GAME: GAlaxy Machine learning for Emission lines

    NASA Astrophysics Data System (ADS)

    Ucci, G.; Ferrara, A.; Pallottini, A.; Gallerani, S.

    2018-06-01

    We present an updated, optimized version of GAME (GAlaxy Machine learning for Emission lines), a code designed to infer key interstellar medium physical properties from emission line intensities of ultraviolet /optical/far-infrared galaxy spectra. The improvements concern (a) an enlarged spectral library including Pop III stars, (b) the inclusion of spectral noise in the training procedure, and (c) an accurate evaluation of uncertainties. We extensively validate the optimized code and compare its performance against empirical methods and other available emission line codes (PYQZ and HII-CHI-MISTRY) on a sample of 62 SDSS stacked galaxy spectra and 75 observed HII regions. Very good agreement is found for metallicity. However, ionization parameters derived by GAME tend to be higher. We show that this is due to the use of too limited libraries in the other codes. The main advantages of GAME are the simultaneous use of all the measured spectral lines and the extremely short computational times. We finally discuss the code potential and limitations.

  15. Application of machine learning using support vector machines for crater detection from Martian digital topography data

    NASA Astrophysics Data System (ADS)

    Salamunićcar, Goran; Lončarić, Sven

    In our previous work, in order to extend the GT-57633 catalogue [PSS, 56 (15), 1992-2008] with still uncatalogued impact-craters, the following has been done [GRS, 48 (5), in press, doi:10.1109/TGRS.2009.2037750]: (1) the crater detection algorithm (CDA) based on digital elevation model (DEM) was developed; (2) using 1/128° MOLA data, this CDA proposed 414631 crater-candidates; (3) each crater-candidate was analyzed manually; and (4) 57592 were confirmed as correct detections. The resulting GT-115225 catalog is the significant result of this effort. However, to check such a large number of crater-candidates manually was a demanding task. This was the main motivation for work on improvement of the CDA in order to provide better classification of craters as true and false detections. To achieve this, we extended the CDA with the machine learning capability, using support vector machines (SVM). In the first step, the CDA (re)calculates numerous terrain morphometric attributes from DEM. For this purpose, already existing modules of the CDA from our previous work were reused in order to be capable to prepare these attributes. In addition, new attributes were introduced such as ellipse eccentricity and tilt. For machine learning purpose, the CDA is additionally extended to provide 2-D topography-profile and 3-D shape for each crater-candidate. The latter two are a performance problem because of the large number of crater-candidates in combination with the large number of attributes. As a solution, we developed a CDA architecture wherein it is possible to combine the SVM with a radial basis function (RBF) or any other kernel (for initial set of attributes), with the SVM with linear kernel (for the cases when 2-D and 3-D data are included as well). Another challenge is that, in addition to diversity of possible crater types, there are numerous morphological differences between the smallest (mostly very circular bowl-shaped craters) and the largest (multi-ring) impact craters. As a solution to this problem, the CDA classifies crater-candidates according to their diameter into 7 groups (D smaller/larger then 2km, 4km, 8km, 16km, 32km and 64km), and for each group uses separate SVMs for training and prediction. For implementation of the machine-learning part and integration with the rest of the CDA, we used C.-J. Lin's et al. [http://www.csie.ntu.edu.tw/˜cjlin/] LIBSVM (A Library for Support Vector Machines) and LIBLINEAR (A Library for Large Linear Classification) libraries. According to the initial evaluation, now the CDA provides much better classification of craters as true and false detections.

  16. Drugs and Drug-Like Compounds: Discriminating Approved Pharmaceuticals from Screening-Library Compounds

    NASA Astrophysics Data System (ADS)

    Schierz, Amanda C.; King, Ross D.

    Compounds in drug screening-libraries should resemble pharmaceuticals. To operationally test this, we analysed the compounds in terms of known drug-like filters and developed a novel machine learning method to discriminate approved pharmaceuticals from “drug-like” compounds. This method uses both structural features and molecular properties for discrimination. The method has an estimated accuracy of 91% in discriminating between the Maybridge HitFinder library and approved pharmaceuticals, and 99% between the NATDiverse collection (from Analyticon Discovery) and approved pharmaceuticals. These results show that Lipinski’s Rule of 5 for oral absorption is not sufficient to describe “drug-likeness” and be the main basis of screening-library design.

  17. Re-inventing Data Libraries: Ensuring Continuing Access To Curated (Value-added) Data

    NASA Astrophysics Data System (ADS)

    Burnhill, P.; Medyckyj-Scott, D.

    2008-12-01

    How many years of inexperience do we need in using, and in particular sharing, digital data generated by others? That history pre-dates, but must also gain leverage from, the emergence of the digital library. Much of this sharing was done within research groups but recent attention to spatial data infrastructure highlights the importance of achieving several 'right mixes': * between Internet-standards, geo-specific referencing, and domain-specific vocabulary (cf ontology); * between attention to user-focus'd services and machine-to-machine interoperability; * between the demands of current high-quality services, the practice of data curation, and the need for long term preservation. This presentation will draw upon ideas and experience data library services in research universities, a national (UK) academic data centre, and developments in digital curation. It will be argued that the 1980s term 'data library' has some polemic value in that we have yet to learn what it means to 'do library' for data: more than "a bit like inter-galactic library loan", perhaps. Illustration will be drawn from multi-faceted database of digitized boundaries (UKBORDERS), through the first Internet map delivery of national mapping agency data (Digimap), to strategic positioning to help geo-enable academic and scientific data and so enhance research (in the UK, in Europe, and beyond).

  18. Python Spectral Analysis Tool (PySAT) for Preprocessing, Multivariate Analysis, and Machine Learning with Point Spectra

    NASA Astrophysics Data System (ADS)

    Anderson, R. B.; Finch, N.; Clegg, S.; Graff, T.; Morris, R. V.; Laura, J.

    2017-06-01

    We present a Python-based library and graphical interface for the analysis of point spectra. The tool is being developed with a focus on methods used for ChemCam data, but is flexible enough to handle spectra from other instruments.

  19. Obstacle Recognition Based on Machine Learning for On-Chip LiDAR Sensors in a Cyber-Physical System

    PubMed Central

    Beruvides, Gerardo

    2017-01-01

    Collision avoidance is an important feature in advanced driver-assistance systems, aimed at providing correct, timely and reliable warnings before an imminent collision (with objects, vehicles, pedestrians, etc.). The obstacle recognition library is designed and implemented to address the design and evaluation of obstacle detection in a transportation cyber-physical system. The library is integrated into a co-simulation framework that is supported on the interaction between SCANeR software and Matlab/Simulink. From the best of the authors’ knowledge, two main contributions are reported in this paper. Firstly, the modelling and simulation of virtual on-chip light detection and ranging sensors in a cyber-physical system, for traffic scenarios, is presented. The cyber-physical system is designed and implemented in SCANeR. Secondly, three specific artificial intelligence-based methods for obstacle recognition libraries are also designed and applied using a sensory information database provided by SCANeR. The computational library has three methods for obstacle detection: a multi-layer perceptron neural network, a self-organization map and a support vector machine. Finally, a comparison among these methods under different weather conditions is presented, with very promising results in terms of accuracy. The best results are achieved using the multi-layer perceptron in sunny and foggy conditions, the support vector machine in rainy conditions and the self-organized map in snowy conditions. PMID:28906450

  20. Obstacle Recognition Based on Machine Learning for On-Chip LiDAR Sensors in a Cyber-Physical System.

    PubMed

    Castaño, Fernando; Beruvides, Gerardo; Haber, Rodolfo E; Artuñedo, Antonio

    2017-09-14

    Collision avoidance is an important feature in advanced driver-assistance systems, aimed at providing correct, timely and reliable warnings before an imminent collision (with objects, vehicles, pedestrians, etc.). The obstacle recognition library is designed and implemented to address the design and evaluation of obstacle detection in a transportation cyber-physical system. The library is integrated into a co-simulation framework that is supported on the interaction between SCANeR software and Matlab/Simulink. From the best of the authors' knowledge, two main contributions are reported in this paper. Firstly, the modelling and simulation of virtual on-chip light detection and ranging sensors in a cyber-physical system, for traffic scenarios, is presented. The cyber-physical system is designed and implemented in SCANeR. Secondly, three specific artificial intelligence-based methods for obstacle recognition libraries are also designed and applied using a sensory information database provided by SCANeR. The computational library has three methods for obstacle detection: a multi-layer perceptron neural network, a self-organization map and a support vector machine. Finally, a comparison among these methods under different weather conditions is presented, with very promising results in terms of accuracy. The best results are achieved using the multi-layer perceptron in sunny and foggy conditions, the support vector machine in rainy conditions and the self-organized map in snowy conditions.

  1. Machine learning to design integral membrane channelrhodopsins for efficient eukaryotic expression and plasma membrane localization.

    PubMed

    Bedbrook, Claire N; Yang, Kevin K; Rice, Austin J; Gradinaru, Viviana; Arnold, Frances H

    2017-10-01

    There is growing interest in studying and engineering integral membrane proteins (MPs) that play key roles in sensing and regulating cellular response to diverse external signals. A MP must be expressed, correctly inserted and folded in a lipid bilayer, and trafficked to the proper cellular location in order to function. The sequence and structural determinants of these processes are complex and highly constrained. Here we describe a predictive, machine-learning approach that captures this complexity to facilitate successful MP engineering and design. Machine learning on carefully-chosen training sequences made by structure-guided SCHEMA recombination has enabled us to accurately predict the rare sequences in a diverse library of channelrhodopsins (ChRs) that express and localize to the plasma membrane of mammalian cells. These light-gated channel proteins of microbial origin are of interest for neuroscience applications, where expression and localization to the plasma membrane is a prerequisite for function. We trained Gaussian process (GP) classification and regression models with expression and localization data from 218 ChR chimeras chosen from a 118,098-variant library designed by SCHEMA recombination of three parent ChRs. We use these GP models to identify ChRs that express and localize well and show that our models can elucidate sequence and structure elements important for these processes. We also used the predictive models to convert a naturally occurring ChR incapable of mammalian localization into one that localizes well.

  2. Machine learning to design integral membrane channelrhodopsins for efficient eukaryotic expression and plasma membrane localization

    PubMed Central

    Rice, Austin J.; Gradinaru, Viviana; Arnold, Frances H.

    2017-01-01

    There is growing interest in studying and engineering integral membrane proteins (MPs) that play key roles in sensing and regulating cellular response to diverse external signals. A MP must be expressed, correctly inserted and folded in a lipid bilayer, and trafficked to the proper cellular location in order to function. The sequence and structural determinants of these processes are complex and highly constrained. Here we describe a predictive, machine-learning approach that captures this complexity to facilitate successful MP engineering and design. Machine learning on carefully-chosen training sequences made by structure-guided SCHEMA recombination has enabled us to accurately predict the rare sequences in a diverse library of channelrhodopsins (ChRs) that express and localize to the plasma membrane of mammalian cells. These light-gated channel proteins of microbial origin are of interest for neuroscience applications, where expression and localization to the plasma membrane is a prerequisite for function. We trained Gaussian process (GP) classification and regression models with expression and localization data from 218 ChR chimeras chosen from a 118,098-variant library designed by SCHEMA recombination of three parent ChRs. We use these GP models to identify ChRs that express and localize well and show that our models can elucidate sequence and structure elements important for these processes. We also used the predictive models to convert a naturally occurring ChR incapable of mammalian localization into one that localizes well. PMID:29059183

  3. Geological applications of machine learning on hyperspectral remote sensing data

    NASA Astrophysics Data System (ADS)

    Tse, C. H.; Li, Yi-liang; Lam, Edmund Y.

    2015-02-01

    The CRISM imaging spectrometer orbiting Mars has been producing a vast amount of data in the visible to infrared wavelengths in the form of hyperspectral data cubes. These data, compared with those obtained from previous remote sensing techniques, yield an unprecedented level of detailed spectral resolution in additional to an ever increasing level of spatial information. A major challenge brought about by the data is the burden of processing and interpreting these datasets and extract the relevant information from it. This research aims at approaching the challenge by exploring machine learning methods especially unsupervised learning to achieve cluster density estimation and classification, and ultimately devising an efficient means leading to identification of minerals. A set of software tools have been constructed by Python to access and experiment with CRISM hyperspectral cubes selected from two specific Mars locations. A machine learning pipeline is proposed and unsupervised learning methods were implemented onto pre-processed datasets. The resulting data clusters are compared with the published ASTER spectral library and browse data products from the Planetary Data System (PDS). The result demonstrated that this approach is capable of processing the huge amount of hyperspectral data and potentially providing guidance to scientists for more detailed studies.

  4. Development and experimental test of support vector machines virtual screening method for searching Src inhibitors from large compound libraries.

    PubMed

    Han, Bucong; Ma, Xiaohua; Zhao, Ruiying; Zhang, Jingxian; Wei, Xiaona; Liu, Xianghui; Liu, Xin; Zhang, Cunlong; Tan, Chunyan; Jiang, Yuyang; Chen, Yuzong

    2012-11-23

    Src plays various roles in tumour progression, invasion, metastasis, angiogenesis and survival. It is one of the multiple targets of multi-target kinase inhibitors in clinical uses and trials for the treatment of leukemia and other cancers. These successes and appearances of drug resistance in some patients have raised significant interest and efforts in discovering new Src inhibitors. Various in-silico methods have been used in some of these efforts. It is desirable to explore additional in-silico methods, particularly those capable of searching large compound libraries at high yields and reduced false-hit rates. We evaluated support vector machines (SVM) as virtual screening tools for searching Src inhibitors from large compound libraries. SVM trained and tested by 1,703 inhibitors and 63,318 putative non-inhibitors correctly identified 93.53%~ 95.01% inhibitors and 99.81%~ 99.90% non-inhibitors in 5-fold cross validation studies. SVM trained by 1,703 inhibitors reported before 2011 and 63,318 putative non-inhibitors correctly identified 70.45% of the 44 inhibitors reported since 2011, and predicted as inhibitors 44,843 (0.33%) of 13.56M PubChem, 1,496 (0.89%) of 168 K MDDR, and 719 (7.73%) of 9,305 MDDR compounds similar to the known inhibitors. SVM showed comparable yield and reduced false hit rates in searching large compound libraries compared to the similarity-based and other machine-learning VS methods developed from the same set of training compounds and molecular descriptors. We tested three virtual hits of the same novel scaffold from in-house chemical libraries not reported as Src inhibitor, one of which showed moderate activity. SVM may be potentially explored for searching Src inhibitors from large compound libraries at low false-hit rates.

  5. Function modeling improves the efficiency of spatial modeling using big data from remote sensing

    Treesearch

    John Hogland; Nathaniel Anderson

    2017-01-01

    Spatial modeling is an integral component of most geographic information systems (GISs). However, conventional GIS modeling techniques can require substantial processing time and storage space and have limited statistical and machine learning functionality. To address these limitations, many have parallelized spatial models using multiple coding libraries and have...

  6. The Harvard Clean Energy Project: High-throughput screening of organic photovoltaic materials using cheminformatics, machine learning, and pattern recognition

    NASA Astrophysics Data System (ADS)

    Olivares-Amaya, Roberto; Hachmann, Johannes; Amador-Bedolla, Carlos; Daly, Aidan; Jinich, Adrian; Atahan-Evrenk, Sule; Boixo, Sergio; Aspuru-Guzik, Alán

    2012-02-01

    Organic photovoltaic devices have emerged as competitors to silicon-based solar cells, currently reaching efficiencies of over 9% and offering desirable properties for manufacturing and installation. We study conjugated donor polymers for high-efficiency bulk-heterojunction photovoltaic devices with a molecular library motivated by experimental feasibility. We use quantum mechanics and a distributed computing approach to explore this vast molecular space. We will detail the screening approach starting from the generation of the molecular library, which can be easily extended to other kinds of molecular systems. We will describe the screening method for these materials which ranges from descriptor models, ubiquitous in the drug discovery community, to eventually reaching first principles quantum chemistry methods. We will present results on the statistical analysis, based principally on machine learning, specifically partial least squares and Gaussian processes. Alongside, clustering methods and the use of the hypergeometric distribution reveal moieties important for the donor materials and allow us to quantify structure-property relationships. These efforts enable us to accelerate materials discovery in organic photovoltaics through our collaboration with experimental groups.

  7. Comparison and combination of several MeSH indexing approaches

    PubMed Central

    Yepes, Antonio Jose Jimeno; Mork, James G.; Demner-Fushman, Dina; Aronson, Alan R.

    2013-01-01

    MeSH indexing of MEDLINE is becoming a more difficult task for the group of highly qualified indexing staff at the US National Library of Medicine, due to the large yearly growth of MEDLINE and the increasing size of MeSH. Since 2002, this task has been assisted by the Medical Text Indexer or MTI program. We extend previous machine learning analysis by adding a more diverse set of MeSH headings targeting examples where MTI has been shown to perform poorly. Machine learning algorithms exceed MTI’s performance on MeSH headings that are used very frequently and headings for which the indexing frequency is very low. We find that when we combine the MTI suggestions and the prediction of the learning algorithms, the performance improves compared to any single method for most of the evaluated MeSH headings. PMID:24551371

  8. Comparison and combination of several MeSH indexing approaches.

    PubMed

    Yepes, Antonio Jose Jimeno; Mork, James G; Demner-Fushman, Dina; Aronson, Alan R

    2013-01-01

    MeSH indexing of MEDLINE is becoming a more difficult task for the group of highly qualified indexing staff at the US National Library of Medicine, due to the large yearly growth of MEDLINE and the increasing size of MeSH. Since 2002, this task has been assisted by the Medical Text Indexer or MTI program. We extend previous machine learning analysis by adding a more diverse set of MeSH headings targeting examples where MTI has been shown to perform poorly. Machine learning algorithms exceed MTI's performance on MeSH headings that are used very frequently and headings for which the indexing frequency is very low. We find that when we combine the MTI suggestions and the prediction of the learning algorithms, the performance improves compared to any single method for most of the evaluated MeSH headings.

  9. Storage and retrieval of mass spectral information

    NASA Technical Reports Server (NTRS)

    Hohn, M. E.; Humberston, M. J.; Eglinton, G.

    1977-01-01

    Computer handling of mass spectra serves two main purposes: the interpretation of the occasional, problematic mass spectrum, and the identification of the large number of spectra generated in the gas-chromatographic-mass spectrometric (GC-MS) analysis of complex natural and synthetic mixtures. Methods available fall into the three categories of library search, artificial intelligence, and learning machine. Optional procedures for coding, abbreviating and filtering a library of spectra minimize time and storage requirements. Newer techniques make increasing use of probability and information theory in accessing files of mass spectral information.

  10. Audio-based deep music emotion recognition

    NASA Astrophysics Data System (ADS)

    Liu, Tong; Han, Li; Ma, Liangkai; Guo, Dongwei

    2018-05-01

    As the rapid development of multimedia networking, more and more songs are issued through the Internet and stored in large digital music libraries. However, music information retrieval on these libraries can be really hard, and the recognition of musical emotion is especially challenging. In this paper, we report a strategy to recognize the emotion contained in songs by classifying their spectrograms, which contain both the time and frequency information, with a convolutional neural network (CNN). The experiments conducted on the l000-song dataset indicate that the proposed model outperforms traditional machine learning method.

  11. Feasibility of retrofitting a university library with active workstations to reduce sedentary behavior.

    PubMed

    Maeda, Hotaka; Quartiroli, Alessandro; Vos, Paul W; Carr, Lucas J; Mahar, Matthew T

    2014-05-01

    Libraries are an inherently sedentary environment, but are an understudied setting for sedentary behavior interventions. To investigate the feasibility of incorporating portable pedal machines in a university library to reduce sedentary behaviors. The 11-week intervention targeted students at a university library. Thirteen portable pedal machines were placed in the library. Four forms of prompts (e-mail, library website, advertisement monitors, and poster) encouraging pedal machine use were employed during the first 4 weeks. Pedal machine use was measured via automatic timers on each machine and momentary time sampling. Daily library visits were measured using a gate counter. Individualized data were measured by survey. Data were collected in fall 2012 and analyzed in 2013. Mean (SD) cumulative pedal time per day was 95.5 (66.1) minutes. One or more pedal machines were observed being used 15% of the time (N=589). Pedal machines were used at least once by 7% of students (n=527). Controlled for gate count, no linear change of pedal machine use across days was found (b=-0.1 minutes, p=0.75) and the presence of the prompts did not change daily pedal time (p=0.63). Seven of eight items that assessed attitudes toward the intervention supported intervention feasibility (p<0.05). The unique non-individualized approach of retrofitting a library with pedal machines to reduce sedentary behavior seems feasible, but improvement of its effectiveness is needed. This study could inform future studies aimed at reshaping traditionally sedentary settings to improve public health. Copyright © 2014 American Journal of Preventive Medicine. Published by Elsevier Inc. All rights reserved.

  12. Towards large-scale FAME-based bacterial species identification using machine learning techniques.

    PubMed

    Slabbinck, Bram; De Baets, Bernard; Dawyndt, Peter; De Vos, Paul

    2009-05-01

    In the last decade, bacterial taxonomy witnessed a huge expansion. The swift pace of bacterial species (re-)definitions has a serious impact on the accuracy and completeness of first-line identification methods. Consequently, back-end identification libraries need to be synchronized with the List of Prokaryotic names with Standing in Nomenclature. In this study, we focus on bacterial fatty acid methyl ester (FAME) profiling as a broadly used first-line identification method. From the BAME@LMG database, we have selected FAME profiles of individual strains belonging to the genera Bacillus, Paenibacillus and Pseudomonas. Only those profiles resulting from standard growth conditions have been retained. The corresponding data set covers 74, 44 and 95 validly published bacterial species, respectively, represented by 961, 378 and 1673 standard FAME profiles. Through the application of machine learning techniques in a supervised strategy, different computational models have been built for genus and species identification. Three techniques have been considered: artificial neural networks, random forests and support vector machines. Nearly perfect identification has been achieved at genus level. Notwithstanding the known limited discriminative power of FAME analysis for species identification, the computational models have resulted in good species identification results for the three genera. For Bacillus, Paenibacillus and Pseudomonas, random forests have resulted in sensitivity values, respectively, 0.847, 0.901 and 0.708. The random forests models outperform those of the other machine learning techniques. Moreover, our machine learning approach also outperformed the Sherlock MIS (MIDI Inc., Newark, DE, USA). These results show that machine learning proves very useful for FAME-based bacterial species identification. Besides good bacterial identification at species level, speed and ease of taxonomic synchronization are major advantages of this computational species identification strategy.

  13. The Challenges, Rewards and Pitfalls in Teaching Online Searching.

    ERIC Educational Resources Information Center

    Huq, A. M. Abdul

    Online searching is now a recognized specialization in library schools, but there are certain problems involved with offering online courses. There is a large amount of material to be covered, especially in one course. Students must learn new concepts and make them work in a machine environment. It is difficult to offer suggestions for a fruitful…

  14. Coupling machine learning with mechanistic models to study runoff production and river flow at the hillslope scale

    NASA Astrophysics Data System (ADS)

    Marçais, J.; Gupta, H. V.; De Dreuzy, J. R.; Troch, P. A. A.

    2016-12-01

    Geomorphological structure and geological heterogeneity of hillslopes are major controls on runoff responses. The diversity of hillslopes (morphological shapes and geological structures) on one hand, and the highly non linear runoff mechanism response on the other hand, make it difficult to transpose what has been learnt at one specific hillslope to another. Therefore, making reliable predictions on runoff appearance or river flow for a given hillslope is a challenge. Applying a classic model calibration (based on inverse problems technique) requires doing it for each specific hillslope and having some data available for calibration. When applied to thousands of cases it cannot always be promoted. Here we propose a novel modeling framework based on coupling process based models with data based approach. First we develop a mechanistic model, based on hillslope storage Boussinesq equations (Troch et al. 2003), able to model non linear runoff responses to rainfall at the hillslope scale. Second we set up a model database, representing thousands of non calibrated simulations. These simulations investigate different hillslope shapes (real ones obtained by analyzing 5m digital elevation model of Brittany and synthetic ones), different hillslope geological structures (i.e. different parametrizations) and different hydrologic forcing terms (i.e. different infiltration chronicles). Then, we use this model library to train a machine learning model on this physically based database. Machine learning model performance is then assessed by a classic validating phase (testing it on new hillslopes and comparing machine learning with mechanistic outputs). Finally we use this machine learning model to learn what are the hillslope properties controlling runoffs. This methodology will be further tested combining synthetic datasets with real ones.

  15. The applications of machine learning algorithms in the modeling of estrogen-like chemicals.

    PubMed

    Liu, Huanxiang; Yao, Xiaojun; Gramatica, Paola

    2009-06-01

    Increasing concern is being shown by the scientific community, government regulators, and the public about endocrine-disrupting chemicals that, in the environment, are adversely affecting human and wildlife health through a variety of mechanisms, mainly estrogen receptor-mediated mechanisms of toxicity. Because of the large number of such chemicals in the environment, there is a great need for an effective means of rapidly assessing endocrine-disrupting activity in the toxicology assessment process. When faced with the challenging task of screening large libraries of molecules for biological activity, the benefits of computational predictive models based on quantitative structure-activity relationships to identify possible estrogens become immediately obvious. Recently, in order to improve the accuracy of prediction, some machine learning techniques were introduced to build more effective predictive models. In this review we will focus our attention on some recent advances in the use of these methods in modeling estrogen-like chemicals. The advantages and disadvantages of the machine learning algorithms used in solving this problem, the importance of the validation and performance assessment of the built models as well as their applicability domains will be discussed.

  16. Analyzing microtomography data with Python and the scikit-image library.

    PubMed

    Gouillart, Emmanuelle; Nunez-Iglesias, Juan; van der Walt, Stéfan

    2017-01-01

    The exploration and processing of images is a vital aspect of the scientific workflows of many X-ray imaging modalities. Users require tools that combine interactivity, versatility, and performance. scikit-image is an open-source image processing toolkit for the Python language that supports a large variety of file formats and is compatible with 2D and 3D images. The toolkit exposes a simple programming interface, with thematic modules grouping functions according to their purpose, such as image restoration, segmentation, and measurements. scikit-image users benefit from a rich scientific Python ecosystem that contains many powerful libraries for tasks such as visualization or machine learning. scikit-image combines a gentle learning curve, versatile image processing capabilities, and the scalable performance required for the high-throughput analysis of X-ray imaging data.

  17. Learning to play Go using recursive neural networks.

    PubMed

    Wu, Lin; Baldi, Pierre

    2008-11-01

    Go is an ancient board game that poses unique opportunities and challenges for artificial intelligence. Currently, there are no computer Go programs that can play at the level of a good human player. However, the emergence of large repositories of games is opening the door for new machine learning approaches to address this challenge. Here we develop a machine learning approach to Go, and related board games, focusing primarily on the problem of learning a good evaluation function in a scalable way. Scalability is essential at multiple levels, from the library of local tactical patterns, to the integration of patterns across the board, to the size of the board itself. The system we propose is capable of automatically learning the propensity of local patterns from a library of games. Propensity and other local tactical information are fed into recursive neural networks, derived from a probabilistic Bayesian network architecture. The recursive neural networks in turn integrate local information across the board in all four cardinal directions and produce local outputs that represent local territory ownership probabilities. The aggregation of these probabilities provides an effective strategic evaluation function that is an estimate of the expected area at the end, or at various other stages, of the game. Local area targets for training can be derived from datasets of games played by human players. In this approach, while requiring a learning time proportional to N(4), skills learned on a board of size N(2) can easily be transferred to boards of other sizes. A system trained using only 9 x 9 amateur game data performs surprisingly well on a test set derived from 19 x 19 professional game data. Possible directions for further improvements are briefly discussed.

  18. In Silico Identification Software (ISIS): A Machine Learning Approach to Tandem Mass Spectral Identification of Lipids

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kangas, Lars J.; Metz, Thomas O.; Isaac, Georgis

    2012-05-15

    Liquid chromatography-mass spectrometry-based metabolomics has gained importance in the life sciences, yet it is not supported by software tools for high throughput identification of metabolites based on their fragmentation spectra. An algorithm (ISIS: in silico identification software) and its implementation are presented and show great promise in generating in silico spectra of lipids for the purpose of structural identification. Instead of using chemical reaction rate equations or rules-based fragmentation libraries, the algorithm uses machine learning to find accurate bond cleavage rates in a mass spectrometer employing collision-induced dissocia-tion tandem mass spectrometry. A preliminary test of the algorithm with 45 lipidsmore » from a subset of lipid classes shows both high sensitivity and specificity.« less

  19. Application of artificial neural networks to identify equilibration in computer simulations

    NASA Astrophysics Data System (ADS)

    Leibowitz, Mitchell H.; Miller, Evan D.; Henry, Michael M.; Jankowski, Eric

    2017-11-01

    Determining which microstates generated by a thermodynamic simulation are representative of the ensemble for which sampling is desired is a ubiquitous, underspecified problem. Artificial neural networks are one type of machine learning algorithm that can provide a reproducible way to apply pattern recognition heuristics to underspecified problems. Here we use the open-source TensorFlow machine learning library and apply it to the problem of identifying which hypothetical observation sequences from a computer simulation are “equilibrated” and which are not. We generate training populations and test populations of observation sequences with embedded linear and exponential correlations. We train a two-neuron artificial network to distinguish the correlated and uncorrelated sequences. We find that this simple network is good enough for > 98% accuracy in identifying exponentially-decaying energy trajectories from molecular simulations.

  20. Proceedings of the Conference on Machine-Readable Catalog Copy (3rd, Library of Congress, February 25, 1966).

    ERIC Educational Resources Information Center

    Library of Congress, Washington, DC.

    A conference was held to permit a discussion between the libraries that will participate in the Library of Congress machine-readable cataloging (MARC) pilot project. The MARC pilot will provide an opportunity for the Library of Congress to assess the effect which data conversion places on the Library's normal processing procedures; the suitability…

  1. Development and experimental test of support vector machines virtual screening method for searching Src inhibitors from large compound libraries

    PubMed Central

    2012-01-01

    Background Src plays various roles in tumour progression, invasion, metastasis, angiogenesis and survival. It is one of the multiple targets of multi-target kinase inhibitors in clinical uses and trials for the treatment of leukemia and other cancers. These successes and appearances of drug resistance in some patients have raised significant interest and efforts in discovering new Src inhibitors. Various in-silico methods have been used in some of these efforts. It is desirable to explore additional in-silico methods, particularly those capable of searching large compound libraries at high yields and reduced false-hit rates. Results We evaluated support vector machines (SVM) as virtual screening tools for searching Src inhibitors from large compound libraries. SVM trained and tested by 1,703 inhibitors and 63,318 putative non-inhibitors correctly identified 93.53%~ 95.01% inhibitors and 99.81%~ 99.90% non-inhibitors in 5-fold cross validation studies. SVM trained by 1,703 inhibitors reported before 2011 and 63,318 putative non-inhibitors correctly identified 70.45% of the 44 inhibitors reported since 2011, and predicted as inhibitors 44,843 (0.33%) of 13.56M PubChem, 1,496 (0.89%) of 168 K MDDR, and 719 (7.73%) of 9,305 MDDR compounds similar to the known inhibitors. Conclusions SVM showed comparable yield and reduced false hit rates in searching large compound libraries compared to the similarity-based and other machine-learning VS methods developed from the same set of training compounds and molecular descriptors. We tested three virtual hits of the same novel scaffold from in-house chemical libraries not reported as Src inhibitor, one of which showed moderate activity. SVM may be potentially explored for searching Src inhibitors from large compound libraries at low false-hit rates. PMID:23173901

  2. LEGION: Lightweight Expandable Group of Independently Operating Nodes

    NASA Technical Reports Server (NTRS)

    Burl, Michael C.

    2012-01-01

    LEGION is a lightweight C-language software library that enables distributed asynchronous data processing with a loosely coupled set of compute nodes. Loosely coupled means that a node can offer itself in service to a larger task at any time and can withdraw itself from service at any time, provided it is not actively engaged in an assignment. The main program, i.e., the one attempting to solve the larger task, does not need to know up front which nodes will be available, how many nodes will be available, or at what times the nodes will be available, which is normally the case in a "volunteer computing" framework. The LEGION software accomplishes its goals by providing message-based, inter-process communication similar to MPI (message passing interface), but without the tight coupling requirements. The software is lightweight and easy to install as it is written in standard C with no exotic library dependencies. LEGION has been demonstrated in a challenging planetary science application in which a machine learning system is used in closed-loop fashion to efficiently explore the input parameter space of a complex numerical simulation. The machine learning system decides which jobs to run through the simulator; then, through LEGION calls, the system farms those jobs out to a collection of compute nodes, retrieves the job results as they become available, and updates a predictive model of how the simulator maps inputs to outputs. The machine learning system decides which new set of jobs would be most informative to run given the results so far; this basic loop is repeated until sufficient insight into the physical system modeled by the simulator is obtained.

  3. Machine learning patterns for neuroimaging-genetic studies in the cloud.

    PubMed

    Da Mota, Benoit; Tudoran, Radu; Costan, Alexandru; Varoquaux, Gaël; Brasche, Goetz; Conrod, Patricia; Lemaitre, Herve; Paus, Tomas; Rietschel, Marcella; Frouin, Vincent; Poline, Jean-Baptiste; Antoniu, Gabriel; Thirion, Bertrand

    2014-01-01

    Brain imaging is a natural intermediate phenotype to understand the link between genetic information and behavior or brain pathologies risk factors. Massive efforts have been made in the last few years to acquire high-dimensional neuroimaging and genetic data on large cohorts of subjects. The statistical analysis of such data is carried out with increasingly sophisticated techniques and represents a great computational challenge. Fortunately, increasing computational power in distributed architectures can be harnessed, if new neuroinformatics infrastructures are designed and training to use these new tools is provided. Combining a MapReduce framework (TomusBLOB) with machine learning algorithms (Scikit-learn library), we design a scalable analysis tool that can deal with non-parametric statistics on high-dimensional data. End-users describe the statistical procedure to perform and can then test the model on their own computers before running the very same code in the cloud at a larger scale. We illustrate the potential of our approach on real data with an experiment showing how the functional signal in subcortical brain regions can be significantly fit with genome-wide genotypes. This experiment demonstrates the scalability and the reliability of our framework in the cloud with a 2 weeks deployment on hundreds of virtual machines.

  4. Machine-Learning-Assisted Approach for Discovering Novel Inhibitors Targeting Bromodomain-Containing Protein 4.

    PubMed

    Xing, Jing; Lu, Wenchao; Liu, Rongfeng; Wang, Yulan; Xie, Yiqian; Zhang, Hao; Shi, Zhe; Jiang, Hao; Liu, Yu-Chih; Chen, Kaixian; Jiang, Hualiang; Luo, Cheng; Zheng, Mingyue

    2017-07-24

    Bromodomain-containing protein 4 (BRD4) is implicated in the pathogenesis of a number of different cancers, inflammatory diseases and heart failure. Much effort has been dedicated toward discovering novel scaffold BRD4 inhibitors (BRD4is) with different selectivity profiles and potential antiresistance properties. Structure-based drug design (SBDD) and virtual screening (VS) are the most frequently used approaches. Here, we demonstrate a novel, structure-based VS approach that uses machine-learning algorithms trained on the priori structure and activity knowledge to predict the likelihood that a compound is a BRD4i based on its binding pattern with BRD4. In addition to positive experimental data, such as X-ray structures of BRD4-ligand complexes and BRD4 inhibitory potencies, negative data such as false positives (FPs) identified from our earlier ligand screening results were incorporated into our knowledge base. We used the resulting data to train a machine-learning model named BRD4LGR to predict the BRD4i-likeness of a compound. BRD4LGR achieved a 20-30% higher AUC-ROC than that of Glide using the same test set. When conducting in vitro experiments against a library of previously untested, commercially available organic compounds, the second round of VS using BRD4LGR generated 15 new BRD4is. Moreover, inverting the machine-learning model provided easy access to structure-activity relationship (SAR) interpretation for hit-to-lead optimization.

  5. SkData: data sets and algorithm evaluation protocols in Python

    NASA Astrophysics Data System (ADS)

    Bergstra, James; Pinto, Nicolas; Cox, David D.

    2015-01-01

    Machine learning benchmark data sets come in all shapes and sizes, whereas classification algorithms assume sanitized input, such as (x, y) pairs with vector-valued input x and integer class label y. Researchers and practitioners know all too well how tedious it can be to get from the URL of a new data set to a NumPy ndarray suitable for e.g. pandas or sklearn. The SkData library handles that work for a growing number of benchmark data sets (small and large) so that one-off in-house scripts for downloading and parsing data sets can be replaced with library code that is reliable, community-tested, and documented. The SkData library also introduces an open-ended formalization of training and testing protocols that facilitates direct comparison with published research. This paper describes the usage and architecture of the SkData library.

  6. Optimizing the Performance of Radionuclide Identification Software in the Hunt for Nuclear Security Threats

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fotion, Katherine A.

    2016-08-18

    The Radionuclide Analysis Kit (RNAK), my team’s most recent nuclide identification software, is entering the testing phase. A question arises: will removing rare nuclides from the software’s library improve its overall performance? An affirmative response indicates fundamental errors in the software’s framework, while a negative response confirms the effectiveness of the software’s key machine learning algorithms. After thorough testing, I found that the performance of RNAK cannot be improved with the library choice effect, thus verifying the effectiveness of RNAK’s algorithms—multiple linear regression, Bayesian network using the Viterbi algorithm, and branch and bound search.

  7. New Learning Method of a Lecture of ‘Machine Fabrication’ by Self-study with Investigation and Presentation Incorporated

    NASA Astrophysics Data System (ADS)

    Kasuga, Yukio

    A new teaching method was developed in learning ‘machine fabrication’ for the undergraduate students. This consists of a few times of lectures, grouping, decision of industrial products which each group wants to investigate, investigation work by library books and internet, arrangement of data containing characteristics of the products, employed materials and processing methods, presentation, discussions and revision followed by another presentation. This new method is derived from one of the Finland‧s way of primary school education. Their way of education is believed to have boosted up to the top ranking in PISA tests by OECD. After starting the new way of learning, students have fresh impressions on this lesson, especially for self-study, the way of investigation, collaborate work and presentation. Also, after four years of implementation, some improvements have been made including less use of internet, and determination of products and fabricating methods in advance which should be investigated. By this, students‧ lecture assessment shows further encouraging results.

  8. GeoDeepDive: Towards a Machine Reading-Ready Digital Library and Information Integration Resource

    NASA Astrophysics Data System (ADS)

    Husson, J. M.; Peters, S. E.; Livny, M.; Ross, I.

    2015-12-01

    Recent developments in machine reading and learning approaches to text and data mining hold considerable promise for accelerating the pace and quality of literature-based data synthesis, but these advances have outpaced even basic levels of access to the published literature. For many geoscience domains, particularly those based on physical samples and field-based descriptions, this limitation is significant. Here we describe a general infrastructure to support published literature-based machine reading and learning approaches to information integration and knowledge base creation. This infrastructure supports rate-controlled automated fetching of original documents, along with full bibliographic citation metadata, from remote servers, the secure storage of original documents, and the utilization of considerable high-throughput computing resources for the pre-processing of these documents by optical character recognition, natural language parsing, and other document annotation and parsing software tools. New tools and versions of existing tools can be automatically deployed against original documents when they are made available. The products of these tools (text/XML files) are managed by MongoDB and are available for use in data extraction applications. Basic search and discovery functionality is provided by ElasticSearch, which is used to identify documents of potential relevance to a given data extraction task. Relevant files derived from the original documents are then combined into basic starting points for application building; these starting points are kept up-to-date as new relevant documents are incorporated into the digital library. Currently, our digital library stores contains more than 360K documents supplied by Elsevier and the USGS and we are actively seeking additional content providers. By focusing on building a dependable infrastructure to support the retrieval, storage, and pre-processing of published content, we are establishing a foundation for complex, and continually improving, information integration and data extraction applications. We have developed one such application, which we present as an example, and invite new collaborations to develop other such applications.

  9. Constructing and Classifying Email Networks from Raw Forensic Images

    DTIC Science & Technology

    2016-09-01

    data mining for sequence and pattern mining ; in medical imaging for image segmentation; and in computer vision for object recognition” [28]. 2.3.1...machine learning and data mining suite that is written in Python. It provides a platform for experiment selection, recommendation systems, and...predictivemod- eling. The Orange library is a hierarchically-organized toolbox of data mining components. Data filtering and probability assessment are at the

  10. Current and Future Applications of Machine Learning for the US Army

    DTIC Science & Technology

    2018-04-13

    designing from the unwieldy application of the first principles of flight controls, aerodynamics, blade propulsion, and so on, the designers turned...when the number of features runs into millions can become challenging. To overcome these issues, regularization techniques have been developed which...and compiled to run efficiently on either CPU or GPU architectures. 5) Keras63 is a library that contains numerous implementations of commonly used

  11. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Painter, J.; McCormick, P.; Krogh, M.

    This paper presents the ACL (Advanced Computing Lab) Message Passing Library. It is a high throughput, low latency communications library, based on Thinking Machines Corp.`s CMMD, upon which message passing applications can be built. The library has been implemented on the Cray T3D, Thinking Machines CM-5, SGI workstations, and on top of PVM.

  12. The Librarian Leading the Machine: A Reassessment of Library Instruction Methods

    ERIC Educational Resources Information Center

    Greer, Katie; Hess, Amanda Nichols; Kraemer, Elizabeth W.

    2016-01-01

    This article builds on the 2007 College and Research Libraries article, "The Librarian, the Machine, or a Little of Both." Since that time, Oakland University Libraries implemented changes to its instruction program that reflect larger trends in teaching and assessment throughout the profession; following these revisions, librarians…

  13. Fax for Library Services. ERIC Digest.

    ERIC Educational Resources Information Center

    Spitzer, Kathleen L.

    This digest discusses how libraries of all types are using the facsimile (or "fax") machine to meet users' information needs. A definition of facsimile technology includes the components of a fax machine, the four types of fax machines, and the recent development of the "fax board," which allows a computer to transmit…

  14. Unresolved Galaxy Classifier for ESA/Gaia mission: Support Vector Machines approach

    NASA Astrophysics Data System (ADS)

    Bellas-Velidis, Ioannis; Kontizas, Mary; Dapergolas, Anastasios; Livanou, Evdokia; Kontizas, Evangelos; Karampelas, Antonios

    A software package Unresolved Galaxy Classifier (UGC) is being developed for the ground-based pipeline of ESA's Gaia mission. It aims to provide an automated taxonomic classification and specific parameters estimation analyzing Gaia BP/RP instrument low-dispersion spectra of unresolved galaxies. The UGC algorithm is based on a supervised learning technique, the Support Vector Machines (SVM). The software is implemented in Java as two separate modules. An offline learning module provides functions for SVM-models training. Once trained, the set of models can be repeatedly applied to unknown galaxy spectra by the pipeline's application module. A library of galaxy models synthetic spectra, simulated for the BP/RP instrument, is used to train and test the modules. Science tests show a very good classification performance of UGC and relatively good regression performance, except for some of the parameters. Possible approaches to improve the performance are discussed.

  15. Compressive sensing based machine learning strategy for characterizing the flow around a cylinder with limited pressure measurements

    NASA Astrophysics Data System (ADS)

    Bright, Ido; Lin, Guang; Kutz, J. Nathan

    2013-12-01

    Compressive sensing is used to determine the flow characteristics around a cylinder (Reynolds number and pressure/flow field) from a sparse number of pressure measurements on the cylinder. Using a supervised machine learning strategy, library elements encoding the dimensionally reduced dynamics are computed for various Reynolds numbers. Convex L1 optimization is then used with a limited number of pressure measurements on the cylinder to reconstruct, or decode, the full pressure field and the resulting flow field around the cylinder. Aside from the highly turbulent regime (large Reynolds number) where only the Reynolds number can be identified, accurate reconstruction of the pressure field and Reynolds number is achieved. The proposed data-driven strategy thus achieves encoding of the fluid dynamics using the L2 norm, and robust decoding (flow field reconstruction) using the sparsity promoting L1 norm.

  16. Comparison of dissimilarity measures for cluster analysis of X-ray diffraction data from combinatorial libraries

    NASA Astrophysics Data System (ADS)

    Iwasaki, Yuma; Kusne, A. Gilad; Takeuchi, Ichiro

    2017-12-01

    Machine learning techniques have proven invaluable to manage the ever growing volume of materials research data produced as developments continue in high-throughput materials simulation, fabrication, and characterization. In particular, machine learning techniques have been demonstrated for their utility in rapidly and automatically identifying potential composition-phase maps from structural data characterization of composition spread libraries, enabling rapid materials fabrication-structure-property analysis and functional materials discovery. A key issue in development of an automated phase-diagram determination method is the choice of dissimilarity measure, or kernel function. The desired measure reduces the impact of confounding structural data issues on analysis performance. The issues include peak height changes and peak shifting due to lattice constant change as a function of composition. In this work, we investigate the choice of dissimilarity measure in X-ray diffraction-based structure analysis and the choice of measure's performance impact on automatic composition-phase map determination. Nine dissimilarity measures are investigated for their impact in analyzing X-ray diffraction patterns for a Fe-Co-Ni ternary alloy composition spread. The cosine, Pearson correlation coefficient, and Jensen-Shannon divergence measures are shown to provide the best performance in the presence of peak height change and peak shifting (due to lattice constant change) when the magnitude of peak shifting is unknown. With prior knowledge of the maximum peak shifting, dynamic time warping in a normalized constrained mode provides the best performance. This work also serves to demonstrate a strategy for rapid analysis of a large number of X-ray diffraction patterns in general beyond data from combinatorial libraries.

  17. IDEPI: rapid prediction of HIV-1 antibody epitopes and other phenotypic features from sequence data using a flexible machine learning platform.

    PubMed

    Hepler, N Lance; Scheffler, Konrad; Weaver, Steven; Murrell, Ben; Richman, Douglas D; Burton, Dennis R; Poignard, Pascal; Smith, Davey M; Kosakovsky Pond, Sergei L

    2014-09-01

    Since its identification in 1983, HIV-1 has been the focus of a research effort unprecedented in scope and difficulty, whose ultimate goals--a cure and a vaccine--remain elusive. One of the fundamental challenges in accomplishing these goals is the tremendous genetic variability of the virus, with some genes differing at as many as 40% of nucleotide positions among circulating strains. Because of this, the genetic bases of many viral phenotypes, most notably the susceptibility to neutralization by a particular antibody, are difficult to identify computationally. Drawing upon open-source general-purpose machine learning algorithms and libraries, we have developed a software package IDEPI (IDentify EPItopes) for learning genotype-to-phenotype predictive models from sequences with known phenotypes. IDEPI can apply learned models to classify sequences of unknown phenotypes, and also identify specific sequence features which contribute to a particular phenotype. We demonstrate that IDEPI achieves performance similar to or better than that of previously published approaches on four well-studied problems: finding the epitopes of broadly neutralizing antibodies (bNab), determining coreceptor tropism of the virus, identifying compartment-specific genetic signatures of the virus, and deducing drug-resistance associated mutations. The cross-platform Python source code (released under the GPL 3.0 license), documentation, issue tracking, and a pre-configured virtual machine for IDEPI can be found at https://github.com/veg/idepi.

  18. Rationalizing the chemical space of protein-protein interaction inhibitors.

    PubMed

    Sperandio, Olivier; Reynès, Christelle H; Camproux, Anne-Claude; Villoutreix, Bruno O

    2010-03-01

    Protein-protein interactions (PPIs) are one of the next major classes of therapeutic targets, although they are too intricate to tackle with standard approaches. This is due, in part, to the inadequacy of today's chemical libraries. However, the emergence of a growing number of experimentally validated inhibitors of PPIs (i-PPIs) allows drug designers to use chemoinformatics and machine learning technologies to unravel the nature of the chemical space covered by the reported compounds. Key characteristics of i-PPIs can then be revealed and highlight the importance of specific shapes and/or aromatic bonds, enabling the design of i-PPI-enriched focused libraries and, therefore, of cost-effective screening strategies. 2009 Elsevier Ltd. All rights reserved.

  19. Natural product-like virtual libraries: recursive atom-based enumeration.

    PubMed

    Yu, Melvin J

    2011-03-28

    A new molecular enumerator is described that allows chemically and architecturally diverse sets of natural product-like and drug-like structures to be generated from a core structure as simple as a single carbon atom or as complex as a polycyclic ring system. Integrated with a rudimentary machine-learning algorithm, the enumerator has the ability to assemble biased virtual libraries enriched in compounds predicted to meet target criteria. The ability to dynamically generate relatively small focused libraries in a recursive manner could reduce the computational time and infrastructure necessary to construct and manage extremely large static libraries. Depending on enumeration conditions, natural product-like structures can be produced with a wide range of heterocyclic and alicyclic ring assemblies. Because natural products represent a proven source of validated structures for identifying and designing new drug candidates, mimicking the structural and topological diversity found in nature with a dynamic set of virtual natural product-like compounds may facilitate the creation of new ideas for novel, biologically relevant lead structures in areas of uncharted chemical space.

  20. PhD7Faster: predicting clones propagating faster from the Ph.D.-7 phage display peptide library.

    PubMed

    Ru, Beibei; 't Hoen, Peter A C; Nie, Fulei; Lin, Hao; Guo, Feng-Biao; Huang, Jian

    2014-02-01

    Phage display can rapidly discover peptides binding to any given target; thus, it has been widely used in basic and applied research. Each round of panning consists of two basic processes: Selection and amplification. However, recent studies have showed that the amplification step would decrease the diversity of phage display libraries due to different propagation capacity of phage clones. This may induce phages with growth advantage rather than specific affinity to appear in the final experimental results. The peptides displayed by such phages are termed as propagation-related target-unrelated peptides (PrTUPs). They would mislead further analysis and research if not removed. In this paper, we describe PhD7Faster, an ensemble predictor based on support vector machine (SVM) for predicting clones with growth advantage from the Ph.D.-7 phage display peptide library. By using reduced dipeptide composition (ReDPC) as features, an accuracy (Acc) of 79.67% and a Matthews correlation coefficient (MCC) of 0.595 were achieved in 5-fold cross-validation. In addition, the SVM-based model was demonstrated to perform better than several representative machine learning algorithms. We anticipate that PhD7Faster can assist biologists to exclude potential PrTUPs and accelerate the finding of specific binders from the popular Ph.D.-7 library. The web server of PhD7Faster can be freely accessed at http://immunet.cn/sarotup/cgi-bin/PhD7Faster.pl.

  1. Application of structured support vector machine backpropagation to a convolutional neural network for human pose estimation.

    PubMed

    Witoonchart, Peerajak; Chongstitvatana, Prabhas

    2017-08-01

    In this study, for the first time, we show how to formulate a structured support vector machine (SSVM) as two layers in a convolutional neural network, where the top layer is a loss augmented inference layer and the bottom layer is the normal convolutional layer. We show that a deformable part model can be learned with the proposed structured SVM neural network by backpropagating the error of the deformable part model to the convolutional neural network. The forward propagation calculates the loss augmented inference and the backpropagation calculates the gradient from the loss augmented inference layer to the convolutional layer. Thus, we obtain a new type of convolutional neural network called an Structured SVM convolutional neural network, which we applied to the human pose estimation problem. This new neural network can be used as the final layers in deep learning. Our method jointly learns the structural model parameters and the appearance model parameters. We implemented our method as a new layer in the existing Caffe library. Copyright © 2017 Elsevier Ltd. All rights reserved.

  2. Machine Learning Interface for Medical Image Analysis.

    PubMed

    Zhang, Yi C; Kagen, Alexander C

    2017-10-01

    TensorFlow is a second-generation open-source machine learning software library with a built-in framework for implementing neural networks in wide variety of perceptual tasks. Although TensorFlow usage is well established with computer vision datasets, the TensorFlow interface with DICOM formats for medical imaging remains to be established. Our goal is to extend the TensorFlow API to accept raw DICOM images as input; 1513 DaTscan DICOM images were obtained from the Parkinson's Progression Markers Initiative (PPMI) database. DICOM pixel intensities were extracted and shaped into tensors, or n-dimensional arrays, to populate the training, validation, and test input datasets for machine learning. A simple neural network was constructed in TensorFlow to classify images into normal or Parkinson's disease groups. Training was executed over 1000 iterations for each cross-validation set. The gradient descent optimization and Adagrad optimization algorithms were used to minimize cross-entropy between the predicted and ground-truth labels. Cross-validation was performed ten times to produce a mean accuracy of 0.938 ± 0.047 (95 % CI 0.908-0.967). The mean sensitivity was 0.974 ± 0.043 (95 % CI 0.947-1.00) and mean specificity was 0.822 ± 0.207 (95 % CI 0.694-0.950). We extended the TensorFlow API to enable DICOM compatibility in the context of DaTscan image analysis. We implemented a neural network classifier that produces diagnostic accuracies on par with excellent results from previous machine learning models. These results indicate the potential role of TensorFlow as a useful adjunct diagnostic tool in the clinical setting.

  3. Integrated machine learning, molecular docking and 3D-QSAR based approach for identification of potential inhibitors of trypanosomal N-myristoyltransferase.

    PubMed

    Singh, Nidhi; Shah, Priyanka; Dwivedi, Hemlata; Mishra, Shikha; Tripathi, Renu; Sahasrabuddhe, Amogh A; Siddiqi, Mohammad Imran

    2016-11-15

    N-Myristoyltransferase (NMT) catalyzes the transfer of myristate to the amino-terminal glycine of a subset of proteins, a co-translational modification involved in trafficking substrate proteins to membrane locations, stabilization and protein-protein interactions. It is a studied and validated pre-clinical drug target for fungal and parasitic infections. In the present study, a machine learning approach, docking studies and CoMFA analysis have been integrated with the objective of translation of knowledge into a pipelined workflow towards the identification of putative hits through the screening of large compound libraries. In the proposed pipeline, the reported parasitic NMT inhibitors have been used to develop predictive machine learning classification models. Simultaneously, a TbNMT complex model was generated to establish the relationship between the binding mode of the inhibitors for LmNMT and TbNMT through molecular dynamics simulation studies. A 3D-QSAR model was developed and used to predict the activity of the proposed hits in the subsequent step. The hits classified as active based on the machine learning model were assessed as the potential anti-trypanosomal NMT inhibitors through molecular docking studies, predicted activity using a QSAR model and visual inspection. In the final step, the proposed pipeline was validated through in vitro experiments. A total of seven hits have been proposed and tested in vitro for evaluation of dual inhibitory activity against Leishmania donovani and Trypanosoma brucei. Out of these five compounds showed significant inhibition against both of the organisms. The common topmost active compound SEW04173 belongs to a pyrazole carboxylate scaffold and is anticipated to enrich the chemical space with enhanced potency through optimization.

  4. Study of an Audio Playback Machine Storage, Distribution, and Repair System. Options for Machine Operation. Study II, Part 1, Phase 2, Final Report.

    ERIC Educational Resources Information Center

    ManTech Technical Services Corp., Fairfax, VA.

    This report presents the results of a management study of audio playback equipment operations conducted by the National Library Service, Library of Congress, its associated network of state and local machine lending agencies (MLA), and other parties that play a role in current operations. The objectives were to document current operations,…

  5. A Library approach to establish an Educational Data Curation Framework (EDCF) that supports K-12 data science sustainability

    NASA Astrophysics Data System (ADS)

    Branch, B. D.; Wegner, K.; Smith, S.; Schulze, D. G.; Merwade, V.; Jung, J.; Bessenbacher, A.

    2013-12-01

    It has been the tradition of the libraries to support literacy. Now in the realm of Executive Order, Making Open and Machine Readable the New Default for Government Information, May 9, 2013, the library has the responsibility to support geospatial data, big data, earth science data or cyber infrastructure data that may support STEM for educational pipeline stimulation. (Such information can be found at http://www.whitehouse.gov/the-press-office/2013/05/09/executive-order-making-open-and-machine-readable-new-default-government-.) Provided is an Educational Data Curation Framework (EDCF) that has been initiated in Purdue research, geospatial data service engagement and outreach endeavors for future consideration and application to augment such data science and climate literacy needs of future global citizens. In addition, this endorsement of this framework by the GLOBE program may facilitate further EDCF implementations, discussion points and prototypes for libraries. In addition, the ECDF will support teacher-led, placed-based and large scale climate or earth science learning systems where such knowledge transfer of climate or earth science data is effectively transferred from higher education research of cyberinfrastructure use such as, NOAA or NASA, to K-12 teachers and school systems. The purpose of this effort is to establish best practices for sustainable K-12 data science delivery system or GLOBE-provided system (http://vis.globe.gov/GLOBE/) where libraries manage the data curation and data appropriateness as data reference experts for such digital data. Here, the Purdue University Libraries' GIS department works to support soils, LIDAR and water science data experiences to support teacher training for an EDCF development effort. Lastly, it should be noted that the interdisciplinary collaboration and demonstration of library supported outreach partners and national organizations such the GLOBE program may best foster EDCF development. This trend in data science where library roles may emerge is consistent with NASA's wavelength program at http://nasawavelength.org. Mr. Steven Smith, an outreach coordinator, led this Purdue University outreach activity involving the GLOBE program with support by the Purdue University Libraries GIS department.

  6. TH-CD-206-05: Machine-Learning Based Segmentation of Organs at Risks for Head and Neck Radiotherapy Planning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ibragimov, B; Pernus, F; Strojan, P

    Purpose: Accurate and efficient delineation of tumor target and organs-at-risks is essential for the success of radiotherapy. In reality, despite of decades of intense research efforts, auto-segmentation has not yet become clinical practice. In this study, we present, for the first time, a deep learning-based classification algorithm for autonomous segmentation in head and neck (HaN) treatment planning. Methods: Fifteen HN datasets of CT, MR and PET images with manual annotation of organs-at-risk (OARs) including spinal cord, brainstem, optic nerves, chiasm, eyes, mandible, tongue, parotid glands were collected and saved in a library of plans. We also have ten super-resolution MRmore » images of the tongue area, where the genioglossus and inferior longitudinalis tongue muscles are defined as organs of interest. We applied the concepts of random forest- and deep learning-based object classification for automated image annotation with the aim of using machine learning to facilitate head and neck radiotherapy planning process. In this new paradigm of segmentation, random forests were used for landmark-assisted segmentation of super-resolution MR images. Alternatively to auto-segmentation with random forest-based landmark detection, deep convolutional neural networks were developed for voxel-wise segmentation of OARs in single and multi-modal images. The network consisted of three pairs of convolution and pooing layer, one RuLU layer and a softmax layer. Results: We present a comprehensive study on using machine learning concepts for auto-segmentation of OARs and tongue muscles for the HaN radiotherapy planning. An accuracy of 81.8% in terms of Dice coefficient was achieved for segmentation of genioglossus and inferior longitudinalis tongue muscles. Preliminary results of OARs regimentation also indicate that deep-learning afforded an unprecedented opportunities to improve the accuracy and robustness of radiotherapy planning. Conclusion: A novel machine learning framework has been developed for image annotation and structure segmentation. Our results indicate the great potential of deep learning in radiotherapy treatment planning.« less

  7. On-the-fly machine-learning for high-throughput experiments: search for rare-earth-free permanent magnets

    PubMed Central

    Kusne, Aaron Gilad; Gao, Tieren; Mehta, Apurva; Ke, Liqin; Nguyen, Manh Cuong; Ho, Kai-Ming; Antropov, Vladimir; Wang, Cai-Zhuang; Kramer, Matthew J.; Long, Christian; Takeuchi, Ichiro

    2014-01-01

    Advanced materials characterization techniques with ever-growing data acquisition speed and storage capabilities represent a challenge in modern materials science, and new procedures to quickly assess and analyze the data are needed. Machine learning approaches are effective in reducing the complexity of data and rapidly homing in on the underlying trend in multi-dimensional data. Here, we show that by employing an algorithm called the mean shift theory to a large amount of diffraction data in high-throughput experimentation, one can streamline the process of delineating the structural evolution across compositional variations mapped on combinatorial libraries with minimal computational cost. Data collected at a synchrotron beamline are analyzed on the fly, and by integrating experimental data with the inorganic crystal structure database (ICSD), we can substantially enhance the accuracy in classifying the structural phases across ternary phase spaces. We have used this approach to identify a novel magnetic phase with enhanced magnetic anisotropy which is a candidate for rare-earth free permanent magnet. PMID:25220062

  8. Predicting DPP-IV inhibitors with machine learning approaches

    NASA Astrophysics Data System (ADS)

    Cai, Jie; Li, Chanjuan; Liu, Zhihong; Du, Jiewen; Ye, Jiming; Gu, Qiong; Xu, Jun

    2017-04-01

    Dipeptidyl peptidase IV (DPP-IV) is a promising Type 2 diabetes mellitus (T2DM) drug target. DPP-IV inhibitors prolong the action of glucagon-like peptide-1 (GLP-1) and gastric inhibitory peptide (GIP), improve glucose homeostasis without weight gain, edema, and hypoglycemia. However, the marketed DPP-IV inhibitors have adverse effects such as nasopharyngitis, headache, nausea, hypersensitivity, skin reactions and pancreatitis. Therefore, it is still expected for novel DPP-IV inhibitors with minimal adverse effects. The scaffolds of existing DPP-IV inhibitors are structurally diversified. This makes it difficult to build virtual screening models based upon the known DPP-IV inhibitor libraries using conventional QSAR approaches. In this paper, we report a new strategy to predict DPP-IV inhibitors with machine learning approaches involving naïve Bayesian (NB) and recursive partitioning (RP) methods. We built 247 machine learning models based on 1307 known DPP-IV inhibitors with optimized molecular properties and topological fingerprints as descriptors. The overall predictive accuracies of the optimized models were greater than 80%. An external test set, composed of 65 recently reported compounds, was employed to validate the optimized models. The results demonstrated that both NB and RP models have a good predictive ability based on different combinations of descriptors. Twenty "good" and twenty "bad" structural fragments for DPP-IV inhibitors can also be derived from these models for inspiring the new DPP-IV inhibitor scaffold design.

  9. Instructional Videos for Unsupervised Harvesting and Learning of Action Examples

    DTIC Science & Technology

    2014-11-03

    collection of image or video anno - tations has been tackled in different ways, but most existing methods still require a human in the loop. The...the views of ARO and NSF. 7. REFERENCES [1] C.-C. Chang and C.- J . Lin. LIBSVM: A library for support vector machines. In ACM Transactions on...feature encoding methods. In BMVC, 2011. [3] J . Chen, Y. Cui, G. Ye, D. Liu, and S.-F. Chang. Event-driven semantic concept discovery by exploiting

  10. Prediction and Identification of Krüppel-Like Transcription Factors by Machine Learning Method.

    PubMed

    Liao, Zhijun; Wang, Xinrui; Chen, Xingyong; Zou, Quan

    2017-01-01

    The Krüppel-like factors (KLFs) are a family of containing Zn finger(ZF) motif transcription factors with 18 members in human genome, among them, KLF18 is predicted by bioinformatics. KLFs possess various physiological function involving in a number of cancers and other diseases. Here we perform a binary-class classification of KLFs and non-KLFs by machine learning methods. The protein sequences of KLFs and non-KLFs were searched from UniProt and randomly separate them into training dataset(containing positive and negative sequences) and test dataset(containing only negative sequences), after extracting the 188-dimensional(188D) feature vectors we carry out category with four classifiers(GBDT, libSVM, RF, and k-NN). On the human KLFs, we further dig into the evolutionary relationship and motif distribution, and finally we analyze the conserved amino acid residue of three zinc fingers. The classifier model from training dataset were well constructed, and the highest specificity(Sp) was 99.83% from a library for support vector machine(libSVM) and all the correctly classified rates were over 70% for 10-fold cross-validation on test dataset. The 18 human KLFs can be further divided into 7 groups and the zinc finger domains were located at the carboxyl terminus, and many conserved amino acid residues including Cysteine and Histidine, and the span and interval between them were consistent in the three ZF domains. Two classification models for KLFs prediction have been built by novel machine learning methods. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  11. Applications of Database Machines in Library Systems.

    ERIC Educational Resources Information Center

    Salmon, Stephen R.

    1984-01-01

    Characteristics and advantages of database machines are summarized and their applications to library functions are described. The ability to attach multiple hosts to the same database and flexibility in choosing operating and database management systems for different functions without loss of access to common database are noted. (EJS)

  12. IJ-OpenCV: Combining ImageJ and OpenCV for processing images in biomedicine.

    PubMed

    Domínguez, César; Heras, Jónathan; Pascual, Vico

    2017-05-01

    The effective processing of biomedical images usually requires the interoperability of diverse software tools that have different aims but are complementary. The goal of this work is to develop a bridge to connect two of those tools: ImageJ, a program for image analysis in life sciences, and OpenCV, a computer vision and machine learning library. Based on a thorough analysis of ImageJ and OpenCV, we detected the features of these systems that could be enhanced, and developed a library to combine both tools, taking advantage of the strengths of each system. The library was implemented on top of the SciJava converter framework. We also provide a methodology to use this library. We have developed the publicly available library IJ-OpenCV that can be employed to create applications combining features from both ImageJ and OpenCV. From the perspective of ImageJ developers, they can use IJ-OpenCV to easily create plugins that use any functionality provided by the OpenCV library and explore different alternatives. From the perspective of OpenCV developers, this library provides a link to the ImageJ graphical user interface and all its features to handle regions of interest. The IJ-OpenCV library bridges the gap between ImageJ and OpenCV, allowing the connection and the cooperation of these two systems. Copyright © 2017 Elsevier Ltd. All rights reserved.

  13. Library Information-Processing System

    NASA Technical Reports Server (NTRS)

    1985-01-01

    System works with Library of Congress MARC II format. System composed of subsystems that provide wide range of library informationprocessing capabilities. Format is American National Standards Institute (ANSI) format for machine-readable bibliographic data. Adaptable to any medium-to-large library.

  14. TRS-80 at the Maine State Library.

    ERIC Educational Resources Information Center

    Wismer, Donald

    This report describes the applications and work flow of a TRS-80 microcomputer at the Maine State Library, and provides sample computer-generated records and programs used with the TRS-80. The machine was chosen for its price, availability, and compatibility with machines already in Maine's schools. It is used for mailing list management (with…

  15. COM: Decisions and Applications in a Small University Library.

    ERIC Educational Resources Information Center

    Schwarz, Philip J.

    Computer-output microfilm (COM) is used at the University of Wisconsin-Stout Library to generate reports from its major machine readable data bases. Conditions indicating the need to convert to COM include existence of a machine readable data base and high cost of report production. Advantages and disadvantages must also be considered before…

  16. Kernelized rank learning for personalized drug recommendation.

    PubMed

    He, Xiao; Folkman, Lukas; Borgwardt, Karsten

    2018-03-08

    Large-scale screenings of cancer cell lines with detailed molecular profiles against libraries of pharmacological compounds are currently being performed in order to gain a better understanding of the genetic component of drug response and to enhance our ability to recommend therapies given a patient's molecular profile. These comprehensive screens differ from the clinical setting in which (1) medical records only contain the response of a patient to very few drugs, (2) drugs are recommended by doctors based on their expert judgment, and (3) selecting the most promising therapy is often more important than accurately predicting the sensitivity to all potential drugs. Current regression models for drug sensitivity prediction fail to account for these three properties. We present a machine learning approach, named Kernelized Rank Learning (KRL), that ranks drugs based on their predicted effect per cell line (patient), circumventing the difficult problem of precisely predicting the sensitivity to the given drug. Our approach outperforms several state-of-the-art predictors in drug recommendation, particularly if the training dataset is sparse, and generalizes to patient data. Our work phrases personalized drug recommendation as a new type of machine learning problem with translational potential to the clinic. The Python implementation of KRL and scripts for running our experiments are available at https://github.com/BorgwardtLab/Kernelized-Rank-Learning. xiao.he@bsse.ethz.ch, lukas.folkman@bsse.ethz.ch. Supplementary data are available at Bioinformatics online.

  17. Predicting human liver microsomal stability with machine learning techniques.

    PubMed

    Sakiyama, Yojiro; Yuki, Hitomi; Moriya, Takashi; Hattori, Kazunari; Suzuki, Misaki; Shimada, Kaoru; Honma, Teruki

    2008-02-01

    To ensure a continuing pipeline in pharmaceutical research, lead candidates must possess appropriate metabolic stability in the drug discovery process. In vitro ADMET (absorption, distribution, metabolism, elimination, and toxicity) screening provides us with useful information regarding the metabolic stability of compounds. However, before the synthesis stage, an efficient process is required in order to deal with the vast quantity of data from large compound libraries and high-throughput screening. Here we have derived a relationship between the chemical structure and its metabolic stability for a data set of in-house compounds by means of various in silico machine learning such as random forest, support vector machine (SVM), logistic regression, and recursive partitioning. For model building, 1952 proprietary compounds comprising two classes (stable/unstable) were used with 193 descriptors calculated by Molecular Operating Environment. The results using test compounds have demonstrated that all classifiers yielded satisfactory results (accuracy > 0.8, sensitivity > 0.9, specificity > 0.6, and precision > 0.8). Above all, classification by random forest as well as SVM yielded kappa values of approximately 0.7 in an independent validation set, slightly higher than other classification tools. These results suggest that nonlinear/ensemble-based classification methods might prove useful in the area of in silico ADME modeling.

  18. Sparse modeling applied to patient identification for safety in medical physics applications

    NASA Astrophysics Data System (ADS)

    Lewkowitz, Stephanie

    Every scheduled treatment at a radiation therapy clinic involves a series of safety protocol to ensure the utmost patient care. Despite safety protocol, on a rare occasion an entirely preventable medical event, an accident, may occur. Delivering a treatment plan to the wrong patient is preventable, yet still is a clinically documented error. This research describes a computational method to identify patients with a novel machine learning technique to combat misadministration. The patient identification program stores face and fingerprint data for each patient. New, unlabeled data from those patients are categorized according to the library. The categorization of data by this face-fingerprint detector is accomplished with new machine learning algorithms based on Sparse Modeling that have already begun transforming the foundation of Computer Vision. Previous patient recognition software required special subroutines for faces and different tailored subroutines for fingerprints. In this research, the same exact model is used for both fingerprints and faces, without any additional subroutines and even without adjusting the two hyperparameters. Sparse modeling is a powerful tool, already shown utility in the areas of super-resolution, denoising, inpainting, demosaicing, and sub-nyquist sampling, i.e. compressed sensing. Sparse Modeling is possible because natural images are inherently sparse in some bases, due to their inherent structure. This research chooses datasets of face and fingerprint images to test the patient identification model. The model stores the images of each dataset as a basis (library). One image at a time is removed from the library, and is classified by a sparse code in terms of the remaining library. The Locally Competitive Algorithm, a truly neural inspired Artificial Neural Network, solves the computationally difficult task of finding the sparse code for the test image. The components of the sparse representation vector are summed by ℓ1 pooling, and correct patient identification is consistently achieved 100% over 1000 trials, when either the face data or fingerprint data are implemented as a classification basis. The algorithm gets 100% classification when faces and fingerprints are concatenated into multimodal datasets. This suggests that 100% patient identification will be achievable in the clinal setting.

  19. Machine learning and docking models for Mycobacterium tuberculosis topoisomerase I.

    PubMed

    Ekins, Sean; Godbole, Adwait Anand; Kéri, György; Orfi, Lászlo; Pato, János; Bhat, Rajeshwari Subray; Verma, Rinkee; Bradley, Erin K; Nagaraja, Valakunja

    2017-03-01

    There is a shortage of compounds that are directed towards new targets apart from those targeted by the FDA approved drugs used against Mycobacterium tuberculosis. Topoisomerase I (Mttopo I) is an essential mycobacterial enzyme and a promising target in this regard. However, it suffers from a shortage of known inhibitors. We have previously used computational approaches such as homology modeling and docking to propose 38 FDA approved drugs for testing and identified several active molecules. To follow on from this, we now describe the in vitro testing of a library of 639 compounds. These data were used to create machine learning models for Mttopo I which were further validated. The combined Mttopo I Bayesian model had a 5 fold cross validation receiver operator characteristic of 0.74 and sensitivity, specificity and concordance values above 0.76 and was used to select commercially available compounds for testing in vitro. The recently described crystal structure of Mttopo I was also compared with the previously described homology model and then used to dock the Mttopo I actives norclomipramine and imipramine. In summary, we describe our efforts to identify small molecule inhibitors of Mttopo I using a combination of machine learning modeling and docking studies in conjunction with screening of the selected molecules for enzyme inhibition. We demonstrate the experimental inhibition of Mttopo I by small molecule inhibitors and show that the enzyme can be readily targeted for lead molecule development. Copyright © 2017 Elsevier Ltd. All rights reserved.

  20. PyMVPA: A Unifying Approach to the Analysis of Neuroscientific Data

    PubMed Central

    Hanke, Michael; Halchenko, Yaroslav O.; Sederberg, Per B.; Olivetti, Emanuele; Fründ, Ingo; Rieger, Jochem W.; Herrmann, Christoph S.; Haxby, James V.; Hanson, Stephen José; Pollmann, Stefan

    2008-01-01

    The Python programming language is steadily increasing in popularity as the language of choice for scientific computing. The ability of this scripting environment to access a huge code base in various languages, combined with its syntactical simplicity, make it the ideal tool for implementing and sharing ideas among scientists from numerous fields and with heterogeneous methodological backgrounds. The recent rise of reciprocal interest between the machine learning (ML) and neuroscience communities is an example of the desire for an inter-disciplinary transfer of computational methods that can benefit from a Python-based framework. For many years, a large fraction of both research communities have addressed, almost independently, very high-dimensional problems with almost completely non-overlapping methods. However, a number of recently published studies that applied ML methods to neuroscience research questions attracted a lot of attention from researchers from both fields, as well as the general public, and showed that this approach can provide novel and fruitful insights into the functioning of the brain. In this article we show how PyMVPA, a specialized Python framework for machine learning based data analysis, can help to facilitate this inter-disciplinary technology transfer by providing a single interface to a wide array of machine learning libraries and neural data-processing methods. We demonstrate the general applicability and power of PyMVPA via analyses of a number of neural data modalities, including fMRI, EEG, MEG, and extracellular recordings. PMID:19212459

  1. Portable programming on parallel/networked computers using the Application Portable Parallel Library (APPL)

    NASA Technical Reports Server (NTRS)

    Quealy, Angela; Cole, Gary L.; Blech, Richard A.

    1993-01-01

    The Application Portable Parallel Library (APPL) is a subroutine-based library of communication primitives that is callable from applications written in FORTRAN or C. APPL provides a consistent programmer interface to a variety of distributed and shared-memory multiprocessor MIMD machines. The objective of APPL is to minimize the effort required to move parallel applications from one machine to another, or to a network of homogeneous machines. APPL encompasses many of the message-passing primitives that are currently available on commercial multiprocessor systems. This paper describes APPL (version 2.3.1) and its usage, reports the status of the APPL project, and indicates possible directions for the future. Several applications using APPL are discussed, as well as performance and overhead results.

  2. An integrated runtime and compile-time approach for parallelizing structured and block structured applications

    NASA Technical Reports Server (NTRS)

    Agrawal, Gagan; Sussman, Alan; Saltz, Joel

    1993-01-01

    Scientific and engineering applications often involve structured meshes. These meshes may be nested (for multigrid codes) and/or irregularly coupled (called multiblock or irregularly coupled regular mesh problems). A combined runtime and compile-time approach for parallelizing these applications on distributed memory parallel machines in an efficient and machine-independent fashion was described. A runtime library which can be used to port these applications on distributed memory machines was designed and implemented. The library is currently implemented on several different systems. To further ease the task of application programmers, methods were developed for integrating this runtime library with compilers for HPK-like parallel programming languages. How this runtime library was integrated with the Fortran 90D compiler being developed at Syracuse University is discussed. Experimental results to demonstrate the efficacy of our approach are presented. A multiblock Navier-Stokes solver template and a multigrid code were experimented with. Our experimental results show that our primitives have low runtime communication overheads. Further, the compiler parallelized codes perform within 20 percent of the code parallelized by manually inserting calls to the runtime library.

  3. Machine Learning of Human Pluripotent Stem Cell-Derived Engineered Cardiac Tissue Contractility for Automated Drug Classification.

    PubMed

    Lee, Eugene K; Tran, David D; Keung, Wendy; Chan, Patrick; Wong, Gabriel; Chan, Camie W; Costa, Kevin D; Li, Ronald A; Khine, Michelle

    2017-11-14

    Accurately predicting cardioactive effects of new molecular entities for therapeutics remains a daunting challenge. Immense research effort has been focused toward creating new screening platforms that utilize human pluripotent stem cell (hPSC)-derived cardiomyocytes and three-dimensional engineered cardiac tissue constructs to better recapitulate human heart function and drug responses. As these new platforms become increasingly sophisticated and high throughput, the drug screens result in larger multidimensional datasets. Improved automated analysis methods must therefore be developed in parallel to fully comprehend the cellular response across a multidimensional parameter space. Here, we describe the use of machine learning to comprehensively analyze 17 functional parameters derived from force readouts of hPSC-derived ventricular cardiac tissue strips (hvCTS) electrically paced at a range of frequencies and exposed to a library of compounds. A generated metric is effective for then determining the cardioactivity of a given drug. Furthermore, we demonstrate a classification model that can automatically predict the mechanistic action of an unknown cardioactive drug. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  4. Integrating Symbolic and Statistical Methods for Testing Intelligent Systems Applications to Machine Learning and Computer Vision

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jha, Sumit Kumar; Pullum, Laura L; Ramanathan, Arvind

    Embedded intelligent systems ranging from tiny im- plantable biomedical devices to large swarms of autonomous un- manned aerial systems are becoming pervasive in our daily lives. While we depend on the flawless functioning of such intelligent systems, and often take their behavioral correctness and safety for granted, it is notoriously difficult to generate test cases that expose subtle errors in the implementations of machine learning algorithms. Hence, the validation of intelligent systems is usually achieved by studying their behavior on representative data sets, using methods such as cross-validation and bootstrapping.In this paper, we present a new testing methodology for studyingmore » the correctness of intelligent systems. Our approach uses symbolic decision procedures coupled with statistical hypothesis testing to. We also use our algorithm to analyze the robustness of a human detection algorithm built using the OpenCV open-source computer vision library. We show that the human detection implementation can fail to detect humans in perturbed video frames even when the perturbations are so small that the corresponding frames look identical to the naked eye.« less

  5. Automated isotope identification algorithm using artificial neural networks

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kamuda, Mark; Stinnett, Jacob; Sullivan, Clair

    There is a need to develop an algorithm that can determine the relative activities of radio-isotopes in a large dataset of low-resolution gamma-ray spectra that contains a mixture of many radio-isotopes. Low-resolution gamma-ray spectra that contain mixtures of radio-isotopes often exhibit feature over-lap, requiring algorithms that can analyze these features when overlap occurs. While machine learning and pattern recognition algorithms have shown promise for the problem of radio-isotope identification, their ability to identify and quantify mixtures of radio-isotopes has not been studied. Because machine learning algorithms use abstract features of the spectrum, such as the shape of overlapping peaks andmore » Compton continuum, they are a natural choice for analyzing radio-isotope mixtures. An artificial neural network (ANN) has be trained to calculate the relative activities of 32 radio-isotopes in a spectrum. Furthermore, the ANN is trained with simulated gamma-ray spectra, allowing easy expansion of the library of target radio-isotopes. In this paper we present our initial algorithms based on an ANN and evaluate them against a series measured and simulated spectra.« less

  6. Automated isotope identification algorithm using artificial neural networks

    DOE PAGES

    Kamuda, Mark; Stinnett, Jacob; Sullivan, Clair

    2017-04-12

    There is a need to develop an algorithm that can determine the relative activities of radio-isotopes in a large dataset of low-resolution gamma-ray spectra that contains a mixture of many radio-isotopes. Low-resolution gamma-ray spectra that contain mixtures of radio-isotopes often exhibit feature over-lap, requiring algorithms that can analyze these features when overlap occurs. While machine learning and pattern recognition algorithms have shown promise for the problem of radio-isotope identification, their ability to identify and quantify mixtures of radio-isotopes has not been studied. Because machine learning algorithms use abstract features of the spectrum, such as the shape of overlapping peaks andmore » Compton continuum, they are a natural choice for analyzing radio-isotope mixtures. An artificial neural network (ANN) has be trained to calculate the relative activities of 32 radio-isotopes in a spectrum. Furthermore, the ANN is trained with simulated gamma-ray spectra, allowing easy expansion of the library of target radio-isotopes. In this paper we present our initial algorithms based on an ANN and evaluate them against a series measured and simulated spectra.« less

  7. Designing focused chemical libraries enriched in protein-protein interaction inhibitors using machine-learning methods.

    PubMed

    Reynès, Christelle; Host, Hélène; Camproux, Anne-Claude; Laconde, Guillaume; Leroux, Florence; Mazars, Anne; Deprez, Benoit; Fahraeus, Robin; Villoutreix, Bruno O; Sperandio, Olivier

    2010-03-05

    Protein-protein interactions (PPIs) may represent one of the next major classes of therapeutic targets. So far, only a minute fraction of the estimated 650,000 PPIs that comprise the human interactome are known with a tiny number of complexes being drugged. Such intricate biological systems cannot be cost-efficiently tackled using conventional high-throughput screening methods. Rather, time has come for designing new strategies that will maximize the chance for hit identification through a rationalization of the PPI inhibitor chemical space and the design of PPI-focused compound libraries (global or target-specific). Here, we train machine-learning-based models, mainly decision trees, using a dataset of known PPI inhibitors and of regular drugs in order to determine a global physico-chemical profile for putative PPI inhibitors. This statistical analysis unravels two important molecular descriptors for PPI inhibitors characterizing specific molecular shapes and the presence of a privileged number of aromatic bonds. The best model has been transposed into a computer program, PPI-HitProfiler, that can output from any drug-like compound collection a focused chemical library enriched in putative PPI inhibitors. Our PPI inhibitor profiler is challenged on the experimental screening results of 11 different PPIs among which the p53/MDM2 interaction screened within our own CDithem platform, that in addition to the validation of our concept led to the identification of 4 novel p53/MDM2 inhibitors. Collectively, our tool shows a robust behavior on the 11 experimental datasets by correctly profiling 70% of the experimentally identified hits while removing 52% of the inactive compounds from the initial compound collections. We strongly believe that this new tool can be used as a global PPI inhibitor profiler prior to screening assays to reduce the size of the compound collections to be experimentally screened while keeping most of the true PPI inhibitors. PPI-HitProfiler is freely available on request from our CDithem platform website, www.CDithem.com.

  8. Designing Focused Chemical Libraries Enriched in Protein-Protein Interaction Inhibitors using Machine-Learning Methods

    PubMed Central

    Reynès, Christelle; Host, Hélène; Camproux, Anne-Claude; Laconde, Guillaume; Leroux, Florence; Mazars, Anne; Deprez, Benoit; Fahraeus, Robin; Villoutreix, Bruno O.; Sperandio, Olivier

    2010-01-01

    Protein-protein interactions (PPIs) may represent one of the next major classes of therapeutic targets. So far, only a minute fraction of the estimated 650,000 PPIs that comprise the human interactome are known with a tiny number of complexes being drugged. Such intricate biological systems cannot be cost-efficiently tackled using conventional high-throughput screening methods. Rather, time has come for designing new strategies that will maximize the chance for hit identification through a rationalization of the PPI inhibitor chemical space and the design of PPI-focused compound libraries (global or target-specific). Here, we train machine-learning-based models, mainly decision trees, using a dataset of known PPI inhibitors and of regular drugs in order to determine a global physico-chemical profile for putative PPI inhibitors. This statistical analysis unravels two important molecular descriptors for PPI inhibitors characterizing specific molecular shapes and the presence of a privileged number of aromatic bonds. The best model has been transposed into a computer program, PPI-HitProfiler, that can output from any drug-like compound collection a focused chemical library enriched in putative PPI inhibitors. Our PPI inhibitor profiler is challenged on the experimental screening results of 11 different PPIs among which the p53/MDM2 interaction screened within our own CDithem platform, that in addition to the validation of our concept led to the identification of 4 novel p53/MDM2 inhibitors. Collectively, our tool shows a robust behavior on the 11 experimental datasets by correctly profiling 70% of the experimentally identified hits while removing 52% of the inactive compounds from the initial compound collections. We strongly believe that this new tool can be used as a global PPI inhibitor profiler prior to screening assays to reduce the size of the compound collections to be experimentally screened while keeping most of the true PPI inhibitors. PPI-HitProfiler is freely available on request from our CDithem platform website, www.CDithem.com. PMID:20221258

  9. An Investigation into the Economics of Retrospective Conversion Using a CD-ROM System.

    ERIC Educational Resources Information Center

    Co, Francisca K.

    This study compares the cost effectiveness of using a CD-ROM (compact disk read-only memory) system known as Bibliofile and the currently used OCLC (Online Computer Library Center)-based method to convert a university library's shelflist into a machine-readable database in the MARC (Machine-Readable Cataloging) format. The cost of each method of…

  10. Development of a Machine Form Union Catalog for the New England Library Information Network (NELINET). Final Report.

    ERIC Educational Resources Information Center

    Goldstein, Samuel; And Others

    Based on a literature survey of union cataloging, and New England libraries it was determined that: (1) New England's collective union catalog needs and problems had not been specified, especially regarding the possibilities of machine application; (2) crucial data and analysis needed for such specification was unavailable; and (3) the absence of…

  11. Design and Development of a Technology Platform for DNA-Encoded Library Production and Affinity Selection.

    PubMed

    Castañón, Jesús; Román, José Pablo; Jessop, Theodore C; de Blas, Jesús; Haro, Rubén

    2018-06-01

    DNA-encoded libraries (DELs) have emerged as an efficient and cost-effective drug discovery tool for the exploration and screening of very large chemical space using small-molecule collections of unprecedented size. Herein, we report an integrated automation and informatics system designed to enhance the quality, efficiency, and throughput of the production and affinity selection of these libraries. The platform is governed by software developed according to a database-centric architecture to ensure data consistency, integrity, and availability. Through its versatile protocol management functionalities, this application captures the wide diversity of experimental processes involved with DEL technology, keeps track of working protocols in the database, and uses them to command robotic liquid handlers for the synthesis of libraries. This approach provides full traceability of building-blocks and DNA tags in each split-and-pool cycle. Affinity selection experiments and high-throughput sequencing reads are also captured in the database, and the results are automatically deconvoluted and visualized in customizable representations. Researchers can compare results of different experiments and use machine learning methods to discover patterns in data. As of this writing, the platform has been validated through the generation and affinity selection of various libraries, and it has become the cornerstone of the DEL production effort at Lilly.

  12. Heavy machine shop on the first floor of building no. ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    Heavy machine shop on the first floor of building no. 5. Many of the machines shown here are shown in photographs of the late 1880's - Thomas A. Edison Laboratories, Machine Shop & Library, Main Street & Lakeside Avenue, West Orange, Essex County, NJ

  13. Creating a Lean, Green, Library Machine: Easy Eco-Friendly Habits for Your Library

    ERIC Educational Resources Information Center

    Blaine, Amy S.

    2010-01-01

    For some library media specialists, implementing the three Rs of recycling, reducing, and reusing comes easily; they've been environmentally conscious well before the concept of going green made its way into the vernacular. Yet for some of library media specialists, the thought of greening their library, let alone the entire school, can seem…

  14. SU-G-BRB-02: An Open-Source Software Analysis Library for Linear Accelerator Quality Assurance

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kerns, J; Yaldo, D

    Purpose: Routine linac quality assurance (QA) tests have become complex enough to require automation of most test analyses. A new data analysis software library was built that allows physicists to automate routine linear accelerator quality assurance tests. The package is open source, code tested, and benchmarked. Methods: Images and data were generated on a TrueBeam linac for the following routine QA tests: VMAT, starshot, CBCT, machine logs, Winston Lutz, and picket fence. The analysis library was built using the general programming language Python. Each test was analyzed with the library algorithms and compared to manual measurements taken at the timemore » of acquisition. Results: VMAT QA results agreed within 0.1% between the library and manual measurements. Machine logs (dynalogs & trajectory logs) were successfully parsed; mechanical axis positions were verified for accuracy and MLC fluence agreed well with EPID measurements. CBCT QA measurements were within 10 HU and 0.2mm where applicable. Winston Lutz isocenter size measurements were within 0.2mm of TrueBeam’s Machine Performance Check. Starshot analysis was within 0.2mm of the Winston Lutz results for the same conditions. Picket fence images with and without a known error showed that the library was capable of detecting MLC offsets within 0.02mm. Conclusion: A new routine QA software library has been benchmarked and is available for use by the community. The library is open-source and extensible for use in larger systems.« less

  15. Toward Intelligent Machine Learning Algorithms

    DTIC Science & Technology

    1988-05-01

    Machine learning is recognized as a tool for improving the performance of many kinds of systems, yet most machine learning systems themselves are not...directed systems, and with the addition of a knowledge store for organizing and maintaining knowledge to assist learning, a learning machine learning (L...ML) algorithm is possible. The necessary components of L-ML systems are presented along with several case descriptions of existing machine learning systems

  16. Web Mining: Machine Learning for Web Applications.

    ERIC Educational Resources Information Center

    Chen, Hsinchun; Chau, Michael

    2004-01-01

    Presents an overview of machine learning research and reviews methods used for evaluating machine learning systems. Ways that machine-learning algorithms were used in traditional information retrieval systems in the "pre-Web" era are described, and the field of Web mining and how machine learning has been used in different Web mining…

  17. The Library's role and challenges in implementing an e-learning strategy: a case study from northern Australia.

    PubMed

    Ritchie, Ann

    2011-03-01

      The Northern Territory Department of Health and Families' (DHF) Library supports education programs for all staff. DHF is implementing an e-learning strategy, which may be viewed as a vehicle for coordinating the education function throughout the organisation.   The objective of this study is to explore the concept of e-learning in relation to the Library's role in implementing an organisation-wide e-learning strategy.   The main findings of a literature search about the effectiveness of e-learning in health professionals' education, and the responsibility and roles of health librarians in e-learning are described. A case study approach is used to outline the current role and future opportunities and challenges for the Library.   The case study presents the organisation's strategic planning context. Four areas of operational activity which build on the Library's current educational activities are suggested: the integration of library resources 'learning objects' within a Learning Management System; developing online health information literacy training programs; establishing a physical and virtual 'e-Learning Library/Centre'; developing collaborative partnerships, taking on new responsibilities in e-learning development, and creating a new e-learning librarian role.   The study shows that the Library's role is fundamental to developing the organisation's e-learning capacity and implementing an organisation-wide e-learning strategy. © 2010 The authors. Health Information and Libraries Journal © 2010 Health Libraries Group.

  18. Cheminformatic models based on machine learning for pyruvate kinase inhibitors of Leishmania mexicana.

    PubMed

    Jamal, Salma; Scaria, Vinod

    2013-11-19

    Leishmaniasis is a neglected tropical disease which affects approx. 12 million individuals worldwide and caused by parasite Leishmania. The current drugs used in the treatment of Leishmaniasis are highly toxic and has seen widespread emergence of drug resistant strains which necessitates the need for the development of new therapeutic options. The high throughput screen data available has made it possible to generate computational predictive models which have the ability to assess the active scaffolds in a chemical library followed by its ADME/toxicity properties in the biological trials. In the present study, we have used publicly available, high-throughput screen datasets of chemical moieties which have been adjudged to target the pyruvate kinase enzyme of L. mexicana (LmPK). The machine learning approach was used to create computational models capable of predicting the biological activity of novel antileishmanial compounds. Further, we evaluated the molecules using the substructure based approach to identify the common substructures contributing to their activity. We generated computational models based on machine learning methods and evaluated the performance of these models based on various statistical figures of merit. Random forest based approach was determined to be the most sensitive, better accuracy as well as ROC. We further added a substructure based approach to analyze the molecules to identify potentially enriched substructures in the active dataset. We believe that the models developed in the present study would lead to reduction in cost and length of clinical studies and hence newer drugs would appear faster in the market providing better healthcare options to the patients.

  19. Streamlining machine learning in mobile devices for remote sensing

    NASA Astrophysics Data System (ADS)

    Coronel, Andrei D.; Estuar, Ma. Regina E.; Garcia, Kyle Kristopher P.; Dela Cruz, Bon Lemuel T.; Torrijos, Jose Emmanuel; Lim, Hadrian Paulo M.; Abu, Patricia Angela R.; Victorino, John Noel C.

    2017-09-01

    Mobile devices have been at the forefront of Intelligent Farming because of its ubiquitous nature. Applications on precision farming have been developed on smartphones to allow small farms to monitor environmental parameters surrounding crops. Mobile devices are used for most of these applications, collecting data to be sent to the cloud for storage, analysis, modeling and visualization. However, with the issue of weak and intermittent connectivity in geographically challenged areas of the Philippines, the solution is to provide analysis on the phone itself. Given this, the farmer gets a real time response after data submission. Though Machine Learning is promising, hardware constraints in mobile devices limit the computational capabilities, making model development on the phone restricted and challenging. This study discusses the development of a Machine Learning based mobile application using OpenCV libraries. The objective is to enable the detection of Fusarium oxysporum cubense (Foc) in juvenile and asymptomatic bananas using images of plant parts and microscopic samples as input. Image datasets of attached, unattached, dorsal, and ventral views of leaves were acquired through sampling protocols. Images of raw and stained specimens from soil surrounding the plant, and sap from the plant resulted to stained and unstained samples respectively. Segmentation and feature extraction techniques were applied to all images. Initial findings show no significant differences among the different feature extraction techniques. For differentiating infected from non-infected leaves, KNN yields highest average accuracy, as opposed to Naive Bayes and SVM. For microscopic images using MSER feature extraction, KNN has been tested as having a better accuracy than SVM or Naive-Bayes.

  20. Vatican Library Automates for the 21st Century.

    ERIC Educational Resources Information Center

    Carter, Thomas L.

    1994-01-01

    Because of space and staff constraints, the Vatican Library can issue only 2,000 reader cards a year. Describes IBM's Vatican Library Project: converting the library catalog records (prior to 1985) into machine readable form and digitally scanning 20,000 manuscript pages, print pages, and art works in gray scale and color, creating a database…

  1. Using Machine Learning to Advance Personality Assessment and Theory.

    PubMed

    Bleidorn, Wiebke; Hopwood, Christopher James

    2018-05-01

    Machine learning has led to important advances in society. One of the most exciting applications of machine learning in psychological science has been the development of assessment tools that can powerfully predict human behavior and personality traits. Thus far, machine learning approaches to personality assessment have focused on the associations between social media and other digital records with established personality measures. The goal of this article is to expand the potential of machine learning approaches to personality assessment by embedding it in a more comprehensive construct validation framework. We review recent applications of machine learning to personality assessment, place machine learning research in the broader context of fundamental principles of construct validation, and provide recommendations for how to use machine learning to advance our understanding of personality.

  2. Protein Science by DNA Sequencing: How Advances in Molecular Biology Are Accelerating Biochemistry.

    PubMed

    Higgins, Sean A; Savage, David F

    2018-01-09

    A fundamental goal of protein biochemistry is to determine the sequence-function relationship, but the vastness of sequence space makes comprehensive evaluation of this landscape difficult. However, advances in DNA synthesis and sequencing now allow researchers to assess the functional impact of every single mutation in many proteins, but challenges remain in library construction and the development of general assays applicable to a diverse range of protein functions. This Perspective briefly outlines the technical innovations in DNA manipulation that allow massively parallel protein biochemistry and then summarizes the methods currently available for library construction and the functional assays of protein variants. Areas in need of future innovation are highlighted with a particular focus on assay development and the use of computational analysis with machine learning to effectively traverse the sequence-function landscape. Finally, applications in the fundamentals of protein biochemistry, disease prediction, and protein engineering are presented.

  3. Detecting atypical examples of known domain types by sequence similarity searching: the SBASE domain library approach.

    PubMed

    Dhir, Somdutta; Pacurar, Mircea; Franklin, Dino; Gáspári, Zoltán; Kertész-Farkas, Attila; Kocsor, András; Eisenhaber, Frank; Pongor, Sándor

    2010-11-01

    SBASE is a project initiated to detect known domain types and predicting domain architectures using sequence similarity searching (Simon et al., Protein Seq Data Anal, 5: 39-42, 1992, Pongor et al, Nucl. Acids. Res. 21:3111-3115, 1992). The current approach uses a curated collection of domain sequences - the SBASE domain library - and standard similarity search algorithms, followed by postprocessing which is based on a simple statistics of the domain similarity network (http://hydra.icgeb.trieste.it/sbase/). It is especially useful in detecting rare, atypical examples of known domain types which are sometimes missed even by more sophisticated methodologies. This approach does not require multiple alignment or machine learning techniques, and can be a useful complement to other domain detection methodologies. This article gives an overview of the project history as well as of the concepts and principles developed within this the project.

  4. feets: feATURE eXTRACTOR for tIME sERIES

    NASA Astrophysics Data System (ADS)

    Cabral, Juan; Sanchez, Bruno; Ramos, Felipe; Gurovich, Sebastián; Granitto, Pablo; VanderPlas, Jake

    2018-06-01

    feets characterizes and analyzes light-curves from astronomical photometric databases for modelling, classification, data cleaning, outlier detection and data analysis. It uses machine learning algorithms to determine the numerical descriptors that characterize and distinguish the different variability classes of light-curves; these range from basic statistical measures such as the mean or standard deviation to complex time-series characteristics such as the autocorrelation function. The library is not restricted to the astronomical field and could also be applied to any kind of time series. This project is a derivative work of FATS (ascl:1711.017).

  5. Multiple machine learning based descriptive and predictive workflow for the identification of potential PTP1B inhibitors.

    PubMed

    Chandra, Sharat; Pandey, Jyotsana; Tamrakar, Akhilesh Kumar; Siddiqi, Mohammad Imran

    2017-01-01

    In insulin and leptin signaling pathway, Protein-Tyrosine Phosphatase 1B (PTP1B) plays a crucial controlling role as a negative regulator, which makes it an attractive therapeutic target for both Type-2 Diabetes (T2D) and obesity. In this work, we have generated classification models by using the inhibition data set of known PTP1B inhibitors to identify new inhibitors of PTP1B utilizing multiple machine learning techniques like naïve Bayesian, random forest, support vector machine and k-nearest neighbors, along with structural fingerprints and selected molecular descriptors. Several models from each algorithm have been constructed and optimized, with the different combination of molecular descriptors and structural fingerprints. For the training and test sets, most of the predictive models showed more than 90% of overall prediction accuracies. The best model was obtained with support vector machine approach and has Matthews Correlation Coefficient of 0.82 for the external test set, which was further employed for the virtual screening of Maybridge small compound database. Five compounds were subsequently selected for experimental assay. Out of these two compounds were found to inhibit PTP1B with significant inhibitory activity in in-vitro inhibition assay. The structural fragments which are important for PTP1B inhibition were identified by naïve Bayesian method and can be further exploited to design new molecules around the identified scaffolds. The descriptive and predictive modeling strategy applied in this study is capable of identifying PTP1B inhibitors from the large compound libraries. Copyright © 2016 Elsevier Inc. All rights reserved.

  6. FORTRAN multitasking library for use on the ELXSI 6400 and the CRAY XMP

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Montry, G.R.

    1985-07-16

    A library of FORTRAN-based multitasking routines has been written for the ELXSI 6400 and the CRAY XMP. This library is designed to make multitasking codes easily transportable between machines with different hardware configurations. The library provides enhanced error checking and diagnostics over vendor-supplied multitasking intrinsics. The library also contains multitasking control structures not normally supplied by the vendor.

  7. 36 CFR Appendix A to Subpart A of... - Fees and Charges for Services Provided to Requesters of Records

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... LIBRARY OF CONGRESS DISCLOSURE OR PRODUCTION OF RECORDS OR INFORMATION Availability of Library of Congress... based on the direct cost to the Library, including labor, material, and computer time. (b) Duplication... established by the Library's Photoduplication Service, or in the case of machine media duplication, by the...

  8. 36 CFR Appendix A to Subpart A of... - Fees and Charges for Services Provided to Requesters of Records

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... LIBRARY OF CONGRESS DISCLOSURE OR PRODUCTION OF RECORDS OR INFORMATION Availability of Library of Congress... based on the direct cost to the Library, including labor, material, and computer time. (b) Duplication... established by the Library's Photoduplication Service, or in the case of machine media duplication, by the...

  9. 36 CFR Appendix A to Subpart A of... - Fees and Charges for Services Provided to Requesters of Records

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... LIBRARY OF CONGRESS DISCLOSURE OR PRODUCTION OF RECORDS OR INFORMATION Availability of Library of Congress... based on the direct cost to the Library, including labor, material, and computer time. (b) Duplication... established by the Library's Photoduplication Service, or in the case of machine media duplication, by the...

  10. 36 CFR Appendix A to Subpart A of... - Fees and Charges for Services Provided to Requesters of Records

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... LIBRARY OF CONGRESS DISCLOSURE OR PRODUCTION OF RECORDS OR INFORMATION Availability of Library of Congress... based on the direct cost to the Library, including labor, material, and computer time. (b) Duplication... established by the Library's Photoduplication Service, or in the case of machine media duplication, by the...

  11. 36 CFR Appendix A to Subpart A of... - Fees and Charges for Services Provided to Requesters of Records

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... LIBRARY OF CONGRESS DISCLOSURE OR PRODUCTION OF RECORDS OR INFORMATION Availability of Library of Congress... based on the direct cost to the Library, including labor, material, and computer time. (b) Duplication... established by the Library's Photoduplication Service, or in the case of machine media duplication, by the...

  12. Accessing PCS Remotely across a Rural County Library System

    ERIC Educational Resources Information Center

    McGeehon, Carol; Millar, Don

    2004-01-01

    The Douglas County Library System (DCLS) is a rural system in Southern Oregon. It's headquarters library is centrally located in Roseburg. DCLS serves a population of 100,000 with the largest concentration of people within 15 miles from the Pacific ocean. Because the library system supports around 150 machines scattered across 11 sites, it needs a…

  13. Evaluation of an Integrated Multi-Task Machine Learning System with Humans in the Loop

    DTIC Science & Technology

    2007-01-01

    machine learning components natural language processing, and optimization...was examined with a test explicitly developed to measure the impact of integrated machine learning when used by a human user in a real world setting...study revealed that integrated machine learning does produce a positive impact on overall performance. This paper also discusses how specific machine learning components contributed to human-system

  14. Something for Everyone: Learning and Learning Technologies in a Public Library

    ERIC Educational Resources Information Center

    Blackburn, Fiona

    2010-01-01

    The nature of learning in a public library is relevant to what place e-learning and social networking technologies might have there. That a public library aims to provide something for everyone is also important. Alice Springs Public Library (ASPL) is a place of learning; it is also a crowded and composite space. The use ASPL could make of the…

  15. The Role of the National Agricultural Library.

    ERIC Educational Resources Information Center

    Howard, Joseph H.

    1989-01-01

    Describes the role, users, collections and services of the National Agricultural Library. Some of the services discussed include a machine readable bibliographic database, an international interlibrary loan system, programs to develop library networks and cooperative cataloging, and the development and use of information technologies such as laser…

  16. Machine learning techniques for breast cancer computer aided diagnosis using different image modalities: A systematic review.

    PubMed

    Yassin, Nisreen I R; Omran, Shaimaa; El Houby, Enas M F; Allam, Hemat

    2018-03-01

    The high incidence of breast cancer in women has increased significantly in the recent years. Physician experience of diagnosing and detecting breast cancer can be assisted by using some computerized features extraction and classification algorithms. This paper presents the conduction and results of a systematic review (SR) that aims to investigate the state of the art regarding the computer aided diagnosis/detection (CAD) systems for breast cancer. The SR was conducted using a comprehensive selection of scientific databases as reference sources, allowing access to diverse publications in the field. The scientific databases used are Springer Link (SL), Science Direct (SD), IEEE Xplore Digital Library, and PubMed. Inclusion and exclusion criteria were defined and applied to each retrieved work to select those of interest. From 320 studies retrieved, 154 studies were included. However, the scope of this research is limited to scientific and academic works and excludes commercial interests. This survey provides a general analysis of the current status of CAD systems according to the used image modalities and the machine learning based classifiers. Potential research studies have been discussed to create a more objective and efficient CAD systems. Copyright © 2017 Elsevier B.V. All rights reserved.

  17. Software platform for managing the classification of error- related potentials of observers

    NASA Astrophysics Data System (ADS)

    Asvestas, P.; Ventouras, E.-C.; Kostopoulos, S.; Sidiropoulos, K.; Korfiatis, V.; Korda, A.; Uzunolglu, A.; Karanasiou, I.; Kalatzis, I.; Matsopoulos, G.

    2015-09-01

    Human learning is partly based on observation. Electroencephalographic recordings of subjects who perform acts (actors) or observe actors (observers), contain a negative waveform in the Evoked Potentials (EPs) of the actors that commit errors and of observers who observe the error-committing actors. This waveform is called the Error-Related Negativity (ERN). Its detection has applications in the context of Brain-Computer Interfaces. The present work describes a software system developed for managing EPs of observers, with the aim of classifying them into observations of either correct or incorrect actions. It consists of an integrated platform for the storage, management, processing and classification of EPs recorded during error-observation experiments. The system was developed using C# and the following development tools and frameworks: MySQL, .NET Framework, Entity Framework and Emgu CV, for interfacing with the machine learning library of OpenCV. Up to six features can be computed per EP recording per electrode. The user can select among various feature selection algorithms and then proceed to train one of three types of classifiers: Artificial Neural Networks, Support Vector Machines, k-nearest neighbour. Next the classifier can be used for classifying any EP curve that has been inputted to the database.

  18. Gleaning knowledge from data in the intensive care unit.

    PubMed

    Pinsky, Michael R; Dubrawski, Artur

    2014-09-15

    It is often difficult to accurately predict when, why, and which patients develop shock, because signs of shock often occur late, once organ injury is already present. Three levels of aggregation of information can be used to aid the bedside clinician in this task: analysis of derived parameters of existing measured physiologic variables using simple bedside calculations (functional hemodynamic monitoring); prior physiologic data of similar subjects during periods of stability and disease to define quantitative metrics of level of severity; and libraries of responses across large and comprehensive collections of records of diverse subjects whose diagnosis, therapies, and course is already known to predict not only disease severity, but also the subsequent behavior of the subject if left untreated or treated with one of the many therapeutic options. The problem is in defining the minimal monitoring data set needed to initially identify those patients across all possible processes, and then specifically monitor their responses to targeted therapies known to improve outcome. To address these issues, multivariable models using machine learning data-driven classification techniques can be used to parsimoniously predict cardiorespiratory insufficiency. We briefly describe how these machine learning approaches are presently applied to address earlier identification of cardiorespiratory insufficiency and direct focused, patient-specific management.

  19. IFLA General Conference, 1991. Division of Special Libraries Services: Section of Social Science Libraries; Section of Geography and Map Libraries; Section of Biological and Medical Sciences Libraries; Section of Art Libraries. Booklet 2.

    ERIC Educational Resources Information Center

    International Federation of Library Associations and Institutions, The Hague (Netherlands).

    The 10 papers in this booklet were presented at meetings of 4 sections within the Division of Special Libraries: (1) "Information Ensurance of a Scientist" (V. Matveyev, USSR); (2) "Linguistic Barriers and Machine Translation" (Stanley Kalkus, USA); (3) "Maps for Planning" (V. I. Zhukov and L. G. Rudenko, USSR); (4)…

  20. Quantum machine learning.

    PubMed

    Biamonte, Jacob; Wittek, Peter; Pancotti, Nicola; Rebentrost, Patrick; Wiebe, Nathan; Lloyd, Seth

    2017-09-13

    Fuelled by increasing computer power and algorithmic advances, machine learning techniques have become powerful tools for finding patterns in data. Quantum systems produce atypical patterns that classical systems are thought not to produce efficiently, so it is reasonable to postulate that quantum computers may outperform classical computers on machine learning tasks. The field of quantum machine learning explores how to devise and implement quantum software that could enable machine learning that is faster than that of classical computers. Recent work has produced quantum algorithms that could act as the building blocks of machine learning programs, but the hardware and software challenges are still considerable.

  1. Quantum machine learning

    NASA Astrophysics Data System (ADS)

    Biamonte, Jacob; Wittek, Peter; Pancotti, Nicola; Rebentrost, Patrick; Wiebe, Nathan; Lloyd, Seth

    2017-09-01

    Fuelled by increasing computer power and algorithmic advances, machine learning techniques have become powerful tools for finding patterns in data. Quantum systems produce atypical patterns that classical systems are thought not to produce efficiently, so it is reasonable to postulate that quantum computers may outperform classical computers on machine learning tasks. The field of quantum machine learning explores how to devise and implement quantum software that could enable machine learning that is faster than that of classical computers. Recent work has produced quantum algorithms that could act as the building blocks of machine learning programs, but the hardware and software challenges are still considerable.

  2. Automated Categorization Scheme for Digital Libraries in Distance Learning: A Pattern Recognition Approach

    ERIC Educational Resources Information Center

    Gunal, Serkan

    2008-01-01

    Digital libraries play a crucial role in distance learning. Nowadays, they are one of the fundamental information sources for the students enrolled in this learning system. These libraries contain huge amount of instructional data (text, audio and video) offered by the distance learning program. Organization of the digital libraries is…

  3. Matrix Multiplication Algorithm Selection with Support Vector Machines

    DTIC Science & Technology

    2015-05-01

    libraries that could intelligently choose the optimal algorithm for a particular set of inputs. Users would be oblivious to the underlying algorithmic...SAT.” J. Artif . Intell. Res.(JAIR), vol. 32, pp. 565–606, 2008. [9] M. G. Lagoudakis and M. L. Littman, “Algorithm selection using reinforcement...Artificial Intelligence , vol. 21, no. 05, pp. 961–976, 2007. [15] C.-C. Chang and C.-J. Lin, “LIBSVM: A library for support vector machines,” ACM

  4. Dropout Prediction in E-Learning Courses through the Combination of Machine Learning Techniques

    ERIC Educational Resources Information Center

    Lykourentzou, Ioanna; Giannoukos, Ioannis; Nikolopoulos, Vassilis; Mpardis, George; Loumos, Vassili

    2009-01-01

    In this paper, a dropout prediction method for e-learning courses, based on three popular machine learning techniques and detailed student data, is proposed. The machine learning techniques used are feed-forward neural networks, support vector machines and probabilistic ensemble simplified fuzzy ARTMAP. Since a single technique may fail to…

  5. ARL Statement on Unlimited Use and Exchange of Bibliographic Records.

    ERIC Educational Resources Information Center

    Association of Research Libraries, Washington, DC.

    The Association of Research Libraries is fully committed to the principle of unrestricted access to and dissemination of ideas, i.e., member libraries must have unlimited access to the machine-readable bibliographic records which are created by member libraries and maintained in bibliographic utilities. Coordinated collection development programs…

  6. Untangling the Tangled Webs We Weave: A Team Approach to Cyberspace.

    ERIC Educational Resources Information Center

    Broidy, Ellen; And Others

    Working in a cooperative team environment across libraries and job classifications, librarians and support staff at the University of California at Irvine (UCI) have mounted several successful web projects, including two versions of the Libraries' home page, a virtual reference collection, and Science Library "ANTswer Machine." UCI's…

  7. A Resource Guide for Teachers.

    ERIC Educational Resources Information Center

    Texas State Library, Austin. Div. for the Blind and Physically Handicapped.

    The Texas State Library provides services to persons unable to read standard print due to blindness or other visual impairments, including a wide range of large print materials, braille, and recordings. The library also coordinates a reading machine program incorporating 70 sites in public and academic libraries and provides classrooms with books,…

  8. AZOrange - High performance open source machine learning for QSAR modeling in a graphical programming environment

    PubMed Central

    2011-01-01

    Background Machine learning has a vast range of applications. In particular, advanced machine learning methods are routinely and increasingly used in quantitative structure activity relationship (QSAR) modeling. QSAR data sets often encompass tens of thousands of compounds and the size of proprietary, as well as public data sets, is rapidly growing. Hence, there is a demand for computationally efficient machine learning algorithms, easily available to researchers without extensive machine learning knowledge. In granting the scientific principles of transparency and reproducibility, Open Source solutions are increasingly acknowledged by regulatory authorities. Thus, an Open Source state-of-the-art high performance machine learning platform, interfacing multiple, customized machine learning algorithms for both graphical programming and scripting, to be used for large scale development of QSAR models of regulatory quality, is of great value to the QSAR community. Results This paper describes the implementation of the Open Source machine learning package AZOrange. AZOrange is specially developed to support batch generation of QSAR models in providing the full work flow of QSAR modeling, from descriptor calculation to automated model building, validation and selection. The automated work flow relies upon the customization of the machine learning algorithms and a generalized, automated model hyper-parameter selection process. Several high performance machine learning algorithms are interfaced for efficient data set specific selection of the statistical method, promoting model accuracy. Using the high performance machine learning algorithms of AZOrange does not require programming knowledge as flexible applications can be created, not only at a scripting level, but also in a graphical programming environment. Conclusions AZOrange is a step towards meeting the needs for an Open Source high performance machine learning platform, supporting the efficient development of highly accurate QSAR models fulfilling regulatory requirements. PMID:21798025

  9. AZOrange - High performance open source machine learning for QSAR modeling in a graphical programming environment.

    PubMed

    Stålring, Jonna C; Carlsson, Lars A; Almeida, Pedro; Boyer, Scott

    2011-07-28

    Machine learning has a vast range of applications. In particular, advanced machine learning methods are routinely and increasingly used in quantitative structure activity relationship (QSAR) modeling. QSAR data sets often encompass tens of thousands of compounds and the size of proprietary, as well as public data sets, is rapidly growing. Hence, there is a demand for computationally efficient machine learning algorithms, easily available to researchers without extensive machine learning knowledge. In granting the scientific principles of transparency and reproducibility, Open Source solutions are increasingly acknowledged by regulatory authorities. Thus, an Open Source state-of-the-art high performance machine learning platform, interfacing multiple, customized machine learning algorithms for both graphical programming and scripting, to be used for large scale development of QSAR models of regulatory quality, is of great value to the QSAR community. This paper describes the implementation of the Open Source machine learning package AZOrange. AZOrange is specially developed to support batch generation of QSAR models in providing the full work flow of QSAR modeling, from descriptor calculation to automated model building, validation and selection. The automated work flow relies upon the customization of the machine learning algorithms and a generalized, automated model hyper-parameter selection process. Several high performance machine learning algorithms are interfaced for efficient data set specific selection of the statistical method, promoting model accuracy. Using the high performance machine learning algorithms of AZOrange does not require programming knowledge as flexible applications can be created, not only at a scripting level, but also in a graphical programming environment. AZOrange is a step towards meeting the needs for an Open Source high performance machine learning platform, supporting the efficient development of highly accurate QSAR models fulfilling regulatory requirements.

  10. The Efficacy of Machine Learning Programs for Navy Manpower Analysis

    DTIC Science & Technology

    1993-03-01

    This thesis investigated the efficacy of two machine learning programs for Navy manpower analysis. Two machine learning programs, AIM and IXL, were...to generate models from the two commercial machine learning programs. Using a held out sub-set of the data the capabilities of the three models were...partial effects. The author recommended further investigation of AIM’s capabilities, and testing in an operational environment.... Machine learning , AIM, IXL.

  11. 3/29/2018: Making Data Machine-Readable Webinar | National Agricultural

    Science.gov Websites

    Library Skip to main content Home National Agricultural Library United States Department of | USDA.gov | Agricultural Research Service | Plain Language | FOIA | Accessibility Statement | Information

  12. The Security of Machine Learning

    DTIC Science & Technology

    2008-04-24

    Machine learning has become a fundamental tool for computer security, since it can rapidly evolve to changing and complex situations. That...adaptability is also a vulnerability: attackers can exploit machine learning systems. We present a taxonomy identifying and analyzing attacks against machine ...We use our framework to survey and analyze the literature of attacks against machine learning systems. We also illustrate our taxonomy by showing

  13. UAVs and Machine Learning Revolutionising Invasive Grass and Vegetation Surveys in Remote Arid Lands.

    PubMed

    Sandino, Juan; Gonzalez, Felipe; Mengersen, Kerrie; Gaston, Kevin J

    2018-02-16

    The monitoring of invasive grasses and vegetation in remote areas is challenging, costly, and on the ground sometimes dangerous. Satellite and manned aircraft surveys can assist but their use may be limited due to the ground sampling resolution or cloud cover. Straightforward and accurate surveillance methods are needed to quantify rates of grass invasion, offer appropriate vegetation tracking reports, and apply optimal control methods. This paper presents a pipeline process to detect and generate a pixel-wise segmentation of invasive grasses, using buffel grass (Cenchrus ciliaris) and spinifex (Triodia sp.) as examples. The process integrates unmanned aerial vehicles (UAVs) also commonly known as drones, high-resolution red, green, blue colour model (RGB) cameras, and a data processing approach based on machine learning algorithms. The methods are illustrated with data acquired in Cape Range National Park, Western Australia (WA), Australia, orthorectified in Agisoft Photoscan Pro, and processed in Python programming language, scikit-learn, and eXtreme Gradient Boosting (XGBoost) libraries. In total, 342,626 samples were extracted from the obtained data set and labelled into six classes. Segmentation results provided an individual detection rate of 97% for buffel grass and 96% for spinifex, with a global multiclass pixel-wise detection rate of 97%. Obtained results were robust against illumination changes, object rotation, occlusion, background cluttering, and floral density variation.

  14. UAVs and Machine Learning Revolutionising Invasive Grass and Vegetation Surveys in Remote Arid Lands

    PubMed Central

    2018-01-01

    The monitoring of invasive grasses and vegetation in remote areas is challenging, costly, and on the ground sometimes dangerous. Satellite and manned aircraft surveys can assist but their use may be limited due to the ground sampling resolution or cloud cover. Straightforward and accurate surveillance methods are needed to quantify rates of grass invasion, offer appropriate vegetation tracking reports, and apply optimal control methods. This paper presents a pipeline process to detect and generate a pixel-wise segmentation of invasive grasses, using buffel grass (Cenchrus ciliaris) and spinifex (Triodia sp.) as examples. The process integrates unmanned aerial vehicles (UAVs) also commonly known as drones, high-resolution red, green, blue colour model (RGB) cameras, and a data processing approach based on machine learning algorithms. The methods are illustrated with data acquired in Cape Range National Park, Western Australia (WA), Australia, orthorectified in Agisoft Photoscan Pro, and processed in Python programming language, scikit-learn, and eXtreme Gradient Boosting (XGBoost) libraries. In total, 342,626 samples were extracted from the obtained data set and labelled into six classes. Segmentation results provided an individual detection rate of 97% for buffel grass and 96% for spinifex, with a global multiclass pixel-wise detection rate of 97%. Obtained results were robust against illumination changes, object rotation, occlusion, background cluttering, and floral density variation. PMID:29462912

  15. Efficient identification of nationally mandated reportable cancer cases using natural language processing and machine learning.

    PubMed

    Osborne, John D; Wyatt, Matthew; Westfall, Andrew O; Willig, James; Bethard, Steven; Gordon, Geoff

    2016-11-01

    To help cancer registrars efficiently and accurately identify reportable cancer cases. The Cancer Registry Control Panel (CRCP) was developed to detect mentions of reportable cancer cases using a pipeline built on the Unstructured Information Management Architecture - Asynchronous Scaleout (UIMA-AS) architecture containing the National Library of Medicine's UIMA MetaMap annotator as well as a variety of rule-based UIMA annotators that primarily act to filter out concepts referring to nonreportable cancers. CRCP inspects pathology reports nightly to identify pathology records containing relevant cancer concepts and combines this with diagnosis codes from the Clinical Electronic Data Warehouse to identify candidate cancer patients using supervised machine learning. Cancer mentions are highlighted in all candidate clinical notes and then sorted in CRCP's web interface for faster validation by cancer registrars. CRCP achieved an accuracy of 0.872 and detected reportable cancer cases with a precision of 0.843 and a recall of 0.848. CRCP increases throughput by 22.6% over a baseline (manual review) pathology report inspection system while achieving a higher precision and recall. Depending on registrar time constraints, CRCP can increase recall to 0.939 at the expense of precision by incorporating a data source information feature. CRCP demonstrates accurate results when applying natural language processing features to the problem of detecting patients with cases of reportable cancer from clinical notes. We show that implementing only a portion of cancer reporting rules in the form of regular expressions is sufficient to increase the precision, recall, and speed of the detection of reportable cancer cases when combined with off-the-shelf information extraction software and machine learning. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  16. TEES 2.2: Biomedical Event Extraction for Diverse Corpora

    PubMed Central

    2015-01-01

    Background The Turku Event Extraction System (TEES) is a text mining program developed for the extraction of events, complex biomedical relationships, from scientific literature. Based on a graph-generation approach, the system detects events with the use of a rich feature set built via dependency parsing. The TEES system has achieved record performance in several of the shared tasks of its domain, and continues to be used in a variety of biomedical text mining tasks. Results The TEES system was quickly adapted to the BioNLP'13 Shared Task in order to provide a public baseline for derived systems. An automated approach was developed for learning the underlying annotation rules of event type, allowing immediate adaptation to the various subtasks, and leading to a first place in four out of eight tasks. The system for the automated learning of annotation rules is further enhanced in this paper to the point of requiring no manual adaptation to any of the BioNLP'13 tasks. Further, the scikit-learn machine learning library is integrated into the system, bringing a wide variety of machine learning methods usable with TEES in addition to the default SVM. A scikit-learn ensemble method is also used to analyze the importances of the features in the TEES feature sets. Conclusions The TEES system was introduced for the BioNLP'09 Shared Task and has since then demonstrated good performance in several other shared tasks. By applying the current TEES 2.2 system to multiple corpora from these past shared tasks an overarching analysis of the most promising methods and possible pitfalls in the evolving field of biomedical event extraction are presented. PMID:26551925

  17. TEES 2.2: Biomedical Event Extraction for Diverse Corpora.

    PubMed

    Björne, Jari; Salakoski, Tapio

    2015-01-01

    The Turku Event Extraction System (TEES) is a text mining program developed for the extraction of events, complex biomedical relationships, from scientific literature. Based on a graph-generation approach, the system detects events with the use of a rich feature set built via dependency parsing. The TEES system has achieved record performance in several of the shared tasks of its domain, and continues to be used in a variety of biomedical text mining tasks. The TEES system was quickly adapted to the BioNLP'13 Shared Task in order to provide a public baseline for derived systems. An automated approach was developed for learning the underlying annotation rules of event type, allowing immediate adaptation to the various subtasks, and leading to a first place in four out of eight tasks. The system for the automated learning of annotation rules is further enhanced in this paper to the point of requiring no manual adaptation to any of the BioNLP'13 tasks. Further, the scikit-learn machine learning library is integrated into the system, bringing a wide variety of machine learning methods usable with TEES in addition to the default SVM. A scikit-learn ensemble method is also used to analyze the importances of the features in the TEES feature sets. The TEES system was introduced for the BioNLP'09 Shared Task and has since then demonstrated good performance in several other shared tasks. By applying the current TEES 2.2 system to multiple corpora from these past shared tasks an overarching analysis of the most promising methods and possible pitfalls in the evolving field of biomedical event extraction are presented.

  18. Entanglement-Based Machine Learning on a Quantum Computer

    NASA Astrophysics Data System (ADS)

    Cai, X.-D.; Wu, D.; Su, Z.-E.; Chen, M.-C.; Wang, X.-L.; Li, Li; Liu, N.-L.; Lu, C.-Y.; Pan, J.-W.

    2015-03-01

    Machine learning, a branch of artificial intelligence, learns from previous experience to optimize performance, which is ubiquitous in various fields such as computer sciences, financial analysis, robotics, and bioinformatics. A challenge is that machine learning with the rapidly growing "big data" could become intractable for classical computers. Recently, quantum machine learning algorithms [Lloyd, Mohseni, and Rebentrost, arXiv.1307.0411] were proposed which could offer an exponential speedup over classical algorithms. Here, we report the first experimental entanglement-based classification of two-, four-, and eight-dimensional vectors to different clusters using a small-scale photonic quantum computer, which are then used to implement supervised and unsupervised machine learning. The results demonstrate the working principle of using quantum computers to manipulate and classify high-dimensional vectors, the core mathematical routine in machine learning. The method can, in principle, be scaled to larger numbers of qubits, and may provide a new route to accelerate machine learning.

  19. MEDLARS and the Library Community

    PubMed Central

    Adams, Scott

    1964-01-01

    The intention of the National Library of Medicine is to share with other libraries the products and the capabilities developed by the MEDLARS system. MEDLARS will provide bibliographic services of use to other libraries from the central system. The decentralization of the central system to permit libraries with access to computers to establish local machine retrieval systems is also indicated. The implications of such decentralization for the American medical library network and its effect on library evolution are suggested, as are the implications for international development of mechanized storage and retrieval systems. PMID:14119289

  20. Documentation for the machine-readable version of A Library of Stellar Spectra (Jacoby, Hunter and Christian 1984)

    NASA Technical Reports Server (NTRS)

    Warren, W. H., Jr.

    1984-01-01

    The machine readable library as it is currently being distributed from the Astronomical Data Center is described. The library contains digital spectral for 161 stars of spectral classes O through M and luminosity classes 1, 3 and 5 in the wavelength range 3510 A to 7427 A. The resolution is approximately 4.5 A, while the typical photometric uncertainty of each resolution element is approximately 1 percent and broadband variations are 3 percent. The documentation includes a format description, a table of the indigenous characteristics of the magnetic tape file, and a sample listing of logical records exactly as they are recorded on the tape.

  1. Investigation of the Public Library as a Linking Agent to Major Scientific, Educational, Social and Environmental Data Bases. Two-Year Interim Report.

    ERIC Educational Resources Information Center

    Summit, Roger K.; Firschein, Oscar

    Eight public libraries participated in a two-year experiment to investigate the potential of the public library as a "linking agent" between the public and the many machine-readable data bases currently accessible using on line computer terminals. The investigation covered users of the service, impact on the library, conditions for…

  2. The Relationship between Learning Organization Dimensions and Library Performance

    ERIC Educational Resources Information Center

    Haley, Qing Kong

    2010-01-01

    The purpose of this research was to examine the relationship between learning organization dimensions and academic library performance. It studied whether differences existed in learning organization dimensions given the predictor variables of performance indicators, library resources, and demographics of the academic library. This research…

  3. A bottom-up approach to MEDLINE indexing recommendations.

    PubMed

    Jimeno-Yepes, Antonio; Wilkowski, Bartłomiej; Mork, James G; Van Lenten, Elizabeth; Fushman, Dina Demner; Aronson, Alan R

    2011-01-01

    MEDLINE indexing performed by the US National Library of Medicine staff describes the essence of a biomedical publication in about 14 Medical Subject Headings (MeSH). Since 2002, this task is assisted by the Medical Text Indexer (MTI) program. We present a bottom-up approach to MEDLINE indexing in which the abstract is searched for indicators for a specific MeSH recommendation in a two-step process. Supervised machine learning combined with triage rules improves sensitivity of recommendations while keeping the number of recommended terms relatively small. Improvement in recommendations observed in this work warrants further exploration of this approach to MTI recommendations on a larger set of MeSH headings.

  4. A Machine Learning and Optimization Toolkit for the Swarm

    DTIC Science & Technology

    2014-11-17

    Machine   Learning  and  Op0miza0on   Toolkit  for  the  Swarm   Ilge  Akkaya,  Shuhei  Emoto...3. DATES COVERED 00-00-2014 to 00-00-2014 4. TITLE AND SUBTITLE A Machine Learning and Optimization Toolkit for the Swarm 5a. CONTRACT NUMBER... machine   learning   methodologies  by  providing  the  right  interfaces  between   machine   learning  tools  and

  5. Library Information System Time-Sharing (LISTS) Project. Final Report.

    ERIC Educational Resources Information Center

    Black, Donald V.

    The Library Information System Time-Sharing (LISTS) experiment was based on three innovations in data processing technology: (1) the advent of computer time-sharing on third-generation machines, (2) the development of general-purpose file-management software and (3) the introduction of large, library-oriented data bases. The main body of the…

  6. Educational Resources in the ASCC Library

    ERIC Educational Resources Information Center

    Lin, Steven

    2006-01-01

    After two years of construction, American Samoa Community College opened its new library on September 2, 2003. The library is located on the east side of campus and is equipped with ten computer workstations, four online public access catalogs, three copying machines, and an elevator that is in compliance with the Americans with Disabilities Act.…

  7. InfoQUEST: An Online Catalog for Small Libraries.

    ERIC Educational Resources Information Center

    Campbell, Bonnie

    1984-01-01

    InfoQUEST is a microcomputer-based online public access catalog, designed for the small library handling file sizes up to 25,000 records. Based on the IBM-PC, or compatible machines, the system will accept downloading, in batch mode, of records from the library's file on the UTLAS Catalog Support System. (Author/EJS)

  8. Library-Based Learning in an Information Society.

    ERIC Educational Resources Information Center

    Breivik, Patricia Senn

    1986-01-01

    The average academic library has great potential for quality nonclassroom learning benefiting students, faculty, alumni, and the local business community. The major detriments are the limited perceptions about libraries and librarians among campus administrators and faculty. Library-based learning should be planned to be assimilated into overall…

  9. Simple Machines. Physical Science in Action[TM]. Schlessinger Science Library. [Videotape].

    ERIC Educational Resources Information Center

    2000

    In today's world, kids are aware that there are machines all around them. What they may not realize is that the function of all machines is to make work easier in some way. Simple Machines uses engaging visuals and colorful graphics to explain the concept of work and how humans use certain basic tools to help get work done. Students will learn…

  10. All about Simple Machines. Physical Science for Children[TM]. Schlessinger Science Library. [Videotape].

    ERIC Educational Resources Information Center

    2000

    All kids know the word "work." But they probably don't understand that work happens whenever a force is used to move something--whether it's lifting a heavy object or playing on a see-saw. All About Simple Machines introduces kids to the concepts of forces, work and how machines are used to make work easier. Six simple machines are…

  11. The JPL Library Information Retrieval System

    ERIC Educational Resources Information Center

    Walsh, Josephine

    1975-01-01

    The development, capabilities, and products of the computer-based retrieval system of the Jet Propulsion Laboratory Library are described. The system handles books and documents, produces a book catalog, and provides a machine search capability. (Author)

  12. Machine Learning Methods for Analysis of Metabolic Data and Metabolic Pathway Modeling

    PubMed Central

    Cuperlovic-Culf, Miroslava

    2018-01-01

    Machine learning uses experimental data to optimize clustering or classification of samples or features, or to develop, augment or verify models that can be used to predict behavior or properties of systems. It is expected that machine learning will help provide actionable knowledge from a variety of big data including metabolomics data, as well as results of metabolism models. A variety of machine learning methods has been applied in bioinformatics and metabolism analyses including self-organizing maps, support vector machines, the kernel machine, Bayesian networks or fuzzy logic. To a lesser extent, machine learning has also been utilized to take advantage of the increasing availability of genomics and metabolomics data for the optimization of metabolic network models and their analysis. In this context, machine learning has aided the development of metabolic networks, the calculation of parameters for stoichiometric and kinetic models, as well as the analysis of major features in the model for the optimal application of bioreactors. Examples of this very interesting, albeit highly complex, application of machine learning for metabolism modeling will be the primary focus of this review presenting several different types of applications for model optimization, parameter determination or system analysis using models, as well as the utilization of several different types of machine learning technologies. PMID:29324649

  13. Machine Learning Methods for Analysis of Metabolic Data and Metabolic Pathway Modeling.

    PubMed

    Cuperlovic-Culf, Miroslava

    2018-01-11

    Machine learning uses experimental data to optimize clustering or classification of samples or features, or to develop, augment or verify models that can be used to predict behavior or properties of systems. It is expected that machine learning will help provide actionable knowledge from a variety of big data including metabolomics data, as well as results of metabolism models. A variety of machine learning methods has been applied in bioinformatics and metabolism analyses including self-organizing maps, support vector machines, the kernel machine, Bayesian networks or fuzzy logic. To a lesser extent, machine learning has also been utilized to take advantage of the increasing availability of genomics and metabolomics data for the optimization of metabolic network models and their analysis. In this context, machine learning has aided the development of metabolic networks, the calculation of parameters for stoichiometric and kinetic models, as well as the analysis of major features in the model for the optimal application of bioreactors. Examples of this very interesting, albeit highly complex, application of machine learning for metabolism modeling will be the primary focus of this review presenting several different types of applications for model optimization, parameter determination or system analysis using models, as well as the utilization of several different types of machine learning technologies.

  14. A machine learning model with human cognitive biases capable of learning from small and biased datasets.

    PubMed

    Taniguchi, Hidetaka; Sato, Hiroshi; Shirakawa, Tomohiro

    2018-05-09

    Human learners can generalize a new concept from a small number of samples. In contrast, conventional machine learning methods require large amounts of data to address the same types of problems. Humans have cognitive biases that promote fast learning. Here, we developed a method to reduce the gap between human beings and machines in this type of inference by utilizing cognitive biases. We implemented a human cognitive model into machine learning algorithms and compared their performance with the currently most popular methods, naïve Bayes, support vector machine, neural networks, logistic regression and random forests. We focused on the task of spam classification, which has been studied for a long time in the field of machine learning and often requires a large amount of data to obtain high accuracy. Our models achieved superior performance with small and biased samples in comparison with other representative machine learning methods.

  15. Machine learning: novel bioinformatics approaches for combating antimicrobial resistance.

    PubMed

    Macesic, Nenad; Polubriaginof, Fernanda; Tatonetti, Nicholas P

    2017-12-01

    Antimicrobial resistance (AMR) is a threat to global health and new approaches to combating AMR are needed. Use of machine learning in addressing AMR is in its infancy but has made promising steps. We reviewed the current literature on the use of machine learning for studying bacterial AMR. The advent of large-scale data sets provided by next-generation sequencing and electronic health records make applying machine learning to the study and treatment of AMR possible. To date, it has been used for antimicrobial susceptibility genotype/phenotype prediction, development of AMR clinical decision rules, novel antimicrobial agent discovery and antimicrobial therapy optimization. Application of machine learning to studying AMR is feasible but remains limited. Implementation of machine learning in clinical settings faces barriers to uptake with concerns regarding model interpretability and data quality.Future applications of machine learning to AMR are likely to be laboratory-based, such as antimicrobial susceptibility phenotype prediction.

  16. Next-Generation Machine Learning for Biological Networks.

    PubMed

    Camacho, Diogo M; Collins, Katherine M; Powers, Rani K; Costello, James C; Collins, James J

    2018-06-14

    Machine learning, a collection of data-analytical techniques aimed at building predictive models from multi-dimensional datasets, is becoming integral to modern biological research. By enabling one to generate models that learn from large datasets and make predictions on likely outcomes, machine learning can be used to study complex cellular systems such as biological networks. Here, we provide a primer on machine learning for life scientists, including an introduction to deep learning. We discuss opportunities and challenges at the intersection of machine learning and network biology, which could impact disease biology, drug discovery, microbiome research, and synthetic biology. Copyright © 2018 Elsevier Inc. All rights reserved.

  17. Comparison between extreme learning machine and wavelet neural networks in data classification

    NASA Astrophysics Data System (ADS)

    Yahia, Siwar; Said, Salwa; Jemai, Olfa; Zaied, Mourad; Ben Amar, Chokri

    2017-03-01

    Extreme learning Machine is a well known learning algorithm in the field of machine learning. It's about a feed forward neural network with a single-hidden layer. It is an extremely fast learning algorithm with good generalization performance. In this paper, we aim to compare the Extreme learning Machine with wavelet neural networks, which is a very used algorithm. We have used six benchmark data sets to evaluate each technique. These datasets Including Wisconsin Breast Cancer, Glass Identification, Ionosphere, Pima Indians Diabetes, Wine Recognition and Iris Plant. Experimental results have shown that both extreme learning machine and wavelet neural networks have reached good results.

  18. MLBCD: a machine learning tool for big clinical data.

    PubMed

    Luo, Gang

    2015-01-01

    Predictive modeling is fundamental for extracting value from large clinical data sets, or "big clinical data," advancing clinical research, and improving healthcare. Machine learning is a powerful approach to predictive modeling. Two factors make machine learning challenging for healthcare researchers. First, before training a machine learning model, the values of one or more model parameters called hyper-parameters must typically be specified. Due to their inexperience with machine learning, it is hard for healthcare researchers to choose an appropriate algorithm and hyper-parameter values. Second, many clinical data are stored in a special format. These data must be iteratively transformed into the relational table format before conducting predictive modeling. This transformation is time-consuming and requires computing expertise. This paper presents our vision for and design of MLBCD (Machine Learning for Big Clinical Data), a new software system aiming to address these challenges and facilitate building machine learning predictive models using big clinical data. The paper describes MLBCD's design in detail. By making machine learning accessible to healthcare researchers, MLBCD will open the use of big clinical data and increase the ability to foster biomedical discovery and improve care.

  19. Machine Learning and Radiology

    PubMed Central

    Wang, Shijun; Summers, Ronald M.

    2012-01-01

    In this paper, we give a short introduction to machine learning and survey its applications in radiology. We focused on six categories of applications in radiology: medical image segmentation, registration, computer aided detection and diagnosis, brain function or activity analysis and neurological disease diagnosis from fMR images, content-based image retrieval systems for CT or MRI images, and text analysis of radiology reports using natural language processing (NLP) and natural language understanding (NLU). This survey shows that machine learning plays a key role in many radiology applications. Machine learning identifies complex patterns automatically and helps radiologists make intelligent decisions on radiology data such as conventional radiographs, CT, MRI, and PET images and radiology reports. In many applications, the performance of machine learning-based automatic detection and diagnosis systems has shown to be comparable to that of a well-trained and experienced radiologist. Technology development in machine learning and radiology will benefit from each other in the long run. Key contributions and common characteristics of machine learning techniques in radiology are discussed. We also discuss the problem of translating machine learning applications to the radiology clinical setting, including advantages and potential barriers. PMID:22465077

  20. Alternative Models of Service, Centralized Machine Operations. Phase II Report. Volume II.

    ERIC Educational Resources Information Center

    Technology Management Corp., Alexandria, VA.

    A study was conducted to determine if the centralization of playback machine operations for the national free library program would be feasible, economical, and desirable. An alternative model of playback machine services was constructed and compared with existing network operations considering both cost and service. The alternative model was…

  1. Evaluating the Security of Machine Learning Algorithms

    DTIC Science & Technology

    2008-05-20

    Two far-reaching trends in computing have grown in significance in recent years. First, statistical machine learning has entered the mainstream as a...computing applications. The growing intersection of these trends compels us to investigate how well machine learning performs under adversarial conditions... machine learning has a structure that we can use to build secure learning systems. This thesis makes three high-level contributions. First, we develop a

  2. Chronology of Milestones for Libraries and Adult Lifelong Learning and Literacy.

    ERIC Educational Resources Information Center

    McCook, Kathleen de la Pena; Barber, Peggy

    This chronology highlights milestones for libraries and adult lifelong learning and literacy from 1924-2001, including the following events: William S. Learned's "The American Public Library and the Diffusion of Knowledge" is published (1924); establishment of the ALA (American Library Association) Adult Education Section (1946); the…

  3. School Libraries and Student Learning: A Guide for School Leaders

    ERIC Educational Resources Information Center

    Morris, Rebecca J.

    2015-01-01

    Innovative, well-designed school library programs can be critical resources for helping students meet high standards of college and career readiness. In "School Libraries and Student Learning", Rebecca J. Morris shows how school leaders can make the most of their school libraries to support ambitious student learning. She offers…

  4. Using human brain activity to guide machine learning.

    PubMed

    Fong, Ruth C; Scheirer, Walter J; Cox, David D

    2018-03-29

    Machine learning is a field of computer science that builds algorithms that learn. In many cases, machine learning algorithms are used to recreate a human ability like adding a caption to a photo, driving a car, or playing a game. While the human brain has long served as a source of inspiration for machine learning, little effort has been made to directly use data collected from working brains as a guide for machine learning algorithms. Here we demonstrate a new paradigm of "neurally-weighted" machine learning, which takes fMRI measurements of human brain activity from subjects viewing images, and infuses these data into the training process of an object recognition learning algorithm to make it more consistent with the human brain. After training, these neurally-weighted classifiers are able to classify images without requiring any additional neural data. We show that our neural-weighting approach can lead to large performance gains when used with traditional machine vision features, as well as to significant improvements with already high-performing convolutional neural network features. The effectiveness of this approach points to a path forward for a new class of hybrid machine learning algorithms which take both inspiration and direct constraints from neuronal data.

  5. Design of an Automated Library Information Storage and Retrieval System for Library of Congress Division for the Blind and Physically Handicapped (DBPH). Final Report.

    ERIC Educational Resources Information Center

    Systems Architects, Inc., Randolph, MA.

    A practical system for producing a union catalog of titles in the collections of the Library of Congress Division for the Blind and Physically Handicapped (DBPH), its regional network, and related agencies from a machine-readable data base is presented. The DBPH organization and operations and the associated regional library network are analyzed.…

  6. Function library programming to support B89 evaluation of Sheffield Apollo RS50 DCC (Direct Computer Control) CMM (Coordinate Measuring Machine)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Frank, R.N.

    1990-02-28

    The Inspection Shop at Lawrence Livermore Lab recently purchased a Sheffield Apollo RS50 Direct Computer Control Coordinate Measuring Machine. The performance of the machine was specified to conform to B89 standard which relies heavily upon using the measuring machine in its intended manner to verify its accuracy (rather than parametric tests). Although it would be possible to use the interactive measurement system to perform these tasks, a more thorough and efficient job can be done by creating Function Library programs for certain tasks which integrate Hewlett-Packard Basic 5.0 language and calls to proprietary analysis and machine control routines. This combinationmore » provides efficient use of the measuring machine with a minimum of keyboard input plus an analysis of the data with respect to the B89 Standard rather than a CMM analysis which would require subsequent interpretation. This paper discusses some characteristics of the Sheffield machine control and analysis software and my use of H-P Basic language to create automated measurement programs to support the B89 performance evaluation of the CMM. 1 ref.« less

  7. Quantum-Enhanced Machine Learning

    NASA Astrophysics Data System (ADS)

    Dunjko, Vedran; Taylor, Jacob M.; Briegel, Hans J.

    2016-09-01

    The emerging field of quantum machine learning has the potential to substantially aid in the problems and scope of artificial intelligence. This is only enhanced by recent successes in the field of classical machine learning. In this work we propose an approach for the systematic treatment of machine learning, from the perspective of quantum information. Our approach is general and covers all three main branches of machine learning: supervised, unsupervised, and reinforcement learning. While quantum improvements in supervised and unsupervised learning have been reported, reinforcement learning has received much less attention. Within our approach, we tackle the problem of quantum enhancements in reinforcement learning as well, and propose a systematic scheme for providing improvements. As an example, we show that quadratic improvements in learning efficiency, and exponential improvements in performance over limited time periods, can be obtained for a broad class of learning problems.

  8. Perl Extension to the Bproc Library

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Grunau, Daryl W.

    2004-06-07

    The Beowulf Distributed process Space (Bproc) software stack is comprised of UNIX/Linux kernel modifications and a support library by which a cluster of machines, each running their own private kernel, can present itself as a unified process space to the user. A Bproc cluster contains a single front-end machine and many back-end nodes which receive and run processes given to them by the front-end. Any process which is migrated to a back-end node is also visible as a ghost process on the fron-end, and may be controlled there using traditional UNIX semantics (e.g. ps(1), kill(1), etc). This software is amore » Perl extension to the Bproc library which enables the Perl programmer to make direct calls to functions within the Bproc library. See http://www.clustermatic.org, http://bproc.sourceforge.net, and http://www.perl.org« less

  9. Myths and legends in learning classification rules

    NASA Technical Reports Server (NTRS)

    Buntine, Wray

    1990-01-01

    A discussion is presented of machine learning theory on empirically learning classification rules. Six myths are proposed in the machine learning community that address issues of bias, learning as search, computational learning theory, Occam's razor, universal learning algorithms, and interactive learning. Some of the problems raised are also addressed from a Bayesian perspective. Questions are suggested that machine learning researchers should be addressing both theoretically and experimentally.

  10. Machine Learning Based Malware Detection

    DTIC Science & Technology

    2015-05-18

    A TRIDENT SCHOLAR PROJECT REPORT NO. 440 Machine Learning Based Malware Detection by Midshipman 1/C Zane A. Markel, USN...COVERED (From - To) 4. TITLE AND SUBTITLE Machine Learning Based Malware Detection 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM...suitably be projected into realistic performance. This work explores several aspects of machine learning based malware detection . First, we

  11. Interpreting Medical Information Using Machine Learning and Individual Conditional Expectation.

    PubMed

    Nohara, Yasunobu; Wakata, Yoshifumi; Nakashima, Naoki

    2015-01-01

    Recently, machine-learning techniques have spread many fields. However, machine-learning is still not popular in medical research field due to difficulty of interpreting. In this paper, we introduce a method of interpreting medical information using machine learning technique. The method gave new explanation of partial dependence plot and individual conditional expectation plot from medical research field.

  12. Machine Learning Applications to Resting-State Functional MR Imaging Analysis.

    PubMed

    Billings, John M; Eder, Maxwell; Flood, William C; Dhami, Devendra Singh; Natarajan, Sriraam; Whitlow, Christopher T

    2017-11-01

    Machine learning is one of the most exciting and rapidly expanding fields within computer science. Academic and commercial research entities are investing in machine learning methods, especially in personalized medicine via patient-level classification. There is great promise that machine learning methods combined with resting state functional MR imaging will aid in diagnosis of disease and guide potential treatment for conditions thought to be impossible to identify based on imaging alone, such as psychiatric disorders. We discuss machine learning methods and explore recent advances. Copyright © 2017 Elsevier Inc. All rights reserved.

  13. Source localization in an ocean waveguide using supervised machine learning.

    PubMed

    Niu, Haiqiang; Reeves, Emma; Gerstoft, Peter

    2017-09-01

    Source localization in ocean acoustics is posed as a machine learning problem in which data-driven methods learn source ranges directly from observed acoustic data. The pressure received by a vertical linear array is preprocessed by constructing a normalized sample covariance matrix and used as the input for three machine learning methods: feed-forward neural networks (FNN), support vector machines (SVM), and random forests (RF). The range estimation problem is solved both as a classification problem and as a regression problem by these three machine learning algorithms. The results of range estimation for the Noise09 experiment are compared for FNN, SVM, RF, and conventional matched-field processing and demonstrate the potential of machine learning for underwater source localization.

  14. Public Libraries and Community-Based Education: Making the Connection for Lifelong Learning. Volume 2: Commissioned Papers. A Conference Sponsored by the National Institute on Postsecondary Education, Libraries, and Lifelong Learning, Office of Educational Research and Improvement (Washington, D.C., April 12-13, 1995).

    ERIC Educational Resources Information Center

    National Inst. on Postsecondary Education, Libraries, and Lifelong Learning (ED/OERI), Washington, DC.

    This conference explored the relationship between the public library, community-based adult education, and lifelong learning. The eight commissioned papers presented include: "Community Based Adult Jewish Learning Program: Issues and Concerns" (Paul A. Flexner); "Rural and Small Libraries: Provisions for Lifelong Learning" (Bernard Vavrek);…

  15. jCompoundMapper: An open source Java library and command-line tool for chemical fingerprints

    PubMed Central

    2011-01-01

    Background The decomposition of a chemical graph is a convenient approach to encode information of the corresponding organic compound. While several commercial toolkits exist to encode molecules as so-called fingerprints, only a few open source implementations are available. The aim of this work is to introduce a library for exactly defined molecular decompositions, with a strong focus on the application of these features in machine learning and data mining. It provides several options such as search depth, distance cut-offs, atom- and pharmacophore typing. Furthermore, it provides the functionality to combine, to compare, or to export the fingerprints into several formats. Results We provide a Java 1.6 library for the decomposition of chemical graphs based on the open source Chemistry Development Kit toolkit. We reimplemented popular fingerprinting algorithms such as depth-first search fingerprints, extended connectivity fingerprints, autocorrelation fingerprints (e.g. CATS2D), radial fingerprints (e.g. Molprint2D), geometrical Molprint, atom pairs, and pharmacophore fingerprints. We also implemented custom fingerprints such as the all-shortest path fingerprint that only includes the subset of shortest paths from the full set of paths of the depth-first search fingerprint. As an application of jCompoundMapper, we provide a command-line executable binary. We measured the conversion speed and number of features for each encoding and described the composition of the features in detail. The quality of the encodings was tested using the default parametrizations in combination with a support vector machine on the Sutherland QSAR data sets. Additionally, we benchmarked the fingerprint encodings on the large-scale Ames toxicity benchmark using a large-scale linear support vector machine. The results were promising and could often compete with literature results. On the large Ames benchmark, for example, we obtained an AUC ROC performance of 0.87 with a reimplementation of the extended connectivity fingerprint. This result is comparable to the performance achieved by a non-linear support vector machine using state-of-the-art descriptors. On the Sutherland QSAR data set, the best fingerprint encodings showed a comparable or better performance on 5 of the 8 benchmarks when compared against the results of the best descriptors published in the paper of Sutherland et al. Conclusions jCompoundMapper is a library for chemical graph fingerprints with several tweaking possibilities and exporting options for open source data mining toolkits. The quality of the data mining results, the conversion speed, the LPGL software license, the command-line interface, and the exporters should be useful for many applications in cheminformatics like benchmarks against literature methods, comparison of data mining algorithms, similarity searching, and similarity-based data mining. PMID:21219648

  16. Machine Learning for Medical Imaging

    PubMed Central

    Korfiatis, Panagiotis; Akkus, Zeynettin; Kline, Timothy L.

    2017-01-01

    Machine learning is a technique for recognizing patterns that can be applied to medical images. Although it is a powerful tool that can help in rendering medical diagnoses, it can be misapplied. Machine learning typically begins with the machine learning algorithm system computing the image features that are believed to be of importance in making the prediction or diagnosis of interest. The machine learning algorithm system then identifies the best combination of these image features for classifying the image or computing some metric for the given image region. There are several methods that can be used, each with different strengths and weaknesses. There are open-source versions of most of these machine learning methods that make them easy to try and apply to images. Several metrics for measuring the performance of an algorithm exist; however, one must be aware of the possible associated pitfalls that can result in misleading metrics. More recently, deep learning has started to be used; this method has the benefit that it does not require image feature identification and calculation as a first step; rather, features are identified as part of the learning process. Machine learning has been used in medical imaging and will have a greater influence in the future. Those working in medical imaging must be aware of how machine learning works. ©RSNA, 2017 PMID:28212054

  17. Machine Learning for Medical Imaging.

    PubMed

    Erickson, Bradley J; Korfiatis, Panagiotis; Akkus, Zeynettin; Kline, Timothy L

    2017-01-01

    Machine learning is a technique for recognizing patterns that can be applied to medical images. Although it is a powerful tool that can help in rendering medical diagnoses, it can be misapplied. Machine learning typically begins with the machine learning algorithm system computing the image features that are believed to be of importance in making the prediction or diagnosis of interest. The machine learning algorithm system then identifies the best combination of these image features for classifying the image or computing some metric for the given image region. There are several methods that can be used, each with different strengths and weaknesses. There are open-source versions of most of these machine learning methods that make them easy to try and apply to images. Several metrics for measuring the performance of an algorithm exist; however, one must be aware of the possible associated pitfalls that can result in misleading metrics. More recently, deep learning has started to be used; this method has the benefit that it does not require image feature identification and calculation as a first step; rather, features are identified as part of the learning process. Machine learning has been used in medical imaging and will have a greater influence in the future. Those working in medical imaging must be aware of how machine learning works. © RSNA, 2017.

  18. Student Learning and the College Library: An Annotated Bibliography.

    ERIC Educational Resources Information Center

    Shklanka, Olga

    The purpose of this annotated bibliography is twofold: (1) to identify which educational and library science literature deals with the learning needs of college students in libraries, and (2) to identify the extent to which library services have been integrated into the educational objectives and learning practices of Canadian community colleges.…

  19. New Essentials for the Library as Learning Commons

    ERIC Educational Resources Information Center

    Gamble, Jay

    2011-01-01

    Library teachers can no longer do business as usual--their students' deep learning is at risk! Expanding current collection of books, CDs, and DVDs is progress, but library teachers cannot allow their facilities to become warehouses for information resources. Their library learning commons should be a place buzzing with the hustle and bustle of…

  20. Academic Library Spaces: Advancing Student Success and Helping Students Thrive

    ERIC Educational Resources Information Center

    Spencer, Mary Ellen; Watstein, Sarah Barbara

    2017-01-01

    Are today's academic libraries really designed for learning? Do library spaces impact student learning? Intending to spark broader and more informed dialogue about the relationship between the quality of learning and the quality of academic library spaces in higher education, the authors consider the concept of space as service; student learning…

  1. Machine learning in heart failure: ready for prime time.

    PubMed

    Awan, Saqib Ejaz; Sohel, Ferdous; Sanfilippo, Frank Mario; Bennamoun, Mohammed; Dwivedi, Girish

    2018-03-01

    The aim of this review is to present an up-to-date overview of the application of machine learning methods in heart failure including diagnosis, classification, readmissions and medication adherence. Recent studies have shown that the application of machine learning techniques may have the potential to improve heart failure outcomes and management, including cost savings by improving existing diagnostic and treatment support systems. Recently developed deep learning methods are expected to yield even better performance than traditional machine learning techniques in performing complex tasks by learning the intricate patterns hidden in big medical data. The review summarizes the recent developments in the application of machine and deep learning methods in heart failure management.

  2. Human Machine Learning Symbiosis

    ERIC Educational Resources Information Center

    Walsh, Kenneth R.; Hoque, Md Tamjidul; Williams, Kim H.

    2017-01-01

    Human Machine Learning Symbiosis is a cooperative system where both the human learner and the machine learner learn from each other to create an effective and efficient learning environment adapted to the needs of the human learner. Such a system can be used in online learning modules so that the modules adapt to each learner's learning state both…

  3. Film Format Pandemonium

    ERIC Educational Resources Information Center

    Malczewski, Benjamin

    2010-01-01

    Every day, Americans borrow 2.1 million DVDs from public libraries. Barely beating libraries out for the top spot is Netflix, at 2.2 million daily rentals. This is according to the OCLC 2010 survey "How Libraries Stack Up." (Redbox vending-machine rentals lag behind at 1.1 million, while DVD rental chain Blockbuster is no longer in the running,…

  4. Experiences Using an Open Source Software Library to Teach Computer Vision Subjects

    ERIC Educational Resources Information Center

    Cazorla, Miguel; Viejo, Diego

    2015-01-01

    Machine vision is an important subject in computer science and engineering degrees. For laboratory experimentation, it is desirable to have a complete and easy-to-use tool. In this work we present a Java library, oriented to teaching computer vision. We have designed and built the library from the scratch with emphasis on readability and…

  5. California State Library: Processing Center Design and Specifications. Volume I, System Description and Input Processing.

    ERIC Educational Resources Information Center

    Sherman, Don; Shoffner, Ralph M.

    The scope of the California State Library-Processing Center (CSL-PC) project is to develop the design and specifications for a computerized technical processing center to provide services to a network of participating California libraries. Immediate objectives are: (1) retrospective conversion of card catalogs to a machine-form data base,…

  6. Machine learning in cardiovascular medicine: are we there yet?

    PubMed

    Shameer, Khader; Johnson, Kipp W; Glicksberg, Benjamin S; Dudley, Joel T; Sengupta, Partho P

    2018-01-19

    Artificial intelligence (AI) broadly refers to analytical algorithms that iteratively learn from data, allowing computers to find hidden insights without being explicitly programmed where to look. These include a family of operations encompassing several terms like machine learning, cognitive learning, deep learning and reinforcement learning-based methods that can be used to integrate and interpret complex biomedical and healthcare data in scenarios where traditional statistical methods may not be able to perform. In this review article, we discuss the basics of machine learning algorithms and what potential data sources exist; evaluate the need for machine learning; and examine the potential limitations and challenges of implementing machine in the context of cardiovascular medicine. The most promising avenues for AI in medicine are the development of automated risk prediction algorithms which can be used to guide clinical care; use of unsupervised learning techniques to more precisely phenotype complex disease; and the implementation of reinforcement learning algorithms to intelligently augment healthcare providers. The utility of a machine learning-based predictive model will depend on factors including data heterogeneity, data depth, data breadth, nature of modelling task, choice of machine learning and feature selection algorithms, and orthogonal evidence. A critical understanding of the strength and limitations of various methods and tasks amenable to machine learning is vital. By leveraging the growing corpus of big data in medicine, we detail pathways by which machine learning may facilitate optimal development of patient-specific models for improving diagnoses, intervention and outcome in cardiovascular medicine. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  7. The X-ray system of crystallographic programs for any computer having a PIDGIN FORTRAN compiler

    NASA Technical Reports Server (NTRS)

    Stewart, J. M.; Kruger, G. J.; Ammon, H. L.; Dickinson, C.; Hall, S. R.

    1972-01-01

    A manual is presented for the use of a library of crystallographic programs. This library, called the X-ray system, is designed to carry out the calculations required to solve the structure of crystals by diffraction techniques. It has been implemented at the University of Maryland on the Univac 1108. It has, however, been developed and run on a variety of machines under various operating systems. It is considered to be an essentially machine independent library of applications programs. The report includes definition of crystallographic computing terms, program descriptions, with some text to show their application to specific crystal problems, detailed card input descriptions, mass storage file structure and some example run streams.

  8. Genome-wide mutant profiling predicts the mechanism of a Lipid II binding antibiotic.

    PubMed

    Santiago, Marina; Lee, Wonsik; Fayad, Antoine Abou; Coe, Kathryn A; Rajagopal, Mithila; Do, Truc; Hennessen, Fabienne; Srisuknimit, Veerasak; Müller, Rolf; Meredith, Timothy C; Walker, Suzanne

    2018-06-01

    Identifying targets of antibacterial compounds remains a challenging step in the development of antibiotics. We have developed a two-pronged functional genomics approach to predict mechanism of action that uses mutant fitness data from antibiotic-treated transposon libraries containing both upregulation and inactivation mutants. We treated a Staphylococcus aureus transposon library containing 690,000 unique insertions with 32 antibiotics. Upregulation signatures identified from directional biases in insertions revealed known molecular targets and resistance mechanisms for the majority of these. Because single-gene upregulation does not always confer resistance, we used a complementary machine-learning approach to predict the mechanism from inactivation mutant fitness profiles. This approach suggested the cell wall precursor Lipid II as the molecular target of the lysocins, a mechanism we have confirmed. We conclude that docking to membrane-anchored Lipid II precedes the selective bacteriolysis that distinguishes these lytic natural products, showing the utility of our approach for nominating the antibiotic mechanism of action.

  9. Data Mining and Machine Learning Tools for Combinatorial Material Science of All-Oxide Photovoltaic Cells.

    PubMed

    Yosipof, Abraham; Nahum, Oren E; Anderson, Assaf Y; Barad, Hannah-Noa; Zaban, Arie; Senderowitz, Hanoch

    2015-06-01

    Growth in energy demands, coupled with the need for clean energy, are likely to make solar cells an important part of future energy resources. In particular, cells entirely made of metal oxides (MOs) have the potential to provide clean and affordable energy if their power conversion efficiencies are improved. Such improvements require the development of new MOs which could benefit from combining combinatorial material sciences for producing solar cells libraries with data mining tools to direct synthesis efforts. In this work we developed a data mining workflow and applied it to the analysis of two recently reported solar cell libraries based on Titanium and Copper oxides. Our results demonstrate that QSAR models with good prediction statistics for multiple solar cells properties could be developed and that these models highlight important factors affecting these properties in accord with experimental findings. The resulting models are therefore suitable for designing better solar cells. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  10. Large-scale image-based profiling of single-cell phenotypes in arrayed CRISPR-Cas9 gene perturbation screens.

    PubMed

    de Groot, Reinoud; Lüthi, Joel; Lindsay, Helen; Holtackers, René; Pelkmans, Lucas

    2018-01-23

    High-content imaging using automated microscopy and computer vision allows multivariate profiling of single-cell phenotypes. Here, we present methods for the application of the CISPR-Cas9 system in large-scale, image-based, gene perturbation experiments. We show that CRISPR-Cas9-mediated gene perturbation can be achieved in human tissue culture cells in a timeframe that is compatible with image-based phenotyping. We developed a pipeline to construct a large-scale arrayed library of 2,281 sequence-verified CRISPR-Cas9 targeting plasmids and profiled this library for genes affecting cellular morphology and the subcellular localization of components of the nuclear pore complex (NPC). We conceived a machine-learning method that harnesses genetic heterogeneity to score gene perturbations and identify phenotypically perturbed cells for in-depth characterization of gene perturbation effects. This approach enables genome-scale image-based multivariate gene perturbation profiling using CRISPR-Cas9. © 2018 The Authors. Published under the terms of the CC BY 4.0 license.

  11. Envisioning a Literacy Partnership: The University of Nebraska at Omaha's Criss Library and Girls Inc.

    ERIC Educational Resources Information Center

    Neujahr, Joyce; Hillyer, Nora; Cast-Brede, Melissa

    2013-01-01

    The history of academic library involvement in service learning is varied. This paper provides an overview of service learning and the literature on academic libraries' participation in service-learning activities. A vision of service-learning participation is described, as well as the implementation of service-learning activities in two library…

  12. Engineering a lunar photolithoautotroph to thrive on the moon - life or simulacrum?

    NASA Astrophysics Data System (ADS)

    Ellery, A. A.

    2018-07-01

    Recent work in developing self-replicating machines has approached the problem as an engineering problem, using engineering materials and methods to implement an engineering analogue of a hitherto uniquely biological function. The question is - can anything be learned that might be relevant to an astrobiological context in which the problem is to determine the general form of biology independent of the Earth. Compared with other non-terrestrial biology disciplines, engineered life is more demanding. Engineering a self-replicating machine tackles real environments unlike artificial life which avoids the problem of physical instantiation altogether by examining software models. Engineering a self-replicating machine is also more demanding than synthetic biology as no library of functional components exists. Everything must be constructed de novo. Biological systems already have the capacity to self-replicate but no engineered machine has yet been constructed with the same ability - this is our primary goal. On the basis of the von Neumann analysis of self-replication, self-replication is a by-product of universal construction capability - a universal constructor is a machine that can construct anything (in a functional sense) given the appropriate instructions (DNA/RNA), energy (ATP) and materials (food). In the biological cell, the universal construction mechanism is the ribosome. The ribosome is a biological assembly line for constructing proteins while DNA constitutes a design specification. For a photoautotroph, the energy source is ambient and the food is inorganic. We submit that engineering a self-replicating machine opens up new areas of astrobiology to be explored in the limits of life.

  13. From machine learning to deep learning: progress in machine intelligence for rational drug discovery.

    PubMed

    Zhang, Lu; Tan, Jianjun; Han, Dan; Zhu, Hao

    2017-11-01

    Machine intelligence, which is normally presented as artificial intelligence, refers to the intelligence exhibited by computers. In the history of rational drug discovery, various machine intelligence approaches have been applied to guide traditional experiments, which are expensive and time-consuming. Over the past several decades, machine-learning tools, such as quantitative structure-activity relationship (QSAR) modeling, were developed that can identify potential biological active molecules from millions of candidate compounds quickly and cheaply. However, when drug discovery moved into the era of 'big' data, machine learning approaches evolved into deep learning approaches, which are a more powerful and efficient way to deal with the massive amounts of data generated from modern drug discovery approaches. Here, we summarize the history of machine learning and provide insight into recently developed deep learning approaches and their applications in rational drug discovery. We suggest that this evolution of machine intelligence now provides a guide for early-stage drug design and discovery in the current big data era. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. Myths and legends in learning classification rules

    NASA Technical Reports Server (NTRS)

    Buntine, Wray

    1990-01-01

    This paper is a discussion of machine learning theory on empirically learning classification rules. The paper proposes six myths in the machine learning community that address issues of bias, learning as search, computational learning theory, Occam's razor, 'universal' learning algorithms, and interactive learnings. Some of the problems raised are also addressed from a Bayesian perspective. The paper concludes by suggesting questions that machine learning researchers should be addressing both theoretically and experimentally.

  15. Machine learning and radiology.

    PubMed

    Wang, Shijun; Summers, Ronald M

    2012-07-01

    In this paper, we give a short introduction to machine learning and survey its applications in radiology. We focused on six categories of applications in radiology: medical image segmentation, registration, computer aided detection and diagnosis, brain function or activity analysis and neurological disease diagnosis from fMR images, content-based image retrieval systems for CT or MRI images, and text analysis of radiology reports using natural language processing (NLP) and natural language understanding (NLU). This survey shows that machine learning plays a key role in many radiology applications. Machine learning identifies complex patterns automatically and helps radiologists make intelligent decisions on radiology data such as conventional radiographs, CT, MRI, and PET images and radiology reports. In many applications, the performance of machine learning-based automatic detection and diagnosis systems has shown to be comparable to that of a well-trained and experienced radiologist. Technology development in machine learning and radiology will benefit from each other in the long run. Key contributions and common characteristics of machine learning techniques in radiology are discussed. We also discuss the problem of translating machine learning applications to the radiology clinical setting, including advantages and potential barriers. Copyright © 2012. Published by Elsevier B.V.

  16. Monash University Library and Learning: A New Paradigm for a New Age

    ERIC Educational Resources Information Center

    Smith, Lisa

    2011-01-01

    This article describes the expansion of Monash University Library's role to incorporate learning skills services, programs, and resources, within the context of the University's evolving learning landscape. It explains the Library's now holistic approach to students' development of information research and learning skills as interconnected skills…

  17. Automation's Effect on Library Personnel.

    ERIC Educational Resources Information Center

    Dakshinamurti, Ganga

    1985-01-01

    Reports on survey studying the human-machine interface in Canadian university, public, and special libraries. Highlights include position category and educational background of 118 participants, participants' feelings toward automation, physical effects of automation, diffusion in decision making, interpersonal communication, future trends,…

  18. Creating Library Interiors: Planning and Design Considerations.

    ERIC Educational Resources Information Center

    Jones, Plummer Alston, Jr.; Barton, Phillip K.

    1997-01-01

    Examines design considerations for public library interiors: access; acoustical treatment; assignable and nonassignable space; building interiors: ceilings, clocks, color, control, drinking fountains; exhibit space: slotwall display, floor coverings, floor loading, furniture, lighting, mechanical systems, public address, copying machines,…

  19. Classification Algorithms for Big Data Analysis, a Map Reduce Approach

    NASA Astrophysics Data System (ADS)

    Ayma, V. A.; Ferreira, R. S.; Happ, P.; Oliveira, D.; Feitosa, R.; Costa, G.; Plaza, A.; Gamba, P.

    2015-03-01

    Since many years ago, the scientific community is concerned about how to increase the accuracy of different classification methods, and major achievements have been made so far. Besides this issue, the increasing amount of data that is being generated every day by remote sensors raises more challenges to be overcome. In this work, a tool within the scope of InterIMAGE Cloud Platform (ICP), which is an open-source, distributed framework for automatic image interpretation, is presented. The tool, named ICP: Data Mining Package, is able to perform supervised classification procedures on huge amounts of data, usually referred as big data, on a distributed infrastructure using Hadoop MapReduce. The tool has four classification algorithms implemented, taken from WEKA's machine learning library, namely: Decision Trees, Naïve Bayes, Random Forest and Support Vector Machines (SVM). The results of an experimental analysis using a SVM classifier on data sets of different sizes for different cluster configurations demonstrates the potential of the tool, as well as aspects that affect its performance.

  20. From Libraries to Learning "Libratories:" The New ABC's of 21st-Century School Libraries

    ERIC Educational Resources Information Center

    Trilling, Bernie

    2010-01-01

    Libraries are evolving into learning laboratories or "libratories"--environments where a wide variety of learning activities and projects can take place. Part project space, part design studio, part community meeting and presentation space, and part research and development lab, libraries of the future will have a new alphabet of services--the new…

  1. Microform Reader Maintenance.

    ERIC Educational Resources Information Center

    Hall, Hal W.; Michaels, George H.

    1985-01-01

    Describes experiences in organizing a program of microform reader and reader/printer maintenance at Texas A & M's Sterling C. Evans Library and offers guidelines for regular machine maintenance and repair. Guidelines discussed relate to maintenance philosophy, general machine cleaning, troubleshooting, service contracts, supplies,…

  2. Earthquake prediction in California using regression algorithms and cloud-based big data infrastructure

    NASA Astrophysics Data System (ADS)

    Asencio-Cortés, G.; Morales-Esteban, A.; Shang, X.; Martínez-Álvarez, F.

    2018-06-01

    Earthquake magnitude prediction is a challenging problem that has been widely studied during the last decades. Statistical, geophysical and machine learning approaches can be found in literature, with no particularly satisfactory results. In recent years, powerful computational techniques to analyze big data have emerged, making possible the analysis of massive datasets. These new methods make use of physical resources like cloud based architectures. California is known for being one of the regions with highest seismic activity in the world and many data are available. In this work, the use of several regression algorithms combined with ensemble learning is explored in the context of big data (1 GB catalog is used), in order to predict earthquakes magnitude within the next seven days. Apache Spark framework, H2 O library in R language and Amazon cloud infrastructure were been used, reporting very promising results.

  3. Libraries and Learning

    ERIC Educational Resources Information Center

    Rainie, Lee

    2016-01-01

    The majority of Americans think local libraries serve the educational needs of their communities and families pretty well and library users often outpace others in learning activities. But many do not know about key education services libraries provide. This report provides statistics on library usage and presents key education services provided…

  4. Applications of Machine Learning and Rule Induction,

    DTIC Science & Technology

    1995-02-15

    An important area of application for machine learning is in automating the acquisition of knowledge bases required for expert systems. In this paper...we review the major paradigms for machine learning , including neural networks, instance-based methods, genetic learning, rule induction, and analytic

  5. Instructional Learning Objects in the Digital Classroom: Effectively Measuring Impact on Student Success

    ERIC Educational Resources Information Center

    Contrino, Jacline L.

    2016-01-01

    Demonstrating library impact on student success is critical for all academic libraries today. This article discusses how the library of a large online university serving non-traditional students evaluated how customized point-of-need learning objects (LOs) embedded in the learning management system impacted student learning. Using a comprehensive…

  6. Learning Outcomes and Affective Factors of Blended Learning of English for Library Science

    ERIC Educational Resources Information Center

    Wentao, Chen; Jinyu, Zhang; Zhonggen, Yu

    2016-01-01

    English for Library Science is an essential course for students to command comprehensive scope of library knowledge. This study aims to compare the learning outcomes, gender differences and affective factors in the environments of blended and traditional learning. Around one thousand participants from one university were randomly selected to…

  7. An Intelligent Mobile Location-Aware Book Recommendation System that Enhances Problem-Based Learning in Libraries

    ERIC Educational Resources Information Center

    Chen, Chih-Ming

    2013-01-01

    Despite rapid and continued adoption of mobile devices, few learning modes integrate with mobile technologies and libraries' environments as innovative learning modes that emphasize the key roles of libraries in facilitating learning. In addition, some education experts have claimed that transmitting knowledge to learners is not the only…

  8. Systems Analysis, Machineable Circulation Data and Library Users and Non-Users.

    ERIC Educational Resources Information Center

    Lubans, John, Jr.

    A study to be made with computer-based circulation data of the non-use and use of a large academic library is discussed. A search of the literature reveals that computer-based circulation systems can be, but have not been, utilized to provide data bases for systematic analyses of library users and resources. The data gathered in the circulation…

  9. Library Resources for the Blind and Physically Handicapped: A Directory with FY 1998 Statistics on Readership, Circulation, Budget, Staff, and Collections.

    ERIC Educational Resources Information Center

    Library of Congress, Washington, DC. National Library Service for the Blind and Physically Handicapped.

    This directory lists National Library Service for the Blind and Physically Handicapped libraries and machine-lending agencies alphabetically by state. Each entry includes address, phone and fax numbers, e-mail address, World Wide Web site, area served, librarian name, hours, book collection, special collections, assistive devices, special…

  10. Espresso and Ambiance: What Public Libraries Can Learn from Bookstores.

    ERIC Educational Resources Information Center

    Sannwald, William

    1998-01-01

    Looks at what makes bookstores so successful today. Examines why exterior and interior spaces are important to libraries; how spaces support the library's mission, goals, and objectives; what libraries can learn from the retail market regarding bookstore merchandising and design; and how this information can benefit libraries. (Author/AEF)

  11. Experimental Realization of a Quantum Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Li, Zhaokai; Liu, Xiaomei; Xu, Nanyang; Du, Jiangfeng

    2015-04-01

    The fundamental principle of artificial intelligence is the ability of machines to learn from previous experience and do future work accordingly. In the age of big data, classical learning machines often require huge computational resources in many practical cases. Quantum machine learning algorithms, on the other hand, could be exponentially faster than their classical counterparts by utilizing quantum parallelism. Here, we demonstrate a quantum machine learning algorithm to implement handwriting recognition on a four-qubit NMR test bench. The quantum machine learns standard character fonts and then recognizes handwritten characters from a set with two candidates. Because of the wide spread importance of artificial intelligence and its tremendous consumption of computational resources, quantum speedup would be extremely attractive against the challenges of big data.

  12. Workshop on Fielded Applications of Machine Learning

    DTIC Science & Technology

    1994-05-11

    This report summaries the talks presented at the Workshop on Fielded Applications of Machine Learning , and draws some initial conclusions about the state of machine learning and its potential for solving real-world problems.

  13. Revisit of Machine Learning Supported Biological and Biomedical Studies.

    PubMed

    Yu, Xiang-Tian; Wang, Lu; Zeng, Tao

    2018-01-01

    Generally, machine learning includes many in silico methods to transform the principles underlying natural phenomenon to human understanding information, which aim to save human labor, to assist human judge, and to create human knowledge. It should have wide application potential in biological and biomedical studies, especially in the era of big biological data. To look through the application of machine learning along with biological development, this review provides wide cases to introduce the selection of machine learning methods in different practice scenarios involved in the whole biological and biomedical study cycle and further discusses the machine learning strategies for analyzing omics data in some cutting-edge biological studies. Finally, the notes on new challenges for machine learning due to small-sample high-dimension are summarized from the key points of sample unbalance, white box, and causality.

  14. Machine Learning. Part 1. A Historical and Methodological Analysis.

    DTIC Science & Technology

    1983-05-31

    Machine learning has always been an integral part of artificial intelligence, and its methodology has evolved in concert with the major concerns of the field. In response to the difficulties of encoding ever-increasing volumes of knowledge in modern Al systems, many researchers have recently turned their attention to machine learning as a means to overcome the knowledge acquisition bottleneck. Part 1 of this paper presents a taxonomic analysis of machine learning organized primarily by learning strategies and secondarily by

  15. Toward Harnessing User Feedback For Machine Learning

    DTIC Science & Technology

    2006-10-02

    machine learning systems. If this resource-the users themselves-could somehow work hand-in-hand with machine learning systems, the accuracy of learning systems could be improved and the users? understanding and trust of the system could improve as well. We conducted a think-aloud study to see how willing users were to provide feedback and to understand what kinds of feedback users could give. Users were shown explanations of machine learning predictions and asked to provide feedback to improve the predictions. We found that users

  16. Intelligible machine learning with malibu.

    PubMed

    Langlois, Robert E; Lu, Hui

    2008-01-01

    malibu is an open-source machine learning work-bench developed in C/C++ for high-performance real-world applications, namely bioinformatics and medical informatics. It leverages third-party machine learning implementations for more robust bug-free software. This workbench handles several well-studied supervised machine learning problems including classification, regression, importance-weighted classification and multiple-instance learning. The malibu interface was designed to create reproducible experiments ideally run in a remote and/or command line environment. The software can be found at: http://proteomics.bioengr. uic.edu/malibu/index.html.

  17. Language Acquisition and Machine Learning.

    DTIC Science & Technology

    1986-02-01

    machine learning and examine its implications for computational models of language acquisition. As a framework for understanding this research, the authors propose four component tasks involved in learning from experience-aggregation, clustering, characterization, and storage. They then consider four common problems studied by machine learning researchers-learning from examples, heuristics learning, conceptual clustering, and learning macro-operators-describing each in terms of our framework. After this, they turn to the problem of grammar

  18. Behavioral Profiling of Scada Network Traffic Using Machine Learning Algorithms

    DTIC Science & Technology

    2014-03-27

    BEHAVIORAL PROFILING OF SCADA NETWORK TRAFFIC USING MACHINE LEARNING ALGORITHMS THESIS Jessica R. Werling, Captain, USAF AFIT-ENG-14-M-81 DEPARTMENT...subject to copyright protection in the United States. AFIT-ENG-14-M-81 BEHAVIORAL PROFILING OF SCADA NETWORK TRAFFIC USING MACHINE LEARNING ...AFIT-ENG-14-M-81 BEHAVIORAL PROFILING OF SCADA NETWORK TRAFFIC USING MACHINE LEARNING ALGORITHMS Jessica R. Werling, B.S.C.S. Captain, USAF Approved

  19. Statistical Machine Learning for Structured and High Dimensional Data

    DTIC Science & Technology

    2014-09-17

    AFRL-OSR-VA-TR-2014-0234 STATISTICAL MACHINE LEARNING FOR STRUCTURED AND HIGH DIMENSIONAL DATA Larry Wasserman CARNEGIE MELLON UNIVERSITY Final...Re . 8-98) v Prescribed by ANSI Std. Z39.18 14-06-2014 Final Dec 2009 - Aug 2014 Statistical Machine Learning for Structured and High Dimensional...area of resource-constrained statistical estimation. machine learning , high-dimensional statistics U U U UU John Lafferty 773-702-3813 > Research under

  20. Machine learning in genetics and genomics

    PubMed Central

    Libbrecht, Maxwell W.; Noble, William Stafford

    2016-01-01

    The field of machine learning promises to enable computers to assist humans in making sense of large, complex data sets. In this review, we outline some of the main applications of machine learning to genetic and genomic data. In the process, we identify some recurrent challenges associated with this type of analysis and provide general guidelines to assist in the practical application of machine learning to real genetic and genomic data. PMID:25948244

  1. Using Multiple FPGA Architectures for Real-time Processing of Low-level Machine Vision Functions

    Treesearch

    Thomas H. Drayer; William E. King; Philip A. Araman; Joseph G. Tront; Richard W. Conners

    1995-01-01

    In this paper, we investigate the use of multiple Field Programmable Gate Array (FPGA) architectures for real-time machine vision processing. The use of FPGAs for low-level processing represents an excellent tradeoff between software and special purpose hardware implementations. A library of modules that implement common low-level machine vision operations is presented...

  2. Application of Deep Learning and Supervised Learning Methods to Recognize Nonlinear Hidden Pattern in Water Stress Levels from Spatiotemporal Datasets across Rural and Urban US Counties

    NASA Astrophysics Data System (ADS)

    Eisenhart, T.; Josset, L.; Rising, J. A.; Devineni, N.; Lall, U.

    2017-12-01

    In the wake of recent water crises, the need to understand and predict the risk of water stress in urban and rural areas has grown. This understanding has the potential to improve decision making in public resource management, policy making, risk management and investment decisions. Assuming an underlying relationship between urban and rural water stress and observable features, we apply Deep Learning and Supervised Learning models to uncover hidden nonlinear patterns from spatiotemporal datasets. Results of interest includes prediction accuracy on extreme categories (i.e. urban areas highly prone to water stress) and not solely the average risk for urban or rural area, which adds complexity to the tuning of model parameters. We first label urban water stressed counties using annual water quality violations and compile a comprehensive spatiotemporal dataset that captures the yearly evolution of climatic, demographic and economic factors of more than 3,000 US counties over the 1980-2010 period. As county-level data reporting is not done on a yearly basis, we test multiple imputation methods to get around the issue of missing data. Using Python libraries, TensorFlow and scikit-learn, we apply and compare the ability of, amongst other methods, Recurrent Neural Networks (testing both LSTM and GRU cells), Convolutional Neural Networks and Support Vector Machines to predict urban water stress. We evaluate the performance of those models over multiple time spans and combine methods to diminish the risk of overfitting and increase prediction power on test sets. This methodology seeks to identify hidden nonlinear patterns to assess the predominant data features that influence urban and rural water stress. Results from this application at the national scale will assess the performance of deep learning models to predict water stress risk areas across all US counties and will highlight a predominant Machine Learning method for modeling water stress risk using spatiotemporal data.

  3. Convergence Toward Common Standards in Machine-Readable Cataloging *

    PubMed Central

    Gull, C. D.

    1969-01-01

    The adoption of the MARC II format for the communication of bibliographic information by the three National Libraries of the U.S.A. makes it possible for those libraries to converge on the remaining necessary common standards for machine-readable cataloging. Three levels of standards are identified: fundamental, the character set; intermediate, MARC II; and detailed, the codes for identifying data elements. The convergence on these standards implies that the National Libraries can create and operate a Joint Bibliographic Data Bank requiring standard book numbers and universal serial numbers for identifying monographs and serials and that the system will thoroughly process contributed catalog entries before adding them to the Data Bank. There is reason to hope that the use of the MARC II format will facilitate catalogers' decision processes. PMID:5782261

  4. Learning Commons in Academic Libraries: Discussing Themes in the Literature from 2001 to the Present

    ERIC Educational Resources Information Center

    Blummer, Barbara; Kenton, Jeffrey M.

    2017-01-01

    Although the term lacks a standard definition, learning commons represent academic library spaces that provide computer and library resources as well as a range of academic services that support learners and learning. Learning commons have been equated to a laboratory for creating knowledge and staffed with librarians that serve as facilitators of…

  5. Broiler chickens can benefit from machine learning: support vector machine analysis of observational epidemiological data

    PubMed Central

    Hepworth, Philip J.; Nefedov, Alexey V.; Muchnik, Ilya B.; Morgan, Kenton L.

    2012-01-01

    Machine-learning algorithms pervade our daily lives. In epidemiology, supervised machine learning has the potential for classification, diagnosis and risk factor identification. Here, we report the use of support vector machine learning to identify the features associated with hock burn on commercial broiler farms, using routinely collected farm management data. These data lend themselves to analysis using machine-learning techniques. Hock burn, dermatitis of the skin over the hock, is an important indicator of broiler health and welfare. Remarkably, this classifier can predict the occurrence of high hock burn prevalence with accuracy of 0.78 on unseen data, as measured by the area under the receiver operating characteristic curve. We also compare the results with those obtained by standard multi-variable logistic regression and suggest that this technique provides new insights into the data. This novel application of a machine-learning algorithm, embedded in poultry management systems could offer significant improvements in broiler health and welfare worldwide. PMID:22319115

  6. Broiler chickens can benefit from machine learning: support vector machine analysis of observational epidemiological data.

    PubMed

    Hepworth, Philip J; Nefedov, Alexey V; Muchnik, Ilya B; Morgan, Kenton L

    2012-08-07

    Machine-learning algorithms pervade our daily lives. In epidemiology, supervised machine learning has the potential for classification, diagnosis and risk factor identification. Here, we report the use of support vector machine learning to identify the features associated with hock burn on commercial broiler farms, using routinely collected farm management data. These data lend themselves to analysis using machine-learning techniques. Hock burn, dermatitis of the skin over the hock, is an important indicator of broiler health and welfare. Remarkably, this classifier can predict the occurrence of high hock burn prevalence with accuracy of 0.78 on unseen data, as measured by the area under the receiver operating characteristic curve. We also compare the results with those obtained by standard multi-variable logistic regression and suggest that this technique provides new insights into the data. This novel application of a machine-learning algorithm, embedded in poultry management systems could offer significant improvements in broiler health and welfare worldwide.

  7. Addressing uncertainty in atomistic machine learning.

    PubMed

    Peterson, Andrew A; Christensen, Rune; Khorshidi, Alireza

    2017-05-10

    Machine-learning regression has been demonstrated to precisely emulate the potential energy and forces that are output from more expensive electronic-structure calculations. However, to predict new regions of the potential energy surface, an assessment must be made of the credibility of the predictions. In this perspective, we address the types of errors that might arise in atomistic machine learning, the unique aspects of atomistic simulations that make machine-learning challenging, and highlight how uncertainty analysis can be used to assess the validity of machine-learning predictions. We suggest this will allow researchers to more fully use machine learning for the routine acceleration of large, high-accuracy, or extended-time simulations. In our demonstrations, we use a bootstrap ensemble of neural network-based calculators, and show that the width of the ensemble can provide an estimate of the uncertainty when the width is comparable to that in the training data. Intriguingly, we also show that the uncertainty can be localized to specific atoms in the simulation, which may offer hints for the generation of training data to strategically improve the machine-learned representation.

  8. On the Conditioning of Machine-Learning-Assisted Turbulence Modeling

    NASA Astrophysics Data System (ADS)

    Wu, Jinlong; Sun, Rui; Wang, Qiqi; Xiao, Heng

    2017-11-01

    Recently, several researchers have demonstrated that machine learning techniques can be used to improve the RANS modeled Reynolds stress by training on available database of high fidelity simulations. However, obtaining improved mean velocity field remains an unsolved challenge, restricting the predictive capability of current machine-learning-assisted turbulence modeling approaches. In this work we define a condition number to evaluate the model conditioning of data-driven turbulence modeling approaches, and propose a stability-oriented machine learning framework to model Reynolds stress. Two canonical flows, the flow in a square duct and the flow over periodic hills, are investigated to demonstrate the predictive capability of the proposed framework. The satisfactory prediction performance of mean velocity field for both flows demonstrates the predictive capability of the proposed framework for machine-learning-assisted turbulence modeling. With showing the capability of improving the prediction of mean flow field, the proposed stability-oriented machine learning framework bridges the gap between the existing machine-learning-assisted turbulence modeling approaches and the demand of predictive capability of turbulence models in real applications.

  9. Progressive sampling-based Bayesian optimization for efficient and automatic machine learning model selection.

    PubMed

    Zeng, Xueqiang; Luo, Gang

    2017-12-01

    Machine learning is broadly used for clinical data analysis. Before training a model, a machine learning algorithm must be selected. Also, the values of one or more model parameters termed hyper-parameters must be set. Selecting algorithms and hyper-parameter values requires advanced machine learning knowledge and many labor-intensive manual iterations. To lower the bar to machine learning, miscellaneous automatic selection methods for algorithms and/or hyper-parameter values have been proposed. Existing automatic selection methods are inefficient on large data sets. This poses a challenge for using machine learning in the clinical big data era. To address the challenge, this paper presents progressive sampling-based Bayesian optimization, an efficient and automatic selection method for both algorithms and hyper-parameter values. We report an implementation of the method. We show that compared to a state of the art automatic selection method, our method can significantly reduce search time, classification error rate, and standard deviation of error rate due to randomization. This is major progress towards enabling fast turnaround in identifying high-quality solutions required by many machine learning-based clinical data analysis tasks.

  10. Bypassing the Kohn-Sham equations with machine learning.

    PubMed

    Brockherde, Felix; Vogt, Leslie; Li, Li; Tuckerman, Mark E; Burke, Kieron; Müller, Klaus-Robert

    2017-10-11

    Last year, at least 30,000 scientific papers used the Kohn-Sham scheme of density functional theory to solve electronic structure problems in a wide variety of scientific fields. Machine learning holds the promise of learning the energy functional via examples, bypassing the need to solve the Kohn-Sham equations. This should yield substantial savings in computer time, allowing larger systems and/or longer time-scales to be tackled, but attempts to machine-learn this functional have been limited by the need to find its derivative. The present work overcomes this difficulty by directly learning the density-potential and energy-density maps for test systems and various molecules. We perform the first molecular dynamics simulation with a machine-learned density functional on malonaldehyde and are able to capture the intramolecular proton transfer process. Learning density models now allows the construction of accurate density functionals for realistic molecular systems.Machine learning allows electronic structure calculations to access larger system sizes and, in dynamical simulations, longer time scales. Here, the authors perform such a simulation using a machine-learned density functional that avoids direct solution of the Kohn-Sham equations.

  11. An Evolutionary Machine Learning Framework for Big Data Sequence Mining

    ERIC Educational Resources Information Center

    Kamath, Uday Krishna

    2014-01-01

    Sequence classification is an important problem in many real-world applications. Unlike other machine learning data, there are no "explicit" features or signals in sequence data that can help traditional machine learning algorithms learn and predict from the data. Sequence data exhibits inter-relationships in the elements that are…

  12. Neuromorphic Optical Signal Processing and Image Understanding for Automated Target Recognition

    DTIC Science & Technology

    1989-12-01

    34 Stochastic Learning Machine " Neuromorphic Target Identification * Cognitive Networks 3. Conclusions ..... ................ .. 12 4. Publications...16 5. References ...... ................... . 17 6. Appendices ....... .................. 18 I. Optoelectronic Neural Networks and...Learning Machines. II. Stochastic Optical Learning Machine. III. Learning Network for Extrapolation AccesFon For and Radar Target Identification

  13. An iterative learning control method with application for CNC machine tools

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kim, D.I.; Kim, S.

    1996-01-01

    A proportional, integral, and derivative (PID) type iterative learning controller is proposed for precise tracking control of industrial robots and computer numerical controller (CNC) machine tools performing repetitive tasks. The convergence of the output error by the proposed learning controller is guaranteed under a certain condition even when the system parameters are not known exactly and unknown external disturbances exist. As the proposed learning controller is repeatedly applied to the industrial robot or the CNC machine tool with the path-dependent repetitive task, the distance difference between the desired path and the actual tracked or machined path, which is one ofmore » the most significant factors in the evaluation of control performance, is progressively reduced. The experimental results demonstrate that the proposed learning controller can improve machining accuracy when the CNC machine tool performs repetitive machining tasks.« less

  14. From Information Center to Discovery System: Next Step for Libraries?

    ERIC Educational Resources Information Center

    Marcum, James W.

    2001-01-01

    Proposes a discovery system model to guide technology integration in academic libraries that fuses organizational learning, systems learning, and knowledge creation techniques with constructivist learning practices to suggest possible future directions for digital libraries. Topics include accessing visual and continuous media; information…

  15. Learning dominance relations in combinatorial search problems

    NASA Technical Reports Server (NTRS)

    Yu, Chee-Fen; Wah, Benjamin W.

    1988-01-01

    Dominance relations commonly are used to prune unnecessary nodes in search graphs, but they are problem-dependent and cannot be derived by a general procedure. The authors identify machine learning of dominance relations and the applicable learning mechanisms. A study of learning dominance relations using learning by experimentation is described. This system has been able to learn dominance relations for the 0/1-knapsack problem, an inventory problem, the reliability-by-replication problem, the two-machine flow shop problem, a number of single-machine scheduling problems, and a two-machine scheduling problem. It is considered that the same methodology can be extended to learn dominance relations in general.

  16. Thutmose - Investigation of Machine Learning-Based Intrusion Detection Systems

    DTIC Science & Technology

    2016-06-01

    research is being done to incorporate the field of machine learning into intrusion detection. Machine learning is a branch of artificial intelligence (AI...adversarial drift." Proceedings of the 2013 ACM workshop on Artificial intelligence and security. ACM. (2013) Kantarcioglu, M., Xi, B., and Clifton, C. "A...34 Proceedings of the 4th ACM workshop on Security and artificial intelligence . ACM. (2011) Dua, S., and Du, X. Data Mining and Machine Learning in

  17. Large-Scale Linear Optimization through Machine Learning: From Theory to Practical System Design and Implementation

    DTIC Science & Technology

    2016-08-10

    AFRL-AFOSR-JP-TR-2016-0073 Large-scale Linear Optimization through Machine Learning: From Theory to Practical System Design and Implementation ...2016 4.  TITLE AND SUBTITLE Large-scale Linear Optimization through Machine Learning: From Theory to Practical System Design and Implementation 5a...performances on various machine learning tasks and it naturally lends itself to fast parallel implementations . Despite this, very little work has been

  18. ML-o-Scope: A Diagnostic Visualization System for Deep Machine Learning Pipelines

    DTIC Science & Technology

    2014-05-16

    ML-o-scope: a diagnostic visualization system for deep machine learning pipelines Daniel Bruckner Electrical Engineering and Computer Sciences... machine learning pipelines 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK NUMBER 5f...the system as a support for tuning large scale object-classification pipelines. 1 Introduction A new generation of pipelined machine learning models

  19. WebWatcher: Machine Learning and Hypertext

    DTIC Science & Technology

    1995-05-29

    WebWatcher: Machine Learning and Hypertext Thorsten Joachims, Tom Mitchell, Dayne Freitag, and Robert Armstrong School of Computer Science Carnegie...HTML-page about machine learning in which we in- serted a hyperlink to WebWatcher (line 6). The user follows this hyperlink and gets to a page which...AND SUBTITLE WebWatcher: Machine Learning and Hypertext 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT

  20. A Parameter Communication Optimization Strategy for Distributed Machine Learning in Sensors.

    PubMed

    Zhang, Jilin; Tu, Hangdi; Ren, Yongjian; Wan, Jian; Zhou, Li; Li, Mingwei; Wang, Jue; Yu, Lifeng; Zhao, Chang; Zhang, Lei

    2017-09-21

    In order to utilize the distributed characteristic of sensors, distributed machine learning has become the mainstream approach, but the different computing capability of sensors and network delays greatly influence the accuracy and the convergence rate of the machine learning model. Our paper describes a reasonable parameter communication optimization strategy to balance the training overhead and the communication overhead. We extend the fault tolerance of iterative-convergent machine learning algorithms and propose the Dynamic Finite Fault Tolerance (DFFT). Based on the DFFT, we implement a parameter communication optimization strategy for distributed machine learning, named Dynamic Synchronous Parallel Strategy (DSP), which uses the performance monitoring model to dynamically adjust the parameter synchronization strategy between worker nodes and the Parameter Server (PS). This strategy makes full use of the computing power of each sensor, ensures the accuracy of the machine learning model, and avoids the situation that the model training is disturbed by any tasks unrelated to the sensors.

  1. How to Communicate with a Machine: On Reading a Public Library's OPAC

    ERIC Educational Resources Information Center

    Saarti, Jarmo; Raivio, Jouko

    2011-01-01

    This article presents a reading of the user interface in one public library system. Its aim is to find out the frames and competences required and used in the communication between the computer and the patron. The authors see the computer as a text that is to be read by the user who wants to search for information from the library. The transition…

  2. School Libraries Empowering Learning: The Australian Landscape.

    ERIC Educational Resources Information Center

    Todd, Ross J.

    2003-01-01

    Describes school libraries in Australia. Highlights include the title of teacher librarian and their education; the history of the role of school libraries in Australian education; empowerment; information skills and benchmarks; national standards for school libraries; information literacy; learning outcomes; evidence-based practice; digital…

  3. Challenges in Delivering Library Services for Distance Learning.

    ERIC Educational Resources Information Center

    Swaine, Cynthia Wright

    The first section of this paper on library services for distance education discusses the status of distance learning in higher education. What distance learning means for libraries is addressed in the second section, including considerations related to diverse locations, agreements with participating institutions, delivery limitations, librarian…

  4. Machine learning for medical images analysis.

    PubMed

    Criminisi, A

    2016-10-01

    This article discusses the application of machine learning for the analysis of medical images. Specifically: (i) We show how a special type of learning models can be thought of as automatically optimized, hierarchically-structured, rule-based algorithms, and (ii) We discuss how the issue of collecting large labelled datasets applies to both conventional algorithms as well as machine learning techniques. The size of the training database is a function of model complexity rather than a characteristic of machine learning methods. Crown Copyright © 2016. Published by Elsevier B.V. All rights reserved.

  5. Machine Learning.

    ERIC Educational Resources Information Center

    Kirrane, Diane E.

    1990-01-01

    As scientists seek to develop machines that can "learn," that is, solve problems by imitating the human brain, a gold mine of information on the processes of human learning is being discovered, expert systems are being improved, and human-machine interactions are being enhanced. (SK)

  6. A Grammar Library for Information Structure

    ERIC Educational Resources Information Center

    Song, Sanghoun

    2014-01-01

    This dissertation makes substantial contributions to both the theoretical and computational treatment of information structure, with an eye toward creating natural language processing applications such as multilingual machine translation systems. The aim of the present dissertation is to create a grammar library of information structure for the…

  7. Online Reference Service--How to Begin: A Selected Bibliography.

    ERIC Educational Resources Information Center

    Shroder, Emelie J., Ed.

    1982-01-01

    Materials in this bibliography were selected and recommended by members of the Use of Machine-Assisted Reference in Public Libraries Committee, Reference and Adult Services Division, American Library Association. Topics include: financial aspects, equipment and communications considerations, comparing databases and database systems, advertising…

  8. Research in image management and access

    NASA Technical Reports Server (NTRS)

    Vondran, Raymond F.; Barron, Billy J.

    1993-01-01

    Presently, the problem of over-all library system design has been compounded by the accretion of both function and structure to a basic framework of requirements. While more device power has led to increased functionality, opportunities for reducing system complexity at the user interface level have not always been pursued with equal zeal. The purpose of this book is therefore to set forth and examine these opportunities, within the general framework of human factors research in man-machine interfaces. Human factors may be viewed as a series of trade-off decisions among four polarized objectives: machine resources and user specifications; functionality and user requirements. In the past, a limiting factor was the availability of systems. However, in the last two years, over one hundred libraries supported by many different software configurations have been added to the Internet. This document includes a statistical analysis of human responses to five Internet library systems by key features, development of the ideal online catalog system, and ideal online catalog systems for libraries and information centers.

  9. A real-time MPEG software decoder using a portable message-passing library

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kwong, Man Kam; Tang, P.T. Peter; Lin, Biquan

    1995-12-31

    We present a real-time MPEG software decoder that uses message-passing libraries such as MPL, p4 and MPI. The parallel MPEG decoder currently runs on the IBM SP system but can be easil ported to other parallel machines. This paper discusses our parallel MPEG decoding algorithm as well as the parallel programming environment under which it uses. Several technical issues are discussed, including balancing of decoding speed, memory limitation, 1/0 capacities, and optimization of MPEG decoding components. This project shows that a real-time portable software MPEG decoder is feasible in a general-purpose parallel machine.

  10. Machine learning applications in genetics and genomics.

    PubMed

    Libbrecht, Maxwell W; Noble, William Stafford

    2015-06-01

    The field of machine learning, which aims to develop computer algorithms that improve with experience, holds promise to enable computers to assist humans in the analysis of large, complex data sets. Here, we provide an overview of machine learning applications for the analysis of genome sequencing data sets, including the annotation of sequence elements and epigenetic, proteomic or metabolomic data. We present considerations and recurrent challenges in the application of supervised, semi-supervised and unsupervised machine learning methods, as well as of generative and discriminative modelling approaches. We provide general guidelines to assist in the selection of these machine learning methods and their practical application for the analysis of genetic and genomic data sets.

  11. Quantum Machine Learning over Infinite Dimensions

    DOE PAGES

    Lau, Hoi-Kwan; Pooser, Raphael; Siopsis, George; ...

    2017-02-21

    Machine learning is a fascinating and exciting eld within computer science. Recently, this ex- citement has been transferred to the quantum information realm. Currently, all proposals for the quantum version of machine learning utilize the nite-dimensional substrate of discrete variables. Here we generalize quantum machine learning to the more complex, but still remarkably practi- cal, in nite-dimensional systems. We present the critical subroutines of quantum machine learning algorithms for an all-photonic continuous-variable quantum computer that achieve an exponential speedup compared to their equivalent classical counterparts. Finally, we also map out an experi- mental implementation which can be used as amore » blueprint for future photonic demonstrations.« less

  12. Quantum Machine Learning over Infinite Dimensions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lau, Hoi-Kwan; Pooser, Raphael; Siopsis, George

    Machine learning is a fascinating and exciting eld within computer science. Recently, this ex- citement has been transferred to the quantum information realm. Currently, all proposals for the quantum version of machine learning utilize the nite-dimensional substrate of discrete variables. Here we generalize quantum machine learning to the more complex, but still remarkably practi- cal, in nite-dimensional systems. We present the critical subroutines of quantum machine learning algorithms for an all-photonic continuous-variable quantum computer that achieve an exponential speedup compared to their equivalent classical counterparts. Finally, we also map out an experi- mental implementation which can be used as amore » blueprint for future photonic demonstrations.« less

  13. Machine learning and medicine: book review and commentary.

    PubMed

    Koprowski, Robert; Foster, Kenneth R

    2018-02-01

    This article is a review of the book "Master machine learning algorithms, discover how they work and implement them from scratch" (ISBN: not available, 37 USD, 163 pages) edited by Jason Brownlee published by the Author, edition, v1.10 http://MachineLearningMastery.com . An accompanying commentary discusses some of the issues that are involved with use of machine learning and data mining techniques to develop predictive models for diagnosis or prognosis of disease, and to call attention to additional requirements for developing diagnostic and prognostic algorithms that are generally useful in medicine. Appendix provides examples that illustrate potential problems with machine learning that are not addressed in the reviewed book.

  14. STAR Library Education Network: a hands-on learning program for libraries and their communities

    NASA Astrophysics Data System (ADS)

    Dusenbery, P.

    2010-12-01

    Science and technology are widely recognized as major drivers of innovation and industry (e.g. Rising above the Gathering Storm, 2006). While the focus for education reform is on school improvement, there is considerable research that supports the role that out-of-school experiences can play in student achievement and public understanding of STEM disciplines. Libraries provide an untapped resource for engaging underserved youth and their families in fostering an appreciation and deeper understanding of science and technology topics. Designed spaces, like libraries, allow lifelong, life-wide, and life-deep learning to take place though the research basis for learning in libraries is not as developed as other informal settings like science centers. The Space Science Institute’s National Center for Interactive Learning (NCIL) in partnership with the American Library Association (ALA), the Lunar and Planetary Institute (LPI), and the National Girls Collaborative Project (NGCP) have received funding from NSF to develop a national education project called the STAR Library Education Network: a hands-on learning program for libraries and their communities (or STAR-Net for short). STAR stands for Science-Technology, Activities and Resources. The overarching goal of the project is to reach underserved youth and their families with informal STEM learning experiences. This project will deepen our knowledge of informal/lifelong learning that takes place in libraries and establish a learning model that can be compared to the more established free-choice learning model for science centers and museums. The project includes the development of two STEM hands-on exhibits on topics that are of interest to library staff and their patrons: Discover Earth and Discover Tech. In addition, the project will produce resources and inquiry-based activities that libraries can use to enrich the exhibit experience. Additional resources will be provided through partnerships with relevant professional science and technology organizations (e.g. American Geophysical Union; National Academy of Engineering) that will provide speakers for host library events and webinars. Online and in-person workshops will be conducted for library staff with a focus on increasing content knowledge and improving facilitation expertise. This presentation will report on strategic planning activities for STAR-Net, a Community of Practice model, and the evaluation/research components of this national education program.

  15. Imagine! On the Future of Teaching and Learning and the Academic Research Library

    ERIC Educational Resources Information Center

    Miller, Kelly E.

    2014-01-01

    In the future, what role will the academic research library play in achieving the mission of higher education? This essay describes seven strategies that academic research libraries can adopt to become future-present libraries--libraries that foster what Douglas Thomas and John Seely Brown have called "a new culture of learning." Written…

  16. Derivative Free Optimization of Complex Systems with the Use of Statistical Machine Learning Models

    DTIC Science & Technology

    2015-09-12

    AFRL-AFOSR-VA-TR-2015-0278 DERIVATIVE FREE OPTIMIZATION OF COMPLEX SYSTEMS WITH THE USE OF STATISTICAL MACHINE LEARNING MODELS Katya Scheinberg...COMPLEX SYSTEMS WITH THE USE OF STATISTICAL MACHINE LEARNING MODELS 5a.  CONTRACT NUMBER 5b.  GRANT NUMBER FA9550-11-1-0239 5c.  PROGRAM ELEMENT...developed, which has been the focus of our research. 15. SUBJECT TERMS optimization, Derivative-Free Optimization, Statistical Machine Learning 16. SECURITY

  17. Guidelines for Developing and Reporting Machine Learning Predictive Models in Biomedical Research: A Multidisciplinary View

    PubMed Central

    2016-01-01

    Background As more and more researchers are turning to big data for new opportunities of biomedical discoveries, machine learning models, as the backbone of big data analysis, are mentioned more often in biomedical journals. However, owing to the inherent complexity of machine learning methods, they are prone to misuse. Because of the flexibility in specifying machine learning models, the results are often insufficiently reported in research articles, hindering reliable assessment of model validity and consistent interpretation of model outputs. Objective To attain a set of guidelines on the use of machine learning predictive models within clinical settings to make sure the models are correctly applied and sufficiently reported so that true discoveries can be distinguished from random coincidence. Methods A multidisciplinary panel of machine learning experts, clinicians, and traditional statisticians were interviewed, using an iterative process in accordance with the Delphi method. Results The process produced a set of guidelines that consists of (1) a list of reporting items to be included in a research article and (2) a set of practical sequential steps for developing predictive models. Conclusions A set of guidelines was generated to enable correct application of machine learning models and consistent reporting of model specifications and results in biomedical research. We believe that such guidelines will accelerate the adoption of big data analysis, particularly with machine learning methods, in the biomedical research community. PMID:27986644

  18. Ownership of Machine-Readable Bibliographic Data. Canadian Network Papers Number 5 = Propriete des Donnees Bibliographique Lisibles par Machine. Documents sur les Resaux Canadiens Numero 5.

    ERIC Educational Resources Information Center

    Duchesne, R. M.; And Others

    Because of data ownership questions raised by the interchange and sharing of machine readable bibliographic data, this paper was prepared for the Bibliographic and Communications Network Committee of the National Library Advisory Board. Background information and definitions are followed by a review of the legal aspects relating to property and…

  19. Uncommons: Transforming Dusty Reading Rooms into Artefactual "Third Space" Library Learning Labs

    ERIC Educational Resources Information Center

    Schadl, Suzanne Michele; Nelson, Molly; Valencia, Kristen S.

    2015-01-01

    This article describes the implementation of two inexpensive social learning library laboratories for advanced students in Latin American and Chicana/o studies. Drawing on philosophical literature from these interdisciplinary areas and ethnic studies, these cases present a "third space" option for library learning called…

  20. Using Rubrics to Assess Learning in Course-Integrated Library Instruction

    ERIC Educational Resources Information Center

    Gariepy, Laura W.; Stout, Jennifer A.; Hodge, Megan L.

    2016-01-01

    Librarians face numerous challenges when designing effective, sustainable methods for assessing student learning outcomes in one-shot, course-integrated library instruction sessions. We explore the use of rubrics to programmatically assess authentic learning exercises completed in one-shot library sessions for a large, required sophomore-level…

  1. Approaches to Machine Learning.

    DTIC Science & Technology

    1984-02-16

    The field of machine learning strives to develop methods and techniques to automatic the acquisition of new information, new skills, and new ways of organizing existing information. In this article, we review the major approaches to machine learning in symbolic domains, covering the tasks of learning concepts from examples, learning search methods, conceptual clustering, and language acquisition. We illustrate each of the basic approaches with paradigmatic examples. (Author)

  2. Computer Information Project for Monographs at the Medical Research Library of Brooklyn

    PubMed Central

    Koch, Michael S.; Kovacs, Helen

    1973-01-01

    The article describes a resource library's computer-based project that provides cataloging and other bibliographic services and promotes greater use of the book collection. A few studies are cited to show the significance of monographic literature in medical libraries. The educational role of the Medical Research Library of Brooklyn is discussed, both with regard to the parent institution and to smaller medical libraries in the same geographic area. Types of aid given to smaller libraries are enumerated. Information is given on methods for providing machine-produced catalog cards, current awareness notes, and bibliographic lists. Actualities and potentialities of the computer project are discussed. PMID:4579767

  3. The JPL Library information retrieval system

    NASA Technical Reports Server (NTRS)

    Walsh, J.

    1975-01-01

    The development, capabilities, and products of the computer-based retrieval system of the Jet Propulsion Laboratory Library are described. The system handles books and documents, produces a book catalog, and provides a machine search capability. Programs and documentation are available to the public through NASA's computer software dissemination program.

  4. Machine Learning in the Big Data Era: Are We There Yet?

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sukumar, Sreenivas Rangan

    In this paper, we discuss the machine learning challenges of the Big Data era. We observe that recent innovations in being able to collect, access, organize, integrate, and query massive amounts of data from a wide variety of data sources have brought statistical machine learning under more scrutiny and evaluation for gleaning insights from the data than ever before. In that context, we pose and debate the question - Are machine learning algorithms scaling with the ability to store and compute? If yes, how? If not, why not? We survey recent developments in the state-of-the-art to discuss emerging and outstandingmore » challenges in the design and implementation of machine learning algorithms at scale. We leverage experience from real-world Big Data knowledge discovery projects across domains of national security and healthcare to suggest our efforts be focused along the following axes: (i) the data science challenge - designing scalable and flexible computational architectures for machine learning (beyond just data-retrieval); (ii) the science of data challenge the ability to understand characteristics of data before applying machine learning algorithms and tools; and (iii) the scalable predictive functions challenge the ability to construct, learn and infer with increasing sample size, dimensionality, and categories of labels. We conclude with a discussion of opportunities and directions for future research.« less

  5. Machine Learning

    DTIC Science & Technology

    1990-04-01

    DTIC i.LE COPY RADC-TR-90-25 Final Technical Report April 1990 MACHINE LEARNING The MITRE Corporation Melissa P. Chase Cs) CTIC ’- CT E 71 IN 2 11990...S. FUNDING NUMBERS MACHINE LEARNING C - F19628-89-C-0001 PE - 62702F PR - MOlE S. AUTHO(S) TA - 79 Melissa P. Chase WUT - 80 S. PERFORMING...341.280.5500 pm I " Aw Sig rill Ia 2110-01 SECTION 1 INTRODUCTION 1.1 BACKGROUND Research in machine learning has taken two directions in the problem of

  6. Workshop on Fielded Applications of Machine Learning Held in Amherst, Massachusetts on 30 June-1 July 1993. Abstracts.

    DTIC Science & Technology

    1993-01-01

    engineering has led to many AI systems that are now regularly used in industry and elsewhere. The ultimate test of machine learning , the subfield of Al that...applications of machine learning suggest the time was ripe for a meeting on this topic. For this reason, Pat Langley (Siemens Corporate Research) and Yves...Kodratoff (Universite de Paris, Sud) organized an invited workshop on applications of machine learning . The goal of the gathering was to familiarize

  7. Clinical decision support of radiotherapy treatment planning: A data-driven machine learning strategy for patient-specific dosimetric decision making.

    PubMed

    Valdes, Gilmer; Simone, Charles B; Chen, Josephine; Lin, Alexander; Yom, Sue S; Pattison, Adam J; Carpenter, Colin M; Solberg, Timothy D

    2017-12-01

    Clinical decision support systems are a growing class of tools with the potential to impact healthcare. This study investigates the construction of a decision support system through which clinicians can efficiently identify which previously approved historical treatment plans are achievable for a new patient to aid in selection of therapy. Treatment data were collected for early-stage lung and postoperative oropharyngeal cancers treated using photon (lung and head and neck) and proton (head and neck) radiotherapy. Machine-learning classifiers were constructed using patient-specific feature-sets and a library of historical plans. Model accuracy was analyzed using learning curves, and historical treatment plan matching was investigated. Learning curves demonstrate that for these datasets, approximately 45, 60, and 30 patients are needed for a sufficiently accurate classification model for radiotherapy for early-stage lung, postoperative oropharyngeal photon, and postoperative oropharyngeal proton, respectively. The resulting classification model provides a database of previously approved treatment plans that are achievable for a new patient. An exemplary case, highlighting tradeoffs between the heart and chest wall dose while holding target dose constant in two historical plans is provided. We report on the first artificial-intelligence based clinical decision support system that connects patients to past discrete treatment plans in radiation oncology and demonstrate for the first time how this tool can enable clinicians to use past decisions to help inform current assessments. Clinicians can be informed of dose tradeoffs between critical structures early in the treatment process, enabling more time spent on finding the optimal course of treatment for individual patients. Copyright © 2017. Published by Elsevier B.V.

  8. Systematic Poisoning Attacks on and Defenses for Machine Learning in Healthcare.

    PubMed

    Mozaffari-Kermani, Mehran; Sur-Kolay, Susmita; Raghunathan, Anand; Jha, Niraj K

    2015-11-01

    Machine learning is being used in a wide range of application domains to discover patterns in large datasets. Increasingly, the results of machine learning drive critical decisions in applications related to healthcare and biomedicine. Such health-related applications are often sensitive, and thus, any security breach would be catastrophic. Naturally, the integrity of the results computed by machine learning is of great importance. Recent research has shown that some machine-learning algorithms can be compromised by augmenting their training datasets with malicious data, leading to a new class of attacks called poisoning attacks. Hindrance of a diagnosis may have life-threatening consequences and could cause distrust. On the other hand, not only may a false diagnosis prompt users to distrust the machine-learning algorithm and even abandon the entire system but also such a false positive classification may cause patient distress. In this paper, we present a systematic, algorithm-independent approach for mounting poisoning attacks across a wide range of machine-learning algorithms and healthcare datasets. The proposed attack procedure generates input data, which, when added to the training set, can either cause the results of machine learning to have targeted errors (e.g., increase the likelihood of classification into a specific class), or simply introduce arbitrary errors (incorrect classification). These attacks may be applied to both fixed and evolving datasets. They can be applied even when only statistics of the training dataset are available or, in some cases, even without access to the training dataset, although at a lower efficacy. We establish the effectiveness of the proposed attacks using a suite of six machine-learning algorithms and five healthcare datasets. Finally, we present countermeasures against the proposed generic attacks that are based on tracking and detecting deviations in various accuracy metrics, and benchmark their effectiveness.

  9. A Study with Computer-Based Circulation Data of the Non-Use and Use of a Large Academic Library. Final Report.

    ERIC Educational Resources Information Center

    Lubans, John, Jr.; And Others

    Computer-based circulation systems, it is widely believed, can be utilized to provide data for library use studies. The study described in this report involves using such a data base to analyze aspects of library use and non-use and types of users. Another major objective of this research was the testing of machine-readable circulation data…

  10. Machine learning in autistic spectrum disorder behavioral research: A review and ways forward.

    PubMed

    Thabtah, Fadi

    2018-02-13

    Autistic Spectrum Disorder (ASD) is a mental disorder that retards acquisition of linguistic, communication, cognitive, and social skills and abilities. Despite being diagnosed with ASD, some individuals exhibit outstanding scholastic, non-academic, and artistic capabilities, in such cases posing a challenging task for scientists to provide answers. In the last few years, ASD has been investigated by social and computational intelligence scientists utilizing advanced technologies such as machine learning to improve diagnostic timing, precision, and quality. Machine learning is a multidisciplinary research topic that employs intelligent techniques to discover useful concealed patterns, which are utilized in prediction to improve decision making. Machine learning techniques such as support vector machines, decision trees, logistic regressions, and others, have been applied to datasets related to autism in order to construct predictive models. These models claim to enhance the ability of clinicians to provide robust diagnoses and prognoses of ASD. However, studies concerning the use of machine learning in ASD diagnosis and treatment suffer from conceptual, implementation, and data issues such as the way diagnostic codes are used, the type of feature selection employed, the evaluation measures chosen, and class imbalances in data among others. A more serious claim in recent studies is the development of a new method for ASD diagnoses based on machine learning. This article critically analyses these recent investigative studies on autism, not only articulating the aforementioned issues in these studies but also recommending paths forward that enhance machine learning use in ASD with respect to conceptualization, implementation, and data. Future studies concerning machine learning in autism research are greatly benefitted by such proposals.

  11. Reference Manual for Machine-Readable Bibliographic Descriptions. Second Revised Edition.

    ERIC Educational Resources Information Center

    Dierickx, H., Ed.; Hopkinson, A., Ed.

    A product of the UNISIST International Centre for Bibliographic Descriptions (UNIBIB), this reference manual presents a standardized communication format for the exchange of machine-readable bibliographic information between bibliographic databases or other types of bibliographic information services, including libraries. The manual is produced in…

  12. Detecting Abnormal Word Utterances in Children With Autism Spectrum Disorders: Machine-Learning-Based Voice Analysis Versus Speech Therapists.

    PubMed

    Nakai, Yasushi; Takiguchi, Tetsuya; Matsui, Gakuyo; Yamaoka, Noriko; Takada, Satoshi

    2017-10-01

    Abnormal prosody is often evident in the voice intonations of individuals with autism spectrum disorders. We compared a machine-learning-based voice analysis with human hearing judgments made by 10 speech therapists for classifying children with autism spectrum disorders ( n = 30) and typical development ( n = 51). Using stimuli limited to single-word utterances, machine-learning-based voice analysis was superior to speech therapist judgments. There was a significantly higher true-positive than false-negative rate for machine-learning-based voice analysis but not for speech therapists. Results are discussed in terms of some artificiality of clinician judgments based on single-word utterances, and the objectivity machine-learning-based voice analysis adds to judging abnormal prosody.

  13. Logic Learning Machine and standard supervised methods for Hodgkin's lymphoma prognosis using gene expression data and clinical variables.

    PubMed

    Parodi, Stefano; Manneschi, Chiara; Verda, Damiano; Ferrari, Enrico; Muselli, Marco

    2018-03-01

    This study evaluates the performance of a set of machine learning techniques in predicting the prognosis of Hodgkin's lymphoma using clinical factors and gene expression data. Analysed samples from 130 Hodgkin's lymphoma patients included a small set of clinical variables and more than 54,000 gene features. Machine learning classifiers included three black-box algorithms ( k-nearest neighbour, Artificial Neural Network, and Support Vector Machine) and two methods based on intelligible rules (Decision Tree and the innovative Logic Learning Machine method). Support Vector Machine clearly outperformed any of the other methods. Among the two rule-based algorithms, Logic Learning Machine performed better and identified a set of simple intelligible rules based on a combination of clinical variables and gene expressions. Decision Tree identified a non-coding gene ( XIST) involved in the early phases of X chromosome inactivation that was overexpressed in females and in non-relapsed patients. XIST expression might be responsible for the better prognosis of female Hodgkin's lymphoma patients.

  14. Public Libraries: A Vital Space for Family Engagement

    ERIC Educational Resources Information Center

    Lopez, M. Elena; Caspe, Margaret; McWilliams, Lorette

    2016-01-01

    Children and youth learn in countless ways, anywhere, anytime. And one of the most powerful levers of children's learning--from the early childhood years through adolescence--is families. For families, libraries provide the books, media, and activities that help them open doors for children's literacy and lifelong learning. Libraries are poised to…

  15. From Commons to Classroom: The Evolution of Learning Spaces in Academic Libraries

    ERIC Educational Resources Information Center

    Karasic, Victoria

    2016-01-01

    Over the past two decades, academic library spaces have evolved to meet the changing teaching and learning needs of diverse campus communities. The Information Commons combines the physical and virtual in an informal library space, whereas the recent Active Learning Classroom creates a more formal setting for collaboration. As scholarship has…

  16. Organizational Learning for Library Enhancements: A Collaborative, Research-Driven Analysis of Academic Department Needs

    ERIC Educational Resources Information Center

    Loo, Jeffery L.; Dupuis, Elizabeth A.

    2015-01-01

    This article presents a qualitative evaluation methodology of academic departments for library organizational learning and library enhancement planning. This evaluation used campus units' academic program review reports as a data source and employed collaborative content analysis by library liaisons to extract departmental strengths, weaknesses,…

  17. Building a Learning Commons: Necessary Conditions for Success

    ERIC Educational Resources Information Center

    McKay, Richard

    2014-01-01

    Remodeling an academic library according to the Learning Commons service model will challenge the library staff. This paper gives insight into four of these challenges: Working with the design team, preserving a scholarly environment, ensuring the most efficient arrangement of the library's service centers, and moving the library's collection. It…

  18. Library Media Learning and Play Center.

    ERIC Educational Resources Information Center

    Faber, Therese; And Others

    Preschool educators developed a library media learning and play center to enable children to "experience" a library; establish positive attitudes about the library; and encourage respect for self, others, and property. The center had the following areas: check-in and check-out desk, quiet reading section, computer center, listening center, video…

  19. Probabilistic machine learning and artificial intelligence.

    PubMed

    Ghahramani, Zoubin

    2015-05-28

    How can a machine learn from experience? Probabilistic modelling provides a framework for understanding what learning is, and has therefore emerged as one of the principal theoretical and practical approaches for designing machines that learn from data acquired through experience. The probabilistic framework, which describes how to represent and manipulate uncertainty about models and predictions, has a central role in scientific data analysis, machine learning, robotics, cognitive science and artificial intelligence. This Review provides an introduction to this framework, and discusses some of the state-of-the-art advances in the field, namely, probabilistic programming, Bayesian optimization, data compression and automatic model discovery.

  20. Probabilistic machine learning and artificial intelligence

    NASA Astrophysics Data System (ADS)

    Ghahramani, Zoubin

    2015-05-01

    How can a machine learn from experience? Probabilistic modelling provides a framework for understanding what learning is, and has therefore emerged as one of the principal theoretical and practical approaches for designing machines that learn from data acquired through experience. The probabilistic framework, which describes how to represent and manipulate uncertainty about models and predictions, has a central role in scientific data analysis, machine learning, robotics, cognitive science and artificial intelligence. This Review provides an introduction to this framework, and discusses some of the state-of-the-art advances in the field, namely, probabilistic programming, Bayesian optimization, data compression and automatic model discovery.

  1. Stacked Denoising Autoencoders Applied to Star/Galaxy Classification

    NASA Astrophysics Data System (ADS)

    Qin, Hao-ran; Lin, Ji-ming; Wang, Jun-yi

    2017-04-01

    In recent years, the deep learning algorithm, with the characteristics of strong adaptability, high accuracy, and structural complexity, has become more and more popular, but it has not yet been used in astronomy. In order to solve the problem that the star/galaxy classification accuracy is high for the bright source set, but low for the faint source set of the Sloan Digital Sky Survey (SDSS) data, we introduced the new deep learning algorithm, namely the SDA (stacked denoising autoencoder) neural network and the dropout fine-tuning technique, which can greatly improve the robustness and antinoise performance. We randomly selected respectively the bright source sets and faint source sets from the SDSS DR12 and DR7 data with spectroscopic measurements, and made preprocessing on them. Then, we randomly selected respectively the training sets and testing sets without replacement from the bright source sets and faint source sets. At last, using these training sets we made the training to obtain the SDA models of the bright sources and faint sources in the SDSS DR7 and DR12, respectively. We compared the test result of the SDA model on the DR12 testing set with the test results of the Library for Support Vector Machines (LibSVM), J48 decision tree, Logistic Model Tree (LMT), Support Vector Machine (SVM), Logistic Regression, and Decision Stump algorithm, and compared the test result of the SDA model on the DR7 testing set with the test results of six kinds of decision trees. The experiments show that the SDA has a better classification accuracy than other machine learning algorithms for the faint source sets of DR7 and DR12. Especially, when the completeness function is used as the evaluation index, compared with the decision tree algorithms, the correctness rate of SDA has improved about 15% for the faint source set of SDSS-DR7.

  2. MO-F-CAMPUS-J-02: Automatic Recognition of Patient Treatment Site in Portal Images Using Machine Learning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chang, X; Yang, D

    Purpose: To investigate the method to automatically recognize the treatment site in the X-Ray portal images. It could be useful to detect potential treatment errors, and to provide guidance to sequential tasks, e.g. automatically verify the patient daily setup. Methods: The portal images were exported from MOSAIQ as DICOM files, and were 1) processed with a threshold based intensity transformation algorithm to enhance contrast, and 2) where then down-sampled (from 1024×768 to 128×96) by using bi-cubic interpolation algorithm. An appearance-based vector space model (VSM) was used to rearrange the images into vectors. A principal component analysis (PCA) method was usedmore » to reduce the vector dimensions. A multi-class support vector machine (SVM), with radial basis function kernel, was used to build the treatment site recognition models. These models were then used to recognize the treatment sites in the portal image. Portal images of 120 patients were included in the study. The images were selected to cover six treatment sites: brain, head and neck, breast, lung, abdomen and pelvis. Each site had images of the twenty patients. Cross-validation experiments were performed to evaluate the performance. Results: MATLAB image processing Toolbox and scikit-learn (a machine learning library in python) were used to implement the proposed method. The average accuracies using the AP and RT images separately were 95% and 94% respectively. The average accuracy using AP and RT images together was 98%. Computation time was ∼0.16 seconds per patient with AP or RT image, ∼0.33 seconds per patient with both of AP and RT images. Conclusion: The proposed method of treatment site recognition is efficient and accurate. It is not sensitive to the differences of image intensity, size and positions of patients in the portal images. It could be useful for the patient safety assurance. The work was partially supported by a research grant from Varian Medical System.« less

  3. Machine Learning Techniques in Clinical Vision Sciences.

    PubMed

    Caixinha, Miguel; Nunes, Sandrina

    2017-01-01

    This review presents and discusses the contribution of machine learning techniques for diagnosis and disease monitoring in the context of clinical vision science. Many ocular diseases leading to blindness can be halted or delayed when detected and treated at its earliest stages. With the recent developments in diagnostic devices, imaging and genomics, new sources of data for early disease detection and patients' management are now available. Machine learning techniques emerged in the biomedical sciences as clinical decision-support techniques to improve sensitivity and specificity of disease detection and monitoring, increasing objectively the clinical decision-making process. This manuscript presents a review in multimodal ocular disease diagnosis and monitoring based on machine learning approaches. In the first section, the technical issues related to the different machine learning approaches will be present. Machine learning techniques are used to automatically recognize complex patterns in a given dataset. These techniques allows creating homogeneous groups (unsupervised learning), or creating a classifier predicting group membership of new cases (supervised learning), when a group label is available for each case. To ensure a good performance of the machine learning techniques in a given dataset, all possible sources of bias should be removed or minimized. For that, the representativeness of the input dataset for the true population should be confirmed, the noise should be removed, the missing data should be treated and the data dimensionally (i.e., the number of parameters/features and the number of cases in the dataset) should be adjusted. The application of machine learning techniques in ocular disease diagnosis and monitoring will be presented and discussed in the second section of this manuscript. To show the clinical benefits of machine learning in clinical vision sciences, several examples will be presented in glaucoma, age-related macular degeneration, and diabetic retinopathy, these ocular pathologies being the major causes of irreversible visual impairment.

  4. Multi-Stage Convex Relaxation Methods for Machine Learning

    DTIC Science & Technology

    2013-03-01

    Many problems in machine learning can be naturally formulated as non-convex optimization problems. However, such direct nonconvex formulations have...original nonconvex formulation. We will develop theoretical properties of this method and algorithmic consequences. Related convex and nonconvex machine learning methods will also be investigated.

  5. Machine Learning for the Knowledge Plane

    DTIC Science & Technology

    2006-06-01

    this idea is to combine techniques from machine learning with new architectural concepts in networking to make the internet self-aware and self...work on the machine learning portion of the Knowledge Plane. This consisted of three components: (a) we wrote a document formulating the various

  6. Machine learning and data science in soft materials engineering

    NASA Astrophysics Data System (ADS)

    Ferguson, Andrew L.

    2018-01-01

    In many branches of materials science it is now routine to generate data sets of such large size and dimensionality that conventional methods of analysis fail. Paradigms and tools from data science and machine learning can provide scalable approaches to identify and extract trends and patterns within voluminous data sets, perform guided traversals of high-dimensional phase spaces, and furnish data-driven strategies for inverse materials design. This topical review provides an accessible introduction to machine learning tools in the context of soft and biological materials by ‘de-jargonizing’ data science terminology, presenting a taxonomy of machine learning techniques, and surveying the mathematical underpinnings and software implementations of popular tools, including principal component analysis, independent component analysis, diffusion maps, support vector machines, and relative entropy. We present illustrative examples of machine learning applications in soft matter, including inverse design of self-assembling materials, nonlinear learning of protein folding landscapes, high-throughput antimicrobial peptide design, and data-driven materials design engines. We close with an outlook on the challenges and opportunities for the field.

  7. Machine learning and data science in soft materials engineering.

    PubMed

    Ferguson, Andrew L

    2018-01-31

    In many branches of materials science it is now routine to generate data sets of such large size and dimensionality that conventional methods of analysis fail. Paradigms and tools from data science and machine learning can provide scalable approaches to identify and extract trends and patterns within voluminous data sets, perform guided traversals of high-dimensional phase spaces, and furnish data-driven strategies for inverse materials design. This topical review provides an accessible introduction to machine learning tools in the context of soft and biological materials by 'de-jargonizing' data science terminology, presenting a taxonomy of machine learning techniques, and surveying the mathematical underpinnings and software implementations of popular tools, including principal component analysis, independent component analysis, diffusion maps, support vector machines, and relative entropy. We present illustrative examples of machine learning applications in soft matter, including inverse design of self-assembling materials, nonlinear learning of protein folding landscapes, high-throughput antimicrobial peptide design, and data-driven materials design engines. We close with an outlook on the challenges and opportunities for the field.

  8. A Parameter Communication Optimization Strategy for Distributed Machine Learning in Sensors

    PubMed Central

    Zhang, Jilin; Tu, Hangdi; Ren, Yongjian; Wan, Jian; Zhou, Li; Li, Mingwei; Wang, Jue; Yu, Lifeng; Zhao, Chang; Zhang, Lei

    2017-01-01

    In order to utilize the distributed characteristic of sensors, distributed machine learning has become the mainstream approach, but the different computing capability of sensors and network delays greatly influence the accuracy and the convergence rate of the machine learning model. Our paper describes a reasonable parameter communication optimization strategy to balance the training overhead and the communication overhead. We extend the fault tolerance of iterative-convergent machine learning algorithms and propose the Dynamic Finite Fault Tolerance (DFFT). Based on the DFFT, we implement a parameter communication optimization strategy for distributed machine learning, named Dynamic Synchronous Parallel Strategy (DSP), which uses the performance monitoring model to dynamically adjust the parameter synchronization strategy between worker nodes and the Parameter Server (PS). This strategy makes full use of the computing power of each sensor, ensures the accuracy of the machine learning model, and avoids the situation that the model training is disturbed by any tasks unrelated to the sensors. PMID:28934163

  9. Machine Learning Approaches for Clinical Psychology and Psychiatry.

    PubMed

    Dwyer, Dominic B; Falkai, Peter; Koutsouleris, Nikolaos

    2018-05-07

    Machine learning approaches for clinical psychology and psychiatry explicitly focus on learning statistical functions from multidimensional data sets to make generalizable predictions about individuals. The goal of this review is to provide an accessible understanding of why this approach is important for future practice given its potential to augment decisions associated with the diagnosis, prognosis, and treatment of people suffering from mental illness using clinical and biological data. To this end, the limitations of current statistical paradigms in mental health research are critiqued, and an introduction is provided to critical machine learning methods used in clinical studies. A selective literature review is then presented aiming to reinforce the usefulness of machine learning methods and provide evidence of their potential. In the context of promising initial results, the current limitations of machine learning approaches are addressed, and considerations for future clinical translation are outlined.

  10. Learning About Climate and Atmospheric Models Through Machine Learning

    NASA Astrophysics Data System (ADS)

    Lucas, D. D.

    2017-12-01

    From the analysis of ensemble variability to improving simulation performance, machine learning algorithms can play a powerful role in understanding the behavior of atmospheric and climate models. To learn about model behavior, we create training and testing data sets through ensemble techniques that sample different model configurations and values of input parameters, and then use supervised machine learning to map the relationships between the inputs and outputs. Following this procedure, we have used support vector machines, random forests, gradient boosting and other methods to investigate a variety of atmospheric and climate model phenomena. We have used machine learning to predict simulation crashes, estimate the probability density function of climate sensitivity, optimize simulations of the Madden Julian oscillation, assess the impacts of weather and emissions uncertainty on atmospheric dispersion, and quantify the effects of model resolution changes on precipitation. This presentation highlights recent examples of our applications of machine learning to improve the understanding of climate and atmospheric models. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.

  11. Automation of energy demand forecasting

    NASA Astrophysics Data System (ADS)

    Siddique, Sanzad

    Automation of energy demand forecasting saves time and effort by searching automatically for an appropriate model in a candidate model space without manual intervention. This thesis introduces a search-based approach that improves the performance of the model searching process for econometrics models. Further improvements in the accuracy of the energy demand forecasting are achieved by integrating nonlinear transformations within the models. This thesis introduces machine learning techniques that are capable of modeling such nonlinearity. Algorithms for learning domain knowledge from time series data using the machine learning methods are also presented. The novel search based approach and the machine learning models are tested with synthetic data as well as with natural gas and electricity demand signals. Experimental results show that the model searching technique is capable of finding an appropriate forecasting model. Further experimental results demonstrate an improved forecasting accuracy achieved by using the novel machine learning techniques introduced in this thesis. This thesis presents an analysis of how the machine learning techniques learn domain knowledge. The learned domain knowledge is used to improve the forecast accuracy.

  12. Social Science Data Archives and Libraries: A View to the Future.

    ERIC Educational Resources Information Center

    Clark, Barton M.

    1982-01-01

    Discusses factors militating against integration of social science data archives and libraries in near future, noting usage of materials, access requisite skills of librarians, economic stability of archives, existing structures which manage social science data archives. Role of librarians, data access tools, and cataloging of machine-readable…

  13. Common Bibliographic Standards for Baylor University Libraries. Revised.

    ERIC Educational Resources Information Center

    Scott, Sharon; And Others

    Developed by a Baylor University (Texas) Task Force, the revised policies of bibliographic standards for the university libraries provide formats for: (1) archives and manuscript control; (2) audiovisual media; (3) books; (4) machine-readable data files; (5) maps; (6) music scores; (7) serials; and (8) sound recordings. The task force assumptions…

  14. Eating and Reading in the Library.

    ERIC Educational Resources Information Center

    Trelease, Jim; Krashen, Stephen

    1996-01-01

    This article, citing the example of large bookselling chains who offer cafes in their stores, advocates the provision of food and drink in school libraries as well. High-quality food, made available by vending machine, would increase levels of wellness, energy, and teachability. Protests involving mess, lack of money, and difficulties with parents…

  15. Planning for the Automation of School Library Media Centers.

    ERIC Educational Resources Information Center

    Caffarella, Edward P.

    1996-01-01

    Geared for school library media specialists whose centers are in the early stages of automation or conversion to a new system, this article focuses on major components of media center automation: circulation control; online public access catalogs; machine readable cataloging; retrospective conversion of print catalog cards; and computer networks…

  16. Apocalypse Soon? The Bug.

    ERIC Educational Resources Information Center

    Clyde, Anne

    1999-01-01

    Discussion of the Year 2000 (Y2K) problem, the computer-code problem that affects computer programs or computer chips, focuses on the impact on teacher-librarians. Topics include automated library systems, access to online information services, library computers and software, and other electronic equipment such as photocopiers and fax machines.…

  17. Library Dream Machines: Helping Students Master Super Online Catalogs.

    ERIC Educational Resources Information Center

    Webb, T. D.

    1992-01-01

    Describes how automation has transformed the library and how super-catalogs have affected the process of doing research. Explains how faculty and librarians can work together to help students to use the available databases effectively, by teaching them Boolean logic, standard record formats, filing rules, etc. (DMM)

  18. Active Learning Framework for Non-Intrusive Load Monitoring: Preprint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jin, Xin

    2016-05-16

    Non-Intrusive Load Monitoring (NILM) is a set of techniques that estimate the electricity usage of individual appliances from power measurements taken at a limited number of locations in a building. One of the key challenges in NILM is having too much data without class labels yet being unable to label the data manually for cost or time constraints. This paper presents an active learning framework that helps existing NILM techniques to overcome this challenge. Active learning is an advanced machine learning method that interactively queries a user for the class label information. Unlike most existing NILM systems that heuristically requestmore » user inputs, the proposed method only needs minimally sufficient information from a user to build a compact and yet highly representative load signature library. Initial results indicate the proposed method can reduce the user inputs by up to 90% while still achieving similar disaggregation performance compared to a heuristic method. Thus, the proposed method can substantially reduce the burden on the user, improve the performance of a NILM system with limited user inputs, and overcome the key market barriers to the wide adoption of NILM technologies.« less

  19. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bolme, David S; Mikkilineni, Aravind K; Rose, Derek C

    Analog computational circuits have been demonstrated to provide substantial improvements in power and speed relative to digital circuits, especially for applications requiring extreme parallelism but only modest precision. Deep machine learning is one such area and stands to benefit greatly from analog and mixed-signal implementations. However, even at modest precisions, offsets and non-linearity can degrade system performance. Furthermore, in all but the simplest systems, it is impossible to directly measure the intermediate outputs of all sub-circuits. The result is that circuit designers are unable to accurately evaluate the non-idealities of computational circuits in-situ and are therefore unable to fully utilizemore » measurement results to improve future designs. In this paper we present a technique to use deep learning frameworks to model physical systems. Recently developed libraries like TensorFlow make it possible to use back propagation to learn parameters in the context of modeling circuit behavior. Offsets and scaling errors can be discovered even for sub-circuits that are deeply embedded in a computational system and not directly observable. The learned parameters can be used to refine simulation methods or to identify appropriate compensation strategies. We demonstrate the framework using a mixed-signal convolution operator as an example circuit.« less

  20. Applications of machine learning in cancer prediction and prognosis.

    PubMed

    Cruz, Joseph A; Wishart, David S

    2007-02-11

    Machine learning is a branch of artificial intelligence that employs a variety of statistical, probabilistic and optimization techniques that allows computers to "learn" from past examples and to detect hard-to-discern patterns from large, noisy or complex data sets. This capability is particularly well-suited to medical applications, especially those that depend on complex proteomic and genomic measurements. As a result, machine learning is frequently used in cancer diagnosis and detection. More recently machine learning has been applied to cancer prognosis and prediction. This latter approach is particularly interesting as it is part of a growing trend towards personalized, predictive medicine. In assembling this review we conducted a broad survey of the different types of machine learning methods being used, the types of data being integrated and the performance of these methods in cancer prediction and prognosis. A number of trends are noted, including a growing dependence on protein biomarkers and microarray data, a strong bias towards applications in prostate and breast cancer, and a heavy reliance on "older" technologies such artificial neural networks (ANNs) instead of more recently developed or more easily interpretable machine learning methods. A number of published studies also appear to lack an appropriate level of validation or testing. Among the better designed and validated studies it is clear that machine learning methods can be used to substantially (15-25%) improve the accuracy of predicting cancer susceptibility, recurrence and mortality. At a more fundamental level, it is also evident that machine learning is also helping to improve our basic understanding of cancer development and progression.

  1. A review of supervised machine learning applied to ageing research.

    PubMed

    Fabris, Fabio; Magalhães, João Pedro de; Freitas, Alex A

    2017-04-01

    Broadly speaking, supervised machine learning is the computational task of learning correlations between variables in annotated data (the training set), and using this information to create a predictive model capable of inferring annotations for new data, whose annotations are not known. Ageing is a complex process that affects nearly all animal species. This process can be studied at several levels of abstraction, in different organisms and with different objectives in mind. Not surprisingly, the diversity of the supervised machine learning algorithms applied to answer biological questions reflects the complexities of the underlying ageing processes being studied. Many works using supervised machine learning to study the ageing process have been recently published, so it is timely to review these works, to discuss their main findings and weaknesses. In summary, the main findings of the reviewed papers are: the link between specific types of DNA repair and ageing; ageing-related proteins tend to be highly connected and seem to play a central role in molecular pathways; ageing/longevity is linked with autophagy and apoptosis, nutrient receptor genes, and copper and iron ion transport. Additionally, several biomarkers of ageing were found by machine learning. Despite some interesting machine learning results, we also identified a weakness of current works on this topic: only one of the reviewed papers has corroborated the computational results of machine learning algorithms through wet-lab experiments. In conclusion, supervised machine learning has contributed to advance our knowledge and has provided novel insights on ageing, yet future work should have a greater emphasis in validating the predictions.

  2. Rehabilitation and Prosthetic Services

    MedlinePlus

    ... VA Learning University (VALU) SimLearn Libraries (VALNET) VA Software Documentation Library (VDL) About VHA Learn about VHA Forms & ... & Sensory Aids Service (PSAS) Our Mission The mission of the Prosthetic & ...

  3. Machine learning, social learning and the governance of self-driving cars.

    PubMed

    Stilgoe, Jack

    2018-02-01

    Self-driving cars, a quintessentially 'smart' technology, are not born smart. The algorithms that control their movements are learning as the technology emerges. Self-driving cars represent a high-stakes test of the powers of machine learning, as well as a test case for social learning in technology governance. Society is learning about the technology while the technology learns about society. Understanding and governing the politics of this technology means asking 'Who is learning, what are they learning and how are they learning?' Focusing on the successes and failures of social learning around the much-publicized crash of a Tesla Model S in 2016, I argue that trajectories and rhetorics of machine learning in transport pose a substantial governance challenge. 'Self-driving' or 'autonomous' cars are misnamed. As with other technologies, they are shaped by assumptions about social needs, solvable problems, and economic opportunities. Governing these technologies in the public interest means improving social learning by constructively engaging with the contingencies of machine learning.

  4. Robust Fault Diagnosis in Electric Drives Using Machine Learning

    DTIC Science & Technology

    2004-09-08

    detection of fault conditions of the inverter. A machine learning framework is developed to systematically select torque-speed domain operation points...were used to generate various fault condition data for machine learning . The technique is viable for accurate, reliable and fast fault detection in electric drives.

  5. Agents Technology Research

    DTIC Science & Technology

    2010-02-01

    multi-agent reputation management. State abstraction is a technique used to allow machine learning technologies to cope with problems that have large...state abstrac- tion process to enable reinforcement learning in domains with large state spaces. State abstraction is vital to machine learning ...across a collective of independent platforms. These individual elements, often referred to as agents in the machine learning community, should exhibit both

  6. Machine learning approaches in medical image analysis: From detection to diagnosis.

    PubMed

    de Bruijne, Marleen

    2016-10-01

    Machine learning approaches are increasingly successful in image-based diagnosis, disease prognosis, and risk assessment. This paper highlights new research directions and discusses three main challenges related to machine learning in medical imaging: coping with variation in imaging protocols, learning from weak labels, and interpretation and evaluation of results. Copyright © 2016 Elsevier B.V. All rights reserved.

  7. Testing meta tagger

    DTIC Science & Technology

    2017-12-21

    rank , and computer vision. Machine learning is closely related to (and often overlaps with) computational statistics, which also focuses on...Machine learning is a field of computer science that gives computers the ability to learn without being explicitly programmed.[1] Arthur Samuel...an American pioneer in the field of computer gaming and artificial intelligence, coined the term "Machine Learning " in 1959 while at IBM[2]. Evolved

  8. Library as a Partner in Co-Designing Learning Spaces: A Case Study at Tampere University of Technology, Finland

    ERIC Educational Resources Information Center

    Tevaniemi, Johanna; Poutanen, Jenni; Lähdemäki, Riitta

    2015-01-01

    This article presents a case of co-designed temporary learning spaces at a Finnish academic library, together with the results of a user-survey. The experimental development of the multifunctional spaces offered an opportunity for the library to collaborate with its parent organisation thus broadening the role of the library. Hence, library can be…

  9. Post hoc support vector machine learning for impedimetric biosensors based on weak protein-ligand interactions.

    PubMed

    Rong, Y; Padron, A V; Hagerty, K J; Nelson, N; Chi, S; Keyhani, N O; Katz, J; Datta, S P A; Gomes, C; McLamore, E S

    2018-04-30

    Impedimetric biosensors for measuring small molecules based on weak/transient interactions between bioreceptors and target analytes are a challenge for detection electronics, particularly in field studies or in the analysis of complex matrices. Protein-ligand binding sensors have enormous potential for biosensing, but achieving accuracy in complex solutions is a major challenge. There is a need for simple post hoc analytical tools that are not computationally expensive, yet provide near real time feedback on data derived from impedance spectra. Here, we show the use of a simple, open source support vector machine learning algorithm for analyzing impedimetric data in lieu of using equivalent circuit analysis. We demonstrate two different protein-based biosensors to show that the tool can be used for various applications. We conclude with a mobile phone-based demonstration focused on the measurement of acetone, an important biomarker related to the onset of diabetic ketoacidosis. In all conditions tested, the open source classifier was capable of performing as well as, or better, than the equivalent circuit analysis for characterizing weak/transient interactions between a model ligand (acetone) and a small chemosensory protein derived from the tsetse fly. In addition, the tool has a low computational requirement, facilitating use for mobile acquisition systems such as mobile phones. The protocol is deployed through Jupyter notebook (an open source computing environment available for mobile phone, tablet or computer use) and the code was written in Python. For each of the applications, we provide step-by-step instructions in English, Spanish, Mandarin and Portuguese to facilitate widespread use. All codes were based on scikit-learn, an open source software machine learning library in the Python language, and were processed in Jupyter notebook, an open-source web application for Python. The tool can easily be integrated with the mobile biosensor equipment for rapid detection, facilitating use by a broad range of impedimetric biosensor users. This post hoc analysis tool can serve as a launchpad for the convergence of nanobiosensors in planetary health monitoring applications based on mobile phone hardware.

  10. Cognitive learning: a machine learning approach for automatic process characterization from design

    NASA Astrophysics Data System (ADS)

    Foucher, J.; Baderot, J.; Martinez, S.; Dervilllé, A.; Bernard, G.

    2018-03-01

    Cutting edge innovation requires accurate and fast process-control to obtain fast learning rate and industry adoption. Current tools available for such task are mainly manual and user dependent. We present in this paper cognitive learning, which is a new machine learning based technique to facilitate and to speed up complex characterization by using the design as input, providing fast training and detection time. We will focus on the machine learning framework that allows object detection, defect traceability and automatic measurement tools.

  11. Assessing and comparison of different machine learning methods in parent-offspring trios for genotype imputation.

    PubMed

    Mikhchi, Abbas; Honarvar, Mahmood; Kashan, Nasser Emam Jomeh; Aminafshar, Mehdi

    2016-06-21

    Genotype imputation is an important tool for prediction of unknown genotypes for both unrelated individuals and parent-offspring trios. Several imputation methods are available and can either employ universal machine learning methods, or deploy algorithms dedicated to infer missing genotypes. In this research the performance of eight machine learning methods: Support Vector Machine, K-Nearest Neighbors, Extreme Learning Machine, Radial Basis Function, Random Forest, AdaBoost, LogitBoost, and TotalBoost compared in terms of the imputation accuracy, computation time and the factors affecting imputation accuracy. The methods employed using real and simulated datasets to impute the un-typed SNPs in parent-offspring trios. The tested methods show that imputation of parent-offspring trios can be accurate. The Random Forest and Support Vector Machine were more accurate than the other machine learning methods. The TotalBoost performed slightly worse than the other methods.The running times were different between methods. The ELM was always most fast algorithm. In case of increasing the sample size, the RBF requires long imputation time.The tested methods in this research can be an alternative for imputation of un-typed SNPs in low missing rate of data. However, it is recommended that other machine learning methods to be used for imputation. Copyright © 2016 Elsevier Ltd. All rights reserved.

  12. Is Library Database Searching a Language Learning Activity?

    ERIC Educational Resources Information Center

    Bordonaro, Karen

    2010-01-01

    This study explores how non-native speakers of English think of words to enter into library databases when they begin the process of searching for information in English. At issue is whether or not language learning takes place when these students use library databases. Language learning in this study refers to the use of strategies employed by…

  13. Listening to Students: Customer Journey Mapping at Birmingham City University Library and Learning Resources

    ERIC Educational Resources Information Center

    Andrews, Judith; Eade, Eleanor

    2013-01-01

    Birmingham City University's Library and Learning Resources' strategic aim is to improve student satisfaction. A key element is the achievement of the Customer Excellence Standard. An important component of the standard is the mapping of services to improve quality. Library and Learning Resources has developed a methodology to map these…

  14. Lifelong Learning in Public Libraries in 12 European Union Countries: Issues in Monitoring and Evaluation

    ERIC Educational Resources Information Center

    Stanziola, Javier

    2011-01-01

    Public libraries in Europe have supported lifelong learning for the past 500 years. Since the Lisbon Strategy emphasized the role of lifelong learning in economic policy, public libraries have been repositioning their services to respond to this new context. In some cases, these roles are undertaken with limited legislative or strategic changes…

  15. Learning Spaces in Academic Libraries--A Review of the Evolving Trends

    ERIC Educational Resources Information Center

    Turner, Arlee; Welch, Bernadette; Reynolds, Sue

    2013-01-01

    This paper presents a review of the professional discourse regarding the evolution of information and learning spaces in academic libraries, particularly in the first decade of the twenty-first century. It investigates the evolution of academic libraries and the development of learning spaces focusing on the use of the terms which have evolved…

  16. Will Undergraduate Students Play Games to Learn How to Conduct Library Research?

    ERIC Educational Resources Information Center

    Markey, Karen; Swanson, Fritz; Jenkins, Andrea; Jennings, Brian; St. Jean, Beth; Rosenberg, Victor; Yao, Xingxing; Frost, Robert

    2009-01-01

    This exploratory study examines whether undergraduate students will play games to learn how to conduct library research. Results indicate that students will play games that are an integral component of the course curriculum and enable them to accomplish overall course goals at the same time they learn about library research. (Contains 1 table.)

  17. Enhancing Your Library's Public Relations with "Lunch-and-Learn" Workshops.

    ERIC Educational Resources Information Center

    Godbey, Ruth

    The staff of the Creighton University (Omaha, Nebraska) Health Sciences Library has been able to improve not only the library's public relations but also the image of the library by presenting weekly "Lunch-and-Learn" workshops. Since 1990, approximately 15 workshops have been presented each semester with topics ranging from cancer…

  18. Public Libraries as Places for Empowering Women through Autonomous Learning Activities

    ERIC Educational Resources Information Center

    Yoshida, Yuko

    2013-01-01

    Introduction. The purpose of this research is to investigate the significance of public libraries as educational institutions. The meaning of lifelong learning in public libraries from the perspective of women's autonomous activities is re-examined. Method. The literature of the grassroots library movement and that of the empowerment of women is…

  19. The Use of Libraries by Post-Graduate Distance Learning Students: Whose Responsibility?

    ERIC Educational Resources Information Center

    Bolton, Neil; Unwin, Lorna; Stephens, Kate

    1998-01-01

    A survey of 977 postgraduate distance-learning students in the United Kingdom investigated student perceptions of library needs. This article examines how students felt they were treated, need for libraries, library training (previous experience and nature and extent of training), problems of distance and time, costs for texts and charges for…

  20. Combining Machine Learning and Natural Language Processing to Assess Literary Text Comprehension

    ERIC Educational Resources Information Center

    Balyan, Renu; McCarthy, Kathryn S.; McNamara, Danielle S.

    2017-01-01

    This study examined how machine learning and natural language processing (NLP) techniques can be leveraged to assess the interpretive behavior that is required for successful literary text comprehension. We compared the accuracy of seven different machine learning classification algorithms in predicting human ratings of student essays about…

  1. Implementing Machine Learning in Radiology Practice and Research.

    PubMed

    Kohli, Marc; Prevedello, Luciano M; Filice, Ross W; Geis, J Raymond

    2017-04-01

    The purposes of this article are to describe concepts that radiologists should understand to evaluate machine learning projects, including common algorithms, supervised as opposed to unsupervised techniques, statistical pitfalls, and data considerations for training and evaluation, and to briefly describe ethical dilemmas and legal risk. Machine learning includes a broad class of computer programs that improve with experience. The complexity of creating, training, and monitoring machine learning indicates that the success of the algorithms will require radiologist involvement for years to come, leading to engagement rather than replacement.

  2. Prediction of outcome in internet-delivered cognitive behaviour therapy for paediatric obsessive-compulsive disorder: A machine learning approach.

    PubMed

    Lenhard, Fabian; Sauer, Sebastian; Andersson, Erik; Månsson, Kristoffer Nt; Mataix-Cols, David; Rück, Christian; Serlachius, Eva

    2018-03-01

    There are no consistent predictors of treatment outcome in paediatric obsessive-compulsive disorder (OCD). One reason for this might be the use of suboptimal statistical methodology. Machine learning is an approach to efficiently analyse complex data. Machine learning has been widely used within other fields, but has rarely been tested in the prediction of paediatric mental health treatment outcomes. To test four different machine learning methods in the prediction of treatment response in a sample of paediatric OCD patients who had received Internet-delivered cognitive behaviour therapy (ICBT). Participants were 61 adolescents (12-17 years) who enrolled in a randomized controlled trial and received ICBT. All clinical baseline variables were used to predict strictly defined treatment response status three months after ICBT. Four machine learning algorithms were implemented. For comparison, we also employed a traditional logistic regression approach. Multivariate logistic regression could not detect any significant predictors. In contrast, all four machine learning algorithms performed well in the prediction of treatment response, with 75 to 83% accuracy. The results suggest that machine learning algorithms can successfully be applied to predict paediatric OCD treatment outcome. Validation studies and studies in other disorders are warranted. Copyright © 2017 John Wiley & Sons, Ltd.

  3. On the Safety of Machine Learning: Cyber-Physical Systems, Decision Sciences, and Data Products.

    PubMed

    Varshney, Kush R; Alemzadeh, Homa

    2017-09-01

    Machine learning algorithms increasingly influence our decisions and interact with us in all parts of our daily lives. Therefore, just as we consider the safety of power plants, highways, and a variety of other engineered socio-technical systems, we must also take into account the safety of systems involving machine learning. Heretofore, the definition of safety has not been formalized in a machine learning context. In this article, we do so by defining machine learning safety in terms of risk, epistemic uncertainty, and the harm incurred by unwanted outcomes. We then use this definition to examine safety in all sorts of applications in cyber-physical systems, decision sciences, and data products. We find that the foundational principle of modern statistical machine learning, empirical risk minimization, is not always a sufficient objective. We discuss how four different categories of strategies for achieving safety in engineering, including inherently safe design, safety reserves, safe fail, and procedural safeguards can be mapped to a machine learning context. We then discuss example techniques that can be adopted in each category, such as considering interpretability and causality of predictive models, objective functions beyond expected prediction accuracy, human involvement for labeling difficult or rare examples, and user experience design of software and open data.

  4. Model Considerations for Memory-based Automatic Music Transcription

    NASA Astrophysics Data System (ADS)

    Albrecht, Štěpán; Šmídl, Václav

    2009-12-01

    The problem of automatic music description is considered. The recorded music is modeled as a superposition of known sounds from a library weighted by unknown weights. Similar observation models are commonly used in statistics and machine learning. Many methods for estimation of the weights are available. These methods differ in the assumptions imposed on the weights. In Bayesian paradigm, these assumptions are typically expressed in the form of prior probability density function (pdf) on the weights. In this paper, commonly used assumptions about music signal are summarized and complemented by a new assumption. These assumptions are translated into pdfs and combined into a single prior density using combination of pdfs. Validity of the model is tested in simulation using synthetic data.

  5. Computational assessment of organic photovoltaic candidate compounds

    NASA Astrophysics Data System (ADS)

    Borunda, Mario; Dai, Shuo; Olivares-Amaya, Roberto; Amador-Bedolla, Carlos; Aspuru-Guzik, Alan

    2015-03-01

    Organic photovoltaic (OPV) cells are emerging as a possible renewable alternative to petroleum based resources and are needed to meet our growing demand for energy. Although not as efficient as silicon based cells, OPV cells have as an advantage that their manufacturing cost is potentially lower. The Harvard Clean Energy Project, using a cheminformatic approach of pattern recognition and machine learning strategies, has ranked a molecular library of more than 2.6 million candidate compounds based on their performance as possible OPV materials. Here, we present a ranking of the top 1000 molecules for use as photovoltaic materials based on their optical absorption properties obtained via time-dependent density functional theory. This computational search has revealed the molecular motifs shared by the set of most promising molecules.

  6. Quantitative structure-activity relationship analysis and virtual screening studies for identifying HDAC2 inhibitors from known HDAC bioactive chemical libraries.

    PubMed

    Pham-The, H; Casañola-Martin, G; Diéguez-Santana, K; Nguyen-Hai, N; Ngoc, N T; Vu-Duc, L; Le-Thi-Thu, H

    2017-03-01

    Histone deacetylases (HDAC) are emerging as promising targets in cancer, neuronal diseases and immune disorders. Computational modelling approaches have been widely applied for the virtual screening and rational design of novel HDAC inhibitors. In this study, different machine learning (ML) techniques were applied for the development of models that accurately discriminate HDAC2 inhibitors form non-inhibitors. The obtained models showed encouraging results, with the global accuracy in the external set ranging from 0.83 to 0.90. Various aspects related to the comparison of modelling techniques, applicability domain and descriptor interpretations were discussed. Finally, consensus predictions of these models were used for screening HDAC2 inhibitors from four chemical libraries whose bioactivities against HDAC1, HDAC3, HDAC6 and HDAC8 have been known. According to the results of virtual screening assays, structures of some hits with pair-isoform-selective activity (between HDAC2 and other HDACs) were revealed. This study illustrates the power of ML-based QSAR approaches for the screening and discovery of potent, isoform-selective HDACIs.

  7. Recent developments in machine learning applications in landslide susceptibility mapping

    NASA Astrophysics Data System (ADS)

    Lun, Na Kai; Liew, Mohd Shahir; Matori, Abdul Nasir; Zawawi, Noor Amila Wan Abdullah

    2017-11-01

    While the prediction of spatial distribution of potential landslide occurrences is a primary interest in landslide hazard mitigation, it remains a challenging task. To overcome the scarceness of complete, sufficiently detailed geomorphological attributes and environmental conditions, various machine-learning techniques are increasingly applied to effectively map landslide susceptibility for large regions. Nevertheless, limited review papers are devoted to this field, particularly on the various domain specific applications of machine learning techniques. Available literature often report relatively good predictive performance, however, papers discussing the limitations of each approaches are quite uncommon. The foremost aim of this paper is to narrow these gaps in literature and to review up-to-date machine learning and ensemble learning techniques applied in landslide susceptibility mapping. It provides new readers an introductory understanding on the subject matter and researchers a contemporary review of machine learning advancements alongside the future direction of these techniques in the landslide mitigation field.

  8. Prediction of in vivo hepatotoxicity effects using in vitro ...

    EPA Pesticide Factsheets

    High-throughput in vitro transcriptomics data support molecular understanding of chemical-induced toxicity. Here, we evaluated the utility of such data to predict liver toxicity. First, in vitro gene expression data for 93 genes was generated following exposure of metabolically competent HepaRG cells to 1060 environmental chemicals from the US EPA ToxCast library. The empirical relationship between these data and rat chronic liver endpoints from animal studies in the Toxicity Reference Database (ToxRefDB) was then evaluated using machine learning techniques. Chemicals were classified as positive (242) or negative (135) based on observed hepatic histopathologic effects, and divided into three categories: hypertrophy (183), injury (112) and proliferative lesions (101). Hepatotoxicants were classified on the basis of the bioactivity of 93 genes (descriptors) using six machine learning algorithms: linear discriminant analysis, naïve Bayes, support vector classification, classification and regression trees, k-nearest neighbors, and an ensemble of classifiers. Classification performance was evaluated using 10-fold cross-validation testing, and in-loop, filter-based, feature subset selection. The best balanced accuracy for prediction of hypertrophy, injury and proliferative lesions were 0.81 ± 0.07, 0.79 ± 0.08 and 0.77 ± 0.09, respectively. Gene specific perturbation of xenobiotic metabolism enzymes (CYP7A1/2E1/4A11/1A1/4A22) and transporters (ABCG2, ABCB11, SLC22

  9. OCTANET--an electronic library network: I. Design and development.

    PubMed Central

    Johnson, M F; Pride, R B

    1983-01-01

    The design and development of the OCTANET system for networking among medical libraries in the midcontinental region is described. This system's features and configuration may be attributed, at least in part, to normal evolution of technology in library networking, remote access to computers, and development of machine-readable data bases. Current functions and services of the system are outlined and implications for future developments in computer-based networking are discussed. PMID:6860825

  10. Machine vision systems using machine learning for industrial product inspection

    NASA Astrophysics Data System (ADS)

    Lu, Yi; Chen, Tie Q.; Chen, Jie; Zhang, Jian; Tisler, Anthony

    2002-02-01

    Machine vision inspection requires efficient processing time and accurate results. In this paper, we present a machine vision inspection architecture, SMV (Smart Machine Vision). SMV decomposes a machine vision inspection problem into two stages, Learning Inspection Features (LIF), and On-Line Inspection (OLI). The LIF is designed to learn visual inspection features from design data and/or from inspection products. During the OLI stage, the inspection system uses the knowledge learnt by the LIF component to inspect the visual features of products. In this paper we will present two machine vision inspection systems developed under the SMV architecture for two different types of products, Printed Circuit Board (PCB) and Vacuum Florescent Displaying (VFD) boards. In the VFD board inspection system, the LIF component learns inspection features from a VFD board and its displaying patterns. In the PCB board inspection system, the LIF learns the inspection features from the CAD file of a PCB board. In both systems, the LIF component also incorporates interactive learning to make the inspection system more powerful and efficient. The VFD system has been deployed successfully in three different manufacturing companies and the PCB inspection system is the process of being deployed in a manufacturing plant.

  11. The application of machine learning techniques in the clinical drug therapy.

    PubMed

    Meng, Huan-Yu; Jin, Wan-Lin; Yan, Cheng-Kai; Yang, Huan

    2018-05-25

    The development of a novel drug is an extremely complicated process that includes the target identification, design and manufacture, and proper therapy of the novel drug, as well as drug dose selection, drug efficacy evaluation, and adverse drug reaction control. Due to the limited resources, high costs, long duration, and low hit-to-lead ratio in the development of pharmacogenetics and computer technology, machine learning techniques have assisted novel drug development and have gradually received more attention by researchers. According to current research, machine learning techniques are widely applied in the process of the discovery of new drugs and novel drug targets, the decision surrounding proper therapy and drug dose, and the prediction of drug efficacy and adverse drug reactions. In this article, we discussed the history, workflow, and advantages and disadvantages of machine learning techniques in the processes mentioned above. Although the advantages of machine learning techniques are fairly obvious, the application of machine learning techniques is currently limited. With further research, the application of machine techniques in drug development could be much more widespread and could potentially be one of the major methods used in drug development. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  12. A Novel Extreme Learning Machine Classification Model for e-Nose Application Based on the Multiple Kernel Approach.

    PubMed

    Jian, Yulin; Huang, Daoyu; Yan, Jia; Lu, Kun; Huang, Ying; Wen, Tailai; Zeng, Tanyue; Zhong, Shijie; Xie, Qilong

    2017-06-19

    A novel classification model, named the quantum-behaved particle swarm optimization (QPSO)-based weighted multiple kernel extreme learning machine (QWMK-ELM), is proposed in this paper. Experimental validation is carried out with two different electronic nose (e-nose) datasets. Being different from the existing multiple kernel extreme learning machine (MK-ELM) algorithms, the combination coefficients of base kernels are regarded as external parameters of single-hidden layer feedforward neural networks (SLFNs). The combination coefficients of base kernels, the model parameters of each base kernel, and the regularization parameter are optimized by QPSO simultaneously before implementing the kernel extreme learning machine (KELM) with the composite kernel function. Four types of common single kernel functions (Gaussian kernel, polynomial kernel, sigmoid kernel, and wavelet kernel) are utilized to constitute different composite kernel functions. Moreover, the method is also compared with other existing classification methods: extreme learning machine (ELM), kernel extreme learning machine (KELM), k-nearest neighbors (KNN), support vector machine (SVM), multi-layer perceptron (MLP), radical basis function neural network (RBFNN), and probabilistic neural network (PNN). The results have demonstrated that the proposed QWMK-ELM outperforms the aforementioned methods, not only in precision, but also in efficiency for gas classification.

  13. Machine Learning, deep learning and optimization in computer vision

    NASA Astrophysics Data System (ADS)

    Canu, Stéphane

    2017-03-01

    As quoted in the Large Scale Computer Vision Systems NIPS workshop, computer vision is a mature field with a long tradition of research, but recent advances in machine learning, deep learning, representation learning and optimization have provided models with new capabilities to better understand visual content. The presentation will go through these new developments in machine learning covering basic motivations, ideas, models and optimization in deep learning for computer vision, identifying challenges and opportunities. It will focus on issues related with large scale learning that is: high dimensional features, large variety of visual classes, and large number of examples.

  14. Machine Learning in Radiology: Applications Beyond Image Interpretation.

    PubMed

    Lakhani, Paras; Prater, Adam B; Hutson, R Kent; Andriole, Kathy P; Dreyer, Keith J; Morey, Jose; Prevedello, Luciano M; Clark, Toshi J; Geis, J Raymond; Itri, Jason N; Hawkins, C Matthew

    2018-02-01

    Much attention has been given to machine learning and its perceived impact in radiology, particularly in light of recent success with image classification in international competitions. However, machine learning is likely to impact radiology outside of image interpretation long before a fully functional "machine radiologist" is implemented in practice. Here, we describe an overview of machine learning, its application to radiology and other domains, and many cases of use that do not involve image interpretation. We hope that better understanding of these potential applications will help radiology practices prepare for the future and realize performance improvement and efficiency gains. Copyright © 2017 American College of Radiology. Published by Elsevier Inc. All rights reserved.

  15. Prostate Cancer Probability Prediction By Machine Learning Technique.

    PubMed

    Jović, Srđan; Miljković, Milica; Ivanović, Miljan; Šaranović, Milena; Arsić, Milena

    2017-11-26

    The main goal of the study was to explore possibility of prostate cancer prediction by machine learning techniques. In order to improve the survival probability of the prostate cancer patients it is essential to make suitable prediction models of the prostate cancer. If one make relevant prediction of the prostate cancer it is easy to create suitable treatment based on the prediction results. Machine learning techniques are the most common techniques for the creation of the predictive models. Therefore in this study several machine techniques were applied and compared. The obtained results were analyzed and discussed. It was concluded that the machine learning techniques could be used for the relevant prediction of prostate cancer.

  16. Simple Activities for Powerful Impact

    NASA Astrophysics Data System (ADS)

    LaConte, K.; Shupla, C. B.; Dusenbery, P.; Harold, J. B.; Holland, A.

    2016-12-01

    STEM education is having a transformational impact on libraries across the country. The STAR Library Education Network (STAR_Net) provides free Science-Technology Activities & Resources that are helping libraries to engage their communities in STEM learning experiences. Hear the results of a national 2015 survey of library and STEM professionals and learn what STEM programming is currently in place in public libraries and how libraries approach and implement STEM programs. Experience hands-on space science activities that are being used in library programs with multiple age groups. Through these hands-on activities, learners explore the nature of science and employ science and engineering practices, including developing and using models, planning and carrying out investigations, and engaging in argument from evidence (NGSS Lead States, 2013). Learn how STAR_Net can help you print (free!) mini-exhibits and educator guides. Join STAR_Net's online community and access STEM resources and webinars to work with libraries in your local community.

  17. The Next Era: Deep Learning in Pharmaceutical Research.

    PubMed

    Ekins, Sean

    2016-11-01

    Over the past decade we have witnessed the increasing sophistication of machine learning algorithms applied in daily use from internet searches, voice recognition, social network software to machine vision software in cameras, phones, robots and self-driving cars. Pharmaceutical research has also seen its fair share of machine learning developments. For example, applying such methods to mine the growing datasets that are created in drug discovery not only enables us to learn from the past but to predict a molecule's properties and behavior in future. The latest machine learning algorithm garnering significant attention is deep learning, which is an artificial neural network with multiple hidden layers. Publications over the last 3 years suggest that this algorithm may have advantages over previous machine learning methods and offer a slight but discernable edge in predictive performance. The time has come for a balanced review of this technique but also to apply machine learning methods such as deep learning across a wider array of endpoints relevant to pharmaceutical research for which the datasets are growing such as physicochemical property prediction, formulation prediction, absorption, distribution, metabolism, excretion and toxicity (ADME/Tox), target prediction and skin permeation, etc. We also show that there are many potential applications of deep learning beyond cheminformatics. It will be important to perform prospective testing (which has been carried out rarely to date) in order to convince skeptics that there will be benefits from investing in this technique.

  18. Contemporary machine learning: techniques for practitioners in the physical sciences

    NASA Astrophysics Data System (ADS)

    Spears, Brian

    2017-10-01

    Machine learning is the science of using computers to find relationships in data without explicitly knowing or programming those relationships in advance. Often without realizing it, we employ machine learning every day as we use our phones or drive our cars. Over the last few years, machine learning has found increasingly broad application in the physical sciences. This most often involves building a model relationship between a dependent, measurable output and an associated set of controllable, but complicated, independent inputs. The methods are applicable both to experimental observations and to databases of simulated output from large, detailed numerical simulations. In this tutorial, we will present an overview of current tools and techniques in machine learning - a jumping-off point for researchers interested in using machine learning to advance their work. We will discuss supervised learning techniques for modeling complicated functions, beginning with familiar regression schemes, then advancing to more sophisticated decision trees, modern neural networks, and deep learning methods. Next, we will cover unsupervised learning and techniques for reducing the dimensionality of input spaces and for clustering data. We'll show example applications from both magnetic and inertial confinement fusion. Along the way, we will describe methods for practitioners to help ensure that their models generalize from their training data to as-yet-unseen test data. We will finally point out some limitations to modern machine learning and speculate on some ways that practitioners from the physical sciences may be particularly suited to help. This work was performed by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.

  19. Applications of Machine Learning in Cancer Prediction and Prognosis

    PubMed Central

    Cruz, Joseph A.; Wishart, David S.

    2006-01-01

    Machine learning is a branch of artificial intelligence that employs a variety of statistical, probabilistic and optimization techniques that allows computers to “learn” from past examples and to detect hard-to-discern patterns from large, noisy or complex data sets. This capability is particularly well-suited to medical applications, especially those that depend on complex proteomic and genomic measurements. As a result, machine learning is frequently used in cancer diagnosis and detection. More recently machine learning has been applied to cancer prognosis and prediction. This latter approach is particularly interesting as it is part of a growing trend towards personalized, predictive medicine. In assembling this review we conducted a broad survey of the different types of machine learning methods being used, the types of data being integrated and the performance of these methods in cancer prediction and prognosis. A number of trends are noted, including a growing dependence on protein biomarkers and microarray data, a strong bias towards applications in prostate and breast cancer, and a heavy reliance on “older” technologies such artificial neural networks (ANNs) instead of more recently developed or more easily interpretable machine learning methods. A number of published studies also appear to lack an appropriate level of validation or testing. Among the better designed and validated studies it is clear that machine learning methods can be used to substantially (15–25%) improve the accuracy of predicting cancer susceptibility, recurrence and mortality. At a more fundamental level, it is also evident that machine learning is also helping to improve our basic understanding of cancer development and progression. PMID:19458758

  20. Guidelines for Developing and Reporting Machine Learning Predictive Models in Biomedical Research: A Multidisciplinary View.

    PubMed

    Luo, Wei; Phung, Dinh; Tran, Truyen; Gupta, Sunil; Rana, Santu; Karmakar, Chandan; Shilton, Alistair; Yearwood, John; Dimitrova, Nevenka; Ho, Tu Bao; Venkatesh, Svetha; Berk, Michael

    2016-12-16

    As more and more researchers are turning to big data for new opportunities of biomedical discoveries, machine learning models, as the backbone of big data analysis, are mentioned more often in biomedical journals. However, owing to the inherent complexity of machine learning methods, they are prone to misuse. Because of the flexibility in specifying machine learning models, the results are often insufficiently reported in research articles, hindering reliable assessment of model validity and consistent interpretation of model outputs. To attain a set of guidelines on the use of machine learning predictive models within clinical settings to make sure the models are correctly applied and sufficiently reported so that true discoveries can be distinguished from random coincidence. A multidisciplinary panel of machine learning experts, clinicians, and traditional statisticians were interviewed, using an iterative process in accordance with the Delphi method. The process produced a set of guidelines that consists of (1) a list of reporting items to be included in a research article and (2) a set of practical sequential steps for developing predictive models. A set of guidelines was generated to enable correct application of machine learning models and consistent reporting of model specifications and results in biomedical research. We believe that such guidelines will accelerate the adoption of big data analysis, particularly with machine learning methods, in the biomedical research community. ©Wei Luo, Dinh Phung, Truyen Tran, Sunil Gupta, Santu Rana, Chandan Karmakar, Alistair Shilton, John Yearwood, Nevenka Dimitrova, Tu Bao Ho, Svetha Venkatesh, Michael Berk. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 16.12.2016.

  1. The Role of Libraries in Lifelong Learning. Final Report of the IFLA Project under the Section for Public Libraries

    ERIC Educational Resources Information Center

    Haggstrom, Britt Marie, Ed.

    2004-01-01

    The fifth UNESCO/CONFINTEA meeting took place in Hamburg in 1997. The Hamburg declaration was adopted and stated that "UNESCO should strengthen libraries, museums heritage and cultural institutions as learning places and partners in the lifelong learning process and modern citizenship" It was felt that IFLA (International Federation of…

  2. Building an Online Library for Interpretation Training: Explorations into an Effective Blended-Learning Mode

    ERIC Educational Resources Information Center

    Chan, Clara Ho-yan

    2014-01-01

    This paper reports on a blended-learning project that aims to develop a web-based library of interpreting practice resources built on the course management system Blackboard for Hong Kong interpretation students to practise outside the classroom. It also evaluates the library's effectiveness for learning, based on a case study that uses it to…

  3. Development of E-Learning Materials for Machining Safety Education

    NASA Astrophysics Data System (ADS)

    Nakazawa, Tsuyoshi; Mita, Sumiyoshi; Matsubara, Masaaki; Takashima, Takeo; Tanaka, Koichi; Izawa, Satoru; Kawamura, Takashi

    We developed two e-learning materials for Manufacturing Practice safety education: movie learning materials and hazard-detection learning materials. Using these video and sound media, students can learn how to operate machines safely with movie learning materials, which raise the effectiveness of preparation and review for manufacturing practice. Using these materials, students can realize safety operation well. Students can apply knowledge learned in lectures to the detection of hazards and use study methods for hazard detection during machine operation using the hazard-detection learning materials. Particularly, the hazard-detection learning materials raise students‧ safety consciousness and increase students‧ comprehension of knowledge from lectures and comprehension of operations during Manufacturing Practice.

  4. Library 101: Why, How, and Lessons Learned

    ERIC Educational Resources Information Center

    Porter, Michael; King, David Lee

    2010-01-01

    This article describes how and why the Library 101 Project was created and the lessons that the developers learned out of this project. The Library 101 is a project that challenges librarians to revise the paradigm of "basic" library services in order to remain relevant in this technology-driven world. It was developed by Michael Porter,…

  5. Evaluation Criteria for the Places of Learning

    ERIC Educational Resources Information Center

    Callison, Daniel

    2007-01-01

    In this article, the author talks about the concept of school library as a place of learning for students. He contends that a school library must be conceived in terms of three aspects. First, it should have a creative school library media specialist to lead the principal through a meaningful vision of what the library could be in terms of…

  6. An introduction to quantum machine learning

    NASA Astrophysics Data System (ADS)

    Schuld, Maria; Sinayskiy, Ilya; Petruccione, Francesco

    2015-04-01

    Machine learning algorithms learn a desired input-output relation from examples in order to interpret new inputs. This is important for tasks such as image and speech recognition or strategy optimisation, with growing applications in the IT industry. In the last couple of years, researchers investigated if quantum computing can help to improve classical machine learning algorithms. Ideas range from running computationally costly algorithms or their subroutines efficiently on a quantum computer to the translation of stochastic methods into the language of quantum theory. This contribution gives a systematic overview of the emerging field of quantum machine learning. It presents the approaches as well as technical details in an accessible way, and discusses the potential of a future theory of quantum learning.

  7. Lifelong Learning--A Public Library Perspective.

    ERIC Educational Resources Information Center

    Kahlert, Maureen

    This paper presents a public library perspective on lifelong learning. The first section discusses the lifelong learning challenge, including the aims of the Australian National Marketing Strategy for Skills and Lifelong Learning, and findings of a national survey related to the value of and barriers to learning. The second section addresses the…

  8. Are They Learning? Are We? Learning Outcomes and the Academic Library

    ERIC Educational Resources Information Center

    Oakleaf, Megan

    2011-01-01

    Since the 1990s, the assessment of learning outcomes in academic libraries has accelerated rapidly, and librarians have come to recognize the necessity of articulating and assessing student learning outcomes. Initially, librarians developed tools and instruments to assess information literacy student learning outcomes. Now, academic librarians are…

  9. Large-Scale Machine Learning for Classification and Search

    ERIC Educational Resources Information Center

    Liu, Wei

    2012-01-01

    With the rapid development of the Internet, nowadays tremendous amounts of data including images and videos, up to millions or billions, can be collected for training machine learning models. Inspired by this trend, this thesis is dedicated to developing large-scale machine learning techniques for the purpose of making classification and nearest…

  10. Newton Methods for Large Scale Problems in Machine Learning

    ERIC Educational Resources Information Center

    Hansen, Samantha Leigh

    2014-01-01

    The focus of this thesis is on practical ways of designing optimization algorithms for minimizing large-scale nonlinear functions with applications in machine learning. Chapter 1 introduces the overarching ideas in the thesis. Chapters 2 and 3 are geared towards supervised machine learning applications that involve minimizing a sum of loss…

  11. Applying Machine Learning to Facilitate Autism Diagnostics: Pitfalls and Promises

    ERIC Educational Resources Information Center

    Bone, Daniel; Goodwin, Matthew S.; Black, Matthew P.; Lee, Chi-Chun; Audhkhasi, Kartik; Narayanan, Shrikanth

    2015-01-01

    Machine learning has immense potential to enhance diagnostic and intervention research in the behavioral sciences, and may be especially useful in investigations involving the highly prevalent and heterogeneous syndrome of autism spectrum disorder. However, use of machine learning in the absence of clinical domain expertise can be tenuous and lead…

  12. An active role for machine learning in drug development

    PubMed Central

    Murphy, Robert F.

    2014-01-01

    Due to the complexity of biological systems, cutting-edge machine-learning methods will be critical for future drug development. In particular, machine-vision methods to extract detailed information from imaging assays and active-learning methods to guide experimentation will be required to overcome the dimensionality problem in drug development. PMID:21587249

  13. Intel NX to PVM 3.2 message passing conversion library

    NASA Technical Reports Server (NTRS)

    Arthur, Trey; Nelson, Michael L.

    1993-01-01

    NASA Langley Research Center has developed a library that allows Intel NX message passing codes to be executed under the more popular and widely supported Parallel Virtual Machine (PVM) message passing library. PVM was developed at Oak Ridge National Labs and has become the defacto standard for message passing. This library will allow the many programs that were developed on the Intel iPSC/860 or Intel Paragon in a Single Program Multiple Data (SPMD) design to be ported to the numerous architectures that PVM (version 3.2) supports. Also, the library adds global operations capability to PVM. A familiarity with Intel NX and PVM message passing is assumed.

  14. Prediction and Validation of Disease Genes Using HeteSim Scores.

    PubMed

    Zeng, Xiangxiang; Liao, Yuanlu; Liu, Yuansheng; Zou, Quan

    2017-01-01

    Deciphering the gene disease association is an important goal in biomedical research. In this paper, we use a novel relevance measure, called HeteSim, to prioritize candidate disease genes. Two methods based on heterogeneous networks constructed using protein-protein interaction, gene-phenotype associations, and phenotype-phenotype similarity, are presented. In HeteSim_MultiPath (HSMP), HeteSim scores of different paths are combined with a constant that dampens the contributions of longer paths. In HeteSim_SVM (HSSVM), HeteSim scores are combined with a machine learning method. The 3-fold experiments show that our non-machine learning method HSMP performs better than the existing non-machine learning methods, our machine learning method HSSVM obtains similar accuracy with the best existing machine learning method CATAPULT. From the analysis of the top 10 predicted genes for different diseases, we found that HSSVM avoid the disadvantage of the existing machine learning based methods, which always predict similar genes for different diseases. The data sets and Matlab code for the two methods are freely available for download at http://lab.malab.cn/data/HeteSim/index.jsp.

  15. Design of Formats and Packs of Catalog Cards.

    ERIC Educational Resources Information Center

    OCLC Online Computer Library Center, Inc., Dublin, OH.

    The three major functions of the Ohio College Library Center's Shared Cataloging System are: 1) provision of union catalog location listing; 2) making available cataloging done by one library to all other users of the system; and 3) production of catalog cards. The system, based on a central machine readable data base, speeds cataloging and…

  16. Nonbibliographic Machine-Readable Data Bases in ARL Libraries. SPEC Kit 105.

    ERIC Educational Resources Information Center

    Westerman, Mel

    This document is one of ten kits distributed annually by the Systems and Procedures Exchange Center (SPEC), a clearinghouse operated by the Association of Research Libraries, Office of Management Studies (ARL/OMS) that provides a central source of timely information and materials on the management and operations of large academic and research…

  17. ELNET--The Electronic Library Database System.

    ERIC Educational Resources Information Center

    King, Shirley V.

    1991-01-01

    ELNET (Electronic Library Network), a Japanese language database, allows searching of index terms and free text terms from articles and stores the full text of the articles on an optical disc system. Users can order fax copies of the text from the optical disc. This article also explains online searching and discusses machine translation. (LRW)

  18. Demonstration of Cataloging Support Services and Marc II Conversion. Final Report.

    ERIC Educational Resources Information Center

    Buckland, Lawrence F.; And Others

    Beginning in December, 1967, the New England Library Information Network (NELINET) was demonstrated in actual operation using Machine-Readable Cataloging (MARC I) bibliographic data. Section 1 of this report is an introduction and summary of the project. Section 2 described the library processing function demonstrated which included catalog card…

  19. Assessing Information on the Internet: Toward Providing Library Services for Computer-Mediated Communication. A Final Report.

    ERIC Educational Resources Information Center

    Dillon, Martin; And Others

    The Online Computer Library Center Internet Resource project focused on the nature of electronic textual information available through remote access using the Internet and the problems associated with creating machine-readable cataloging (MARC) records for these objects using current USMARC format for computer files and "Anglo-American…

  20. Break-Even Point for a Proof Slip Operation

    ERIC Educational Resources Information Center

    Anderson, James F.

    1972-01-01

    Break-even analysis is applied to determine what magnitude of titles added per year is sufficient to utilize economically Library of Congress proof slips and a Xerox 914 copying machine in the cataloging operation of a library. A formula is derived, and an example of its use is given. (1 reference) (Author/SJ)

  1. Self-Service to the People

    ERIC Educational Resources Information Center

    Kantor-Horning, Susan

    2009-01-01

    It's called GoLibrary in the United States and Bokomaten in its native Sweden. Patrons know it as Library-a-Go-Go in Contra Costa County, California, but whatever its name, the automated lending service this materials handling machine provides has proved a tremendous aid in addressing underserved segments of this sprawling community. It's not hard…

  2. In vitro molecular machine learning algorithm via symmetric internal loops of DNA.

    PubMed

    Lee, Ji-Hoon; Lee, Seung Hwan; Baek, Christina; Chun, Hyosun; Ryu, Je-Hwan; Kim, Jin-Woo; Deaton, Russell; Zhang, Byoung-Tak

    2017-08-01

    Programmable biomolecules, such as DNA strands, deoxyribozymes, and restriction enzymes, have been used to solve computational problems, construct large-scale logic circuits, and program simple molecular games. Although studies have shown the potential of molecular computing, the capability of computational learning with DNA molecules, i.e., molecular machine learning, has yet to be experimentally verified. Here, we present a novel molecular learning in vitro model in which symmetric internal loops of double-stranded DNA are exploited to measure the differences between training instances, thus enabling the molecules to learn from small errors. The model was evaluated on a data set of twenty dialogue sentences obtained from the television shows Friends and Prison Break. The wet DNA-computing experiments confirmed that the molecular learning machine was able to generalize the dialogue patterns of each show and successfully identify the show from which the sentences originated. The molecular machine learning model described here opens the way for solving machine learning problems in computer science and biology using in vitro molecular computing with the data encoded in DNA molecules. Copyright © 2017. Published by Elsevier B.V.

  3. Comparison of machine learning and semi-quantification algorithms for (I123)FP-CIT classification: the beginning of the end for semi-quantification?

    PubMed

    Taylor, Jonathan Christopher; Fenner, John Wesley

    2017-11-29

    Semi-quantification methods are well established in the clinic for assisted reporting of (I123) Ioflupane images. Arguably, these are limited diagnostic tools. Recent research has demonstrated the potential for improved classification performance offered by machine learning algorithms. A direct comparison between methods is required to establish whether a move towards widespread clinical adoption of machine learning algorithms is justified. This study compared three machine learning algorithms with that of a range of semi-quantification methods, using the Parkinson's Progression Markers Initiative (PPMI) research database and a locally derived clinical database for validation. Machine learning algorithms were based on support vector machine classifiers with three different sets of features: Voxel intensities Principal components of image voxel intensities Striatal binding radios from the putamen and caudate. Semi-quantification methods were based on striatal binding ratios (SBRs) from both putamina, with and without consideration of the caudates. Normal limits for the SBRs were defined through four different methods: Minimum of age-matched controls Mean minus 1/1.5/2 standard deviations from age-matched controls Linear regression of normal patient data against age (minus 1/1.5/2 standard errors) Selection of the optimum operating point on the receiver operator characteristic curve from normal and abnormal training data Each machine learning and semi-quantification technique was evaluated with stratified, nested 10-fold cross-validation, repeated 10 times. The mean accuracy of the semi-quantitative methods for classification of local data into Parkinsonian and non-Parkinsonian groups varied from 0.78 to 0.87, contrasting with 0.89 to 0.95 for classifying PPMI data into healthy controls and Parkinson's disease groups. The machine learning algorithms gave mean accuracies between 0.88 to 0.92 and 0.95 to 0.97 for local and PPMI data respectively. Classification performance was lower for the local database than the research database for both semi-quantitative and machine learning algorithms. However, for both databases, the machine learning methods generated equal or higher mean accuracies (with lower variance) than any of the semi-quantification approaches. The gain in performance from using machine learning algorithms as compared to semi-quantification was relatively small and may be insufficient, when considered in isolation, to offer significant advantages in the clinical context.

  4. Technology Toolkit.

    ERIC Educational Resources Information Center

    Brooklyn Public Library, NY.

    This reference resource identifies issues concerning technology use in library literacy programs and describes approaches that work at the Brooklyn (New York) Public Library. Section 1 discusses the learning centers at the library, including its mission, philosophy, curriculum, technology, volunteer tutors, and active learning environment. Section…

  5. Candles, Corks and Contracts: Essential Relationships between Learners and Libraries.

    ERIC Educational Resources Information Center

    Burge, Elizabeth J.; Snow, Judith E.

    2000-01-01

    Current relationships between libraries and adult learners are shaped by technology adoption, learner demographics, constructivist learning, and institutional pressures. Future relationships must emphasize learner-centered action over technological efficiency, stronger learning leadership, and greater integration of libraries in educational…

  6. Regularised extreme learning machine with misclassification cost and rejection cost for gene expression data classification.

    PubMed

    Lu, Huijuan; Wei, Shasha; Zhou, Zili; Miao, Yanzi; Lu, Yi

    2015-01-01

    The main purpose of traditional classification algorithms on bioinformatics application is to acquire better classification accuracy. However, these algorithms cannot meet the requirement that minimises the average misclassification cost. In this paper, a new algorithm of cost-sensitive regularised extreme learning machine (CS-RELM) was proposed by using probability estimation and misclassification cost to reconstruct the classification results. By improving the classification accuracy of a group of small sample which higher misclassification cost, the new CS-RELM can minimise the classification cost. The 'rejection cost' was integrated into CS-RELM algorithm to further reduce the average misclassification cost. By using Colon Tumour dataset and SRBCT (Small Round Blue Cells Tumour) dataset, CS-RELM was compared with other cost-sensitive algorithms such as extreme learning machine (ELM), cost-sensitive extreme learning machine, regularised extreme learning machine, cost-sensitive support vector machine (SVM). The results of experiments show that CS-RELM with embedded rejection cost could reduce the average cost of misclassification and made more credible classification decision than others.

  7. Studying depression using imaging and machine learning methods.

    PubMed

    Patel, Meenal J; Khalaf, Alexander; Aizenstein, Howard J

    2016-01-01

    Depression is a complex clinical entity that can pose challenges for clinicians regarding both accurate diagnosis and effective timely treatment. These challenges have prompted the development of multiple machine learning methods to help improve the management of this disease. These methods utilize anatomical and physiological data acquired from neuroimaging to create models that can identify depressed patients vs. non-depressed patients and predict treatment outcomes. This article (1) presents a background on depression, imaging, and machine learning methodologies; (2) reviews methodologies of past studies that have used imaging and machine learning to study depression; and (3) suggests directions for future depression-related studies.

  8. Machine-Learning Approach for Design of Nanomagnetic-Based Antennas

    NASA Astrophysics Data System (ADS)

    Gianfagna, Carmine; Yu, Huan; Swaminathan, Madhavan; Pulugurtha, Raj; Tummala, Rao; Antonini, Giulio

    2017-08-01

    We propose a machine-learning approach for design of planar inverted-F antennas with a magneto-dielectric nanocomposite substrate. It is shown that machine-learning techniques can be efficiently used to characterize nanomagnetic-based antennas by accurately mapping the particle radius and volume fraction of the nanomagnetic material to antenna parameters such as gain, bandwidth, radiation efficiency, and resonant frequency. A modified mixing rule model is also presented. In addition, the inverse problem is addressed through machine learning as well, where given the antenna parameters, the corresponding design space of possible material parameters is identified.

  9. Machine Learning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chikkagoudar, Satish; Chatterjee, Samrat; Thomas, Dennis G.

    The absence of a robust and unified theory of cyber dynamics presents challenges and opportunities for using machine learning based data-driven approaches to further the understanding of the behavior of such complex systems. Analysts can also use machine learning approaches to gain operational insights. In order to be operationally beneficial, cybersecurity machine learning based models need to have the ability to: (1) represent a real-world system, (2) infer system properties, and (3) learn and adapt based on expert knowledge and observations. Probabilistic models and Probabilistic graphical models provide these necessary properties and are further explored in this chapter. Bayesian Networksmore » and Hidden Markov Models are introduced as an example of a widely used data driven classification/modeling strategy.« less

  10. Model-based machine learning.

    PubMed

    Bishop, Christopher M

    2013-02-13

    Several decades of research in the field of machine learning have resulted in a multitude of different algorithms for solving a broad range of problems. To tackle a new application, a researcher typically tries to map their problem onto one of these existing methods, often influenced by their familiarity with specific algorithms and by the availability of corresponding software implementations. In this study, we describe an alternative methodology for applying machine learning, in which a bespoke solution is formulated for each new application. The solution is expressed through a compact modelling language, and the corresponding custom machine learning code is then generated automatically. This model-based approach offers several major advantages, including the opportunity to create highly tailored models for specific scenarios, as well as rapid prototyping and comparison of a range of alternative models. Furthermore, newcomers to the field of machine learning do not have to learn about the huge range of traditional methods, but instead can focus their attention on understanding a single modelling environment. In this study, we show how probabilistic graphical models, coupled with efficient inference algorithms, provide a very flexible foundation for model-based machine learning, and we outline a large-scale commercial application of this framework involving tens of millions of users. We also describe the concept of probabilistic programming as a powerful software environment for model-based machine learning, and we discuss a specific probabilistic programming language called Infer.NET, which has been widely used in practical applications.

  11. Acceleration of saddle-point searches with machine learning.

    PubMed

    Peterson, Andrew A

    2016-08-21

    In atomistic simulations, the location of the saddle point on the potential-energy surface (PES) gives important information on transitions between local minima, for example, via transition-state theory. However, the search for saddle points often involves hundreds or thousands of ab initio force calls, which are typically all done at full accuracy. This results in the vast majority of the computational effort being spent calculating the electronic structure of states not important to the researcher, and very little time performing the calculation of the saddle point state itself. In this work, we describe how machine learning (ML) can reduce the number of intermediate ab initio calculations needed to locate saddle points. Since machine-learning models can learn from, and thus mimic, atomistic simulations, the saddle-point search can be conducted rapidly in the machine-learning representation. The saddle-point prediction can then be verified by an ab initio calculation; if it is incorrect, this strategically has identified regions of the PES where the machine-learning representation has insufficient training data. When these training data are used to improve the machine-learning model, the estimates greatly improve. This approach can be systematized, and in two simple example problems we demonstrate a dramatic reduction in the number of ab initio force calls. We expect that this approach and future refinements will greatly accelerate searches for saddle points, as well as other searches on the potential energy surface, as machine-learning methods see greater adoption by the atomistics community.

  12. Model-based machine learning

    PubMed Central

    Bishop, Christopher M.

    2013-01-01

    Several decades of research in the field of machine learning have resulted in a multitude of different algorithms for solving a broad range of problems. To tackle a new application, a researcher typically tries to map their problem onto one of these existing methods, often influenced by their familiarity with specific algorithms and by the availability of corresponding software implementations. In this study, we describe an alternative methodology for applying machine learning, in which a bespoke solution is formulated for each new application. The solution is expressed through a compact modelling language, and the corresponding custom machine learning code is then generated automatically. This model-based approach offers several major advantages, including the opportunity to create highly tailored models for specific scenarios, as well as rapid prototyping and comparison of a range of alternative models. Furthermore, newcomers to the field of machine learning do not have to learn about the huge range of traditional methods, but instead can focus their attention on understanding a single modelling environment. In this study, we show how probabilistic graphical models, coupled with efficient inference algorithms, provide a very flexible foundation for model-based machine learning, and we outline a large-scale commercial application of this framework involving tens of millions of users. We also describe the concept of probabilistic programming as a powerful software environment for model-based machine learning, and we discuss a specific probabilistic programming language called Infer.NET, which has been widely used in practical applications. PMID:23277612

  13. Acceleration of saddle-point searches with machine learning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Peterson, Andrew A., E-mail: andrew-peterson@brown.edu

    In atomistic simulations, the location of the saddle point on the potential-energy surface (PES) gives important information on transitions between local minima, for example, via transition-state theory. However, the search for saddle points often involves hundreds or thousands of ab initio force calls, which are typically all done at full accuracy. This results in the vast majority of the computational effort being spent calculating the electronic structure of states not important to the researcher, and very little time performing the calculation of the saddle point state itself. In this work, we describe how machine learning (ML) can reduce the numbermore » of intermediate ab initio calculations needed to locate saddle points. Since machine-learning models can learn from, and thus mimic, atomistic simulations, the saddle-point search can be conducted rapidly in the machine-learning representation. The saddle-point prediction can then be verified by an ab initio calculation; if it is incorrect, this strategically has identified regions of the PES where the machine-learning representation has insufficient training data. When these training data are used to improve the machine-learning model, the estimates greatly improve. This approach can be systematized, and in two simple example problems we demonstrate a dramatic reduction in the number of ab initio force calls. We expect that this approach and future refinements will greatly accelerate searches for saddle points, as well as other searches on the potential energy surface, as machine-learning methods see greater adoption by the atomistics community.« less

  14. Development of the FITS tools package for multiple software environments

    NASA Technical Reports Server (NTRS)

    Pence, W. D.; Blackburn, J. K.

    1992-01-01

    The HEASARC is developing a package of general purpose software for analyzing data files in FITS format. This paper describes the design philosophy which makes the software both machine-independent (it runs on VAXs, Suns, and DEC-stations) and software environment-independent. Currently the software can be compiled and linked to produce IRAF tasks, or alternatively, the same source code can be used to generate stand-alone tasks using one of two implementations of a user-parameter interface library. The machine independence of the software is achieved by writing the source code in ANSI standard Fortran or C, using the machine-independent FITSIO subroutine interface for all data file I/O, and using a standard user-parameter subroutine interface for all user I/O. The latter interface is based on the Fortran IRAF Parameter File interface developed at STScI. The IRAF tasks are built by linking to the IRAF implementation of this parameter interface library. Two other implementations of this parameter interface library, which have no IRAF dependencies, are now available which can be used to generate stand-alone executable tasks. These stand-alone tasks can simply be executed from the machine operating system prompt either by supplying all the task parameters on the command line or by entering the task name after which the user will be prompted for any required parameters. A first release of this FTOOLS package is now publicly available. The currently available tasks are described, along with instructions on how to obtain a copy of the software.

  15. Embracing Change: Adapting and Evolving Your Distance Learning Library Services to Meet the New ACRL Distance Learning Library Services Standards

    ERIC Educational Resources Information Center

    Marcum, Brad

    2016-01-01

    This article examines the update and revision of the current Association of College and Research Libraries (ACRL) Distance Learning Standards that has been proposed and submitted to the ACRL Standards Committee. An in-depth analysis of the update is included, along with some comparisons between the old and new. Practical advice detailing…

  16. Planning and Designing Academic Library Learning Spaces: Expert Perspectives of Architects, Librarians, and Library Consultants. Project Information Literacy Research Report. The Practitioner Series

    ERIC Educational Resources Information Center

    Head, Alison J.

    2016-01-01

    This paper identifies approaches, challenges, and best practices related to planning and designing today's academic library learning spaces. As part of the Project Information Literacy (PIL) Practitioner Series, qualitative data is presented from 49 interviews conducted with a sample of academic librarians, architects, and library consultants.…

  17. How Academic Libraries Help Faculty Teach and Students Learn: The 2005 Colorado Academic Library Impact Study. A Closer Look

    ERIC Educational Resources Information Center

    Dickenson, Don

    2006-01-01

    This study examined academic library usage and outcomes. The objective of the study was to understand how academic libraries help students learn and assist faculty with teaching and research. From March to May 2005, nine Colorado institutions administered two online questionnaires--one to undergraduate students and another to faculty members who…

  18. A comparison of machine learning and Bayesian modelling for molecular serotyping.

    PubMed

    Newton, Richard; Wernisch, Lorenz

    2017-08-11

    Streptococcus pneumoniae is a human pathogen that is a major cause of infant mortality. Identifying the pneumococcal serotype is an important step in monitoring the impact of vaccines used to protect against disease. Genomic microarrays provide an effective method for molecular serotyping. Previously we developed an empirical Bayesian model for the classification of serotypes from a molecular serotyping array. With only few samples available, a model driven approach was the only option. In the meanwhile, several thousand samples have been made available to us, providing an opportunity to investigate serotype classification by machine learning methods, which could complement the Bayesian model. We compare the performance of the original Bayesian model with two machine learning algorithms: Gradient Boosting Machines and Random Forests. We present our results as an example of a generic strategy whereby a preliminary probabilistic model is complemented or replaced by a machine learning classifier once enough data are available. Despite the availability of thousands of serotyping arrays, a problem encountered when applying machine learning methods is the lack of training data containing mixtures of serotypes; due to the large number of possible combinations. Most of the available training data comprises samples with only a single serotype. To overcome the lack of training data we implemented an iterative analysis, creating artificial training data of serotype mixtures by combining raw data from single serotype arrays. With the enhanced training set the machine learning algorithms out perform the original Bayesian model. However, for serotypes currently lacking sufficient training data the best performing implementation was a combination of the results of the Bayesian Model and the Gradient Boosting Machine. As well as being an effective method for classifying biological data, machine learning can also be used as an efficient method for revealing subtle biological insights, which we illustrate with an example.

  19. Putting Learning into Library Planning

    ERIC Educational Resources Information Center

    Bennett, Scott

    2015-01-01

    This essay notes the emergence of learning as a key factor in academic library planning. It argues for an improved, learning-oriented planning process by noting the dangers that arise from the priority usually given to fixing dysfunctional space and from the traps of mistaking the "things" of learning for learning itself and of thinking…

  20. A Sustainable Model for Integrating Current Topics in Machine Learning Research into the Undergraduate Curriculum

    ERIC Educational Resources Information Center

    Georgiopoulos, M.; DeMara, R. F.; Gonzalez, A. J.; Wu, A. S.; Mollaghasemi, M.; Gelenbe, E.; Kysilka, M.; Secretan, J.; Sharma, C. A.; Alnsour, A. J.

    2009-01-01

    This paper presents an integrated research and teaching model that has resulted from an NSF-funded effort to introduce results of current Machine Learning research into the engineering and computer science curriculum at the University of Central Florida (UCF). While in-depth exposure to current topics in Machine Learning has traditionally occurred…

  1. Learning as a Machine: Crossovers between Humans and Machines

    ERIC Educational Resources Information Center

    Hildebrandt, Mireille

    2017-01-01

    This article is a revised version of the keynote presented at LAK '16 in Edinburgh. The article investigates some of the assumptions of learning analytics, notably those related to behaviourism. Building on the work of Ivan Pavlov, Herbert Simon, and James Gibson as ways of "learning as a machine," the article then develops two levels of…

  2. Computer Programmed Milling Machine Operations. High-Technology Training Module.

    ERIC Educational Resources Information Center

    Leonard, Dennis

    This learning module for a high school metals and manufacturing course is designed to introduce the concept of computer-assisted machining (CAM). Through it, students learn how to set up and put data into the controller to machine a part. They also become familiar with computer-aided manufacturing and learn the advantages of computer numerical…

  3. 2014 Bio-Acoustics Data Challenge for the International Community on Machine Learning and Bioacoustics

    DTIC Science & Technology

    2014-09-30

    This ONR grant promotes the development and application of advanced machine learning techniques for detection and classification of marine mammal...sounds. The objective is to engage a broad community of data scientists in the development and application of advanced machine learning techniques for detection and classification of marine mammal sounds.

  4. Prediction and early detection of delirium in the intensive care unit by using heart rate variability and machine learning.

    PubMed

    Oh, Jooyoung; Cho, Dongrae; Park, Jaesub; Na, Se Hee; Kim, Jongin; Heo, Jaeseok; Shin, Cheung Soo; Kim, Jae-Jin; Park, Jin Young; Lee, Boreom

    2018-03-27

    Delirium is an important syndrome found in patients in the intensive care unit (ICU), however, it is usually under-recognized during treatment. This study was performed to investigate whether delirious patients can be successfully distinguished from non-delirious patients by using heart rate variability (HRV) and machine learning. Electrocardiography data of 140 patients was acquired during daily ICU care, and HRV data were analyzed. Delirium, including its type, severity, and etiologies, was evaluated daily by trained psychiatrists. HRV data and various machine learning algorithms including linear support vector machine (SVM), SVM with radial basis function (RBF) kernels, linear extreme learning machine (ELM), ELM with RBF kernels, linear discriminant analysis, and quadratic discriminant analysis were utilized to distinguish delirium patients from non-delirium patients. HRV data of 4797 ECGs were included, and 39 patients had delirium at least once during their ICU stay. The maximum classification accuracy was acquired using SVM with RBF kernels. Our prediction method based on HRV with machine learning was comparable to previous delirium prediction models using massive amounts of clinical information. Our results show that autonomic alterations could be a significant feature of patients with delirium in the ICU, suggesting the potential for the automatic prediction and early detection of delirium based on HRV with machine learning.

  5. Prediction of antiepileptic drug treatment outcomes using machine learning.

    PubMed

    Colic, Sinisa; Wither, Robert G; Lang, Min; Zhang, Liang; Eubanks, James H; Bardakjian, Berj L

    2017-02-01

    Antiepileptic drug (AED) treatments produce inconsistent outcomes, often necessitating patients to go through several drug trials until a successful treatment can be found. This study proposes the use of machine learning techniques to predict epilepsy treatment outcomes of commonly used AEDs. Machine learning algorithms were trained and evaluated using features obtained from intracranial electroencephalogram (iEEG) recordings of the epileptiform discharges observed in Mecp2-deficient mouse model of the Rett Syndrome. Previous work have linked the presence of cross-frequency coupling (I CFC ) of the delta (2-5 Hz) rhythm with the fast ripple (400-600 Hz) rhythm in epileptiform discharges. Using the I CFC to label post-treatment outcomes we compared support vector machines (SVMs) and random forest (RF) machine learning classifiers for providing likelihood scores of successful treatment outcomes. (a) There was heterogeneity in AED treatment outcomes, (b) machine learning techniques could be used to rank the efficacy of AEDs by estimating likelihood scores for successful treatment outcome, (c) I CFC features yielded the most effective a priori identification of appropriate AED treatment, and (d) both classifiers performed comparably. Machine learning approaches yielded predictions of successful drug treatment outcomes which in turn could reduce the burdens of drug trials and lead to substantial improvements in patient quality of life.

  6. Prediction of antiepileptic drug treatment outcomes using machine learning

    NASA Astrophysics Data System (ADS)

    Colic, Sinisa; Wither, Robert G.; Lang, Min; Zhang, Liang; Eubanks, James H.; Bardakjian, Berj L.

    2017-02-01

    Objective. Antiepileptic drug (AED) treatments produce inconsistent outcomes, often necessitating patients to go through several drug trials until a successful treatment can be found. This study proposes the use of machine learning techniques to predict epilepsy treatment outcomes of commonly used AEDs. Approach. Machine learning algorithms were trained and evaluated using features obtained from intracranial electroencephalogram (iEEG) recordings of the epileptiform discharges observed in Mecp2-deficient mouse model of the Rett Syndrome. Previous work have linked the presence of cross-frequency coupling (I CFC) of the delta (2-5 Hz) rhythm with the fast ripple (400-600 Hz) rhythm in epileptiform discharges. Using the I CFC to label post-treatment outcomes we compared support vector machines (SVMs) and random forest (RF) machine learning classifiers for providing likelihood scores of successful treatment outcomes. Main results. (a) There was heterogeneity in AED treatment outcomes, (b) machine learning techniques could be used to rank the efficacy of AEDs by estimating likelihood scores for successful treatment outcome, (c) I CFC features yielded the most effective a priori identification of appropriate AED treatment, and (d) both classifiers performed comparably. Significance. Machine learning approaches yielded predictions of successful drug treatment outcomes which in turn could reduce the burdens of drug trials and lead to substantial improvements in patient quality of life.

  7. A Novel Extreme Learning Machine Classification Model for e-Nose Application Based on the Multiple Kernel Approach

    PubMed Central

    Jian, Yulin; Huang, Daoyu; Yan, Jia; Lu, Kun; Huang, Ying; Wen, Tailai; Zeng, Tanyue; Zhong, Shijie; Xie, Qilong

    2017-01-01

    A novel classification model, named the quantum-behaved particle swarm optimization (QPSO)-based weighted multiple kernel extreme learning machine (QWMK-ELM), is proposed in this paper. Experimental validation is carried out with two different electronic nose (e-nose) datasets. Being different from the existing multiple kernel extreme learning machine (MK-ELM) algorithms, the combination coefficients of base kernels are regarded as external parameters of single-hidden layer feedforward neural networks (SLFNs). The combination coefficients of base kernels, the model parameters of each base kernel, and the regularization parameter are optimized by QPSO simultaneously before implementing the kernel extreme learning machine (KELM) with the composite kernel function. Four types of common single kernel functions (Gaussian kernel, polynomial kernel, sigmoid kernel, and wavelet kernel) are utilized to constitute different composite kernel functions. Moreover, the method is also compared with other existing classification methods: extreme learning machine (ELM), kernel extreme learning machine (KELM), k-nearest neighbors (KNN), support vector machine (SVM), multi-layer perceptron (MLP), radical basis function neural network (RBFNN), and probabilistic neural network (PNN). The results have demonstrated that the proposed QWMK-ELM outperforms the aforementioned methods, not only in precision, but also in efficiency for gas classification. PMID:28629202

  8. Mortality risk prediction in burn injury: Comparison of logistic regression with machine learning approaches.

    PubMed

    Stylianou, Neophytos; Akbarov, Artur; Kontopantelis, Evangelos; Buchan, Iain; Dunn, Ken W

    2015-08-01

    Predicting mortality from burn injury has traditionally employed logistic regression models. Alternative machine learning methods have been introduced in some areas of clinical prediction as the necessary software and computational facilities have become accessible. Here we compare logistic regression and machine learning predictions of mortality from burn. An established logistic mortality model was compared to machine learning methods (artificial neural network, support vector machine, random forests and naïve Bayes) using a population-based (England & Wales) case-cohort registry. Predictive evaluation used: area under the receiver operating characteristic curve; sensitivity; specificity; positive predictive value and Youden's index. All methods had comparable discriminatory abilities, similar sensitivities, specificities and positive predictive values. Although some machine learning methods performed marginally better than logistic regression the differences were seldom statistically significant and clinically insubstantial. Random forests were marginally better for high positive predictive value and reasonable sensitivity. Neural networks yielded slightly better prediction overall. Logistic regression gives an optimal mix of performance and interpretability. The established logistic regression model of burn mortality performs well against more complex alternatives. Clinical prediction with a small set of strong, stable, independent predictors is unlikely to gain much from machine learning outside specialist research contexts. Copyright © 2015 Elsevier Ltd and ISBI. All rights reserved.

  9. Using machine learning for sequence-level automated MRI protocol selection in neuroradiology.

    PubMed

    Brown, Andrew D; Marotta, Thomas R

    2018-05-01

    Incorrect imaging protocol selection can lead to important clinical findings being missed, contributing to both wasted health care resources and patient harm. We present a machine learning method for analyzing the unstructured text of clinical indications and patient demographics from magnetic resonance imaging (MRI) orders to automatically protocol MRI procedures at the sequence level. We compared 3 machine learning models - support vector machine, gradient boosting machine, and random forest - to a baseline model that predicted the most common protocol for all observations in our test set. The gradient boosting machine model significantly outperformed the baseline and demonstrated the best performance of the 3 models in terms of accuracy (95%), precision (86%), recall (80%), and Hamming loss (0.0487). This demonstrates the feasibility of automating sequence selection by applying machine learning to MRI orders. Automated sequence selection has important safety, quality, and financial implications and may facilitate improvements in the quality and safety of medical imaging service delivery.

  10. UPEML: a machine-portable CDC Update emulator

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mehlhorn, T.A.; Young, M.F.

    1984-12-01

    UPEML is a machine-portable CDC Update emulation program. UPEML is written in ANSI standard Fortran-77 and is relatively simple and compact. It is capable of emulating a significant subset of the standard CDC Update functions including program library creation and subsequent modification. Machine-portability is an essential attribute of UPEML. It was written primarily to facilitate the use of CDC-based scientific packages on alternate computer systems such as the VAX 11/780 and the IBM 3081.

  11. Machine learning molecular dynamics for the simulation of infrared spectra.

    PubMed

    Gastegger, Michael; Behler, Jörg; Marquetand, Philipp

    2017-10-01

    Machine learning has emerged as an invaluable tool in many research areas. In the present work, we harness this power to predict highly accurate molecular infrared spectra with unprecedented computational efficiency. To account for vibrational anharmonic and dynamical effects - typically neglected by conventional quantum chemistry approaches - we base our machine learning strategy on ab initio molecular dynamics simulations. While these simulations are usually extremely time consuming even for small molecules, we overcome these limitations by leveraging the power of a variety of machine learning techniques, not only accelerating simulations by several orders of magnitude, but also greatly extending the size of systems that can be treated. To this end, we develop a molecular dipole moment model based on environment dependent neural network charges and combine it with the neural network potential approach of Behler and Parrinello. Contrary to the prevalent big data philosophy, we are able to obtain very accurate machine learning models for the prediction of infrared spectra based on only a few hundreds of electronic structure reference points. This is made possible through the use of molecular forces during neural network potential training and the introduction of a fully automated sampling scheme. We demonstrate the power of our machine learning approach by applying it to model the infrared spectra of a methanol molecule, n -alkanes containing up to 200 atoms and the protonated alanine tripeptide, which at the same time represents the first application of machine learning techniques to simulate the dynamics of a peptide. In all of these case studies we find an excellent agreement between the infrared spectra predicted via machine learning models and the respective theoretical and experimental spectra.

  12. The Next Era: Deep Learning in Pharmaceutical Research

    PubMed Central

    Ekins, Sean

    2016-01-01

    Over the past decade we have witnessed the increasing sophistication of machine learning algorithms applied in daily use from internet searches, voice recognition, social network software to machine vision software in cameras, phones, robots and self-driving cars. Pharmaceutical research has also seen its fair share of machine learning developments. For example, applying such methods to mine the growing datasets that are created in drug discovery not only enables us to learn from the past but to predict a molecule’s properties and behavior in future. The latest machine learning algorithm garnering significant attention is deep learning, which is an artificial neural network with multiple hidden layers. Publications over the last 3 years suggest that this algorithm may have advantages over previous machine learning methods and offer a slight but discernable edge in predictive performance. The time has come for a balanced review of this technique but also to apply machine learning methods such as deep learning across a wider array of endpoints relevant to pharmaceutical research for which the datasets are growing such as physicochemical property prediction, formulation prediction, absorption, distribution, metabolism, excretion and toxicity (ADME/Tox), target prediction and skin permeation, etc. We also show that there are many potential applications of deep learning beyond cheminformatics. It will be important to perform prospective testing (which has been carried out rarely to date) in order to convince skeptics that there will be benefits from investing in this technique. PMID:27599991

  13. Component Pin Recognition Using Algorithms Based on Machine Learning

    NASA Astrophysics Data System (ADS)

    Xiao, Yang; Hu, Hong; Liu, Ze; Xu, Jiangchang

    2018-04-01

    The purpose of machine vision for a plug-in machine is to improve the machine’s stability and accuracy, and recognition of the component pin is an important part of the vision. This paper focuses on component pin recognition using three different techniques. The first technique involves traditional image processing using the core algorithm for binary large object (BLOB) analysis. The second technique uses the histogram of oriented gradients (HOG), to experimentally compare the effect of the support vector machine (SVM) and the adaptive boosting machine (AdaBoost) learning meta-algorithm classifiers. The third technique is the use of an in-depth learning method known as convolution neural network (CNN), which involves identifying the pin by comparing a sample to its training. The main purpose of the research presented in this paper is to increase the knowledge of learning methods used in the plug-in machine industry in order to achieve better results.

  14. Experimental Machine Learning of Quantum States

    NASA Astrophysics Data System (ADS)

    Gao, Jun; Qiao, Lu-Feng; Jiao, Zhi-Qiang; Ma, Yue-Chi; Hu, Cheng-Qiu; Ren, Ruo-Jing; Yang, Ai-Lin; Tang, Hao; Yung, Man-Hong; Jin, Xian-Min

    2018-06-01

    Quantum information technologies provide promising applications in communication and computation, while machine learning has become a powerful technique for extracting meaningful structures in "big data." A crossover between quantum information and machine learning represents a new interdisciplinary area stimulating progress in both fields. Traditionally, a quantum state is characterized by quantum-state tomography, which is a resource-consuming process when scaled up. Here we experimentally demonstrate a machine-learning approach to construct a quantum-state classifier for identifying the separability of quantum states. We show that it is possible to experimentally train an artificial neural network to efficiently learn and classify quantum states, without the need of obtaining the full information of the states. We also show how adding a hidden layer of neurons to the neural network can significantly boost the performance of the state classifier. These results shed new light on how classification of quantum states can be achieved with limited resources, and represent a step towards machine-learning-based applications in quantum information processing.

  15. Ensemble learning of inverse probability weights for marginal structural modeling in large observational datasets.

    PubMed

    Gruber, Susan; Logan, Roger W; Jarrín, Inmaculada; Monge, Susana; Hernán, Miguel A

    2015-01-15

    Inverse probability weights used to fit marginal structural models are typically estimated using logistic regression. However, a data-adaptive procedure may be able to better exploit information available in measured covariates. By combining predictions from multiple algorithms, ensemble learning offers an alternative to logistic regression modeling to further reduce bias in estimated marginal structural model parameters. We describe the application of two ensemble learning approaches to estimating stabilized weights: super learning (SL), an ensemble machine learning approach that relies on V-fold cross validation, and an ensemble learner (EL) that creates a single partition of the data into training and validation sets. Longitudinal data from two multicenter cohort studies in Spain (CoRIS and CoRIS-MD) were analyzed to estimate the mortality hazard ratio for initiation versus no initiation of combined antiretroviral therapy among HIV positive subjects. Both ensemble approaches produced hazard ratio estimates further away from the null, and with tighter confidence intervals, than logistic regression modeling. Computation time for EL was less than half that of SL. We conclude that ensemble learning using a library of diverse candidate algorithms offers an alternative to parametric modeling of inverse probability weights when fitting marginal structural models. With large datasets, EL provides a rich search over the solution space in less time than SL with comparable results. Copyright © 2014 John Wiley & Sons, Ltd.

  16. Ensemble learning of inverse probability weights for marginal structural modeling in large observational datasets

    PubMed Central

    Gruber, Susan; Logan, Roger W.; Jarrín, Inmaculada; Monge, Susana; Hernán, Miguel A.

    2014-01-01

    Inverse probability weights used to fit marginal structural models are typically estimated using logistic regression. However a data-adaptive procedure may be able to better exploit information available in measured covariates. By combining predictions from multiple algorithms, ensemble learning offers an alternative to logistic regression modeling to further reduce bias in estimated marginal structural model parameters. We describe the application of two ensemble learning approaches to estimating stabilized weights: super learning (SL), an ensemble machine learning approach that relies on V -fold cross validation, and an ensemble learner (EL) that creates a single partition of the data into training and validation sets. Longitudinal data from two multicenter cohort studies in Spain (CoRIS and CoRIS-MD) were analyzed to estimate the mortality hazard ratio for initiation versus no initiation of combined antiretroviral therapy among HIV positive subjects. Both ensemble approaches produced hazard ratio estimates further away from the null, and with tighter confidence intervals, than logistic regression modeling. Computation time for EL was less than half that of SL. We conclude that ensemble learning using a library of diverse candidate algorithms offers an alternative to parametric modeling of inverse probability weights when fitting marginal structural models. With large datasets, EL provides a rich search over the solution space in less time than SL with comparable results. PMID:25316152

  17. Machine learning modelling for predicting soil liquefaction susceptibility

    NASA Astrophysics Data System (ADS)

    Samui, P.; Sitharam, T. G.

    2011-01-01

    This study describes two machine learning techniques applied to predict liquefaction susceptibility of soil based on the standard penetration test (SPT) data from the 1999 Chi-Chi, Taiwan earthquake. The first machine learning technique which uses Artificial Neural Network (ANN) based on multi-layer perceptions (MLP) that are trained with Levenberg-Marquardt backpropagation algorithm. The second machine learning technique uses the Support Vector machine (SVM) that is firmly based on the theory of statistical learning theory, uses classification technique. ANN and SVM have been developed to predict liquefaction susceptibility using corrected SPT [(N1)60] and cyclic stress ratio (CSR). Further, an attempt has been made to simplify the models, requiring only the two parameters [(N1)60 and peck ground acceleration (amax/g)], for the prediction of liquefaction susceptibility. The developed ANN and SVM models have also been applied to different case histories available globally. The paper also highlights the capability of the SVM over the ANN models.

  18. Correct machine learning on protein sequences: a peer-reviewing perspective.

    PubMed

    Walsh, Ian; Pollastri, Gianluca; Tosatto, Silvio C E

    2016-09-01

    Machine learning methods are becoming increasingly popular to predict protein features from sequences. Machine learning in bioinformatics can be powerful but carries also the risk of introducing unexpected biases, which may lead to an overestimation of the performance. This article espouses a set of guidelines to allow both peer reviewers and authors to avoid common machine learning pitfalls. Understanding biology is necessary to produce useful data sets, which have to be large and diverse. Separating the training and test process is imperative to avoid over-selling method performance, which is also dependent on several hidden parameters. A novel predictor has always to be compared with several existing methods, including simple baseline strategies. Using the presented guidelines will help nonspecialists to appreciate the critical issues in machine learning. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  19. A Critical Review for Developing Accurate and Dynamic Predictive Models Using Machine Learning Methods in Medicine and Health Care.

    PubMed

    Alanazi, Hamdan O; Abdullah, Abdul Hanan; Qureshi, Kashif Naseer

    2017-04-01

    Recently, Artificial Intelligence (AI) has been used widely in medicine and health care sector. In machine learning, the classification or prediction is a major field of AI. Today, the study of existing predictive models based on machine learning methods is extremely active. Doctors need accurate predictions for the outcomes of their patients' diseases. In addition, for accurate predictions, timing is another significant factor that influences treatment decisions. In this paper, existing predictive models in medicine and health care have critically reviewed. Furthermore, the most famous machine learning methods have explained, and the confusion between a statistical approach and machine learning has clarified. A review of related literature reveals that the predictions of existing predictive models differ even when the same dataset is used. Therefore, existing predictive models are essential, and current methods must be improved.

  20. Play It, Learn It, Make It Last: Developing an Online Game to Create Self-Sufficient Library Information Users.

    PubMed

    Boyce, Lindsay M

    2016-01-01

    Library orientation at an academic health sciences library consisted of a five-minute overview within new student orientation. Past experience indicated this brief presentation was insufficient for students to learn about library resources. In 2014, an effort was made to supplement orientation by developing an online game aimed at enabling students to become self-sufficient through hands-on learning. A gaming model was chosen with expectations that competition and rewards would motivate students. Although the pilots suffered from low participation rates, the experience merits further research into the potential of a broader model of online library instruction in the health sciences environment.

  1. Novel Breast Imaging and Machine Learning: Predicting Breast Lesion Malignancy at Cone-Beam CT Using Machine Learning Techniques.

    PubMed

    Uhlig, Johannes; Uhlig, Annemarie; Kunze, Meike; Beissbarth, Tim; Fischer, Uwe; Lotz, Joachim; Wienbeck, Susanne

    2018-05-24

    The purpose of this study is to evaluate the diagnostic performance of machine learning techniques for malignancy prediction at breast cone-beam CT (CBCT) and to compare them to human readers. Five machine learning techniques, including random forests, back propagation neural networks (BPN), extreme learning machines, support vector machines, and K-nearest neighbors, were used to train diagnostic models on a clinical breast CBCT dataset with internal validation by repeated 10-fold cross-validation. Two independent blinded human readers with profound experience in breast imaging and breast CBCT analyzed the same CBCT dataset. Diagnostic performance was compared using AUC, sensitivity, and specificity. The clinical dataset comprised 35 patients (American College of Radiology density type C and D breasts) with 81 suspicious breast lesions examined with contrast-enhanced breast CBCT. Forty-five lesions were histopathologically proven to be malignant. Among the machine learning techniques, BPNs provided the best diagnostic performance, with AUC of 0.91, sensitivity of 0.85, and specificity of 0.82. The diagnostic performance of the human readers was AUC of 0.84, sensitivity of 0.89, and specificity of 0.72 for reader 1 and AUC of 0.72, sensitivity of 0.71, and specificity of 0.67 for reader 2. AUC was significantly higher for BPN when compared with both reader 1 (p = 0.01) and reader 2 (p < 0.001). Machine learning techniques provide a high and robust diagnostic performance in the prediction of malignancy in breast lesions identified at CBCT. BPNs showed the best diagnostic performance, surpassing human readers in terms of AUC and specificity.

  2. Towards a lifelong learning society through reading promotion: Opportunities and challenges for libraries and community learning centres in Viet Nam

    NASA Astrophysics Data System (ADS)

    Hossain, Zakir

    2016-04-01

    The government of Viet Nam has made a commitment to build a Lifelong Learning Society by 2020. A range of related initiatives have been launched, including the Southeast Asian Ministers of Education Organization Centre for Lifelong Learning (SEAMEO CELLL) and "Book Day" - a day aimed at encouraging reading and raising awareness of its importance for the development of knowledge and skills. Viet Nam also aims to implement lifelong learning (LLL) activities in libraries, museums, cultural centres and clubs. The government of Viet Nam currently operates more than 11,900 Community Learning Centres (CLCs) and is in the process of both renovating and innovating public libraries and museums throughout the country. In addition to the work undertaken by the Viet Nam government, a number of enterprises have been initiated by non-governmental organisations and non-profit organisations to promote literacy and lifelong learning. This paper investigates some government initiatives focused on libraries and CLCs and their impact on reading promotion. Proposing a way forward, the paper confirms that Viet Nam's libraries and CLCs play an essential role in promoting reading and building a LLL Society.

  3. Machine learning of molecular properties: Locality and active learning

    NASA Astrophysics Data System (ADS)

    Gubaev, Konstantin; Podryabinkin, Evgeny V.; Shapeev, Alexander V.

    2018-06-01

    In recent years, the machine learning techniques have shown great potent1ial in various problems from a multitude of disciplines, including materials design and drug discovery. The high computational speed on the one hand and the accuracy comparable to that of density functional theory on another hand make machine learning algorithms efficient for high-throughput screening through chemical and configurational space. However, the machine learning algorithms available in the literature require large training datasets to reach the chemical accuracy and also show large errors for the so-called outliers—the out-of-sample molecules, not well-represented in the training set. In the present paper, we propose a new machine learning algorithm for predicting molecular properties that addresses these two issues: it is based on a local model of interatomic interactions providing high accuracy when trained on relatively small training sets and an active learning algorithm of optimally choosing the training set that significantly reduces the errors for the outliers. We compare our model to the other state-of-the-art algorithms from the literature on the widely used benchmark tests.

  4. SSL - THE SIMPLE SOCKETS LIBRARY

    NASA Technical Reports Server (NTRS)

    Campbell, C. E.

    1994-01-01

    The Simple Sockets Library (SSL) allows C programmers to develop systems of cooperating programs using Berkeley streaming Sockets running under the TCP/IP protocol over Ethernet. The SSL provides a simple way to move information between programs running on the same or different machines and does so with little overhead. The SSL can create three types of Sockets: namely a server, a client, and an accept Socket. The SSL's Sockets are designed to be used in a fashion reminiscent of the use of FILE pointers so that a C programmer who is familiar with reading and writing files will immediately feel comfortable with reading and writing with Sockets. The SSL consists of three parts: the library, PortMaster, and utilities. The user of the SSL accesses it by linking programs to the SSL library. The PortMaster initializes connections between clients and servers. The PortMaster also supports a "firewall" facility to keep out socket requests from unapproved machines. The "firewall" is a file which contains Internet addresses for all approved machines. There are three utilities provided with the SSL. SKTDBG can be used to debug programs that make use of the SSL. SPMTABLE lists the servers and port numbers on requested machine(s). SRMSRVR tells the PortMaster to forcibly remove a server name from its list. The package also includes two example programs: multiskt.c, which makes multiple accepts on one server, and sktpoll.c, which repeatedly attempts to connect a client to some server at one second intervals. SSL is a machine independent library written in the C-language for computers connected via Ethernet using the TCP/IP protocol. It has been successfully compiled and implemented on a variety of platforms, including Sun series computers running SunOS, DEC VAX series computers running VMS, SGI computers running IRIX, DECstations running ULTRIX, DEC alpha AXPs running OSF/1, IBM RS/6000 computers running AIX, IBM PC and compatibles running BSD/386 UNIX and HP Apollo 3000/4000/9000/400T computers running HP-UX. SSL requires 45K of RAM to run under SunOS and 80K of RAM to run under VMS. For use on IBM PC series computers and compatibles running DOS, SSL requires Microsoft C 6.0 and the Wollongong TCP/IP package. Source code for sample programs and debugging tools are provided. The documentation is available on the distribution medium in TeX and PostScript formats. The standard distribution medium for SSL is a .25 inch streaming magnetic tape cartridge (QIC-24) in UNIX tar format. It is also available on a 3.5 inch diskette in UNIX tar format and a 5.25 inch 360K MS-DOS format diskette. The SSL was developed in 1992 and was updated in 1993.

  5. Human and Machine Entanglement in the Digital Archive: Academic Libraries and Socio-Technical Change

    ERIC Educational Resources Information Center

    Manoff, Marlene

    2015-01-01

    This essay urges a broadening of the discourse of library and information science (LIS) to address the convergence of forces shaping the information environment. It proposes adopting a model from the field of science studies that acknowledges the interdependence and coevolution of social, cultural, and material phenomena. Digital archives and…

  6. The Florida State University's Learning District: A Case Study of an Academic Library-Run Peer Tutoring Program

    ERIC Educational Resources Information Center

    Demeter, Michelle

    2011-01-01

    In March 2010, the first floor of the main library at The Florida State University was renovated as a learning commons. With this change in design, all tutoring that existed throughout the library was moved into the commons. The crown jewel of these programs is the library's in-house, late-night peer tutoring program that has seen incredible…

  7. Machine-Learning Algorithms to Automate Morphological and Functional Assessments in 2D Echocardiography.

    PubMed

    Narula, Sukrit; Shameer, Khader; Salem Omar, Alaa Mabrouk; Dudley, Joel T; Sengupta, Partho P

    2016-11-29

    Machine-learning models may aid cardiac phenotypic recognition by using features of cardiac tissue deformation. This study investigated the diagnostic value of a machine-learning framework that incorporates speckle-tracking echocardiographic data for automated discrimination of hypertrophic cardiomyopathy (HCM) from physiological hypertrophy seen in athletes (ATH). Expert-annotated speckle-tracking echocardiographic datasets obtained from 77 ATH and 62 HCM patients were used for developing an automated system. An ensemble machine-learning model with 3 different machine-learning algorithms (support vector machines, random forests, and artificial neural networks) was developed and a majority voting method was used for conclusive predictions with further K-fold cross-validation. Feature selection using an information gain (IG) algorithm revealed that volume was the best predictor for differentiating between HCM ands. ATH (IG = 0.24) followed by mid-left ventricular segmental (IG = 0.134) and average longitudinal strain (IG = 0.131). The ensemble machine-learning model showed increased sensitivity and specificity compared with early-to-late diastolic transmitral velocity ratio (p < 0.01), average early diastolic tissue velocity (e') (p < 0.01), and strain (p = 0.04). Because ATH were younger, adjusted analysis was undertaken in younger HCM patients and compared with ATH with left ventricular wall thickness >13 mm. In this subgroup analysis, the automated model continued to show equal sensitivity, but increased specificity relative to early-to-late diastolic transmitral velocity ratio, e', and strain. Our results suggested that machine-learning algorithms can assist in the discrimination of physiological versus pathological patterns of hypertrophic remodeling. This effort represents a step toward the development of a real-time, machine-learning-based system for automated interpretation of echocardiographic images, which may help novice readers with limited experience. Copyright © 2016 American College of Cardiology Foundation. Published by Elsevier Inc. All rights reserved.

  8. Machine learning: Trends, perspectives, and prospects.

    PubMed

    Jordan, M I; Mitchell, T M

    2015-07-17

    Machine learning addresses the question of how to build computers that improve automatically through experience. It is one of today's most rapidly growing technical fields, lying at the intersection of computer science and statistics, and at the core of artificial intelligence and data science. Recent progress in machine learning has been driven both by the development of new learning algorithms and theory and by the ongoing explosion in the availability of online data and low-cost computation. The adoption of data-intensive machine-learning methods can be found throughout science, technology and commerce, leading to more evidence-based decision-making across many walks of life, including health care, manufacturing, education, financial modeling, policing, and marketing. Copyright © 2015, American Association for the Advancement of Science.

  9. Learning Activity Packets for Milling Machines. Unit II--Horizontal Milling Machines.

    ERIC Educational Resources Information Center

    Oklahoma State Board of Vocational and Technical Education, Stillwater. Curriculum and Instructional Materials Center.

    This learning activity packet (LAP) outlines the study activities and performance tasks covered in a related curriculum guide on milling machines. The course of study in this LAP is intended to help students learn to set up and operate a horizontal mill. Tasks addressed in the LAP include mounting style "A" or "B" arbors and adjusting arbor…

  10. Machine learning for science: state of the art and future prospects.

    PubMed

    Mjolsness, E; DeCoste, D

    2001-09-14

    Recent advances in machine learning methods, along with successful applications across a wide variety of fields such as planetary science and bioinformatics, promise powerful new tools for practicing scientists. This viewpoint highlights some useful characteristics of modern machine learning methods and their relevance to scientific applications. We conclude with some speculations on near-term progress and promising directions.

  11. Advancing Research in Second Language Writing through Computational Tools and Machine Learning Techniques: A Research Agenda

    ERIC Educational Resources Information Center

    Crossley, Scott A.

    2013-01-01

    This paper provides an agenda for replication studies focusing on second language (L2) writing and the use of natural language processing (NLP) tools and machine learning algorithms. Specifically, it introduces a range of the available NLP tools and machine learning algorithms and demonstrates how these could be used to replicate seminal studies…

  12. Machine Learning in the Presence of an Adversary: Attacking and Defending the SpamBayes Spam Filter

    DTIC Science & Technology

    2008-05-20

    Machine learning techniques are often used for decision making in security critical applications such as intrusion detection and spam filtering...filter. The defenses shown in this thesis are able to work against the attacks developed against SpamBayes and are sufficiently generic to be easily extended into other statistical machine learning algorithms.

  13. Deep Learning through Concept-Based Inquiry

    ERIC Educational Resources Information Center

    Donham, Jean

    2010-01-01

    Learning in the library should present opportunities to enrich student learning activities to address concerns of interest and cognitive complexity, but these must be tasks that call for in-depth analysis--not merely gathering facts. Library learning experiences need to demand enough of students to keep them interested and also need to be…

  14. Predicting the outcomes of organic reactions via machine learning: are current descriptors sufficient?

    PubMed

    Skoraczyński, G; Dittwald, P; Miasojedow, B; Szymkuć, S; Gajewska, E P; Grzybowski, B A; Gambin, A

    2017-06-15

    As machine learning/artificial intelligence algorithms are defeating chess masters and, most recently, GO champions, there is interest - and hope - that they will prove equally useful in assisting chemists in predicting outcomes of organic reactions. This paper demonstrates, however, that the applicability of machine learning to the problems of chemical reactivity over diverse types of chemistries remains limited - in particular, with the currently available chemical descriptors, fundamental mathematical theorems impose upper bounds on the accuracy with which raction yields and times can be predicted. Improving the performance of machine-learning methods calls for the development of fundamentally new chemical descriptors.

  15. Ten quick tips for machine learning in computational biology.

    PubMed

    Chicco, Davide

    2017-01-01

    Machine learning has become a pivotal tool for many projects in computational biology, bioinformatics, and health informatics. Nevertheless, beginners and biomedical researchers often do not have enough experience to run a data mining project effectively, and therefore can follow incorrect practices, that may lead to common mistakes or over-optimistic results. With this review, we present ten quick tips to take advantage of machine learning in any computational biology context, by avoiding some common errors that we observed hundreds of times in multiple bioinformatics projects. We believe our ten suggestions can strongly help any machine learning practitioner to carry on a successful project in computational biology and related sciences.

  16. Multivariate analysis of fMRI time series: classification and regression of brain responses using machine learning.

    PubMed

    Formisano, Elia; De Martino, Federico; Valente, Giancarlo

    2008-09-01

    Machine learning and pattern recognition techniques are being increasingly employed in functional magnetic resonance imaging (fMRI) data analysis. By taking into account the full spatial pattern of brain activity measured simultaneously at many locations, these methods allow detecting subtle, non-strictly localized effects that may remain invisible to the conventional analysis with univariate statistical methods. In typical fMRI applications, pattern recognition algorithms "learn" a functional relationship between brain response patterns and a perceptual, cognitive or behavioral state of a subject expressed in terms of a label, which may assume discrete (classification) or continuous (regression) values. This learned functional relationship is then used to predict the unseen labels from a new data set ("brain reading"). In this article, we describe the mathematical foundations of machine learning applications in fMRI. We focus on two methods, support vector machines and relevance vector machines, which are respectively suited for the classification and regression of fMRI patterns. Furthermore, by means of several examples and applications, we illustrate and discuss the methodological challenges of using machine learning algorithms in the context of fMRI data analysis.

  17. Game-powered machine learning

    PubMed Central

    Barrington, Luke; Turnbull, Douglas; Lanckriet, Gert

    2012-01-01

    Searching for relevant content in a massive amount of multimedia information is facilitated by accurately annotating each image, video, or song with a large number of relevant semantic keywords, or tags. We introduce game-powered machine learning, an integrated approach to annotating multimedia content that combines the effectiveness of human computation, through online games, with the scalability of machine learning. We investigate this framework for labeling music. First, a socially-oriented music annotation game called Herd It collects reliable music annotations based on the “wisdom of the crowds.” Second, these annotated examples are used to train a supervised machine learning system. Third, the machine learning system actively directs the annotation games to collect new data that will most benefit future model iterations. Once trained, the system can automatically annotate a corpus of music much larger than what could be labeled using human computation alone. Automatically annotated songs can be retrieved based on their semantic relevance to text-based queries (e.g., “funky jazz with saxophone,” “spooky electronica,” etc.). Based on the results presented in this paper, we find that actively coupling annotation games with machine learning provides a reliable and scalable approach to making searchable massive amounts of multimedia data. PMID:22460786

  18. Inverse Problems in Geodynamics Using Machine Learning Algorithms

    NASA Astrophysics Data System (ADS)

    Shahnas, M. H.; Yuen, D. A.; Pysklywec, R. N.

    2018-01-01

    During the past few decades numerical studies have been widely employed to explore the style of circulation and mixing in the mantle of Earth and other planets. However, in geodynamical studies there are many properties from mineral physics, geochemistry, and petrology in these numerical models. Machine learning, as a computational statistic-related technique and a subfield of artificial intelligence, has rapidly emerged recently in many fields of sciences and engineering. We focus here on the application of supervised machine learning (SML) algorithms in predictions of mantle flow processes. Specifically, we emphasize on estimating mantle properties by employing machine learning techniques in solving an inverse problem. Using snapshots of numerical convection models as training samples, we enable machine learning models to determine the magnitude of the spin transition-induced density anomalies that can cause flow stagnation at midmantle depths. Employing support vector machine algorithms, we show that SML techniques can successfully predict the magnitude of mantle density anomalies and can also be used in characterizing mantle flow patterns. The technique can be extended to more complex geodynamic problems in mantle dynamics by employing deep learning algorithms for putting constraints on properties such as viscosity, elastic parameters, and the nature of thermal and chemical anomalies.

  19. Game-powered machine learning.

    PubMed

    Barrington, Luke; Turnbull, Douglas; Lanckriet, Gert

    2012-04-24

    Searching for relevant content in a massive amount of multimedia information is facilitated by accurately annotating each image, video, or song with a large number of relevant semantic keywords, or tags. We introduce game-powered machine learning, an integrated approach to annotating multimedia content that combines the effectiveness of human computation, through online games, with the scalability of machine learning. We investigate this framework for labeling music. First, a socially-oriented music annotation game called Herd It collects reliable music annotations based on the "wisdom of the crowds." Second, these annotated examples are used to train a supervised machine learning system. Third, the machine learning system actively directs the annotation games to collect new data that will most benefit future model iterations. Once trained, the system can automatically annotate a corpus of music much larger than what could be labeled using human computation alone. Automatically annotated songs can be retrieved based on their semantic relevance to text-based queries (e.g., "funky jazz with saxophone," "spooky electronica," etc.). Based on the results presented in this paper, we find that actively coupling annotation games with machine learning provides a reliable and scalable approach to making searchable massive amounts of multimedia data.

  20. Advances in Machine Learning and Data Mining for Astronomy

    NASA Astrophysics Data System (ADS)

    Way, Michael J.; Scargle, Jeffrey D.; Ali, Kamal M.; Srivastava, Ashok N.

    2012-03-01

    Advances in Machine Learning and Data Mining for Astronomy documents numerous successful collaborations among computer scientists, statisticians, and astronomers who illustrate the application of state-of-the-art machine learning and data mining techniques in astronomy. Due to the massive amount and complexity of data in most scientific disciplines, the material discussed in this text transcends traditional boundaries between various areas in the sciences and computer science. The book's introductory part provides context to issues in the astronomical sciences that are also important to health, social, and physical sciences, particularly probabilistic and statistical aspects of classification and cluster analysis. The next part describes a number of astrophysics case studies that leverage a range of machine learning and data mining technologies. In the last part, developers of algorithms and practitioners of machine learning and data mining show how these tools and techniques are used in astronomical applications. With contributions from leading astronomers and computer scientists, this book is a practical guide to many of the most important developments in machine learning, data mining, and statistics. It explores how these advances can solve current and future problems in astronomy and looks at how they could lead to the creation of entirely new algorithms within the data mining community.

  1. A Model for Designing Library Instruction for Distance Learning

    ERIC Educational Resources Information Center

    Rand, Angela Doucet

    2013-01-01

    Providing library instruction in distance learning environments presents a unique set of challenges for instructional librarians. Innovations in computer-mediated communication and advances in cognitive science research provide the opportunity for designing library instruction that meets a variety of student information seeking needs. Using a…

  2. Creating an Online Library To Support a Virtual Learning Community.

    ERIC Educational Resources Information Center

    Sandelands, Eric

    1998-01-01

    International Management Centres (IMC), an independent business school, and Anbar Electronic Intelligence (AEI), a database publisher, have created a virtual library for IMC's virtual business school. Topics discussed include action learning; IMC's partnership with AEI; the virtual university model; designing virtual library resources; and…

  3. Virtual Reference Services.

    ERIC Educational Resources Information Center

    Brewer, Sally

    2003-01-01

    As the need to access information increases, school librarians must create virtual libraries. Linked to reliable reference resources, the virtual library extends the physical collection and library hours and lets students learn to use Web-based resources in a protected learning environment. The growing number of virtual schools increases the need…

  4. Systems Design and Pilot Operation of a Regional Center for Technical Processing for the Libraries of the New England State Universities. NELINET, New England Library Information Network. Progress Report, July 1, 1967 - March 30, 1968, Volume II, Appendices.

    ERIC Educational Resources Information Center

    Agenbroad, James E.; And Others

    Included in this volume of appendices to LI 000 979 are acquisitions flow charts; a current operations questionnaire; an algorithm for splitting the Library of Congress call number; analysis of the Machine-Readable Cataloging (MARC II) format; production problems and decisions; operating procedures for information transmittal in the New England…

  5. Machine learning to predict the occurrence of bisphosphonate-related osteonecrosis of the jaw associated with dental extraction: A preliminary report.

    PubMed

    Kim, Dong Wook; Kim, Hwiyoung; Nam, Woong; Kim, Hyung Jun; Cha, In-Ho

    2018-04-23

    The aim of this study was to build and validate five types of machine learning models that can predict the occurrence of BRONJ associated with dental extraction in patients taking bisphosphonates for the management of osteoporosis. A retrospective review of the medical records was conducted to obtain cases and controls for the study. Total 125 patients consisting of 41 cases and 84 controls were selected for the study. Five machine learning prediction algorithms including multivariable logistic regression model, decision tree, support vector machine, artificial neural network, and random forest were implemented. The outputs of these models were compared with each other and also with conventional methods, such as serum CTX level. Area under the receiver operating characteristic (ROC) curve (AUC) was used to compare the results. The performance of machine learning models was significantly superior to conventional statistical methods and single predictors. The random forest model yielded the best performance (AUC = 0.973), followed by artificial neural network (AUC = 0.915), support vector machine (AUC = 0.882), logistic regression (AUC = 0.844), decision tree (AUC = 0.821), drug holiday alone (AUC = 0.810), and CTX level alone (AUC = 0.630). Machine learning methods showed superior performance in predicting BRONJ associated with dental extraction compared to conventional statistical methods using drug holiday and serum CTX level. Machine learning can thus be applied in a wide range of clinical studies. Copyright © 2017. Published by Elsevier Inc.

  6. Using NLM exhibits and events to engage library users and reach the community.

    PubMed

    Auten, Beth; Norton, Hannah F; Tennant, Michele R; Edwards, Mary E; Stoyan-Rosenzweig, Nina; Daley, Matthew

    2013-01-01

    In an effort to reach out to library users and make the library a more relevant, welcoming place, the University of Florida's Health Science Center Library hosted exhibits from the National Library of Medicine's (NLM) Traveling Exhibition Program. From 2010 through 2012, the library hosted four NLM exhibits and created event series for each. Through reflection and use of a participant survey, lessons were learned concerning creating relevant programs, marketing events, and forming new partnerships. Each successive exhibit added events and activities to address different audiences. A survey of libraries that have hosted NLM exhibits highlights lessons learned at those institutions.

  7. Using NLM Exhibits and Events to Engage Library Users and Reach the Community

    PubMed Central

    Auten, Beth; Norton, Hannah F.; Tennant, Michele R.; Edwards, Mary E.; Stoyan-Rosenzweig, Nina; Daley, Matthew

    2013-01-01

    In an effort to reach out to library users and make the library a more relevant, welcoming place, the University of Florida’s Health Science Center Library hosted exhibits from the National Library of Medicine’s (NLM) Traveling Exhibition Program. From 2010 through 2012, the library hosted four NLM exhibits and created event series for each. Through reflection and use of a participant survey, lessons were learned concerning creating relevant programs, marketing events, and forming new partnerships. Each successive exhibit added events and activities to address different audiences. A survey of libraries that have hosted NLM exhibits highlights lessons learned at those institutions. PMID:23869634

  8. Real English: A Translator to Enable Natural Language Man-Machine Conversation.

    ERIC Educational Resources Information Center

    Gautin, Harvey

    This dissertation presents a pragmatic interpreter/translator called Real English to serve as a natural language man-machine communication interface in a multi-mode on-line information retrieval system. This multi-mode feature affords the user a library-like searching tool by giving him access to a dictionary, lexicon, thesaurus, synonym table,…

  9. Prediction of drug synergy in cancer using ensemble-based machine learning techniques

    NASA Astrophysics Data System (ADS)

    Singh, Harpreet; Rana, Prashant Singh; Singh, Urvinder

    2018-04-01

    Drug synergy prediction plays a significant role in the medical field for inhibiting specific cancer agents. It can be developed as a pre-processing tool for therapeutic successes. Examination of different drug-drug interaction can be done by drug synergy score. It needs efficient regression-based machine learning approaches to minimize the prediction errors. Numerous machine learning techniques such as neural networks, support vector machines, random forests, LASSO, Elastic Nets, etc., have been used in the past to realize requirement as mentioned above. However, these techniques individually do not provide significant accuracy in drug synergy score. Therefore, the primary objective of this paper is to design a neuro-fuzzy-based ensembling approach. To achieve this, nine well-known machine learning techniques have been implemented by considering the drug synergy data. Based on the accuracy of each model, four techniques with high accuracy are selected to develop ensemble-based machine learning model. These models are Random forest, Fuzzy Rules Using Genetic Cooperative-Competitive Learning method (GFS.GCCL), Adaptive-Network-Based Fuzzy Inference System (ANFIS) and Dynamic Evolving Neural-Fuzzy Inference System method (DENFIS). Ensembling is achieved by evaluating the biased weighted aggregation (i.e. adding more weights to the model with a higher prediction score) of predicted data by selected models. The proposed and existing machine learning techniques have been evaluated on drug synergy score data. The comparative analysis reveals that the proposed method outperforms others in terms of accuracy, root mean square error and coefficient of correlation.

  10. A Comparison of a Machine Learning Model with EuroSCORE II in Predicting Mortality after Elective Cardiac Surgery: A Decision Curve Analysis.

    PubMed

    Allyn, Jérôme; Allou, Nicolas; Augustin, Pascal; Philip, Ivan; Martinet, Olivier; Belghiti, Myriem; Provenchere, Sophie; Montravers, Philippe; Ferdynus, Cyril

    2017-01-01

    The benefits of cardiac surgery are sometimes difficult to predict and the decision to operate on a given individual is complex. Machine Learning and Decision Curve Analysis (DCA) are recent methods developed to create and evaluate prediction models. We conducted a retrospective cohort study using a prospective collected database from December 2005 to December 2012, from a cardiac surgical center at University Hospital. The different models of prediction of mortality in-hospital after elective cardiac surgery, including EuroSCORE II, a logistic regression model and a machine learning model, were compared by ROC and DCA. Of the 6,520 patients having elective cardiac surgery with cardiopulmonary bypass, 6.3% died. Mean age was 63.4 years old (standard deviation 14.4), and mean EuroSCORE II was 3.7 (4.8) %. The area under ROC curve (IC95%) for the machine learning model (0.795 (0.755-0.834)) was significantly higher than EuroSCORE II or the logistic regression model (respectively, 0.737 (0.691-0.783) and 0.742 (0.698-0.785), p < 0.0001). Decision Curve Analysis showed that the machine learning model, in this monocentric study, has a greater benefit whatever the probability threshold. According to ROC and DCA, machine learning model is more accurate in predicting mortality after elective cardiac surgery than EuroSCORE II. These results confirm the use of machine learning methods in the field of medical prediction.

  11. The Role of the School Library: Reflections from Sweden

    ERIC Educational Resources Information Center

    Avery, Helen

    2014-01-01

    Libraries are critical learning spaces and may play a significant role in intercultural education initiatives, particularly in Sweden where the national curriculum ascribes central functions to libraries for learning activities. Unfortunately, the ways in which teachers and librarians may collaborate to leverage mutual resources is not fully…

  12. Distance Learning Library Services in Ugandan Universities

    ERIC Educational Resources Information Center

    Mayende, Jackline Estomihi Kiwelu; Obura, Constant Okello

    2013-01-01

    The study carried out at Makerere University and Uganda Martyrs University in 2010 aimed at providing strategies for enhanced distance learning library services in terms of convenience and adequacy. The study adopted a cross sectional descriptive survey design. The study revealed services provided in branch libraries in Ugandan universities were…

  13. School Library Media Program Connections for Learning.

    ERIC Educational Resources Information Center

    Bookmark, 1991

    1991-01-01

    The 29 articles in this theme issue of "The Bookmark" focus on various aspects of school library media programs. The articles are as follows: (1) "School Library Media Program Connections for Learning" (Betty J. Morris); (2) "Humanity and Technology in the School of the Future" (Michael V. McGill); (3) "Community…

  14. A Splendid Torch: Learning and Teaching in Today's Academic Libraries

    ERIC Educational Resources Information Center

    Eyre, Jodi Reeves, Ed.; Maclachlan, John C., Ed.; Williford, Christa, Ed.

    2017-01-01

    Six essays, written collaboratively by current and former Council on Library and Information Resources (CLIR) postdoctoral fellows, explore the contributions that today's academic libraries--as providers of resources, professional support, and space--are making to learning and teaching. Topics include the continuing evolution of the learning…

  15. Enriching Critical Thinking and Language Learning with Educational Digital Libraries

    ERIC Educational Resources Information Center

    Lu, Hsin-lin

    2012-01-01

    As the amount of information available in online digital libraries increases exponentially, questions arise concerning the most productive way to use that information to advance learning. Applying the earlier information seeking theories advocated by Kelly (1963), Taylor (1968), and Belkin (1980) to the digital libraries experience, Carol Kuhlthau…

  16. Using Social Networks to Create Powerful Learning Communities

    ERIC Educational Resources Information Center

    Lenox, Marianne; Coleman, Maurice

    2010-01-01

    Regular readers of "Computers in Libraries" are aware that social networks are forming increasingly important linkages to professional and personal development in all libraries. Live and virtual social networks have become the new learning playground for librarians and library staff. Social networks have the ability to connect those who are…

  17. CellSort: a support vector machine tool for optimizing fluorescence-activated cell sorting and reducing experimental effort.

    PubMed

    Yu, Jessica S; Pertusi, Dante A; Adeniran, Adebola V; Tyo, Keith E J

    2017-03-15

    High throughput screening by fluorescence activated cell sorting (FACS) is a common task in protein engineering and directed evolution. It can also be a rate-limiting step if high false positive or negative rates necessitate multiple rounds of enrichment. Current FACS software requires the user to define sorting gates by intuition and is practically limited to two dimensions. In cases when multiple rounds of enrichment are required, the software cannot forecast the enrichment effort required. We have developed CellSort, a support vector machine (SVM) algorithm that identifies optimal sorting gates based on machine learning using positive and negative control populations. CellSort can take advantage of more than two dimensions to enhance the ability to distinguish between populations. We also present a Bayesian approach to predict the number of sorting rounds required to enrich a population from a given library size. This Bayesian approach allowed us to determine strategies for biasing the sorting gates in order to reduce the required number of enrichment rounds. This algorithm should be generally useful for improve sorting outcomes and reducing effort when using FACS. Source code available at http://tyolab.northwestern.edu/tools/ . k-tyo@northwestern.edu. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  18. Machine Learning

    NASA Astrophysics Data System (ADS)

    Hoffmann, Achim; Mahidadia, Ashesh

    The purpose of this chapter is to present fundamental ideas and techniques of machine learning suitable for the field of this book, i.e., for automated scientific discovery. The chapter focuses on those symbolic machine learning methods, which produce results that are suitable to be interpreted and understood by humans. This is particularly important in the context of automated scientific discovery as the scientific theories to be produced by machines are usually meant to be interpreted by humans. This chapter contains some of the most influential ideas and concepts in machine learning research to give the reader a basic insight into the field. After the introduction in Sect. 1, general ideas of how learning problems can be framed are given in Sect. 2. The section provides useful perspectives to better understand what learning algorithms actually do. Section 3 presents the Version space model which is an early learning algorithm as well as a conceptual framework, that provides important insight into the general mechanisms behind most learning algorithms. In section 4, a family of learning algorithms, the AQ family for learning classification rules is presented. The AQ family belongs to the early approaches in machine learning. The next, Sect. 5 presents the basic principles of decision tree learners. Decision tree learners belong to the most influential class of inductive learning algorithms today. Finally, a more recent group of learning systems are presented in Sect. 6, which learn relational concepts within the framework of logic programming. This is a particularly interesting group of learning systems since the framework allows also to incorporate background knowledge which may assist in generalisation. Section 7 discusses Association Rules - a technique that comes from the related field of Data mining. Section 8 presents the basic idea of the Naive Bayesian Classifier. While this is a very popular learning technique, the learning result is not well suited for human comprehension as it is essentially a large collection of probability values. In Sect. 9, we present a generic method for improving accuracy of a given learner by generatingmultiple classifiers using variations of the training data. While this works well in most cases, the resulting classifiers have significantly increased complexity and, hence, tend to destroy the human readability of the learning result that a single learner may produce. Section 10 contains a summary, mentions briefly other techniques not discussed in this chapter and presents outlook on the potential of machine learning in the future.

  19. Active Learning and Teaching: Improving Postsecondary Library Instruction.

    ERIC Educational Resources Information Center

    Allen, Eileen E.

    1995-01-01

    Discusses ways to improve postsecondary library instruction based on theories of active learning. Topics include a historical background of active learning; student achievement and attitudes; cognitive development; risks; active teaching; and instructional techniques, including modified lectures, brainstorming, small group work, cooperative…

  20. Multi-centre diagnostic classification of individual structural neuroimaging scans from patients with major depressive disorder.

    PubMed

    Mwangi, Benson; Ebmeier, Klaus P; Matthews, Keith; Steele, J Douglas

    2012-05-01

    Quantitative abnormalities of brain structure in patients with major depressive disorder have been reported at a group level for decades. However, these structural differences appear subtle in comparison with conventional radiologically defined abnormalities, with considerable inter-subject variability. Consequently, it has not been possible to readily identify scans from patients with major depressive disorder at an individual level. Recently, machine learning techniques such as relevance vector machines and support vector machines have been applied to predictive classification of individual scans with variable success. Here we describe a novel hybrid method, which combines machine learning with feature selection and characterization, with the latter aimed at maximizing the accuracy of machine learning prediction. The method was tested using a multi-centre dataset of T(1)-weighted 'structural' scans. A total of 62 patients with major depressive disorder and matched controls were recruited from referred secondary care clinical populations in Aberdeen and Edinburgh, UK. The generalization ability and predictive accuracy of the classifiers was tested using data left out of the training process. High prediction accuracy was achieved (~90%). While feature selection was important for maximizing high predictive accuracy with machine learning, feature characterization contributed only a modest improvement to relevance vector machine-based prediction (~5%). Notably, while the only information provided for training the classifiers was T(1)-weighted scans plus a categorical label (major depressive disorder versus controls), both relevance vector machine and support vector machine 'weighting factors' (used for making predictions) correlated strongly with subjective ratings of illness severity. These results indicate that machine learning techniques have the potential to inform clinical practice and research, as they can make accurate predictions about brain scan data from individual subjects. Furthermore, machine learning weighting factors may reflect an objective biomarker of major depressive disorder illness severity, based on abnormalities of brain structure.

  1. Evolving autonomous learning in cognitive networks.

    PubMed

    Sheneman, Leigh; Hintze, Arend

    2017-12-01

    There are two common approaches for optimizing the performance of a machine: genetic algorithms and machine learning. A genetic algorithm is applied over many generations whereas machine learning works by applying feedback until the system meets a performance threshold. These methods have been previously combined, particularly in artificial neural networks using an external objective feedback mechanism. We adapt this approach to Markov Brains, which are evolvable networks of probabilistic and deterministic logic gates. Prior to this work MB could only adapt from one generation to the other, so we introduce feedback gates which augment their ability to learn during their lifetime. We show that Markov Brains can incorporate these feedback gates in such a way that they do not rely on an external objective feedback signal, but instead can generate internal feedback that is then used to learn. This results in a more biologically accurate model of the evolution of learning, which will enable us to study the interplay between evolution and learning and could be another step towards autonomously learning machines.

  2. Using Machine Learning for Behavior-Based Access Control: Scalable Anomaly Detection on TCP Connections and HTTP Requests

    DTIC Science & Technology

    2013-11-01

    machine learning techniques used in BBAC to make predictions about the intent of actors establishing TCP connections and issuing HTTP requests. We discuss pragmatic challenges and solutions we encountered in implementing and evaluating BBAC, discussing (a) the general concepts underlying BBAC, (b) challenges we have encountered in identifying suitable datasets, (c) mitigation strategies to cope...and describe current plans for transitioning BBAC capabilities into the Department of Defense together with lessons learned for the machine learning

  3. Generative Modeling for Machine Learning on the D-Wave

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Thulasidasan, Sunil

    These are slides on Generative Modeling for Machine Learning on the D-Wave. The following topics are detailed: generative models; Boltzmann machines: a generative model; restricted Boltzmann machines; learning parameters: RBM training; practical ways to train RBM; D-Wave as a Boltzmann sampler; mapping RBM onto the D-Wave; Chimera restricted RBM; mapping binary RBM to Ising model; experiments; data; D-Wave effective temperature, parameters noise, etc.; experiments: contrastive divergence (CD) 1 step; after 50 steps of CD; after 100 steps of CD; D-Wave (experiments 1, 2, 3); D-Wave observations.

  4. Implementing Machine Learning in the PCWG Tool

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Clifton, Andrew; Ding, Yu; Stuart, Peter

    The Power Curve Working Group (www.pcwg.org) is an ad-hoc industry-led group to investigate the performance of wind turbines in real-world conditions. As part of ongoing experience-sharing exercises, machine learning has been proposed as a possible way to predict turbine performance. This presentation provides some background information about machine learning and how it might be implemented in the PCWG exercises.

  5. Adaptive Learning Systems: Beyond Teaching Machines

    ERIC Educational Resources Information Center

    Kara, Nuri; Sevim, Nese

    2013-01-01

    Since 1950s, teaching machines have changed a lot. Today, we have different ideas about how people learn, what instructor should do to help students during their learning process. We have adaptive learning technologies that can create much more student oriented learning environments. The purpose of this article is to present these changes and its…

  6. Learning Analytics and the Academic Library: Professional Ethics Commitments at a Crossroads

    ERIC Educational Resources Information Center

    Jones, Kyle M. L.; Salo, Dorothea

    2018-01-01

    In this paper, the authors address learning analytics and the ways academic libraries are beginning to participate in wider institutional learning analytics initiatives. Since there are moral issues associated with learning analytics, the authors consider how data mining practices run counter to ethical principles in the American Library…

  7. An introduction to deep learning on biological sequence data: examples and solutions.

    PubMed

    Jurtz, Vanessa Isabell; Johansen, Alexander Rosenberg; Nielsen, Morten; Almagro Armenteros, Jose Juan; Nielsen, Henrik; Sønderby, Casper Kaae; Winther, Ole; Sønderby, Søren Kaae

    2017-11-15

    Deep neural network architectures such as convolutional and long short-term memory networks have become increasingly popular as machine learning tools during the recent years. The availability of greater computational resources, more data, new algorithms for training deep models and easy to use libraries for implementation and training of neural networks are the drivers of this development. The use of deep learning has been especially successful in image recognition; and the development of tools, applications and code examples are in most cases centered within this field rather than within biology. Here, we aim to further the development of deep learning methods within biology by providing application examples and ready to apply and adapt code templates. Given such examples, we illustrate how architectures consisting of convolutional and long short-term memory neural networks can relatively easily be designed and trained to state-of-the-art performance on three biological sequence problems: prediction of subcellular localization, protein secondary structure and the binding of peptides to MHC Class II molecules. All implementations and datasets are available online to the scientific community at https://github.com/vanessajurtz/lasagne4bio. skaaesonderby@gmail.com. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  8. Quantum neural network based machine translator for Hindi to English.

    PubMed

    Narayan, Ravi; Singh, V P; Chakraverty, S

    2014-01-01

    This paper presents the machine learning based machine translation system for Hindi to English, which learns the semantically correct corpus. The quantum neural based pattern recognizer is used to recognize and learn the pattern of corpus, using the information of part of speech of individual word in the corpus, like a human. The system performs the machine translation using its knowledge gained during the learning by inputting the pair of sentences of Devnagri-Hindi and English. To analyze the effectiveness of the proposed approach, 2600 sentences have been evaluated during simulation and evaluation. The accuracy achieved on BLEU score is 0.7502, on NIST score is 6.5773, on ROUGE-L score is 0.9233, and on METEOR score is 0.5456, which is significantly higher in comparison with Google Translation and Bing Translation for Hindi to English Machine Translation.

  9. Machine Learning-Based Classification of 38 Years of Spine-Related Literature Into 100 Research Topics.

    PubMed

    Sing, David C; Metz, Lionel N; Dudli, Stefan

    2017-06-01

    Retrospective review. To identify the top 100 spine research topics. Recent advances in "machine learning," or computers learning without explicit instructions, have yielded broad technological advances. Topic modeling algorithms can be applied to large volumes of text to discover quantifiable themes and trends. Abstracts were extracted from the National Library of Medicine PubMed database from five prominent peer-reviewed spine journals (European Spine Journal [ESJ], The Spine Journal [SpineJ], Spine, Journal of Spinal Disorders and Techniques [JSDT], Journal of Neurosurgery: Spine [JNS]). Each abstract was entered into a latent Dirichlet allocation model specified to discover 100 topics, resulting in each abstract being assigned a probability of belonging in a topic. Topics were named using the five most frequently appearing terms within that topic. Significance of increasing ("hot") or decreasing ("cold") topic popularity over time was evaluated with simple linear regression. From 1978 to 2015, 25,805 spine-related research articles were extracted and classified into 100 topics. Top two most published topics included "clinical, surgeons, guidelines, information, care" (n = 496 articles) and "pain, back, low, treatment, chronic" (424). Top two hot trends included "disc, cervical, replacement, level, arthroplasty" (+0.05%/yr, P < 0.001), and "minimally, invasive, approach, technique" (+0.05%/yr, P < 0.001). By journal, the most published topics were ESJ-"operative, surgery, postoperative, underwent, preoperative"; SpineJ-"clinical, surgeons, guidelines, information, care"; Spine-"pain, back, low, treatment, chronic"; JNS- "tumor, lesions, rare, present, diagnosis"; JSDT-"cervical, anterior, plate, fusion, ACDF." Topics discovered through latent Dirichlet allocation modeling represent unbiased meaningful themes relevant to spine care. Topic dynamics can provide historical context and direction for future research for aspiring investigators and trainees interested in spine careers. Please explore https://singdc.shinyapps.io/spinetopics. N A.

  10. Energy landscapes for machine learning

    NASA Astrophysics Data System (ADS)

    Ballard, Andrew J.; Das, Ritankar; Martiniani, Stefano; Mehta, Dhagash; Sagun, Levent; Stevenson, Jacob D.; Wales, David J.

    Machine learning techniques are being increasingly used as flexible non-linear fitting and prediction tools in the physical sciences. Fitting functions that exhibit multiple solutions as local minima can be analysed in terms of the corresponding machine learning landscape. Methods to explore and visualise molecular potential energy landscapes can be applied to these machine learning landscapes to gain new insight into the solution space involved in training and the nature of the corresponding predictions. In particular, we can define quantities analogous to molecular structure, thermodynamics, and kinetics, and relate these emergent properties to the structure of the underlying landscape. This Perspective aims to describe these analogies with examples from recent applications, and suggest avenues for new interdisciplinary research.

  11. [The urologist of the future and new technologies.

    PubMed

    Peinado, Francois; Fernández, Atanasio; Teba, Fernando; Celada, Guillermo; Acosta, Marco Antonio

    2018-01-01

    The last 25 years have brought about revolutionary changes for medicine and in particular for urology: internet was only in its infancy, medical records were written on paper, searches for medical information were done in the hospital library, medical articles were photocopied and our relationship with patients only existed face to face. Social networks had not yet appeared and even Google did not exist. Just imagine what might happen during the next 25 years, we're going to see even more radical changes. The urologist of the future is going to see the arrival of artificial intelligence, collaborative medicine, telemedicine, machine learning, the Internet of Things and personalized robotics; in the meantime, social media will continue to transform the interaction between physician and patient. The training of urologists will also be different thanks to new learning technologies such as virtual reality or augmented reality. IBM Watson Health through its system of artificial intelligence and its learning algorithms will become our essential travel companion. The urologist of the future, as well as physician, will have to acquire the necessary technological skills in order to use all these new tools which are already on the horizon.

  12. California State Library: Processing Center Design and Specifications. Volume III, Coding Manual.

    ERIC Educational Resources Information Center

    Sherman, Don; Shoffner, Ralph M.

    As part of the report on the California State Library Processing Center design and specifications, this volume is a coding manual for the conversion of catalog card data to a machine-readable form. The form is compatible with the national MARC system, while at the same time it contains provisions for problems peculiar to the local situation. This…

  13. Real-time Scheduling for GPUS with Applications in Advanced Automotive Systems

    DTIC Science & Technology

    2015-01-01

    129 3.7 Architecture of GPU tasklet scheduling infrastructure ...throughput. This disparity is even greater when we consider mobile CPUs, such as those designed by ARM. For instance, the ARM Cortex-A15 series processor as...stub library that replaces the GPGPU runtime within each virtual machine. The stub library communicates API calls to a GPGPU backend user-space daemon

  14. Motor-response learning at a process control panel by an autonomous robot

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Spelt, P.F.; de Saussure, G.; Lyness, E.

    1988-01-01

    The Center for Engineering Systems Advanced Research (CESAR) was founded at Oak Ridge National Laboratory (ORNL) by the Department of Energy's Office of Energy Research/Division of Engineering and Geoscience (DOE-OER/DEG) to conduct basic research in the area of intelligent machines. Therefore, researchers at the CESAR Laboratory are engaged in a variety of research activities in the field of machine learning. In this paper, we describe our approach to a class of machine learning which involves motor response acquisition using feedback from trial-and-error learning. Our formulation is being experimentally validated using an autonomous robot, learning tasks of control panel monitoring andmore » manipulation for effect process control. The CLIPS Expert System and the associated knowledge base used by the robot in the learning process, which reside in a hypercube computer aboard the robot, are described in detail. Benchmark testing of the learning process on a robot/control panel simulation system consisting of two intercommunicating computers is presented, along with results of sample problems used to train and test the expert system. These data illustrate machine learning and the resulting performance improvement in the robot for problems similar to, but not identical with, those on which the robot was trained. Conclusions are drawn concerning the learning problems, and implications for future work on machine learning for autonomous robots are discussed. 16 refs., 4 figs., 1 tab.« less

  15. BLAS- BASIC LINEAR ALGEBRA SUBPROGRAMS

    NASA Technical Reports Server (NTRS)

    Krogh, F. T.

    1994-01-01

    The Basic Linear Algebra Subprogram (BLAS) library is a collection of FORTRAN callable routines for employing standard techniques in performing the basic operations of numerical linear algebra. The BLAS library was developed to provide a portable and efficient source of basic operations for designers of programs involving linear algebraic computations. The subprograms available in the library cover the operations of dot product, multiplication of a scalar and a vector, vector plus a scalar times a vector, Givens transformation, modified Givens transformation, copy, swap, Euclidean norm, sum of magnitudes, and location of the largest magnitude element. Since these subprograms are to be used in an ANSI FORTRAN context, the cases of single precision, double precision, and complex data are provided for. All of the subprograms have been thoroughly tested and produce consistent results even when transported from machine to machine. BLAS contains Assembler versions and FORTRAN test code for any of the following compilers: Lahey F77L, Microsoft FORTRAN, or IBM Professional FORTRAN. It requires the Microsoft Macro Assembler and a math co-processor. The PC implementation allows individual arrays of over 64K. The BLAS library was developed in 1979. The PC version was made available in 1986 and updated in 1988.

  16. The New Possibilities from "Big Data" to Overlooked Associations Between Diabetes, Biochemical Parameters, Glucose Control, and Osteoporosis.

    PubMed

    Kruse, Christian

    2018-06-01

    To review current practices and technologies within the scope of "Big Data" that can further our understanding of diabetes mellitus and osteoporosis from large volumes of data. "Big Data" techniques involving supervised machine learning, unsupervised machine learning, and deep learning image analysis are presented with examples of current literature. Supervised machine learning can allow us to better predict diabetes-induced osteoporosis and understand relative predictor importance of diabetes-affected bone tissue. Unsupervised machine learning can allow us to understand patterns in data between diabetic pathophysiology and altered bone metabolism. Image analysis using deep learning can allow us to be less dependent on surrogate predictors and use large volumes of images to classify diabetes-induced osteoporosis and predict future outcomes directly from images. "Big Data" techniques herald new possibilities to understand diabetes-induced osteoporosis and ascertain our current ability to classify, understand, and predict this condition.

  17. Vowel Imagery Decoding toward Silent Speech BCI Using Extreme Learning Machine with Electroencephalogram

    PubMed Central

    Kim, Jongin; Park, Hyeong-jun

    2016-01-01

    The purpose of this study is to classify EEG data on imagined speech in a single trial. We recorded EEG data while five subjects imagined different vowels, /a/, /e/, /i/, /o/, and /u/. We divided each single trial dataset into thirty segments and extracted features (mean, variance, standard deviation, and skewness) from all segments. To reduce the dimension of the feature vector, we applied a feature selection algorithm based on the sparse regression model. These features were classified using a support vector machine with a radial basis function kernel, an extreme learning machine, and two variants of an extreme learning machine with different kernels. Because each single trial consisted of thirty segments, our algorithm decided the label of the single trial by selecting the most frequent output among the outputs of the thirty segments. As a result, we observed that the extreme learning machine and its variants achieved better classification rates than the support vector machine with a radial basis function kernel and linear discrimination analysis. Thus, our results suggested that EEG responses to imagined speech could be successfully classified in a single trial using an extreme learning machine with a radial basis function and linear kernel. This study with classification of imagined speech might contribute to the development of silent speech BCI systems. PMID:28097128

  18. Assessment of Learning during Library Instruction: Practices, Prevalence, and Preparation

    ERIC Educational Resources Information Center

    Sobel, Karen; Sugimoto, Cassidy R.

    2012-01-01

    Library instruction serves a critical function in the operation of the contemporary academic library environment. Librarians are asked to provide instruction and information literacy training using a range of tools and modes of delivery. The current literature presents an array of instruments used for assessing student learning and for delivering…

  19. Closing the Gap: Public Libraries and Senior Learners

    ERIC Educational Resources Information Center

    Welliver, Hilary

    2017-01-01

    This executive position paper (EPP) examines the gap in Delaware's public library services to senior citizens engaged in lifelong learning. Re-envisioning the public library as a center that provides support of lifelong learning for the whole individual and improving the quality of life for Kent County's seniors is one approach that may keep…

  20. Technological Change in the Workplace: A Statewide Survey of Community College Library and Learning Resources Personnel.

    ERIC Educational Resources Information Center

    Poole, Carolyn E.; Denny, Emmett

    2001-01-01

    Discussion of the effects of technostress on library personnel focuses on an investigation that examined how employees in Florida community college libraries and learning resources centers are dealing with technological change in their work environment. Considers implications for planning and implementing technological change and includes…

  1. Toolkit Approach to Integrating Library Resources into the Learning Management System

    ERIC Educational Resources Information Center

    Black, Elizabeth L.

    2008-01-01

    As use of learning management systems (LMS) increases, it is essential that librarians are there. Ohio State University Libraries took a toolkit approach to integrate library content in the LMS to facilitate creative and flexible interactions between librarians, students and faculty in Ohio State University's large and decentralized academic…

  2. The Flexible Learning Lab

    ERIC Educational Resources Information Center

    Cartier, Leslie C.

    2014-01-01

    Elementary school libraries are not often thought of as suitable spaces for learning commons because most elementary school libraries operate on a fixed schedule, allowing only one class at a time to use the space. Elementary school libraries are too often the drop-off location for a specific class during the classroom teacher's planning…

  3. Library Experience and Information Literacy Learning of First Year International Students: An Australian Case Study

    ERIC Educational Resources Information Center

    Hughes, Hilary; Hall, Nerilee; Pozzi, Megan

    2017-01-01

    This qualitative case study provides fresh understandings about first year undergraduate international students' library and information use at an Australian university, and their associated information literacy learning needs. The findings provide evidence to inform the development of library spaces and information literacy responses that enhance…

  4. The Library's Contribution to Student Learning: Inspirations and Aspirations

    ERIC Educational Resources Information Center

    Oakleaf, Megan

    2015-01-01

    George Kuh and Robert Gonyea's 2003 article entitled "The Role of the Academic Library in Promoting Student Engagement in Learning" appeared in "College and Research Libraries" in July 2003, after being presented as an invited paper at the ACRL 11th National Conference that April. Herein, Megan Oakleaf describes not only the…

  5. Lifelong Learning in Public Libraries in 12 European Union Countries: Policy and Considerations

    ERIC Educational Resources Information Center

    Stanziola, Javier

    2010-01-01

    Public libraries have traditionally provided key inputs to support lifelong learning. More recently, significant social and technological changes have challenged this sector to redefine their role in this field. For most public libraries in Europe this has meant continuing their role as providers of information and advice while increasing services…

  6. A Study of Early Learning Services in Museums and Libraries

    ERIC Educational Resources Information Center

    Sirinides, P.; Fink, R.; DuBois, T.

    2017-01-01

    Museums and libraries can play a role in providing opportunities for early learning, and there is clear momentum and infrastructure already in place to help make this happen. Researchers conducted a mixed-methods descriptive study to generate new evidence about the availability of services for young children in museums and libraries, and the…

  7. 2004 National Awards for Museum & Library Service

    ERIC Educational Resources Information Center

    Institute of Museum and Library Services, 2004

    2004-01-01

    The National Awards for Museum and Library Service give national recognition to institutions that play an integral and essential part in our learning society. The awards celebrate the efforts of libraries and museums of all sizes to connect with their increasingly diverse communities and to serve as centers of lifelong learning. As the pace of…

  8. Looks like Teen Spirit: Libraries for Youth Are Changing--Thanks to Teen Input

    ERIC Educational Resources Information Center

    Bolan, Kimberly

    2006-01-01

    During the last 10 years, many libraries have transformed their young adult areas into more efficient, innovative, and inspirational spaces. Many teens have suddenly found the library warm and inviting--a place that encouraged independence, learning, socialization, and creativity. As more people learn about the positive impact of dynamic teen…

  9. A Basic Hybrid Library Support Model to Distance Learners in Sudan

    ERIC Educational Resources Information Center

    Abdelrahman, Omer Hassan

    2012-01-01

    Distance learning has flourished in Sudan during the last two decades; more and more higher education institutions offer distance learning programmes to off-campus students. Like on-campus students, distance learners should have access to appropriate library and information support services. They also have specific needs for library and…

  10. Machine Learning Approaches in Cardiovascular Imaging.

    PubMed

    Henglin, Mir; Stein, Gillian; Hushcha, Pavel V; Snoek, Jasper; Wiltschko, Alexander B; Cheng, Susan

    2017-10-01

    Cardiovascular imaging technologies continue to increase in their capacity to capture and store large quantities of data. Modern computational methods, developed in the field of machine learning, offer new approaches to leveraging the growing volume of imaging data available for analyses. Machine learning methods can now address data-related problems ranging from simple analytic queries of existing measurement data to the more complex challenges involved in analyzing raw images. To date, machine learning has been used in 2 broad and highly interconnected areas: automation of tasks that might otherwise be performed by a human and generation of clinically important new knowledge. Most cardiovascular imaging studies have focused on task-oriented problems, but more studies involving algorithms aimed at generating new clinical insights are emerging. Continued expansion in the size and dimensionality of cardiovascular imaging databases is driving strong interest in applying powerful deep learning methods, in particular, to analyze these data. Overall, the most effective approaches will require an investment in the resources needed to appropriately prepare such large data sets for analyses. Notwithstanding current technical and logistical challenges, machine learning and especially deep learning methods have much to offer and will substantially impact the future practice and science of cardiovascular imaging. © 2017 American Heart Association, Inc.

  11. Validating Machine Learning Algorithms for Twitter Data Against Established Measures of Suicidality.

    PubMed

    Braithwaite, Scott R; Giraud-Carrier, Christophe; West, Josh; Barnes, Michael D; Hanson, Carl Lee

    2016-05-16

    One of the leading causes of death in the United States (US) is suicide and new methods of assessment are needed to track its risk in real time. Our objective is to validate the use of machine learning algorithms for Twitter data against empirically validated measures of suicidality in the US population. Using a machine learning algorithm, the Twitter feeds of 135 Mechanical Turk (MTurk) participants were compared with validated, self-report measures of suicide risk. Our findings show that people who are at high suicidal risk can be easily differentiated from those who are not by machine learning algorithms, which accurately identify the clinically significant suicidal rate in 92% of cases (sensitivity: 53%, specificity: 97%, positive predictive value: 75%, negative predictive value: 93%). Machine learning algorithms are efficient in differentiating people who are at a suicidal risk from those who are not. Evidence for suicidality can be measured in nonclinical populations using social media data.

  12. Validating Machine Learning Algorithms for Twitter Data Against Established Measures of Suicidality

    PubMed Central

    2016-01-01

    Background One of the leading causes of death in the United States (US) is suicide and new methods of assessment are needed to track its risk in real time. Objective Our objective is to validate the use of machine learning algorithms for Twitter data against empirically validated measures of suicidality in the US population. Methods Using a machine learning algorithm, the Twitter feeds of 135 Mechanical Turk (MTurk) participants were compared with validated, self-report measures of suicide risk. Results Our findings show that people who are at high suicidal risk can be easily differentiated from those who are not by machine learning algorithms, which accurately identify the clinically significant suicidal rate in 92% of cases (sensitivity: 53%, specificity: 97%, positive predictive value: 75%, negative predictive value: 93%). Conclusions Machine learning algorithms are efficient in differentiating people who are at a suicidal risk from those who are not. Evidence for suicidality can be measured in nonclinical populations using social media data. PMID:27185366

  13. Radar detection with the Neyman-Pearson criterion using supervised-learning-machines trained with the cross-entropy error

    NASA Astrophysics Data System (ADS)

    Jarabo-Amores, María-Pilar; la Mata-Moya, David de; Gil-Pita, Roberto; Rosa-Zurera, Manuel

    2013-12-01

    The application of supervised learning machines trained to minimize the Cross-Entropy error to radar detection is explored in this article. The detector is implemented with a learning machine that implements a discriminant function, which output is compared to a threshold selected to fix a desired probability of false alarm. The study is based on the calculation of the function the learning machine approximates to during training, and the application of a sufficient condition for a discriminant function to be used to approximate the optimum Neyman-Pearson (NP) detector. In this article, the function a supervised learning machine approximates to after being trained to minimize the Cross-Entropy error is obtained. This discriminant function can be used to implement the NP detector, which maximizes the probability of detection, maintaining the probability of false alarm below or equal to a predefined value. Some experiments about signal detection using neural networks are also presented to test the validity of the study.

  14. AstroML: Python-powered Machine Learning for Astronomy

    NASA Astrophysics Data System (ADS)

    Vander Plas, Jake; Connolly, A. J.; Ivezic, Z.

    2014-01-01

    As astronomical data sets grow in size and complexity, automated machine learning and data mining methods are becoming an increasingly fundamental component of research in the field. The astroML project (http://astroML.org) provides a common repository for practical examples of the data mining and machine learning tools used and developed by astronomical researchers, written in Python. The astroML module contains a host of general-purpose data analysis and machine learning routines, loaders for openly-available astronomical datasets, and fast implementations of specific computational methods often used in astronomy and astrophysics. The associated website features hundreds of examples of these routines being used for analysis of real astronomical datasets, while the associated textbook provides a curriculum resource for graduate-level courses focusing on practical statistics, machine learning, and data mining approaches within Astronomical research. This poster will highlight several of the more powerful and unique examples of analysis performed with astroML, all of which can be reproduced in their entirety on any computer with the proper packages installed.

  15. The impact of machine learning techniques in the study of bipolar disorder: A systematic review.

    PubMed

    Librenza-Garcia, Diego; Kotzian, Bruno Jaskulski; Yang, Jessica; Mwangi, Benson; Cao, Bo; Pereira Lima, Luiza Nunes; Bermudez, Mariane Bagatin; Boeira, Manuela Vianna; Kapczinski, Flávio; Passos, Ives Cavalcante

    2017-09-01

    Machine learning techniques provide new methods to predict diagnosis and clinical outcomes at an individual level. We aim to review the existing literature on the use of machine learning techniques in the assessment of subjects with bipolar disorder. We systematically searched PubMed, Embase and Web of Science for articles published in any language up to January 2017. We found 757 abstracts and included 51 studies in our review. Most of the included studies used multiple levels of biological data to distinguish the diagnosis of bipolar disorder from other psychiatric disorders or healthy controls. We also found studies that assessed the prediction of clinical outcomes and studies using unsupervised machine learning to build more consistent clinical phenotypes of bipolar disorder. We concluded that given the clinical heterogeneity of samples of patients with BD, machine learning techniques may provide clinicians and researchers with important insights in fields such as diagnosis, personalized treatment and prognosis orientation. Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. Estimates of the atmospheric parameters of M-type stars: a machine-learning perspective

    NASA Astrophysics Data System (ADS)

    Sarro, L. M.; Ordieres-Meré, J.; Bello-García, A.; González-Marcos, A.; Solano, E.

    2018-05-01

    Estimating the atmospheric parameters of M-type stars has been a difficult task due to the lack of simple diagnostics in the stellar spectra. We aim at uncovering good sets of predictive features of stellar atmospheric parameters (Teff, log (g), [M/H]) in spectra of M-type stars. We define two types of potential features (equivalent widths and integrated flux ratios) able to explain the atmospheric physical parameters. We search the space of feature sets using a genetic algorithm that evaluates solutions by their prediction performance in the framework of the BT-Settl library of stellar spectra. Thereafter, we construct eight regression models using different machine-learning techniques and compare their performances with those obtained using the classical χ2 approach and independent component analysis (ICA) coefficients. Finally, we validate the various alternatives using two sets of real spectra from the NASA Infrared Telescope Facility (IRTF) and Dwarf Archives collections. We find that the cross-validation errors are poor measures of the performance of regression models in the context of physical parameter prediction in M-type stars. For R ˜ 2000 spectra with signal-to-noise ratios typical of the IRTF and Dwarf Archives, feature selection with genetic algorithms or alternative techniques produces only marginal advantages with respect to representation spaces that are unconstrained in wavelength (full spectrum or ICA). We make available the atmospheric parameters for the two collections of observed spectra as online material.

  17. Machine learning methods for the classification of gliomas: Initial results using features extracted from MR spectroscopy.

    PubMed

    Ranjith, G; Parvathy, R; Vikas, V; Chandrasekharan, Kesavadas; Nair, Suresh

    2015-04-01

    With the advent of new imaging modalities, radiologists are faced with handling increasing volumes of data for diagnosis and treatment planning. The use of automated and intelligent systems is becoming essential in such a scenario. Machine learning, a branch of artificial intelligence, is increasingly being used in medical image analysis applications such as image segmentation, registration and computer-aided diagnosis and detection. Histopathological analysis is currently the gold standard for classification of brain tumors. The use of machine learning algorithms along with extraction of relevant features from magnetic resonance imaging (MRI) holds promise of replacing conventional invasive methods of tumor classification. The aim of the study is to classify gliomas into benign and malignant types using MRI data. Retrospective data from 28 patients who were diagnosed with glioma were used for the analysis. WHO Grade II (low-grade astrocytoma) was classified as benign while Grade III (anaplastic astrocytoma) and Grade IV (glioblastoma multiforme) were classified as malignant. Features were extracted from MR spectroscopy. The classification was done using four machine learning algorithms: multilayer perceptrons, support vector machine, random forest and locally weighted learning. Three of the four machine learning algorithms gave an area under ROC curve in excess of 0.80. Random forest gave the best performance in terms of AUC (0.911) while sensitivity was best for locally weighted learning (86.1%). The performance of different machine learning algorithms in the classification of gliomas is promising. An even better performance may be expected by integrating features extracted from other MR sequences. © The Author(s) 2015 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.

  18. Performance Analysis of Ivshmem for High-Performance Computing in Virtual Machines

    NASA Astrophysics Data System (ADS)

    Ivanovic, Pavle; Richter, Harald

    2018-01-01

    High-Performance computing (HPC) is rarely accomplished via virtual machines (VMs). In this paper, we present a remake of ivshmem which can change this. Ivshmem was a shared memory (SHM) between virtual machines on the same server, with SHM-access synchronization included, until about 5 years ago when newer versions of Linux and its virtualization library libvirt evolved. We restored that SHM-access synchronization feature because it is indispensable for HPC and made ivshmem runnable with contemporary versions of Linux, libvirt, KVM, QEMU and especially MPICH, which is an implementation of MPI - the standard HPC communication library. Additionally, MPICH was transparently modified by us to get ivshmem included, resulting in a three to ten times performance improvement compared to TCP/IP. Furthermore, we have transparently replaced MPI_PUT, a single-side MPICH communication mechanism, by an own MPI_PUT wrapper. As a result, our ivshmem even surpasses non-virtualized SHM data transfers for block lengths greater than 512 KBytes, showing the benefits of virtualization. All improvements were possible without using SR-IOV.

  19. UPEML Version 2. 0: A machine-portable CDC Update emulator

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mehlhorn, T.A.; Young, M.F.

    1987-05-01

    UPEML is a machine-portable CDC Update emulation program. UPEML is written in ANSI standard Fortran-77 and is relatively simple and compact. It is capable of emulating a significant subset of the standard CDC Update functions, including program library creation and subsequent modification. Machine-portability is an essential attribute of UPEML. UPEML was written primarily to facilitate the use of CDC-based scientific packages on alternate computer systems such as the VAX 11/780 and the IBM 3081. UPEML has also been successfully used on the multiprocessor ELXSI, on CRAYs under both COS and CTSS operating systems, on APOLLO workstations, and on the HP-9000.more » Version 2.0 includes enhanced error checking, full ASCI character support, a program library audit capability, and a partial update option in which only selected or modified decks are written to the compile file. Further enhancements include checks for overlapping corrections, processing of nested calls to common decks, and reads and addfiles from alternate input files.« less

  20. Use of Advanced Machine-Learning Techniques for Non-Invasive Monitoring of Hemorrhage

    DTIC Science & Technology

    2010-04-01

    that state-of-the-art machine learning techniques when integrated with novel non-invasive monitoring technologies could detect subtle, physiological...decompensation. Continuous, non-invasively measured hemodynamic signals (e.g., ECG, blood pressures, stroke volume) were used for the development of machine ... learning algorithms. Accuracy estimates were obtained by building models using 27 subjects and testing on the 28th. This process was repeated 28 times

Top