extracting machine understandable: Topics by Science.gov

Sample records for extracting machine understandable

The dynamics of discrete-time computation, with application to recurrent neural networks and finite state machine extraction.

PubMed

Casey, M

1996-08-15

Recurrent neural networks (RNNs) can learn to perform finite state computations. It is shown that an RNN performing a finite state computation must organize its state space to mimic the states in the minimal deterministic finite state machine that can perform that computation, and a precise description of the attractor structure of such systems is given. This knowledge effectively predicts activation space dynamics, which allows one to understand RNN computation dynamics in spite of complexity in activation dynamics. This theory provides a theoretical framework for understanding finite state machine (FSM) extraction techniques and can be used to improve training methods for RNNs performing FSM computations. This provides an example of a successful approach to understanding a general class of complex systems that has not been explicitly designed, e.g., systems that have evolved or learned their internal structure.
Image understanding and the man-machine interface II; Proceedings of the Meeting, Los Angeles, CA, Jan. 17, 18, 1989

NASA Technical Reports Server (NTRS)

Barrett, Eamon B. (Editor); Pearson, James J. (Editor)

1989-01-01

Image understanding concepts and models, image understanding systems and applications, advanced digital processors and software tools, and advanced man-machine interfaces are among the topics discussed. Particular papers are presented on such topics as neural networks for computer vision, object-based segmentation and color recognition in multispectral images, the application of image algebra to image measurement and feature extraction, and the integration of modeling and graphics to create an infrared signal processing test bed.
Natural Language Processing.

ERIC Educational Resources Information Center

Chowdhury, Gobinda G.

2003-01-01

Discusses issues related to natural language processing, including theoretical developments; natural language understanding; tools and techniques; natural language text processing systems; abstracting; information extraction; information retrieval; interfaces; software; Internet, Web, and digital library applications; machine translation for…
Scorebox extraction from mobile sports videos using Support Vector Machines

NASA Astrophysics Data System (ADS)

Kim, Wonjun; Park, Jimin; Kim, Changick

2008-08-01

Scorebox plays an important role in understanding contents of sports videos. However, the tiny scorebox may give the small-display-viewers uncomfortable experience in grasping the game situation. In this paper, we propose a novel framework to extract the scorebox from sports video frames. We first extract candidates by using accumulated intensity and edge information after short learning period. Since there are various types of scoreboxes inserted in sports videos, multiple attributes need to be used for efficient extraction. Based on those attributes, the optimal information gain is computed and top three ranked attributes in terms of information gain are selected as a three-dimensional feature vector for Support Vector Machines (SVM) to distinguish the scorebox from other candidates, such as logos and advertisement boards. The proposed method is tested on various videos of sports games and experimental results show the efficiency and robustness of our proposed method.
Information extraction from multi-institutional radiology reports.

PubMed

Hassanpour, Saeed; Langlotz, Curtis P

2016-01-01

The radiology report is the most important source of clinical imaging information. It documents critical information about the patient's health and the radiologist's interpretation of medical findings. It also communicates information to the referring physicians and records that information for future clinical and research use. Although efforts to structure some radiology report information through predefined templates are beginning to bear fruit, a large portion of radiology report information is entered in free text. The free text format is a major obstacle for rapid extraction and subsequent use of information by clinicians, researchers, and healthcare information systems. This difficulty is due to the ambiguity and subtlety of natural language, complexity of described images, and variations among different radiologists and healthcare organizations. As a result, radiology reports are used only once by the clinician who ordered the study and rarely are used again for research and data mining. In this work, machine learning techniques and a large multi-institutional radiology report repository are used to extract the semantics of the radiology report and overcome the barriers to the re-use of radiology report information in clinical research and other healthcare applications. We describe a machine learning system to annotate radiology reports and extract report contents according to an information model. This information model covers the majority of clinically significant contents in radiology reports and is applicable to a wide variety of radiology study types. Our automated approach uses discriminative sequence classifiers for named-entity recognition to extract and organize clinically significant terms and phrases consistent with the information model. We evaluated our information extraction system on 150 radiology reports from three major healthcare organizations and compared its results to a commonly used non-machine learning information extraction method. We also evaluated the generalizability of our approach across different organizations by training and testing our system on data from different organizations. Our results show the efficacy of our machine learning approach in extracting the information model's elements (10-fold cross-validation average performance: precision: 87%, recall: 84%, F1 score: 85%) and its superiority and generalizability compared to the common non-machine learning approach (p-value<0.05). Our machine learning information extraction approach provides an effective automatic method to annotate and extract clinically significant information from a large collection of free text radiology reports. This information extraction system can help clinicians better understand the radiology reports and prioritize their review process. In addition, the extracted information can be used by researchers to link radiology reports to information from other data sources such as electronic health records and the patient's genome. Extracted information also can facilitate disease surveillance, real-time clinical decision support for the radiologist, and content-based image retrieval. Copyright © 2015 Elsevier B.V. All rights reserved.
Large-scale extraction of accurate drug-disease treatment pairs from biomedical literature for drug repurposing

PubMed Central

2013-01-01

Background A large-scale, highly accurate, machine-understandable drug-disease treatment relationship knowledge base is important for computational approaches to drug repurposing. The large body of published biomedical research articles and clinical case reports available on MEDLINE is a rich source of FDA-approved drug-disease indication as well as drug-repurposing knowledge that is crucial for applying FDA-approved drugs for new diseases. However, much of this information is buried in free text and not captured in any existing databases. The goal of this study is to extract a large number of accurate drug-disease treatment pairs from published literature. Results In this study, we developed a simple but highly accurate pattern-learning approach to extract treatment-specific drug-disease pairs from 20 million biomedical abstracts available on MEDLINE. We extracted a total of 34,305 unique drug-disease treatment pairs, the majority of which are not included in existing structured databases. Our algorithm achieved a precision of 0.904 and a recall of 0.131 in extracting all pairs, and a precision of 0.904 and a recall of 0.842 in extracting frequent pairs. In addition, we have shown that the extracted pairs strongly correlate with both drug target genes and therapeutic classes, therefore may have high potential in drug discovery. Conclusions We demonstrated that our simple pattern-learning relationship extraction algorithm is able to accurately extract many drug-disease pairs from the free text of biomedical literature that are not captured in structured databases. The large-scale, accurate, machine-understandable drug-disease treatment knowledge base that is resultant of our study, in combination with pairs from structured databases, will have high potential in computational drug repurposing tasks. PMID:23742147
Natural Language Processing Techniques for Extracting and Categorizing Finding Measurements in Narrative Radiology Reports.

PubMed

Sevenster, M; Buurman, J; Liu, P; Peters, J F; Chang, P J

2015-01-01

Accumulating quantitative outcome parameters may contribute to constructing a healthcare organization in which outcomes of clinical procedures are reproducible and predictable. In imaging studies, measurements are the principal category of quantitative para meters. The purpose of this work is to develop and evaluate two natural language processing engines that extract finding and organ measurements from narrative radiology reports and to categorize extracted measurements by their "temporality". The measurement extraction engine is developed as a set of regular expressions. The engine was evaluated against a manually created ground truth. Automated categorization of measurement temporality is defined as a machine learning problem. A ground truth was manually developed based on a corpus of radiology reports. A maximum entropy model was created using features that characterize the measurement itself and its narrative context. The model was evaluated in a ten-fold cross validation protocol. The measurement extraction engine has precision 0.994 and recall 0.991. Accuracy of the measurement classification engine is 0.960. The work contributes to machine understanding of radiology reports and may find application in software applications that process medical data.
Toward Machine Understanding of Information Quality.

ERIC Educational Resources Information Center

Tang, Rong; Ng, K. B.; Strzalkowski, Tomek; Kantor, Paul B.

2003-01-01

Reports preliminary results of a study to develop and automate new metrics for assessment of information quality in text documents, particularly in news. Through focus group studies, quality judgment experiments, and textual feature extraction and analysis, nine quality aspects were generated and applied in human assessments. Experiments were…
Machine vision and appearance based learning

NASA Astrophysics Data System (ADS)

Bernstein, Alexander

2017-03-01

Smart algorithms are used in Machine vision to organize or extract high-level information from the available data. The resulted high-level understanding the content of images received from certain visual sensing system and belonged to an appearance space can be only a key first step in solving various specific tasks such as mobile robot navigation in uncertain environments, road detection in autonomous driving systems, etc. Appearance-based learning has become very popular in the field of machine vision. In general, the appearance of a scene is a function of the scene content, the lighting conditions, and the camera position. Mobile robots localization problem in machine learning framework via appearance space analysis is considered. This problem is reduced to certain regression on an appearance manifold problem, and newly regression on manifolds methods are used for its solution.
Automatic Extraction of Destinations, Origins and Route Parts from Human Generated Route Directions

NASA Astrophysics Data System (ADS)

Zhang, Xiao; Mitra, Prasenjit; Klippel, Alexander; Maceachren, Alan

Researchers from the cognitive and spatial sciences are studying text descriptions of movement patterns in order to examine how humans communicate and understand spatial information. In particular, route directions offer a rich source of information on how cognitive systems conceptualize movement patterns by segmenting them into meaningful parts. Route directions are composed using a plethora of cognitive spatial organization principles: changing levels of granularity, hierarchical organization, incorporation of cognitively and perceptually salient elements, and so forth. Identifying such information in text documents automatically is crucial for enabling machine-understanding of human spatial language. The benefits are: a) creating opportunities for large-scale studies of human linguistic behavior; b) extracting and georeferencing salient entities (landmarks) that are used by human route direction providers; c) developing methods to translate route directions to sketches and maps; and d) enabling queries on large corpora of crawled/analyzed movement data. In this paper, we introduce our approach and implementations that bring us closer to the goal of automatically processing linguistic route directions. We report on research directed at one part of the larger problem, that is, extracting the three most critical parts of route directions and movement patterns in general: origin, destination, and route parts. We use machine-learning based algorithms to extract these parts of routes, including, for example, destination names and types. We prove the effectiveness of our approach in several experiments using hand-tagged corpora.
Collaborative human-machine analysis to disambiguate entities in unstructured text and structured datasets

NASA Astrophysics Data System (ADS)

Davenport, Jack H.

2016-05-01

Intelligence analysts demand rapid information fusion capabilities to develop and maintain accurate situational awareness and understanding of dynamic enemy threats in asymmetric military operations. The ability to extract relationships between people, groups, and locations from a variety of text datasets is critical to proactive decision making. The derived network of entities must be automatically created and presented to analysts to assist in decision making. DECISIVE ANALYTICS Corporation (DAC) provides capabilities to automatically extract entities, relationships between entities, semantic concepts about entities, and network models of entities from text and multi-source datasets. DAC's Natural Language Processing (NLP) Entity Analytics model entities as complex systems of attributes and interrelationships which are extracted from unstructured text via NLP algorithms. The extracted entities are automatically disambiguated via machine learning algorithms, and resolution recommendations are presented to the analyst for validation; the analyst's expertise is leveraged in this hybrid human/computer collaborative model. Military capability is enhanced by these NLP Entity Analytics because analysts can now create/update an entity profile with intelligence automatically extracted from unstructured text, thereby fusing entity knowledge from structured and unstructured data sources. Operational and sustainment costs are reduced since analysts do not have to manually tag and resolve entities.
Rule Extraction Based on Extreme Learning Machine and an Improved Ant-Miner Algorithm for Transient Stability Assessment.

PubMed

Li, Yang; Li, Guoqing; Wang, Zhenhao

2015-01-01

In order to overcome the problems of poor understandability of the pattern recognition-based transient stability assessment (PRTSA) methods, a new rule extraction method based on extreme learning machine (ELM) and an improved Ant-miner (IAM) algorithm is presented in this paper. First, the basic principles of ELM and Ant-miner algorithm are respectively introduced. Then, based on the selected optimal feature subset, an example sample set is generated by the trained ELM-based PRTSA model. And finally, a set of classification rules are obtained by IAM algorithm to replace the original ELM network. The novelty of this proposal is that transient stability rules are extracted from an example sample set generated by the trained ELM-based transient stability assessment model by using IAM algorithm. The effectiveness of the proposed method is shown by the application results on the New England 39-bus power system and a practical power system--the southern power system of Hebei province.
Specification of a new de-stoner machine: evaluation of machining effects on olive paste's rheology and olive oil yield and quality.

PubMed

Romaniello, Roberto; Leone, Alessandro; Tamborrino, Antonia

2017-01-01

An industrial prototype of a partial de-stoner machine was specified, built and implemented in an industrial olive oil extraction plant. The partial de-stoner machine was compared to the traditional mechanical crusher to assess its quantitative and qualitative performance. The extraction efficiency of the olive oil extraction plant, olive oil quality, sensory evaluation and rheological aspects were investigated. The results indicate that by using the partial de-stoner machine the extraction plant did not show statistical differences with respect to the traditional mechanical crushing. Moreover, the partial de-stoner machine allowed recovery of 60% of olive pits and the oils obtained were characterised by more marked green fruitiness, flavour and aroma than the oils produced using the traditional processing systems. The partial de-stoner machine removes the limitations of the traditional total de-stoner machine, opening new frontiers for the recovery of pits to be used as biomass. Moreover, the partial de-stoner machine permitted a significant reduction in the viscosity of the olive paste. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.
Enhancing interpretability of automatically extracted machine learning features: application to a RBM-Random Forest system on brain lesion segmentation.

PubMed

Pereira, Sérgio; Meier, Raphael; McKinley, Richard; Wiest, Roland; Alves, Victor; Silva, Carlos A; Reyes, Mauricio

2018-02-01

Machine learning systems are achieving better performances at the cost of becoming increasingly complex. However, because of that, they become less interpretable, which may cause some distrust by the end-user of the system. This is especially important as these systems are pervasively being introduced to critical domains, such as the medical field. Representation Learning techniques are general methods for automatic feature computation. Nevertheless, these techniques are regarded as uninterpretable "black boxes". In this paper, we propose a methodology to enhance the interpretability of automatically extracted machine learning features. The proposed system is composed of a Restricted Boltzmann Machine for unsupervised feature learning, and a Random Forest classifier, which are combined to jointly consider existing correlations between imaging data, features, and target variables. We define two levels of interpretation: global and local. The former is devoted to understanding if the system learned the relevant relations in the data correctly, while the later is focused on predictions performed on a voxel- and patient-level. In addition, we propose a novel feature importance strategy that considers both imaging data and target variables, and we demonstrate the ability of the approach to leverage the interpretability of the obtained representation for the task at hand. We evaluated the proposed methodology in brain tumor segmentation and penumbra estimation in ischemic stroke lesions. We show the ability of the proposed methodology to unveil information regarding relationships between imaging modalities and extracted features and their usefulness for the task at hand. In both clinical scenarios, we demonstrate that the proposed methodology enhances the interpretability of automatically learned features, highlighting specific learning patterns that resemble how an expert extracts relevant data from medical images. Copyright © 2017 Elsevier B.V. All rights reserved.
Manifold learning in machine vision and robotics

NASA Astrophysics Data System (ADS)

Bernstein, Alexander

2017-02-01

Smart algorithms are used in Machine vision and Robotics to organize or extract high-level information from the available data. Nowadays, Machine learning is an essential and ubiquitous tool to automate extraction patterns or regularities from data (images in Machine vision; camera, laser, and sonar sensors data in Robotics) in order to solve various subject-oriented tasks such as understanding and classification of images content, navigation of mobile autonomous robot in uncertain environments, robot manipulation in medical robotics and computer-assisted surgery, and other. Usually such data have high dimensionality, however, due to various dependencies between their components and constraints caused by physical reasons, all "feasible and usable data" occupy only a very small part in high dimensional "observation space" with smaller intrinsic dimensionality. Generally accepted model of such data is manifold model in accordance with which the data lie on or near an unknown manifold (surface) of lower dimensionality embedded in an ambient high dimensional observation space; real-world high-dimensional data obtained from "natural" sources meet, as a rule, this model. The use of Manifold learning technique in Machine vision and Robotics, which discovers a low-dimensional structure of high dimensional data and results in effective algorithms for solving of a large number of various subject-oriented tasks, is the content of the conference plenary speech some topics of which are in the paper.
Towards a generalized energy prediction model for machine tools

PubMed Central

Bhinge, Raunak; Park, Jinkyoo; Law, Kincho H.; Dornfeld, David A.; Helu, Moneer; Rachuri, Sudarsan

2017-01-01

Energy prediction of machine tools can deliver many advantages to a manufacturing enterprise, ranging from energy-efficient process planning to machine tool monitoring. Physics-based, energy prediction models have been proposed in the past to understand the energy usage pattern of a machine tool. However, uncertainties in both the machine and the operating environment make it difficult to predict the energy consumption of the target machine reliably. Taking advantage of the opportunity to collect extensive, contextual, energy-consumption data, we discuss a data-driven approach to develop an energy prediction model of a machine tool in this paper. First, we present a methodology that can efficiently and effectively collect and process data extracted from a machine tool and its sensors. We then present a data-driven model that can be used to predict the energy consumption of the machine tool for machining a generic part. Specifically, we use Gaussian Process (GP) Regression, a non-parametric machine-learning technique, to develop the prediction model. The energy prediction model is then generalized over multiple process parameters and operations. Finally, we apply this generalized model with a method to assess uncertainty intervals to predict the energy consumed to machine any part using a Mori Seiki NVD1500 machine tool. Furthermore, the same model can be used during process planning to optimize the energy-efficiency of a machining process. PMID:28652687
Towards a generalized energy prediction model for machine tools.

PubMed

Bhinge, Raunak; Park, Jinkyoo; Law, Kincho H; Dornfeld, David A; Helu, Moneer; Rachuri, Sudarsan

2017-04-01

Energy prediction of machine tools can deliver many advantages to a manufacturing enterprise, ranging from energy-efficient process planning to machine tool monitoring. Physics-based, energy prediction models have been proposed in the past to understand the energy usage pattern of a machine tool. However, uncertainties in both the machine and the operating environment make it difficult to predict the energy consumption of the target machine reliably. Taking advantage of the opportunity to collect extensive, contextual, energy-consumption data, we discuss a data-driven approach to develop an energy prediction model of a machine tool in this paper. First, we present a methodology that can efficiently and effectively collect and process data extracted from a machine tool and its sensors. We then present a data-driven model that can be used to predict the energy consumption of the machine tool for machining a generic part. Specifically, we use Gaussian Process (GP) Regression, a non-parametric machine-learning technique, to develop the prediction model. The energy prediction model is then generalized over multiple process parameters and operations. Finally, we apply this generalized model with a method to assess uncertainty intervals to predict the energy consumed to machine any part using a Mori Seiki NVD1500 machine tool. Furthermore, the same model can be used during process planning to optimize the energy-efficiency of a machining process.
Support patient search on pathology reports with interactive online learning based data extraction.

PubMed

Zheng, Shuai; Lu, James J; Appin, Christina; Brat, Daniel; Wang, Fusheng

2015-01-01

Structural reporting enables semantic understanding and prompt retrieval of clinical findings about patients. While synoptic pathology reporting provides templates for data entries, information in pathology reports remains primarily in narrative free text form. Extracting data of interest from narrative pathology reports could significantly improve the representation of the information and enable complex structured queries. However, manual extraction is tedious and error-prone, and automated tools are often constructed with a fixed training dataset and not easily adaptable. Our goal is to extract data from pathology reports to support advanced patient search with a highly adaptable semi-automated data extraction system, which can adjust and self-improve by learning from a user's interaction with minimal human effort. We have developed an online machine learning based information extraction system called IDEAL-X. With its graphical user interface, the system's data extraction engine automatically annotates values for users to review upon loading each report text. The system analyzes users' corrections regarding these annotations with online machine learning, and incrementally enhances and refines the learning model as reports are processed. The system also takes advantage of customized controlled vocabularies, which can be adaptively refined during the online learning process to further assist the data extraction. As the accuracy of automatic annotation improves overtime, the effort of human annotation is gradually reduced. After all reports are processed, a built-in query engine can be applied to conveniently define queries based on extracted structured data. We have evaluated the system with a dataset of anatomic pathology reports from 50 patients. Extracted data elements include demographical data, diagnosis, genetic marker, and procedure. The system achieves F-1 scores of around 95% for the majority of tests. Extracting data from pathology reports could enable more accurate knowledge to support biomedical research and clinical diagnosis. IDEAL-X provides a bridge that takes advantage of online machine learning based data extraction and the knowledge from human's feedback. By combining iterative online learning and adaptive controlled vocabularies, IDEAL-X can deliver highly adaptive and accurate data extraction to support patient search.
Extraction of actionable information from crowdsourced disaster data.

PubMed

Kiatpanont, Rungsun; Tanlamai, Uthai; Chongstitvatana, Prabhas

Natural disasters cause enormous damage to countries all over the world. To deal with these common problems, different activities are required for disaster management at each phase of the crisis. There are three groups of activities as follows: (1) make sense of the situation and determine how best to deal with it, (2) deploy the necessary resources, and (3) harmonize as many parties as possible, using the most effective communication channels. Current technological improvements and developments now enable people to act as real-time information sources. As a result, inundation with crowdsourced data poses a real challenge for a disaster manager. The problem is how to extract the valuable information from a gigantic data pool in the shortest possible time so that the information is still useful and actionable. This research proposed an actionable-data-extraction process to deal with the challenge. Twitter was selected as a test case because messages posted on Twitter are publicly available. Hashtag, an easy and very efficient technique, was also used to differentiate information. A quantitative approach to extract useful information from the tweets was supported and verified by interviews with disaster managers from many leading organizations in Thailand to understand their missions. The information classifications extracted from the collected tweets were first performed manually, and then the tweets were used to train a machine learning algorithm to classify future tweets. One particularly useful, significant, and primary section was the request for help category. The support vector machine algorithm was used to validate the results from the extraction process of 13,696 sample tweets, with over 74 percent accuracy. The results confirmed that the machine learning technique could significantly and practically assist with disaster management by dealing with crowdsourced data.
A Machine Reading System for Assembling Synthetic Paleontological Databases

PubMed Central

Peters, Shanan E.; Zhang, Ce; Livny, Miron; Ré, Christopher

2014-01-01

Many aspects of macroevolutionary theory and our understanding of biotic responses to global environmental change derive from literature-based compilations of paleontological data. Existing manually assembled databases are, however, incomplete and difficult to assess and enhance with new data types. Here, we develop and validate the quality of a machine reading system, PaleoDeepDive, that automatically locates and extracts data from heterogeneous text, tables, and figures in publications. PaleoDeepDive performs comparably to humans in several complex data extraction and inference tasks and generates congruent synthetic results that describe the geological history of taxonomic diversity and genus-level rates of origination and extinction. Unlike traditional databases, PaleoDeepDive produces a probabilistic database that systematically improves as information is added. We show that the system can readily accommodate sophisticated data types, such as morphological data in biological illustrations and associated textual descriptions. Our machine reading approach to scientific data integration and synthesis brings within reach many questions that are currently underdetermined and does so in ways that may stimulate entirely new modes of inquiry. PMID:25436610

Mining hidden data to predict patient prognosis: texture feature extraction and machine learning in mammography

NASA Astrophysics Data System (ADS)

Leighs, J. A.; Halling-Brown, M. D.; Patel, M. N.

2018-03-01

The UK currently has a national breast cancer-screening program and images are routinely collected from a number of screening sites, representing a wealth of invaluable data that is currently under-used. Radiologists evaluate screening images manually and recall suspicious cases for further analysis such as biopsy. Histological testing of biopsy samples confirms the malignancy of the tumour, along with other diagnostic and prognostic characteristics such as disease grade. Machine learning is becoming increasingly popular for clinical image classification problems, as it is capable of discovering patterns in data otherwise invisible. This is particularly true when applied to medical imaging features; however clinical datasets are often relatively small. A texture feature extraction toolkit has been developed to mine a wide range of features from medical images such as mammograms. This study analysed a dataset of 1,366 radiologist-marked, biopsy-proven malignant lesions obtained from the OPTIMAM Medical Image Database (OMI-DB). Exploratory data analysis methods were employed to better understand extracted features. Machine learning techniques including Classification and Regression Trees (CART), ensemble methods (e.g. random forests), and logistic regression were applied to the data to predict the disease grade of the analysed lesions. Prediction scores of up to 83% were achieved; sensitivity and specificity of the models trained have been discussed to put the results into a clinical context. The results show promise in the ability to predict prognostic indicators from the texture features extracted and thus enable prioritisation of care for patients at greatest risk.
Identifying well-formed biomedical phrases in MEDLINE® text.

PubMed

Kim, Won; Yeganova, Lana; Comeau, Donald C; Wilbur, W John

2012-12-01

In the modern world people frequently interact with retrieval systems to satisfy their information needs. Humanly understandable well-formed phrases represent a crucial interface between humans and the web, and the ability to index and search with such phrases is beneficial for human-web interactions. In this paper we consider the problem of identifying humanly understandable, well formed, and high quality biomedical phrases in MEDLINE documents. The main approaches used previously for detecting such phrases are syntactic, statistical, and a hybrid approach combining these two. In this paper we propose a supervised learning approach for identifying high quality phrases. First we obtain a set of known well-formed useful phrases from an existing source and label these phrases as positive. We then extract from MEDLINE a large set of multiword strings that do not contain stop words or punctuation. We believe this unlabeled set contains many well-formed phrases. Our goal is to identify these additional high quality phrases. We examine various feature combinations and several machine learning strategies designed to solve this problem. A proper choice of machine learning methods and features identifies in the large collection strings that are likely to be high quality phrases. We evaluate our approach by making human judgments on multiword strings extracted from MEDLINE using our methods. We find that over 85% of such extracted phrase candidates are humanly judged to be of high quality. Published by Elsevier Inc.
Effective Information Extraction Framework for Heterogeneous Clinical Reports Using Online Machine Learning and Controlled Vocabularies

PubMed Central

Zheng, Shuai; Ghasemzadeh, Nima; Hayek, Salim S; Quyyumi, Arshed A

2017-01-01

Background Extracting structured data from narrated medical reports is challenged by the complexity of heterogeneous structures and vocabularies and often requires significant manual effort. Traditional machine-based approaches lack the capability to take user feedbacks for improving the extraction algorithm in real time. Objective Our goal was to provide a generic information extraction framework that can support diverse clinical reports and enables a dynamic interaction between a human and a machine that produces highly accurate results. Methods A clinical information extraction system IDEAL-X has been built on top of online machine learning. It processes one document at a time, and user interactions are recorded as feedbacks to update the learning model in real time. The updated model is used to predict values for extraction in subsequent documents. Once prediction accuracy reaches a user-acceptable threshold, the remaining documents may be batch processed. A customizable controlled vocabulary may be used to support extraction. Results Three datasets were used for experiments based on report styles: 100 cardiac catheterization procedure reports, 100 coronary angiographic reports, and 100 integrated reports—each combines history and physical report, discharge summary, outpatient clinic notes, outpatient clinic letter, and inpatient discharge medication report. Data extraction was performed by 3 methods: online machine learning, controlled vocabularies, and a combination of these. The system delivers results with F1 scores greater than 95%. Conclusions IDEAL-X adopts a unique online machine learning–based approach combined with controlled vocabularies to support data extraction for clinical reports. The system can quickly learn and improve, thus it is highly adaptable. PMID:28487265
Resistance gene identification from Larimichthys crocea with machine learning techniques

NASA Astrophysics Data System (ADS)

Cai, Yinyin; Liao, Zhijun; Ju, Ying; Liu, Juan; Mao, Yong; Liu, Xiangrong

2016-12-01

The research on resistance genes (R-gene) plays a vital role in bioinformatics as it has the capability of coping with adverse changes in the external environment, which can form the corresponding resistance protein by transcription and translation. It is meaningful to identify and predict R-gene of Larimichthys crocea (L.Crocea). It is friendly for breeding and the marine environment as well. Large amounts of L.Crocea’s immune mechanisms have been explored by biological methods. However, much about them is still unclear. In order to break the limited understanding of the L.Crocea’s immune mechanisms and to detect new R-gene and R-gene-like genes, this paper came up with a more useful combination prediction method, which is to extract and classify the feature of available genomic data by machine learning. The effectiveness of feature extraction and classification methods to identify potential novel R-gene was evaluated, and different statistical analyzes were utilized to explore the reliability of prediction method, which can help us further understand the immune mechanisms of L.Crocea against pathogens. In this paper, a webserver called LCRG-Pred is available at http://server.malab.cn/rg_lc/.
Effective Information Extraction Framework for Heterogeneous Clinical Reports Using Online Machine Learning and Controlled Vocabularies.

PubMed

Zheng, Shuai; Lu, James J; Ghasemzadeh, Nima; Hayek, Salim S; Quyyumi, Arshed A; Wang, Fusheng

2017-05-09

Extracting structured data from narrated medical reports is challenged by the complexity of heterogeneous structures and vocabularies and often requires significant manual effort. Traditional machine-based approaches lack the capability to take user feedbacks for improving the extraction algorithm in real time. Our goal was to provide a generic information extraction framework that can support diverse clinical reports and enables a dynamic interaction between a human and a machine that produces highly accurate results. A clinical information extraction system IDEAL-X has been built on top of online machine learning. It processes one document at a time, and user interactions are recorded as feedbacks to update the learning model in real time. The updated model is used to predict values for extraction in subsequent documents. Once prediction accuracy reaches a user-acceptable threshold, the remaining documents may be batch processed. A customizable controlled vocabulary may be used to support extraction. Three datasets were used for experiments based on report styles: 100 cardiac catheterization procedure reports, 100 coronary angiographic reports, and 100 integrated reports-each combines history and physical report, discharge summary, outpatient clinic notes, outpatient clinic letter, and inpatient discharge medication report. Data extraction was performed by 3 methods: online machine learning, controlled vocabularies, and a combination of these. The system delivers results with F1 scores greater than 95%. IDEAL-X adopts a unique online machine learning-based approach combined with controlled vocabularies to support data extraction for clinical reports. The system can quickly learn and improve, thus it is highly adaptable. ©Shuai Zheng, James J Lu, Nima Ghasemzadeh, Salim S Hayek, Arshed A Quyyumi, Fusheng Wang. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 09.05.2017.
Extracting laboratory test information from biomedical text

PubMed Central

Kang, Yanna Shen; Kayaalp, Mehmet

2013-01-01

Background: No previous study reported the efficacy of current natural language processing (NLP) methods for extracting laboratory test information from narrative documents. This study investigates the pathology informatics question of how accurately such information can be extracted from text with the current tools and techniques, especially machine learning and symbolic NLP methods. The study data came from a text corpus maintained by the U.S. Food and Drug Administration, containing a rich set of information on laboratory tests and test devices. Methods: The authors developed a symbolic information extraction (SIE) system to extract device and test specific information about four types of laboratory test entities: Specimens, analytes, units of measures and detection limits. They compared the performance of SIE and three prominent machine learning based NLP systems, LingPipe, GATE and BANNER, each implementing a distinct supervised machine learning method, hidden Markov models, support vector machines and conditional random fields, respectively. Results: Machine learning systems recognized laboratory test entities with moderately high recall, but low precision rates. Their recall rates were relatively higher when the number of distinct entity values (e.g., the spectrum of specimens) was very limited or when lexical morphology of the entity was distinctive (as in units of measures), yet SIE outperformed them with statistically significant margins on extracting specimen, analyte and detection limit information in both precision and F-measure. Its high recall performance was statistically significant on analyte information extraction. Conclusions: Despite its shortcomings against machine learning methods, a well-tailored symbolic system may better discern relevancy among a pile of information of the same type and may outperform a machine learning system by tapping into lexically non-local contextual information such as the document structure. PMID:24083058
Shaft instantaneous angular speed for blade vibration in rotating machine

NASA Astrophysics Data System (ADS)

Gubran, Ahmed A.; Sinha, Jyoti K.

2014-02-01

Reliable blade health monitoring (BHM) in rotating machines like steam turbines and gas turbines, is a topic of research since decades to reduce machine down time, maintenance costs and to maintain the overall safety. Transverse blade vibration is often transmitted to the shaft as torsional vibration. The shaft instantaneous angular speed (IAS) is nothing but the representing the shaft torsional vibration. Hence the shaft IAS has been extracted from the measured encoder data during machine run-up to understand the blade vibration and to explore the possibility of reliable assessment of blade health. A number of experiments on an experimental rig with a bladed disk were conducted with healthy but mistuned blades and with different faults simulation in the blades. The measured shaft torsional vibration shows a distinct difference between the healthy and the faulty blade conditions. Hence, the observations are useful for the BHM in future. The paper presents the experimental setup, simulation of blade faults, experiments conducted, observations and results.
Morphological analysis of dendrites and spines by hybridization of ridge detection with twin support vector machine.

PubMed

Wang, Shuihua; Chen, Mengmeng; Li, Yang; Shao, Ying; Zhang, Yudong; Du, Sidan; Wu, Jane

2016-01-01

Dendritic spines are described as neuronal protrusions. The morphology of dendritic spines and dendrites has a strong relationship to its function, as well as playing an important role in understanding brain function. Quantitative analysis of dendrites and dendritic spines is essential to an understanding of the formation and function of the nervous system. However, highly efficient tools for the quantitative analysis of dendrites and dendritic spines are currently undeveloped. In this paper we propose a novel three-step cascaded algorithm-RTSVM- which is composed of ridge detection as the curvature structure identifier for backbone extraction, boundary location based on differences in density, the Hu moment as features and Twin Support Vector Machine (TSVM) classifiers for spine classification. Our data demonstrates that this newly developed algorithm has performed better than other available techniques used to detect accuracy and false alarm rates. This algorithm will be used effectively in neuroscience research.
A new machine classification method applied to human peripheral blood leukocytes

NASA Technical Reports Server (NTRS)

Rorvig, Mark E.; Fitzpatrick, Steven J.; Vitthal, Sanjay; Ladoulis, Charles T.

1994-01-01

Human beings judge images by complex mental processes, whereas computing machines extract features. By reducing scaled human judgments and machine extracted features to a common metric space and fitting them by regression, the judgments of human experts rendered on a sample of images may be imposed on an image population to provide automatic classification.
Emotion Discrimination Using Spatially Compact Regions of Interest Extracted from Imaging EEG Activity

PubMed Central

Padilla-Buritica, Jorge I.; Martinez-Vargas, Juan D.; Castellanos-Dominguez, German

2016-01-01

Lately, research on computational models of emotion had been getting much attention due to their potential for understanding the mechanisms of emotions and their promising broad range of applications that potentially bridge the gap between human and machine interactions. We propose a new method for emotion classification that relies on features extracted from those active brain areas that are most likely related to emotions. To this end, we carry out the selection of spatially compact regions of interest that are computed using the brain neural activity reconstructed from Electroencephalography data. Throughout this study, we consider three representative feature extraction methods widely applied to emotion detection tasks, including Power spectral density, Wavelet, and Hjorth parameters. Further feature selection is carried out using principal component analysis. For validation purpose, these features are used to feed a support vector machine classifier that is trained under the leave-one-out cross-validation strategy. Obtained results on real affective data show that incorporation of the proposed training method in combination with the enhanced spatial resolution provided by the source estimation allows improving the performed accuracy of discrimination in most of the considered emotions, namely: dominance, valence, and liking. PMID:27489541
Supporting the human life-raft in confronting the juggernaut of technology: Jens Rasmussen, 1961-1986.

PubMed

Kant, Vivek

2017-03-01

Jens Rasmussen's contribution to the field of human factors and ergonomics has had a lasting impact. Six prominent interrelated themes can be extracted from his research between 1961 and 1986. These themes form the basis of an engineering epistemology which is best manifested by his abstraction hierarchy. Further, Rasmussen reformulated technical reliability using systems language to enable a proper human-machine fit. To understand the concept of human-machine fit, he included the operator as a central component in the system to enhance system safety. This change resulted in the application of a qualitative and categorical approach for human-machine interaction design. Finally, Rasmussen's insistence on a working philosophy of systems design as being a joint responsibility of operators and designers provided the basis for averting errors and ensuring safe and correct system functioning. Copyright © 2016 Elsevier Ltd. All rights reserved.
Extracting Date/Time Expressions in Super-Function Based Japanese-English Machine Translation

NASA Astrophysics Data System (ADS)

Sasayama, Manabu; Kuroiwa, Shingo; Ren, Fuji

Super-Function Based Machine Translation(SFBMT) which is a type of Example-Based Machine Translation has a feature which makes it possible to expand the coverage of examples by changing nouns into variables, however, there were problems extracting entire date/time expressions containing parts-of-speech other than nouns, because only nouns/numbers were changed into variables. We describe a method for extracting date/time expressions for SFBMT. SFBMT uses noun determination rules to extract nouns and a bilingual dictionary to obtain correspondence of the extracted nouns between the source and the target languages. In this method, we add a rule to extract date/time expressions and then extract date/time expressions from a Japanese-English bilingual corpus. The evaluation results shows that the precision of this method for Japanese sentences is 96.7%, with a recall of 98.2% and the precision for English sentences is 94.7%, with a recall of 92.7%.
Complex Approach to Conceptual Design of Machine Mechanically Extracting Oil from Jatropha curcas L. Seeds for Biomass-Based Fuel Production

PubMed Central

Mašín, Ivan

2016-01-01

One of important sources of biomass-based fuel is Jatropha curcas L. Great attention is paid to the biofuel produced from the oil extracted from the Jatropha curcas L. seeds. A mechanised extraction is the most efficient and feasible method for oil extraction for small-scale farmers but there is a need to extract oil in more efficient manner which would increase the labour productivity, decrease production costs, and increase benefits of small-scale farmers. On the other hand innovators should be aware that further machines development is possible only when applying the systematic approach and design methodology in all stages of engineering design. Systematic approach in this case means that designers and development engineers rigorously apply scientific knowledge, integrate different constraints and user priorities, carefully plan product and activities, and systematically solve technical problems. This paper therefore deals with the complex approach to design specification determining that can bring new innovative concepts to design of mechanical machines for oil extraction. The presented case study as the main part of the paper is focused on new concept of screw of machine mechanically extracting oil from Jatropha curcas L. seeds. PMID:27668259
Perspectives on Machine Learning for Classification of Schizotypy Using fMRI Data.

PubMed

Madsen, Kristoffer H; Krohne, Laerke G; Cai, Xin-Lu; Wang, Yi; Chan, Raymond C K

2018-03-15

Functional magnetic resonance imaging is capable of estimating functional activation and connectivity in the human brain, and lately there has been increased interest in the use of these functional modalities combined with machine learning for identification of psychiatric traits. While these methods bear great potential for early diagnosis and better understanding of disease processes, there are wide ranges of processing choices and pitfalls that may severely hamper interpretation and generalization performance unless carefully considered. In this perspective article, we aim to motivate the use of machine learning schizotypy research. To this end, we describe common data processing steps while commenting on best practices and procedures. First, we introduce the important role of schizotypy to motivate the importance of reliable classification, and summarize existing machine learning literature on schizotypy. Then, we describe procedures for extraction of features based on fMRI data, including statistical parametric mapping, parcellation, complex network analysis, and decomposition methods, as well as classification with a special focus on support vector classification and deep learning. We provide more detailed descriptions and software as supplementary material. Finally, we present current challenges in machine learning for classification of schizotypy and comment on future trends and perspectives.
Informatics and machine learning to define the phenotype.

PubMed

Basile, Anna Okula; Ritchie, Marylyn DeRiggi

2018-03-01

For the past decade, the focus of complex disease research has been the genotype. From technological advancements to the development of analysis methods, great progress has been made. However, advances in our definition of the phenotype have remained stagnant. Phenotype characterization has recently emerged as an exciting area of informatics and machine learning. The copious amounts of diverse biomedical data that have been collected may be leveraged with data-driven approaches to elucidate trait-related features and patterns. Areas covered: In this review, the authors discuss the phenotype in traditional genetic associations and the challenges this has imposed.Approaches for phenotype refinement that can aid in more accurate characterization of traits are also discussed. Further, the authors highlight promising machine learning approaches for establishing a phenotype and the challenges of electronic health record (EHR)-derived data. Expert commentary: The authors hypothesize that through unsupervised machine learning, data-driven approaches can be used to define phenotypes rather than relying on expert clinician knowledge. Through the use of machine learning and an unbiased set of features extracted from clinical repositories, researchers will have the potential to further understand complex traits and identify patient subgroups. This knowledge may lead to more preventative and precise clinical care.
System and method for cooling a superconducting rotary machine

DOEpatents

Ackermann, Robert Adolf [Schenectady, NY; Laskaris, Evangelos Trifon [Schenectady, NY; Huang, Xianrui [Clifton Park, NY; Bray, James William [Niskayuna, NY

2011-08-09

A system for cooling a superconducting rotary machine includes a plurality of sealed siphon tubes disposed in balanced locations around a rotor adjacent to a superconducting coil. Each of the sealed siphon tubes includes a tubular body and a heat transfer medium disposed in the tubular body that undergoes a phase change during operation of the machine to extract heat from the superconducting coil. A siphon heat exchanger is thermally coupled to the siphon tubes for extracting heat from the siphon tubes during operation of the machine.
Modelling and representation issues in automated feature extraction from aerial and satellite images

NASA Astrophysics Data System (ADS)

Sowmya, Arcot; Trinder, John

New digital systems for the processing of photogrammetric and remote sensing images have led to new approaches to information extraction for mapping and Geographic Information System (GIS) applications, with the expectation that data can become more readily available at a lower cost and with greater currency. Demands for mapping and GIS data are increasing as well for environmental assessment and monitoring. Hence, researchers from the fields of photogrammetry and remote sensing, as well as computer vision and artificial intelligence, are bringing together their particular skills for automating these tasks of information extraction. The paper will review some of the approaches used in knowledge representation and modelling for machine vision, and give examples of their applications in research for image understanding of aerial and satellite imagery.
Nuclear Weapons. National Nuclear Security Administration’s Plans for Its Uranium Processing Facility Should Better Reflect Funding Estimates and Technology Readiness

DTIC Science & Technology

2010-11-01

metal. Recovery extraction centrifugal contactors A process that uses solvent to extract uranium for purposes of purification. Agile machining A...extraction centrifugal contactors 5 6 Yes 6 No Agile machining 5 5 No 6 No Chip management 5 6 Yes 6 No Special casting 3 6 Yes 6 No Source: GAO
Language extraction from zinc sulfide

NASA Astrophysics Data System (ADS)

Varn, Dowman Parks

2001-09-01

Recent advances in the analysis of one-dimensional temporal and spacial series allow for detailed characterization of disorder and computation in physical systems. One such system that has defied theoretical understanding since its discovery in 1912 is polytypism. Polytypes are layered compounds, exhibiting crystallinity in two dimensions, yet having complicated stacking sequences in the third direction. They can show both ordered and disordered sequences, sometimes each in the same specimen. We demonstrate a method for extracting two-layer correlation information from ZnS diffraction patterns and employ a novel technique for epsilon-machine reconstruction. We solve a long-standing problem---that of determining structural information for disordered materials from their diffraction patterns---for this special class of disorder. Our solution offers the most complete possible statistical description of the disorder. Furthermore, from our reconstructed epsilon-machines we find the effective range of the interlayer interaction in these materials, as well as the configurational energy of both ordered and disordered specimens. Finally, we can determine the 'language' (in terms of the Chomsky Hierarchy) these small rocks speak, and we find that regular languages are sufficient to describe them.
Support Vector Machine-Based Endmember Extraction

DOE Office of Scientific and Technical Information (OSTI.GOV)

Filippi, Anthony M; Archibald, Richard K

Introduced in this paper is the utilization of Support Vector Machines (SVMs) to automatically perform endmember extraction from hyperspectral data. The strengths of SVM are exploited to provide a fast and accurate calculated representation of high-dimensional data sets that may consist of multiple distributions. Once this representation is computed, the number of distributions can be determined without prior knowledge. For each distribution, an optimal transform can be determined that preserves informational content while reducing the data dimensionality, and hence, the computational cost. Finally, endmember extraction for the whole data set is accomplished. Results indicate that this Support Vector Machine-Based Endmembermore » Extraction (SVM-BEE) algorithm has the capability of autonomously determining endmembers from multiple clusters with computational speed and accuracy, while maintaining a robust tolerance to noise.« less

Artificial intelligence expert systems with neural network machine learning may assist decision-making for extractions in orthodontic treatment planning.

PubMed

Takada, Kenji

2016-09-01

New approach for the diagnosis of extractions with neural network machine learning. Seok-Ki Jung and Tae-Woo Kim. Am J Orthod Dentofacial Orthop 2016;149:127-33. Not reported. Mathematical modeling. Copyright © 2016 Elsevier Inc. All rights reserved.
Applying machine learning to identify autistic adults using imitation: An exploratory study.

PubMed

Li, Baihua; Sharma, Arjun; Meng, James; Purushwalkam, Senthil; Gowen, Emma

2017-01-01

Autism spectrum condition (ASC) is primarily diagnosed by behavioural symptoms including social, sensory and motor aspects. Although stereotyped, repetitive motor movements are considered during diagnosis, quantitative measures that identify kinematic characteristics in the movement patterns of autistic individuals are poorly studied, preventing advances in understanding the aetiology of motor impairment, or whether a wider range of motor characteristics could be used for diagnosis. The aim of this study was to investigate whether data-driven machine learning based methods could be used to address some fundamental problems with regard to identifying discriminative test conditions and kinematic parameters to classify between ASC and neurotypical controls. Data was based on a previous task where 16 ASC participants and 14 age, IQ matched controls observed then imitated a series of hand movements. 40 kinematic parameters extracted from eight imitation conditions were analysed using machine learning based methods. Two optimal imitation conditions and nine most significant kinematic parameters were identified and compared with some standard attribute evaluators. To our knowledge, this is the first attempt to apply machine learning to kinematic movement parameters measured during imitation of hand movements to investigate the identification of ASC. Although based on a small sample, the work demonstrates the feasibility of applying machine learning methods to analyse high-dimensional data and suggest the potential of machine learning for identifying kinematic biomarkers that could contribute to the diagnostic classification of autism.
Device for Extracting Flavors and Fragrances

NASA Technical Reports Server (NTRS)

Chang, F. R.

1986-01-01

Machine for making coffee and tea in weightless environment may prove even more valuable on Earth as general extraction apparatus. Zero-gravity beverage maker uses piston instead of gravity to move hot water and beverage from one chamber to other and dispense beverage. Machine functions like conventional coffeemaker during part of operating cycle and includes additional features that enable operation not only in zero gravity but also extraction under pressure in presence or absence of gravity.
One Dimensional Turing-Like Handshake Test for Motor Intelligence

PubMed Central

Karniel, Amir; Avraham, Guy; Peles, Bat-Chen; Levy-Tzedek, Shelly; Nisky, Ilana

2010-01-01

In the Turing test, a computer model is deemed to "think intelligently" if it can generate answers that are not distinguishable from those of a human. However, this test is limited to the linguistic aspects of machine intelligence. A salient function of the brain is the control of movement, and the movement of the human hand is a sophisticated demonstration of this function. Therefore, we propose a Turing-like handshake test, for machine motor intelligence. We administer the test through a telerobotic system in which the interrogator is engaged in a task of holding a robotic stylus and interacting with another party (human or artificial). Instead of asking the interrogator whether the other party is a person or a computer program, we employ a two-alternative forced choice method and ask which of two systems is more human-like. We extract a quantitative grade for each model according to its resemblance to the human handshake motion and name it "Model Human-Likeness Grade" (MHLG). We present three methods to estimate the MHLG. (i) By calculating the proportion of subjects' answers that the model is more human-like than the human; (ii) By comparing two weighted sums of human and model handshakes we fit a psychometric curve and extract the point of subjective equality (PSE); (iii) By comparing a given model with a weighted sum of human and random signal, we fit a psychometric curve to the answers of the interrogator and extract the PSE for the weight of the human in the weighted sum. Altogether, we provide a protocol to test computational models of the human handshake. We believe that building a model is a necessary step in understanding any phenomenon and, in this case, in understanding the neural mechanisms responsible for the generation of the human handshake. PMID:21206462
The optional selection of micro-motion feature based on Support Vector Machine

NASA Astrophysics Data System (ADS)

Li, Bo; Ren, Hongmei; Xiao, Zhi-he; Sheng, Jing

2017-11-01

Micro-motion form of target is multiple, different micro-motion forms are apt to be modulated, which makes it difficult for feature extraction and recognition. Aiming at feature extraction of cone-shaped objects with different micro-motion forms, this paper proposes the best selection method of micro-motion feature based on support vector machine. After the time-frequency distribution of radar echoes, comparing the time-frequency spectrum of objects with different micro-motion forms, features are extracted based on the differences between the instantaneous frequency variations of different micro-motions. According to the methods based on SVM (Support Vector Machine) features are extracted, then the best features are acquired. Finally, the result shows the method proposed in this paper is feasible under the test condition of certain signal-to-noise ratio(SNR).
Analysis and Prediction of Myristoylation Sites Using the mRMR Method, the IFS Method and an Extreme Learning Machine Algorithm.

PubMed

Wang, ShaoPeng; Zhang, Yu-Hang; Huang, GuoHua; Chen, Lei; Cai, Yu-Dong

2017-01-01

Myristoylation is an important hydrophobic post-translational modification that is covalently bound to the amino group of Gly residues on the N-terminus of proteins. The many diverse functions of myristoylation on proteins, such as membrane targeting, signal pathway regulation and apoptosis, are largely due to the lipid modification, whereas abnormal or irregular myristoylation on proteins can lead to several pathological changes in the cell. To better understand the function of myristoylated sites and to correctly identify them in protein sequences, this study conducted a novel computational investigation on identifying myristoylation sites in protein sequences. A training dataset with 196 positive and 84 negative peptide segments were obtained. Four types of features derived from the peptide segments following the myristoylation sites were used to specify myristoylatedand non-myristoylated sites. Then, feature selection methods including maximum relevance and minimum redundancy (mRMR), incremental feature selection (IFS), and a machine learning algorithm (extreme learning machine method) were adopted to extract optimal features for the algorithm to identify myristoylation sites in protein sequences, thereby building an optimal prediction model. As a result, 41 key features were extracted and used to build an optimal prediction model. The effectiveness of the optimal prediction model was further validated by its performance on a test dataset. Furthermore, detailed analyses were also performed on the extracted 41 features to gain insight into the mechanism of myristoylation modification. This study provided a new computational method for identifying myristoylation sites in protein sequences. We believe that it can be a useful tool to predict myristoylation sites from protein sequences. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Bidirectional RNN for Medical Event Detection in Electronic Health Records.

PubMed

Jagannatha, Abhyuday N; Yu, Hong

2016-06-01

Sequence labeling for extraction of medical events and their attributes from unstructured text in Electronic Health Record (EHR) notes is a key step towards semantic understanding of EHRs. It has important applications in health informatics including pharmacovigilance and drug surveillance. The state of the art supervised machine learning models in this domain are based on Conditional Random Fields (CRFs) with features calculated from fixed context windows. In this application, we explored recurrent neural network frameworks and show that they significantly out-performed the CRF models.
Detecting epileptic seizure with different feature extracting strategies using robust machine learning classification techniques by applying advance parameter optimization approach.

PubMed

Hussain, Lal

2018-06-01

Epilepsy is a neurological disorder produced due to abnormal excitability of neurons in the brain. The research reveals that brain activity is monitored through electroencephalogram (EEG) of patients suffered from seizure to detect the epileptic seizure. The performance of EEG detection based epilepsy require feature extracting strategies. In this research, we have extracted varying features extracting strategies based on time and frequency domain characteristics, nonlinear, wavelet based entropy and few statistical features. A deeper study was undertaken using novel machine learning classifiers by considering multiple factors. The support vector machine kernels are evaluated based on multiclass kernel and box constraint level. Likewise, for K-nearest neighbors (KNN), we computed the different distance metrics, Neighbor weights and Neighbors. Similarly, the decision trees we tuned the paramours based on maximum splits and split criteria and ensemble classifiers are evaluated based on different ensemble methods and learning rate. For training/testing tenfold Cross validation was employed and performance was evaluated in form of TPR, NPR, PPV, accuracy and AUC. In this research, a deeper analysis approach was performed using diverse features extracting strategies using robust machine learning classifiers with more advanced optimal options. Support Vector Machine linear kernel and KNN with City block distance metric give the overall highest accuracy of 99.5% which was higher than using the default parameters for these classifiers. Moreover, highest separation (AUC = 0.9991, 0.9990) were obtained at different kernel scales using SVM. Additionally, the K-nearest neighbors with inverse squared distance weight give higher performance at different Neighbors. Moreover, to distinguish the postictal heart rate oscillations from epileptic ictal subjects, and highest performance of 100% was obtained using different machine learning classifiers.
Machine learning to predict the occurrence of bisphosphonate-related osteonecrosis of the jaw associated with dental extraction: A preliminary report.

PubMed

Kim, Dong Wook; Kim, Hwiyoung; Nam, Woong; Kim, Hyung Jun; Cha, In-Ho

2018-04-23

The aim of this study was to build and validate five types of machine learning models that can predict the occurrence of BRONJ associated with dental extraction in patients taking bisphosphonates for the management of osteoporosis. A retrospective review of the medical records was conducted to obtain cases and controls for the study. Total 125 patients consisting of 41 cases and 84 controls were selected for the study. Five machine learning prediction algorithms including multivariable logistic regression model, decision tree, support vector machine, artificial neural network, and random forest were implemented. The outputs of these models were compared with each other and also with conventional methods, such as serum CTX level. Area under the receiver operating characteristic (ROC) curve (AUC) was used to compare the results. The performance of machine learning models was significantly superior to conventional statistical methods and single predictors. The random forest model yielded the best performance (AUC = 0.973), followed by artificial neural network (AUC = 0.915), support vector machine (AUC = 0.882), logistic regression (AUC = 0.844), decision tree (AUC = 0.821), drug holiday alone (AUC = 0.810), and CTX level alone (AUC = 0.630). Machine learning methods showed superior performance in predicting BRONJ associated with dental extraction compared to conventional statistical methods using drug holiday and serum CTX level. Machine learning can thus be applied in a wide range of clinical studies. Copyright © 2017. Published by Elsevier Inc.
Machine learning methods for the classification of gliomas: Initial results using features extracted from MR spectroscopy.

PubMed

Ranjith, G; Parvathy, R; Vikas, V; Chandrasekharan, Kesavadas; Nair, Suresh

2015-04-01

With the advent of new imaging modalities, radiologists are faced with handling increasing volumes of data for diagnosis and treatment planning. The use of automated and intelligent systems is becoming essential in such a scenario. Machine learning, a branch of artificial intelligence, is increasingly being used in medical image analysis applications such as image segmentation, registration and computer-aided diagnosis and detection. Histopathological analysis is currently the gold standard for classification of brain tumors. The use of machine learning algorithms along with extraction of relevant features from magnetic resonance imaging (MRI) holds promise of replacing conventional invasive methods of tumor classification. The aim of the study is to classify gliomas into benign and malignant types using MRI data. Retrospective data from 28 patients who were diagnosed with glioma were used for the analysis. WHO Grade II (low-grade astrocytoma) was classified as benign while Grade III (anaplastic astrocytoma) and Grade IV (glioblastoma multiforme) were classified as malignant. Features were extracted from MR spectroscopy. The classification was done using four machine learning algorithms: multilayer perceptrons, support vector machine, random forest and locally weighted learning. Three of the four machine learning algorithms gave an area under ROC curve in excess of 0.80. Random forest gave the best performance in terms of AUC (0.911) while sensitivity was best for locally weighted learning (86.1%). The performance of different machine learning algorithms in the classification of gliomas is promising. An even better performance may be expected by integrating features extracted from other MR sequences. © The Author(s) 2015 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.
Service Modules for Coal Extraction

NASA Technical Reports Server (NTRS)

Gangal, M. D.; Lewis, E. V.

1985-01-01

Service train follows group of mining machines, paying out utility lines as machines progress into coal face. Service train for four mining machines removes gases and coal and provides water and electricity. Flexible, coiling armored carriers protect cables and hoses. High coal production attained by arraying row of machines across face, working side by side.
Machine Learning Approach to Extract Diagnostic and Prognostic Thresholds: Application in Prognosis of Cardiovascular Mortality

PubMed Central

Mena, Luis J.; Orozco, Eber E.; Felix, Vanessa G.; Ostos, Rodolfo; Melgarejo, Jesus; Maestre, Gladys E.

2012-01-01

Machine learning has become a powerful tool for analysing medical domains, assessing the importance of clinical parameters, and extracting medical knowledge for outcomes research. In this paper, we present a machine learning method for extracting diagnostic and prognostic thresholds, based on a symbolic classification algorithm called REMED. We evaluated the performance of our method by determining new prognostic thresholds for well-known and potential cardiovascular risk factors that are used to support medical decisions in the prognosis of fatal cardiovascular diseases. Our approach predicted 36% of cardiovascular deaths with 80% specificity and 75% general accuracy. The new method provides an innovative approach that might be useful to support decisions about medical diagnoses and prognoses. PMID:22924062
Fermilab Booster Transition Crossing Simulations and Beam Studies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bhat, C. M.; Tan, C. Y.

2016-01-01

The Fermilab Booster accelerates beam from 400 MeV to 8 GeV at 15 Hz. In the PIP (Proton Improvement Plan) era, it is required that Booster deliver 4.2 xmore » $$10^{12}$$ protons per pulse to extraction. One of the obstacles for providing quality beam to the users is the longitudinal quadrupole oscillation that the beam suffers from right after transition. Although this oscillation is well taken care of with quadrupole dampers, it is important to understand the source of these oscillations in light of the PIP II requirements that require 6.5 x $$10^{12}$$ protons per pulse at extraction. This paper explores the results from machine studies, computer simulations and solutions to prevent the quadrupole oscillations after transition.« less
Scale-invariant feature extraction of neural network and renormalization group flow

NASA Astrophysics Data System (ADS)

Iso, Satoshi; Shiba, Shotaro; Yokoo, Sumito

2018-05-01

Theoretical understanding of how a deep neural network (DNN) extracts features from input images is still unclear, but it is widely believed that the extraction is performed hierarchically through a process of coarse graining. It reminds us of the basic renormalization group (RG) concept in statistical physics. In order to explore possible relations between DNN and RG, we use the restricted Boltzmann machine (RBM) applied to an Ising model and construct a flow of model parameters (in particular, temperature) generated by the RBM. We show that the unsupervised RBM trained by spin configurations at various temperatures from T =0 to T =6 generates a flow along which the temperature approaches the critical value Tc=2.2 7 . This behavior is the opposite of the typical RG flow of the Ising model. By analyzing various properties of the weight matrices of the trained RBM, we discuss why it flows towards Tc and how the RBM learns to extract features of spin configurations.
Ensemble methods with simple features for document zone classification

NASA Astrophysics Data System (ADS)

Obafemi-Ajayi, Tayo; Agam, Gady; Xie, Bingqing

2012-01-01

Document layout analysis is of fundamental importance for document image understanding and information retrieval. It requires the identification of blocks extracted from a document image via features extraction and block classification. In this paper, we focus on the classification of the extracted blocks into five classes: text (machine printed), handwriting, graphics, images, and noise. We propose a new set of features for efficient classifications of these blocks. We present a comparative evaluation of three ensemble based classification algorithms (boosting, bagging, and combined model trees) in addition to other known learning algorithms. Experimental results are demonstrated for a set of 36503 zones extracted from 416 document images which were randomly selected from the tobacco legacy document collection. The results obtained verify the robustness and effectiveness of the proposed set of features in comparison to the commonly used Ocropus recognition features. When used in conjunction with the Ocropus feature set, we further improve the performance of the block classification system to obtain a classification accuracy of 99.21%.
Understanding disciplinary vocabularies using a full-text enabled domain-independent term extraction approach.

PubMed

Yan, Erjia; Williams, Jake; Chen, Zheng

2017-01-01

Publication metadata help deliver rich analyses of scholarly communication. However, research concepts and ideas are more effectively expressed through unstructured fields such as full texts. Thus, the goals of this paper are to employ a full-text enabled method to extract terms relevant to disciplinary vocabularies, and through them, to understand the relationships between disciplines. This paper uses an efficient, domain-independent term extraction method to extract disciplinary vocabularies from a large multidisciplinary corpus of PLoS ONE publications. It finds a power-law pattern in the frequency distributions of terms present in each discipline, indicating a semantic richness potentially sufficient for further study and advanced analysis. The salient relationships amongst these vocabularies become apparent in application of a principal component analysis. For example, Mathematics and Computer and Information Sciences were found to have similar vocabulary use patterns along with Engineering and Physics; while Chemistry and the Social Sciences were found to exhibit contrasting vocabulary use patterns along with the Earth Sciences and Chemistry. These results have implications to studies of scholarly communication as scholars attempt to identify the epistemological cultures of disciplines, and as a full text-based methodology could lead to machine learning applications in the automated classification of scholarly work according to disciplinary vocabularies.
Understanding disciplinary vocabularies using a full-text enabled domain-independent term extraction approach

PubMed Central

Williams, Jake; Chen, Zheng

2017-01-01

Publication metadata help deliver rich analyses of scholarly communication. However, research concepts and ideas are more effectively expressed through unstructured fields such as full texts. Thus, the goals of this paper are to employ a full-text enabled method to extract terms relevant to disciplinary vocabularies, and through them, to understand the relationships between disciplines. This paper uses an efficient, domain-independent term extraction method to extract disciplinary vocabularies from a large multidisciplinary corpus of PLoS ONE publications. It finds a power-law pattern in the frequency distributions of terms present in each discipline, indicating a semantic richness potentially sufficient for further study and advanced analysis. The salient relationships amongst these vocabularies become apparent in application of a principal component analysis. For example, Mathematics and Computer and Information Sciences were found to have similar vocabulary use patterns along with Engineering and Physics; while Chemistry and the Social Sciences were found to exhibit contrasting vocabulary use patterns along with the Earth Sciences and Chemistry. These results have implications to studies of scholarly communication as scholars attempt to identify the epistemological cultures of disciplines, and as a full text-based methodology could lead to machine learning applications in the automated classification of scholarly work according to disciplinary vocabularies. PMID:29186141
Generating description with multi-feature fusion and saliency maps of image

NASA Astrophysics Data System (ADS)

Liu, Lisha; Ding, Yuxuan; Tian, Chunna; Yuan, Bo

2018-04-01

Generating description for an image can be regard as visual understanding. It is across artificial intelligence, machine learning, natural language processing and many other areas. In this paper, we present a model that generates description for images based on RNN (recurrent neural network) with object attention and multi-feature of images. The deep recurrent neural networks have excellent performance in machine translation, so we use it to generate natural sentence description for images. The proposed method uses single CNN (convolution neural network) that is trained on ImageNet to extract image features. But we think it can not adequately contain the content in images, it may only focus on the object area of image. So we add scene information to image feature using CNN which is trained on Places205. Experiments show that model with multi-feature extracted by two CNNs perform better than which with a single feature. In addition, we make saliency weights on images to emphasize the salient objects in images. We evaluate our model on MSCOCO based on public metrics, and the results show that our model performs better than several state-of-the-art methods.
Structural classification of proteins using texture descriptors extracted from the cellular automata image.

PubMed

Kavianpour, Hamidreza; Vasighi, Mahdi

2017-02-01

Nowadays, having knowledge about cellular attributes of proteins has an important role in pharmacy, medical science and molecular biology. These attributes are closely correlated with the function and three-dimensional structure of proteins. Knowledge of protein structural class is used by various methods for better understanding the protein functionality and folding patterns. Computational methods and intelligence systems can have an important role in performing structural classification of proteins. Most of protein sequences are saved in databanks as characters and strings and a numerical representation is essential for applying machine learning methods. In this work, a binary representation of protein sequences is introduced based on reduced amino acids alphabets according to surrounding hydrophobicity index. Many important features which are hidden in these long binary sequences can be clearly displayed through their cellular automata images. The extracted features from these images are used to build a classification model by support vector machine. Comparing to previous studies on the several benchmark datasets, the promising classification rates obtained by tenfold cross-validation imply that the current approach can help in revealing some inherent features deeply hidden in protein sequences and improve the quality of predicting protein structural class.
Precise on-machine extraction of the surface normal vector using an eddy current sensor array

NASA Astrophysics Data System (ADS)

Wang, Yongqing; Lian, Meng; Liu, Haibo; Ying, Yangwei; Sheng, Xianjun

2016-11-01

To satisfy the requirements of on-machine measurement of the surface normal during complex surface manufacturing, a highly robust normal vector extraction method using an Eddy current (EC) displacement sensor array is developed, the output of which is almost unaffected by surface brightness, machining coolant and environmental noise. A precise normal vector extraction model based on a triangular-distributed EC sensor array is first established. Calibration of the effects of object surface inclination and coupling interference on measurement results, and the relative position of EC sensors, is involved. A novel apparatus employing three EC sensors and a force transducer was designed, which can be easily integrated into the computer numerical control (CNC) machine tool spindle and/or robot terminal execution. Finally, to test the validity and practicability of the proposed method, typical experiments were conducted with specified testing pieces using the developed approach and system, such as an inclined plane and cylindrical and spherical surfaces.

Difficulty understanding speech in noise by the hearing impaired: underlying causes and technological solutions.

PubMed

Healy, Eric W; Yoho, Sarah E

2016-08-01

A primary complaint of hearing-impaired individuals involves poor speech understanding when background noise is present. Hearing aids and cochlear implants often allow good speech understanding in quiet backgrounds. But hearing-impaired individuals are highly noise intolerant, and existing devices are not very effective at combating background noise. As a result, speech understanding in noise is often quite poor. In accord with the significance of the problem, considerable effort has been expended toward understanding and remedying this issue. Fortunately, our understanding of the underlying issues is reasonably good. In sharp contrast, effective solutions have remained elusive. One solution that seems promising involves a single-microphone machine-learning algorithm to extract speech from background noise. Data from our group indicate that the algorithm is capable of producing vast increases in speech understanding by hearing-impaired individuals. This paper will first provide an overview of the speech-in-noise problem and outline why hearing-impaired individuals are so noise intolerant. An overview of our approach to solving this problem will follow.
Integrative relational machine-learning for understanding drug side-effect profiles

PubMed Central

2013-01-01

Background Drug side effects represent a common reason for stopping drug development during clinical trials. Improving our ability to understand drug side effects is necessary to reduce attrition rates during drug development as well as the risk of discovering novel side effects in available drugs. Today, most investigations deal with isolated side effects and overlook possible redundancy and their frequent co-occurrence. Results In this work, drug annotations are collected from SIDER and DrugBank databases. Terms describing individual side effects reported in SIDER are clustered with a semantic similarity measure into term clusters (TCs). Maximal frequent itemsets are extracted from the resulting drug x TC binary table, leading to the identification of what we call side-effect profiles (SEPs). A SEP is defined as the longest combination of TCs which are shared by a significant number of drugs. Frequent SEPs are explored on the basis of integrated drug and target descriptors using two machine learning methods: decision-trees and inductive-logic programming. Although both methods yield explicit models, inductive-logic programming method performs relational learning and is able to exploit not only drug properties but also background knowledge. Learning efficiency is evaluated by cross-validation and direct testing with new molecules. Comparison of the two machine-learning methods shows that the inductive-logic-programming method displays a greater sensitivity than decision trees and successfully exploit background knowledge such as functional annotations and pathways of drug targets, thereby producing rich and expressive rules. All models and theories are available on a dedicated web site. Conclusions Side effect profiles covering significant number of drugs have been extracted from a drug ×side-effect association table. Integration of background knowledge concerning both chemical and biological spaces has been combined with a relational learning method for discovering rules which explicitly characterize drug-SEP associations. These rules are successfully used for predicting SEPs associated with new drugs. PMID:23802887
Integrative relational machine-learning for understanding drug side-effect profiles.

PubMed

Bresso, Emmanuel; Grisoni, Renaud; Marchetti, Gino; Karaboga, Arnaud Sinan; Souchet, Michel; Devignes, Marie-Dominique; Smaïl-Tabbone, Malika

2013-06-26

Drug side effects represent a common reason for stopping drug development during clinical trials. Improving our ability to understand drug side effects is necessary to reduce attrition rates during drug development as well as the risk of discovering novel side effects in available drugs. Today, most investigations deal with isolated side effects and overlook possible redundancy and their frequent co-occurrence. In this work, drug annotations are collected from SIDER and DrugBank databases. Terms describing individual side effects reported in SIDER are clustered with a semantic similarity measure into term clusters (TCs). Maximal frequent itemsets are extracted from the resulting drug x TC binary table, leading to the identification of what we call side-effect profiles (SEPs). A SEP is defined as the longest combination of TCs which are shared by a significant number of drugs. Frequent SEPs are explored on the basis of integrated drug and target descriptors using two machine learning methods: decision-trees and inductive-logic programming. Although both methods yield explicit models, inductive-logic programming method performs relational learning and is able to exploit not only drug properties but also background knowledge. Learning efficiency is evaluated by cross-validation and direct testing with new molecules. Comparison of the two machine-learning methods shows that the inductive-logic-programming method displays a greater sensitivity than decision trees and successfully exploit background knowledge such as functional annotations and pathways of drug targets, thereby producing rich and expressive rules. All models and theories are available on a dedicated web site. Side effect profiles covering significant number of drugs have been extracted from a drug ×side-effect association table. Integration of background knowledge concerning both chemical and biological spaces has been combined with a relational learning method for discovering rules which explicitly characterize drug-SEP associations. These rules are successfully used for predicting SEPs associated with new drugs.
Prostate cancer detection using machine learning techniques by employing combination of features extracting strategies.

PubMed

Hussain, Lal; Ahmed, Adeel; Saeed, Sharjil; Rathore, Saima; Awan, Imtiaz Ahmed; Shah, Saeed Arif; Majid, Abdul; Idris, Adnan; Awan, Anees Ahmed

2018-02-06

Prostate is a second leading causes of cancer deaths among men. Early detection of cancer can effectively reduce the rate of mortality caused by Prostate cancer. Due to high and multiresolution of MRIs from prostate cancer require a proper diagnostic systems and tools. In the past researchers developed Computer aided diagnosis (CAD) systems that help the radiologist to detect the abnormalities. In this research paper, we have employed novel Machine learning techniques such as Bayesian approach, Support vector machine (SVM) kernels: polynomial, radial base function (RBF) and Gaussian and Decision Tree for detecting prostate cancer. Moreover, different features extracting strategies are proposed to improve the detection performance. The features extracting strategies are based on texture, morphological, scale invariant feature transform (SIFT), and elliptic Fourier descriptors (EFDs) features. The performance was evaluated based on single as well as combination of features using Machine Learning Classification techniques. The Cross validation (Jack-knife k-fold) was performed and performance was evaluated in term of receiver operating curve (ROC) and specificity, sensitivity, Positive predictive value (PPV), negative predictive value (NPV), false positive rate (FPR). Based on single features extracting strategies, SVM Gaussian Kernel gives the highest accuracy of 98.34% with AUC of 0.999. While, using combination of features extracting strategies, SVM Gaussian kernel with texture + morphological, and EFDs + morphological features give the highest accuracy of 99.71% and AUC of 1.00.
Mathematical model of simple spalling formation during coal cutting with extracting machine

NASA Astrophysics Data System (ADS)

Gabov, V. V.; Zadkov, D. A.

2018-05-01

A single-mass model of a rotor shearer is analyzed. It is shown that rotor mining machines has large inertia moments and load dynamics. An extraction module model with selective movement of the cutting tool is represented. The peculiar feature of such extracting machines is fluid power drive cutter mechanism. They can steadily operate at large shear thickness, and locking modes are not an emergency for them. Comparing with shearers they have less inertional mass, but slower average cutting speed, and its momentary values depend on load. Basing on the equation of hydraulic fuel consumption balance the work of fluid power drive of extracting module cutter mechanism together with hydro pneumatic accumulator is analyzed. Spalling formation model during coal cutting with fluid power drive cutter mechanism and potential energy stores are suggested. Matching cutter speed with the speed of main crack expansion and amount of potential energy consumption, cutter load is determined only by ultimate stress at crack pole and friction. Tests of an extracting module cutter in real size model proved the stated theory.
Agricultural mapping using Support Vector Machine-Based Endmember Extraction (SVM-BEE)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Archibald, Richard K; Filippi, Anthony M; Bhaduri, Budhendra L

Extracting endmembers from remotely sensed images of vegetated areas can present difficulties. In this research, we applied a recently developed endmember-extraction algorithm based on Support Vector Machines (SVMs) to the problem of semi-autonomous estimation of vegetation endmembers from a hyperspectral image. This algorithm, referred to as Support Vector Machine-Based Endmember Extraction (SVM-BEE), accurately and rapidly yields a computed representation of hyperspectral data that can accommodate multiple distributions. The number of distributions is identified without prior knowledge, based upon this representation. Prior work established that SVM-BEE is robustly noise-tolerant and can semi-automatically and effectively estimate endmembers; synthetic data and a geologicmore » scene were previously analyzed. Here we compared the efficacies of the SVM-BEE and N-FINDR algorithms in extracting endmembers from a predominantly agricultural scene. SVM-BEE was able to estimate vegetation and other endmembers for all classes in the image, which N-FINDR failed to do. Classifications based on SVM-BEE endmembers were markedly more accurate compared with those based on N-FINDR endmembers.« less
Overview of machine vision methods in x-ray imaging and microtomography

NASA Astrophysics Data System (ADS)

Buzmakov, Alexey; Zolotov, Denis; Chukalina, Marina; Nikolaev, Dmitry; Gladkov, Andrey; Ingacheva, Anastasia; Yakimchuk, Ivan; Asadchikov, Victor

2018-04-01

Digital X-ray imaging became widely used in science, medicine, non-destructive testing. This allows using modern digital images analysis for automatic information extraction and interpretation. We give short review of scientific applications of machine vision in scientific X-ray imaging and microtomography, including image processing, feature detection and extraction, images compression to increase camera throughput, microtomography reconstruction, visualization and setup adjustment.
Extracting semantics from audio-visual content: the final frontier in multimedia retrieval.

PubMed

Naphade, M R; Huang, T S

2002-01-01

Multimedia understanding is a fast emerging interdisciplinary research area. There is tremendous potential for effective use of multimedia content through intelligent analysis. Diverse application areas are increasingly relying on multimedia understanding systems. Advances in multimedia understanding are related directly to advances in signal processing, computer vision, pattern recognition, multimedia databases, and smart sensors. We review the state-of-the-art techniques in multimedia retrieval. In particular, we discuss how multimedia retrieval can be viewed as a pattern recognition problem. We discuss how reliance on powerful pattern recognition and machine learning techniques is increasing in the field of multimedia retrieval. We review the state-of-the-art multimedia understanding systems with particular emphasis on a system for semantic video indexing centered around multijects and multinets. We discuss how semantic retrieval is centered around concepts and context and the various mechanisms for modeling concepts and context.
Espresso coffee foam delays cooling of the liquid phase.

PubMed

Arii, Yasuhiro; Nishizawa, Kaho

2017-04-01

Espresso coffee foam, called crema, is known to be a marker of the quality of espresso coffee extraction. However, the role of foam in coffee temperature has not been quantitatively clarified. In this study, we used an automatic machine for espresso coffee extraction. We evaluated whether the foam prepared using the machine was suitable for foam analysis. After extraction, the percentage and consistency of the foam were measured using various techniques, and changes in the foam volume were tracked over time. Our extraction method, therefore, allowed consistent preparation of high-quality foam. We also quantitatively determined that the foam phase slowed cooling of the liquid phase after extraction. High-quality foam plays an important role in delaying the cooling of espresso coffee.
Extraction of inland Nypa fruticans (Nipa Palm) using Support Vector Machine

NASA Astrophysics Data System (ADS)

Alberto, R. T.; Serrano, S. C.; Damian, G. B.; Camaso, E. E.; Biagtan, A. R.; Panuyas, N. Z.; Quibuyen, J. S.

2017-09-01

Mangroves are considered as one of the major habitats in coastal ecosystem, providing a lot of economic and ecological services in human society. Nypa fruticans (Nipa palm) is one of the important species of mangroves because of its versatility and uniqueness as halophytic palm. However, nipas are not only adaptable in saline areas, they can also managed to thrive away from the coastline depending on the favorable soil types available in the area. Because of this, mapping of this species are not limited alone in the near shore areas, but in areas where this species are present as well. The extraction process of Nypa fruticans were carried out using the available LiDAR data. Support Vector Machine (SVM) classification process was used to extract nipas in inland areas. The SVM classification process in mapping Nypa fruticans produced high accuracy of 95+%. The Support Vector Machine classification process to extract inland nipas was proven to be effective by utilizing different terrain derivatives from LiDAR data.
Automated anatomical labeling of bronchial branches extracted from CT datasets based on machine learning and combination optimization and its application to bronchoscope guidance.

PubMed

Mori, Kensaku; Ota, Shunsuke; Deguchi, Daisuke; Kitasaka, Takayuki; Suenaga, Yasuhito; Iwano, Shingo; Hasegawa, Yosihnori; Takabatake, Hirotsugu; Mori, Masaki; Natori, Hiroshi

2009-01-01

This paper presents a method for the automated anatomical labeling of bronchial branches extracted from 3D CT images based on machine learning and combination optimization. We also show applications of anatomical labeling on a bronchoscopy guidance system. This paper performs automated labeling by using machine learning and combination optimization. The actual procedure consists of four steps: (a) extraction of tree structures of the bronchus regions extracted from CT images, (b) construction of AdaBoost classifiers, (c) computation of candidate names for all branches by using the classifiers, (d) selection of best combination of anatomical names. We applied the proposed method to 90 cases of 3D CT datasets. The experimental results showed that the proposed method can assign correct anatomical names to 86.9% of the bronchial branches up to the sub-segmental lobe branches. Also, we overlaid the anatomical names of bronchial branches on real bronchoscopic views to guide real bronchoscopy.
Mechatronics technology in predictive maintenance method

NASA Astrophysics Data System (ADS)

Majid, Nurul Afiqah A.; Muthalif, Asan G. A.

2017-11-01

This paper presents recent mechatronics technology that can help to implement predictive maintenance by combining intelligent and predictive maintenance instrument. Vibration Fault Simulation System (VFSS) is an example of mechatronics system. The focus of this study is the prediction on the use of critical machines to detect vibration. Vibration measurement is often used as the key indicator of the state of the machine. This paper shows the choice of the appropriate strategy in the vibration of diagnostic process of the mechanical system, especially rotating machines, in recognition of the failure during the working process. In this paper, the vibration signature analysis is implemented to detect faults in rotary machining that includes imbalance, mechanical looseness, bent shaft, misalignment, missing blade bearing fault, balancing mass and critical speed. In order to perform vibration signature analysis for rotating machinery faults, studies have been made on how mechatronics technology is used as predictive maintenance methods. Vibration Faults Simulation Rig (VFSR) is designed to simulate and understand faults signatures. These techniques are based on the processing of vibrational data in frequency-domain. The LabVIEW-based spectrum analyzer software is developed to acquire and extract frequency contents of faults signals. This system is successfully tested based on the unique vibration fault signatures that always occur in a rotating machinery.
When Machines Think: Radiology's Next Frontier.

PubMed

Dreyer, Keith J; Geis, J Raymond

2017-12-01

Artificial intelligence (AI), machine learning, and deep learning are terms now seen frequently, all of which refer to computer algorithms that change as they are exposed to more data. Many of these algorithms are surprisingly good at recognizing objects in images. The combination of large amounts of machine-consumable digital data, increased and cheaper computing power, and increasingly sophisticated statistical models combine to enable machines to find patterns in data in ways that are not only cost-effective but also potentially beyond humans' abilities. Building an AI algorithm can be surprisingly easy. Understanding the associated data structures and statistics, on the other hand, is often difficult and obscure. Converting the algorithm into a sophisticated product that works consistently in broad, general clinical use is complex and incompletely understood. To show how these AI products reduce costs and improve outcomes will require clinical translation and industrial-grade integration into routine workflow. Radiology has the chance to leverage AI to become a center of intelligently aggregated, quantitative, diagnostic information. Centaur radiologists, formed as a synergy of human plus computer, will provide interpretations using data extracted from images by humans and image-analysis computer algorithms, as well as the electronic health record, genomics, and other disparate sources. These interpretations will form the foundation of precision health care, or care customized to an individual patient. © RSNA, 2017.
Exploring Characterizations of Learning Object Repositories Using Data Mining Techniques

NASA Astrophysics Data System (ADS)

Segura, Alejandra; Vidal, Christian; Menendez, Victor; Zapata, Alfredo; Prieto, Manuel

Learning object repositories provide a platform for the sharing of Web-based educational resources. As these repositories evolve independently, it is difficult for users to have a clear picture of the kind of contents they give access to. Metadata can be used to automatically extract a characterization of these resources by using machine learning techniques. This paper presents an exploratory study carried out in the contents of four public repositories that uses clustering and association rule mining algorithms to extract characterizations of repository contents. The results of the analysis include potential relationships between different attributes of learning objects that may be useful to gain an understanding of the kind of resources available and eventually develop search mechanisms that consider repository descriptions as a criteria in federated search.
Skeletonization with hollow detection on gray image by gray weighted distance transform

NASA Astrophysics Data System (ADS)

Bhattacharya, Prabir; Qian, Kai; Cao, Siqi; Qian, Yi

1998-10-01

A skeletonization algorithm that could be used to process non-uniformly distributed gray-scale images with hollows was presented. This algorithm is based on the Gray Weighted Distance Transformation. The process includes a preliminary phase of investigation in the hollows in the gray-scale image, whether these hollows are considered as topological constraints for the skeleton structure depending on their statistically significant depth. We then extract the resulting skeleton that has certain meaningful information for understanding the object in the image. This improved algorithm can overcome the possible misinterpretation of some complicated images in the extracted skeleton, especially in images with asymmetric hollows and asymmetric features. This algorithm can be executed on a parallel machine as all the operations are executed in local. Some examples are discussed to illustrate the algorithm.
Calculation of parameters of technological equipment for deep-sea mining

NASA Astrophysics Data System (ADS)

Yungmeister, D. A.; Ivanov, S. E.; Isaev, A. I.

2018-03-01

The actual problem of extracting minerals from the bottom of the world ocean is considered. On the ocean floor, three types of minerals are of interest: iron-manganese concretions (IMC), cobalt-manganese crusts (CMC) and sulphides. The analysis of known designs of machines and complexes for the extraction of IMC is performed. These machines are based on the principle of excavating the bottom surface; however such methods do not always correspond to “gentle” methods of mining. The ecological purity of such mining methods does not meet the necessary requirements. Such machines require the transmission of high electric power through the water column, which in some cases is a significant challenge. The authors analyzed the options of transportation of the extracted mineral from the bottom. The paper describes the design of machines that collect IMC by the method of vacuum suction. In this method, the gripping plates or drums are provided with cavities in which a vacuum is created and individual IMC are attracted to the devices by a pressure drop. The work of such machines can be called “gentle” processing technology of the bottom areas. Their environmental impact is significantly lower than mechanical devices that carry out the raking of IMC. The parameters of the device for lifting the IMC collected on the bottom are calculated. With the use of Kevlar ropes of serial production up to 0.06 meters in diameter, with a cycle time of up to 2 hours and a lifting speed of up to 3 meters per second, a productivity of about 400,000 tons per year can be realized for IMC. The development of machines based on the calculated parameters and approbation of their designs will create a unique complex for the extraction of minerals at oceanic deposits.
Prediction and Factor Extraction of Drug Function by Analyzing Medical Records in Developing Countries.

PubMed

Hu, Min; Nohara, Yasunobu; Nakamura, Masafumi; Nakashima, Naoki

2017-01-01

The World Health Organization has declared Bangladesh one of 58 countries facing acute Human Resources for Health (HRH) crisis. Artificial intelligence in healthcare has been shown to be successful for diagnostics. Using machine learning to predict pharmaceutical prescriptions may solve HRH crises. In this study, we investigate a predictive model by analyzing prescription data of 4,543 subjects in Bangladesh. We predict the function of prescribed drugs, comparing three machine-learning approaches. The approaches compare whether a subject shall be prescribed medicine from the 21 most frequently prescribed drug functions. Receiver Operating Characteristics (ROC) were selected as a way to evaluate and assess prediction models. The results show the drug function with the best prediction performance was oral hypoglycemic drugs, which has an average AUC of 0.962. To understand how the variables affect prediction, we conducted factor analysis based on tree-based algorithms and natural language processing techniques.
Nonlinear machine learning and design of reconfigurable digital colloids.

PubMed

Long, Andrew W; Phillips, Carolyn L; Jankowksi, Eric; Ferguson, Andrew L

2016-09-14

Digital colloids, a cluster of freely rotating "halo" particles tethered to the surface of a central particle, were recently proposed as ultra-high density memory elements for information storage. Rational design of these digital colloids for memory storage applications requires a quantitative understanding of the thermodynamic and kinetic stability of the configurational states within which information is stored. We apply nonlinear machine learning to Brownian dynamics simulations of these digital colloids to extract the low-dimensional intrinsic manifold governing digital colloid morphology, thermodynamics, and kinetics. By modulating the relative size ratio between halo particles and central particles, we investigate the size-dependent configurational stability and transition kinetics for the 2-state tetrahedral (N = 4) and 30-state octahedral (N = 6) digital colloids. We demonstrate the use of this framework to guide the rational design of a memory storage element to hold a block of text that trades off the competing design criteria of memory addressability and volatility.
A Method for Extracting Important Segments from Documents Using Support Vector Machines

NASA Astrophysics Data System (ADS)

Suzuki, Daisuke; Utsumi, Akira

In this paper we propose an extraction-based method for automatic summarization. The proposed method consists of two processes: important segment extraction and sentence compaction. The process of important segment extraction classifies each segment in a document as important or not by Support Vector Machines (SVMs). The process of sentence compaction then determines grammatically appropriate portions of a sentence for a summary according to its dependency structure and the classification result by SVMs. To test the performance of our method, we conducted an evaluation experiment using the Text Summarization Challenge (TSC-1) corpus of human-prepared summaries. The result was that our method achieved better performance than a segment-extraction-only method and the Lead method, especially for sentences only a part of which was included in human summaries. Further analysis of the experimental results suggests that a hybrid method that integrates sentence extraction with segment extraction may generate better summaries.
Automatic detection of Martian dark slope streaks by machine learning using HiRISE images

NASA Astrophysics Data System (ADS)

Wang, Yexin; Di, Kaichang; Xin, Xin; Wan, Wenhui

2017-07-01

Dark slope streaks (DSSs) on the Martian surface are one of the active geologic features that can be observed on Mars nowadays. The detection of DSS is a prerequisite for studying its appearance, morphology, and distribution to reveal its underlying geological mechanisms. In addition, increasingly massive amounts of Mars high resolution data are now available. Hence, an automatic detection method for locating DSSs is highly desirable. In this research, we present an automatic DSS detection method by combining interest region extraction and machine learning techniques. The interest region extraction combines gradient and regional grayscale information. Moreover, a novel recognition strategy is proposed that takes the normalized minimum bounding rectangles (MBRs) of the extracted regions to calculate the Local Binary Pattern (LBP) feature and train a DSS classifier using the Adaboost machine learning algorithm. Comparative experiments using five different feature descriptors and three different machine learning algorithms show the superiority of the proposed method. Experimental results utilizing 888 extracted region samples from 28 HiRISE images show that the overall detection accuracy of our proposed method is 92.4%, with a true positive rate of 79.1% and false positive rate of 3.7%, which in particular indicates great performance of the method at eliminating non-DSS regions.

An active role for machine learning in drug development

PubMed Central

Murphy, Robert F.

2014-01-01

Due to the complexity of biological systems, cutting-edge machine-learning methods will be critical for future drug development. In particular, machine-vision methods to extract detailed information from imaging assays and active-learning methods to guide experimentation will be required to overcome the dimensionality problem in drug development. PMID:21587249
A combination of feature extraction methods with an ensemble of different classifiers for protein structural class prediction problem.

PubMed

Dehzangi, Abdollah; Paliwal, Kuldip; Sharma, Alok; Dehzangi, Omid; Sattar, Abdul

2013-01-01

Better understanding of structural class of a given protein reveals important information about its overall folding type and its domain. It can also be directly used to provide critical information on general tertiary structure of a protein which has a profound impact on protein function determination and drug design. Despite tremendous enhancements made by pattern recognition-based approaches to solve this problem, it still remains as an unsolved issue for bioinformatics that demands more attention and exploration. In this study, we propose a novel feature extraction model that incorporates physicochemical and evolutionary-based information simultaneously. We also propose overlapped segmented distribution and autocorrelation-based feature extraction methods to provide more local and global discriminatory information. The proposed feature extraction methods are explored for 15 most promising attributes that are selected from a wide range of physicochemical-based attributes. Finally, by applying an ensemble of different classifiers namely, Adaboost.M1, LogitBoost, naive Bayes, multilayer perceptron (MLP), and support vector machine (SVM) we show enhancement of the protein structural class prediction accuracy for four popular benchmarks.
Machine Learning and Data Mining Methods in Diabetes Research.

PubMed

Kavakiotis, Ioannis; Tsave, Olga; Salifoglou, Athanasios; Maglaveras, Nicos; Vlahavas, Ioannis; Chouvarda, Ioanna

2017-01-01

The remarkable advances in biotechnology and health sciences have led to a significant production of data, such as high throughput genetic data and clinical information, generated from large Electronic Health Records (EHRs). To this end, application of machine learning and data mining methods in biosciences is presently, more than ever before, vital and indispensable in efforts to transform intelligently all available information into valuable knowledge. Diabetes mellitus (DM) is defined as a group of metabolic disorders exerting significant pressure on human health worldwide. Extensive research in all aspects of diabetes (diagnosis, etiopathophysiology, therapy, etc.) has led to the generation of huge amounts of data. The aim of the present study is to conduct a systematic review of the applications of machine learning, data mining techniques and tools in the field of diabetes research with respect to a) Prediction and Diagnosis, b) Diabetic Complications, c) Genetic Background and Environment, and e) Health Care and Management with the first category appearing to be the most popular. A wide range of machine learning algorithms were employed. In general, 85% of those used were characterized by supervised learning approaches and 15% by unsupervised ones, and more specifically, association rules. Support vector machines (SVM) arise as the most successful and widely used algorithm. Concerning the type of data, clinical datasets were mainly used. The title applications in the selected articles project the usefulness of extracting valuable knowledge leading to new hypotheses targeting deeper understanding and further investigation in DM.
General method of pattern classification using the two-domain theory

NASA Technical Reports Server (NTRS)

Rorvig, Mark E. (Inventor)

1993-01-01

Human beings judge patterns (such as images) by complex mental processes, some of which may not be known, while computing machines extract features. By representing the human judgements with simple measurements and reducing them and the machine extracted features to a common metric space and fitting them by regression, the judgements of human experts rendered on a sample of patterns may be imposed on a pattern population to provide automatic classification.
General method of pattern classification using the two-domain theory

NASA Technical Reports Server (NTRS)

Rorvig, Mark E. (Inventor)

1990-01-01

Human beings judge patterns (such as images) by complex mental processes, some of which may not be known, while computing machines extract features. By representing the human judgements with simple measurements and reducing them and the machine extracted features to a common metric space and fitting them by regression, the judgements of human experts rendered on a sample of patterns may be imposed on a pattern population to provide automatic classification.
Graph theory for feature extraction and classification: a migraine pathology case study.

PubMed

Jorge-Hernandez, Fernando; Garcia Chimeno, Yolanda; Garcia-Zapirain, Begonya; Cabrera Zubizarreta, Alberto; Gomez Beldarrain, Maria Angeles; Fernandez-Ruanova, Begonya

2014-01-01

Graph theory is also widely used as a representational form and characterization of brain connectivity network, as is machine learning for classifying groups depending on the features extracted from images. Many of these studies use different techniques, such as preprocessing, correlations, features or algorithms. This paper proposes an automatic tool to perform a standard process using images of the Magnetic Resonance Imaging (MRI) machine. The process includes pre-processing, building the graph per subject with different correlations, atlas, relevant feature extraction according to the literature, and finally providing a set of machine learning algorithms which can produce analyzable results for physicians or specialists. In order to verify the process, a set of images from prescription drug abusers and patients with migraine have been used. In this way, the proper functioning of the tool has been proved, providing results of 87% and 92% of success depending on the classifier used.
Binary classification of items of interest in a repeatable process

DOEpatents

Abell, Jeffrey A; Spicer, John Patrick; Wincek, Michael Anthony; Wang, Hui; Chakraborty, Debejyo

2015-01-06

A system includes host and learning machines. Each machine has a processor in electrical communication with at least one sensor. Instructions for predicting a binary quality status of an item of interest during a repeatable process are recorded in memory. The binary quality status includes passing and failing binary classes. The learning machine receives signals from the at least one sensor and identifies candidate features. Features are extracted from the candidate features, each more predictive of the binary quality status. The extracted features are mapped to a dimensional space having a number of dimensions proportional to the number of extracted features. The dimensional space includes most of the passing class and excludes at least 90 percent of the failing class. Received signals are compared to the boundaries of the recorded dimensional space to predict, in real time, the binary quality status of a subsequent item of interest.
A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries.

PubMed

Jiang, Min; Chen, Yukun; Liu, Mei; Rosenbloom, S Trent; Mani, Subramani; Denny, Joshua C; Xu, Hua

2011-01-01

The authors' goal was to develop and evaluate machine-learning-based approaches to extracting clinical entities-including medical problems, tests, and treatments, as well as their asserted status-from hospital discharge summaries written using natural language. This project was part of the 2010 Center of Informatics for Integrating Biology and the Bedside/Veterans Affairs (VA) natural-language-processing challenge. The authors implemented a machine-learning-based named entity recognition system for clinical text and systematically evaluated the contributions of different types of features and ML algorithms, using a training corpus of 349 annotated notes. Based on the results from training data, the authors developed a novel hybrid clinical entity extraction system, which integrated heuristic rule-based modules with the ML-base named entity recognition module. The authors applied the hybrid system to the concept extraction and assertion classification tasks in the challenge and evaluated its performance using a test data set with 477 annotated notes. Standard measures including precision, recall, and F-measure were calculated using the evaluation script provided by the Center of Informatics for Integrating Biology and the Bedside/VA challenge organizers. The overall performance for all three types of clinical entities and all six types of assertions across 477 annotated notes were considered as the primary metric in the challenge. Systematic evaluation on the training set showed that Conditional Random Fields outperformed Support Vector Machines, and semantic information from existing natural-language-processing systems largely improved performance, although contributions from different types of features varied. The authors' hybrid entity extraction system achieved a maximum overall F-score of 0.8391 for concept extraction (ranked second) and 0.9313 for assertion classification (ranked fourth, but not statistically different than the first three systems) on the test data set in the challenge.
Large-scale deep learning for robotically gathered imagery for science

NASA Astrophysics Data System (ADS)

Skinner, K.; Johnson-Roberson, M.; Li, J.; Iscar, E.

2016-12-01

With the explosion of computing power, the intelligence and capability of mobile robotics has dramatically increased over the last two decades. Today, we can deploy autonomous robots to achieve observations in a variety of environments ripe for scientific exploration. These platforms are capable of gathering a volume of data previously unimaginable. Additionally, optical cameras, driven by mobile phones and consumer photography, have rapidly improved in size, power consumption, and quality making their deployment cheaper and easier. Finally, in parallel we have seen the rise of large-scale machine learning approaches, particularly deep neural networks (DNNs), increasing the quality of the semantic understanding that can be automatically extracted from optical imagery. In concert this enables new science using a combination of machine learning and robotics. This work will discuss the application of new low-cost high-performance computing approaches and the associated software frameworks to enable scientists to rapidly extract useful science data from millions of robotically gathered images. The automated analysis of imagery on this scale opens up new avenues of inquiry unavailable using more traditional manual or semi-automated approaches. We will use a large archive of millions of benthic images gathered with an autonomous underwater vehicle to demonstrate how these tools enable new scientific questions to be posed.
Real-Time Digital Signal Processing Based on FPGAs for Electronic Skin Implementation †

PubMed Central

Ibrahim, Ali; Gastaldo, Paolo; Chible, Hussein; Valle, Maurizio

2017-01-01

Enabling touch-sensing capability would help appliances understand interaction behaviors with their surroundings. Many recent studies are focusing on the development of electronic skin because of its necessity in various application domains, namely autonomous artificial intelligence (e.g., robots), biomedical instrumentation, and replacement prosthetic devices. An essential task of the electronic skin system is to locally process the tactile data and send structured information either to mimic human skin or to respond to the application demands. The electronic skin must be fabricated together with an embedded electronic system which has the role of acquiring the tactile data, processing, and extracting structured information. On the other hand, processing tactile data requires efficient methods to extract meaningful information from raw sensor data. Machine learning represents an effective method for data analysis in many domains: it has recently demonstrated its effectiveness in processing tactile sensor data. In this framework, this paper presents the implementation of digital signal processing based on FPGAs for tactile data processing. It provides the implementation of a tensorial kernel function for a machine learning approach. Implementation results are assessed by highlighting the FPGA resource utilization and power consumption. Results demonstrate the feasibility of the proposed implementation when real-time classification of input touch modalities are targeted. PMID:28287448
Real-Time Digital Signal Processing Based on FPGAs for Electronic Skin Implementation.

PubMed

Ibrahim, Ali; Gastaldo, Paolo; Chible, Hussein; Valle, Maurizio

2017-03-10

Enabling touch-sensing capability would help appliances understand interaction behaviors with their surroundings. Many recent studies are focusing on the development of electronic skin because of its necessity in various application domains, namely autonomous artificial intelligence (e.g., robots), biomedical instrumentation, and replacement prosthetic devices. An essential task of the electronic skin system is to locally process the tactile data and send structured information either to mimic human skin or to respond to the application demands. The electronic skin must be fabricated together with an embedded electronic system which has the role of acquiring the tactile data, processing, and extracting structured information. On the other hand, processing tactile data requires efficient methods to extract meaningful information from raw sensor data. Machine learning represents an effective method for data analysis in many domains: it has recently demonstrated its effectiveness in processing tactile sensor data. In this framework, this paper presents the implementation of digital signal processing based on FPGAs for tactile data processing. It provides the implementation of a tensorial kernel function for a machine learning approach. Implementation results are assessed by highlighting the FPGA resource utilization and power consumption. Results demonstrate the feasibility of the proposed implementation when real-time classification of input touch modalities are targeted.
Young’s modulus and Poisson’s ratio changes due to machining in porous microcracked cordierite

DOE PAGES

Cooper, R. C.; Bruno, Giovanni; Onel, Yener; ...

2016-07-25

Microstructural changes in porous cordierite caused by machining were characterized using microtensile testing, X-ray computed tomography and scanning electron microscopy. Young s moduli and Poisson s ratios were determined on ~215-380 um thick machined samples by combining digital image correlation and microtensile loading. The results provide evidence for an increase in microcrack density due to machining of the thin samples extracted from diesel particulate filter honeycombs.
Noninvasive extraction of fetal electrocardiogram based on Support Vector Machine

NASA Astrophysics Data System (ADS)

Fu, Yumei; Xiang, Shihan; Chen, Tianyi; Zhou, Ping; Huang, Weiyan

2015-10-01

The fetal electrocardiogram (FECG) signal has important clinical value for diagnosing the fetal heart diseases and choosing suitable therapeutics schemes to doctors. So, the noninvasive extraction of FECG from electrocardiogram (ECG) signals becomes a hot research point. A new method, the Support Vector Machine (SVM) is utilized for the extraction of FECG with limited size of data. Firstly, the theory of the SVM and the principle of the extraction based on the SVM are studied. Secondly, the transformation of maternal electrocardiogram (MECG) component in abdominal composite signal is verified to be nonlinear and fitted with the SVM. Then, the SVM is trained, and the training results are compared with the real data to ensure the effect of the training. Meanwhile, the parameters of the SVM are optimized to achieve the best performance so that the learning machine can be utilized to fit the unknown samples. Finally, the FECG is extracted by removing the optimal estimation of MECG component from the abdominal composite signal. In order to evaluate the performance of FECG extraction based on the SVM, the Signal-to-Noise Ratio (SNR) and the visual test are used. The experimental results show that the FECG with good quality can be extracted, its SNR ratio is significantly increased as high as 9.2349 dB and the time cost is significantly decreased as short as 0.802 seconds. Compared with the traditional method, the noninvasive extraction method based on the SVM has a simple realization, the shorter treatment time and the better extraction quality under the same conditions.
A feasibility study of automatic lung nodule detection in chest digital tomosynthesis with machine learning based on support vector machine

NASA Astrophysics Data System (ADS)

Lee, Donghoon; Kim, Ye-seul; Choi, Sunghoon; Lee, Haenghwa; Jo, Byungdu; Choi, Seungyeon; Shin, Jungwook; Kim, Hee-Joung

2017-03-01

The chest digital tomosynthesis(CDT) is recently developed medical device that has several advantage for diagnosing lung disease. For example, CDT provides depth information with relatively low radiation dose compared to computed tomography (CT). However, a major problem with CDT is the image artifacts associated with data incompleteness resulting from limited angle data acquisition in CDT geometry. For this reason, the sensitivity of lung disease was not clear compared to CT. In this study, to improve sensitivity of lung disease detection in CDT, we developed computer aided diagnosis (CAD) systems based on machine learning. For design CAD systems, we used 100 cases of lung nodules cropped images and 100 cases of normal lesion cropped images acquired by lung man phantoms and proto type CDT. We used machine learning techniques based on support vector machine and Gabor filter. The Gabor filter was used for extracting characteristics of lung nodules and we compared performance of feature extraction of Gabor filter with various scale and orientation parameters. We used 3, 4, 5 scales and 4, 6, 8 orientations. After extracting features, support vector machine (SVM) was used for classifying feature of lesions. The linear, polynomial and Gaussian kernels of SVM were compared to decide the best SVM conditions for CDT reconstruction images. The results of CAD system with machine learning showed the capability of automatically lung lesion detection. Furthermore detection performance was the best when Gabor filter with 5 scale and 8 orientation and SVM with Gaussian kernel were used. In conclusion, our suggested CAD system showed improving sensitivity of lung lesion detection in CDT and decide Gabor filter and SVM conditions to achieve higher detection performance of our developed CAD system for CDT.
NASA's online machine aided indexing system

NASA Technical Reports Server (NTRS)

Silvester, June P.; Genuardi, Michael T.; Klingbiel, Paul H.

1993-01-01

This report describes the NASA Lexical Dictionary, a machine aided indexing system used online at the National Aeronautics and Space Administration's Center for Aerospace Information (CASI). This system is comprised of a text processor that is based on the computational, non-syntactic analysis of input text, and an extensive 'knowledge base' that serves to recognize and translate text-extracted concepts. The structure and function of the various NLD system components are described in detail. Methods used for the development of the knowledge base are discussed. Particular attention is given to a statistically-based text analysis program that provides the knowledge base developer with a list of concept-specific phrases extracted from large textual corpora. Production and quality benefits resulting from the integration of machine aided indexing at CASI are discussed along with a number of secondary applications of NLD-derived systems including on-line spell checking and machine aided lexicography.
The Effect of Hierarchical Micro/Nanotextured Titanium Implants on Osseointegration Immediately After Tooth Extraction in Beagle Dogs.

PubMed

Fu, Qian; Bellare, Anuj; Cui, Yajun; Cheng, Bingkun; Xu, Shanshan; Kong, Liang

2017-06-01

Owing to simplify the operation and shorten the overall duration of treatment, immediate implantation earned much satisfactory from patients and dentists. The results of immediate implantation determined by osseointegration, we fabricated a micro/nanotextured titanium implants to improve osseointegration immediately after tooth extraction. The aim of this study was to investigate the effect of hierarchical micro/nanotextured titanium implant on osseointegration immediately after tooth extraction. The micro/nanotextured titanium implants were fabricated by etching with 0.5 wt% hydrofluoric (HF) acid followed by anodization in HF electrolytes. Implants with a machined surface as well as implants a microtextured surface prepared by 0.5 wt% HF etching served as control groups. The machined, microtextured, and micro/nanotextured implants were inserted into fresh sockets immediately after tooth extraction in beagle dogs. Twelve weeks after implantation, the animals were sacrificed for micro-CT scanning, histological analysis and biomechanical test. The micro-CT imaging revealed that the bone volume/total volume (BV/TV) and trabecular thickness (Tb.Th) in the micro/nanotextured group was significantly higher than that in the machined group and microtextured group, and the trabecular separation (Tb.Sp) in the micro/nanotextured group was significantly lower than that in the other groups. For the histological analysis, the bone-to-implant contact in the machined, micro and micro/nanotextured groups were 47.13 ± 6.2%, 54.29 ± 4.18%, and 63.38 ± 7.63%, respectively, and the differences significant. The maximum pull-out force in the machined, micro, and micro/nanotextured groups were 216.58 ± 38.71 N, 259.42 ± 28.93 N, and 284.73 ± 47.09 N, respectively. The results indicated that implants with a hierarchical micro/nanotextured can promote osseointegration immediately after tooth extraction. © 2016 Wiley Periodicals, Inc.
Spoken language identification based on the enhanced self-adjusting extreme learning machine approach.

PubMed

Albadr, Musatafa Abbas Abbood; Tiun, Sabrina; Al-Dhief, Fahad Taha; Sammour, Mahmoud A M

2018-01-01

Spoken Language Identification (LID) is the process of determining and classifying natural language from a given content and dataset. Typically, data must be processed to extract useful features to perform LID. The extracting features for LID, based on literature, is a mature process where the standard features for LID have already been developed using Mel-Frequency Cepstral Coefficients (MFCC), Shifted Delta Cepstral (SDC), the Gaussian Mixture Model (GMM) and ending with the i-vector based framework. However, the process of learning based on extract features remains to be improved (i.e. optimised) to capture all embedded knowledge on the extracted features. The Extreme Learning Machine (ELM) is an effective learning model used to perform classification and regression analysis and is extremely useful to train a single hidden layer neural network. Nevertheless, the learning process of this model is not entirely effective (i.e. optimised) due to the random selection of weights within the input hidden layer. In this study, the ELM is selected as a learning model for LID based on standard feature extraction. One of the optimisation approaches of ELM, the Self-Adjusting Extreme Learning Machine (SA-ELM) is selected as the benchmark and improved by altering the selection phase of the optimisation process. The selection process is performed incorporating both the Split-Ratio and K-Tournament methods, the improved SA-ELM is named Enhanced Self-Adjusting Extreme Learning Machine (ESA-ELM). The results are generated based on LID with the datasets created from eight different languages. The results of the study showed excellent superiority relating to the performance of the Enhanced Self-Adjusting Extreme Learning Machine LID (ESA-ELM LID) compared with the SA-ELM LID, with ESA-ELM LID achieving an accuracy of 96.25%, as compared to the accuracy of SA-ELM LID of only 95.00%.
Spoken language identification based on the enhanced self-adjusting extreme learning machine approach

PubMed Central

Tiun, Sabrina; AL-Dhief, Fahad Taha; Sammour, Mahmoud A. M.

2018-01-01

Spoken Language Identification (LID) is the process of determining and classifying natural language from a given content and dataset. Typically, data must be processed to extract useful features to perform LID. The extracting features for LID, based on literature, is a mature process where the standard features for LID have already been developed using Mel-Frequency Cepstral Coefficients (MFCC), Shifted Delta Cepstral (SDC), the Gaussian Mixture Model (GMM) and ending with the i-vector based framework. However, the process of learning based on extract features remains to be improved (i.e. optimised) to capture all embedded knowledge on the extracted features. The Extreme Learning Machine (ELM) is an effective learning model used to perform classification and regression analysis and is extremely useful to train a single hidden layer neural network. Nevertheless, the learning process of this model is not entirely effective (i.e. optimised) due to the random selection of weights within the input hidden layer. In this study, the ELM is selected as a learning model for LID based on standard feature extraction. One of the optimisation approaches of ELM, the Self-Adjusting Extreme Learning Machine (SA-ELM) is selected as the benchmark and improved by altering the selection phase of the optimisation process. The selection process is performed incorporating both the Split-Ratio and K-Tournament methods, the improved SA-ELM is named Enhanced Self-Adjusting Extreme Learning Machine (ESA-ELM). The results are generated based on LID with the datasets created from eight different languages. The results of the study showed excellent superiority relating to the performance of the Enhanced Self-Adjusting Extreme Learning Machine LID (ESA-ELM LID) compared with the SA-ELM LID, with ESA-ELM LID achieving an accuracy of 96.25%, as compared to the accuracy of SA-ELM LID of only 95.00%. PMID:29672546
DOE Office of Scientific and Technical Information (OSTI.GOV)

Cooper, R. C.; Bruno, Giovanni; Onel, Yener

Microstructural changes in porous cordierite caused by machining were characterized using microtensile testing, X-ray computed tomography and scanning electron microscopy. Young s moduli and Poisson s ratios were determined on ~215-380 um thick machined samples by combining digital image correlation and microtensile loading. The results provide evidence for an increase in microcrack density due to machining of the thin samples extracted from diesel particulate filter honeycombs.
Aggregation of Electric Current Consumption Features to Extract Maintenance KPIs

NASA Astrophysics Data System (ADS)

Simon, Victor; Johansson, Carl-Anders; Galar, Diego

2017-09-01

All electric powered machines offer the possibility of extracting information and calculating Key Performance Indicators (KPIs) from the electric current signal. Depending on the time window, sampling frequency and type of analysis, different indicators from the micro to macro level can be calculated for such aspects as maintenance, production, energy consumption etc. On the micro-level, the indicators are generally used for condition monitoring and diagnostics and are normally based on a short time window and a high sampling frequency. The macro indicators are normally based on a longer time window with a slower sampling frequency and are used as indicators for overall performance, cost or consumption. The indicators can be calculated directly from the current signal but can also be based on a combination of information from the current signal and operational data like rpm, position etc. One or several of those indicators can be used for prediction and prognostics of a machine's future behavior. This paper uses this technique to calculate indicators for maintenance and energy optimization in electric powered machines and fleets of machines, especially machine tools.

Automatic optical detection and classification of marine animals around MHK converters using machine vision

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brunton, Steven

Optical systems provide valuable information for evaluating interactions and associations between organisms and MHK energy converters and for capturing potentially rare encounters between marine organisms and MHK device. The deluge of optical data from cabled monitoring packages makes expert review time-consuming and expensive. We propose algorithms and a processing framework to automatically extract events of interest from underwater video. The open-source software framework consists of background subtraction, filtering, feature extraction and hierarchical classification algorithms. This principle classification pipeline was validated on real-world data collected with an experimental underwater monitoring package. An event detection rate of 100% was achieved using robustmore » principal components analysis (RPCA), Fourier feature extraction and a support vector machine (SVM) binary classifier. The detected events were then further classified into more complex classes – algae | invertebrate | vertebrate, one species | multiple species of fish, and interest rank. Greater than 80% accuracy was achieved using a combination of machine learning techniques.« less
Wavelet images and Chou's pseudo amino acid composition for protein classification.

PubMed

Nanni, Loris; Brahnam, Sheryl; Lumini, Alessandra

2012-08-01

The last decade has seen an explosion in the collection of protein data. To actualize the potential offered by this wealth of data, it is important to develop machine systems capable of classifying and extracting features from proteins. Reliable machine systems for protein classification offer many benefits, including the promise of finding novel drugs and vaccines. In developing our system, we analyze and compare several feature extraction methods used in protein classification that are based on the calculation of texture descriptors starting from a wavelet representation of the protein. We then feed these texture-based representations of the protein into an Adaboost ensemble of neural network or a support vector machine classifier. In addition, we perform experiments that combine our feature extraction methods with a standard method that is based on the Chou's pseudo amino acid composition. Using several datasets, we show that our best approach outperforms standard methods. The Matlab code of the proposed protein descriptors is available at http://bias.csr.unibo.it/nanni/wave.rar .
A method for integrating and ranking the evidence for biochemical pathways by mining reactions from text

PubMed Central

Miwa, Makoto; Ohta, Tomoko; Rak, Rafal; Rowley, Andrew; Kell, Douglas B.; Pyysalo, Sampo; Ananiadou, Sophia

2013-01-01

Motivation: To create, verify and maintain pathway models, curators must discover and assess knowledge distributed over the vast body of biological literature. Methods supporting these tasks must understand both the pathway model representations and the natural language in the literature. These methods should identify and order documents by relevance to any given pathway reaction. No existing system has addressed all aspects of this challenge. Method: We present novel methods for associating pathway model reactions with relevant publications. Our approach extracts the reactions directly from the models and then turns them into queries for three text mining-based MEDLINE literature search systems. These queries are executed, and the resulting documents are combined and ranked according to their relevance to the reactions of interest. We manually annotate document-reaction pairs with the relevance of the document to the reaction and use this annotation to study several ranking methods, using various heuristic and machine-learning approaches. Results: Our evaluation shows that the annotated document-reaction pairs can be used to create a rule-based document ranking system, and that machine learning can be used to rank documents by their relevance to pathway reactions. We find that a Support Vector Machine-based system outperforms several baselines and matches the performance of the rule-based system. The success of the query extraction and ranking methods are used to update our existing pathway search system, PathText. Availability: An online demonstration of PathText 2 and the annotated corpus are available for research purposes at http://www.nactem.ac.uk/pathtext2/. Contact: makoto.miwa@manchester.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23813008
The feasibility of using natural language processing to extract clinical information from breast pathology reports.

PubMed

Buckley, Julliette M; Coopey, Suzanne B; Sharko, John; Polubriaginof, Fernanda; Drohan, Brian; Belli, Ahmet K; Kim, Elizabeth M H; Garber, Judy E; Smith, Barbara L; Gadd, Michele A; Specht, Michelle C; Roche, Constance A; Gudewicz, Thomas M; Hughes, Kevin S

2012-01-01

The opportunity to integrate clinical decision support systems into clinical practice is limited due to the lack of structured, machine readable data in the current format of the electronic health record. Natural language processing has been designed to convert free text into machine readable data. The aim of the current study was to ascertain the feasibility of using natural language processing to extract clinical information from >76,000 breast pathology reports. APPROACH AND PROCEDURE: Breast pathology reports from three institutions were analyzed using natural language processing software (Clearforest, Waltham, MA) to extract information on a variety of pathologic diagnoses of interest. Data tables were created from the extracted information according to date of surgery, side of surgery, and medical record number. The variety of ways in which each diagnosis could be represented was recorded, as a means of demonstrating the complexity of machine interpretation of free text. There was widespread variation in how pathologists reported common pathologic diagnoses. We report, for example, 124 ways of saying invasive ductal carcinoma and 95 ways of saying invasive lobular carcinoma. There were >4000 ways of saying invasive ductal carcinoma was not present. Natural language processor sensitivity and specificity were 99.1% and 96.5% when compared to expert human coders. We have demonstrated how a large body of free text medical information such as seen in breast pathology reports, can be converted to a machine readable format using natural language processing, and described the inherent complexities of the task.
Machine Detection of Enhanced Electromechanical Energy Conversion in PbZr 0.2Ti 0.8O 3 Thin Films

DOE PAGES

Agar, Joshua C.; Cao, Ye; Naul, Brett; ...

2018-05-28

Many energy conversion, sensing, and microelectronic applications based on ferroic materials are determined by the domain structure evolution under applied stimuli. New hyperspectral, multidimensional spectroscopic techniques now probe dynamic responses at relevant length and time scales to provide an understanding of how these nanoscale domain structures impact macroscopic properties. Such approaches, however, remain limited in use because of the difficulties that exist in extracting and visualizing scientific insights from these complex datasets. Using multidimensional band-excitation scanning probe spectroscopy and adapting tools from both computer vision and machine learning, an automated workflow is developed to featurize, detect, and classify signatures ofmore » ferroelectric/ferroelastic switching processes in complex ferroelectric domain structures. This approach enables the identification and nanoscale visualization of varied modes of response and a pathway to statistically meaningful quantification of the differences between those modes. Lastly, among other things, the importance of domain geometry is spatially visualized for enhancing nanoscale electromechanical energy conversion.« less
Smart Point Cloud: Definition and Remaining Challenges

NASA Astrophysics Data System (ADS)

Poux, F.; Hallot, P.; Neuville, R.; Billen, R.

2016-10-01

Dealing with coloured point cloud acquired from terrestrial laser scanner, this paper identifies remaining challenges for a new data structure: the smart point cloud. This concept arises with the statement that massive and discretized spatial information from active remote sensing technology is often underused due to data mining limitations. The generalisation of point cloud data associated with the heterogeneity and temporality of such datasets is the main issue regarding structure, segmentation, classification, and interaction for an immediate understanding. We propose to use both point cloud properties and human knowledge through machine learning to rapidly extract pertinent information, using user-centered information (smart data) rather than raw data. A review of feature detection, machine learning frameworks and database systems indexed both for mining queries and data visualisation is studied. Based on existing approaches, we propose a new 3-block flexible framework around device expertise, analytic expertise and domain base reflexion. This contribution serves as the first step for the realisation of a comprehensive smart point cloud data structure.
On the impact of approximate computation in an analog DeSTIN architecture.

PubMed

Young, Steven; Lu, Junjie; Holleman, Jeremy; Arel, Itamar

2014-05-01

Deep machine learning (DML) holds the potential to revolutionize machine learning by automating rich feature extraction, which has become the primary bottleneck of human engineering in pattern recognition systems. However, the heavy computational burden renders DML systems implemented on conventional digital processors impractical for large-scale problems. The highly parallel computations required to implement large-scale deep learning systems are well suited to custom hardware. Analog computation has demonstrated power efficiency advantages of multiple orders of magnitude relative to digital systems while performing nonideal computations. In this paper, we investigate typical error sources introduced by analog computational elements and their impact on system-level performance in DeSTIN--a compositional deep learning architecture. These inaccuracies are evaluated on a pattern classification benchmark, clearly demonstrating the robustness of the underlying algorithm to the errors introduced by analog computational elements. A clear understanding of the impacts of nonideal computations is necessary to fully exploit the efficiency of analog circuits.
Machine Detection of Enhanced Electromechanical Energy Conversion in PbZr 0.2Ti 0.8O 3 Thin Films

DOE Office of Scientific and Technical Information (OSTI.GOV)

Agar, Joshua C.; Cao, Ye; Naul, Brett

Many energy conversion, sensing, and microelectronic applications based on ferroic materials are determined by the domain structure evolution under applied stimuli. New hyperspectral, multidimensional spectroscopic techniques now probe dynamic responses at relevant length and time scales to provide an understanding of how these nanoscale domain structures impact macroscopic properties. Such approaches, however, remain limited in use because of the difficulties that exist in extracting and visualizing scientific insights from these complex datasets. Using multidimensional band-excitation scanning probe spectroscopy and adapting tools from both computer vision and machine learning, an automated workflow is developed to featurize, detect, and classify signatures ofmore » ferroelectric/ferroelastic switching processes in complex ferroelectric domain structures. This approach enables the identification and nanoscale visualization of varied modes of response and a pathway to statistically meaningful quantification of the differences between those modes. Lastly, among other things, the importance of domain geometry is spatially visualized for enhancing nanoscale electromechanical energy conversion.« less
A multiple kernel support vector machine scheme for feature selection and rule extraction from gene expression data of cancer tissue.

PubMed

Chen, Zhenyu; Li, Jianping; Wei, Liwei

2007-10-01

Recently, gene expression profiling using microarray techniques has been shown as a promising tool to improve the diagnosis and treatment of cancer. Gene expression data contain high level of noise and the overwhelming number of genes relative to the number of available samples. It brings out a great challenge for machine learning and statistic techniques. Support vector machine (SVM) has been successfully used to classify gene expression data of cancer tissue. In the medical field, it is crucial to deliver the user a transparent decision process. How to explain the computed solutions and present the extracted knowledge becomes a main obstacle for SVM. A multiple kernel support vector machine (MK-SVM) scheme, consisting of feature selection, rule extraction and prediction modeling is proposed to improve the explanation capacity of SVM. In this scheme, we show that the feature selection problem can be translated into an ordinary multiple parameters learning problem. And a shrinkage approach: 1-norm based linear programming is proposed to obtain the sparse parameters and the corresponding selected features. We propose a novel rule extraction approach using the information provided by the separating hyperplane and support vectors to improve the generalization capacity and comprehensibility of rules and reduce the computational complexity. Two public gene expression datasets: leukemia dataset and colon tumor dataset are used to demonstrate the performance of this approach. Using the small number of selected genes, MK-SVM achieves encouraging classification accuracy: more than 90% for both two datasets. Moreover, very simple rules with linguist labels are extracted. The rule sets have high diagnostic power because of their good classification performance.
Intelligent Vision On The SM9O Mini-Computer Basis And Applications

NASA Astrophysics Data System (ADS)

Hawryszkiw, J.

1985-02-01

Distinction has to be made between image processing and vision Image processing finds its roots in the strong tradition of linear signal processing and promotes geometrical transform techniques, such as fi I tering , compression, and restoration. Its purpose is to transform an image for a human observer to easily extract from that image information significant for him. For example edges after a gradient operator, or a specific direction after a directional filtering operation. Image processing consists in fact in a set of local or global space-time transforms. The interpretation of the final image is done by the human observer. The purpose of vision is to extract the semantic content of the image. The machine can then understand that content, and run a process of decision, which turns into an action. Thus, intel I i gent vision depends on - Image processing - Pattern recognition - Artificial intel I igence
Mining chemical information from open patents

PubMed Central

2011-01-01

Linked Open Data presents an opportunity to vastly improve the quality of science in all fields by increasing the availability and usability of the data upon which it is based. In the chemical field, there is a huge amount of information available in the published literature, the vast majority of which is not available in machine-understandable formats. PatentEye, a prototype system for the extraction and semantification of chemical reactions from the patent literature has been implemented and is discussed. A total of 4444 reactions were extracted from 667 patent documents that comprised 10 weeks' worth of publications from the European Patent Office (EPO), with a precision of 78% and recall of 64% with regards to determining the identity and amount of reactants employed and an accuracy of 92% with regards to product identification. NMR spectra reported as product characterisation data are additionally captured. PMID:21999425
Developing Preservice Teachers' Understanding of Function Using a Vending Machine Metaphor Applet

ERIC Educational Resources Information Center

McCulloch, Allison; Lovett, Jennifer; Edgington, Cyndi

2017-01-01

The purpose of this study is to examine the use of a Vending Machine applet as a cognitive root for the development of preservice teachers understanding of function. The applet was designed to purposefully problematize common misconceptions associated with the algebraic nature of typical function machines. Findings indicated affordances and…
Extracting features from protein sequences to improve deep extreme learning machine for protein fold recognition.

PubMed

Ibrahim, Wisam; Abadeh, Mohammad Saniee

2017-05-21

Protein fold recognition is an important problem in bioinformatics to predict three-dimensional structure of a protein. One of the most challenging tasks in protein fold recognition problem is the extraction of efficient features from the amino-acid sequences to obtain better classifiers. In this paper, we have proposed six descriptors to extract features from protein sequences. These descriptors are applied in the first stage of a three-stage framework PCA-DELM-LDA to extract feature vectors from the amino-acid sequences. Principal Component Analysis PCA has been implemented to reduce the number of extracted features. The extracted feature vectors have been used with original features to improve the performance of the Deep Extreme Learning Machine DELM in the second stage. Four new features have been extracted from the second stage and used in the third stage by Linear Discriminant Analysis LDA to classify the instances into 27 folds. The proposed framework is implemented on the independent and combined feature sets in SCOP datasets. The experimental results show that extracted feature vectors in the first stage could improve the performance of DELM in extracting new useful features in second stage. Copyright © 2017 Elsevier Ltd. All rights reserved.
Feature extraction algorithm for space targets based on fractal theory

NASA Astrophysics Data System (ADS)

Tian, Balin; Yuan, Jianping; Yue, Xiaokui; Ning, Xin

2007-11-01

In order to offer a potential for extending the life of satellites and reducing the launch and operating costs, satellite servicing including conducting repairs, upgrading and refueling spacecraft on-orbit become much more frequently. Future space operations can be more economically and reliably executed using machine vision systems, which can meet real time and tracking reliability requirements for image tracking of space surveillance system. Machine vision was applied to the research of relative pose for spacecrafts, the feature extraction algorithm was the basis of relative pose. In this paper fractal geometry based edge extraction algorithm which can be used in determining and tracking the relative pose of an observed satellite during proximity operations in machine vision system was presented. The method gets the gray-level image distributed by fractal dimension used the Differential Box-Counting (DBC) approach of the fractal theory to restrain the noise. After this, we detect the consecutive edge using Mathematical Morphology. The validity of the proposed method is examined by processing and analyzing images of space targets. The edge extraction method not only extracts the outline of the target, but also keeps the inner details. Meanwhile, edge extraction is only processed in moving area to reduce computation greatly. Simulation results compared edge detection using the method which presented by us with other detection methods. The results indicate that the presented algorithm is a valid method to solve the problems of relative pose for spacecrafts.
Using deep belief network modelling to characterize differences in brain morphometry in schizophrenia

PubMed Central

Pinaya, Walter H. L.; Gadelha, Ary; Doyle, Orla M.; Noto, Cristiano; Zugman, André; Cordeiro, Quirino; Jackowski, Andrea P.; Bressan, Rodrigo A.; Sato, João R.

2016-01-01

Neuroimaging-based models contribute to increasing our understanding of schizophrenia pathophysiology and can reveal the underlying characteristics of this and other clinical conditions. However, the considerable variability in reported neuroimaging results mirrors the heterogeneity of the disorder. Machine learning methods capable of representing invariant features could circumvent this problem. In this structural MRI study, we trained a deep learning model known as deep belief network (DBN) to extract features from brain morphometry data and investigated its performance in discriminating between healthy controls (N = 83) and patients with schizophrenia (N = 143). We further analysed performance in classifying patients with a first-episode psychosis (N = 32). The DBN highlighted differences between classes, especially in the frontal, temporal, parietal, and insular cortices, and in some subcortical regions, including the corpus callosum, putamen, and cerebellum. The DBN was slightly more accurate as a classifier (accuracy = 73.6%) than the support vector machine (accuracy = 68.1%). Finally, the error rate of the DBN in classifying first-episode patients was 56.3%, indicating that the representations learned from patients with schizophrenia and healthy controls were not suitable to define these patients. Our data suggest that deep learning could improve our understanding of psychiatric disorders such as schizophrenia by improving neuromorphometric analyses. PMID:27941946
Using deep belief network modelling to characterize differences in brain morphometry in schizophrenia

NASA Astrophysics Data System (ADS)

Pinaya, Walter H. L.; Gadelha, Ary; Doyle, Orla M.; Noto, Cristiano; Zugman, André; Cordeiro, Quirino; Jackowski, Andrea P.; Bressan, Rodrigo A.; Sato, João R.

2016-12-01

Neuroimaging-based models contribute to increasing our understanding of schizophrenia pathophysiology and can reveal the underlying characteristics of this and other clinical conditions. However, the considerable variability in reported neuroimaging results mirrors the heterogeneity of the disorder. Machine learning methods capable of representing invariant features could circumvent this problem. In this structural MRI study, we trained a deep learning model known as deep belief network (DBN) to extract features from brain morphometry data and investigated its performance in discriminating between healthy controls (N = 83) and patients with schizophrenia (N = 143). We further analysed performance in classifying patients with a first-episode psychosis (N = 32). The DBN highlighted differences between classes, especially in the frontal, temporal, parietal, and insular cortices, and in some subcortical regions, including the corpus callosum, putamen, and cerebellum. The DBN was slightly more accurate as a classifier (accuracy = 73.6%) than the support vector machine (accuracy = 68.1%). Finally, the error rate of the DBN in classifying first-episode patients was 56.3%, indicating that the representations learned from patients with schizophrenia and healthy controls were not suitable to define these patients. Our data suggest that deep learning could improve our understanding of psychiatric disorders such as schizophrenia by improving neuromorphometric analyses.
Discriminative and informative features for biomolecular text mining with ensemble feature selection.

PubMed

Van Landeghem, Sofie; Abeel, Thomas; Saeys, Yvan; Van de Peer, Yves

2010-09-15

In the field of biomolecular text mining, black box behavior of machine learning systems currently limits understanding of the true nature of the predictions. However, feature selection (FS) is capable of identifying the most relevant features in any supervised learning setting, providing insight into the specific properties of the classification algorithm. This allows us to build more accurate classifiers while at the same time bridging the gap between the black box behavior and the end-user who has to interpret the results. We show that our FS methodology successfully discards a large fraction of machine-generated features, improving classification performance of state-of-the-art text mining algorithms. Furthermore, we illustrate how FS can be applied to gain understanding in the predictions of a framework for biomolecular event extraction from text. We include numerous examples of highly discriminative features that model either biological reality or common linguistic constructs. Finally, we discuss a number of insights from our FS analyses that will provide the opportunity to considerably improve upon current text mining tools. The FS algorithms and classifiers are available in Java-ML (http://java-ml.sf.net). The datasets are publicly available from the BioNLP'09 Shared Task web site (http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/SharedTask/).
Regolith-geology mapping with support vector machine: A case study over weathered Ni-bearing peridotites, New Caledonia

NASA Astrophysics Data System (ADS)

De Boissieu, Florian; Sevin, Brice; Cudahy, Thomas; Mangeas, Morgan; Chevrel, Stéphane; Ong, Cindy; Rodger, Andrew; Maurizot, Pierre; Laukamp, Carsten; Lau, Ian; Touraivane, Touraivane; Cluzel, Dominique; Despinoy, Marc

2018-02-01

Accurate maps of Earth's geology, especially its regolith, are required for managing the sustainable exploration and development of mineral resources. This paper shows how airborne imaging hyperspectral data collected over weathered peridotite rocks in vegetated, mountainous terrane in New Caledonia were processed using a combination of methods to generate a regolith-geology map that could be used for more efficiently targeting Ni exploration. The image processing combined two usual methods, which are spectral feature extraction and support vector machine (SVM). This rationale being the spectral features extraction can rapidly reduce data complexity by both targeting only the diagnostic mineral absorptions and masking those pixels complicated by vegetation, cloud and deep shade. SVM is a supervised classification method able to generate an optimal non-linear classifier with these features that generalises well even with limited training data. Key minerals targeted are serpentine, which is considered as an indicator for hydrolysed peridotitic rock, and iron oxy-hydroxides (hematite and goethite), which are considered as diagnostic of laterite development. The final classified regolith map was assessed against interpreted regolith field sites, which yielded approximately 70% similarity for all unit types, as well as against a regolith-geology map interpreted using traditional datasets (not hyperspectral imagery). Importantly, the hyperspectral derived mineral map provided much greater detail enabling a more precise understanding of the regolith-geological architecture where there are exposed soils and rocks.
The New Possibilities from "Big Data" to Overlooked Associations Between Diabetes, Biochemical Parameters, Glucose Control, and Osteoporosis.

PubMed

Kruse, Christian

2018-06-01

To review current practices and technologies within the scope of "Big Data" that can further our understanding of diabetes mellitus and osteoporosis from large volumes of data. "Big Data" techniques involving supervised machine learning, unsupervised machine learning, and deep learning image analysis are presented with examples of current literature. Supervised machine learning can allow us to better predict diabetes-induced osteoporosis and understand relative predictor importance of diabetes-affected bone tissue. Unsupervised machine learning can allow us to understand patterns in data between diabetic pathophysiology and altered bone metabolism. Image analysis using deep learning can allow us to be less dependent on surrogate predictors and use large volumes of images to classify diabetes-induced osteoporosis and predict future outcomes directly from images. "Big Data" techniques herald new possibilities to understand diabetes-induced osteoporosis and ascertain our current ability to classify, understand, and predict this condition.
Attention-Based Recurrent Temporal Restricted Boltzmann Machine for Radar High Resolution Range Profile Sequence Recognition.

PubMed

Zhang, Yifan; Gao, Xunzhang; Peng, Xuan; Ye, Jiaqi; Li, Xiang

2018-05-16

The High Resolution Range Profile (HRRP) recognition has attracted great concern in the field of Radar Automatic Target Recognition (RATR). However, traditional HRRP recognition methods failed to model high dimensional sequential data efficiently and have a poor anti-noise ability. To deal with these problems, a novel stochastic neural network model named Attention-based Recurrent Temporal Restricted Boltzmann Machine (ARTRBM) is proposed in this paper. RTRBM is utilized to extract discriminative features and the attention mechanism is adopted to select major features. RTRBM is efficient to model high dimensional HRRP sequences because it can extract the information of temporal and spatial correlation between adjacent HRRPs. The attention mechanism is used in sequential data recognition tasks including machine translation and relation classification, which makes the model pay more attention to the major features of recognition. Therefore, the combination of RTRBM and the attention mechanism makes our model effective for extracting more internal related features and choose the important parts of the extracted features. Additionally, the model performs well with the noise corrupted HRRP data. Experimental results on the Moving and Stationary Target Acquisition and Recognition (MSTAR) dataset show that our proposed model outperforms other traditional methods, which indicates that ARTRBM extracts, selects, and utilizes the correlation information between adjacent HRRPs effectively and is suitable for high dimensional data or noise corrupted data.

DREAM: Classification scheme for dialog acts in clinical research query mediation.

PubMed

Hoxha, Julia; Chandar, Praveen; He, Zhe; Cimino, James; Hanauer, David; Weng, Chunhua

2016-02-01

Clinical data access involves complex but opaque communication between medical researchers and query analysts. Understanding such communication is indispensable for designing intelligent human-machine dialog systems that automate query formulation. This study investigates email communication and proposes a novel scheme for classifying dialog acts in clinical research query mediation. We analyzed 315 email messages exchanged in the communication for 20 data requests obtained from three institutions. The messages were segmented into 1333 utterance units. Through a rigorous process, we developed a classification scheme and applied it for dialog act annotation of the extracted utterances. Evaluation results with high inter-annotator agreement demonstrate the reliability of this scheme. This dataset is used to contribute preliminary understanding of dialog acts distribution and conversation flow in this dialog space. Copyright © 2015 Elsevier Inc. All rights reserved.
Identification of compound-protein interactions through the analysis of gene ontology, KEGG enrichment for proteins and molecular fragments of compounds.

PubMed

Chen, Lei; Zhang, Yu-Hang; Zheng, Mingyue; Huang, Tao; Cai, Yu-Dong

2016-12-01

Compound-protein interactions play important roles in every cell via the recognition and regulation of specific functional proteins. The correct identification of compound-protein interactions can lead to a good comprehension of this complicated system and provide useful input for the investigation of various attributes of compounds and proteins. In this study, we attempted to understand this system by extracting properties from both proteins and compounds, in which proteins were represented by gene ontology and KEGG pathway enrichment scores and compounds were represented by molecular fragments. Advanced feature selection methods, including minimum redundancy maximum relevance, incremental feature selection, and the basic machine learning algorithm random forest, were used to analyze these properties and extract core factors for the determination of actual compound-protein interactions. Compound-protein interactions reported in The Binding Databases were used as positive samples. To improve the reliability of the results, the analytic procedure was executed five times using different negative samples. Simultaneously, five optimal prediction methods based on a random forest and yielding maximum MCCs of approximately 77.55 % were constructed and may be useful tools for the prediction of compound-protein interactions. This work provides new clues to understanding the system of compound-protein interactions by analyzing extracted core features. Our results indicate that compound-protein interactions are related to biological processes involving immune, developmental and hormone-associated pathways.
Toward Harnessing User Feedback For Machine Learning

DTIC Science & Technology

2006-10-02

machine learning systems. If this resource-the users themselves-could somehow work hand-in-hand with machine learning systems, the accuracy of learning systems could be improved and the users? understanding and trust of the system could improve as well. We conducted a think-aloud study to see how willing users were to provide feedback and to understand what kinds of feedback users could give. Users were shown explanations of machine learning predictions and asked to provide feedback to improve the predictions. We found that users
Towards building a disease-phenotype knowledge base: extracting disease-manifestation relationship from literature

PubMed Central

Xu, Rong; Li, Li; Wang, QuanQiu

2013-01-01

Motivation: Systems approaches to studying phenotypic relationships among diseases are emerging as an active area of research for both novel disease gene discovery and drug repurposing. Currently, systematic study of disease phenotypic relationships on a phenome-wide scale is limited because large-scale machine-understandable disease–phenotype relationship knowledge bases are often unavailable. Here, we present an automatic approach to extract disease–manifestation (D-M) pairs (one specific type of disease–phenotype relationship) from the wide body of published biomedical literature. Data and Methods: Our method leverages external knowledge and limits the amount of human effort required. For the text corpus, we used 119 085 682 MEDLINE sentences (21 354 075 citations). First, we used D-M pairs from existing biomedical ontologies as prior knowledge to automatically discover D-M–specific syntactic patterns. We then extracted additional pairs from MEDLINE using the learned patterns. Finally, we analysed correlations between disease manifestations and disease-associated genes and drugs to demonstrate the potential of this newly created knowledge base in disease gene discovery and drug repurposing. Results: In total, we extracted 121 359 unique D-M pairs with a high precision of 0.924. Among the extracted pairs, 120 419 (99.2%) have not been captured in existing structured knowledge sources. We have shown that disease manifestations correlate positively with both disease-associated genes and drug treatments. Conclusions: The main contribution of our study is the creation of a large-scale and accurate D-M phenotype relationship knowledge base. This unique knowledge base, when combined with existing phenotypic, genetic and proteomic datasets, can have profound implications in our deeper understanding of disease etiology and in rapid drug repurposing. Availability: http://nlp.case.edu/public/data/DMPatternUMLS/ Contact: rxx@case.edu PMID:23828786
Landcover Classification Using Deep Fully Convolutional Neural Networks

NASA Astrophysics Data System (ADS)

Wang, J.; Li, X.; Zhou, S.; Tang, J.

2017-12-01

Land cover classification has always been an essential application in remote sensing. Certain image features are needed for land cover classification whether it is based on pixel or object-based methods. Different from other machine learning methods, deep learning model not only extracts useful information from multiple bands/attributes, but also learns spatial characteristics. In recent years, deep learning methods have been developed rapidly and widely applied in image recognition, semantic understanding, and other application domains. However, there are limited studies applying deep learning methods in land cover classification. In this research, we used fully convolutional networks (FCN) as the deep learning model to classify land covers. The National Land Cover Database (NLCD) within the state of Kansas was used as training dataset and Landsat images were classified using the trained FCN model. We also applied an image segmentation method to improve the original results from the FCN model. In addition, the pros and cons between deep learning and several machine learning methods were compared and explored. Our research indicates: (1) FCN is an effective classification model with an overall accuracy of 75%; (2) image segmentation improves the classification results with better match of spatial patterns; (3) FCN has an excellent ability of learning which can attains higher accuracy and better spatial patterns compared with several machine learning methods.
Large-scale machine learning of media outlets for understanding public reactions to nation-wide viral infection outbreaks.

PubMed

Choi, Sungwoon; Lee, Jangho; Kang, Min-Gyu; Min, Hyeyoung; Chang, Yoon-Seok; Yoon, Sungroh

2017-10-01

From May to July 2015, there was a nation-wide outbreak of Middle East respiratory syndrome (MERS) in Korea. MERS is caused by MERS-CoV, an enveloped, positive-sense, single-stranded RNA virus belonging to the family Coronaviridae. Despite expert opinions that the danger of MERS might be exaggerated, there was an overreaction by the public according to the Korean mass media, which led to a noticeable reduction in social and economic activities during the outbreak. To explain this phenomenon, we presumed that machine learning-based analysis of media outlets would be helpful and collected a number of Korean mass media articles and short-text comments produced during the 10-week outbreak. To process and analyze the collected data (over 86 million words in total) effectively, we created a methodology composed of machine-learning and information-theoretic approaches. Our proposal included techniques for extracting emotions from emoticons and Internet slang, which allowed us to significantly (approximately 73%) increase the number of emotion-bearing texts needed for robust sentiment analysis of social media. As a result, we discovered a plausible explanation for the public overreaction to MERS in terms of the interplay between the disease, mass media, and public emotions. Copyright © 2017 Elsevier Inc. All rights reserved.
Understanding dental CAD/CAM for restorations - dental milling machines from a mechanical engineering viewpoint. Part A: chairside milling machines.

PubMed

Lebon, Nicolas; Tapie, Laurent; Duret, Francois; Attal, Jean-Pierre

2016-01-01

The dental milling machine is an important device in the dental CAD/CAM chain. Nowadays, dental numerical controlled (NC) milling machines are available for dental surgeries (chairside solution). This article provides a mechanical engineering approach to NC milling machines to help dentists understand the involvement of technology in digital dentistry practice. First, some technical concepts and definitions associated with NC milling machines are described from a mechanical engineering viewpoint. The technical and economic criteria of four chairside dental NC milling machines that are available on the market are then described. The technical criteria are focused on the capacities of the embedded technologies of these milling machines to mill both prosthetic materials and types of shape restorations. The economic criteria are focused on investment costs and interoperability with third-party software. The clinical relevance of the technology is assessed in terms of the accuracy and integrity of the restoration.
Characterization of electroencephalography signals for estimating saliency features in videos.

PubMed

Liang, Zhen; Hamada, Yasuyuki; Oba, Shigeyuki; Ishii, Shin

2018-05-12

Understanding the functions of the visual system has been one of the major targets in neuroscience formany years. However, the relation between spontaneous brain activities and visual saliency in natural stimuli has yet to be elucidated. In this study, we developed an optimized machine learning-based decoding model to explore the possible relationships between the electroencephalography (EEG) characteristics and visual saliency. The optimal features were extracted from the EEG signals and saliency map which was computed according to an unsupervised saliency model ( Tavakoli and Laaksonen, 2017). Subsequently, various unsupervised feature selection/extraction techniques were examined using different supervised regression models. The robustness of the presented model was fully verified by means of ten-fold or nested cross validation procedure, and promising results were achieved in the reconstruction of saliency features based on the selected EEG characteristics. Through the successful demonstration of using EEG characteristics to predict the real-time saliency distribution in natural videos, we suggest the feasibility of quantifying visual content through measuring brain activities (EEG signals) in real environments, which would facilitate the understanding of cortical involvement in the processing of natural visual stimuli and application developments motivated by human visual processing. Copyright © 2018 Elsevier Ltd. All rights reserved.
Integrating semantic information into multiple kernels for protein-protein interaction extraction from biomedical literatures.

PubMed

Li, Lishuang; Zhang, Panpan; Zheng, Tianfu; Zhang, Hongying; Jiang, Zhenchao; Huang, Degen

2014-01-01

Protein-Protein Interaction (PPI) extraction is an important task in the biomedical information extraction. Presently, many machine learning methods for PPI extraction have achieved promising results. However, the performance is still not satisfactory. One reason is that the semantic resources were basically ignored. In this paper, we propose a multiple-kernel learning-based approach to extract PPIs, combining the feature-based kernel, tree kernel and semantic kernel. Particularly, we extend the shortest path-enclosed tree kernel (SPT) by a dynamic extended strategy to retrieve the richer syntactic information. Our semantic kernel calculates the protein-protein pair similarity and the context similarity based on two semantic resources: WordNet and Medical Subject Heading (MeSH). We evaluate our method with Support Vector Machine (SVM) and achieve an F-score of 69.40% and an AUC of 92.00%, which show that our method outperforms most of the state-of-the-art systems by integrating semantic information.
A Framework for Final Drive Simultaneous Failure Diagnosis Based on Fuzzy Entropy and Sparse Bayesian Extreme Learning Machine

PubMed Central

Ye, Qing; Pan, Hao; Liu, Changhua

2015-01-01

This research proposes a novel framework of final drive simultaneous failure diagnosis containing feature extraction, training paired diagnostic models, generating decision threshold, and recognizing simultaneous failure modes. In feature extraction module, adopt wavelet package transform and fuzzy entropy to reduce noise interference and extract representative features of failure mode. Use single failure sample to construct probability classifiers based on paired sparse Bayesian extreme learning machine which is trained only by single failure modes and have high generalization and sparsity of sparse Bayesian learning approach. To generate optimal decision threshold which can convert probability output obtained from classifiers into final simultaneous failure modes, this research proposes using samples containing both single and simultaneous failure modes and Grid search method which is superior to traditional techniques in global optimization. Compared with other frequently used diagnostic approaches based on support vector machine and probability neural networks, experiment results based on F 1-measure value verify that the diagnostic accuracy and efficiency of the proposed framework which are crucial for simultaneous failure diagnosis are superior to the existing approach. PMID:25722717
Skipping the real world: Classification of PolSAR images without explicit feature extraction

NASA Astrophysics Data System (ADS)

Hänsch, Ronny; Hellwich, Olaf

2018-06-01

The typical processing chain for pixel-wise classification from PolSAR images starts with an optional preprocessing step (e.g. speckle reduction), continues with extracting features projecting the complex-valued data into the real domain (e.g. by polarimetric decompositions) which are then used as input for a machine-learning based classifier, and ends in an optional postprocessing (e.g. label smoothing). The extracted features are usually hand-crafted as well as preselected and represent (a somewhat arbitrary) projection from the complex to the real domain in order to fit the requirements of standard machine-learning approaches such as Support Vector Machines or Artificial Neural Networks. This paper proposes to adapt the internal node tests of Random Forests to work directly on the complex-valued PolSAR data, which makes any explicit feature extraction obsolete. This approach leads to a classification framework with a significantly decreased computation time and memory footprint since no image features have to be computed and stored beforehand. The experimental results on one fully-polarimetric and one dual-polarimetric dataset show that, despite the simpler approach, accuracy can be maintained (decreased by only less than 2 % for the fully-polarimetric dataset) or even improved (increased by roughly 9 % for the dual-polarimetric dataset).
Spectral feature extraction of EEG signals and pattern recognition during mental tasks of 2-D cursor movements for BCI using SVM and ANN.

PubMed

Bascil, M Serdar; Tesneli, Ahmet Y; Temurtas, Feyzullah

2016-09-01

Brain computer interface (BCI) is a new communication way between man and machine. It identifies mental task patterns stored in electroencephalogram (EEG). So, it extracts brain electrical activities recorded by EEG and transforms them machine control commands. The main goal of BCI is to make available assistive environmental devices for paralyzed people such as computers and makes their life easier. This study deals with feature extraction and mental task pattern recognition on 2-D cursor control from EEG as offline analysis approach. The hemispherical power density changes are computed and compared on alpha-beta frequency bands with only mental imagination of cursor movements. First of all, power spectral density (PSD) features of EEG signals are extracted and high dimensional data reduced by principle component analysis (PCA) and independent component analysis (ICA) which are statistical algorithms. In the last stage, all features are classified with two types of support vector machine (SVM) which are linear and least squares (LS-SVM) and three different artificial neural network (ANN) structures which are learning vector quantization (LVQ), multilayer neural network (MLNN) and probabilistic neural network (PNN) and mental task patterns are successfully identified via k-fold cross validation technique.
Real-time machine vision system using FPGA and soft-core processor

NASA Astrophysics Data System (ADS)

Malik, Abdul Waheed; Thörnberg, Benny; Meng, Xiaozhou; Imran, Muhammad

2012-06-01

This paper presents a machine vision system for real-time computation of distance and angle of a camera from reference points in the environment. Image pre-processing, component labeling and feature extraction modules were modeled at Register Transfer (RT) level and synthesized for implementation on field programmable gate arrays (FPGA). The extracted image component features were sent from the hardware modules to a soft-core processor, MicroBlaze, for computation of distance and angle. A CMOS imaging sensor operating at a clock frequency of 27MHz was used in our experiments to produce a video stream at the rate of 75 frames per second. Image component labeling and feature extraction modules were running in parallel having a total latency of 13ms. The MicroBlaze was interfaced with the component labeling and feature extraction modules through Fast Simplex Link (FSL). The latency for computing distance and angle of camera from the reference points was measured to be 2ms on the MicroBlaze, running at 100 MHz clock frequency. In this paper, we present the performance analysis, device utilization and power consumption for the designed system. The FPGA based machine vision system that we propose has high frame speed, low latency and a power consumption that is much lower compared to commercially available smart camera solutions.
Introduction to the JASIST Special Topic Issue on Web Retrieval and Mining: A Machine Learning Perspective.

ERIC Educational Resources Information Center

Chen, Hsinchun

2003-01-01

Discusses information retrieval techniques used on the World Wide Web. Topics include machine learning in information extraction; relevance feedback; information filtering and recommendation; text classification and text clustering; Web mining, based on data mining techniques; hyperlink structure; and Web size. (LRW)
Binary classification of items of interest in a repeatable process

DOEpatents

Abell, Jeffrey A.; Spicer, John Patrick; Wincek, Michael Anthony; Wang, Hui; Chakraborty, Debejyo

2014-06-24

A system includes host and learning machines in electrical communication with sensors positioned with respect to an item of interest, e.g., a weld, and memory. The host executes instructions from memory to predict a binary quality status of the item. The learning machine receives signals from the sensor(s), identifies candidate features, and extracts features from the candidates that are more predictive of the binary quality status relative to other candidate features. The learning machine maps the extracted features to a dimensional space that includes most of the items from a passing binary class and excludes all or most of the items from a failing binary class. The host also compares the received signals for a subsequent item of interest to the dimensional space to thereby predict, in real time, the binary quality status of the subsequent item of interest.
Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features.

PubMed

Nikfarjam, Azadeh; Sarker, Abeed; O'Connor, Karen; Ginn, Rachel; Gonzalez, Graciela

2015-05-01

Social media is becoming increasingly popular as a platform for sharing personal health-related information. This information can be utilized for public health monitoring tasks, particularly for pharmacovigilance, via the use of natural language processing (NLP) techniques. However, the language in social media is highly informal, and user-expressed medical concepts are often nontechnical, descriptive, and challenging to extract. There has been limited progress in addressing these challenges, and thus far, advanced machine learning-based NLP techniques have been underutilized. Our objective is to design a machine learning-based approach to extract mentions of adverse drug reactions (ADRs) from highly informal text in social media. We introduce ADRMine, a machine learning-based concept extraction system that uses conditional random fields (CRFs). ADRMine utilizes a variety of features, including a novel feature for modeling words' semantic similarities. The similarities are modeled by clustering words based on unsupervised, pretrained word representation vectors (embeddings) generated from unlabeled user posts in social media using a deep learning technique. ADRMine outperforms several strong baseline systems in the ADR extraction task by achieving an F-measure of 0.82. Feature analysis demonstrates that the proposed word cluster features significantly improve extraction performance. It is possible to extract complex medical concepts, with relatively high performance, from informal, user-generated content. Our approach is particularly scalable, suitable for social media mining, as it relies on large volumes of unlabeled data, thus diminishing the need for large, annotated training data sets. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.
Epileptic seizure detection in EEG signal using machine learning techniques.

PubMed

Jaiswal, Abeg Kumar; Banka, Haider

2018-03-01

Epilepsy is a well-known nervous system disorder characterized by seizures. Electroencephalograms (EEGs), which capture brain neural activity, can detect epilepsy. Traditional methods for analyzing an EEG signal for epileptic seizure detection are time-consuming. Recently, several automated seizure detection frameworks using machine learning technique have been proposed to replace these traditional methods. The two basic steps involved in machine learning are feature extraction and classification. Feature extraction reduces the input pattern space by keeping informative features and the classifier assigns the appropriate class label. In this paper, we propose two effective approaches involving subpattern based PCA (SpPCA) and cross-subpattern correlation-based PCA (SubXPCA) with Support Vector Machine (SVM) for automated seizure detection in EEG signals. Feature extraction was performed using SpPCA and SubXPCA. Both techniques explore the subpattern correlation of EEG signals, which helps in decision-making process. SVM is used for classification of seizure and non-seizure EEG signals. The SVM was trained with radial basis kernel. All the experiments have been carried out on the benchmark epilepsy EEG dataset. The entire dataset consists of 500 EEG signals recorded under different scenarios. Seven different experimental cases for classification have been conducted. The classification accuracy was evaluated using tenfold cross validation. The classification results of the proposed approaches have been compared with the results of some of existing techniques proposed in the literature to establish the claim.
ClearTK 2.0: Design Patterns for Machine Learning in UIMA

PubMed Central

Bethard, Steven; Ogren, Philip; Becker, Lee

2014-01-01

ClearTK adds machine learning functionality to the UIMA framework, providing wrappers to popular machine learning libraries, a rich feature extraction library that works across different classifiers, and utilities for applying and evaluating machine learning models. Since its inception in 2008, ClearTK has evolved in response to feedback from developers and the community. This evolution has followed a number of important design principles including: conceptually simple annotator interfaces, readable pipeline descriptions, minimal collection readers, type system agnostic code, modules organized for ease of import, and assisting user comprehension of the complex UIMA framework. PMID:29104966
ClearTK 2.0: Design Patterns for Machine Learning in UIMA.

PubMed

Bethard, Steven; Ogren, Philip; Becker, Lee

2014-05-01

ClearTK adds machine learning functionality to the UIMA framework, providing wrappers to popular machine learning libraries, a rich feature extraction library that works across different classifiers, and utilities for applying and evaluating machine learning models. Since its inception in 2008, ClearTK has evolved in response to feedback from developers and the community. This evolution has followed a number of important design principles including: conceptually simple annotator interfaces, readable pipeline descriptions, minimal collection readers, type system agnostic code, modules organized for ease of import, and assisting user comprehension of the complex UIMA framework.
Machine vision extracted plant movement for early detection of plant water stress.

PubMed

Kacira, M; Ling, P P; Short, T H

2002-01-01

A methodology was established for early, non-contact, and quantitative detection of plant water stress with machine vision extracted plant features. Top-projected canopy area (TPCA) of the plants was extracted from plant images using image-processing techniques. Water stress induced plant movement was decoupled from plant diurnal movement and plant growth using coefficient of relative variation of TPCA (CRV[TPCA)] and was found to be an effective marker for water stress detection. Threshold value of CRV(TPCA) as an indicator of water stress was determined by a parametric approach. The effectiveness of the sensing technique was evaluated against the timing of stress detection by an operator. Results of this study suggested that plant water stress detection using projected canopy area based features of the plants was feasible.

Computational consciousness: building a self-preserving organism.

PubMed

Barros, Allan Kardec

2010-01-01

Consciousness has been a subject of crescent interest among the neuroscience community. However, building machine models of it is quite challenging, as it involves many characteristics and properties of the human brain which are poorly defined or are very abstract. Here I propose to use information theory (IT) to give a mathematical framework to understand consciousness. For this reason, I used the term "computational". This work is grounded on some recent results on the use of IT to understand how the cortex codes information, where redundancy reduction plays a fundamental role. Basically, I propose a system, here called "organism", whose strategy is to extract the maximal amount of information from the environment in order to survive. To highlight the proposed framework, I show a simple organism composed of a single neuron which adapts itself to the outside dynamics by taking into account its internal state, whose perception is understood here to be related to "feelings".
An Interactive Simulation System for Modeling Stands, Harvests, and Machines

Treesearch

Jingxin Wang; W. Dale Greene

1999-01-01

A interactive computer simulation program models stands, harvest, and machine factors and evaluates their interatcitons while performing felling, skidding, or fowarding activities. A stand generator allows the user to generate either natural or planted stands. Fellings with chainsaw, drive-to-tree feller-bunchers, or harvesters and extraction with grapple skidders or...
Detection of distorted frames in retinal video-sequences via machine learning

NASA Astrophysics Data System (ADS)

Kolar, Radim; Liberdova, Ivana; Odstrcilik, Jan; Hracho, Michal; Tornow, Ralf P.

2017-07-01

This paper describes detection of distorted frames in retinal sequences based on set of global features extracted from each frame. The feature vector is consequently used in classification step, in which three types of classifiers are tested. The best classification accuracy 96% has been achieved with support vector machine approach.
Energy Survey of Machine Tools: Separating Power Information of the Main Transmission System During Machining Process

NASA Astrophysics Data System (ADS)

Liu, Shuang; Liu, Fei; Hu, Shaohua; Yin, Zhenbiao

The major power information of the main transmission system in machine tools (MTSMT) during machining process includes effective output power (i.e. cutting power), input power and power loss from the mechanical transmission system, and the main motor power loss. These information are easy to obtain in the lab but difficult to evaluate in a manufacturing process. To solve this problem, a separation method is proposed here to extract the MTSMT power information during machining process. In this method, the energy flow and the mathematical models of major power information of MTSMT during the machining process are set up first. Based on the mathematical models and the basic data tables obtained from experiments, the above mentioned power information during machining process can be separated just by measuring the real time total input power of the spindle motor. The operation program of this method is also given.
Human Systems Integration (HSI) Associated Development Activities in Japan

DTIC Science & Technology

2008-06-12

machine learning and data mining methods. The continuous effort ( KAIZEN ) to improve the analysis phases are illustrated in Figure 14. Although there...model Extraction of a workflow Extraction of a control rule Variation analysis and improvement Plant operation KAIZEN Fig. 14
Automated measurement of mouse social behaviors using depth sensing, video tracking, and machine learning.

PubMed

Hong, Weizhe; Kennedy, Ann; Burgos-Artizzu, Xavier P; Zelikowsky, Moriel; Navonne, Santiago G; Perona, Pietro; Anderson, David J

2015-09-22

A lack of automated, quantitative, and accurate assessment of social behaviors in mammalian animal models has limited progress toward understanding mechanisms underlying social interactions and their disorders such as autism. Here we present a new integrated hardware and software system that combines video tracking, depth sensing, and machine learning for automatic detection and quantification of social behaviors involving close and dynamic interactions between two mice of different coat colors in their home cage. We designed a hardware setup that integrates traditional video cameras with a depth camera, developed computer vision tools to extract the body "pose" of individual animals in a social context, and used a supervised learning algorithm to classify several well-described social behaviors. We validated the robustness of the automated classifiers in various experimental settings and used them to examine how genetic background, such as that of Black and Tan Brachyury (BTBR) mice (a previously reported autism model), influences social behavior. Our integrated approach allows for rapid, automated measurement of social behaviors across diverse experimental designs and also affords the ability to develop new, objective behavioral metrics.
Microstructure, Morphology, and Nanomechanical Properties Near Fine Holes Produced by Electro-Discharge Machining

NASA Astrophysics Data System (ADS)

Blau, P. J.; Howe, J. Y.; Coffey, D. W.; Trejo, R. M.; Kenik, E. D.; Jolly, B. C.; Yang, N.

2012-08-01

Fine holes in metal alloys are employed for many important technological purposes, including cooling and the precise atomization of liquids. For example, they play an important role in the metering and delivery of fuel to the combustion chambers in energy-efficient, low-emission diesel engines. Electro-discharge machining (EDM) is one process employed to produce such holes. Since the hole shape and bore morphology can affect fluid flow, and holes also represent structural discontinuities in the tips of the spray nozzles, it is important to understand the microstructures adjacent to these holes, the features of the hole walls, and the nanomechanical properties of the material that was in some manner altered by the EDM hole-making process. Several techniques were used to characterize the structure and properties of spray-holes in a commercial injector nozzle. These include scanning electron microscopy, cross sectioning and metallographic etching, bore surface roughness measurements by optical interferometry, scanning electron microscopy, and transmission electron microscopy of recast EDM layers extracted with the help of a focused ion beam.
A Markovian engine for a biological energy transducer: the catalytic wheel.

PubMed

Tsong, Tian Yow; Chang, Cheng-Hung

2007-04-01

The molecular machines in biological cells are made of proteins, DNAs and other classes of molecules. The structures of these molecules are characteristically "soft", highly flexible, and yet their interactions with other molecules or ions are specific and selective. This chapter discusses a prevalent form, the catalytic wheel, or the energy transducer of cells, examines its mechanism of action, and extracts from it a set of simple but general rules for understanding the energetics of the biomolecular devices. These rules should also benefit design of manmade nanometer scale machines such as rotary motors or track-guided linear transporters. We will focus on an electric work that, by matching system dynamics and then enhancing the conformational fluctuation of one or several driver proteins, converts stochastic input of energy into rotation or locomotion of a receptor protein. The spatial (or barrier) and temporal symmetry breakings required for selected driver/receptor combinations are examined. This electric ratchet consists of a core engine that follows the Markovian dynamic, alleviates difficulties encountered in rigid mechanical model, and tailors to the soft-matter characteristics of the biomolecules.
Automated measurement of mouse social behaviors using depth sensing, video tracking, and machine learning

PubMed Central

Hong, Weizhe; Kennedy, Ann; Burgos-Artizzu, Xavier P.; Zelikowsky, Moriel; Navonne, Santiago G.; Perona, Pietro; Anderson, David J.

2015-01-01

A lack of automated, quantitative, and accurate assessment of social behaviors in mammalian animal models has limited progress toward understanding mechanisms underlying social interactions and their disorders such as autism. Here we present a new integrated hardware and software system that combines video tracking, depth sensing, and machine learning for automatic detection and quantification of social behaviors involving close and dynamic interactions between two mice of different coat colors in their home cage. We designed a hardware setup that integrates traditional video cameras with a depth camera, developed computer vision tools to extract the body “pose” of individual animals in a social context, and used a supervised learning algorithm to classify several well-described social behaviors. We validated the robustness of the automated classifiers in various experimental settings and used them to examine how genetic background, such as that of Black and Tan Brachyury (BTBR) mice (a previously reported autism model), influences social behavior. Our integrated approach allows for rapid, automated measurement of social behaviors across diverse experimental designs and also affords the ability to develop new, objective behavioral metrics. PMID:26354123
Predicting the Valence of a Scene from Observers’ Eye Movements

PubMed Central

R.-Tavakoli, Hamed; Atyabi, Adham; Rantanen, Antti; Laukka, Seppo J.; Nefti-Meziani, Samia; Heikkilä, Janne

2015-01-01

Multimedia analysis benefits from understanding the emotional content of a scene in a variety of tasks such as video genre classification and content-based image retrieval. Recently, there has been an increasing interest in applying human bio-signals, particularly eye movements, to recognize the emotional gist of a scene such as its valence. In order to determine the emotional category of images using eye movements, the existing methods often learn a classifier using several features that are extracted from eye movements. Although it has been shown that eye movement is potentially useful for recognition of scene valence, the contribution of each feature is not well-studied. To address the issue, we study the contribution of features extracted from eye movements in the classification of images into pleasant, neutral, and unpleasant categories. We assess ten features and their fusion. The features are histogram of saccade orientation, histogram of saccade slope, histogram of saccade length, histogram of saccade duration, histogram of saccade velocity, histogram of fixation duration, fixation histogram, top-ten salient coordinates, and saliency map. We utilize machine learning approach to analyze the performance of features by learning a support vector machine and exploiting various feature fusion schemes. The experiments reveal that ‘saliency map’, ‘fixation histogram’, ‘histogram of fixation duration’, and ‘histogram of saccade slope’ are the most contributing features. The selected features signify the influence of fixation information and angular behavior of eye movements in the recognition of the valence of images. PMID:26407322
A machine learning approach to galaxy-LSS classification - I. Imprints on halo merger trees

NASA Astrophysics Data System (ADS)

Hui, Jianan; Aragon, Miguel; Cui, Xinping; Flegal, James M.

2018-04-01

The cosmic web plays a major role in the formation and evolution of galaxies and defines, to a large extent, their properties. However, the relation between galaxies and environment is still not well understood. Here, we present a machine learning approach to study imprints of environmental effects on the mass assembly of haloes. We present a galaxy-LSS machine learning classifier based on galaxy properties sensitive to the environment. We then use the classifier to assess the relevance of each property. Correlations between galaxy properties and their cosmic environment can be used to predict galaxy membership to void/wall or filament/cluster with an accuracy of 93 per cent. Our study unveils environmental information encoded in properties of haloes not normally considered directly dependent on the cosmic environment such as merger history and complexity. Understanding the physical mechanism by which the cosmic web is imprinted in a halo can lead to significant improvements in galaxy formation models. This is accomplished by extracting features from galaxy properties and merger trees, computing feature scores for each feature and then applying support vector machine (SVM) to different feature sets. To this end, we have discovered that the shape and depth of the merger tree, formation time, and density of the galaxy are strongly associated with the cosmic environment. We describe a significant improvement in the original classification algorithm by performing LU decomposition of the distance matrix computed by the feature vectors and then using the output of the decomposition as input vectors for SVM.
Extraction of espresso coffee by using gradient of temperature. Effect on physicochemical and sensorial characteristics of espresso.

PubMed

Salamanca, C Alejandra; Fiol, Núria; González, Carlos; Saez, Marc; Villaescusa, Isabel

2017-01-01

Espresso extraction is generally carried out at a fixed temperature within the range 85-95°C. In this work the extraction of the espressos was made in a new generation coffee machine that enables temperature profiling of the brewing water. The effect of using gradient of temperature to brew espressos on physicochemical and sensorial characteristics of the beverage has been investigated. Three different extraction temperature profiles were tested: updrawn gradient (88-93°C), downdrawn gradient (93-88°C) and fixed temperature (90°C). The coffee species investigated were Robusta, Arabica natural and Washed Arabica. Results proved that the use of gradient temperature for brewing espressos allows increasing or decreasing the extraction of some chemical compounds from coffee grounds. Moreover an appropriate gradient of temperature can highlight or hide some sensorial attributes. In conclusion, the possibility of programming gradient of temperature in the coffee machines recently introduced in the market opens new expectations in the field of espresso brewing. Copyright © 2016 Elsevier Ltd. All rights reserved.
Using decision-tree classifier systems to extract knowledge from databases

NASA Technical Reports Server (NTRS)

St.clair, D. C.; Sabharwal, C. L.; Hacke, Keith; Bond, W. E.

1990-01-01

One difficulty in applying artificial intelligence techniques to the solution of real world problems is that the development and maintenance of many AI systems, such as those used in diagnostics, require large amounts of human resources. At the same time, databases frequently exist which contain information about the process(es) of interest. Recently, efforts to reduce development and maintenance costs of AI systems have focused on using machine learning techniques to extract knowledge from existing databases. Research is described in the area of knowledge extraction using a class of machine learning techniques called decision-tree classifier systems. Results of this research suggest ways of performing knowledge extraction which may be applied in numerous situations. In addition, a measurement called the concept strength metric (CSM) is described which can be used to determine how well the resulting decision tree can differentiate between the concepts it has learned. The CSM can be used to determine whether or not additional knowledge needs to be extracted from the database. An experiment involving real world data is presented to illustrate the concepts described.
Understanding and Writing G & M Code for CNC Machines

ERIC Educational Resources Information Center

Loveland, Thomas

2012-01-01

In modern CAD and CAM manufacturing companies, engineers design parts for machines and consumable goods. Many of these parts are cut on CNC machines. Whether using a CNC lathe, milling machine, or router, the ideas and designs of engineers must be translated into a machine-readable form called G & M Code that can be used to cut parts to precise…
Extraction of angle deterministic signals in the presence of stationary speed fluctuations with cyclostationary blind source separation

NASA Astrophysics Data System (ADS)

Delvecchio, S.; Antoni, J.

2012-02-01

This paper addresses the use of a cyclostationary blind source separation algorithm (namely RRCR) to extract angle deterministic signals from mechanical rotating machines in presence of stationary speed fluctuations. This means that only phase fluctuations while machine is running in steady-state conditions are considered while run-up or run-down speed variations are not taken into account. The machine is also supposed to run in idle conditions so non-stationary phenomena due to the load are not considered. It is theoretically assessed that in such operating conditions the deterministic (periodic) signal in the angle domain becomes cyclostationary at first and second orders in the time domain. This fact justifies the use of the RRCR algorithm, which is able to directly extract the angle deterministic signal from the time domain without performing any kind of interpolation. This is particularly valuable when angular resampling fails because of uncontrolled speed fluctuations. The capability of the proposed approach is verified by means of simulated and actual vibration signals captured on a pneumatic screwdriver handle. In this particular case not only the extraction of the angle deterministic part can be performed but also the separation of the main sources of excitation (i.e. motor shaft imbalance, epyciloidal gear meshing and air pressure forces) affecting the user hand during operations.
Complex temporal topic evolution modelling using the Kullback-Leibler divergence and the Bhattacharyya distance.

PubMed

Andrei, Victor; Arandjelović, Ognjen

2016-12-01

The rapidly expanding corpus of medical research literature presents major challenges in the understanding of previous work, the extraction of maximum information from collected data, and the identification of promising research directions. We present a case for the use of advanced machine learning techniques as an aide in this task and introduce a novel methodology that is shown to be capable of extracting meaningful information from large longitudinal corpora and of tracking complex temporal changes within it. Our framework is based on (i) the discretization of time into epochs, (ii) epoch-wise topic discovery using a hierarchical Dirichlet process-based model, and (iii) a temporal similarity graph which allows for the modelling of complex topic changes. More specifically, this is the first work that discusses and distinguishes between two groups of particularly challenging topic evolution phenomena: topic splitting and speciation and topic convergence and merging, in addition to the more widely recognized emergence and disappearance and gradual evolution. The proposed framework is evaluated on a public medical literature corpus.
Bayesian decoding using unsorted spikes in the rat hippocampus

PubMed Central

Layton, Stuart P.; Chen, Zhe; Wilson, Matthew A.

2013-01-01

A fundamental task in neuroscience is to understand how neural ensembles represent information. Population decoding is a useful tool to extract information from neuronal populations based on the ensemble spiking activity. We propose a novel Bayesian decoding paradigm to decode unsorted spikes in the rat hippocampus. Our approach uses a direct mapping between spike waveform features and covariates of interest and avoids accumulation of spike sorting errors. Our decoding paradigm is nonparametric, encoding model-free for representing stimuli, and extracts information from all available spikes and their waveform features. We apply the proposed Bayesian decoding algorithm to a position reconstruction task for freely behaving rats based on tetrode recordings of rat hippocampal neuronal activity. Our detailed decoding analyses demonstrate that our approach is efficient and better utilizes the available information in the nonsortable hash than the standard sorting-based decoding algorithm. Our approach can be adapted to an online encoding/decoding framework for applications that require real-time decoding, such as brain-machine interfaces. PMID:24089403
Non-stationary signal analysis based on general parameterized time-frequency transform and its application in the feature extraction of a rotary machine

NASA Astrophysics Data System (ADS)

Zhou, Peng; Peng, Zhike; Chen, Shiqian; Yang, Yang; Zhang, Wenming

2018-06-01

With the development of large rotary machines for faster and more integrated performance, the condition monitoring and fault diagnosis for them are becoming more challenging. Since the time-frequency (TF) pattern of the vibration signal from the rotary machine often contains condition information and fault feature, the methods based on TF analysis have been widely-used to solve these two problems in the industrial community. This article introduces an effective non-stationary signal analysis method based on the general parameterized time-frequency transform (GPTFT). The GPTFT is achieved by inserting a rotation operator and a shift operator in the short-time Fourier transform. This method can produce a high-concentrated TF pattern with a general kernel. A multi-component instantaneous frequency (IF) extraction method is proposed based on it. The estimation for the IF of every component is accomplished by defining a spectrum concentration index (SCI). Moreover, such an IF estimation process is iteratively operated until all the components are extracted. The tests on three simulation examples and a real vibration signal demonstrate the effectiveness and superiority of our method.
Fuzzy Nonlinear Proximal Support Vector Machine for Land Extraction Based on Remote Sensing Image

PubMed Central

Zhong, Xiaomei; Li, Jianping; Dou, Huacheng; Deng, Shijun; Wang, Guofei; Jiang, Yu; Wang, Yongjie; Zhou, Zebing; Wang, Li; Yan, Fei

2013-01-01

Currently, remote sensing technologies were widely employed in the dynamic monitoring of the land. This paper presented an algorithm named fuzzy nonlinear proximal support vector machine (FNPSVM) by basing on ETM+ remote sensing image. This algorithm is applied to extract various types of lands of the city Da’an in northern China. Two multi-category strategies, namely “one-against-one” and “one-against-rest” for this algorithm were described in detail and then compared. A fuzzy membership function was presented to reduce the effects of noises or outliers on the data samples. The approaches of feature extraction, feature selection, and several key parameter settings were also given. Numerous experiments were carried out to evaluate its performances including various accuracies (overall accuracies and kappa coefficient), stability, training speed, and classification speed. The FNPSVM classifier was compared to the other three classifiers including the maximum likelihood classifier (MLC), back propagation neural network (BPN), and the proximal support vector machine (PSVM) under different training conditions. The impacts of the selection of training samples, testing samples and features on the four classifiers were also evaluated in these experiments. PMID:23936016
Machine fault feature extraction based on intrinsic mode functions

NASA Astrophysics Data System (ADS)

Fan, Xianfeng; Zuo, Ming J.

2008-04-01

This work employs empirical mode decomposition (EMD) to decompose raw vibration signals into intrinsic mode functions (IMFs) that represent the oscillatory modes generated by the components that make up the mechanical systems generating the vibration signals. The motivation here is to develop vibration signal analysis programs that are self-adaptive and that can detect machine faults at the earliest onset of deterioration. The change in velocity of the amplitude of some IMFs over a particular unit time will increase when the vibration is stimulated by a component fault. Therefore, the amplitude acceleration energy in the intrinsic mode functions is proposed as an indicator of the impulsive features that are often associated with mechanical component faults. The periodicity of the amplitude acceleration energy for each IMF is extracted by spectrum analysis. A spectrum amplitude index is introduced as a method to select the optimal result. A comparison study of the method proposed here and some well-established techniques for detecting machinery faults is conducted through the analysis of both gear and bearing vibration signals. The results indicate that the proposed method has superior capability to extract machine fault features from vibration signals.

Triadic split-merge sampler

NASA Astrophysics Data System (ADS)

van Rossum, Anne C.; Lin, Hai Xiang; Dubbeldam, Johan; van der Herik, H. Jaap

2018-04-01

In machine vision typical heuristic methods to extract parameterized objects out of raw data points are the Hough transform and RANSAC. Bayesian models carry the promise to optimally extract such parameterized objects given a correct definition of the model and the type of noise at hand. A category of solvers for Bayesian models are Markov chain Monte Carlo methods. Naive implementations of MCMC methods suffer from slow convergence in machine vision due to the complexity of the parameter space. Towards this blocked Gibbs and split-merge samplers have been developed that assign multiple data points to clusters at once. In this paper we introduce a new split-merge sampler, the triadic split-merge sampler, that perform steps between two and three randomly chosen clusters. This has two advantages. First, it reduces the asymmetry between the split and merge steps. Second, it is able to propose a new cluster that is composed out of data points from two different clusters. Both advantages speed up convergence which we demonstrate on a line extraction problem. We show that the triadic split-merge sampler outperforms the conventional split-merge sampler. Although this new MCMC sampler is demonstrated in this machine vision context, its application extend to the very general domain of statistical inference.
Research on intelligent machine self-perception method based on LSTM

NASA Astrophysics Data System (ADS)

Wang, Qiang; Cheng, Tao

2018-05-01

In this paper, we use the advantages of LSTM in feature extraction and processing high-dimensional and complex nonlinear data, and apply it to the autonomous perception of intelligent machines. Compared with the traditional multi-layer neural network, this model has memory, can handle time series information of any length. Since the multi-physical domain signals of processing machines have a certain timing relationship, and there is a contextual relationship between states and states, using this deep learning method to realize the self-perception of intelligent processing machines has strong versatility and adaptability. The experiment results show that the method proposed in this paper can obviously improve the sensing accuracy under various working conditions of the intelligent machine, and also shows that the algorithm can well support the intelligent processing machine to realize self-perception.
Research on bearing fault diagnosis of large machinery based on mathematical morphology

NASA Astrophysics Data System (ADS)

Wang, Yu

2018-04-01

To study the automatic diagnosis of large machinery fault based on support vector machine, combining the four common faults of the large machinery, the support vector machine is used to classify and identify the fault. The extracted feature vectors are entered. The feature vector is trained and identified by multi - classification method. The optimal parameters of the support vector machine are searched by trial and error method and cross validation method. Then, the support vector machine is compared with BP neural network. The results show that the support vector machines are short in time and high in classification accuracy. It is more suitable for the research of fault diagnosis in large machinery. Therefore, it can be concluded that the training speed of support vector machines (SVM) is fast and the performance is good.
A Machine Learning Ensemble Classifier for Early Prediction of Diabetic Retinopathy.

PubMed

S K, Somasundaram; P, Alli

2017-11-09

The main complication of diabetes is Diabetic retinopathy (DR), retinal vascular disease and it leads to the blindness. Regular screening for early DR disease detection is considered as an intensive labor and resource oriented task. Therefore, automatic detection of DR diseases is performed only by using the computational technique is the great solution. An automatic method is more reliable to determine the presence of an abnormality in Fundus images (FI) but, the classification process is poorly performed. Recently, few research works have been designed for analyzing texture discrimination capacity in FI to distinguish the healthy images. However, the feature extraction (FE) process was not performed well, due to the high dimensionality. Therefore, to identify retinal features for DR disease diagnosis and early detection using Machine Learning and Ensemble Classification method, called, Machine Learning Bagging Ensemble Classifier (ML-BEC) is designed. The ML-BEC method comprises of two stages. The first stage in ML-BEC method comprises extraction of the candidate objects from Retinal Images (RI). The candidate objects or the features for DR disease diagnosis include blood vessels, optic nerve, neural tissue, neuroretinal rim, optic disc size, thickness and variance. These features are initially extracted by applying Machine Learning technique called, t-distributed Stochastic Neighbor Embedding (t-SNE). Besides, t-SNE generates a probability distribution across high-dimensional images where the images are separated into similar and dissimilar pairs. Then, t-SNE describes a similar probability distribution across the points in the low-dimensional map. This lessens the Kullback-Leibler divergence among two distributions regarding the locations of the points on the map. The second stage comprises of application of ensemble classifiers to the extracted features for providing accurate analysis of digital FI using machine learning. In this stage, an automatic detection of DR screening system using Bagging Ensemble Classifier (BEC) is investigated. With the help of voting the process in ML-BEC, bagging minimizes the error due to variance of the base classifier. With the publicly available retinal image databases, our classifier is trained with 25% of RI. Results show that the ensemble classifier can achieve better classification accuracy (CA) than single classification models. Empirical experiments suggest that the machine learning-based ensemble classifier is efficient for further reducing DR classification time (CT).
Fuzzy support vector machine: an efficient rule-based classification technique for microarrays.

PubMed

Hajiloo, Mohsen; Rabiee, Hamid R; Anooshahpour, Mahdi

2013-01-01

The abundance of gene expression microarray data has led to the development of machine learning algorithms applicable for tackling disease diagnosis, disease prognosis, and treatment selection problems. However, these algorithms often produce classifiers with weaknesses in terms of accuracy, robustness, and interpretability. This paper introduces fuzzy support vector machine which is a learning algorithm based on combination of fuzzy classifiers and kernel machines for microarray classification. Experimental results on public leukemia, prostate, and colon cancer datasets show that fuzzy support vector machine applied in combination with filter or wrapper feature selection methods develops a robust model with higher accuracy than the conventional microarray classification models such as support vector machine, artificial neural network, decision trees, k nearest neighbors, and diagonal linear discriminant analysis. Furthermore, the interpretable rule-base inferred from fuzzy support vector machine helps extracting biological knowledge from microarray data. Fuzzy support vector machine as a new classification model with high generalization power, robustness, and good interpretability seems to be a promising tool for gene expression microarray classification.
Nonlinear features for classification and pose estimation of machined parts from single views

NASA Astrophysics Data System (ADS)

Talukder, Ashit; Casasent, David P.

1998-10-01

A new nonlinear feature extraction method is presented for classification and pose estimation of objects from single views. The feature extraction method is called the maximum representation and discrimination feature (MRDF) method. The nonlinear MRDF transformations to use are obtained in closed form, and offer significant advantages compared to nonlinear neural network implementations. The features extracted are useful for both object discrimination (classification) and object representation (pose estimation). We consider MRDFs on image data, provide a new 2-stage nonlinear MRDF solution, and show it specializes to well-known linear and nonlinear image processing transforms under certain conditions. We show the use of MRDF in estimating the class and pose of images of rendered solid CAD models of machine parts from single views using a feature-space trajectory neural network classifier. We show new results with better classification and pose estimation accuracy than are achieved by standard principal component analysis and Fukunaga-Koontz feature extraction methods.
Predicting pork loin intramuscular fat using computer vision system.

PubMed

Liu, J-H; Sun, X; Young, J M; Bachmeier, L A; Newman, D J

2018-09-01

The objective of this study was to investigate the ability of computer vision system to predict pork intramuscular fat percentage (IMF%). Center-cut loin samples (n = 85) were trimmed of subcutaneous fat and connective tissue. Images were acquired and pixels were segregated to estimate image IMF% and 18 image color features for each image. Subjective IMF% was determined by a trained grader. Ether extract IMF% was calculated using ether extract method. Image color features and image IMF% were used as predictors for stepwise regression and support vector machine models. Results showed that subjective IMF% had a correlation of 0.81 with ether extract IMF% while the image IMF% had a 0.66 correlation with ether extract IMF%. Accuracy rates for regression models were 0.63 for stepwise and 0.75 for support vector machine. Although subjective IMF% has shown to have better prediction, results from computer vision system demonstrates the potential of being used as a tool in predicting pork IMF% in the future. Copyright © 2018 Elsevier Ltd. All rights reserved.
Analysis in Motion Initiative – Human Machine Intelligence

DOE Office of Scientific and Technical Information (OSTI.GOV)

Blaha, Leslie

As computers and machines become more pervasive in our everyday lives, we are looking for ways for humans and machines to work more intelligently together. How can we help machines understand their users so the team can do smarter things together? The Analysis in Motion Initiative is advancing the science of human machine intelligence — creating human-machine teams that work better together to make correct, useful, and timely interpretations of data.
Signature Verification Using N-tuple Learning Machine.

PubMed

Maneechot, Thanin; Kitjaidure, Yuttana

2005-01-01

This research presents new algorithm for signature verification using N-tuple learning machine. The features are taken from handwritten signature on Digital Tablet (On-line). This research develops recognition algorithm using four features extraction, namely horizontal and vertical pen tip position(x-y position), pen tip pressure, and pen altitude angles. Verification uses N-tuple technique with Gaussian thresholding.
Kernel machines for epilepsy diagnosis via EEG signal classification: a comparative study.

PubMed

Lima, Clodoaldo A M; Coelho, André L V

2011-10-01

We carry out a systematic assessment on a suite of kernel-based learning machines while coping with the task of epilepsy diagnosis through automatic electroencephalogram (EEG) signal classification. The kernel machines investigated include the standard support vector machine (SVM), the least squares SVM, the Lagrangian SVM, the smooth SVM, the proximal SVM, and the relevance vector machine. An extensive series of experiments was conducted on publicly available data, whose clinical EEG recordings were obtained from five normal subjects and five epileptic patients. The performance levels delivered by the different kernel machines are contrasted in terms of the criteria of predictive accuracy, sensitivity to the kernel function/parameter value, and sensitivity to the type of features extracted from the signal. For this purpose, 26 values for the kernel parameter (radius) of two well-known kernel functions (namely, Gaussian and exponential radial basis functions) were considered as well as 21 types of features extracted from the EEG signal, including statistical values derived from the discrete wavelet transform, Lyapunov exponents, and combinations thereof. We first quantitatively assess the impact of the choice of the wavelet basis on the quality of the features extracted. Four wavelet basis functions were considered in this study. Then, we provide the average accuracy (i.e., cross-validation error) values delivered by 252 kernel machine configurations; in particular, 40%/35% of the best-calibrated models of the standard and least squares SVMs reached 100% accuracy rate for the two kernel functions considered. Moreover, we show the sensitivity profiles exhibited by a large sample of the configurations whereby one can visually inspect their levels of sensitiveness to the type of feature and to the kernel function/parameter value. Overall, the results evidence that all kernel machines are competitive in terms of accuracy, with the standard and least squares SVMs prevailing more consistently. Moreover, the choice of the kernel function and parameter value as well as the choice of the feature extractor are critical decisions to be taken, albeit the choice of the wavelet family seems not to be so relevant. Also, the statistical values calculated over the Lyapunov exponents were good sources of signal representation, but not as informative as their wavelet counterparts. Finally, a typical sensitivity profile has emerged among all types of machines, involving some regions of stability separated by zones of sharp variation, with some kernel parameter values clearly associated with better accuracy rates (zones of optimality). Copyright © 2011 Elsevier B.V. All rights reserved.
Background Knowledge in Learning-Based Relation Extraction

ERIC Educational Resources Information Center

Do, Quang Xuan

2012-01-01

In this thesis, we study the importance of background knowledge in relation extraction systems. We not only demonstrate the benefits of leveraging background knowledge to improve the systems' performance but also propose a principled framework that allows one to effectively incorporate knowledge into statistical machine learning models for…
Characterization of Adrenal Lesions on Unenhanced MRI Using Texture Analysis: A Machine-Learning Approach.

PubMed

Romeo, Valeria; Maurea, Simone; Cuocolo, Renato; Petretta, Mario; Mainenti, Pier Paolo; Verde, Francesco; Coppola, Milena; Dell'Aversana, Serena; Brunetti, Arturo

2018-01-17

Adrenal adenomas (AA) are the most common benign adrenal lesions, often characterized based on intralesional fat content as either lipid-rich (LRA) or lipid-poor (LPA). The differentiation of AA, particularly LPA, from nonadenoma adrenal lesions (NAL) may be challenging. Texture analysis (TA) can extract quantitative parameters from MR images. Machine learning is a technique for recognizing patterns that can be applied to medical images by identifying the best combination of TA features to create a predictive model for the diagnosis of interest. To assess the diagnostic efficacy of TA-derived parameters extracted from MR images in characterizing LRA, LPA, and NAL using a machine-learning approach. Retrospective, observational study. Sixty MR examinations, including 20 LRA, 20 LPA, and 20 NAL. Unenhanced T 1 -weighted in-phase (IP) and out-of-phase (OP) as well as T 2 -weighted (T 2 -w) MR images acquired at 3T. Adrenal lesions were manually segmented, placing a spherical volume of interest on IP, OP, and T 2 -w images. Different selection methods were trained and tested using the J48 machine-learning classifiers. The feature selection method that obtained the highest diagnostic performance using the J48 classifier was identified; the diagnostic performance was also compared with that of a senior radiologist by means of McNemar's test. A total of 138 TA-derived features were extracted; among these, four features were selected, extracted from the IP (Short_Run_High_Gray_Level_Emphasis), OP (Mean_Intensity and Maximum_3D_Diameter), and T 2 -w (Standard_Deviation) images; the J48 classifier obtained a diagnostic accuracy of 80%. The expert radiologist obtained a diagnostic accuracy of 73%. McNemar's test did not show significant differences in terms of diagnostic performance between the J48 classifier and the expert radiologist. Machine learning conducted on MR TA-derived features is a potential tool to characterize adrenal lesions. 4 Technical Efficacy: Stage 2 J. Magn. Reson. Imaging 2018. © 2018 International Society for Magnetic Resonance in Medicine.
Predictive Modeling and Optimization of Vibration-assisted AFM Tip-based Nanomachining

NASA Astrophysics Data System (ADS)

Kong, Xiangcheng

The tip-based vibration-assisted nanomachining process offers a low-cost, low-effort technique in fabricating nanometer scale 2D/3D structures in sub-100 nm regime. To understand its mechanism, as well as provide the guidelines for process planning and optimization, we have systematically studied this nanomachining technique in this work. To understand the mechanism of this nanomachining technique, we firstly analyzed the interaction between the AFM tip and the workpiece surface during the machining process. A 3D voxel-based numerical algorithm has been developed to calculate the material removal rate as well as the contact area between the AFM tip and the workpiece surface. As a critical factor to understand the mechanism of this nanomachining process, the cutting force has been analyzed and modeled. A semi-empirical model has been proposed by correlating the cutting force with the material removal rate, which was validated using experimental data from different machining conditions. With the understanding of its mechanism, we have developed guidelines for process planning of this nanomachining technique. To provide the guideline for parameter selection, the effect of machining parameters on the feature dimensions (depth and width) has been analyzed. Based on ANOVA test results, the feature width is only controlled by the XY vibration amplitude, while the feature depth is affected by several machining parameters such as setpoint force and feed rate. A semi-empirical model was first proposed to predict the machined feature depth under given machining condition. Then, to reduce the computation intensity, linear and nonlinear regression models were also proposed and validated using experimental data. Given the desired feature dimensions, feasible machining parameters could be provided using these predictive feature dimension models. As the tip wear is unavoidable during the machining process, the machining precision will gradually decrease. To maintain the machining quality, the guideline for when to change the tip should be provided. In this study, we have developed several metrics to detect tip wear, such as tip radius and the pull-off force. The effect of machining parameters on the tip wear rate has been studied using these metrics, and the machining distance before a tip must be changed has been modeled using these machining parameters. Finally, the optimization functions have been built for unit production time and unit production cost subject to realistic constraints, and the optimal machining parameters can be found by solving these functions.
Thermal Error Test and Intelligent Modeling Research on the Spindle of High Speed CNC Machine Tools

NASA Astrophysics Data System (ADS)

Luo, Zhonghui; Peng, Bin; Xiao, Qijun; Bai, Lu

2018-03-01

Thermal error is the main factor affecting the accuracy of precision machining. Through experiments, this paper studies the thermal error test and intelligent modeling for the spindle of vertical high speed CNC machine tools in respect of current research focuses on thermal error of machine tool. Several testing devices for thermal error are designed, of which 7 temperature sensors are used to measure the temperature of machine tool spindle system and 2 displacement sensors are used to detect the thermal error displacement. A thermal error compensation model, which has a good ability in inversion prediction, is established by applying the principal component analysis technology, optimizing the temperature measuring points, extracting the characteristic values closely associated with the thermal error displacement, and using the artificial neural network technology.
Classification and machine recognition of severe weather patterns

NASA Technical Reports Server (NTRS)

Wang, P. P.; Burns, R. C.

1976-01-01

Forecasting and warning of severe weather conditions are treated from the vantage point of pattern recognition by machine. Pictorial patterns and waveform patterns are distinguished. Time series data on sferics are dealt with by considering waveform patterns. A severe storm patterns recognition machine is described, along with schemes for detection via cross-correlation of time series (same channel or different channels). Syntactic and decision-theoretic approaches to feature extraction are discussed. Active and decayed tornados and thunderstorms, lightning discharges, and funnels and their related time series data are studied.
Machine learning and computer vision approaches for phenotypic profiling.

PubMed

Grys, Ben T; Lo, Dara S; Sahin, Nil; Kraus, Oren Z; Morris, Quaid; Boone, Charles; Andrews, Brenda J

2017-01-02

With recent advances in high-throughput, automated microscopy, there has been an increased demand for effective computational strategies to analyze large-scale, image-based data. To this end, computer vision approaches have been applied to cell segmentation and feature extraction, whereas machine-learning approaches have been developed to aid in phenotypic classification and clustering of data acquired from biological images. Here, we provide an overview of the commonly used computer vision and machine-learning methods for generating and categorizing phenotypic profiles, highlighting the general biological utility of each approach. © 2017 Grys et al.
Machine learning and computer vision approaches for phenotypic profiling

PubMed Central

Morris, Quaid

2017-01-01

With recent advances in high-throughput, automated microscopy, there has been an increased demand for effective computational strategies to analyze large-scale, image-based data. To this end, computer vision approaches have been applied to cell segmentation and feature extraction, whereas machine-learning approaches have been developed to aid in phenotypic classification and clustering of data acquired from biological images. Here, we provide an overview of the commonly used computer vision and machine-learning methods for generating and categorizing phenotypic profiles, highlighting the general biological utility of each approach. PMID:27940887
Virtual Machine Language Controls Remote Devices

NASA Technical Reports Server (NTRS)

2014-01-01

Kennedy Space Center worked with Blue Sun Enterprises, based in Boulder, Colorado, to enhance the company's virtual machine language (VML) to control the instruments on the Regolith and Environment Science and Oxygen and Lunar Volatiles Extraction mission. Now the NASA-improved VML is available for crewed and uncrewed spacecraft, and has potential applications on remote systems such as weather balloons, unmanned aerial vehicles, and submarines.
Automatic event detection in low SNR microseismic signals based on multi-scale permutation entropy and a support vector machine

NASA Astrophysics Data System (ADS)

Jia, Rui-Sheng; Sun, Hong-Mei; Peng, Yan-Jun; Liang, Yong-Quan; Lu, Xin-Ming

2017-07-01

Microseismic monitoring is an effective means for providing early warning of rock or coal dynamical disasters, and its first step is microseismic event detection, although low SNR microseismic signals often cannot effectively be detected by routine methods. To solve this problem, this paper presents permutation entropy and a support vector machine to detect low SNR microseismic events. First, an extraction method of signal features based on multi-scale permutation entropy is proposed by studying the influence of the scale factor on the signal permutation entropy. Second, the detection model of low SNR microseismic events based on the least squares support vector machine is built by performing a multi-scale permutation entropy calculation for the collected vibration signals, constructing a feature vector set of signals. Finally, a comparative analysis of the microseismic events and noise signals in the experiment proves that the different characteristics of the two can be fully expressed by using multi-scale permutation entropy. The detection model of microseismic events combined with the support vector machine, which has the features of high classification accuracy and fast real-time algorithms, can meet the requirements of online, real-time extractions of microseismic events.
Thermal machines beyond the weak coupling regime

NASA Astrophysics Data System (ADS)

Gallego, R.; Riera, A.; Eisert, J.

2014-12-01

How much work can be extracted from a heat bath using a thermal machine? The study of this question has a very long history in statistical physics in the weak-coupling limit, when applied to macroscopic systems. However, the assumption that thermal heat baths remain uncorrelated with associated physical systems is less reasonable on the nano-scale and in the quantum setting. In this work, we establish a framework of work extraction in the presence of quantum correlations. We show in a mathematically rigorous and quantitative fashion that quantum correlations and entanglement emerge as limitations to work extraction compared to what would be allowed by the second law of thermodynamics. At the heart of the approach are operations that capture the naturally non-equilibrium dynamics encountered when putting physical systems into contact with each other. We discuss various limits that relate to known results and put our work into the context of approaches to finite-time quantum thermodynamics.

Protein function in precision medicine: deep understanding with machine learning.

PubMed

Rost, Burkhard; Radivojac, Predrag; Bromberg, Yana

2016-08-01

Precision medicine and personalized health efforts propose leveraging complex molecular, medical and family history, along with other types of personal data toward better life. We argue that this ambitious objective will require advanced and specialized machine learning solutions. Simply skimming some low-hanging results off the data wealth might have limited potential. Instead, we need to better understand all parts of the system to define medically relevant causes and effects: how do particular sequence variants affect particular proteins and pathways? How do these effects, in turn, cause the health or disease-related phenotype? Toward this end, deeper understanding will not simply diffuse from deeper machine learning, but from more explicit focus on understanding protein function, context-specific protein interaction networks, and impact of variation on both. © 2016 Federation of European Biochemical Societies.
Speech sound classification and detection of articulation disorders with support vector machines and wavelets.

PubMed

Georgoulas, George; Georgopoulos, Voula C; Stylios, Chrysostomos D

2006-01-01

This paper proposes a novel integrated methodology to extract features and classify speech sounds with intent to detect the possible existence of a speech articulation disorder in a speaker. Articulation, in effect, is the specific and characteristic way that an individual produces the speech sounds. A methodology to process the speech signal, extract features and finally classify the signal and detect articulation problems in a speaker is presented. The use of support vector machines (SVMs), for the classification of speech sounds and detection of articulation disorders is introduced. The proposed method is implemented on a data set where different sets of features and different schemes of SVMs are tested leading to satisfactory performance.
High recall document content extraction

NASA Astrophysics Data System (ADS)

An, Chang; Baird, Henry S.

2011-01-01

We report methodologies for computing high-recall masks for document image content extraction, that is, the location and segmentation of regions containing handwriting, machine-printed text, photographs, blank space, etc. The resulting segmentation is pixel-accurate, which accommodates arbitrary zone shapes (not merely rectangles). We describe experiments showing that iterated classifiers can increase recall of all content types, with little loss of precision. We also introduce two methodological enhancements: (1) a multi-stage voting rule; and (2) a scoring policy that views blank pixels as a "don't care" class with other content classes. These enhancements improve both recall and precision, achieving at least 89% recall and at least 87% precision among three content types: machine-print, handwriting, and photo.
Novel grid-based optical Braille conversion: from scanning to wording

NASA Astrophysics Data System (ADS)

Yoosefi Babadi, Majid; Jafari, Shahram

2011-12-01

Grid-based optical Braille conversion (GOBCO) is explained in this article. The grid-fitting technique involves processing scanned images taken from old hard-copy Braille manuscripts, recognising and converting them into English ASCII text documents inside a computer. The resulted words are verified using the relevant dictionary to provide the final output. The algorithms employed in this article can be easily modified to be implemented on other visual pattern recognition systems and text extraction applications. This technique has several advantages including: simplicity of the algorithm, high speed of execution, ability to help visually impaired persons and blind people to work with fax machines and the like, and the ability to help sighted people with no prior knowledge of Braille to understand hard-copy Braille manuscripts.
Prominent feature extraction for review analysis: an empirical study

NASA Astrophysics Data System (ADS)

Agarwal, Basant; Mittal, Namita

2016-05-01

Sentiment analysis (SA) research has increased tremendously in recent times. SA aims to determine the sentiment orientation of a given text into positive or negative polarity. Motivation for SA research is the need for the industry to know the opinion of the users about their product from online portals, blogs, discussion boards and reviews and so on. Efficient features need to be extracted for machine-learning algorithm for better sentiment classification. In this paper, initially various features are extracted such as unigrams, bi-grams and dependency features from the text. In addition, new bi-tagged features are also extracted that conform to predefined part-of-speech patterns. Furthermore, various composite features are created using these features. Information gain (IG) and minimum redundancy maximum relevancy (mRMR) feature selection methods are used to eliminate the noisy and irrelevant features from the feature vector. Finally, machine-learning algorithms are used for classifying the review document into positive or negative class. Effects of different categories of features are investigated on four standard data-sets, namely, movie review and product (book, DVD and electronics) review data-sets. Experimental results show that composite features created from prominent features of unigram and bi-tagged features perform better than other features for sentiment classification. mRMR is a better feature selection method as compared with IG for sentiment classification. Boolean Multinomial Naïve Bayes) algorithm performs better than support vector machine classifier for SA in terms of accuracy and execution time.
Deep Learning Methods for Underwater Target Feature Extraction and Recognition

PubMed Central

Peng, Yuan; Qiu, Mengran; Shi, Jianfei; Liu, Liangliang

2018-01-01

The classification and recognition technology of underwater acoustic signal were always an important research content in the field of underwater acoustic signal processing. Currently, wavelet transform, Hilbert-Huang transform, and Mel frequency cepstral coefficients are used as a method of underwater acoustic signal feature extraction. In this paper, a method for feature extraction and identification of underwater noise data based on CNN and ELM is proposed. An automatic feature extraction method of underwater acoustic signals is proposed using depth convolution network. An underwater target recognition classifier is based on extreme learning machine. Although convolution neural networks can execute both feature extraction and classification, their function mainly relies on a full connection layer, which is trained by gradient descent-based; the generalization ability is limited and suboptimal, so an extreme learning machine (ELM) was used in classification stage. Firstly, CNN learns deep and robust features, followed by the removing of the fully connected layers. Then ELM fed with the CNN features is used as the classifier to conduct an excellent classification. Experiments on the actual data set of civil ships obtained 93.04% recognition rate; compared to the traditional Mel frequency cepstral coefficients and Hilbert-Huang feature, recognition rate greatly improved. PMID:29780407
A solid-state controller for a wind-driven slip-ring induction generator

NASA Astrophysics Data System (ADS)

Velayudhan, C.; Bundell, J. H.; Leary, B. G.

1984-08-01

The three-phase induction generator appears to become the preferred choice for wind-powered systems operated in parallel with existing power systems. A problem arises in connection with the useful operating speed range of the squirrel-cage machine, which is relatively narrow, as, for instance, in the range from 1 to 1.15. Efficient extraction of energy from a wind turbine, on the other hand, requires a speed range, perhaps as large as 1 to 3. One approach for 'matching' the generator to the turbine for the extraction of maximum power at any usable wind speed involves the use of a slip-ring induction machine. The power demand of the slip-ring machine can be matched to the available output from the wind turbine by modifying the speed-torque characteristics of the generator. A description is presented of a simple electronic rotor resistance controller which can optimize the power taken from a wind turbine over the full speed range.
Modeling Music Emotion Judgments Using Machine Learning Methods

PubMed Central

Vempala, Naresh N.; Russo, Frank A.

2018-01-01

Emotion judgments and five channels of physiological data were obtained from 60 participants listening to 60 music excerpts. Various machine learning (ML) methods were used to model the emotion judgments inclusive of neural networks, linear regression, and random forests. Input for models of perceived emotion consisted of audio features extracted from the music recordings. Input for models of felt emotion consisted of physiological features extracted from the physiological recordings. Models were trained and interpreted with consideration of the classic debate in music emotion between cognitivists and emotivists. Our models supported a hybrid position wherein emotion judgments were influenced by a combination of perceived and felt emotions. In comparing the different ML approaches that were used for modeling, we conclude that neural networks were optimal, yielding models that were flexible as well as interpretable. Inspection of a committee machine, encompassing an ensemble of networks, revealed that arousal judgments were predominantly influenced by felt emotion, whereas valence judgments were predominantly influenced by perceived emotion. PMID:29354080
An integrated condition-monitoring method for a milling process using reduced decomposition features

NASA Astrophysics Data System (ADS)

Liu, Jie; Wu, Bo; Wang, Yan; Hu, Youmin

2017-08-01

Complex and non-stationary cutting chatter affects productivity and quality in the milling process. Developing an effective condition-monitoring approach is critical to accurately identify cutting chatter. In this paper, an integrated condition-monitoring method is proposed, where reduced features are used to efficiently recognize and classify machine states in the milling process. In the proposed method, vibration signals are decomposed into multiple modes with variational mode decomposition, and Shannon power spectral entropy is calculated to extract features from the decomposed signals. Principal component analysis is adopted to reduce feature size and computational cost. With the extracted feature information, the probabilistic neural network model is used to recognize and classify the machine states, including stable, transition, and chatter states. Experimental studies are conducted, and results show that the proposed method can effectively detect cutting chatter during different milling operation conditions. This monitoring method is also efficient enough to satisfy fast machine state recognition and classification.
Imaging nanoscale lattice variations by machine learning of x-ray diffraction microscopy data

DOE PAGES

Laanait, Nouamane; Zhang, Zhan; Schlepütz, Christian M.

2016-08-09

In this paper, we present a novel methodology based on machine learning to extract lattice variations in crystalline materials, at the nanoscale, from an x-ray Bragg diffraction-based imaging technique. By employing a full-field microscopy setup, we capture real space images of materials, with imaging contrast determined solely by the x-ray diffracted signal. The data sets that emanate from this imaging technique are a hybrid of real space information (image spatial support) and reciprocal lattice space information (image contrast), and are intrinsically multidimensional (5D). By a judicious application of established unsupervised machine learning techniques and multivariate analysis to this multidimensional datamore » cube, we show how to extract features that can be ascribed physical interpretations in terms of common structural distortions, such as lattice tilts and dislocation arrays. Finally, we demonstrate this 'big data' approach to x-ray diffraction microscopy by identifying structural defects present in an epitaxial ferroelectric thin-film of lead zirconate titanate.« less
Machine Learning for Knowledge Extraction from PHR Big Data.

PubMed

Poulymenopoulou, Michaela; Malamateniou, Flora; Vassilacopoulos, George

2014-01-01

Cloud computing, Internet of things (IOT) and NoSQL database technologies can support a new generation of cloud-based PHR services that contain heterogeneous (unstructured, semi-structured and structured) patient data (health, social and lifestyle) from various sources, including automatically transmitted data from Internet connected devices of patient living space (e.g. medical devices connected to patients at home care). The patient data stored in such PHR systems constitute big data whose analysis with the use of appropriate machine learning algorithms is expected to improve diagnosis and treatment accuracy, to cut healthcare costs and, hence, to improve the overall quality and efficiency of healthcare provided. This paper describes a health data analytics engine which uses machine learning algorithms for analyzing cloud based PHR big health data towards knowledge extraction to support better healthcare delivery as regards disease diagnosis and prognosis. This engine comprises of the data preparation, the model generation and the data analysis modules and runs on the cloud taking advantage from the map/reduce paradigm provided by Apache Hadoop.
Modeling Music Emotion Judgments Using Machine Learning Methods.

PubMed

Vempala, Naresh N; Russo, Frank A

2017-01-01

Emotion judgments and five channels of physiological data were obtained from 60 participants listening to 60 music excerpts. Various machine learning (ML) methods were used to model the emotion judgments inclusive of neural networks, linear regression, and random forests. Input for models of perceived emotion consisted of audio features extracted from the music recordings. Input for models of felt emotion consisted of physiological features extracted from the physiological recordings. Models were trained and interpreted with consideration of the classic debate in music emotion between cognitivists and emotivists. Our models supported a hybrid position wherein emotion judgments were influenced by a combination of perceived and felt emotions. In comparing the different ML approaches that were used for modeling, we conclude that neural networks were optimal, yielding models that were flexible as well as interpretable. Inspection of a committee machine, encompassing an ensemble of networks, revealed that arousal judgments were predominantly influenced by felt emotion, whereas valence judgments were predominantly influenced by perceived emotion.
Imaging nanoscale lattice variations by machine learning of x-ray diffraction microscopy data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Laanait, Nouamane; Zhang, Zhan; Schlepütz, Christian M.

In this paper, we present a novel methodology based on machine learning to extract lattice variations in crystalline materials, at the nanoscale, from an x-ray Bragg diffraction-based imaging technique. By employing a full-field microscopy setup, we capture real space images of materials, with imaging contrast determined solely by the x-ray diffracted signal. The data sets that emanate from this imaging technique are a hybrid of real space information (image spatial support) and reciprocal lattice space information (image contrast), and are intrinsically multidimensional (5D). By a judicious application of established unsupervised machine learning techniques and multivariate analysis to this multidimensional datamore » cube, we show how to extract features that can be ascribed physical interpretations in terms of common structural distortions, such as lattice tilts and dislocation arrays. Finally, we demonstrate this 'big data' approach to x-ray diffraction microscopy by identifying structural defects present in an epitaxial ferroelectric thin-film of lead zirconate titanate.« less
Machine Translation in Post-Contemporary Era

ERIC Educational Resources Information Center

Lin, Grace Hui Chin

2010-01-01

This article focusing on translating techniques via personal computer or laptop reports updated artificial intelligence progresses before 2010. Based on interpretations and information for field of MT [Machine Translation] by Yorick Wilks' book, "Machine Translation, Its scope and limits," this paper displays understandable theoretical frameworks…
The Machine Intelligence Hex Project

ERIC Educational Resources Information Center

Chalup, Stephan K.; Mellor, Drew; Rosamond, Fran

2005-01-01

Hex is a challenging strategy board game for two players. To enhance students' progress in acquiring understanding and practical experience with complex machine intelligence and programming concepts we developed the Machine Intelligence Hex (MIHex) project. The associated undergraduate student assignment is about designing and implementing Hex…
MLBCD: a machine learning tool for big clinical data.

PubMed

Luo, Gang

2015-01-01

Predictive modeling is fundamental for extracting value from large clinical data sets, or "big clinical data," advancing clinical research, and improving healthcare. Machine learning is a powerful approach to predictive modeling. Two factors make machine learning challenging for healthcare researchers. First, before training a machine learning model, the values of one or more model parameters called hyper-parameters must typically be specified. Due to their inexperience with machine learning, it is hard for healthcare researchers to choose an appropriate algorithm and hyper-parameter values. Second, many clinical data are stored in a special format. These data must be iteratively transformed into the relational table format before conducting predictive modeling. This transformation is time-consuming and requires computing expertise. This paper presents our vision for and design of MLBCD (Machine Learning for Big Clinical Data), a new software system aiming to address these challenges and facilitate building machine learning predictive models using big clinical data. The paper describes MLBCD's design in detail. By making machine learning accessible to healthcare researchers, MLBCD will open the use of big clinical data and increase the ability to foster biomedical discovery and improve care.
Pediatric peripheral blood progenitor cell collection: haemonetics MCS 3P versus COBE Spectra versus Fresenius AS104.

PubMed

Bambi, F; Faulkner, L B; Azzari, C; Gelli, A M; Tamburini, A; Tintori, V; Lippi, A A; Tucci, F; Bernini, G; Genovese, F

1998-01-01

An increasing number of apheresis machines are becoming available for peripheral blood progenitor cell (PBPC) collection in children. At the Children's Hospital of Florence (Italy), three apheresis machines were evaluated: MCS 3P (Haemonetics) (10 procedures in 4 patients, aged 10-12 years, weight 23.5-64 kg), Spectra, (COBE) (8 procedures in 3 patients, aged 4-17 years, weight 19-59 kg), and AS104 (Fresenius) (24 procedures in 9 patients, aged 2-16 years, weight 13.6-60 kg). For PBPC quantitative analysis, CD34 cytofluorimetry was employed. Relevant variables analyzed included efficiency of CD34+ cell extraction and enrichment, mononuclear cell purity and red cell contamination of the apheresis components, and platelet count decreases after leukapheresis. No significant differences in CD34+ cell-extraction abilities were found. However, the AS104 provided consistently purer leukapheresis components in terms of mononuclear cell and CD34+ cell enrichment (441 +/- 59%, vs. 240 +/- 35% and 290 +/- 42% for MCS 3P and Spectra, respectively). Postapheresis platelet counts dropped the least with the AS104. The smallest patient who underwent apheresis with MCS 3P (the only machine working on discontinuous flow and hence with greater volume shifts) weighed 23.5 kg and tolerated the procedure well, with no signs of hemodynamic instability. No significant complications were observed. All machines seem to have comparable PBPC extraction efficiency, but the AS104 seems to give the component with the greatest PBPC enrichment. This feature might be relevant for further ex vivo cell processing (CD34+ cell selection, expansion, and so on).
A Comparison of Supervised Machine Learning Algorithms and Feature Vectors for MS Lesion Segmentation Using Multimodal Structural MRI

PubMed Central

Sweeney, Elizabeth M.; Vogelstein, Joshua T.; Cuzzocreo, Jennifer L.; Calabresi, Peter A.; Reich, Daniel S.; Crainiceanu, Ciprian M.; Shinohara, Russell T.

2014-01-01

Machine learning is a popular method for mining and analyzing large collections of medical data. We focus on a particular problem from medical research, supervised multiple sclerosis (MS) lesion segmentation in structural magnetic resonance imaging (MRI). We examine the extent to which the choice of machine learning or classification algorithm and feature extraction function impacts the performance of lesion segmentation methods. As quantitative measures derived from structural MRI are important clinical tools for research into the pathophysiology and natural history of MS, the development of automated lesion segmentation methods is an active research field. Yet, little is known about what drives performance of these methods. We evaluate the performance of automated MS lesion segmentation methods, which consist of a supervised classification algorithm composed with a feature extraction function. These feature extraction functions act on the observed T1-weighted (T1-w), T2-weighted (T2-w) and fluid-attenuated inversion recovery (FLAIR) MRI voxel intensities. Each MRI study has a manual lesion segmentation that we use to train and validate the supervised classification algorithms. Our main finding is that the differences in predictive performance are due more to differences in the feature vectors, rather than the machine learning or classification algorithms. Features that incorporate information from neighboring voxels in the brain were found to increase performance substantially. For lesion segmentation, we conclude that it is better to use simple, interpretable, and fast algorithms, such as logistic regression, linear discriminant analysis, and quadratic discriminant analysis, and to develop the features to improve performance. PMID:24781953
A comparison of supervised machine learning algorithms and feature vectors for MS lesion segmentation using multimodal structural MRI.

PubMed

Sweeney, Elizabeth M; Vogelstein, Joshua T; Cuzzocreo, Jennifer L; Calabresi, Peter A; Reich, Daniel S; Crainiceanu, Ciprian M; Shinohara, Russell T

2014-01-01

Machine learning is a popular method for mining and analyzing large collections of medical data. We focus on a particular problem from medical research, supervised multiple sclerosis (MS) lesion segmentation in structural magnetic resonance imaging (MRI). We examine the extent to which the choice of machine learning or classification algorithm and feature extraction function impacts the performance of lesion segmentation methods. As quantitative measures derived from structural MRI are important clinical tools for research into the pathophysiology and natural history of MS, the development of automated lesion segmentation methods is an active research field. Yet, little is known about what drives performance of these methods. We evaluate the performance of automated MS lesion segmentation methods, which consist of a supervised classification algorithm composed with a feature extraction function. These feature extraction functions act on the observed T1-weighted (T1-w), T2-weighted (T2-w) and fluid-attenuated inversion recovery (FLAIR) MRI voxel intensities. Each MRI study has a manual lesion segmentation that we use to train and validate the supervised classification algorithms. Our main finding is that the differences in predictive performance are due more to differences in the feature vectors, rather than the machine learning or classification algorithms. Features that incorporate information from neighboring voxels in the brain were found to increase performance substantially. For lesion segmentation, we conclude that it is better to use simple, interpretable, and fast algorithms, such as logistic regression, linear discriminant analysis, and quadratic discriminant analysis, and to develop the features to improve performance.
Putting Humpty-Dumpty Together: Clustering the Functional Dynamics of Single Biomolecular Machines Such as the Spliceosome.

PubMed

Rohlman, C E; Blanco, M R; Walter, N G

2016-01-01

The spliceosome is a biomolecular machine that, in all eukaryotes, accomplishes site-specific splicing of introns from precursor messenger RNAs (pre-mRNAs) with high fidelity. Operating at the nanometer scale, where inertia and friction have lost the dominant role they play in the macroscopic realm, the spliceosome is highly dynamic and assembles its active site around each pre-mRNA anew. To understand the structural dynamics underlying the molecular motors, clocks, and ratchets that achieve functional accuracy in the yeast spliceosome (a long-standing model system), we have developed single-molecule fluorescence resonance energy transfer (smFRET) approaches that report changes in intra- and intermolecular interactions in real time. Building on our work using hidden Markov models (HMMs) to extract kinetic and conformational state information from smFRET time trajectories, we recognized that HMM analysis of individual state transitions as independent stochastic events is insufficient for a biomolecular machine as complex as the spliceosome. In this chapter, we elaborate on the recently developed smFRET-based Single-Molecule Cluster Analysis (SiMCAn) that dissects the intricate conformational dynamics of a pre-mRNA through the splicing cycle in a model-free fashion. By leveraging hierarchical clustering techniques developed for Bioinformatics, SiMCAn efficiently analyzes large datasets to first identify common molecular behaviors. Through a second level of clustering based on the abundance of dynamic behaviors exhibited by defined functional intermediates that have been stalled by biochemical or genetic tools, SiMCAn then efficiently assigns pre-mRNA FRET states and transitions to specific splicing complexes, with the potential to find heretofore undescribed conformations. SiMCAn thus arises as a general tool to analyze dynamic cellular machines more broadly. © 2016 Elsevier Inc. All rights reserved.

Sensitivity analysis of machine-learning models of hydrologic time series

NASA Astrophysics Data System (ADS)

O'Reilly, A. M.

2017-12-01

Sensitivity analysis traditionally has been applied to assessing model response to perturbations in model parameters, where the parameters are those model input variables adjusted during calibration. Unlike physics-based models where parameters represent real phenomena, the equivalent of parameters for machine-learning models are simply mathematical "knobs" that are automatically adjusted during training/testing/verification procedures. Thus the challenge of extracting knowledge of hydrologic system functionality from machine-learning models lies in their very nature, leading to the label "black box." Sensitivity analysis of the forcing-response behavior of machine-learning models, however, can provide understanding of how the physical phenomena represented by model inputs affect the physical phenomena represented by model outputs.As part of a previous study, hybrid spectral-decomposition artificial neural network (ANN) models were developed to simulate the observed behavior of hydrologic response contained in multidecadal datasets of lake water level, groundwater level, and spring flow. Model inputs used moving window averages (MWA) to represent various frequencies and frequency-band components of time series of rainfall and groundwater use. Using these forcing time series, the MWA-ANN models were trained to predict time series of lake water level, groundwater level, and spring flow at 51 sites in central Florida, USA. A time series of sensitivities for each MWA-ANN model was produced by perturbing forcing time-series and computing the change in response time-series per unit change in perturbation. Variations in forcing-response sensitivities are evident between types (lake, groundwater level, or spring), spatially (among sites of the same type), and temporally. Two generally common characteristics among sites are more uniform sensitivities to rainfall over time and notable increases in sensitivities to groundwater usage during significant drought periods.
Biologically based machine vision: signal analysis of monopolar cells in the visual system of Musca domestica.

PubMed

Newton, Jenny; Barrett, Steven F; Wilcox, Michael J; Popp, Stephanie

2002-01-01

Machine vision for navigational purposes is a rapidly growing field. Many abilities such as object recognition and target tracking rely on vision. Autonomous vehicles must be able to navigate in dynamic enviroments and simultaneously locate a target position. Traditional machine vision often fails to react in real time because of large computational requirements whereas the fly achieves complex orientation and navigation with a relatively small and simple brain. Understanding how the fly extracts visual information and how neurons encode and process information could lead us to a new approach for machine vision applications. Photoreceptors in the Musca domestica eye that share the same spatial information converge into a structure called the cartridge. The cartridge consists of the photoreceptor axon terminals and monopolar cells L1, L2, and L4. It is thought that L1 and L2 cells encode edge related information relative to a single cartridge. These cells are thought to be equivalent to vertebrate bipolar cells, producing contrast enhancement and reduction of information sent to L4. Monopolar cell L4 is thought to perform image segmentation on the information input from L1 and L2 and also enhance edge detection. A mesh of interconnected L4's would correlate the output from L1 and L2 cells of adjacent cartridges and provide a parallel network for segmenting an object's edges. The focus of this research is to excite photoreceptors of the common housefly, Musca domestica, with different visual patterns. The electrical response of monopolar cells L1, L2, and L4 will be recorded using intracellular recording techniques. Signal analysis will determine the neurocircuitry to detect and segment images.
Relations between work and entropy production for general information-driven, finite-state engines

NASA Astrophysics Data System (ADS)

Merhav, Neri

2017-02-01

We consider a system model of a general finite-state machine (ratchet) that simultaneously interacts with three kinds of reservoirs: a heat reservoir, a work reservoir, and an information reservoir, the latter being taken to be a running digital tape whose symbols interact sequentially with the machine. As has been shown in earlier work, this finite-state machine can act as a demon (with memory), which creates a net flow of energy from the heat reservoir into the work reservoir (thus extracting useful work) at the price of increasing the entropy of the information reservoir. Under very few assumptions, we propose a simple derivation of a family of inequalities that relate the work extraction with the entropy production. These inequalities can be seen as either upper bounds on the extractable work or as lower bounds on the entropy production, depending on the point of view. Many of these bounds are relatively easy to calculate and they are tight in the sense that equality can be approached arbitrarily closely. In their basic forms, these inequalities are applicable to any finite number of cycles (and not only asymptotically), and for a general input information sequence (possibly correlated), which is not necessarily assumed even stationary. Several known results are obtained as special cases.
New Trends in E-Science: Machine Learning and Knowledge Discovery in Databases

NASA Astrophysics Data System (ADS)

Brescia, Massimo

2012-11-01

Data mining, or Knowledge Discovery in Databases (KDD), while being the main methodology to extract the scientific information contained in Massive Data Sets (MDS), needs to tackle crucial problems since it has to orchestrate complex challenges posed by transparent access to different computing environments, scalability of algorithms, reusability of resources. To achieve a leap forward for the progress of e-science in the data avalanche era, the community needs to implement an infrastructure capable of performing data access, processing and mining in a distributed but integrated context. The increasing complexity of modern technologies carried out a huge production of data, whose related warehouse management and the need to optimize analysis and mining procedures lead to a change in concept on modern science. Classical data exploration, based on local user own data storage and limited computing infrastructures, is no more efficient in the case of MDS, worldwide spread over inhomogeneous data centres and requiring teraflop processing power. In this context modern experimental and observational science requires a good understanding of computer science, network infrastructures, Data Mining, etc. i.e. of all those techniques which fall into the domain of the so called e-science (recently assessed also by the Fourth Paradigm of Science). Such understanding is almost completely absent in the older generations of scientists and this reflects in the inadequacy of most academic and research programs. A paradigm shift is needed: statistical pattern recognition, object oriented programming, distributed computing, parallel programming need to become an essential part of scientific background. A possible practical solution is to provide the research community with easy-to understand, easy-to-use tools, based on the Web 2.0 technologies and Machine Learning methodology. Tools where almost all the complexity is hidden to the final user, but which are still flexible and able to produce efficient and reliable scientific results. All these considerations will be described in the detail in the chapter. Moreover, examples of modern applications offering to a wide variety of e-science communities a large spectrum of computational facilities to exploit the wealth of available massive data sets and powerful machine learning and statistical algorithms will be also introduced.
Deep Logic Networks: Inserting and Extracting Knowledge From Deep Belief Networks.

PubMed

Tran, Son N; d'Avila Garcez, Artur S

2018-02-01

Developments in deep learning have seen the use of layerwise unsupervised learning combined with supervised learning for fine-tuning. With this layerwise approach, a deep network can be seen as a more modular system that lends itself well to learning representations. In this paper, we investigate whether such modularity can be useful to the insertion of background knowledge into deep networks, whether it can improve learning performance when it is available, and to the extraction of knowledge from trained deep networks, and whether it can offer a better understanding of the representations learned by such networks. To this end, we use a simple symbolic language-a set of logical rules that we call confidence rules-and show that it is suitable for the representation of quantitative reasoning in deep networks. We show by knowledge extraction that confidence rules can offer a low-cost representation for layerwise networks (or restricted Boltzmann machines). We also show that layerwise extraction can produce an improvement in the accuracy of deep belief networks. Furthermore, the proposed symbolic characterization of deep networks provides a novel method for the insertion of prior knowledge and training of deep networks. With the use of this method, a deep neural-symbolic system is proposed and evaluated, with the experimental results indicating that modularity through the use of confidence rules and knowledge insertion can be beneficial to network performance.
A miniature instrumented sphere to understand impacts created by mechanical blueberry harvesting

USDA-ARS?s Scientific Manuscript database

The majority of US highbush blueberries for fresh market are hand harvested due to the high bruising damage to fruit caused by the machine harvesters. To reduce the bruising damage, we need to first understand how the machine parts interact with the berry fruit. To address this need, we developed ...
Multimodal Teaching Analytics: Automated Extraction of Orchestration Graphs from Wearable Sensor Data

ERIC Educational Resources Information Center

Prieto, L. P.; Sharma, K.; Kidzinski, L.; Rodríguez-Triana, M. J.; Dillenbourg, P.

2018-01-01

The pedagogical modelling of everyday classroom practice is an interesting kind of evidence, both for educational research and teachers' own professional development. This paper explores the usage of wearable sensors and machine learning techniques to automatically extract orchestration graphs (teaching activities and their social plane over time)…
Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature

PubMed Central

Murugesan, Gurusamy; Abdulkadhar, Sabenabanu; Natarajan, Jeyakumar

2017-01-01

Automatic extraction of protein-protein interaction (PPI) pairs from biomedical literature is a widely examined task in biological information extraction. Currently, many kernel based approaches such as linear kernel, tree kernel, graph kernel and combination of multiple kernels has achieved promising results in PPI task. However, most of these kernel methods fail to capture the semantic relation information between two entities. In this paper, we present a special type of tree kernel for PPI extraction which exploits both syntactic (structural) and semantic vectors information known as Distributed Smoothed Tree kernel (DSTK). DSTK comprises of distributed trees with syntactic information along with distributional semantic vectors representing semantic information of the sentences or phrases. To generate robust machine learning model composition of feature based kernel and DSTK were combined using ensemble support vector machine (SVM). Five different corpora (AIMed, BioInfer, HPRD50, IEPA, and LLL) were used for evaluating the performance of our system. Experimental results show that our system achieves better f-score with five different corpora compared to other state-of-the-art systems. PMID:29099838
Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature.

PubMed

Murugesan, Gurusamy; Abdulkadhar, Sabenabanu; Natarajan, Jeyakumar

2017-01-01

Automatic extraction of protein-protein interaction (PPI) pairs from biomedical literature is a widely examined task in biological information extraction. Currently, many kernel based approaches such as linear kernel, tree kernel, graph kernel and combination of multiple kernels has achieved promising results in PPI task. However, most of these kernel methods fail to capture the semantic relation information between two entities. In this paper, we present a special type of tree kernel for PPI extraction which exploits both syntactic (structural) and semantic vectors information known as Distributed Smoothed Tree kernel (DSTK). DSTK comprises of distributed trees with syntactic information along with distributional semantic vectors representing semantic information of the sentences or phrases. To generate robust machine learning model composition of feature based kernel and DSTK were combined using ensemble support vector machine (SVM). Five different corpora (AIMed, BioInfer, HPRD50, IEPA, and LLL) were used for evaluating the performance of our system. Experimental results show that our system achieves better f-score with five different corpora compared to other state-of-the-art systems.
The phaco machine: analysing new technology.

PubMed

Fishkind, William J

2013-01-01

The phaco machine is frequently overlooked as the crucial surgical instrument it is. Understanding how to set parameters is initiated by understanding fundamental concepts of machine function. This study analyses the critical concepts of partial occlusion phaco, occlusion phaco and pump technology. In addition, phaco energy categories as well as variations of phaco energy production are explored. Contemporary power modulations and pump controls allow for the enhancement of partial occlusion phacoemulsification. These significant changes in the anterior chamber dynamics produce a balanced environment for phaco; less complications; and improved patient outcomes.
Chemical-induced disease relation extraction with various linguistic features.

PubMed

Gu, Jinghang; Qian, Longhua; Zhou, Guodong

2016-01-01

Understanding the relations between chemicals and diseases is crucial in various biomedical tasks such as new drug discoveries and new therapy developments. While manually mining these relations from the biomedical literature is costly and time-consuming, such a procedure is often difficult to keep up-to-date. To address these issues, the BioCreative-V community proposed a challenging task of automatic extraction of chemical-induced disease (CID) relations in order to benefit biocuration. This article describes our work on the CID relation extraction task on the BioCreative-V tasks. We built a machine learning based system that utilized simple yet effective linguistic features to extract relations with maximum entropy models. In addition to leveraging various features, the hypernym relations between entity concepts derived from the Medical Subject Headings (MeSH)-controlled vocabulary were also employed during both training and testing stages to obtain more accurate classification models and better extraction performance, respectively. We demoted relation extraction between entities in documents to relation extraction between entity mentions. In our system, pairs of chemical and disease mentions at both intra- and inter-sentence levels were first constructed as relation instances for training and testing, then two classification models at both levels were trained from the training examples and applied to the testing examples. Finally, we merged the classification results from mention level to document level to acquire final relations between chemicals and diseases. Our system achieved promisingF-scores of 60.4% on the development dataset and 58.3% on the test dataset using gold-standard entity annotations, respectively. Database URL:https://github.com/JHnlp/BC5CIDTask. © The Author(s) 2016. Published by Oxford University Press.
Aural mapping of STEM concepts using literature mining

NASA Astrophysics Data System (ADS)

Bharadwaj, Venkatesh

Recent technological applications have made the life of people too much dependent on Science, Technology, Engineering, and Mathematics (STEM) and its applications. Understanding basic level science is a must in order to use and contribute to this technological revolution. Science education in middle and high school levels however depends heavily on visual representations such as models, diagrams, figures, animations and presentations etc. This leaves visually impaired students with very few options to learn science and secure a career in STEM related areas. Recent experiments have shown that small aural clues called Audemes are helpful in understanding and memorization of science concepts among visually impaired students. Audemes are non-verbal sound translations of a science concept. In order to facilitate science concepts as Audemes, for visually impaired students, this thesis presents an automatic system for audeme generation from STEM textbooks. This thesis describes the systematic application of multiple Natural Language Processing tools and techniques, such as dependency parser, POS tagger, Information Retrieval algorithm, Semantic mapping of aural words, machine learning etc., to transform the science concept into a combination of atomic-sounds, thus forming an audeme. We present a rule based classification method for all STEM related concepts. This work also presents a novel way of mapping and extracting most related sounds for the words being used in textbook. Additionally, machine learning methods are used in the system to guarantee the customization of output according to a user's perception. The system being presented is robust, scalable, fully automatic and dynamically adaptable for audeme generation.
Fuzzy Logic-Based Audio Pattern Recognition

NASA Astrophysics Data System (ADS)

Malcangi, M.

2008-11-01

Audio and audio-pattern recognition is becoming one of the most important technologies to automatically control embedded systems. Fuzzy logic may be the most important enabling methodology due to its ability to rapidly and economically model such application. An audio and audio-pattern recognition engine based on fuzzy logic has been developed for use in very low-cost and deeply embedded systems to automate human-to-machine and machine-to-machine interaction. This engine consists of simple digital signal-processing algorithms for feature extraction and normalization, and a set of pattern-recognition rules manually tuned or automatically tuned by a self-learning process.
Machinery Management. FMO: Fundamentals of Machine Operation. Third Edition.

ERIC Educational Resources Information Center

Bowers, Wendell

This text is intended to provide a basic understanding of selecting, maintaining, and managing farm machinery. The following topics are covered in the individual chapters: dealing with typical problems in farm machinery management; measuring machine capacity; improving field efficiency; matching machine size and capacity; estimating power…
Accurate Identification of Cancerlectins through Hybrid Machine Learning Technology.

PubMed

Zhang, Jieru; Ju, Ying; Lu, Huijuan; Xuan, Ping; Zou, Quan

2016-01-01

Cancerlectins are cancer-related proteins that function as lectins. They have been identified through computational identification techniques, but these techniques have sometimes failed to identify proteins because of sequence diversity among the cancerlectins. Advanced machine learning identification methods, such as support vector machine and basic sequence features (n-gram), have also been used to identify cancerlectins. In this study, various protein fingerprint features and advanced classifiers, including ensemble learning techniques, were utilized to identify this group of proteins. We improved the prediction accuracy of the original feature extraction methods and classification algorithms by more than 10% on average. Our work provides a basis for the computational identification of cancerlectins and reveals the power of hybrid machine learning techniques in computational proteomics.
Machine Learning: A Crucial Tool for Sensor Design

PubMed Central

Zhao, Weixiang; Bhushan, Abhinav; Santamaria, Anthony D.; Simon, Melinda G.; Davis, Cristina E.

2009-01-01

Sensors have been widely used for disease diagnosis, environmental quality monitoring, food quality control, industrial process analysis and control, and other related fields. As a key tool for sensor data analysis, machine learning is becoming a core part of novel sensor design. Dividing a complete machine learning process into three steps: data pre-treatment, feature extraction and dimension reduction, and system modeling, this paper provides a review of the methods that are widely used for each step. For each method, the principles and the key issues that affect modeling results are discussed. After reviewing the potential problems in machine learning processes, this paper gives a summary of current algorithms in this field and provides some feasible directions for future studies. PMID:20191110
Toward Usable Interactive Analytics: Coupling Cognition and Computation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Endert, Alexander; North, Chris; Chang, Remco

Interactive analytics provide users a myriad of computational means to aid in extracting meaningful information from large and complex datasets. Much prior work focuses either on advancing the capabilities of machine-centric approaches by the data mining and machine learning communities, or human-driven methods by the visualization and CHI communities. However, these methods do not yet support a true human-machine symbiotic relationship where users and machines work together collaboratively and adapt to each other to advance an interactive analytic process. In this paper we discuss some of the inherent issues, outlining what we believe are the steps toward usable interactive analyticsmore » that will ultimately increase the effectiveness for both humans and computers to produce insights.« less
Machine learning for Big Data analytics in plants.

PubMed

Ma, Chuang; Zhang, Hao Helen; Wang, Xiangfeng

2014-12-01

Rapid advances in high-throughput genomic technology have enabled biology to enter the era of 'Big Data' (large datasets). The plant science community not only needs to build its own Big-Data-compatible parallel computing and data management infrastructures, but also to seek novel analytical paradigms to extract information from the overwhelming amounts of data. Machine learning offers promising computational and analytical solutions for the integrative analysis of large, heterogeneous and unstructured datasets on the Big-Data scale, and is gradually gaining popularity in biology. This review introduces the basic concepts and procedures of machine-learning applications and envisages how machine learning could interface with Big Data technology to facilitate basic research and biotechnology in the plant sciences. Copyright © 2014 Elsevier Ltd. All rights reserved.
All about Simple Machines. Physical Science for Children[TM]. Schlessinger Science Library. [Videotape].

ERIC Educational Resources Information Center

2000

All kids know the word "work." But they probably don't understand that work happens whenever a force is used to move something--whether it's lifting a heavy object or playing on a see-saw. All About Simple Machines introduces kids to the concepts of forces, work and how machines are used to make work easier. Six simple machines are…
Identification of four class emotion from Indonesian spoken language using acoustic and lexical features

NASA Astrophysics Data System (ADS)

Kasyidi, Fatan; Puji Lestari, Dessi

2018-03-01

One of the important aspects in human to human communication is to understand emotion of each party. Recently, interactions between human and computer continues to develop, especially affective interaction where emotion recognition is one of its important components. This paper presents our extended works on emotion recognition of Indonesian spoken language to identify four main class of emotions: Happy, Sad, Angry, and Contentment using combination of acoustic/prosodic features and lexical features. We construct emotion speech corpus from Indonesia television talk show where the situations are as close as possible to the natural situation. After constructing the emotion speech corpus, the acoustic/prosodic and lexical features are extracted to train the emotion model. We employ some machine learning algorithms such as Support Vector Machine (SVM), Naive Bayes, and Random Forest to get the best model. The experiment result of testing data shows that the best model has an F-measure score of 0.447 by using only the acoustic/prosodic feature and F-measure score of 0.488 by using both acoustic/prosodic and lexical features to recognize four class emotion using the SVM RBF Kernel.

Crowdsourcing the Measurement of Interstate Conflict

PubMed Central

2016-01-01

Much of the data used to measure conflict is extracted from news reports. This is typically accomplished using either expert coders to quantify the relevant information or machine coders to automatically extract data from documents. Although expert coding is costly, it produces quality data. Machine coding is fast and inexpensive, but the data are noisy. To diminish the severity of this tradeoff, we introduce a method for analyzing news documents that uses crowdsourcing, supplemented with computational approaches. The new method is tested on documents about Militarized Interstate Disputes, and its accuracy ranges between about 68 and 76 percent. This is shown to be a considerable improvement over automated coding, and to cost less and be much faster than expert coding. PMID:27310427
Facial Expression Recognition using Multiclass Ensemble Least-Square Support Vector Machine

NASA Astrophysics Data System (ADS)

Lawi, Armin; Sya'Rani Machrizzandi, M.

2018-03-01

Facial expression is one of behavior characteristics of human-being. The use of biometrics technology system with facial expression characteristics makes it possible to recognize a person’s mood or emotion. The basic components of facial expression analysis system are face detection, face image extraction, facial classification and facial expressions recognition. This paper uses Principal Component Analysis (PCA) algorithm to extract facial features with expression parameters, i.e., happy, sad, neutral, angry, fear, and disgusted. Then Multiclass Ensemble Least-Squares Support Vector Machine (MELS-SVM) is used for the classification process of facial expression. The result of MELS-SVM model obtained from our 185 different expression images of 10 persons showed high accuracy level of 99.998% using RBF kernel.
Data Mining and Knowledge Discovery tools for exploiting big Earth-Observation data

NASA Astrophysics Data System (ADS)

Espinoza Molina, D.; Datcu, M.

2015-04-01

The continuous increase in the size of the archives and in the variety and complexity of Earth-Observation (EO) sensors require new methodologies and tools that allow the end-user to access a large image repository, to extract and to infer knowledge about the patterns hidden in the images, to retrieve dynamically a collection of relevant images, and to support the creation of emerging applications (e.g.: change detection, global monitoring, disaster and risk management, image time series, etc.). In this context, we are concerned with providing a platform for data mining and knowledge discovery content from EO archives. The platform's goal is to implement a communication channel between Payload Ground Segments and the end-user who receives the content of the data coded in an understandable format associated with semantics that is ready for immediate exploitation. It will provide the user with automated tools to explore and understand the content of highly complex images archives. The challenge lies in the extraction of meaningful information and understanding observations of large extended areas, over long periods of time, with a broad variety of EO imaging sensors in synergy with other related measurements and data. The platform is composed of several components such as 1.) ingestion of EO images and related data providing basic features for image analysis, 2.) query engine based on metadata, semantics and image content, 3.) data mining and knowledge discovery tools for supporting the interpretation and understanding of image content, 4.) semantic definition of the image content via machine learning methods. All these components are integrated and supported by a relational database management system, ensuring the integrity and consistency of Terabytes of Earth Observation data.
Neuromorphic Optical Signal Processing and Image Understanding for Automated Target Recognition

DTIC Science & Technology

1989-12-01

34 Stochastic Learning Machine " Neuromorphic Target Identification * Cognitive Networks 3. Conclusions ..... ................ .. 12 4. Publications...16 5. References ...... ................... . 17 6. Appendices ....... .................. 18 I. Optoelectronic Neural Networks and...Learning Machines. II. Stochastic Optical Learning Machine. III. Learning Network for Extrapolation AccesFon For and Radar Target Identification
Pressure-Letdown Machine for a Coal Reactor

NASA Technical Reports Server (NTRS)

Perkins, G. S.; Mabe, W. B.

1986-01-01

Pumps operating in reverse generate power. Conceptual pressure-letdown machine for coal-liquefaction system extracts energy from expansion of product fluid. Mud pumps, originally intended for use in oil drilling, operated in reverse so their motors act as generators. Several pumps operated in alternating phase to obtain multiple stages of letdown from inlet pressure to outlet pressure. About 75 percent of work generates inlet pressure recoverable as electrical energy.
Machinery Bearing Fault Diagnosis Using Variational Mode Decomposition and Support Vector Machine as a Classifier

NASA Astrophysics Data System (ADS)

Rama Krishna, K.; Ramachandran, K. I.

2018-02-01

Crack propagation is a major cause of failure in rotating machines. It adversely affects the productivity, safety, and the machining quality. Hence, detecting the crack’s severity accurately is imperative for the predictive maintenance of such machines. Fault diagnosis is an established concept in identifying the faults, for observing the non-linear behaviour of the vibration signals at various operating conditions. In this work, we find the classification efficiencies for both original and the reconstructed vibrational signals. The reconstructed signals are obtained using Variational Mode Decomposition (VMD), by splitting the original signal into three intrinsic mode functional components and framing them accordingly. Feature extraction, feature selection and feature classification are the three phases in obtaining the classification efficiencies. All the statistical features from the original signals and reconstructed signals are found out in feature extraction process individually. A few statistical parameters are selected in feature selection process and are classified using the SVM classifier. The obtained results show the best parameters and appropriate kernel in SVM classifier for detecting the faults in bearings. Hence, we conclude that better results were obtained by VMD and SVM process over normal process using SVM. This is owing to denoising and filtering the raw vibrational signals.
Machine Learning methods for Quantitative Radiomic Biomarkers.

PubMed

Parmar, Chintan; Grossmann, Patrick; Bussink, Johan; Lambin, Philippe; Aerts, Hugo J W L

2015-08-17

Radiomics extracts and mines large number of medical imaging features quantifying tumor phenotypic characteristics. Highly accurate and reliable machine-learning approaches can drive the success of radiomic applications in clinical care. In this radiomic study, fourteen feature selection methods and twelve classification methods were examined in terms of their performance and stability for predicting overall survival. A total of 440 radiomic features were extracted from pre-treatment computed tomography (CT) images of 464 lung cancer patients. To ensure the unbiased evaluation of different machine-learning methods, publicly available implementations along with reported parameter configurations were used. Furthermore, we used two independent radiomic cohorts for training (n = 310 patients) and validation (n = 154 patients). We identified that Wilcoxon test based feature selection method WLCX (stability = 0.84 ± 0.05, AUC = 0.65 ± 0.02) and a classification method random forest RF (RSD = 3.52%, AUC = 0.66 ± 0.03) had highest prognostic performance with high stability against data perturbation. Our variability analysis indicated that the choice of classification method is the most dominant source of performance variation (34.21% of total variance). Identification of optimal machine-learning methods for radiomic applications is a crucial step towards stable and clinically relevant radiomic biomarkers, providing a non-invasive way of quantifying and monitoring tumor-phenotypic characteristics in clinical practice.
Tool Wear Prediction in Ti-6Al-4V Machining through Multiple Sensor Monitoring and PCA Features Pattern Recognition.

PubMed

Caggiano, Alessandra

2018-03-09

Machining of titanium alloys is characterised by extremely rapid tool wear due to the high cutting temperature and the strong adhesion at the tool-chip and tool-workpiece interface, caused by the low thermal conductivity and high chemical reactivity of Ti alloys. With the aim to monitor the tool conditions during dry turning of Ti-6Al-4V alloy, a machine learning procedure based on the acquisition and processing of cutting force, acoustic emission and vibration sensor signals during turning is implemented. A number of sensorial features are extracted from the acquired sensor signals in order to feed machine learning paradigms based on artificial neural networks. To reduce the large dimensionality of the sensorial features, an advanced feature extraction methodology based on Principal Component Analysis (PCA) is proposed. PCA allowed to identify a smaller number of features ( k = 2 features), the principal component scores, obtained through linear projection of the original d features into a new space with reduced dimensionality k = 2, sufficient to describe the variance of the data. By feeding artificial neural networks with the PCA features, an accurate diagnosis of tool flank wear ( VB max ) was achieved, with predicted values very close to the measured tool wear values.
Tool Wear Prediction in Ti-6Al-4V Machining through Multiple Sensor Monitoring and PCA Features Pattern Recognition

PubMed Central

2018-01-01

Machining of titanium alloys is characterised by extremely rapid tool wear due to the high cutting temperature and the strong adhesion at the tool-chip and tool-workpiece interface, caused by the low thermal conductivity and high chemical reactivity of Ti alloys. With the aim to monitor the tool conditions during dry turning of Ti-6Al-4V alloy, a machine learning procedure based on the acquisition and processing of cutting force, acoustic emission and vibration sensor signals during turning is implemented. A number of sensorial features are extracted from the acquired sensor signals in order to feed machine learning paradigms based on artificial neural networks. To reduce the large dimensionality of the sensorial features, an advanced feature extraction methodology based on Principal Component Analysis (PCA) is proposed. PCA allowed to identify a smaller number of features (k = 2 features), the principal component scores, obtained through linear projection of the original d features into a new space with reduced dimensionality k = 2, sufficient to describe the variance of the data. By feeding artificial neural networks with the PCA features, an accurate diagnosis of tool flank wear (VBmax) was achieved, with predicted values very close to the measured tool wear values. PMID:29522443
e-IQ and IQ knowledge mining for generalized LDA

NASA Astrophysics Data System (ADS)

Jenkins, Jeffrey; van Bergem, Rutger; Sweet, Charles; Vietsch, Eveline; Szu, Harold

2015-05-01

How can the human brain uncover patterns, associations and features in real-time, real-world data? There must be a general strategy used to transform raw signals into useful features, but representing this generalization in the context of our information extraction tool set is lacking. In contrast to Big Data (BD), Large Data Analysis (LDA) has become a reachable multi-disciplinary goal in recent years due in part to high performance computers and algorithm development, as well as the availability of large data sets. However, the experience of Machine Learning (ML) and information communities has not been generalized into an intuitive framework that is useful to researchers across disciplines. The data exploration phase of data mining is a prime example of this unspoken, ad-hoc nature of ML - the Computer Scientist works with a Subject Matter Expert (SME) to understand the data, and then build tools (i.e. classifiers, etc.) which can benefit the SME and the rest of the researchers in that field. We ask, why is there not a tool to represent information in a meaningful way to the researcher asking the question? Meaning is subjective and contextual across disciplines, so to ensure robustness, we draw examples from several disciplines and propose a generalized LDA framework for independent data understanding of heterogeneous sources which contribute to Knowledge Discovery in Databases (KDD). Then, we explore the concept of adaptive Information resolution through a 6W unsupervised learning methodology feedback system. In this paper, we will describe the general process of man-machine interaction in terms of an asymmetric directed graph theory (digging for embedded knowledge), and model the inverse machine-man feedback (digging for tacit knowledge) as an ANN unsupervised learning methodology. Finally, we propose a collective learning framework which utilizes a 6W semantic topology to organize heterogeneous knowledge and diffuse information to entities within a society in a personalized way.
Understanding the undelaying mechanism of HA-subtyping in the level of physic-chemical characteristics of protein.

PubMed

Ebrahimi, Mansour; Aghagolzadeh, Parisa; Shamabadi, Narges; Tahmasebi, Ahmad; Alsharifi, Mohammed; Adelson, David L; Hemmatzadeh, Farhid; Ebrahimie, Esmaeil

2014-01-01

The evolution of the influenza A virus to increase its host range is a major concern worldwide. Molecular mechanisms of increasing host range are largely unknown. Influenza surface proteins play determining roles in reorganization of host-sialic acid receptors and host range. In an attempt to uncover the physic-chemical attributes which govern HA subtyping, we performed a large scale functional analysis of over 7000 sequences of 16 different HA subtypes. Large number (896) of physic-chemical protein characteristics were calculated for each HA sequence. Then, 10 different attribute weighting algorithms were used to find the key characteristics distinguishing HA subtypes. Furthermore, to discover machine leaning models which can predict HA subtypes, various Decision Tree, Support Vector Machine, Naïve Bayes, and Neural Network models were trained on calculated protein characteristics dataset as well as 10 trimmed datasets generated by attribute weighting algorithms. The prediction accuracies of the machine learning methods were evaluated by 10-fold cross validation. The results highlighted the frequency of Gln (selected by 80% of attribute weighting algorithms), percentage/frequency of Tyr, percentage of Cys, and frequencies of Try and Glu (selected by 70% of attribute weighting algorithms) as the key features that are associated with HA subtyping. Random Forest tree induction algorithm and RBF kernel function of SVM (scaled by grid search) showed high accuracy of 98% in clustering and predicting HA subtypes based on protein attributes. Decision tree models were successful in monitoring the short mutation/reassortment paths by which influenza virus can gain the key protein structure of another HA subtype and increase its host range in a short period of time with less energy consumption. Extracting and mining a large number of amino acid attributes of HA subtypes of influenza A virus through supervised algorithms represent a new avenue for understanding and predicting possible future structure of influenza pandemics.
Understanding the Underlying Mechanism of HA-Subtyping in the Level of Physic-Chemical Characteristics of Protein

PubMed Central

Ebrahimi, Mansour; Aghagolzadeh, Parisa; Shamabadi, Narges; Tahmasebi, Ahmad; Alsharifi, Mohammed; Adelson, David L.

2014-01-01

The evolution of the influenza A virus to increase its host range is a major concern worldwide. Molecular mechanisms of increasing host range are largely unknown. Influenza surface proteins play determining roles in reorganization of host-sialic acid receptors and host range. In an attempt to uncover the physic-chemical attributes which govern HA subtyping, we performed a large scale functional analysis of over 7000 sequences of 16 different HA subtypes. Large number (896) of physic-chemical protein characteristics were calculated for each HA sequence. Then, 10 different attribute weighting algorithms were used to find the key characteristics distinguishing HA subtypes. Furthermore, to discover machine leaning models which can predict HA subtypes, various Decision Tree, Support Vector Machine, Naïve Bayes, and Neural Network models were trained on calculated protein characteristics dataset as well as 10 trimmed datasets generated by attribute weighting algorithms. The prediction accuracies of the machine learning methods were evaluated by 10-fold cross validation. The results highlighted the frequency of Gln (selected by 80% of attribute weighting algorithms), percentage/frequency of Tyr, percentage of Cys, and frequencies of Try and Glu (selected by 70% of attribute weighting algorithms) as the key features that are associated with HA subtyping. Random Forest tree induction algorithm and RBF kernel function of SVM (scaled by grid search) showed high accuracy of 98% in clustering and predicting HA subtypes based on protein attributes. Decision tree models were successful in monitoring the short mutation/reassortment paths by which influenza virus can gain the key protein structure of another HA subtype and increase its host range in a short period of time with less energy consumption. Extracting and mining a large number of amino acid attributes of HA subtypes of influenza A virus through supervised algorithms represent a new avenue for understanding and predicting possible future structure of influenza pandemics. PMID:24809455
Machine learning and data science in soft materials engineering

NASA Astrophysics Data System (ADS)

Ferguson, Andrew L.

2018-01-01

In many branches of materials science it is now routine to generate data sets of such large size and dimensionality that conventional methods of analysis fail. Paradigms and tools from data science and machine learning can provide scalable approaches to identify and extract trends and patterns within voluminous data sets, perform guided traversals of high-dimensional phase spaces, and furnish data-driven strategies for inverse materials design. This topical review provides an accessible introduction to machine learning tools in the context of soft and biological materials by ‘de-jargonizing’ data science terminology, presenting a taxonomy of machine learning techniques, and surveying the mathematical underpinnings and software implementations of popular tools, including principal component analysis, independent component analysis, diffusion maps, support vector machines, and relative entropy. We present illustrative examples of machine learning applications in soft matter, including inverse design of self-assembling materials, nonlinear learning of protein folding landscapes, high-throughput antimicrobial peptide design, and data-driven materials design engines. We close with an outlook on the challenges and opportunities for the field.
Machine learning and data science in soft materials engineering.

PubMed

Ferguson, Andrew L

2018-01-31

In many branches of materials science it is now routine to generate data sets of such large size and dimensionality that conventional methods of analysis fail. Paradigms and tools from data science and machine learning can provide scalable approaches to identify and extract trends and patterns within voluminous data sets, perform guided traversals of high-dimensional phase spaces, and furnish data-driven strategies for inverse materials design. This topical review provides an accessible introduction to machine learning tools in the context of soft and biological materials by 'de-jargonizing' data science terminology, presenting a taxonomy of machine learning techniques, and surveying the mathematical underpinnings and software implementations of popular tools, including principal component analysis, independent component analysis, diffusion maps, support vector machines, and relative entropy. We present illustrative examples of machine learning applications in soft matter, including inverse design of self-assembling materials, nonlinear learning of protein folding landscapes, high-throughput antimicrobial peptide design, and data-driven materials design engines. We close with an outlook on the challenges and opportunities for the field.
Quantitative approaches to energy and glucose homeostasis: machine learning and modelling for precision understanding and prediction

PubMed Central

Murphy, Kevin G.; Jones, Nick S.

2018-01-01

Obesity is a major global public health problem. Understanding how energy homeostasis is regulated, and can become dysregulated, is crucial for developing new treatments for obesity. Detailed recording of individual behaviour and new imaging modalities offer the prospect of medically relevant models of energy homeostasis that are both understandable and individually predictive. The profusion of data from these sources has led to an interest in applying machine learning techniques to gain insight from these large, relatively unstructured datasets. We review both physiological models and machine learning results across a diverse range of applications in energy homeostasis, and highlight how modelling and machine learning can work together to improve predictive ability. We collect quantitative details in a comprehensive mathematical supplement. We also discuss the prospects of forecasting homeostatic behaviour and stress the importance of characterizing stochasticity within and between individuals in order to provide practical, tailored forecasts and guidance to combat the spread of obesity. PMID:29367240
A new method for the prediction of chatter stability lobes based on dynamic cutting force simulation model and support vector machine

NASA Astrophysics Data System (ADS)

Peng, Chong; Wang, Lun; Liao, T. Warren

2015-10-01

Currently, chatter has become the critical factor in hindering machining quality and productivity in machining processes. To avoid cutting chatter, a new method based on dynamic cutting force simulation model and support vector machine (SVM) is presented for the prediction of chatter stability lobes. The cutting force is selected as the monitoring signal, and the wavelet energy entropy theory is used to extract the feature vectors. A support vector machine is constructed using the MATLAB LIBSVM toolbox for pattern classification based on the feature vectors derived from the experimental cutting data. Then combining with the dynamic cutting force simulation model, the stability lobes diagram (SLD) can be estimated. Finally, the predicted results are compared with existing methods such as zero-order analytical (ZOA) and semi-discretization (SD) method as well as actual cutting experimental results to confirm the validity of this new method.
Vibration Sensor Monitoring of Nickel-Titanium Alloy Turning for Machinability Evaluation.

PubMed

Segreto, Tiziana; Caggiano, Alessandra; Karam, Sara; Teti, Roberto

2017-12-12

Nickel-Titanium (Ni-Ti) alloys are very difficult-to-machine materials causing notable manufacturing problems due to their unique mechanical properties, including superelasticity, high ductility, and severe strain-hardening. In this framework, the aim of this paper is to assess the machinability of Ni-Ti alloys with reference to turning processes in order to realize a reliable and robust in-process identification of machinability conditions. An on-line sensor monitoring procedure based on the acquisition of vibration signals was implemented during the experimental turning tests. The detected vibration sensorial data were processed through an advanced signal processing method in time-frequency domain based on wavelet packet transform (WPT). The extracted sensorial features were used to construct WPT pattern feature vectors to send as input to suitably configured neural networks (NNs) for cognitive pattern recognition in order to evaluate the correlation between input sensorial information and output machinability conditions.
Vibration Sensor Monitoring of Nickel-Titanium Alloy Turning for Machinability Evaluation

PubMed Central

Segreto, Tiziana; Karam, Sara; Teti, Roberto

2017-01-01

Nickel-Titanium (Ni-Ti) alloys are very difficult-to-machine materials causing notable manufacturing problems due to their unique mechanical properties, including superelasticity, high ductility, and severe strain-hardening. In this framework, the aim of this paper is to assess the machinability of Ni-Ti alloys with reference to turning processes in order to realize a reliable and robust in-process identification of machinability conditions. An on-line sensor monitoring procedure based on the acquisition of vibration signals was implemented during the experimental turning tests. The detected vibration sensorial data were processed through an advanced signal processing method in time-frequency domain based on wavelet packet transform (WPT). The extracted sensorial features were used to construct WPT pattern feature vectors to send as input to suitably configured neural networks (NNs) for cognitive pattern recognition in order to evaluate the correlation between input sensorial information and output machinability conditions. PMID:29231864
Automatic Extraction of Metadata from Scientific Publications for CRIS Systems

ERIC Educational Resources Information Center

Kovacevic, Aleksandar; Ivanovic, Dragan; Milosavljevic, Branko; Konjovic, Zora; Surla, Dusan

2011-01-01

Purpose: The aim of this paper is to develop a system for automatic extraction of metadata from scientific papers in PDF format for the information system for monitoring the scientific research activity of the University of Novi Sad (CRIS UNS). Design/methodology/approach: The system is based on machine learning and performs automatic extraction…
Combining Natural Language Processing and Statistical Text Mining: A Study of Specialized versus Common Languages

ERIC Educational Resources Information Center

Jarman, Jay

2011-01-01

This dissertation focuses on developing and evaluating hybrid approaches for analyzing free-form text in the medical domain. This research draws on natural language processing (NLP) techniques that are used to parse and extract concepts based on a controlled vocabulary. Once important concepts are extracted, additional machine learning algorithms,…

Maximum entropy methods for extracting the learned features of deep neural networks.

PubMed

Finnegan, Alex; Song, Jun S

2017-10-01

New architectures of multilayer artificial neural networks and new methods for training them are rapidly revolutionizing the application of machine learning in diverse fields, including business, social science, physical sciences, and biology. Interpreting deep neural networks, however, currently remains elusive, and a critical challenge lies in understanding which meaningful features a network is actually learning. We present a general method for interpreting deep neural networks and extracting network-learned features from input data. We describe our algorithm in the context of biological sequence analysis. Our approach, based on ideas from statistical physics, samples from the maximum entropy distribution over possible sequences, anchored at an input sequence and subject to constraints implied by the empirical function learned by a network. Using our framework, we demonstrate that local transcription factor binding motifs can be identified from a network trained on ChIP-seq data and that nucleosome positioning signals are indeed learned by a network trained on chemical cleavage nucleosome maps. Imposing a further constraint on the maximum entropy distribution also allows us to probe whether a network is learning global sequence features, such as the high GC content in nucleosome-rich regions. This work thus provides valuable mathematical tools for interpreting and extracting learned features from feed-forward neural networks.
A la Recherche du Temps Perdu: extracting temporal relations from medical text in the 2012 i2b2 NLP challenge.

PubMed

Cherry, Colin; Zhu, Xiaodan; Martin, Joel; de Bruijn, Berry

2013-01-01

An analysis of the timing of events is critical for a deeper understanding of the course of events within a patient record. The 2012 i2b2 NLP challenge focused on the extraction of temporal relationships between concepts within textual hospital discharge summaries. The team from the National Research Council Canada (NRC) submitted three system runs to the second track of the challenge: typifying the time-relationship between pre-annotated entities. The NRC system was designed around four specialist modules containing statistical machine learning classifiers. Each specialist targeted distinct sets of relationships: local relationships, 'sectime'-type relationships, non-local overlap-type relationships, and non-local causal relationships. The best NRC submission achieved a precision of 0.7499, a recall of 0.6431, and an F1 score of 0.6924, resulting in a statistical tie for first place. Post hoc improvements led to a precision of 0.7537, a recall of 0.6455, and an F1 score of 0.6954, giving the highest scores reported on this task to date. Methods for general relation extraction extended well to temporal relations, and gave top-ranked state-of-the-art results. Careful ordering of predictions within result sets proved critical to this success.
Identification of Alzheimer's disease and mild cognitive impairment using multimodal sparse hierarchical extreme learning machine.

PubMed

Kim, Jongin; Lee, Boreom

2018-05-07

Different modalities such as structural MRI, FDG-PET, and CSF have complementary information, which is likely to be very useful for diagnosis of AD and MCI. Therefore, it is possible to develop a more effective and accurate AD/MCI automatic diagnosis method by integrating complementary information of different modalities. In this paper, we propose multi-modal sparse hierarchical extreme leaning machine (MSH-ELM). We used volume and mean intensity extracted from 93 regions of interest (ROIs) as features of MRI and FDG-PET, respectively, and used p-tau, t-tau, and Aβ42 as CSF features. In detail, high-level representation was individually extracted from each of MRI, FDG-PET, and CSF using a stacked sparse extreme learning machine auto-encoder (sELM-AE). Then, another stacked sELM-AE was devised to acquire a joint hierarchical feature representation by fusing the high-level representations obtained from each modality. Finally, we classified joint hierarchical feature representation using a kernel-based extreme learning machine (KELM). The results of MSH-ELM were compared with those of conventional ELM, single kernel support vector machine (SK-SVM), multiple kernel support vector machine (MK-SVM) and stacked auto-encoder (SAE). Performance was evaluated through 10-fold cross-validation. In the classification of AD vs. HC and MCI vs. HC problem, the proposed MSH-ELM method showed mean balanced accuracies of 96.10% and 86.46%, respectively, which is much better than those of competing methods. In summary, the proposed algorithm exhibits consistently better performance than SK-SVM, ELM, MK-SVM and SAE in the two binary classification problems (AD vs. HC and MCI vs. HC). © 2018 Wiley Periodicals, Inc.
Multi-temporal Land Use Mapping of Coastal Wetlands Area using Machine Learning in Google Earth Engine

NASA Astrophysics Data System (ADS)

Farda, N. M.

2017-12-01

Coastal wetlands provide ecosystem services essential to people and the environment. Changes in coastal wetlands, especially on land use, are important to monitor by utilizing multi-temporal imagery. The Google Earth Engine (GEE) provides many machine learning algorithms (10 algorithms) that are very useful for extracting land use from imagery. The research objective is to explore machine learning in Google Earth Engine and its accuracy for multi-temporal land use mapping of coastal wetland area. Landsat 3 MSS (1978), Landsat 5 TM (1991), Landsat 7 ETM+ (2001), and Landsat 8 OLI (2014) images located in Segara Anakan lagoon are selected to represent multi temporal images. The input for machine learning are visible and near infrared bands, PCA band, invers PCA bands, bare soil index, vegetation index, wetness index, elevation from ASTER GDEM, and GLCM (Harralick) texture, and also polygon samples in 140 locations. There are 10 machine learning algorithms applied to extract coastal wetlands land use from Landsat imagery. The algorithms are Fast Naive Bayes, CART (Classification and Regression Tree), Random Forests, GMO Max Entropy, Perceptron (Multi Class Perceptron), Winnow, Voting SVM, Margin SVM, Pegasos (Primal Estimated sub-GrAdient SOlver for Svm), IKPamir (Intersection Kernel Passive Aggressive Method for Information Retrieval, SVM). Machine learning in Google Earth Engine are very helpful in multi-temporal land use mapping, the highest accuracy for land use mapping of coastal wetland is CART with 96.98 % Overall Accuracy using K-Fold Cross Validation (K = 10). GEE is particularly useful for multi-temporal land use mapping with ready used image and classification algorithms, and also very challenging for other applications.
Use of Machine Learning Classifiers and Sensor Data to Detect Neurological Deficit in Stroke Patients.

PubMed

Park, Eunjeong; Chang, Hyuk-Jae; Nam, Hyo Suk

2017-04-18

The pronator drift test (PDT), a neurological examination, is widely used in clinics to measure motor weakness of stroke patients. The aim of this study was to develop a PDT tool with machine learning classifiers to detect stroke symptoms based on quantification of proximal arm weakness using inertial sensors and signal processing. We extracted features of drift and pronation from accelerometer signals of wearable devices on the inner wrists of 16 stroke patients and 10 healthy controls. Signal processing and feature selection approach were applied to discriminate PDT features used to classify stroke patients. A series of machine learning techniques, namely support vector machine (SVM), radial basis function network (RBFN), and random forest (RF), were implemented to discriminate stroke patients from controls with leave-one-out cross-validation. Signal processing by the PDT tool extracted a total of 12 PDT features from sensors. Feature selection abstracted the major attributes from the 12 PDT features to elucidate the dominant characteristics of proximal weakness of stroke patients using machine learning classification. Our proposed PDT classifiers had an area under the receiver operating characteristic curve (AUC) of .806 (SVM), .769 (RBFN), and .900 (RF) without feature selection, and feature selection improves the AUCs to .913 (SVM), .956 (RBFN), and .975 (RF), representing an average performance enhancement of 15.3%. Sensors and machine learning methods can reliably detect stroke signs and quantify proximal arm weakness. Our proposed solution will facilitate pervasive monitoring of stroke patients. ©Eunjeong Park, Hyuk-Jae Chang, Hyo Suk Nam. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 18.04.2017.
Machine learning-based quantitative texture analysis of CT images of small renal masses: Differentiation of angiomyolipoma without visible fat from renal cell carcinoma.

PubMed

Feng, Zhichao; Rong, Pengfei; Cao, Peng; Zhou, Qingyu; Zhu, Wenwei; Yan, Zhimin; Liu, Qianyun; Wang, Wei

2018-04-01

To evaluate the diagnostic performance of machine-learning based quantitative texture analysis of CT images to differentiate small (≤ 4 cm) angiomyolipoma without visible fat (AMLwvf) from renal cell carcinoma (RCC). This single-institutional retrospective study included 58 patients with pathologically proven small renal mass (17 in AMLwvf and 41 in RCC groups). Texture features were extracted from the largest possible tumorous regions of interest (ROIs) by manual segmentation in preoperative three-phase CT images. Interobserver reliability and the Mann-Whitney U test were applied to select features preliminarily. Then support vector machine with recursive feature elimination (SVM-RFE) and synthetic minority oversampling technique (SMOTE) were adopted to establish discriminative classifiers, and the performance of classifiers was assessed. Of the 42 extracted features, 16 candidate features showed significant intergroup differences (P < 0.05) and had good interobserver agreement. An optimal feature subset including 11 features was further selected by the SVM-RFE method. The SVM-RFE+SMOTE classifier achieved the best performance in discriminating between small AMLwvf and RCC, with the highest accuracy, sensitivity, specificity and AUC of 93.9 %, 87.8 %, 100 % and 0.955, respectively. Machine learning analysis of CT texture features can facilitate the accurate differentiation of small AMLwvf from RCC. • Although conventional CT is useful for diagnosis of SRMs, it has limitations. • Machine-learning based CT texture analysis facilitate differentiation of small AMLwvf from RCC. • The highest accuracy of SVM-RFE+SMOTE classifier reached 93.9 %. • Texture analysis combined with machine-learning methods might spare unnecessary surgery for AMLwvf.
Using input feature information to improve ultraviolet retrieval in neural networks

NASA Astrophysics Data System (ADS)

Sun, Zhibin; Chang, Ni-Bin; Gao, Wei; Chen, Maosi; Zempila, Melina

2017-09-01

In neural networks, the training/predicting accuracy and algorithm efficiency can be improved significantly via accurate input feature extraction. In this study, some spatial features of several important factors in retrieving surface ultraviolet (UV) are extracted. An extreme learning machine (ELM) is used to retrieve the surface UV of 2014 in the continental United States, using the extracted features. The results conclude that more input weights can improve the learning capacities of neural networks.
CD-REST: a system for extracting chemical-induced disease relation in literature.

PubMed

Xu, Jun; Wu, Yonghui; Zhang, Yaoyun; Wang, Jingqi; Lee, Hee-Jin; Xu, Hua

2016-01-01

Mining chemical-induced disease relations embedded in the vast biomedical literature could facilitate a wide range of computational biomedical applications, such as pharmacovigilance. The BioCreative V organized a Chemical Disease Relation (CDR) Track regarding chemical-induced disease relation extraction from biomedical literature in 2015. We participated in all subtasks of this challenge. In this article, we present our participation system Chemical Disease Relation Extraction SysTem (CD-REST), an end-to-end system for extracting chemical-induced disease relations in biomedical literature. CD-REST consists of two main components: (1) a chemical and disease named entity recognition and normalization module, which employs the Conditional Random Fields algorithm for entity recognition and a Vector Space Model-based approach for normalization; and (2) a relation extraction module that classifies both sentence-level and document-level candidate drug-disease pairs by support vector machines. Our system achieved the best performance on the chemical-induced disease relation extraction subtask in the BioCreative V CDR Track, demonstrating the effectiveness of our proposed machine learning-based approaches for automatic extraction of chemical-induced disease relations in biomedical literature. The CD-REST system provides web services using HTTP POST request. The web services can be accessed fromhttp://clinicalnlptool.com/cdr The online CD-REST demonstration system is available athttp://clinicalnlptool.com/cdr/cdr.html. Database URL:http://clinicalnlptool.com/cdr;http://clinicalnlptool.com/cdr/cdr.html. © The Author(s) 2016. Published by Oxford University Press.
Building machines that adapt and compute like brains.

PubMed

Kriegeskorte, Nikolaus; Mok, Robert M

2017-01-01

Building machines that learn and think like humans is essential not only for cognitive science, but also for computational neuroscience, whose ultimate goal is to understand how cognition is implemented in biological brains. A new cognitive computational neuroscience should build cognitive-level and neural-level models, understand their relationships, and test both types of models with both brain and behavioral data.
Mining EEG with SVM for Understanding Cognitive Underpinnings of Math Problem Solving Strategies

PubMed Central

López, Julio

2018-01-01

We have developed a new methodology for examining and extracting patterns from brain electric activity by using data mining and machine learning techniques. Data was collected from experiments focused on the study of cognitive processes that might evoke different specific strategies in the resolution of math problems. A binary classification problem was constructed using correlations and phase synchronization between different electroencephalographic channels as characteristics and, as labels or classes, the math performances of individuals participating in specially designed experiments. The proposed methodology is based on using well-established procedures of feature selection, which were used to determine a suitable brain functional network size related to math problem solving strategies and also to discover the most relevant links in this network without including noisy connections or excluding significant connections. PMID:29670667
Mining EEG with SVM for Understanding Cognitive Underpinnings of Math Problem Solving Strategies.

PubMed

Bosch, Paul; Herrera, Mauricio; López, Julio; Maldonado, Sebastián

2018-01-01

We have developed a new methodology for examining and extracting patterns from brain electric activity by using data mining and machine learning techniques. Data was collected from experiments focused on the study of cognitive processes that might evoke different specific strategies in the resolution of math problems. A binary classification problem was constructed using correlations and phase synchronization between different electroencephalographic channels as characteristics and, as labels or classes, the math performances of individuals participating in specially designed experiments. The proposed methodology is based on using well-established procedures of feature selection, which were used to determine a suitable brain functional network size related to math problem solving strategies and also to discover the most relevant links in this network without including noisy connections or excluding significant connections.
Classification of Aerial Photogrammetric 3d Point Clouds

NASA Astrophysics Data System (ADS)

Becker, C.; Häni, N.; Rosinskaya, E.; d'Angelo, E.; Strecha, C.

2017-05-01

We present a powerful method to extract per-point semantic class labels from aerial photogrammetry data. Labelling this kind of data is important for tasks such as environmental modelling, object classification and scene understanding. Unlike previous point cloud classification methods that rely exclusively on geometric features, we show that incorporating color information yields a significant increase in accuracy in detecting semantic classes. We test our classification method on three real-world photogrammetry datasets that were generated with Pix4Dmapper Pro, and with varying point densities. We show that off-the-shelf machine learning techniques coupled with our new features allow us to train highly accurate classifiers that generalize well to unseen data, processing point clouds containing 10 million points in less than 3 minutes on a desktop computer.
Feature Selection in Order to Extract Multiple Sclerosis Lesions Automatically in 3D Brain Magnetic Resonance Images Using Combination of Support Vector Machine and Genetic Algorithm.

PubMed

Khotanlou, Hassan; Afrasiabi, Mahlagha

2012-10-01

This paper presents a new feature selection approach for automatically extracting multiple sclerosis (MS) lesions in three-dimensional (3D) magnetic resonance (MR) images. Presented method is applicable to different types of MS lesions. In this method, T1, T2, and fluid attenuated inversion recovery (FLAIR) images are firstly preprocessed. In the next phase, effective features to extract MS lesions are selected by using a genetic algorithm (GA). The fitness function of the GA is the Similarity Index (SI) of a support vector machine (SVM) classifier. The results obtained on different types of lesions have been evaluated by comparison with manual segmentations. This algorithm is evaluated on 15 real 3D MR images using several measures. As a result, the SI between MS regions determined by the proposed method and radiologists was 87% on average. Experiments and comparisons with other methods show the effectiveness and the efficiency of the proposed approach.
Learning About Climate and Atmospheric Models Through Machine Learning

NASA Astrophysics Data System (ADS)

Lucas, D. D.

2017-12-01

From the analysis of ensemble variability to improving simulation performance, machine learning algorithms can play a powerful role in understanding the behavior of atmospheric and climate models. To learn about model behavior, we create training and testing data sets through ensemble techniques that sample different model configurations and values of input parameters, and then use supervised machine learning to map the relationships between the inputs and outputs. Following this procedure, we have used support vector machines, random forests, gradient boosting and other methods to investigate a variety of atmospheric and climate model phenomena. We have used machine learning to predict simulation crashes, estimate the probability density function of climate sensitivity, optimize simulations of the Madden Julian oscillation, assess the impacts of weather and emissions uncertainty on atmospheric dispersion, and quantify the effects of model resolution changes on precipitation. This presentation highlights recent examples of our applications of machine learning to improve the understanding of climate and atmospheric models. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Intelligent Gearbox Diagnosis Methods Based on SVM, Wavelet Lifting and RBR

PubMed Central

Gao, Lixin; Ren, Zhiqiang; Tang, Wenliang; Wang, Huaqing; Chen, Peng

2010-01-01

Given the problems in intelligent gearbox diagnosis methods, it is difficult to obtain the desired information and a large enough sample size to study; therefore, we propose the application of various methods for gearbox fault diagnosis, including wavelet lifting, a support vector machine (SVM) and rule-based reasoning (RBR). In a complex field environment, it is less likely for machines to have the same fault; moreover, the fault features can also vary. Therefore, a SVM could be used for the initial diagnosis. First, gearbox vibration signals were processed with wavelet packet decomposition, and the signal energy coefficients of each frequency band were extracted and used as input feature vectors in SVM for normal and faulty pattern recognition. Second, precision analysis using wavelet lifting could successfully filter out the noisy signals while maintaining the impulse characteristics of the fault; thus effectively extracting the fault frequency of the machine. Lastly, the knowledge base was built based on the field rules summarized by experts to identify the detailed fault type. Results have shown that SVM is a powerful tool to accomplish gearbox fault pattern recognition when the sample size is small, whereas the wavelet lifting scheme can effectively extract fault features, and rule-based reasoning can be used to identify the detailed fault type. Therefore, a method that combines SVM, wavelet lifting and rule-based reasoning ensures effective gearbox fault diagnosis. PMID:22399894
Identifying seizure onset zone from electrocorticographic recordings: A machine learning approach based on phase locking value.

PubMed

Elahian, Bahareh; Yeasin, Mohammed; Mudigoudar, Basanagoud; Wheless, James W; Babajani-Feremi, Abbas

2017-10-01

Using a novel technique based on phase locking value (PLV), we investigated the potential for features extracted from electrocorticographic (ECoG) recordings to serve as biomarkers to identify the seizure onset zone (SOZ). We computed the PLV between the phase of the amplitude of high gamma activity (80-150Hz) and the phase of lower frequency rhythms (4-30Hz) from ECoG recordings obtained from 10 patients with epilepsy (21 seizures). We extracted five features from the PLV and used a machine learning approach based on logistic regression to build a model that classifies electrodes as SOZ or non-SOZ. More than 96% of electrodes identified as the SOZ by our algorithm were within the resected area in six seizure-free patients. In four non-seizure-free patients, more than 31% of the identified SOZ electrodes by our algorithm were outside the resected area. In addition, we observed that the seizure outcome in non-seizure-free patients correlated with the number of non-resected SOZ electrodes identified by our algorithm. This machine learning approach, based on features extracted from the PLV, effectively identified electrodes within the SOZ. The approach has the potential to assist clinicians in surgical decision-making when pre-surgical intracranial recordings are utilized. Copyright © 2017 British Epilepsy Association. Published by Elsevier Ltd. All rights reserved.
Deep convolutional neural networks for classifying GPR B-scans

NASA Astrophysics Data System (ADS)

Besaw, Lance E.; Stimac, Philip J.

2015-05-01

Symmetric and asymmetric buried explosive hazards (BEHs) present real, persistent, deadly threats on the modern battlefield. Current approaches to mitigate these threats rely on highly trained operatives to reliably detect BEHs with reasonable false alarm rates using handheld Ground Penetrating Radar (GPR) and metal detectors. As computers become smaller, faster and more efficient, there exists greater potential for automated threat detection based on state-of-the-art machine learning approaches, reducing the burden on the field operatives. Recent advancements in machine learning, specifically deep learning artificial neural networks, have led to significantly improved performance in pattern recognition tasks, such as object classification in digital images. Deep convolutional neural networks (CNNs) are used in this work to extract meaningful signatures from 2-dimensional (2-D) GPR B-scans and classify threats. The CNNs skip the traditional "feature engineering" step often associated with machine learning, and instead learn the feature representations directly from the 2-D data. A multi-antennae, handheld GPR with centimeter-accurate positioning data was used to collect shallow subsurface data over prepared lanes containing a wide range of BEHs. Several heuristics were used to prevent over-training, including cross validation, network weight regularization, and "dropout." Our results show that CNNs can extract meaningful features and accurately classify complex signatures contained in GPR B-scans, complementing existing GPR feature extraction and classification techniques.
Intelligent gearbox diagnosis methods based on SVM, wavelet lifting and RBR.

PubMed

Gao, Lixin; Ren, Zhiqiang; Tang, Wenliang; Wang, Huaqing; Chen, Peng

2010-01-01

Given the problems in intelligent gearbox diagnosis methods, it is difficult to obtain the desired information and a large enough sample size to study; therefore, we propose the application of various methods for gearbox fault diagnosis, including wavelet lifting, a support vector machine (SVM) and rule-based reasoning (RBR). In a complex field environment, it is less likely for machines to have the same fault; moreover, the fault features can also vary. Therefore, a SVM could be used for the initial diagnosis. First, gearbox vibration signals were processed with wavelet packet decomposition, and the signal energy coefficients of each frequency band were extracted and used as input feature vectors in SVM for normal and faulty pattern recognition. Second, precision analysis using wavelet lifting could successfully filter out the noisy signals while maintaining the impulse characteristics of the fault; thus effectively extracting the fault frequency of the machine. Lastly, the knowledge base was built based on the field rules summarized by experts to identify the detailed fault type. Results have shown that SVM is a powerful tool to accomplish gearbox fault pattern recognition when the sample size is small, whereas the wavelet lifting scheme can effectively extract fault features, and rule-based reasoning can be used to identify the detailed fault type. Therefore, a method that combines SVM, wavelet lifting and rule-based reasoning ensures effective gearbox fault diagnosis.
An illustration of new methods in machine condition monitoring, Part I: stochastic resonance

NASA Astrophysics Data System (ADS)

Worden, K.; Antoniadou, I.; Marchesiello, S.; Mba, C.; Garibaldi, L.

2017-05-01

There have been many recent developments in the application of data-based methods to machine condition monitoring. A powerful methodology based on machine learning has emerged, where diagnostics are based on a two-step procedure: extraction of damage-sensitive features, followed by unsupervised learning (novelty detection) or supervised learning (classification). The objective of the current pair of papers is simply to illustrate one state-of-the-art procedure for each step, using synthetic data representative of reality in terms of size and complexity. The first paper in the pair will deal with feature extraction. Although some papers have appeared in the recent past considering stochastic resonance as a means of amplifying damage information in signals, they have largely relied on ad hoc specifications of the resonator used. In contrast, the current paper will adopt a principled optimisation-based approach to the resonator design. The paper will also show that a discrete dynamical system can provide all the benefits of a continuous system, but also provide a considerable speed-up in terms of simulation time in order to facilitate the optimisation approach.
Machine-aided indexing at NASA

NASA Technical Reports Server (NTRS)

Silvester, June P.; Genuardi, Michael T.; Klingbiel, Paul H.

1994-01-01

This report describes the NASA Lexical Dictionary (NLD), a machine-aided indexing system used online at the National Aeronautics and Space Administration's Center for AeroSpace Information (CASI). This system automatically suggests a set of candidate terms from NASA's controlled vocabulary for any designated natural language text input. The system is comprised of a text processor that is based on the computational, nonsyntactic analysis of input text and an extensive knowledge base that serves to recognize and translate text-extracted concepts. The functions of the various NLD system components are described in detail, and production and quality benefits resulting from the implementation of machine-aided indexing at CASI are discussed.

Data mining in bioinformatics using Weka.

PubMed

Frank, Eibe; Hall, Mark; Trigg, Len; Holmes, Geoffrey; Witten, Ian H

2004-10-12

The Weka machine learning workbench provides a general-purpose environment for automatic classification, regression, clustering and feature selection-common data mining problems in bioinformatics research. It contains an extensive collection of machine learning algorithms and data pre-processing methods complemented by graphical user interfaces for data exploration and the experimental comparison of different machine learning techniques on the same problem. Weka can process data given in the form of a single relational table. Its main objectives are to (a) assist users in extracting useful information from data and (b) enable them to easily identify a suitable algorithm for generating an accurate predictive model from it. http://www.cs.waikato.ac.nz/ml/weka.
A new generation of medical cyclotrons for the 90`s

DOE Office of Scientific and Technical Information (OSTI.GOV)

Milton, B.F.

1995-08-01

Cyclotrons continue to be efficient accelerators for use in radio-isotope production. In recent years, developments in accelerator technology have greatly increased the practical beam current in these machines while also improving the overall system reliability. These developments combined with the development of new isotopes for medicine and industry, and a retiring of older machines indicates a strong future for commercial cyclotrons. In this paper the authors will survey recent developments in the areas of cyclotron technology as they relate to the new generation of commercial cyclotrons. Existing and potential markets for these cyclotrons will be presented. They will also discussmore » the possibility of systems capable of extracted energies up to 150 MeV and extracted beam currents of up to 2.0 mA.« less
In vivo classification of human skin burns using machine learning and quantitative features captured by optical coherence tomography

NASA Astrophysics Data System (ADS)

Singla, Neeru; Srivastava, Vishal; Singh Mehta, Dalip

2018-02-01

We report the first fully automated detection of human skin burn injuries in vivo, with the goal of automatic surgical margin assessment based on optical coherence tomography (OCT) images. Our proposed automated procedure entails building a machine-learning-based classifier by extracting quantitative features from normal and burn tissue images recorded by OCT. In this study, 56 samples (28 normal, 28 burned) were imaged by OCT and eight features were extracted. A linear model classifier was trained using 34 samples and 22 samples were used to test the model. Sensitivity of 91.6% and specificity of 90% were obtained. Our results demonstrate the capability of a computer-aided technique for accurately and automatically identifying burn tissue resection margins during surgical treatment.
Image Analysis and Modeling

DTIC Science & Technology

1975-08-01

image analysis and processing tasks such as information extraction, image enhancement and restoration, coding, etc. The ultimate objective of this research is to form a basis for the development of technology relevant to military applications of machine extraction of information from aircraft and satellite imagery of the earth’s surface. This report discusses research activities during the three month period February 1 - April 30,
Prediction of residue-residue contact matrix for protein-protein interaction with Fisher score features and deep learning.

PubMed

Du, Tianchuan; Liao, Li; Wu, Cathy H; Sun, Bilin

2016-11-01

Protein-protein interactions play essential roles in many biological processes. Acquiring knowledge of the residue-residue contact information of two interacting proteins is not only helpful in annotating functions for proteins, but also critical for structure-based drug design. The prediction of the protein residue-residue contact matrix of the interfacial regions is challenging. In this work, we introduced deep learning techniques (specifically, stacked autoencoders) to build deep neural network models to tackled the residue-residue contact prediction problem. In tandem with interaction profile Hidden Markov Models, which was used first to extract Fisher score features from protein sequences, stacked autoencoders were deployed to extract and learn hidden abstract features. The deep learning model showed significant improvement over the traditional machine learning model, Support Vector Machines (SVM), with the overall accuracy increased by 15% from 65.40% to 80.82%. We showed that the stacked autoencoders could extract novel features, which can be utilized by deep neural networks and other classifiers to enhance learning, out of the Fisher score features. It is further shown that deep neural networks have significant advantages over SVM in making use of the newly extracted features. Copyright © 2016. Published by Elsevier Inc.
Combining semi-automated image analysis techniques with machine learning algorithms to accelerate large-scale genetic studies.

PubMed

Atkinson, Jonathan A; Lobet, Guillaume; Noll, Manuel; Meyer, Patrick E; Griffiths, Marcus; Wells, Darren M

2017-10-01

Genetic analyses of plant root systems require large datasets of extracted architectural traits. To quantify such traits from images of root systems, researchers often have to choose between automated tools (that are prone to error and extract only a limited number of architectural traits) or semi-automated ones (that are highly time consuming). We trained a Random Forest algorithm to infer architectural traits from automatically extracted image descriptors. The training was performed on a subset of the dataset, then applied to its entirety. This strategy allowed us to (i) decrease the image analysis time by 73% and (ii) extract meaningful architectural traits based on image descriptors. We also show that these traits are sufficient to identify the quantitative trait loci that had previously been discovered using a semi-automated method. We have shown that combining semi-automated image analysis with machine learning algorithms has the power to increase the throughput of large-scale root studies. We expect that such an approach will enable the quantification of more complex root systems for genetic studies. We also believe that our approach could be extended to other areas of plant phenotyping. © The Authors 2017. Published by Oxford University Press.
Combining semi-automated image analysis techniques with machine learning algorithms to accelerate large-scale genetic studies

PubMed Central

Atkinson, Jonathan A.; Lobet, Guillaume; Noll, Manuel; Meyer, Patrick E.; Griffiths, Marcus

2017-01-01

Abstract Genetic analyses of plant root systems require large datasets of extracted architectural traits. To quantify such traits from images of root systems, researchers often have to choose between automated tools (that are prone to error and extract only a limited number of architectural traits) or semi-automated ones (that are highly time consuming). We trained a Random Forest algorithm to infer architectural traits from automatically extracted image descriptors. The training was performed on a subset of the dataset, then applied to its entirety. This strategy allowed us to (i) decrease the image analysis time by 73% and (ii) extract meaningful architectural traits based on image descriptors. We also show that these traits are sufficient to identify the quantitative trait loci that had previously been discovered using a semi-automated method. We have shown that combining semi-automated image analysis with machine learning algorithms has the power to increase the throughput of large-scale root studies. We expect that such an approach will enable the quantification of more complex root systems for genetic studies. We also believe that our approach could be extended to other areas of plant phenotyping. PMID:29020748
Machine Learning Classification Combining Multiple Features of A Hyper-Network of fMRI Data in Alzheimer's Disease

PubMed Central

Guo, Hao; Zhang, Fan; Chen, Junjie; Xu, Yong; Xiang, Jie

2017-01-01

Exploring functional interactions among various brain regions is helpful for understanding the pathological underpinnings of neurological disorders. Brain networks provide an important representation of those functional interactions, and thus are widely applied in the diagnosis and classification of neurodegenerative diseases. Many mental disorders involve a sharp decline in cognitive ability as a major symptom, which can be caused by abnormal connectivity patterns among several brain regions. However, conventional functional connectivity networks are usually constructed based on pairwise correlations among different brain regions. This approach ignores higher-order relationships, and cannot effectively characterize the high-order interactions of many brain regions working together. Recent neuroscience research suggests that higher-order relationships between brain regions are important for brain network analysis. Hyper-networks have been proposed that can effectively represent the interactions among brain regions. However, this method extracts the local properties of brain regions as features, but ignores the global topology information, which affects the evaluation of network topology and reduces the performance of the classifier. This problem can be compensated by a subgraph feature-based method, but it is not sensitive to change in a single brain region. Considering that both of these feature extraction methods result in the loss of information, we propose a novel machine learning classification method that combines multiple features of a hyper-network based on functional magnetic resonance imaging in Alzheimer's disease. The method combines the brain region features and subgraph features, and then uses a multi-kernel SVM for classification. This retains not only the global topological information, but also the sensitivity to change in a single brain region. To certify the proposed method, 28 normal control subjects and 38 Alzheimer's disease patients were selected to participate in an experiment. The proposed method achieved satisfactory classification accuracy, with an average of 91.60%. The abnormal brain regions included the bilateral precuneus, right parahippocampal gyrus\\hippocampus, right posterior cingulate gyrus, and other regions that are known to be important in Alzheimer's disease. Machine learning classification combining multiple features of a hyper-network of functional magnetic resonance imaging data in Alzheimer's disease obtains better classification performance. PMID:29209156
Preliminary results concerning the simulation of beam profiles from extracted ion current distributions for mini-STRIKE

DOE Office of Scientific and Technical Information (OSTI.GOV)

Agostinetti, P., E-mail: piero.agostinetti@igi.cnr.it; Serianni, G.; Veltri, P.

The Radio Frequency (RF) negative hydrogen ion source prototype has been chosen for the ITER neutral beam injectors due to its optimal performances and easier maintenance demonstrated at Max-Planck-Institut für Plasmaphysik, Garching in hydrogen and deuterium. One of the key information to better understand the operating behavior of the RF ion sources is the extracted negative ion current density distribution. This distribution—influenced by several factors like source geometry, particle drifts inside the source, cesium distribution, and layout of cesium ovens—is not straightforward to be evaluated. The main outcome of the present contribution is the development of a minimization method tomore » estimate the extracted current distribution using the footprint of the beam recorded with mini-STRIKE (Short-Time Retractable Instrumented Kalorimeter). To accomplish this, a series of four computational models have been set up, where the output of a model is the input of the following one. These models compute the optics of the ion beam, evaluate the distribution of the heat deposited on the mini-STRIKE diagnostic calorimeter, and finally give an estimate of the temperature distribution on the back of mini-STRIKE. Several iterations with different extracted current profiles are necessary to give an estimate of the profile most compatible with the experimental data. A first test of the application of the method to the BAvarian Test Machine for Negative ions beam is given.« less
Effectiveness of Podcasts as Laboratory Instructional Support: Learner Perceptions of Machine Shop and Welding Students

ERIC Educational Resources Information Center

Lauritzen, Louis Dee

2014-01-01

Machine shop students face the daunting task of learning the operation of complex three-dimensional machine tools, and welding students must develop specific motor skills in addition to understanding the complexity of material types and characteristics. The use of consumer technology by the Millennial generation of vocational students, the…
CHARACTERIZATION OF Pro-Beam LOW VOLTAGE ELECTRON BEAM WELDING MACHINE

DOE Office of Scientific and Technical Information (OSTI.GOV)

Burgardt, Paul; Pierce, Stanley W.

The purpose of this paper is to present and discuss data related to the performance of a newly acquired low voltage electron beam welding machine. The machine was made by Pro-Beam AG &Co. KGaA of Germany. This machine was recently installed at LANL in building SM -39; a companion machine was installed in the production facility. The PB machine is substantially different than the EBW machines typically used at LANL and therefore, it is important to understand its characteristics as well as possible. Our basic purpose in this paper is to present basic machine performance data and to compare thosemore » with similar results from the existing EBW machines. It is hoped that this data will provide a historical record of this machine’s characteristics as well as possibly being helpful for transferring welding processes from the old EBW machines to the PB machine or comparable machines that may be purchased in the future.« less
Surface mapping via unsupervised classification of remote sensing: application to MESSENGER/MASCS and DAWN/VIRS data.

NASA Astrophysics Data System (ADS)

D'Amore, M.; Le Scaon, R.; Helbert, J.; Maturilli, A.

2017-12-01

Machine-learning achieved unprecedented results in high-dimensional data processing tasks with wide applications in various fields. Due to the growing number of complex nonlinear systems that have to be investigated in science and the bare raw size of data nowadays available, ML offers the unique ability to extract knowledge, regardless the specific application field. Examples are image segmentation, supervised/unsupervised/ semi-supervised classification, feature extraction, data dimensionality analysis/reduction.The MASCS instrument has mapped Mercury surface in the 400-1145 nm wavelength range during orbital observations by the MESSENGER spacecraft. We have conducted k-means unsupervised hierarchical clustering to identify and characterize spectral units from MASCS observations. The results display a dichotomy: a polar and equatorial units, possibly linked to compositional differences or weathering due to irradiation. To explore possible relations between composition and spectral behavior, we have compared the spectral provinces with elemental abundance maps derived from MESSENGER's X-Ray Spectrometer (XRS).For the Vesta application on DAWN Visible and infrared spectrometer (VIR) data, we explored several Machine Learning techniques: image segmentation method, stream algorithm and hierarchical clustering.The algorithm successfully separates the Olivine outcrops around two craters on Vesta's surface [1]. New maps summarizing the spectral and chemical signature of the surface could be automatically produced.We conclude that instead of hand digging in data, scientist could choose a subset of algorithms with well known feature (i.e. efficacy on the particular problem, speed, accuracy) and focus their effort in understanding what important characteristic of the groups found in the data mean. [1] E Ammannito et al. "Olivine in an unexpected location on Vesta's surface". In: Nature 504.7478 (2013), pp. 122-125.
Comparison of classification methods for voxel-based prediction of acute ischemic stroke outcome following intra-arterial intervention

NASA Astrophysics Data System (ADS)

Winder, Anthony J.; Siemonsen, Susanne; Flottmann, Fabian; Fiehler, Jens; Forkert, Nils D.

2017-03-01

Voxel-based tissue outcome prediction in acute ischemic stroke patients is highly relevant for both clinical routine and research. Previous research has shown that features extracted from baseline multi-parametric MRI datasets have a high predictive value and can be used for the training of classifiers, which can generate tissue outcome predictions for both intravenous and conservative treatments. However, with the recent advent and popularization of intra-arterial thrombectomy treatment, novel research specifically addressing the utility of predictive classi- fiers for thrombectomy intervention is necessary for a holistic understanding of current stroke treatment options. The aim of this work was to develop three clinically viable tissue outcome prediction models using approximate nearest-neighbor, generalized linear model, and random decision forest approaches and to evaluate the accuracy of predicting tissue outcome after intra-arterial treatment. Therefore, the three machine learning models were trained, evaluated, and compared using datasets of 42 acute ischemic stroke patients treated with intra-arterial thrombectomy. Classifier training utilized eight voxel-based features extracted from baseline MRI datasets and five global features. Evaluation of classifier-based predictions was performed via comparison to the known tissue outcome, which was determined in follow-up imaging, using the Dice coefficient and leave-on-patient-out cross validation. The random decision forest prediction model led to the best tissue outcome predictions with a mean Dice coefficient of 0.37. The approximate nearest-neighbor and generalized linear model performed equally suboptimally with average Dice coefficients of 0.28 and 0.27 respectively, suggesting that both non-linearity and machine learning are desirable properties of a classifier well-suited to the intra-arterial tissue outcome prediction problem.
Using Machine Learning to Advance Personality Assessment and Theory.

PubMed

Bleidorn, Wiebke; Hopwood, Christopher James

2018-05-01

Machine learning has led to important advances in society. One of the most exciting applications of machine learning in psychological science has been the development of assessment tools that can powerfully predict human behavior and personality traits. Thus far, machine learning approaches to personality assessment have focused on the associations between social media and other digital records with established personality measures. The goal of this article is to expand the potential of machine learning approaches to personality assessment by embedding it in a more comprehensive construct validation framework. We review recent applications of machine learning to personality assessment, place machine learning research in the broader context of fundamental principles of construct validation, and provide recommendations for how to use machine learning to advance our understanding of personality.
Visualization and Analysis of Geology Word Vectors for Efficient Information Extraction

NASA Astrophysics Data System (ADS)

Floyd, J. S.

2016-12-01

When a scientist begins studying a new geographic region of the Earth, they frequently begin by gathering relevant scientific literature in order to understand what is known, for example, about the region's geologic setting, structure, stratigraphy, and tectonic and environmental history. Experienced scientists typically know what keywords to seek and understand that if a document contains one important keyword, then other words in the document may be important as well. Word relationships in a document give rise to what is known in linguistics as the context-dependent nature of meaning. For example, the meaning of the word `strike' in geology, as in the strike of a fault, is quite different from its popular meaning in baseball. In addition, word order, such as in the phrase `Cretaceous-Tertiary boundary,' often corresponds to the order of sequences in time or space. The context of words and the relevance of words to each other can be derived quantitatively by machine learning vector representations of words. Here we show the results of training a neural network to create word vectors from scientific research papers from selected rift basins and mid-ocean ridges: the Woodlark Basin of Papua New Guinea, the Hess Deep rift, and the Gulf of Mexico basin. The word vectors are statistically defined by surrounding words within a given window, limited by the length of each sentence. The word vectors are analyzed by their cosine distance to related words (e.g., `axial' and `magma'), classified by high dimensional clustering, and visualized by reducing the vector dimensions and plotting the vectors on a two- or three-dimensional graph. Similarity analysis of `Triassic' and `Cretaceous' returns `Jurassic' as the nearest word vector, suggesting that the model is capable of learning the geologic time scale. Similarity analysis of `basalt' and `minerals' automatically returns mineral names such as `chlorite', `plagioclase,' and `olivine.' Word vector analysis and visualization allow one to extract information from hundreds of papers or more and find relationships in less time than it would take to read all of the papers. As machine learning tools become more commonly available, more and more scientists will be able to use and refine these tools for their individual needs.
Surgical robotics beyond enhanced dexterity instrumentation: a survey of machine learning techniques and their role in intelligent and autonomous surgical actions.

PubMed

Kassahun, Yohannes; Yu, Bingbin; Tibebu, Abraham Temesgen; Stoyanov, Danail; Giannarou, Stamatia; Metzen, Jan Hendrik; Vander Poorten, Emmanuel

2016-04-01

Advances in technology and computing play an increasingly important role in the evolution of modern surgical techniques and paradigms. This article reviews the current role of machine learning (ML) techniques in the context of surgery with a focus on surgical robotics (SR). Also, we provide a perspective on the future possibilities for enhancing the effectiveness of procedures by integrating ML in the operating room. The review is focused on ML techniques directly applied to surgery, surgical robotics, surgical training and assessment. The widespread use of ML methods in diagnosis and medical image computing is beyond the scope of the review. Searches were performed on PubMed and IEEE Explore using combinations of keywords: ML, surgery, robotics, surgical and medical robotics, skill learning, skill analysis and learning to perceive. Studies making use of ML methods in the context of surgery are increasingly being reported. In particular, there is an increasing interest in using ML for developing tools to understand and model surgical skill and competence or to extract surgical workflow. Many researchers begin to integrate this understanding into the control of recent surgical robots and devices. ML is an expanding field. It is popular as it allows efficient processing of vast amounts of data for interpreting and real-time decision making. Already widely used in imaging and diagnosis, it is believed that ML will also play an important role in surgery and interventional treatments. In particular, ML could become a game changer into the conception of cognitive surgical robots. Such robots endowed with cognitive skills would assist the surgical team also on a cognitive level, such as possibly lowering the mental load of the team. For example, ML could help extracting surgical skill, learned through demonstration by human experts, and could transfer this to robotic skills. Such intelligent surgical assistance would significantly surpass the state of the art in surgical robotics. Current devices possess no intelligence whatsoever and are merely advanced and expensive instruments.
Rapid and simultaneous analysis of ten aromatic amines in mainstream cigarette smoke by liquid chromatography/electrospray ionization tandem mass spectrometry under ISO and "Health Canada intensive" machine smoking regimens.

PubMed

Xie, Fuwei; Yu, Jingjing; Wang, Sheng; Zhao, Ge; Xia, Qiaoling; Zhang, Xiaobing; Zhang, Shusheng

2013-10-15

Ten primary aromatic amines (AAs) in mainstream cigarette smoke under both ISO and "Health Canada intensive" machine smoking regimens were determined in this work, which were suspected to be carcinogenic compounds. The measured AAs included aniline, ortho-toluidine, meta-toluidine, para-toluidine, 1-naphthylamine, 2-naphthylamine, 3-aminobiphenyl, 4-aminobiphenyl, meta-phenylenediamine and meta-anisidine. For rapidly and sensitively analyzing these AAs, a liquid chromatography-electrospray ionization tandem mass spectrometric (LC-MS/MS) method coupled with solid phase extraction (SPE) was developed. The particulate phase of mainstream cigarette smoke was collected on a Cambridge filter pads, while the gas phase was trapped by 25 mL 5% HCl solution. Then, the pad was extracted in an ultrasonic bath with the impinger HCl solution. After being neutralized with NaOH, the extract was purified with a HLB solid phase extraction column, and then was analyzed with LC-MS/MS using isotope-labeled internal standard. The overall sample pretreatment and analysis time was less than 1.5h. The limits of detection for all targets ranged from 0.05 ng cig(-1) to 0.96 ng cig(-1) with the recoveries in the range of 75.0-131.8%. And the intra-day and inter-day precisions were less than 10% and 16%, respectively. Under HCI machine smoking regimen, the AAs yields in mainstream cigarette smoke were much higher and the average increases were greater than 100% compared with those under ISO smoking condition. Copyright © 2013 Elsevier B.V. All rights reserved.
ALE: automated label extraction from GEO metadata.

PubMed

Giles, Cory B; Brown, Chase A; Ripperger, Michael; Dennis, Zane; Roopnarinesingh, Xiavan; Porter, Hunter; Perz, Aleksandra; Wren, Jonathan D

2017-12-28

NCBI's Gene Expression Omnibus (GEO) is a rich community resource containing millions of gene expression experiments from human, mouse, rat, and other model organisms. However, information about each experiment (metadata) is in the format of an open-ended, non-standardized textual description provided by the depositor. Thus, classification of experiments for meta-analysis by factors such as gender, age of the sample donor, and tissue of origin is not feasible without assigning labels to the experiments. Automated approaches are preferable for this, primarily because of the size and volume of the data to be processed, but also because it ensures standardization and consistency. While some of these labels can be extracted directly from the textual metadata, many of the data available do not contain explicit text informing the researcher about the age and gender of the subjects with the study. To bridge this gap, machine-learning methods can be trained to use the gene expression patterns associated with the text-derived labels to refine label-prediction confidence. Our analysis shows only 26% of metadata text contains information about gender and 21% about age. In order to ameliorate the lack of available labels for these data sets, we first extract labels from the textual metadata for each GEO RNA dataset and evaluate the performance against a gold standard of manually curated labels. We then use machine-learning methods to predict labels, based upon gene expression of the samples and compare this to the text-based method. Here we present an automated method to extract labels for age, gender, and tissue from textual metadata and GEO data using both a heuristic approach as well as machine learning. We show the two methods together improve accuracy of label assignment to GEO samples.
A simple crunching of the AGS 'bare' machine ORM data - February 2007 - to extract some aspects of AGS transverse coupling at injection and extraction

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ahrens, L.

2010-11-01

The objective of this note is to (once again) explore the AGS 'ORM' (orbit response matrix) data taken (by Operations) early during the 2007 run with an AGS bare machine and gold beam. Indeed the present motivation is to extract as much information about the AGS inherent transverse coupling as possible - from general arguments and the copious ORM data. And taking this one step further, (though not accomplished yet) the goal really should be to tell the model how to describe this coupling. 'Bare' as used here means the AGS with no quadrupole, sextupole or octupole magnets powered. Onlymore » the main (combined-function) magnet string and dipole bumps necessary to optimize beam survival are powered. 'ORM data' means the systematic recording of the equilibrium orbit beam position monitor response to powering individual dipole corrector magnets. The 'matrix' results from looking at the effect of each of the (12 superperiods X 4 dipoles per superperiod) 'kicks' on each of the (12 X 6) pick up electrodes (pues) in each transverse plane. So then we have two (48 X 72) matrices of numbers from the ORM data. (Though 'pue' usually refers to the hardware in the vacuum chamber and 'bpm' to the beam position monitoring system, the two labels will be used casually here.) The exercise is carried out at two magnet rigidities, injection (AGS field {approx}434 Gauss) and extraction to RHIC ({approx}9730 Gauss), - a ratio of rigidities of about 22.4. Since we stick with a bare machine, we are also stuck with the bare tunes which means the tunes are rather close together and near 8.75. Injection: (h,v) {approx} (8.73, 8.76).« less
An efficient scheme for automatic web pages categorization using the support vector machine

NASA Astrophysics Data System (ADS)

Bhalla, Vinod Kumar; Kumar, Neeraj

2016-07-01

In the past few years, with an evolution of the Internet and related technologies, the number of the Internet users grows exponentially. These users demand access to relevant web pages from the Internet within fraction of seconds. To achieve this goal, there is a requirement of an efficient categorization of web page contents. Manual categorization of these billions of web pages to achieve high accuracy is a challenging task. Most of the existing techniques reported in the literature are semi-automatic. Using these techniques, higher level of accuracy cannot be achieved. To achieve these goals, this paper proposes an automatic web pages categorization into the domain category. The proposed scheme is based on the identification of specific and relevant features of the web pages. In the proposed scheme, first extraction and evaluation of features are done followed by filtering the feature set for categorization of domain web pages. A feature extraction tool based on the HTML document object model of the web page is developed in the proposed scheme. Feature extraction and weight assignment are based on the collection of domain-specific keyword list developed by considering various domain pages. Moreover, the keyword list is reduced on the basis of ids of keywords in keyword list. Also, stemming of keywords and tag text is done to achieve a higher accuracy. An extensive feature set is generated to develop a robust classification technique. The proposed scheme was evaluated using a machine learning method in combination with feature extraction and statistical analysis using support vector machine kernel as the classification tool. The results obtained confirm the effectiveness of the proposed scheme in terms of its accuracy in different categories of web pages.

Particle size alterations of feedstuffs during in situ neutral detergent fiber incubation.

PubMed

Krämer, M; Nørgaard, P; Lund, P; Weisbjerg, M R

2013-07-01

Particle size alterations during neutral detergent fiber (NDF) determination and in situ rumen incubation were analyzed by dry sieving and image analysis to evaluate the in situ procedure for estimation of NDF degradation parameters and indigestible NDF concentration in terms of particle size. Early-cut and late-cut grass silages, corn silage, alfalfa silage, rapeseed meal, and dried distillers grains were examined. Treatments were (1) drying and grinding of forage samples and grinding of concentrates; (2) neutral detergent-soluble (NDS) extraction; (3) machine washing and NDS extraction; (4) 24-h rumen incubation, machine washing, and NDS extraction; and (5) 288-h rumen incubation, machine washing, and NDS extraction. Degradation profiles for potentially degradable NDF were determined and image analysis was used to estimate particle size profiles and thereby the risk for particle loss. Particle dimensions changed during NDF determination and in situ rumen incubation and variations depended on feedstuff and treatment. Corn silage and late-cut grass silage varied most in particle area among feedstuffs, with an increase of 139% between 0 and 24h and a decrease of 77% between 24 and 288 h for corn silage and a decrease of 74% for late-cut grass silage between 24- and 288-h in situ rumen incubation. Especially for late-cut grass silage residues after 288 h in situ rumen incubation, a high mass proportion in the critical zone for escape was found. Particle area decreased linearly with increasing incubation time. Particle loss during in situ rumen incubation cannot be excluded and is likely to vary among feedstuffs. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Localized thin-section CT with radiomics feature extraction and machine learning to classify early-detected pulmonary nodules from lung cancer screening

NASA Astrophysics Data System (ADS)

Tu, Shu-Ju; Wang, Chih-Wei; Pan, Kuang-Tse; Wu, Yi-Cheng; Wu, Chen-Te

2018-03-01

Lung cancer screening aims to detect small pulmonary nodules and decrease the mortality rate of those affected. However, studies from large-scale clinical trials of lung cancer screening have shown that the false-positive rate is high and positive predictive value is low. To address these problems, a technical approach is greatly needed for accurate malignancy differentiation among these early-detected nodules. We studied the clinical feasibility of an additional protocol of localized thin-section CT for further assessment on recalled patients from lung cancer screening tests. Our approach of localized thin-section CT was integrated with radiomics features extraction and machine learning classification which was supervised by pathological diagnosis. Localized thin-section CT images of 122 nodules were retrospectively reviewed and 374 radiomics features were extracted. In this study, 48 nodules were benign and 74 malignant. There were nine patients with multiple nodules and four with synchronous multiple malignant nodules. Different machine learning classifiers with a stratified ten-fold cross-validation were used and repeated 100 times to evaluate classification accuracy. Of the image features extracted from the thin-section CT images, 238 (64%) were useful in differentiating between benign and malignant nodules. These useful features include CT density (p = 0.002 518), sigma (p = 0.002 781), uniformity (p = 0.032 41), and entropy (p = 0.006 685). The highest classification accuracy was 79% by the logistic classifier. The performance metrics of this logistic classification model was 0.80 for the positive predictive value, 0.36 for the false-positive rate, and 0.80 for the area under the receiver operating characteristic curve. Our approach of direct risk classification supervised by the pathological diagnosis with localized thin-section CT and radiomics feature extraction may support clinical physicians in determining truly malignant nodules and therefore reduce problems in lung cancer screening.
Simultaneous analysis of nine aromatic amines in mainstream cigarette smoke using online solid-phase extraction combined with liquid chromatography-tandem mass spectrometry.

PubMed

Zhang, Jie; Bai, Ruoshi; Zhou, Zhaojuan; Liu, Xingyu; Zhou, Jun

2017-04-01

A fully automated analytical method was developed and validated by this present study. The method was based on two-dimensional (2D) online solid-phase extraction liquid chromatography-tandem mass spectrometry (SPE-LC-MS/MS) to determine nine aromatic amines (AAs) in mainstream smoke (MSS) simultaneously. As a part of validation process, AAs yields for 16 top-selling commercial cigarettes from China market were evaluated by the developed method under both Health Canada Intensive (HCI) and ISO machine smoking regimes. The gas phase of MSS was trapped by 25 mL 0.6 M hydrochloric acid solution, while the particulate phase was collected on a glass fiber filter. Then, the glass fiber pad was extracted with hydrochloric acid solution in an ultrasonic bath. The extract was analyzed with 2D online SPE-LC-MS/MS. In order to minimize the matrix effects of sample on each analyte, two cartridges with different extraction mechanisms were utilized to cleanup disturbances of different polarity, which were performed by the 2D SPE. A phenyl-hexyl analytical column was used to achieve a chromatographic separation. Under the optimized conditions, the isomers of p-toluidine, m-toluidine and o-toluidine, 3-aminobiphenyl and 4-aminobiphenyl, and 1-naphthylamine and 2-naphthylamine were baseline separated with good peak shapes for the first time. The limits of detection for nine AAs ranged from 0.03 to 0.24 ng cig -1 . The recovery of the measurement of nine AAs was from 84.82 to 118.47%. The intra-day and inter-day precisions of nine AAs were less than 10 and 16%, respectively. Compared with ISO machine smoking regime, the AAs yields in MSS were 1.17 to 3.41 times higher under HCI machine smoking regime. Graphical abstract New method using online SPE-LC/MS/MS for analysis of aromatic amines in mainstream cigarette smoke.
Localized thin-section CT with radiomics feature extraction and machine learning to classify early-detected pulmonary nodules from lung cancer screening.

PubMed

Tu, Shu-Ju; Wang, Chih-Wei; Pan, Kuang-Tse; Wu, Yi-Cheng; Wu, Chen-Te

2018-03-14

Lung cancer screening aims to detect small pulmonary nodules and decrease the mortality rate of those affected. However, studies from large-scale clinical trials of lung cancer screening have shown that the false-positive rate is high and positive predictive value is low. To address these problems, a technical approach is greatly needed for accurate malignancy differentiation among these early-detected nodules. We studied the clinical feasibility of an additional protocol of localized thin-section CT for further assessment on recalled patients from lung cancer screening tests. Our approach of localized thin-section CT was integrated with radiomics features extraction and machine learning classification which was supervised by pathological diagnosis. Localized thin-section CT images of 122 nodules were retrospectively reviewed and 374 radiomics features were extracted. In this study, 48 nodules were benign and 74 malignant. There were nine patients with multiple nodules and four with synchronous multiple malignant nodules. Different machine learning classifiers with a stratified ten-fold cross-validation were used and repeated 100 times to evaluate classification accuracy. Of the image features extracted from the thin-section CT images, 238 (64%) were useful in differentiating between benign and malignant nodules. These useful features include CT density (p = 0.002 518), sigma (p = 0.002 781), uniformity (p = 0.032 41), and entropy (p = 0.006 685). The highest classification accuracy was 79% by the logistic classifier. The performance metrics of this logistic classification model was 0.80 for the positive predictive value, 0.36 for the false-positive rate, and 0.80 for the area under the receiver operating characteristic curve. Our approach of direct risk classification supervised by the pathological diagnosis with localized thin-section CT and radiomics feature extraction may support clinical physicians in determining truly malignant nodules and therefore reduce problems in lung cancer screening.
Integrating multisensor satellite data merging and image reconstruction in support of machine learning for better water quality management.

PubMed

Chang, Ni-Bin; Bai, Kaixu; Chen, Chi-Farn

2017-10-01

Monitoring water quality changes in lakes, reservoirs, estuaries, and coastal waters is critical in response to the needs for sustainable development. This study develops a remote sensing-based multiscale modeling system by integrating multi-sensor satellite data merging and image reconstruction algorithms in support of feature extraction with machine learning leading to automate continuous water quality monitoring in environmentally sensitive regions. This new Earth observation platform, termed "cross-mission data merging and image reconstruction with machine learning" (CDMIM), is capable of merging multiple satellite imageries to provide daily water quality monitoring through a series of image processing, enhancement, reconstruction, and data mining/machine learning techniques. Two existing key algorithms, including Spectral Information Adaptation and Synthesis Scheme (SIASS) and SMart Information Reconstruction (SMIR), are highlighted to support feature extraction and content-based mapping. Whereas SIASS can support various data merging efforts to merge images collected from cross-mission satellite sensors, SMIR can overcome data gaps by reconstructing the information of value-missing pixels due to impacts such as cloud obstruction. Practical implementation of CDMIM was assessed by predicting the water quality over seasons in terms of the concentrations of nutrients and chlorophyll-a, as well as water clarity in Lake Nicaragua, providing synergistic efforts to better monitor the aquatic environment and offer insightful lake watershed management strategies. Copyright © 2017 Elsevier Ltd. All rights reserved.
The technique of entropy optimization in motor current signature analysis and its application in the fault diagnosis of gear transmission

NASA Astrophysics Data System (ADS)

Chen, Xiaoguang; Liang, Lin; Liu, Fei; Xu, Guanghua; Luo, Ailing; Zhang, Sicong

2012-05-01

Nowadays, Motor Current Signature Analysis (MCSA) is widely used in the fault diagnosis and condition monitoring of machine tools. However, although the current signal has lower SNR (Signal Noise Ratio), it is difficult to identify the feature frequencies of machine tools from complex current spectrum that the feature frequencies are often dense and overlapping by traditional signal processing method such as FFT transformation. With the study in the Motor Current Signature Analysis (MCSA), it is found that the entropy is of importance for frequency identification, which is associated with the probability distribution of any random variable. Therefore, it plays an important role in the signal processing. In order to solve the problem that the feature frequencies are difficult to be identified, an entropy optimization technique based on motor current signal is presented in this paper for extracting the typical feature frequencies of machine tools which can effectively suppress the disturbances. Some simulated current signals were made by MATLAB, and a current signal was obtained from a complex gearbox of an iron works made in Luxembourg. In diagnosis the MCSA is combined with entropy optimization. Both simulated and experimental results show that this technique is efficient, accurate and reliable enough to extract the feature frequencies of current signal, which provides a new strategy for the fault diagnosis and the condition monitoring of machine tools.
"What is relevant in a text document?": An interpretable machine learning approach

PubMed Central

Arras, Leila; Horn, Franziska; Montavon, Grégoire; Müller, Klaus-Robert

2017-01-01

Text documents can be described by a number of abstract concepts such as semantic category, writing style, or sentiment. Machine learning (ML) models have been trained to automatically map documents to these abstract concepts, allowing to annotate very large text collections, more than could be processed by a human in a lifetime. Besides predicting the text’s category very accurately, it is also highly desirable to understand how and why the categorization process takes place. In this paper, we demonstrate that such understanding can be achieved by tracing the classification decision back to individual words using layer-wise relevance propagation (LRP), a recently developed technique for explaining predictions of complex non-linear classifiers. We train two word-based ML models, a convolutional neural network (CNN) and a bag-of-words SVM classifier, on a topic categorization task and adapt the LRP method to decompose the predictions of these models onto words. Resulting scores indicate how much individual words contribute to the overall classification decision. This enables one to distill relevant information from text documents without an explicit semantic information extraction step. We further use the word-wise relevance scores for generating novel vector-based document representations which capture semantic information. Based on these document vectors, we introduce a measure of model explanatory power and show that, although the SVM and CNN models perform similarly in terms of classification accuracy, the latter exhibits a higher level of explainability which makes it more comprehensible for humans and potentially more useful for other applications. PMID:28800619
Prosthetic EMG control enhancement through the application of man-machine principles

NASA Technical Reports Server (NTRS)

Simcox, W. A.

1977-01-01

An area in medicine that appears suitable to man-machine principles is rehabilitation research, particularly when the motor aspects of the body are involved. If one considers the limb, whether functional or not, as the machine, the brain as the controller and the neuromuscular system as the man-machine interface, the human body is reduced to a man-machine system that can benefit from the principles behind such systems. The area of rehabilitation that this paper deals with is that of an arm amputee and his prosthetic device. Reducing this area to its man-machine basics, the problem becomes one of attaining natural multiaxis prosthetic control using Electromyographic activity (EMG) as the means of communication between man and prothesis. In order to use EMG as the communication channel it must be amplified and processed to yield a high information signal suitable for control. The most common processing scheme employed is termed Mean Value Processing. This technique for extracting the useful EMG signal consists of a differential to single ended conversion to the surface activity followed by a rectification and smoothing.
Comparison of Machine Learning Methods for the Arterial Hypertension Diagnostics

PubMed Central

Belo, David; Gamboa, Hugo

2017-01-01

The paper presents results of machine learning approach accuracy applied analysis of cardiac activity. The study evaluates the diagnostics possibilities of the arterial hypertension by means of the short-term heart rate variability signals. Two groups were studied: 30 relatively healthy volunteers and 40 patients suffering from the arterial hypertension of II-III degree. The following machine learning approaches were studied: linear and quadratic discriminant analysis, k-nearest neighbors, support vector machine with radial basis, decision trees, and naive Bayes classifier. Moreover, in the study, different methods of feature extraction are analyzed: statistical, spectral, wavelet, and multifractal. All in all, 53 features were investigated. Investigation results show that discriminant analysis achieves the highest classification accuracy. The suggested approach of noncorrelated feature set search achieved higher results than data set based on the principal components. PMID:28831239
Single molecule detection, thermal fluctuation and life

PubMed Central

YANAGIDA, Toshio; ISHII, Yoshiharu

2017-01-01

Single molecule detection has contributed to our understanding of the unique mechanisms of life. Unlike artificial man-made machines, biological molecular machines integrate thermal noises rather than avoid them. For example, single molecule detection has demonstrated that myosin motors undergo biased Brownian motion for stepwise movement and that single protein molecules spontaneously change their conformation, for switching to interactions with other proteins, in response to thermal fluctuation. Thus, molecular machines have flexibility and efficiency not seen in artificial machines. PMID:28190869
Unit-Record Machine Operation: A Suggested Adult Business Education Course Outline.

ERIC Educational Resources Information Center

New York State Education Dept., Albany. Bureau of Continuing Education Curriculum Development.

The course contained in this book was written for training data processing machine operators; it is intended to prepare adults to qualify for an entry-level job. It is not aimed at developing high proficiency on any one machine, but rather at introducing the student to a variety of equipment, and developing an understanding of how the data flow…
Catalysis of heat-to-work conversion in quantum machines

PubMed Central

Ghosh, A.; Latune, C. L.; Davidovich, L.; Kurizki, G.

2017-01-01

We propose a hitherto-unexplored concept in quantum thermodynamics: catalysis of heat-to-work conversion by quantum nonlinear pumping of the piston mode which extracts work from the machine. This concept is analogous to chemical reaction catalysis: Small energy investment by the catalyst (pump) may yield a large increase in heat-to-work conversion. Since it is powered by thermal baths, the catalyzed machine adheres to the Carnot bound, but may strongly enhance its efficiency and power compared with its noncatalyzed counterparts. This enhancement stems from the increased ability of the squeezed piston to store work. Remarkably, the fraction of piston energy that is convertible into work may then approach unity. The present machine and its counterparts powered by squeezed baths share a common feature: Neither is a genuine heat engine. However, a squeezed pump that catalyzes heat-to-work conversion by small investment of work is much more advantageous than a squeezed bath that simply transduces part of the work invested in its squeezing into work performed by the machine. PMID:29087326
Catalysis of heat-to-work conversion in quantum machines

NASA Astrophysics Data System (ADS)

Ghosh, A.; Latune, C. L.; Davidovich, L.; Kurizki, G.

2017-11-01

We propose a hitherto-unexplored concept in quantum thermodynamics: catalysis of heat-to-work conversion by quantum nonlinear pumping of the piston mode which extracts work from the machine. This concept is analogous to chemical reaction catalysis: Small energy investment by the catalyst (pump) may yield a large increase in heat-to-work conversion. Since it is powered by thermal baths, the catalyzed machine adheres to the Carnot bound, but may strongly enhance its efficiency and power compared with its noncatalyzed counterparts. This enhancement stems from the increased ability of the squeezed piston to store work. Remarkably, the fraction of piston energy that is convertible into work may then approach unity. The present machine and its counterparts powered by squeezed baths share a common feature: Neither is a genuine heat engine. However, a squeezed pump that catalyzes heat-to-work conversion by small investment of work is much more advantageous than a squeezed bath that simply transduces part of the work invested in its squeezing into work performed by the machine.
Experimental Machine Learning of Quantum States

NASA Astrophysics Data System (ADS)

Gao, Jun; Qiao, Lu-Feng; Jiao, Zhi-Qiang; Ma, Yue-Chi; Hu, Cheng-Qiu; Ren, Ruo-Jing; Yang, Ai-Lin; Tang, Hao; Yung, Man-Hong; Jin, Xian-Min

2018-06-01

Quantum information technologies provide promising applications in communication and computation, while machine learning has become a powerful technique for extracting meaningful structures in "big data." A crossover between quantum information and machine learning represents a new interdisciplinary area stimulating progress in both fields. Traditionally, a quantum state is characterized by quantum-state tomography, which is a resource-consuming process when scaled up. Here we experimentally demonstrate a machine-learning approach to construct a quantum-state classifier for identifying the separability of quantum states. We show that it is possible to experimentally train an artificial neural network to efficiently learn and classify quantum states, without the need of obtaining the full information of the states. We also show how adding a hidden layer of neurons to the neural network can significantly boost the performance of the state classifier. These results shed new light on how classification of quantum states can be achieved with limited resources, and represent a step towards machine-learning-based applications in quantum information processing.
Classification of older adults with/without a fall history using machine learning methods.

PubMed

Lin Zhang; Ou Ma; Fabre, Jennifer M; Wood, Robert H; Garcia, Stephanie U; Ivey, Kayla M; McCann, Evan D

2015-01-01

Falling is a serious problem in an aged society such that assessment of the risk of falls for individuals is imperative for the research and practice of falls prevention. This paper introduces an application of several machine learning methods for training a classifier which is capable of classifying individual older adults into a high risk group and a low risk group (distinguished by whether or not the members of the group have a recent history of falls). Using a 3D motion capture system, significant gait features related to falls risk are extracted. By training these features, classification hypotheses are obtained based on machine learning techniques (K Nearest-neighbour, Naive Bayes, Logistic Regression, Neural Network, and Support Vector Machine). Training and test accuracies with sensitivity and specificity of each of these techniques are assessed. The feature adjustment and tuning of the machine learning algorithms are discussed. The outcome of the study will benefit the prediction and prevention of falls.
Catalysis of heat-to-work conversion in quantum machines.

PubMed

Ghosh, A; Latune, C L; Davidovich, L; Kurizki, G

2017-11-14

We propose a hitherto-unexplored concept in quantum thermodynamics: catalysis of heat-to-work conversion by quantum nonlinear pumping of the piston mode which extracts work from the machine. This concept is analogous to chemical reaction catalysis: Small energy investment by the catalyst (pump) may yield a large increase in heat-to-work conversion. Since it is powered by thermal baths, the catalyzed machine adheres to the Carnot bound, but may strongly enhance its efficiency and power compared with its noncatalyzed counterparts. This enhancement stems from the increased ability of the squeezed piston to store work. Remarkably, the fraction of piston energy that is convertible into work may then approach unity. The present machine and its counterparts powered by squeezed baths share a common feature: Neither is a genuine heat engine. However, a squeezed pump that catalyzes heat-to-work conversion by small investment of work is much more advantageous than a squeezed bath that simply transduces part of the work invested in its squeezing into work performed by the machine.
A System for Automated Extraction of Metadata from Scanned Documents using Layout Recognition and String Pattern Search Models.

PubMed

Misra, Dharitri; Chen, Siyuan; Thoma, George R

2009-01-01

One of the most expensive aspects of archiving digital documents is the manual acquisition of context-sensitive metadata useful for the subsequent discovery of, and access to, the archived items. For certain types of textual documents, such as journal articles, pamphlets, official government records, etc., where the metadata is contained within the body of the documents, a cost effective method is to identify and extract the metadata in an automated way, applying machine learning and string pattern search techniques.At the U. S. National Library of Medicine (NLM) we have developed an automated metadata extraction (AME) system that employs layout classification and recognition models with a metadata pattern search model for a text corpus with structured or semi-structured information. A combination of Support Vector Machine and Hidden Markov Model is used to create the layout recognition models from a training set of the corpus, following which a rule-based metadata search model is used to extract the embedded metadata by analyzing the string patterns within and surrounding each field in the recognized layouts.In this paper, we describe the design of our AME system, with focus on the metadata search model. We present the extraction results for a historic collection from the Food and Drug Administration, and outline how the system may be adapted for similar collections. Finally, we discuss some ongoing enhancements to our AME system.
A novel automated device for rapid nucleic acid extraction utilizing a zigzag motion of magnetic silica beads.

PubMed

Yamaguchi, Akemi; Matsuda, Kazuyuki; Uehara, Masayuki; Honda, Takayuki; Saito, Yasunori

2016-02-04

We report a novel automated device for nucleic acid extraction, which consists of a mechanical control system and a disposable cassette. The cassette is composed of a bottle, a capillary tube, and a chamber. After sample injection in the bottle, the sample is lysed, and nucleic acids are adsorbed on the surface of magnetic silica beads. These magnetic beads are transported and are vibrated through the washing reagents in the capillary tube under the control of the mechanical control system, and thus, the nucleic acid is purified without centrifugation. The purified nucleic acid is automatically extracted in 3 min for the polymerase chain reaction (PCR). The nucleic acid extraction is dependent on the transport speed and the vibration frequency of the magnetic beads, and optimizing these two parameters provided better PCR efficiency than the conventional manual procedure. There was no difference between the detection limits of our novel device and that of the conventional manual procedure. We have already developed the droplet-PCR machine, which can amplify and detect specific nucleic acids rapidly and automatically. Connecting the droplet-PCR machine to our novel automated extraction device enables PCR analysis within 15 min, and this system can be made available as a point-of-care testing in clinics as well as general hospitals. Copyright © 2015 Elsevier B.V. All rights reserved.
Extracting microRNA-gene relations from biomedical literature using distant supervision

PubMed Central

Clarke, Luka A.; Couto, Francisco M.

2017-01-01

Many biomedical relation extraction approaches are based on supervised machine learning, requiring an annotated corpus. Distant supervision aims at training a classifier by combining a knowledge base with a corpus, reducing the amount of manual effort necessary. This is particularly useful for biomedicine because many databases and ontologies have been made available for many biological processes, while the availability of annotated corpora is still limited. We studied the extraction of microRNA-gene relations from text. MicroRNA regulation is an important biological process due to its close association with human diseases. The proposed method, IBRel, is based on distantly supervised multi-instance learning. We evaluated IBRel on three datasets, and the results were compared with a co-occurrence approach as well as a supervised machine learning algorithm. While supervised learning outperformed on two of those datasets, IBRel obtained an F-score 28.3 percentage points higher on the dataset for which there was no training set developed specifically. To demonstrate the applicability of IBRel, we used it to extract 27 miRNA-gene relations from recently published papers about cystic fibrosis. Our results demonstrate that our method can be successfully used to extract relations from literature about a biological process without an annotated corpus. The source code and data used in this study are available at https://github.com/AndreLamurias/IBRel. PMID:28263989
Extracting microRNA-gene relations from biomedical literature using distant supervision.

PubMed

Lamurias, Andre; Clarke, Luka A; Couto, Francisco M

2017-01-01

Many biomedical relation extraction approaches are based on supervised machine learning, requiring an annotated corpus. Distant supervision aims at training a classifier by combining a knowledge base with a corpus, reducing the amount of manual effort necessary. This is particularly useful for biomedicine because many databases and ontologies have been made available for many biological processes, while the availability of annotated corpora is still limited. We studied the extraction of microRNA-gene relations from text. MicroRNA regulation is an important biological process due to its close association with human diseases. The proposed method, IBRel, is based on distantly supervised multi-instance learning. We evaluated IBRel on three datasets, and the results were compared with a co-occurrence approach as well as a supervised machine learning algorithm. While supervised learning outperformed on two of those datasets, IBRel obtained an F-score 28.3 percentage points higher on the dataset for which there was no training set developed specifically. To demonstrate the applicability of IBRel, we used it to extract 27 miRNA-gene relations from recently published papers about cystic fibrosis. Our results demonstrate that our method can be successfully used to extract relations from literature about a biological process without an annotated corpus. The source code and data used in this study are available at https://github.com/AndreLamurias/IBRel.

Environmental metabolomics with data science for investigating ecosystem homeostasis.

PubMed

Kikuchi, Jun; Ito, Kengo; Date, Yasuhiro

2018-02-01

A natural ecosystem can be viewed as the interconnections between complex metabolic reactions and environments. Humans, a part of these ecosystems, and their activities strongly affect the environments. To account for human effects within ecosystems, understanding what benefits humans receive by facilitating the maintenance of environmental homeostasis is important. This review describes recent applications of several NMR approaches to the evaluation of environmental homeostasis by metabolic profiling and data science. The basic NMR strategy used to evaluate homeostasis using big data collection is similar to that used in human health studies. Sophisticated metabolomic approaches (metabolic profiling) are widely reported in the literature. Further challenges include the analysis of complex macromolecular structures, and of the compositions and interactions of plant biomass, soil humic substances, and aqueous particulate organic matter. To support the study of these topics, we also discuss sample preparation techniques and solid-state NMR approaches. Because NMR approaches can produce a number of data with high reproducibility and inter-institution compatibility, further analysis of such data using machine learning approaches is often worthwhile. We also describe methods for data pretreatment in solid-state NMR and for environmental feature extraction from heterogeneously-measured spectroscopic data by machine learning approaches. Copyright © 2017. Published by Elsevier B.V.
SparkText: Biomedical Text Mining on Big Data Framework.

PubMed

Ye, Zhan; Tafti, Ahmad P; He, Karen Y; Wang, Kai; He, Max M

Many new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment. In this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers) from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM), and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes. This study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research.
Association between abnormal brain functional connectivity in children and psychopathology: A study based on graph theory and machine learning.

PubMed

Sato, João Ricardo; Biazoli, Claudinei Eduardo; Salum, Giovanni Abrahão; Gadelha, Ary; Crossley, Nicolas; Vieira, Gilson; Zugman, André; Picon, Felipe Almeida; Pan, Pedro Mario; Hoexter, Marcelo Queiroz; Amaro, Edson; Anés, Mauricio; Moura, Luciana Monteiro; Del'Aquilla, Marco Antonio Gomes; Mcguire, Philip; Rohde, Luis Augusto; Miguel, Euripedes Constantino; Jackowski, Andrea Parolin; Bressan, Rodrigo Affonseca

2018-03-01

One of the major challenges facing psychiatry is how to incorporate biological measures in the classification of mental health disorders. Many of these disorders affect brain development and its connectivity. In this study, we propose a novel method for assessing brain networks based on the combination of a graph theory measure (eigenvector centrality) and a one-class support vector machine (OC-SVM). We applied this approach to resting-state fMRI data from 622 children and adolescents. Eigenvector centrality (EVC) of nodes from positive- and negative-task networks were extracted from each subject and used as input to an OC-SVM to label individual brain networks as typical or atypical. We hypothesised that classification of these subjects regarding the pattern of brain connectivity would predict the level of psychopathology. Subjects with atypical brain network organisation had higher levels of psychopathology (p < 0.001). There was a greater EVC in the typical group at the bilateral posterior cingulate and bilateral posterior temporal cortices; and significant decreases in EVC at left temporal pole. The combination of graph theory methods and an OC-SVM is a promising method to characterise neurodevelopment, and may be useful to understand the deviations leading to mental disorders.
Computational techniques for ECG analysis and interpretation in light of their contribution to medical advances

PubMed Central

Mincholé, Ana; Martínez, Juan Pablo; Laguna, Pablo; Rodriguez, Blanca

2018-01-01

Widely developed for clinical screening, electrocardiogram (ECG) recordings capture the cardiac electrical activity from the body surface. ECG analysis can therefore be a crucial first step to help diagnose, understand and predict cardiovascular disorders responsible for 30% of deaths worldwide. Computational techniques, and more specifically machine learning techniques and computational modelling are powerful tools for classification, clustering and simulation, and they have recently been applied to address the analysis of medical data, especially ECG data. This review describes the computational methods in use for ECG analysis, with a focus on machine learning and 3D computer simulations, as well as their accuracy, clinical implications and contributions to medical advances. The first section focuses on heartbeat classification and the techniques developed to extract and classify abnormal from regular beats. The second section focuses on patient diagnosis from whole recordings, applied to different diseases. The third section presents real-time diagnosis and applications to wearable devices. The fourth section highlights the recent field of personalized ECG computer simulations and their interpretation. Finally, the discussion section outlines the challenges of ECG analysis and provides a critical assessment of the methods presented. The computational methods reported in this review are a strong asset for medical discoveries and their translation to the clinical world may lead to promising advances. PMID:29321268
SparkText: Biomedical Text Mining on Big Data Framework

PubMed Central

He, Karen Y.; Wang, Kai

2016-01-01

Background Many new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment. Results In this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers) from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM), and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes. Conclusions This study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research. PMID:27685652
Successful attack on permutation-parity-machine-based neural cryptography.

PubMed

Seoane, Luís F; Ruttor, Andreas

2012-02-01

An algorithm is presented which implements a probabilistic attack on the key-exchange protocol based on permutation parity machines. Instead of imitating the synchronization of the communicating partners, the strategy consists of a Monte Carlo method to sample the space of possible weights during inner rounds and an analytic approach to convey the extracted information from one outer round to the next one. The results show that the protocol under attack fails to synchronize faster than an eavesdropper using this algorithm.
Gender classification from face images by using local binary pattern and gray-level co-occurrence matrix

NASA Astrophysics Data System (ADS)

Uzbaş, Betül; Arslan, Ahmet

2018-04-01

Gender is an important step for human computer interactive processes and identification. Human face image is one of the important sources to determine gender. In the present study, gender classification is performed automatically from facial images. In order to classify gender, we propose a combination of features that have been extracted face, eye and lip regions by using a hybrid method of Local Binary Pattern and Gray-Level Co-Occurrence Matrix. The features have been extracted from automatically obtained face, eye and lip regions. All of the extracted features have been combined and given as input parameters to classification methods (Support Vector Machine, Artificial Neural Networks, Naive Bayes and k-Nearest Neighbor methods) for gender classification. The Nottingham Scan face database that consists of the frontal face images of 100 people (50 male and 50 female) is used for this purpose. As the result of the experimental studies, the highest success rate has been achieved as 98% by using Support Vector Machine. The experimental results illustrate the efficacy of our proposed method.
High-performance Chinese multiclass traffic sign detection via coarse-to-fine cascade and parallel support vector machine detectors

NASA Astrophysics Data System (ADS)

Chang, Faliang; Liu, Chunsheng

2017-09-01

The high variability of sign colors and shapes in uncontrolled environments has made the detection of traffic signs a challenging problem in computer vision. We propose a traffic sign detection (TSD) method based on coarse-to-fine cascade and parallel support vector machine (SVM) detectors to detect Chinese warning and danger traffic signs. First, a region of interest (ROI) extraction method is proposed to extract ROIs using color contrast features in local regions. The ROI extraction can reduce scanning regions and save detection time. For multiclass TSD, we propose a structure that combines a coarse-to-fine cascaded tree with a parallel structure of histogram of oriented gradients (HOG) + SVM detectors. The cascaded tree is designed to detect different types of traffic signs in a coarse-to-fine process. The parallel HOG + SVM detectors are designed to do fine detection of different types of traffic signs. The experiments demonstrate the proposed TSD method can rapidly detect multiclass traffic signs with different colors and shapes in high accuracy.
Quantification of caffeine, trigonelline and nicotinic acid in espresso coffee: the influence of espresso machines and coffee cultivars.

PubMed

Caprioli, Giovanni; Cortese, Manuela; Maggi, Filippo; Minnetti, Caterina; Odello, Luigi; Sagratini, Gianni; Vittori, Sauro

2014-06-01

Caffeine, trigonelline and nicotinic acid are important bioactive constituents of coffee. In this work, the combination of different water temperatures and pressures in the settings of the espresso coffee (EC) machine was evaluated, to assess how these factors influence how effectively caffeine, trigonelline and nicotinic acid are extracted from both Arabica and Robusta samples. The proposed analytical method, based on a high performance liquid chromatography (HPLC) system coupled to a variable wavelength detector (VWD), showed good linearity (R²> 0.9985) and good recoveries (71-92%); after validation for three monitored compounds, the method was used to analyze 20 commercial samples. The combination of a temperature of 92 °C and pressure at 7 or 9 bar seems to be the ideal setting for the most efficient extraction of these compounds and consequently for their intake; the compound extracted in the greatest quantity was caffeine, which was in the range of 116.87-199.68 mg in a 25 ml cup of coffee.
Classification of pulmonary pathology from breath sounds using the wavelet packet transform and an extreme learning machine.

PubMed

Palaniappan, Rajkumar; Sundaraj, Kenneth; Sundaraj, Sebastian; Huliraj, N; Revadi, S S

2017-06-08

Auscultation is a medical procedure used for the initial diagnosis and assessment of lung and heart diseases. From this perspective, we propose assessing the performance of the extreme learning machine (ELM) classifiers for the diagnosis of pulmonary pathology using breath sounds. Energy and entropy features were extracted from the breath sound using the wavelet packet transform. The statistical significance of the extracted features was evaluated by one-way analysis of variance (ANOVA). The extracted features were inputted into the ELM classifier. The maximum classification accuracies obtained for the conventional validation (CV) of the energy and entropy features were 97.36% and 98.37%, respectively, whereas the accuracies obtained for the cross validation (CRV) of the energy and entropy features were 96.80% and 97.91%, respectively. In addition, maximum classification accuracies of 98.25% and 99.25% were obtained for the CV and CRV of the ensemble features, respectively. The results indicate that the classification accuracy obtained with the ensemble features was higher than those obtained with the energy and entropy features.
Automated in vivo identification of fungal infection on human scalp using optical coherence tomography and machine learning

NASA Astrophysics Data System (ADS)

Dubey, Kavita; Srivastava, Vishal; Singh Mehta, Dalip

2018-04-01

Early identification of fungal infection on the human scalp is crucial for avoiding hair loss. The diagnosis of fungal infection on the human scalp is based on a visual assessment by trained experts or doctors. Optical coherence tomography (OCT) has the ability to capture fungal infection information from the human scalp with a high resolution. In this study, we present a fully automated, non-contact, non-invasive optical method for rapid detection of fungal infections based on the extracted features from A-line and B-scan images of OCT. A multilevel ensemble machine model is designed to perform automated classification, which shows the superiority of our classifier to the best classifier based on the features extracted from OCT images. In this study, 60 samples (30 fungal, 30 normal) were imaged by OCT and eight features were extracted. The classification algorithm had an average sensitivity, specificity and accuracy of 92.30, 90.90 and 91.66%, respectively, for identifying fungal and normal human scalps. This remarkable classifying ability makes the proposed model readily applicable to classifying the human scalp.
Multivariate data analysis and machine learning in Alzheimer's disease with a focus on structural magnetic resonance imaging.

PubMed

Falahati, Farshad; Westman, Eric; Simmons, Andrew

2014-01-01

Machine learning algorithms and multivariate data analysis methods have been widely utilized in the field of Alzheimer's disease (AD) research in recent years. Advances in medical imaging and medical image analysis have provided a means to generate and extract valuable neuroimaging information. Automatic classification techniques provide tools to analyze this information and observe inherent disease-related patterns in the data. In particular, these classifiers have been used to discriminate AD patients from healthy control subjects and to predict conversion from mild cognitive impairment to AD. In this paper, recent studies are reviewed that have used machine learning and multivariate analysis in the field of AD research. The main focus is on studies that used structural magnetic resonance imaging (MRI), but studies that included positron emission tomography and cerebrospinal fluid biomarkers in addition to MRI are also considered. A wide variety of materials and methods has been employed in different studies, resulting in a range of different outcomes. Influential factors such as classifiers, feature extraction algorithms, feature selection methods, validation approaches, and cohort properties are reviewed, as well as key MRI-based and multi-modal based studies. Current and future trends are discussed.
Automatic Detection of Acromegaly From Facial Photographs Using Machine Learning Methods.

PubMed

Kong, Xiangyi; Gong, Shun; Su, Lijuan; Howard, Newton; Kong, Yanguo

2018-01-01

Automatic early detection of acromegaly is theoretically possible from facial photographs, which can lessen the prevalence and increase the cure probability. In this study, several popular machine learning algorithms were used to train a retrospective development dataset consisting of 527 acromegaly patients and 596 normal subjects. We firstly used OpenCV to detect the face bounding rectangle box, and then cropped and resized it to the same pixel dimensions. From the detected faces, locations of facial landmarks which were the potential clinical indicators were extracted. Frontalization was then adopted to synthesize frontal facing views to improve the performance. Several popular machine learning methods including LM, KNN, SVM, RT, CNN, and EM were used to automatically identify acromegaly from the detected facial photographs, extracted facial landmarks, and synthesized frontal faces. The trained models were evaluated using a separate dataset, of which half were diagnosed as acromegaly by growth hormone suppression test. The best result of our proposed methods showed a PPV of 96%, a NPV of 95%, a sensitivity of 96% and a specificity of 96%. Artificial intelligence can automatically early detect acromegaly with a high sensitivity and specificity. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
A novel approach for detection and classification of mammographic microcalcifications using wavelet analysis and extreme learning machine.

PubMed

Malar, E; Kandaswamy, A; Chakravarthy, D; Giri Dharan, A

2012-09-01

The objective of this paper is to reveal the effectiveness of wavelet based tissue texture analysis for microcalcification detection in digitized mammograms using Extreme Learning Machine (ELM). Microcalcifications are tiny deposits of calcium in the breast tissue which are potential indicators for early detection of breast cancer. The dense nature of the breast tissue and the poor contrast of the mammogram image prohibit the effectiveness in identifying microcalcifications. Hence, a new approach to discriminate the microcalcifications from the normal tissue is done using wavelet features and is compared with different feature vectors extracted using Gray Level Spatial Dependence Matrix (GLSDM) and Gabor filter based techniques. A total of 120 Region of Interests (ROIs) extracted from 55 mammogram images of mini-Mias database, including normal and microcalcification images are used in the current research. The network is trained with the above mentioned features and the results denote that ELM produces relatively better classification accuracy (94%) with a significant reduction in training time than the other artificial neural networks like Bayesnet classifier, Naivebayes classifier, and Support Vector Machine. ELM also avoids problems like local minima, improper learning rate, and over fitting. Copyright © 2012 Elsevier Ltd. All rights reserved.
Machine Learning Methods to Extract Documentation of Breast Cancer Symptoms From Electronic Health Records.

PubMed

Forsyth, Alexander W; Barzilay, Regina; Hughes, Kevin S; Lui, Dickson; Lorenz, Karl A; Enzinger, Andrea; Tulsky, James A; Lindvall, Charlotta

2018-06-01

Clinicians document cancer patients' symptoms in free-text format within electronic health record visit notes. Although symptoms are critically important to quality of life and often herald clinical status changes, computational methods to assess the trajectory of symptoms over time are woefully underdeveloped. To create machine learning algorithms capable of extracting patient-reported symptoms from free-text electronic health record notes. The data set included 103,564 sentences obtained from the electronic clinical notes of 2695 breast cancer patients receiving paclitaxel-containing chemotherapy at two academic cancer centers between May 1996 and May 2015. We manually annotated 10,000 sentences and trained a conditional random field model to predict words indicating an active symptom (positive label), absence of a symptom (negative label), or no symptom at all (neutral label). Sentences labeled by human coder were divided into training, validation, and test data sets. Final model performance was determined on 20% test data unused in model development or tuning. The final model achieved precision of 0.82, 0.86, and 0.99 and recall of 0.56, 0.69, and 1.00 for positive, negative, and neutral symptom labels, respectively. The most common positive symptoms were pain, fatigue, and nausea. Machine-based labeling of 103,564 sentences took two minutes. We demonstrate the potential of machine learning to gather, track, and analyze symptoms experienced by cancer patients during chemotherapy. Although our initial model requires further optimization to improve the performance, further model building may yield machine learning methods suitable to be deployed in routine clinical care, quality improvement, and research applications. Copyright © 2018 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.
Understanding dental CAD/CAM for restorations--dental milling machines from a mechanical engineering viewpoint. Part B: labside milling machines.

PubMed

Lebon, Nicolas; Tapie, Laurent; Duret, Francois; Attal, Jean-Pierre

2016-01-01

Nowadays, dental numerical controlled (NC) milling machines are available for dental laboratories (labside solution) and dental production centers. This article provides a mechanical engineering approach to NC milling machines to help dental technicians understand the involvement of technology in digital dentistry practice. The technical and economic criteria are described for four labside and two production center dental NC milling machines available on the market. The technical criteria are focused on the capacities of the embedded technologies of milling machines to mill prosthetic materials and various restoration shapes. The economic criteria are focused on investment cost and interoperability with third-party software. The clinical relevance of the technology is discussed through the accuracy and integrity of the restoration. It can be asserted that dental production center milling machines offer a wider range of materials and types of restoration shapes than labside solutions, while labside solutions offer a wider range than chairside solutions. The accuracy and integrity of restorations may be improved as a function of the embedded technologies provided. However, the more complex the technical solutions available, the more skilled the user must be. Investment cost and interoperability with third-party software increase according to the quality of the embedded technologies implemented. Each private dental practice may decide which fabrication option to use depending on the scope of the practice.
Machine perfusion in liver transplantation as a tool to prevent non-anastomotic biliary strictures: Rationale, current evidence and future directions.

PubMed

Weeder, Pepijn D; van Rijn, Rianne; Porte, Robert J

2015-07-01

The high incidence of non-anastomotic biliary strictures (NAS) after transplantation of livers from extended criteria donors is currently a major barrier to widespread use of these organs. This review provides an update on the most recent advances in the understanding of the etiology of NAS. These new insights give reason to believe that machine perfusion can reduce the incidence of NAS after transplantation by providing more protective effects on the biliary tree during preservation of the donor liver. An overview is presented regarding the different endpoints that have been used for assessment of biliary injury and function before and after transplantation, emphasizing on methods used during machine perfusion. The wide spectrum of different approaches to machine perfusion is discussed, including the many different combinations of techniques, temperatures and perfusates at varying time points. In addition, the current understanding of the effect of machine perfusion in relation to biliary injury is reviewed. Finally, we explore directions for future research such as the application of (pharmacological) strategies during machine perfusion to further improve preservation. We stress the great potential of machine perfusion to possibly expand the donor pool by reducing the incidence of NAS in extended criteria organs. Copyright © 2015 European Association for the Study of the Liver. Published by Elsevier B.V. All rights reserved.
A general-purpose machine learning framework for predicting properties of inorganic materials

DOE PAGES

Ward, Logan; Agrawal, Ankit; Choudhary, Alok; ...

2016-08-26

A very active area of materials research is to devise methods that use machine learning to automatically extract predictive models from existing materials data. While prior examples have demonstrated successful models for some applications, many more applications exist where machine learning can make a strong impact. To enable faster development of machine-learning-based models for such applications, we have created a framework capable of being applied to a broad range of materials data. Our method works by using a chemically diverse list of attributes, which we demonstrate are suitable for describing a wide variety of properties, and a novel method formore » partitioning the data set into groups of similar materials to boost the predictive accuracy. In this manuscript, we demonstrate how this new method can be used to predict diverse properties of crystalline and amorphous materials, such as band gap energy and glass-forming ability.« less
Lysine acetylation sites prediction using an ensemble of support vector machine classifiers.

PubMed

Xu, Yan; Wang, Xiao-Bo; Ding, Jun; Wu, Ling-Yun; Deng, Nai-Yang

2010-05-07

Lysine acetylation is an essentially reversible and high regulated post-translational modification which regulates diverse protein properties. Experimental identification of acetylation sites is laborious and expensive. Hence, there is significant interest in the development of computational methods for reliable prediction of acetylation sites from amino acid sequences. In this paper we use an ensemble of support vector machine classifiers to perform this work. The experimentally determined acetylation lysine sites are extracted from Swiss-Prot database and scientific literatures. Experiment results show that an ensemble of support vector machine classifiers outperforms single support vector machine classifier and other computational methods such as PAIL and LysAcet on the problem of predicting acetylation lysine sites. The resulting method has been implemented in EnsemblePail, a web server for lysine acetylation sites prediction available at http://www.aporc.org/EnsemblePail/. Copyright (c) 2010 Elsevier Ltd. All rights reserved.
A general-purpose machine learning framework for predicting properties of inorganic materials

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ward, Logan; Agrawal, Ankit; Choudhary, Alok

A very active area of materials research is to devise methods that use machine learning to automatically extract predictive models from existing materials data. While prior examples have demonstrated successful models for some applications, many more applications exist where machine learning can make a strong impact. To enable faster development of machine-learning-based models for such applications, we have created a framework capable of being applied to a broad range of materials data. Our method works by using a chemically diverse list of attributes, which we demonstrate are suitable for describing a wide variety of properties, and a novel method formore » partitioning the data set into groups of similar materials to boost the predictive accuracy. In this manuscript, we demonstrate how this new method can be used to predict diverse properties of crystalline and amorphous materials, such as band gap energy and glass-forming ability.« less

Discovering Fine-grained Sentiment in Suicide Notes

PubMed Central

Wang, Wenbo; Chen, Lu; Tan, Ming; Wang, Shaojun; Sheth, Amit P.

2012-01-01

This paper presents our solution for the i2b2 sentiment classification challenge. Our hybrid system consists of machine learning and rule-based classifiers. For the machine learning classifier, we investigate a variety of lexical, syntactic and knowledge-based features, and show how much these features contribute to the performance of the classifier through experiments. For the rule-based classifier, we propose an algorithm to automatically extract effective syntactic and lexical patterns from training examples. The experimental results show that the rule-based classifier outperforms the baseline machine learning classifier using unigram features. By combining the machine learning classifier and the rule-based classifier, the hybrid system gains a better trade-off between precision and recall, and yields the highest micro-averaged F-measure (0.5038), which is better than the mean (0.4875) and median (0.5027) micro-average F-measures among all participating teams. PMID:22879770
Holographic Labeling And Reading Machine For Authentication And Security Appications

DOEpatents

Weber, David C.; Trolinger, James D.

1999-07-06

A holographic security label and automated reading machine for marking and subsequently authenticating any object such as an identification badge, a pass, a ticket, a manufactured part, or a package is described. The security label is extremely difficult to copy or even to read by unauthorized persons. The system comprises a holographic security label that has been created with a coded reference wave, whose specification can be kept secret. The label contains information that can be extracted only with the coded reference wave, which is derived from a holographic key, which restricts access of the information to only the possessor of the key. A reading machine accesses the information contained in the label and compares it with data stored in the machine through the application of a joint transform correlator, which is also equipped with a reference hologram that adds additional security to the procedure.
Machine learning of network metrics in ATLAS Distributed Data Management

NASA Astrophysics Data System (ADS)

Lassnig, Mario; Toler, Wesley; Vamosi, Ralf; Bogado, Joaquin; ATLAS Collaboration

2017-10-01

The increasing volume of physics data poses a critical challenge to the ATLAS experiment. In anticipation of high luminosity physics, automation of everyday data management tasks has become necessary. Previously many of these tasks required human decision-making and operation. Recent advances in hardware and software have made it possible to entrust more complicated duties to automated systems using models trained by machine learning algorithms. In this contribution we show results from one of our ongoing automation efforts that focuses on network metrics. First, we describe our machine learning framework built atop the ATLAS Analytics Platform. This framework can automatically extract and aggregate data, train models with various machine learning algorithms, and eventually score the resulting models and parameters. Second, we use these models to forecast metrics relevant for networkaware job scheduling and data brokering. We show the characteristics of the data and evaluate the forecasting accuracy of our models.
Scientific bases of human-machine communication by voice.

PubMed Central

Schafer, R W

1995-01-01

The scientific bases for human-machine communication by voice are in the fields of psychology, linguistics, acoustics, signal processing, computer science, and integrated circuit technology. The purpose of this paper is to highlight the basic scientific and technological issues in human-machine communication by voice and to point out areas of future research opportunity. The discussion is organized around the following major issues in implementing human-machine voice communication systems: (i) hardware/software implementation of the system, (ii) speech synthesis for voice output, (iii) speech recognition and understanding for voice input, and (iv) usability factors related to how humans interact with machines. PMID:7479802
Progress in machine consciousness.

PubMed

Gamez, David

2008-09-01

This paper is a review of the work that has been carried out on machine consciousness. A clear overview of this diverse field is achieved by breaking machine consciousness down into four different areas, which are used to understand its aims, discuss its relationship with other subjects and outline the work that has been carried out so far. The criticisms that have been made against machine consciousness are also covered, along with its potential benefits, and the work that has been done on analysing systems for signs of consciousness. Some of the social and ethical issues raised by machine consciousness are examined at the end of the paper.
New method for the rapid extraction of natural products: efficient isolation of shikimic acid from star anise.

PubMed

Just, Jeremy; Deans, Bianca J; Olivier, Wesley J; Paull, Brett; Bissember, Alex C; Smith, Jason A

2015-05-15

A new, practical, rapid, and high-yielding process for the pressurized hot water extraction (PHWE) of multigram quantities of shikimic acid from star anise (Illicium verum) using an unmodified household espresso machine has been developed. This operationally simple and inexpensive method enables the efficient and straightforward isolation of shikimic acid and the facile preparation of a range of its synthetic derivatives.
The extraction of accurate coordinates of images on photographic plates by means of a scanning type measuring machine

NASA Technical Reports Server (NTRS)

Ross, B. E.

1971-01-01

The Moire method experimental stress analysis is similar to a problem encountered in astrometry. It is necessary to extract accurate coordinates from images on photographic plates. The solution to the mutual problem found applicable to the field of experimental stress analysis is presented to outline the measurement problem. A discussion of the photo-reading device developed to make the measurements follows.
Comparison of machine learned approaches for thyroid nodule characterization from shear wave elastography images

NASA Astrophysics Data System (ADS)

Pereira, Carina; Dighe, Manjiri; Alessio, Adam M.

2018-02-01

Various Computer Aided Diagnosis (CAD) systems have been developed that characterize thyroid nodules using the features extracted from the B-mode ultrasound images and Shear Wave Elastography images (SWE). These features, however, are not perfect predictors of malignancy. In other domains, deep learning techniques such as Convolutional Neural Networks (CNNs) have outperformed conventional feature extraction based machine learning approaches. In general, fully trained CNNs require substantial volumes of data, motivating several efforts to use transfer learning with pre-trained CNNs. In this context, we sought to compare the performance of conventional feature extraction, fully trained CNNs, and transfer learning based, pre-trained CNNs for the detection of thyroid malignancy from ultrasound images. We compared these approaches applied to a data set of 964 B-mode and SWE images from 165 patients. The data were divided into 80% training/validation and 20% testing data. The highest accuracies achieved on the testing data for the conventional feature extraction, fully trained CNN, and pre-trained CNN were 0.80, 0.75, and 0.83 respectively. In this application, classification using a pre-trained network yielded the best performance, potentially due to the relatively limited sample size and sub-optimal architecture for the fully trained CNN.
Machine-assisted verification of latent fingerprints: first results for nondestructive contact-less optical acquisition techniques with a CWL sensor

NASA Astrophysics Data System (ADS)

Hildebrandt, Mario; Kiltz, Stefan; Krapyvskyy, Dmytro; Dittmann, Jana; Vielhauer, Claus; Leich, Marcus

2011-11-01

A machine-assisted analysis of traces from crime scenes might be possible with the advent of new high-resolution non-destructive contact-less acquisition techniques for latent fingerprints. This requires reliable techniques for the automatic extraction of fingerprint features from latent and exemplar fingerprints for matching purposes using pattern recognition approaches. Therefore, we evaluate the NIST Biometric Image Software for the feature extraction and verification of contact-lessly acquired latent fingerprints to determine potential error rates. Our exemplary test setup includes 30 latent fingerprints from 5 people in two test sets that are acquired from different surfaces using a chromatic white light sensor. The first test set includes 20 fingerprints on two different surfaces. It is used to determine the feature extraction performance. The second test set includes one latent fingerprint on 10 different surfaces and an exemplar fingerprint to determine the verification performance. This utilized sensing technique does not require a physical or chemical visibility enhancement of the fingerprint residue, thus the original trace remains unaltered for further investigations. No particular feature extraction and verification techniques have been applied to such data, yet. Hence, we see the need for appropriate algorithms that are suitable to support forensic investigations.
A Fault Alarm and Diagnosis Method Based on Sensitive Parameters and Support Vector Machine

NASA Astrophysics Data System (ADS)

Zhang, Jinjie; Yao, Ziyun; Lv, Zhiquan; Zhu, Qunxiong; Xu, Fengtian; Jiang, Zhinong

2015-08-01

Study on the extraction of fault feature and the diagnostic technique of reciprocating compressor is one of the hot research topics in the field of reciprocating machinery fault diagnosis at present. A large number of feature extraction and classification methods have been widely applied in the related research, but the practical fault alarm and the accuracy of diagnosis have not been effectively improved. Developing feature extraction and classification methods to meet the requirements of typical fault alarm and automatic diagnosis in practical engineering is urgent task. The typical mechanical faults of reciprocating compressor are presented in the paper, and the existing data of online monitoring system is used to extract fault feature parameters within 15 types in total; the inner sensitive connection between faults and the feature parameters has been made clear by using the distance evaluation technique, also sensitive characteristic parameters of different faults have been obtained. On this basis, a method based on fault feature parameters and support vector machine (SVM) is developed, which will be applied to practical fault diagnosis. A better ability of early fault warning has been proved by the experiment and the practical fault cases. Automatic classification by using the SVM to the data of fault alarm has obtained better diagnostic accuracy.
Self-Supervised Chinese Ontology Learning from Online Encyclopedias

PubMed Central

Shao, Zhiqing; Ruan, Tong

2014-01-01

Constructing ontology manually is a time-consuming, error-prone, and tedious task. We present SSCO, a self-supervised learning based chinese ontology, which contains about 255 thousand concepts, 5 million entities, and 40 million facts. We explore the three largest online Chinese encyclopedias for ontology learning and describe how to transfer the structured knowledge in encyclopedias, including article titles, category labels, redirection pages, taxonomy systems, and InfoBox modules, into ontological form. In order to avoid the errors in encyclopedias and enrich the learnt ontology, we also apply some machine learning based methods. First, we proof that the self-supervised machine learning method is practicable in Chinese relation extraction (at least for synonymy and hyponymy) statistically and experimentally and train some self-supervised models (SVMs and CRFs) for synonymy extraction, concept-subconcept relation extraction, and concept-instance relation extraction; the advantages of our methods are that all training examples are automatically generated from the structural information of encyclopedias and a few general heuristic rules. Finally, we evaluate SSCO in two aspects, scale and precision; manual evaluation results show that the ontology has excellent precision, and high coverage is concluded by comparing SSCO with other famous ontologies and knowledge bases; the experiment results also indicate that the self-supervised models obviously enrich SSCO. PMID:24715819
Self-supervised Chinese ontology learning from online encyclopedias.

PubMed

Hu, Fanghuai; Shao, Zhiqing; Ruan, Tong

2014-01-01

Constructing ontology manually is a time-consuming, error-prone, and tedious task. We present SSCO, a self-supervised learning based chinese ontology, which contains about 255 thousand concepts, 5 million entities, and 40 million facts. We explore the three largest online Chinese encyclopedias for ontology learning and describe how to transfer the structured knowledge in encyclopedias, including article titles, category labels, redirection pages, taxonomy systems, and InfoBox modules, into ontological form. In order to avoid the errors in encyclopedias and enrich the learnt ontology, we also apply some machine learning based methods. First, we proof that the self-supervised machine learning method is practicable in Chinese relation extraction (at least for synonymy and hyponymy) statistically and experimentally and train some self-supervised models (SVMs and CRFs) for synonymy extraction, concept-subconcept relation extraction, and concept-instance relation extraction; the advantages of our methods are that all training examples are automatically generated from the structural information of encyclopedias and a few general heuristic rules. Finally, we evaluate SSCO in two aspects, scale and precision; manual evaluation results show that the ontology has excellent precision, and high coverage is concluded by comparing SSCO with other famous ontologies and knowledge bases; the experiment results also indicate that the self-supervised models obviously enrich SSCO.
Implementing Machine Learning in Radiology Practice and Research.

PubMed

Kohli, Marc; Prevedello, Luciano M; Filice, Ross W; Geis, J Raymond

2017-04-01

The purposes of this article are to describe concepts that radiologists should understand to evaluate machine learning projects, including common algorithms, supervised as opposed to unsupervised techniques, statistical pitfalls, and data considerations for training and evaluation, and to briefly describe ethical dilemmas and legal risk. Machine learning includes a broad class of computer programs that improve with experience. The complexity of creating, training, and monitoring machine learning indicates that the success of the algorithms will require radiologist involvement for years to come, leading to engagement rather than replacement.
Deep Learning in Label-free Cell Classification

PubMed Central

Chen, Claire Lifan; Mahjoubfar, Ata; Tai, Li-Chia; Blaby, Ian K.; Huang, Allen; Niazi, Kayvan Reza; Jalali, Bahram

2016-01-01

Label-free cell analysis is essential to personalized genomics, cancer diagnostics, and drug development as it avoids adverse effects of staining reagents on cellular viability and cell signaling. However, currently available label-free cell assays mostly rely only on a single feature and lack sufficient differentiation. Also, the sample size analyzed by these assays is limited due to their low throughput. Here, we integrate feature extraction and deep learning with high-throughput quantitative imaging enabled by photonic time stretch, achieving record high accuracy in label-free cell classification. Our system captures quantitative optical phase and intensity images and extracts multiple biophysical features of individual cells. These biophysical measurements form a hyperdimensional feature space in which supervised learning is performed for cell classification. We compare various learning algorithms including artificial neural network, support vector machine, logistic regression, and a novel deep learning pipeline, which adopts global optimization of receiver operating characteristics. As a validation of the enhanced sensitivity and specificity of our system, we show classification of white blood T-cells against colon cancer cells, as well as lipid accumulating algal strains for biofuel production. This system opens up a new path to data-driven phenotypic diagnosis and better understanding of the heterogeneous gene expressions in cells. PMID:26975219
Electrical Motor Current Signal Analysis using a Modulation Signal Bispectrum for the Fault Diagnosis of a Gearbox Downstream

NASA Astrophysics Data System (ADS)

Haram, M.; Wang, T.; Gu, F.; Ball, A. D.

2012-05-01

Motor current signal analysis has been an effective way for many years of monitoring electrical machines themselves. However, little work has been carried out in using this technique for monitoring their downstream equipment because of difficulties in extracting small fault components in the measured current signals. This paper investigates the characteristics of electrical current signals for monitoring the faults from a downstream gearbox using a modulation signal bispectrum (MSB), including phase effects in extracting small modulating components in a noisy measurement. An analytical study is firstly performed to understand amplitude, frequency and phase characteristics of current signals due to faults. It then explores the performance of MSB analysis in detecting weak modulating components in current signals. Experimental study based on a 10kw two stage gearbox, driven by a three phase induction motor, shows that MSB peaks at different rotational frequencies can be based to quantify the severity of gear tooth breakage and the degrees of shaft misalignment. In addition, the type and location of a fault can be recognized based on the frequency at which the change of MSB peak is the highest among different frequencies.
Autonomous characterization of plastic-bonded explosives

NASA Astrophysics Data System (ADS)

Linder, Kim Dalton; DeRego, Paul; Gomez, Antonio; Baumgart, Chris

2006-08-01

Plastic-Bonded Explosives (PBXs) are a newer generation of explosive compositions developed at Los Alamos National Laboratory (LANL). Understanding the micromechanical behavior of these materials is critical. The size of the crystal particles and porosity within the PBX influences their shock sensitivity. Current methods to characterize the prominent structural characteristics include manual examination by scientists and attempts to use commercially available image processing packages. Both methods are time consuming and tedious. LANL personnel, recognizing this as a manually intensive process, have worked with the Kansas City Plant / Kirtland Operations to develop a system which utilizes image processing and pattern recognition techniques to characterize PBX material. System hardware consists of a CCD camera, zoom lens, two-dimensional, motorized stage, and coaxial, cross-polarized light. System integration of this hardware with the custom software is at the core of the machine vision system. Fundamental processing steps involve capturing images from the PBX specimen, and extraction of void, crystal, and binder regions. For crystal extraction, a Quadtree decomposition segmentation technique is employed. Benefits of this system include: (1) reduction of the overall characterization time; (2) a process which is quantifiable and repeatable; (3) utilization of personnel for intelligent review rather than manual processing; and (4) significantly enhanced characterization accuracy.
Deep Learning in Label-free Cell Classification

NASA Astrophysics Data System (ADS)

Chen, Claire Lifan; Mahjoubfar, Ata; Tai, Li-Chia; Blaby, Ian K.; Huang, Allen; Niazi, Kayvan Reza; Jalali, Bahram

2016-03-01

Label-free cell analysis is essential to personalized genomics, cancer diagnostics, and drug development as it avoids adverse effects of staining reagents on cellular viability and cell signaling. However, currently available label-free cell assays mostly rely only on a single feature and lack sufficient differentiation. Also, the sample size analyzed by these assays is limited due to their low throughput. Here, we integrate feature extraction and deep learning with high-throughput quantitative imaging enabled by photonic time stretch, achieving record high accuracy in label-free cell classification. Our system captures quantitative optical phase and intensity images and extracts multiple biophysical features of individual cells. These biophysical measurements form a hyperdimensional feature space in which supervised learning is performed for cell classification. We compare various learning algorithms including artificial neural network, support vector machine, logistic regression, and a novel deep learning pipeline, which adopts global optimization of receiver operating characteristics. As a validation of the enhanced sensitivity and specificity of our system, we show classification of white blood T-cells against colon cancer cells, as well as lipid accumulating algal strains for biofuel production. This system opens up a new path to data-driven phenotypic diagnosis and better understanding of the heterogeneous gene expressions in cells.
Automatic lip reading by using multimodal visual features

NASA Astrophysics Data System (ADS)

Takahashi, Shohei; Ohya, Jun

2013-12-01

Since long time ago, speech recognition has been researched, though it does not work well in noisy places such as in the car or in the train. In addition, people with hearing-impaired or difficulties in hearing cannot receive benefits from speech recognition. To recognize the speech automatically, visual information is also important. People understand speeches from not only audio information, but also visual information such as temporal changes in the lip shape. A vision based speech recognition method could work well in noisy places, and could be useful also for people with hearing disabilities. In this paper, we propose an automatic lip-reading method for recognizing the speech by using multimodal visual information without using any audio information such as speech recognition. First, the ASM (Active Shape Model) is used to track and detect the face and lip in a video sequence. Second, the shape, optical flow and spatial frequencies of the lip features are extracted from the lip detected by ASM. Next, the extracted multimodal features are ordered chronologically so that Support Vector Machine is performed in order to learn and classify the spoken words. Experiments for classifying several words show promising results of this proposed method.
Predicting lysine glycation sites using bi-profile bayes feature extraction.

PubMed

Ju, Zhe; Sun, Juhe; Li, Yanjie; Wang, Li

2017-12-01

Glycation is a nonenzymatic post-translational modification which has been found to be involved in various biological processes and closely associated with many metabolic diseases. The accurate identification of glycation sites is important to understand the underlying molecular mechanisms of glycation. As the traditional experimental methods are often labor-intensive and time-consuming, it is desired to develop computational methods to predict glycation sites. In this study, a novel predictor named BPB_GlySite is proposed to predict lysine glycation sites by using bi-profile bayes feature extraction and support vector machine algorithm. As illustrated by 10-fold cross-validation, BPB_GlySite achieves a satisfactory performance with a Sensitivity of 63.68%, a Specificity of 72.60%, an Accuracy of 69.63% and a Matthew's correlation coefficient of 0.3499. Experimental results also indicate that BPB_GlySite significantly outperforms three existing glycation sites predictors: NetGlycate, PreGly and Gly-PseAAC. Therefore, BPB_GlySite can be a useful bioinformatics tool for the prediction of glycation sites. A user-friendly web-server for BPB_GlySite is established at 123.206.31.171/BPB_GlySite/. Copyright © 2017 Elsevier Ltd. All rights reserved.
Automatic Picking of Foraminifera: Design of the Foraminifera Image Recognition and Sorting Tool (FIRST) Prototype and Results of the Image Classification Scheme

NASA Astrophysics Data System (ADS)

de Garidel-Thoron, T.; Marchant, R.; Soto, E.; Gally, Y.; Beaufort, L.; Bolton, C. T.; Bouslama, M.; Licari, L.; Mazur, J. C.; Brutti, J. M.; Norsa, F.

2017-12-01

Foraminifera tests are the main proxy carriers for paleoceanographic reconstructions. Both geochemical and taxonomical studies require large numbers of tests to achieve statistical relevance. To date, the extraction of foraminifera from the sediment coarse fraction is still done by hand and thus time-consuming. Moreover, the recognition of morphotypes, ecologically relevant, requires some taxonomical skills not easily taught. The automatic recognition and extraction of foraminifera would largely help paleoceanographers to overcome these issues. Recent advances in automatic image classification using machine learning opens the way to automatic extraction of foraminifera. Here we detail progress on the design of an automatic picking machine as part of the FIRST project. The machine handles 30 pre-sieved samples (100-1000µm), separating them into individual particles (including foraminifera) and imaging each in pseudo-3D. The particles are classified and specimens of interest are sorted either for Individual Foraminifera Analyses (44 per slide) and/or for classical multiple analyses (8 morphological classes per slide, up to 1000 individuals per hole). The classification is based on machine learning using Convolutional Neural Networks (CNNs), similar to the approach used in the coccolithophorid imaging system SYRACO. To prove its feasibility, we built two training image datasets of modern planktonic foraminifera containing approximately 2000 and 5000 images each, corresponding to 15 & 25 morphological classes. Using a CNN with a residual topology (ResNet) we achieve over 95% correct classification for each dataset. We tested the network on 160,000 images from 45 depths of a sediment core from the Pacific ocean, for which we have human counts. The current algorithm is able to reproduce the downcore variability in both Globigerinoides ruber and the fragmentation index (r2 = 0.58 and 0.88 respectively). The FIRST prototype yields some promising results for high-resolution paleoceanographic studies and evolutionary studies.

Monitoring Temperature and Fan Speed Using Ganglia and Winbond Chips

DOE Office of Scientific and Technical Information (OSTI.GOV)

McCaffrey, Cattie; /SLAC

2006-09-27

Effective monitoring is essential to keep a large group of machines, like the ones at Stanford Linear Accelerator Center (SLAC), up and running. SLAC currently uses Ganglia Monitoring System to observe about 2000 machines, analyzing metrics like CPU usage and I/O rate. However, metrics essential to machine hardware health, such as temperature and fan speed, are not being monitored. Many machines have a Winbond w83782d chip which monitors three temperatures, two of which come from dual CPUs, and returns the information when the sensor command is invoked. Ganglia also provides a feature, gmetric, that allows the users to monitor theirmore » own metrics and incorporate them into the monitoring system. The programming language Perl is chosen to implement a script that invokes the sensors command, extracts the temperature and fan speed information, and calls gmetric with the appropriate arguments. Two machines were used to test the script; the two CPUs on each machine run at about 65 Celsius, which is well within the operating temperature range (The maximum safe temperature range is 77-82 Celsius for the Pentium III processors being used). Installing the script on all machines with a Winbond w83782d chip allows the SLAC Scientific Computing and Computing Services group (SCCS) to better evaluate current cooling methods.« less
Vowel Imagery Decoding toward Silent Speech BCI Using Extreme Learning Machine with Electroencephalogram

PubMed Central

Kim, Jongin; Park, Hyeong-jun

2016-01-01

The purpose of this study is to classify EEG data on imagined speech in a single trial. We recorded EEG data while five subjects imagined different vowels, /a/, /e/, /i/, /o/, and /u/. We divided each single trial dataset into thirty segments and extracted features (mean, variance, standard deviation, and skewness) from all segments. To reduce the dimension of the feature vector, we applied a feature selection algorithm based on the sparse regression model. These features were classified using a support vector machine with a radial basis function kernel, an extreme learning machine, and two variants of an extreme learning machine with different kernels. Because each single trial consisted of thirty segments, our algorithm decided the label of the single trial by selecting the most frequent output among the outputs of the thirty segments. As a result, we observed that the extreme learning machine and its variants achieved better classification rates than the support vector machine with a radial basis function kernel and linear discrimination analysis. Thus, our results suggested that EEG responses to imagined speech could be successfully classified in a single trial using an extreme learning machine with a radial basis function and linear kernel. This study with classification of imagined speech might contribute to the development of silent speech BCI systems. PMID:28097128
Photometric Supernova Classification with Machine Learning

NASA Astrophysics Data System (ADS)

Lochner, Michelle; McEwen, Jason D.; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K.

2016-08-01

Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k-nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.
Automatic classification of written descriptions by healthy adults: An overview of the application of natural language processing and machine learning techniques to clinical discourse analysis.

PubMed

Toledo, Cíntia Matsuda; Cunha, Andre; Scarton, Carolina; Aluísio, Sandra

2014-01-01

Discourse production is an important aspect in the evaluation of brain-injured individuals. We believe that studies comparing the performance of brain-injured subjects with that of healthy controls must use groups with compatible education. A pioneering application of machine learning methods using Brazilian Portuguese for clinical purposes is described, highlighting education as an important variable in the Brazilian scenario. The aims were to describe how to:(i) develop machine learning classifiers using features generated by natural language processing tools to distinguish descriptions produced by healthy individuals into classes based on their years of education; and(ii) automatically identify the features that best distinguish the groups. The approach proposed here extracts linguistic features automatically from the written descriptions with the aid of two Natural Language Processing tools: Coh-Metrix-Port and AIC. It also includes nine task-specific features (three new ones, two extracted manually, besides description time; type of scene described - simple or complex; presentation order - which type of picture was described first; and age). In this study, the descriptions by 144 of the subjects studied in Toledo 18 were used,which included 200 healthy Brazilians of both genders. A Support Vector Machine (SVM) with a radial basis function (RBF) kernel is the most recommended approach for the binary classification of our data, classifying three of the four initial classes. CfsSubsetEval (CFS) is a strong candidate to replace manual feature selection methods.
A Machine Learning-based Method for Question Type Classification in Biomedical Question Answering.

PubMed

Sarrouti, Mourad; Ouatik El Alaoui, Said

2017-05-18

Biomedical question type classification is one of the important components of an automatic biomedical question answering system. The performance of the latter depends directly on the performance of its biomedical question type classification system, which consists of assigning a category to each question in order to determine the appropriate answer extraction algorithm. This study aims to automatically classify biomedical questions into one of the four categories: (1) yes/no, (2) factoid, (3) list, and (4) summary. In this paper, we propose a biomedical question type classification method based on machine learning approaches to automatically assign a category to a biomedical question. First, we extract features from biomedical questions using the proposed handcrafted lexico-syntactic patterns. Then, we feed these features for machine-learning algorithms. Finally, the class label is predicted using the trained classifiers. Experimental evaluations performed on large standard annotated datasets of biomedical questions, provided by the BioASQ challenge, demonstrated that our method exhibits significant improved performance when compared to four baseline systems. The proposed method achieves a roughly 10-point increase over the best baseline in terms of accuracy. Moreover, the obtained results show that using handcrafted lexico-syntactic patterns as features' provider of support vector machine (SVM) lead to the highest accuracy of 89.40 %. The proposed method can automatically classify BioASQ questions into one of the four categories: yes/no, factoid, list, and summary. Furthermore, the results demonstrated that our method produced the best classification performance compared to four baseline systems.
Integrating machine learning techniques and high-resolution imagery to generate GIS-ready information for urban water consumption studies

NASA Astrophysics Data System (ADS)

Wolf, Nils; Hof, Angela

2012-10-01

Urban sprawl driven by shifts in tourism development produces new suburban landscapes of water consumption on Mediterranean coasts. Golf courses, ornamental, 'Atlantic' gardens and swimming pools are the most striking artefacts of this transformation, threatening the local water supply systems and exacerbating water scarcity. In the face of climate change, urban landscape irrigation is becoming increasingly important from a resource management point of view. This paper adopts urban remote sensing towards a targeted mapping approach using machine learning techniques and highresolution satellite imagery (WorldView-2) to generate GIS-ready information for urban water consumption studies. Swimming pools, vegetation and - as a subgroup of vegetation - turf grass are extracted as important determinants of water consumption. For image analysis, the complex nature of urban environments suggests spatial-spectral classification, i.e. the complementary use of the spectral signature and spatial descriptors. Multiscale image segmentation provides means to extract the spatial descriptors - namely object feature layers - which can be concatenated at pixel level to the spectral signature. This study assesses the value of object features using different machine learning techniques and amounts of labeled information for learning. The results indicate the benefit of the spatial-spectral approach if combined with appropriate classifiers like tree-based ensembles or support vector machines, which can handle high dimensionality. Finally, a Random Forest classifier was chosen to deliver the classified input data for the estimation of evaporative water loss and net landscape irrigation requirements.
Language Acquisition and Machine Learning.

DTIC Science & Technology

1986-02-01

machine learning and examine its implications for computational models of language acquisition. As a framework for understanding this research, the authors propose four component tasks involved in learning from experience-aggregation, clustering, characterization, and storage. They then consider four common problems studied by machine learning researchers-learning from examples, heuristics learning, conceptual clustering, and learning macro-operators-describing each in terms of our framework. After this, they turn to the problem of grammar
Clocks to Computers: A Machine-Based “Big Picture” of the History of Modern Science.

PubMed

van Lunteren, Frans

2016-12-01

Over the last few decades there have been several calls for a “big picture” of the history of science. There is a general need for a concise overview of the rise of modern science, with a clear structure allowing for a rough division into periods. This essay proposes such a scheme, one that is both elementary and comprehensive. It focuses on four machines, which can be seen to have mediated between science and society during successive periods of time: the clock, the balance, the steam engine, and the computer. Following an extended developmental phase, each of these machines came to play a highly visible role in Western societies, both socially and economically. Each of these machines, moreover, was used as a powerful resource for the understanding of both inorganic and organic nature. More specifically, their metaphorical use helped to construe and refine some key concepts that would play a prominent role in such understanding. In each case the key concept would at some point be considered to represent the ultimate building block of reality. Finally, in a refined form, each of these machines would eventually make its entry in scientific research, thereby strengthening the ties between these machines and nature.
Object recognition of ladar with support vector machine

NASA Astrophysics Data System (ADS)

Sun, Jian-Feng; Li, Qi; Wang, Qi

2005-01-01

Intensity, range and Doppler images can be obtained by using laser radar. Laser radar can detect much more object information than other detecting sensor, such as passive infrared imaging and synthetic aperture radar (SAR), so it is well suited as the sensor of object recognition. Traditional method of laser radar object recognition is extracting target features, which can be influenced by noise. In this paper, a laser radar recognition method-Support Vector Machine is introduced. Support Vector Machine (SVM) is a new hotspot of recognition research after neural network. It has well performance on digital written and face recognition. Two series experiments about SVM designed for preprocessing and non-preprocessing samples are performed by real laser radar images, and the experiments results are compared.
Power line identification of millimeter wave radar based on PCA-GS-SVM

NASA Astrophysics Data System (ADS)

Fang, Fang; Zhang, Guifeng; Cheng, Yansheng

2017-12-01

Aiming at the problem that the existing detection method can not effectively solve the security of UAV's ultra low altitude flight caused by power line, a power line recognition method based on grid search (GS) and the principal component analysis and support vector machine (PCA-SVM) is proposed. Firstly, the candidate line of Hough transform is reduced by PCA, and the main feature of candidate line is extracted. Then, upport vector machine (SVM is) optimized by grid search method (GS). Finally, using support vector machine classifier optimized parameters to classify the candidate line. MATLAB simulation results show that this method can effectively identify the power line and noise, and has high recognition accuracy and algorithm efficiency.
TU-G-303-03: Machine Learning to Improve Human Learning From Longitudinal Image Sets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Veeraraghavan, H.

‘Radiomics’ refers to studies that extract a large amount of quantitative information from medical imaging studies as a basis for characterizing a specific aspect of patient health. Radiomics models can be built to address a wide range of outcome predictions, clinical decisions, basic cancer biology, etc. For example, radiomics models can be built to predict the aggressiveness of an imaged cancer, cancer gene expression characteristics (radiogenomics), radiation therapy treatment response, etc. Technically, radiomics brings together quantitative imaging, computer vision/image processing, and machine learning. In this symposium, speakers will discuss approaches to radiomics investigations, including: longitudinal radiomics, radiomics combined with othermore » biomarkers (‘pan-omics’), radiomics for various imaging modalities (CT, MRI, and PET), and the use of registered multi-modality imaging datasets as a basis for radiomics. There are many challenges to the eventual use of radiomics-derived methods in clinical practice, including: standardization and robustness of selected metrics, accruing the data required, building and validating the resulting models, registering longitudinal data that often involve significant patient changes, reliable automated cancer segmentation tools, etc. Despite the hurdles, results achieved so far indicate the tremendous potential of this general approach to quantifying and using data from medical images. Specific applications of radiomics to be presented in this symposium will include: the longitudinal analysis of patients with low-grade gliomas; automatic detection and assessment of patients with metastatic bone lesions; image-based monitoring of patients with growing lymph nodes; predicting radiotherapy outcomes using multi-modality radiomics; and studies relating radiomics with genomics in lung cancer and glioblastoma. Learning Objectives: Understanding the basic image features that are often used in radiomic models. Understanding requirements for reliable radiomic models, including robustness of metrics, adequate predictive accuracy, and generalizability. Understanding the methodology behind radiomic-genomic (’radiogenomics’) correlations. Research supported by NIH (US), CIHR (Canada), and NSERC (Canada)« less
Analysis of Flatness Deviations for Austenitic Stainless Steel Workpieces after Efficient Surface Machining

NASA Astrophysics Data System (ADS)

Nadolny, K.; Kapłonek, W.

2014-08-01

The following work is an analysis of flatness deviations of a workpiece made of X2CrNiMo17-12-2 austenitic stainless steel. The workpiece surface was shaped using efficient machining techniques (milling, grinding, and smoothing). After the machining was completed, all surfaces underwent stylus measurements in order to obtain surface flatness and roughness parameters. For this purpose the stylus profilometer Hommel-Tester T8000 by Hommelwerke with HommelMap software was used. The research results are presented in the form of 2D surface maps, 3D surface topographies with extracted single profiles, Abbott-Firestone curves, and graphical studies of the Sk parameters. The results of these experimental tests proved the possibility of a correlation between flatness and roughness parameters, as well as enabled an analysis of changes in these parameters from shaping and rough grinding to finished machining. The main novelty of this paper is comprehensive analysis of measurement results obtained during a three-step machining process of austenitic stainless steel. Simultaneous analysis of individual machining steps (milling, grinding, and smoothing) enabled a complementary assessment of the process of shaping the workpiece surface macro- and micro-geometry, giving special consideration to minimize the flatness deviations
78 FR 77670 - Information Collection Request Submitted to OMB for Review and Approval; Comment Request; NESHAP...

Federal Register 2010, 2011, 2012, 2013, 2014

2013-12-24

...: http://www.epa.gov/dockets . Abstract: The sources subject to this rule (i.e., extraction plants, ceramic plants, foundries, incinerators, propellant plants, and machine shops which process beryllium and...
Fragrant pear sexuality recognition with machine vision

NASA Astrophysics Data System (ADS)

Ma, Benxue; Ying, Yibin

2006-10-01

In this research, a method to identify Kuler fragrant pear's sexuality with machine vision was developed. Kuler fragrant pear has male pear and female pear. They have an obvious difference in favor. To detect the sexuality of Kuler fragrant pear, images of fragrant pear were acquired by CCD color camera. Before feature extraction, some preprocessing is conducted on the acquired images to remove noise and unnecessary contents. Color feature, perimeter feature and area feature of fragrant pear bottom image were extracted by digital image processing technique. And the fragrant pear sexuality was determined by complexity obtained from perimeter and area. In this research, using 128 Kurle fragrant pears as samples, good recognition rate between the male pear and the female pear was obtained for Kurle pear's sexuality detection (82.8%). Result shows this method could detect male pear and female pear with a good accuracy.
SABRE, a 10-MV linear induction accelerator

DOE Office of Scientific and Technical Information (OSTI.GOV)

Corely, J.P.; Alexander, J.A.; Pankuch, P.J.

SABRE (Sandia Accelerator and Beam Research Experiment) is a 10-MV, 250-kA, 40-ns linear induction accelerator. It was designed to be used in positive polarity output. Positive polarity accelerators are important for application to Sandia's ICF (Inertial Confinement Fusion) and LMF (Laboratory Microfusion Facility) program efforts. SABRE was built to allow a more detailed study of pulsed power issues associated with positive polarity output machines. MITL (Magnetically Insulated Transmission Line) voltage adder efficiency, extraction ion diode development, and ion beam transport and focusing. The SABRE design allows the system to operate in either positive polarity output for ion extraction applications ormore » negative polarity output for more conventional electron beam loads. Details of the design of SABRE and the results of initial machine performance in negative polarity operation are presented in this paper. 13 refs., 12 figs., 1 tab.« less
Fall Detection Using Smartphone Audio Features.

PubMed

Cheffena, Michael

2016-07-01

An automated fall detection system based on smartphone audio features is developed. The spectrogram, mel frequency cepstral coefficents (MFCCs), linear predictive coding (LPC), and matching pursuit (MP) features of different fall and no-fall sound events are extracted from experimental data. Based on the extracted audio features, four different machine learning classifiers: k-nearest neighbor classifier (k-NN), support vector machine (SVM), least squares method (LSM), and artificial neural network (ANN) are investigated for distinguishing between fall and no-fall events. For each audio feature, the performance of each classifier in terms of sensitivity, specificity, accuracy, and computational complexity is evaluated. The best performance is achieved using spectrogram features with ANN classifier with sensitivity, specificity, and accuracy all above 98%. The classifier also has acceptable computational requirement for training and testing. The system is applicable in home environments where the phone is placed in the vicinity of the user.
A review of intelligent systems for heart sound signal analysis.

PubMed

Nabih-Ali, Mohammed; El-Dahshan, El-Sayed A; Yahia, Ashraf S

2017-10-01

Intelligent computer-aided diagnosis (CAD) systems can enhance the diagnostic capabilities of physicians and reduce the time required for accurate diagnosis. CAD systems could provide physicians with a suggestion about the diagnostic of heart diseases. The objective of this paper is to review the recent published preprocessing, feature extraction and classification techniques and their state of the art of phonocardiogram (PCG) signal analysis. Published literature reviewed in this paper shows the potential of machine learning techniques as a design tool in PCG CAD systems and reveals that the CAD systems for PCG signal analysis are still an open problem. Related studies are compared to their datasets, feature extraction techniques and the classifiers they used. Current achievements and limitations in developing CAD systems for PCG signal analysis using machine learning techniques are presented and discussed. In the light of this review, a number of future research directions for PCG signal analysis are provided.
Breast Cancer Recognition Using a Novel Hybrid Intelligent Method

PubMed Central

Addeh, Jalil; Ebrahimzadeh, Ata

2012-01-01

Breast cancer is the second largest cause of cancer deaths among women. At the same time, it is also among the most curable cancer types if it can be diagnosed early. This paper presents a novel hybrid intelligent method for recognition of breast cancer tumors. The proposed method includes three main modules: the feature extraction module, the classifier module, and the optimization module. In the feature extraction module, fuzzy features are proposed as the efficient characteristic of the patterns. In the classifier module, because of the promising generalization capability of support vector machines (SVM), a SVM-based classifier is proposed. In support vector machine training, the hyperparameters have very important roles for its recognition accuracy. Therefore, in the optimization module, the bees algorithm (BA) is proposed for selecting appropriate parameters of the classifier. The proposed system is tested on Wisconsin Breast Cancer database and simulation results show that the recommended system has a high accuracy. PMID:23626945
Machine learning in soil classification.

PubMed

Bhattacharya, B; Solomatine, D P

2006-03-01

In a number of engineering problems, e.g. in geotechnics, petroleum engineering, etc. intervals of measured series data (signals) are to be attributed a class maintaining the constraint of contiguity and standard classification methods could be inadequate. Classification in this case needs involvement of an expert who observes the magnitude and trends of the signals in addition to any a priori information that might be available. In this paper, an approach for automating this classification procedure is presented. Firstly, a segmentation algorithm is developed and applied to segment the measured signals. Secondly, the salient features of these segments are extracted using boundary energy method. Based on the measured data and extracted features to assign classes to the segments classifiers are built; they employ Decision Trees, ANN and Support Vector Machines. The methodology was tested in classifying sub-surface soil using measured data from Cone Penetration Testing and satisfactory results were obtained.
Mutual information, neural networks and the renormalization group

NASA Astrophysics Data System (ADS)

Koch-Janusz, Maciej; Ringel, Zohar

2018-06-01

Physical systems differing in their microscopic details often display strikingly similar behaviour when probed at macroscopic scales. Those universal properties, largely determining their physical characteristics, are revealed by the powerful renormalization group (RG) procedure, which systematically retains `slow' degrees of freedom and integrates out the rest. However, the important degrees of freedom may be difficult to identify. Here we demonstrate a machine-learning algorithm capable of identifying the relevant degrees of freedom and executing RG steps iteratively without any prior knowledge about the system. We introduce an artificial neural network based on a model-independent, information-theoretic characterization of a real-space RG procedure, which performs this task. We apply the algorithm to classical statistical physics problems in one and two dimensions. We demonstrate RG flow and extract the Ising critical exponent. Our results demonstrate that machine-learning techniques can extract abstract physical concepts and consequently become an integral part of theory- and model-building.

Hunting for Hydrothermal Vents at the Local-Scale Using AUV's and Machine-Learning Classification in the Earth's Oceans

NASA Astrophysics Data System (ADS)

White, S. M.

2018-05-01

New AUV-based mapping technology coupled with machine-learning methods for detecting individual vents and vent fields at the local-scale raise the possibility of understanding the geologic controls on hydrothermal venting.
[Effect of the harvest season on the composition of raw and fermented cotyledons of 2 varieties of cacao and shell fractions].

PubMed

de Dios Alvarado, J; Villacís, F E; Zamora, G F

1983-06-01

A study was carried out wherein during the period August 1979 to January 1980, samples of raw and fermented cacao were analyzed monthly. These included two varieties: Arriba, taken from a farm in "Quevedo", and the EET-19, grown in "Pichilingüe" by the Instituto Nacional de Investigaciones Agropecuarias (INIAP). Taking the ear of cacao as a basis, the weight of its main parts was determined. The proximal composition was established in the cotyledons, with significant statistical differences in regard to moisture, protein, and ether extract content according to the month of harvest. As to the fermentation process, differences in moisture, ether extract and ash content were detected; differences in the ether extract and ash content were found between the two varieties. The fat extracted from the cotyledons presented different iodine, saponification and acidity index values between the raw and fermented samples, but none were determined between the varieties; as far as the month of harvest is concerned, differences in the acidity index were observed. The percentage composition of the main fatty acids is reported (palmitic, stearic, oleic, and linoleic acids). In order to suggest possible industrial ways of utilizing the cacao shell by-product which is discarded by the shelling machine, the chemical characteristics of five fractions were determined based on the functioning of the shelling machine. The moisture, protein, ether extract, ash, crude fiber, theobromine, and caffeine contents varied among the fractions, and it was dependent on the broken "nibs" content. Differences in the protein, ether extract, and ash content, according to the months of production, were found. Obviously, the high fat content in fractions A (fine dust) and B (fine ground), which varied from 30 to 11 g/100 g, merits its extraction; the remainder meal has a valuable protein and alkaloid content. The chemical characteristics of the fat extracted from the shell of two fractions were similar to the fat extracted from the cotyledons.
[Human machines--mechanical humans? The industrial arrangement of the relation between human being and machine on the basis of psychotechnik and Georg Schlesingers work with disabled soldiers].

PubMed

Patzel-Mattern, Katja

2005-01-01

The 20th Century is the century of of technical artefacts. With their existance and use they create an artificial reality, within which humans have to position themselves. Psychotechnik is an attempt to enable humans for this positioning. It gained importance in Germany after World War I and had its heyday between 1919 and 1926. On the basis of the activity of the engineer and supporter of Psychotechnik Georg Schlesinger, whose particular interest were disabled soldiers, the essay on hand will investigate the understanding of the body and the human being of Psychotechnik as an applied science. It turned out, that the biggest achievement of Psychotechnik was to establish a new view of the relation between human being and machine. Thus it helped to show that the human-machine-interface is a shapable unit. Psychotechnik sees the human body and its physique as the last instance for the design of machines. Its main concern is to optimize the relation between human being and machine rather than to standardize human beings according to the construction of machines. After her splendid rise during the Weimar Republic and her rapid decline since the late 1920s Psychotechnik nowadays gains scientifical attention as a historical phenomenon. The main attention in the current discourse lies on the aspects conserning philosophy of science: the unity of body and soul, the understanding of the human-machine-interface as a shapable unit and the human being as a last instance of this unit.
A System for Automated Extraction of Metadata from Scanned Documents using Layout Recognition and String Pattern Search Models

PubMed Central

Misra, Dharitri; Chen, Siyuan; Thoma, George R.

2010-01-01

One of the most expensive aspects of archiving digital documents is the manual acquisition of context-sensitive metadata useful for the subsequent discovery of, and access to, the archived items. For certain types of textual documents, such as journal articles, pamphlets, official government records, etc., where the metadata is contained within the body of the documents, a cost effective method is to identify and extract the metadata in an automated way, applying machine learning and string pattern search techniques. At the U. S. National Library of Medicine (NLM) we have developed an automated metadata extraction (AME) system that employs layout classification and recognition models with a metadata pattern search model for a text corpus with structured or semi-structured information. A combination of Support Vector Machine and Hidden Markov Model is used to create the layout recognition models from a training set of the corpus, following which a rule-based metadata search model is used to extract the embedded metadata by analyzing the string patterns within and surrounding each field in the recognized layouts. In this paper, we describe the design of our AME system, with focus on the metadata search model. We present the extraction results for a historic collection from the Food and Drug Administration, and outline how the system may be adapted for similar collections. Finally, we discuss some ongoing enhancements to our AME system. PMID:21179386
JPRS Report, Soviet Union KOMMUNIST No 4, March 1988

DTIC Science & Technology

1988-05-09

comrades V.l. Boldin , head of the CPSU Central Committee General Depart- ment; N.V. Geliert, mechanizer, Sovkhoz imeni Amangeldy, Kazakh SSR; and V.l...of enterprises with materials and raw materials, " extracting " freight cars, machines and cement, etc. In frequent cases the sectorial department of...does not fear change, who is oriented toward change and who has development possibilities. The new ideals and ideas shared by such people, extracted
Machine Learning in Radiology: Applications Beyond Image Interpretation.

PubMed

Lakhani, Paras; Prater, Adam B; Hutson, R Kent; Andriole, Kathy P; Dreyer, Keith J; Morey, Jose; Prevedello, Luciano M; Clark, Toshi J; Geis, J Raymond; Itri, Jason N; Hawkins, C Matthew

2018-02-01

Much attention has been given to machine learning and its perceived impact in radiology, particularly in light of recent success with image classification in international competitions. However, machine learning is likely to impact radiology outside of image interpretation long before a fully functional "machine radiologist" is implemented in practice. Here, we describe an overview of machine learning, its application to radiology and other domains, and many cases of use that do not involve image interpretation. We hope that better understanding of these potential applications will help radiology practices prepare for the future and realize performance improvement and efficiency gains. Copyright © 2017 American College of Radiology. Published by Elsevier Inc. All rights reserved.
Start-up and control method and apparatus for resonant free piston Stirling engine

DOEpatents

Walsh, Michael M.

1984-01-01

A resonant free-piston Stirling engine having a new and improved start-up and control method and system. A displacer linear electrodynamic machine is provided having an armature secured to and movable with the displacer and having a stator supported by the Stirling engine housing in juxtaposition to the armature. A control excitation circuit is provided for electrically exciting the displacer linear electrodynamic machine with electrical excitation signals having substantially the same frequency as the desired frequency of operation of the Stirling engine. The excitation control circuit is designed so that it selectively and controllably causes the displacer electrodynamic machine to function either as a generator load to extract power from the displacer or the control circuit selectively can be operated to cause the displacer electrodynamic machine to operate as an electric drive motor to apply additional input power to the displacer in addition to the thermodynamic power feedback to the displacer whereby the displacer linear electrodynamic machine also is used in the electric drive motor mode as a means for initially starting the resonant free-piston Stirling engine.
Antepartum fetal heart rate feature extraction and classification using empirical mode decomposition and support vector machine

PubMed Central

2011-01-01

Background Cardiotocography (CTG) is the most widely used tool for fetal surveillance. The visual analysis of fetal heart rate (FHR) traces largely depends on the expertise and experience of the clinician involved. Several approaches have been proposed for the effective interpretation of FHR. In this paper, a new approach for FHR feature extraction based on empirical mode decomposition (EMD) is proposed, which was used along with support vector machine (SVM) for the classification of FHR recordings as 'normal' or 'at risk'. Methods The FHR were recorded from 15 subjects at a sampling rate of 4 Hz and a dataset consisting of 90 randomly selected records of 20 minutes duration was formed from these. All records were labelled as 'normal' or 'at risk' by two experienced obstetricians. A training set was formed by 60 records, the remaining 30 left as the testing set. The standard deviations of the EMD components are input as features to a support vector machine (SVM) to classify FHR samples. Results For the training set, a five-fold cross validation test resulted in an accuracy of 86% whereas the overall geometric mean of sensitivity and specificity was 94.8%. The Kappa value for the training set was .923. Application of the proposed method to the testing set (30 records) resulted in a geometric mean of 81.5%. The Kappa value for the testing set was .684. Conclusions Based on the overall performance of the system it can be stated that the proposed methodology is a promising new approach for the feature extraction and classification of FHR signals. PMID:21244712
Streamlining machine learning in mobile devices for remote sensing

NASA Astrophysics Data System (ADS)

Coronel, Andrei D.; Estuar, Ma. Regina E.; Garcia, Kyle Kristopher P.; Dela Cruz, Bon Lemuel T.; Torrijos, Jose Emmanuel; Lim, Hadrian Paulo M.; Abu, Patricia Angela R.; Victorino, John Noel C.

2017-09-01

Mobile devices have been at the forefront of Intelligent Farming because of its ubiquitous nature. Applications on precision farming have been developed on smartphones to allow small farms to monitor environmental parameters surrounding crops. Mobile devices are used for most of these applications, collecting data to be sent to the cloud for storage, analysis, modeling and visualization. However, with the issue of weak and intermittent connectivity in geographically challenged areas of the Philippines, the solution is to provide analysis on the phone itself. Given this, the farmer gets a real time response after data submission. Though Machine Learning is promising, hardware constraints in mobile devices limit the computational capabilities, making model development on the phone restricted and challenging. This study discusses the development of a Machine Learning based mobile application using OpenCV libraries. The objective is to enable the detection of Fusarium oxysporum cubense (Foc) in juvenile and asymptomatic bananas using images of plant parts and microscopic samples as input. Image datasets of attached, unattached, dorsal, and ventral views of leaves were acquired through sampling protocols. Images of raw and stained specimens from soil surrounding the plant, and sap from the plant resulted to stained and unstained samples respectively. Segmentation and feature extraction techniques were applied to all images. Initial findings show no significant differences among the different feature extraction techniques. For differentiating infected from non-infected leaves, KNN yields highest average accuracy, as opposed to Naive Bayes and SVM. For microscopic images using MSER feature extraction, KNN has been tested as having a better accuracy than SVM or Naive-Bayes.
A multi-label, semi-supervised classification approach applied to personality prediction in social media.

PubMed

Lima, Ana Carolina E S; de Castro, Leandro Nunes

2014-10-01

Social media allow web users to create and share content pertaining to different subjects, exposing their activities, opinions, feelings and thoughts. In this context, online social media has attracted the interest of data scientists seeking to understand behaviours and trends, whilst collecting statistics for social sites. One potential application for these data is personality prediction, which aims to understand a user's behaviour within social media. Traditional personality prediction relies on users' profiles, their status updates, the messages they post, etc. Here, a personality prediction system for social media data is introduced that differs from most approaches in the literature, in that it works with groups of texts, instead of single texts, and does not take users' profiles into account. Also, the proposed approach extracts meta-attributes from texts and does not work directly with the content of the messages. The set of possible personality traits is taken from the Big Five model and allows the problem to be characterised as a multi-label classification task. The problem is then transformed into a set of five binary classification problems and solved by means of a semi-supervised learning approach, due to the difficulty in annotating the massive amounts of data generated in social media. In our implementation, the proposed system was trained with three well-known machine-learning algorithms, namely a Naïve Bayes classifier, a Support Vector Machine, and a Multilayer Perceptron neural network. The system was applied to predict the personality of Tweets taken from three datasets available in the literature, and resulted in an approximately 83% accurate prediction, with some of the personality traits presenting better individual classification rates than others. Copyright © 2014 Elsevier Ltd. All rights reserved.
PREvaIL, an integrative approach for inferring catalytic residues using sequence, structural, and network features in a machine-learning framework.

PubMed

Song, Jiangning; Li, Fuyi; Takemoto, Kazuhiro; Haffari, Gholamreza; Akutsu, Tatsuya; Chou, Kuo-Chen; Webb, Geoffrey I

2018-04-14

Determining the catalytic residues in an enzyme is critical to our understanding the relationship between protein sequence, structure, function, and enhancing our ability to design novel enzymes and their inhibitors. Although many enzymes have been sequenced, and their primary and tertiary structures determined, experimental methods for enzyme functional characterization lag behind. Because experimental methods used for identifying catalytic residues are resource- and labor-intensive, computational approaches have considerable value and are highly desirable for their ability to complement experimental studies in identifying catalytic residues and helping to bridge the sequence-structure-function gap. In this study, we describe a new computational method called PREvaIL for predicting enzyme catalytic residues. This method was developed by leveraging a comprehensive set of informative features extracted from multiple levels, including sequence, structure, and residue-contact network, in a random forest machine-learning framework. Extensive benchmarking experiments on eight different datasets based on 10-fold cross-validation and independent tests, as well as side-by-side performance comparisons with seven modern sequence- and structure-based methods, showed that PREvaIL achieved competitive predictive performance, with an area under the receiver operating characteristic curve and area under the precision-recall curve ranging from 0.896 to 0.973 and from 0.294 to 0.523, respectively. We demonstrated that this method was able to capture useful signals arising from different levels, leveraging such differential but useful types of features and allowing us to significantly improve the performance of catalytic residue prediction. We believe that this new method can be utilized as a valuable tool for both understanding the complex sequence-structure-function relationships of proteins and facilitating the characterization of novel enzymes lacking functional annotations. Copyright © 2018 Elsevier Ltd. All rights reserved.
Machine Learning-based Classification of Diffuse Large B-cell Lymphoma Patients by Their Protein Expression Profiles.

PubMed

Deeb, Sally J; Tyanova, Stefka; Hummel, Michael; Schmidt-Supprian, Marc; Cox, Juergen; Mann, Matthias

2015-11-01

Characterization of tumors at the molecular level has improved our knowledge of cancer causation and progression. Proteomic analysis of their signaling pathways promises to enhance our understanding of cancer aberrations at the functional level, but this requires accurate and robust tools. Here, we develop a state of the art quantitative mass spectrometric pipeline to characterize formalin-fixed paraffin-embedded tissues of patients with closely related subtypes of diffuse large B-cell lymphoma. We combined a super-SILAC approach with label-free quantification (hybrid LFQ) to address situations where the protein is absent in the super-SILAC standard but present in the patient samples. Shotgun proteomic analysis on a quadrupole Orbitrap quantified almost 9,000 tumor proteins in 20 patients. The quantitative accuracy of our approach allowed the segregation of diffuse large B-cell lymphoma patients according to their cell of origin using both their global protein expression patterns and the 55-protein signature obtained previously from patient-derived cell lines (Deeb, S. J., D'Souza, R. C., Cox, J., Schmidt-Supprian, M., and Mann, M. (2012) Mol. Cell. Proteomics 11, 77-89). Expression levels of individual segregation-driving proteins as well as categories such as extracellular matrix proteins behaved consistently with known trends between the subtypes. We used machine learning (support vector machines) to extract candidate proteins with the highest segregating power. A panel of four proteins (PALD1, MME, TNFAIP8, and TBC1D4) is predicted to classify patients with low error rates. Highly ranked proteins from the support vector analysis revealed differential expression of core signaling molecules between the subtypes, elucidating aspects of their pathobiology. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Why so GLUMM? Detecting depression clusters through graphing lifestyle-environs using machine-learning methods (GLUMM).

PubMed

Dipnall, J F; Pasco, J A; Berk, M; Williams, L J; Dodd, S; Jacka, F N; Meyer, D

2017-01-01

Key lifestyle-environ risk factors are operative for depression, but it is unclear how risk factors cluster. Machine-learning (ML) algorithms exist that learn, extract, identify and map underlying patterns to identify groupings of depressed individuals without constraints. The aim of this research was to use a large epidemiological study to identify and characterise depression clusters through "Graphing lifestyle-environs using machine-learning methods" (GLUMM). Two ML algorithms were implemented: unsupervised Self-organised mapping (SOM) to create GLUMM clusters and a supervised boosted regression algorithm to describe clusters. Ninety-six "lifestyle-environ" variables were used from the National health and nutrition examination study (2009-2010). Multivariate logistic regression validated clusters and controlled for possible sociodemographic confounders. The SOM identified two GLUMM cluster solutions. These solutions contained one dominant depressed cluster (GLUMM5-1, GLUMM7-1). Equal proportions of members in each cluster rated as highly depressed (17%). Alcohol consumption and demographics validated clusters. Boosted regression identified GLUMM5-1 as more informative than GLUMM7-1. Members were more likely to: have problems sleeping; unhealthy eating; ≤2 years in their home; an old home; perceive themselves underweight; exposed to work fumes; experienced sex at ≤14 years; not perform moderate recreational activities. A positive relationship between GLUMM5-1 (OR: 7.50, P<0.001) and GLUMM7-1 (OR: 7.88, P<0.001) with depression was found, with significant interactions with those married/living with partner (P=0.001). Using ML based GLUMM to form ordered depressive clusters from multitudinous lifestyle-environ variables enabled a deeper exploration of the heterogeneous data to uncover better understandings into relationships between the complex mental health factors. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
Feature extraction in MFL signals of machined defects in steel tubes

NASA Astrophysics Data System (ADS)

Perazzo, R.; Pignotti, A.; Reich, S.; Stickar, P.

2001-04-01

Thirty defects of various shapes were machined on the external and internal wall surfaces of a 177 mm diameter ferromagnetic steel pipe. MFL signals were digitized and recorded at a frequency of 4 Khz. Various magnetizing currents and relative tube-probe velocities of the order of 2m/s were used. The identification of the location of the defect by a principal component/neural network analysis of the signal is shown to be more effective than the standard procedure of classification based on the average signal frequency.
Receptive fields and the theory of discriminant operators

NASA Astrophysics Data System (ADS)

Gupta, Madan M.; Hungenahally, Suresh K.

1991-02-01

Biological basis for machine vision is a notion which is being used extensively for the development of machine vision systems for various applications. In this paper we have made an attempt to emulate the receptive fields that exist in the biological visual channels. In particular we have exploited the notion of receptive fields for developing the mathematical functions named as discriminantfunctions for the extraction of transition information from signals and multi-dimensional signals and images. These functions are found to be useful for the development of artificial receptive fields for neuro-vision systems. 1.
Ejected Particle Size Distributions from Shocked Metal Surfaces

DOE PAGES

Schauer, M. M.; Buttler, W. T.; Frayer, D. K.; ...

2017-04-12

Here, we present size distributions for particles ejected from features machined onto the surface of shocked Sn targets. The functional form of the size distributions is assumed to be log-normal, and the characteristic parameters of the distribution are extracted from the measured angular distribution of light scattered from a laser beam incident on the ejected particles. We also found strong evidence for a bimodal distribution of particle sizes with smaller particles evolved from features machined into the target surface and larger particles being produced at the edges of these features.
Ejected Particle Size Distributions from Shocked Metal Surfaces

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schauer, M. M.; Buttler, W. T.; Frayer, D. K.

Here, we present size distributions for particles ejected from features machined onto the surface of shocked Sn targets. The functional form of the size distributions is assumed to be log-normal, and the characteristic parameters of the distribution are extracted from the measured angular distribution of light scattered from a laser beam incident on the ejected particles. We also found strong evidence for a bimodal distribution of particle sizes with smaller particles evolved from features machined into the target surface and larger particles being produced at the edges of these features.
New approach for cognitive analysis and understanding of medical patterns and visualizations

NASA Astrophysics Data System (ADS)

Ogiela, Marek R.; Tadeusiewicz, Ryszard

2003-11-01

This paper presents new opportunities for applying linguistic description of the picture merit content and AI methods to undertake tasks of the automatic understanding of images semantics in intelligent medical information systems. A successful obtaining of the crucial semantic content of the medical image may contribute considerably to the creation of new intelligent multimedia cognitive medical systems. Thanks to the new idea of cognitive resonance between stream of the data extracted from the image using linguistic methods and expectations taken from the representaion of the medical knowledge, it is possible to understand the merit content of the image even if teh form of the image is very different from any known pattern. This article proves that structural techniques of artificial intelligence may be applied in the case of tasks related to automatic classification and machine perception based on semantic pattern content in order to determine the semantic meaning of the patterns. In the paper are described some examples presenting ways of applying such techniques in the creation of cognitive vision systems for selected classes of medical images. On the base of scientific research described in the paper we try to build some new systems for collecting, storing, retrieving and intelligent interpreting selected medical images especially obtained in radiological and MRI examinations.
Work extraction from quantum systems with bounded fluctuations in work.

PubMed

Richens, Jonathan G; Masanes, Lluis

2016-11-25

In the standard framework of thermodynamics, work is a random variable whose average is bounded by the change in free energy of the system. This average work is calculated without regard for the size of its fluctuations. Here we show that for some processes, such as reversible cooling, the fluctuations in work diverge. Realistic thermal machines may be unable to cope with arbitrarily large fluctuations. Hence, it is important to understand how thermodynamic efficiency rates are modified by bounding fluctuations. We quantify the work content and work of formation of arbitrary finite dimensional quantum states when the fluctuations in work are bounded by a given amount c. By varying c we interpolate between the standard and minimum free energies. We derive fundamental trade-offs between the magnitude of work and its fluctuations. As one application of these results, we derive the corrected Carnot efficiency of a qubit heat engine with bounded fluctuations.
Work extraction from quantum systems with bounded fluctuations in work

PubMed Central

Richens, Jonathan G.; Masanes, Lluis

2016-01-01

In the standard framework of thermodynamics, work is a random variable whose average is bounded by the change in free energy of the system. This average work is calculated without regard for the size of its fluctuations. Here we show that for some processes, such as reversible cooling, the fluctuations in work diverge. Realistic thermal machines may be unable to cope with arbitrarily large fluctuations. Hence, it is important to understand how thermodynamic efficiency rates are modified by bounding fluctuations. We quantify the work content and work of formation of arbitrary finite dimensional quantum states when the fluctuations in work are bounded by a given amount c. By varying c we interpolate between the standard and minimum free energies. We derive fundamental trade-offs between the magnitude of work and its fluctuations. As one application of these results, we derive the corrected Carnot efficiency of a qubit heat engine with bounded fluctuations. PMID:27886177

Work extraction from quantum systems with bounded fluctuations in work

NASA Astrophysics Data System (ADS)

Richens, Jonathan G.; Masanes, Lluis

2016-11-01

In the standard framework of thermodynamics, work is a random variable whose average is bounded by the change in free energy of the system. This average work is calculated without regard for the size of its fluctuations. Here we show that for some processes, such as reversible cooling, the fluctuations in work diverge. Realistic thermal machines may be unable to cope with arbitrarily large fluctuations. Hence, it is important to understand how thermodynamic efficiency rates are modified by bounding fluctuations. We quantify the work content and work of formation of arbitrary finite dimensional quantum states when the fluctuations in work are bounded by a given amount c. By varying c we interpolate between the standard and minimum free energies. We derive fundamental trade-offs between the magnitude of work and its fluctuations. As one application of these results, we derive the corrected Carnot efficiency of a qubit heat engine with bounded fluctuations.
The Adam and Eve Robot Scientists for the Automated Discovery of Scientific Knowledge

NASA Astrophysics Data System (ADS)

King, Ross

A Robot Scientist is a physically implemented robotic system that applies techniques from artificial intelligence to execute cycles of automated scientific experimentation. A Robot Scientist can automatically execute cycles of hypothesis formation, selection of efficient experiments to discriminate between hypotheses, execution of experiments using laboratory automation equipment, and analysis of results. The motivation for developing Robot Scientists is to better understand science, and to make scientific research more efficient. The Robot Scientist `Adam' was the first machine to autonomously discover scientific knowledge: both form and experimentally confirm novel hypotheses. Adam worked in the domain of yeast functional genomics. The Robot Scientist `Eve' was originally developed to automate early-stage drug development, with specific application to neglected tropical disease such as malaria, African sleeping sickness, etc. We are now adapting Eve to work with on cancer. We are also teaching Eve to autonomously extract information from the scientific literature.
Consequences of Part Temperature Variability in Electron Beam Melting of Ti-6Al-4V

NASA Astrophysics Data System (ADS)

Fisher, Brian A.; Mireles, Jorge; Ridwan, Shakerur; Wicker, Ryan B.; Beuth, Jack

2017-12-01

To facilitate adoption of Ti-6Al-4V (Ti64) parts produced via additive manufacturing (AM), the ability to ensure part quality is critical. Measuring temperatures is an important component of part quality monitoring in all direct metal AM processes. In this work, surface temperatures were monitored using a custom infrared camera system attached to an Arcam electron beam melting (EBM®) machine. These temperatures were analyzed to understand their possible effect on solidification microstructure based on solidification cooling rates extracted from finite element simulations. Complicated thermal histories were seen during part builds, and temperature changes occurring during typical Ti64 builds may be large enough to affect solidification microstructure. There is, however, enough time between fusion of individual layers for spatial temperature variations (i.e., hot spots) to dissipate. This means that an effective thermal control strategy for EBM® can be based on average measured surface temperatures, ignoring temperature variability.
Learning to File: Reconfiguring Information and Information Work in the Early Twentieth Century.

PubMed

Robertson, Craig

2017-01-01

This article uses textbooks and advertisements to explore the formal and informal ways in which people were introduced to vertical filing in the early twentieth century. Through the privileging of "system" an ideal mode of paperwork emerged in which a clerk could "grasp" information simply by hand without having to understand or comprehend its content. A file clerk's hands and fingers became central to the representation and teaching of filing. In this way, filing offered an example of a distinctly modern form of information work. Filing textbooks sought to enhance dexterity as the rapid handling of paper came to represent information as something that existed in discrete units, in bits that could be easily extracted. Advertisements represented this mode of information work in its ideal form when they frequently erased the worker or reduced him or her to hands, as "instant" filing became "automatic" filing, with the filing cabinet presented as a machine.
SU-D-BRA-07: A Phantom Study to Assess the Variability in Radiomics Features Extracted From Cone-Beam CT Images

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fave, X; Fried, D; UT Health Science Center Graduate School of Biomedical Sciences, Houston, TX

2015-06-15

Purpose: Several studies have demonstrated the prognostic potential for texture features extracted from CT images of non-small cell lung cancer (NSCLC) patients. The purpose of this study was to determine if these features could be extracted with high reproducibility from cone-beam CT (CBCT) images in order for features to be easily tracked throughout a patient’s treatment. Methods: Two materials in a radiomics phantom, designed to approximate NSCLC tumor texture, were used to assess the reproducibility of 26 features. This phantom was imaged on 9 CBCT scanners, including Elekta and Varian machines. Thoracic and head imaging protocols were acquired on eachmore » machine. CBCT images from 27 NSCLC patients imaged using the thoracic protocol on Varian machines were obtained for comparison. The variance for each texture measured from these patients was compared to the variance in phantom values for different manufacturer/protocol subsets. Levene’s test was used to identify features which had a significantly smaller variance in the phantom scans versus the patient data. Results: Approximately half of the features (13/26 for material1 and 15/26 for material2) had a significantly smaller variance (p<0.05) between Varian thoracic scans of the phantom compared to patient scans. Many of these same features remained significant for the head scans on Varian (12/26 and 8/26). However, when thoracic scans from Elekta and Varian were combined, only a few features were still significant (4/26 and 5/26). Three features (skewness, coarsely filtered mean and standard deviation) were significant in almost all manufacturer/protocol subsets. Conclusion: Texture features extracted from CBCT images of a radiomics phantom are reproducible and show significantly less variation than the same features measured from patient images when images from the same manufacturer or with similar parameters are used. Reproducibility between CBCT scanners may be high enough to allow the extraction of meaningful texture values for patients. This project was funded in part by the Cancer Prevention Research Institute of Texas (CPRIT). Xenia Fave is a recipient of the American Association of Physicists in Medicine Graduate Fellowship.« less
Probabilistic machine learning and artificial intelligence.

PubMed

Ghahramani, Zoubin

2015-05-28

How can a machine learn from experience? Probabilistic modelling provides a framework for understanding what learning is, and has therefore emerged as one of the principal theoretical and practical approaches for designing machines that learn from data acquired through experience. The probabilistic framework, which describes how to represent and manipulate uncertainty about models and predictions, has a central role in scientific data analysis, machine learning, robotics, cognitive science and artificial intelligence. This Review provides an introduction to this framework, and discusses some of the state-of-the-art advances in the field, namely, probabilistic programming, Bayesian optimization, data compression and automatic model discovery.
Probabilistic machine learning and artificial intelligence

NASA Astrophysics Data System (ADS)

Ghahramani, Zoubin

2015-05-01

How can a machine learn from experience? Probabilistic modelling provides a framework for understanding what learning is, and has therefore emerged as one of the principal theoretical and practical approaches for designing machines that learn from data acquired through experience. The probabilistic framework, which describes how to represent and manipulate uncertainty about models and predictions, has a central role in scientific data analysis, machine learning, robotics, cognitive science and artificial intelligence. This Review provides an introduction to this framework, and discusses some of the state-of-the-art advances in the field, namely, probabilistic programming, Bayesian optimization, data compression and automatic model discovery.
Literature classification for semi-automated updating of biological knowledgebases

PubMed Central

2013-01-01

Background As the output of biological assays increase in resolution and volume, the body of specialized biological data, such as functional annotations of gene and protein sequences, enables extraction of higher-level knowledge needed for practical application in bioinformatics. Whereas common types of biological data, such as sequence data, are extensively stored in biological databases, functional annotations, such as immunological epitopes, are found primarily in semi-structured formats or free text embedded in primary scientific literature. Results We defined and applied a machine learning approach for literature classification to support updating of TANTIGEN, a knowledgebase of tumor T-cell antigens. Abstracts from PubMed were downloaded and classified as either "relevant" or "irrelevant" for database update. Training and five-fold cross-validation of a k-NN classifier on 310 abstracts yielded classification accuracy of 0.95, thus showing significant value in support of data extraction from the literature. Conclusion We here propose a conceptual framework for semi-automated extraction of epitope data embedded in scientific literature using principles from text mining and machine learning. The addition of such data will aid in the transition of biological databases to knowledgebases. PMID:24564403
Extraction and classification of 3D objects from volumetric CT data

NASA Astrophysics Data System (ADS)

Song, Samuel M.; Kwon, Junghyun; Ely, Austin; Enyeart, John; Johnson, Chad; Lee, Jongkyu; Kim, Namho; Boyd, Douglas P.

2016-05-01

We propose an Automatic Threat Detection (ATD) algorithm for Explosive Detection System (EDS) using our multistage Segmentation Carving (SC) followed by Support Vector Machine (SVM) classifier. The multi-stage Segmentation and Carving (SC) step extracts all suspect 3-D objects. The feature vector is then constructed for all extracted objects and the feature vector is classified by the Support Vector Machine (SVM) previously learned using a set of ground truth threat and benign objects. The learned SVM classifier has shown to be effective in classification of different types of threat materials. The proposed ATD algorithm robustly deals with CT data that are prone to artifacts due to scatter, beam hardening as well as other systematic idiosyncrasies of the CT data. Furthermore, the proposed ATD algorithm is amenable for including newly emerging threat materials as well as for accommodating data from newly developing sensor technologies. Efficacy of the proposed ATD algorithm with the SVM classifier is demonstrated by the Receiver Operating Characteristics (ROC) curve that relates Probability of Detection (PD) as a function of Probability of False Alarm (PFA). The tests performed using CT data of passenger bags shows excellent performance characteristics.
ANN based Performance Evaluation of BDI for Condition Monitoring of Induction Motor Bearings

NASA Astrophysics Data System (ADS)

Patel, Raj Kumar; Giri, V. K.

2017-06-01

One of the critical parts in rotating machines is bearings and most of the failure arises from the defective bearings. Bearing failure leads to failure of a machine and the unpredicted productivity loss in the performance. Therefore, bearing fault detection and prognosis is an integral part of the preventive maintenance procedures. In this paper vibration signal for four conditions of a deep groove ball bearing; normal (N), inner race defect (IRD), ball defect (BD) and outer race defect (ORD) were acquired from a customized bearing test rig, under four different conditions and three different fault sizes. Two approaches have been opted for statistical feature extraction from the vibration signal. In the first approach, raw signal is used for statistical feature extraction and in the second approach statistical features extracted are based on bearing damage index (BDI). The proposed BDI technique uses wavelet packet node energy coefficients analysis method. Both the features are used as inputs to an ANN classifier to evaluate its performance. A comparison of ANN performance is made based on raw vibration data and data chosen by using BDI. The ANN performance has been found to be fairly higher when BDI based signals were used as inputs to the classifier.
Classifying Physical Morphology of Cocoa Beans Digital Images using Multiclass Ensemble Least-Squares Support Vector Machine

NASA Astrophysics Data System (ADS)

Lawi, Armin; Adhitya, Yudhi

2018-03-01

The objective of this research is to determine the quality of cocoa beans through morphology of their digital images. Samples of cocoa beans were scattered on a bright white paper under a controlled lighting condition. A compact digital camera was used to capture the images. The images were then processed to extract their morphological parameters. Classification process begins with an analysis of cocoa beans image based on morphological feature extraction. Parameters for extraction of morphological or physical feature parameters, i.e., Area, Perimeter, Major Axis Length, Minor Axis Length, Aspect Ratio, Circularity, Roundness, Ferret Diameter. The cocoa beans are classified into 4 groups, i.e.: Normal Beans, Broken Beans, Fractured Beans, and Skin Damaged Beans. The model of classification used in this paper is the Multiclass Ensemble Least-Squares Support Vector Machine (MELS-SVM), a proposed improvement model of SVM using ensemble method in which the separate hyperplanes are obtained by least square approach and the multiclass procedure uses One-Against- All method. The result of our proposed model showed that the classification with morphological feature input parameters were accurately as 99.705% for the four classes, respectively.
Epileptic Seizures Prediction Using Machine Learning Methods

PubMed Central

Usman, Syed Muhammad

2017-01-01

Epileptic seizures occur due to disorder in brain functionality which can affect patient's health. Prediction of epileptic seizures before the beginning of the onset is quite useful for preventing the seizure by medication. Machine learning techniques and computational methods are used for predicting epileptic seizures from Electroencephalograms (EEG) signals. However, preprocessing of EEG signals for noise removal and features extraction are two major issues that have an adverse effect on both anticipation time and true positive prediction rate. Therefore, we propose a model that provides reliable methods of both preprocessing and feature extraction. Our model predicts epileptic seizures' sufficient time before the onset of seizure starts and provides a better true positive rate. We have applied empirical mode decomposition (EMD) for preprocessing and have extracted time and frequency domain features for training a prediction model. The proposed model detects the start of the preictal state, which is the state that starts few minutes before the onset of the seizure, with a higher true positive rate compared to traditional methods, 92.23%, and maximum anticipation time of 33 minutes and average prediction time of 23.6 minutes on scalp EEG CHB-MIT dataset of 22 subjects. PMID:29410700
An Android malware detection system based on machine learning

NASA Astrophysics Data System (ADS)

Wen, Long; Yu, Haiyang

2017-08-01

The Android smartphone, with its open source character and excellent performance, has attracted many users. However, the convenience of the Android platform also has motivated the development of malware. The traditional method which detects the malware based on the signature is unable to detect unknown applications. The article proposes a machine learning-based lightweight system that is capable of identifying malware on Android devices. In this system we extract features based on the static analysis and the dynamitic analysis, then a new feature selection approach based on principle component analysis (PCA) and relief are presented in the article to decrease the dimensions of the features. After that, a model will be constructed with support vector machine (SVM) for classification. Experimental results show that our system provides an effective method in Android malware detection.
Development of OCR system for portable passport and visa reader

NASA Astrophysics Data System (ADS)

Visilter, Yury V.; Zheltov, Sergey Y.; Lukin, Anton A.

1999-01-01

The modern passport and visa documents include special machine-readable zones satisfied the ICAO standards. This allows to develop the special passport and visa automatic readers. However, there are some special problems in such OCR systems: low resolution of character images captured by CCD-camera (down to 150 dpi), essential shifts and slopes (up to 10 degrees), rich paper texture under the character symbols, non-homogeneous illumination. This paper presents the structure and some special aspects of OCR system for portable passport and visa reader. In our approach the binarization procedure is performed after the segmentation step, and it is applied to the each character site separately. Character recognition procedure uses the structural information of machine-readable zone. Special algorithms are developed for machine-readable zone extraction and character segmentation.
Machine learning approach for automated screening of malaria parasite using light microscopic images.

PubMed

Das, Dev Kumar; Ghosh, Madhumala; Pal, Mallika; Maiti, Asok K; Chakraborty, Chandan

2013-02-01

The aim of this paper is to address the development of computer assisted malaria parasite characterization and classification using machine learning approach based on light microscopic images of peripheral blood smears. In doing this, microscopic image acquisition from stained slides, illumination correction and noise reduction, erythrocyte segmentation, feature extraction, feature selection and finally classification of different stages of malaria (Plasmodium vivax and Plasmodium falciparum) have been investigated. The erythrocytes are segmented using marker controlled watershed transformation and subsequently total ninety six features describing shape-size and texture of erythrocytes are extracted in respect to the parasitemia infected versus non-infected cells. Ninety four features are found to be statistically significant in discriminating six classes. Here a feature selection-cum-classification scheme has been devised by combining F-statistic, statistical learning techniques i.e., Bayesian learning and support vector machine (SVM) in order to provide the higher classification accuracy using best set of discriminating features. Results show that Bayesian approach provides the highest accuracy i.e., 84% for malaria classification by selecting 19 most significant features while SVM provides highest accuracy i.e., 83.5% with 9 most significant features. Finally, the performance of these two classifiers under feature selection framework has been compared toward malaria parasite classification. Copyright © 2012 Elsevier Ltd. All rights reserved.
Detecting Dementia Through Interactive Computer Avatars

PubMed Central

Adachi, Hiroyoshi; Ukita, Norimichi; Ikeda, Manabu; Kazui, Hiroaki; Kudo, Takashi; Nakamura, Satoshi

2017-01-01

This paper proposes a new approach to automatically detect dementia. Even though some works have detected dementia from speech and language attributes, most have applied detection using picture descriptions, narratives, and cognitive tasks. In this paper, we propose a new computer avatar with spoken dialog functionalities that produces spoken queries based on the mini-mental state examination, the Wechsler memory scale-revised, and other related neuropsychological questions. We recorded the interactive data of spoken dialogues from 29 participants (14 dementia and 15 healthy controls) and extracted various audiovisual features. We tried to predict dementia using audiovisual features and two machine learning algorithms (support vector machines and logistic regression). Here, we show that the support vector machines outperformed logistic regression, and by using the extracted features they classified the participants into two groups with 0.93 detection performance, as measured by the areas under the receiver operating characteristic curve. We also newly identified some contributing features, e.g., gap before speaking, the variations of fundamental frequency, voice quality, and the ratio of smiling. We concluded that our system has the potential to detect dementia through spoken dialog systems and that the system can assist health care workers. In addition, these findings could help medical personnel detect signs of dementia. PMID:29018636
A Hybrid Generalized Hidden Markov Model-Based Condition Monitoring Approach for Rolling Bearings

PubMed Central

Liu, Jie; Hu, Youmin; Wu, Bo; Wang, Yan; Xie, Fengyun

2017-01-01

The operating condition of rolling bearings affects productivity and quality in the rotating machine process. Developing an effective rolling bearing condition monitoring approach is critical to accurately identify the operating condition. In this paper, a hybrid generalized hidden Markov model-based condition monitoring approach for rolling bearings is proposed, where interval valued features are used to efficiently recognize and classify machine states in the machine process. In the proposed method, vibration signals are decomposed into multiple modes with variational mode decomposition (VMD). Parameters of the VMD, in the form of generalized intervals, provide a concise representation for aleatory and epistemic uncertainty and improve the robustness of identification. The multi-scale permutation entropy method is applied to extract state features from the decomposed signals in different operating conditions. Traditional principal component analysis is adopted to reduce feature size and computational cost. With the extracted features’ information, the generalized hidden Markov model, based on generalized interval probability, is used to recognize and classify the fault types and fault severity levels. Finally, the experiment results show that the proposed method is effective at recognizing and classifying the fault types and fault severity levels of rolling bearings. This monitoring method is also efficient enough to quantify the two uncertainty components. PMID:28524088
Multiple Cylinder Free-Piston Stirling Machinery

NASA Astrophysics Data System (ADS)

Berchowitz, David M.; Kwon, Yong-Rak

In order to improve the specific power of piston-cylinder type machinery, there is a point in capacity or power where an advantage accrues with increasing number of piston-cylinder assemblies. In the case of Stirling machinery where primary energy is transferred across the casing wall of the machine, this consideration is even more important. This is due primarily to the difference in scaling of basic power and the required heat transfer. Heat transfer is found to be progressively limited as the size of the machine increases. Multiple cylinder machines tend to preserve the surface area to volume ratio at more favorable levels. In addition, the spring effect of the working gas in the so-called alpha configuration is often sufficient to provide a high frequency resonance point that improves the specific power. There are a number of possible multiple cylinder configurations. The simplest is an opposed pair of piston-displacer machines (beta configuration). A three-cylinder machine requires stepped pistons to obtain proper volume phase relationships. Four to six cylinder configurations are also possible. A small demonstrator inline four cylinder alpha machine has been built to demonstrate both cooling operation and power generation. Data from this machine verifies theoretical expectations and is used to extrapolate the performance of future machines. Vibration levels are discussed and it is argued that some multiple cylinder machines have no linear component to the casing vibration but may have a nutating couple. Example applications are discussed ranging from general purpose coolers, computer cooling, exhaust heat power extraction and some high power engines.
A Unified Approach to Abductive Inference

DTIC Science & Technology

2014-09-30

learning in “ Big data ” domains. COMBINING MARKOV LOGIC AND SUPPORT VECTOR MACHINES FOR EVENT EXTRACTION Event extraction is the task of...and achieves stateoftheart performance. This makes it an ideal candidate for learning in “ Big data ...including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the
Extraction and Analysis of Mega Cities’ Impervious Surface on Pixel-based and Object-oriented Support Vector Machine Classification Technology: A case of Bombay

NASA Astrophysics Data System (ADS)

Yu, S. S.; Sun, Z. C.; Sun, L.; Wu, M. F.

2017-02-01

The object of this paper is to study the impervious surface extraction method using remote sensing imagery and monitor the spatiotemporal changing patterns of mega cities. Megacity Bombay was selected as the interesting area. Firstly, the pixel-based and object-oriented support vector machine (SVM) classification methods were used to acquire the land use/land cover (LULC) products of Bombay in 2010. Consequently, the overall accuracy (OA) and overall Kappa (OK) of the pixel-based method were 94.97% and 0.96 with a running time of 78 minutes, the OA and OK of the object-oriented method were 93.72% and 0.94 with a running time of only 17s. Additionally, OA and OK of the object-oriented method after a post-classification were improved up to 95.8% and 0.94. Then, the dynamic impervious surfaces of Bombay in the period 1973-2015 were extracted and the urbanization pattern of Bombay was analysed. Results told that both the two SVM classification methods could accomplish the impervious surface extraction, but the object-oriented method should be a better choice. Urbanization of Bombay experienced a fast extending during the past 42 years, implying a dramatically urban sprawl of mega cities in the developing countries along the One Belt and One Road (OBOR).

Dynamical analysis of contrastive divergence learning: Restricted Boltzmann machines with Gaussian visible units.

PubMed

Karakida, Ryo; Okada, Masato; Amari, Shun-Ichi

2016-07-01

The restricted Boltzmann machine (RBM) is an essential constituent of deep learning, but it is hard to train by using maximum likelihood (ML) learning, which minimizes the Kullback-Leibler (KL) divergence. Instead, contrastive divergence (CD) learning has been developed as an approximation of ML learning and widely used in practice. To clarify the performance of CD learning, in this paper, we analytically derive the fixed points where ML and CDn learning rules converge in two types of RBMs: one with Gaussian visible and Gaussian hidden units and the other with Gaussian visible and Bernoulli hidden units. In addition, we analyze the stability of the fixed points. As a result, we find that the stable points of CDn learning rule coincide with those of ML learning rule in a Gaussian-Gaussian RBM. We also reveal that larger principal components of the input data are extracted at the stable points. Moreover, in a Gaussian-Bernoulli RBM, we find that both ML and CDn learning can extract independent components at one of stable points. Our analysis demonstrates that the same feature components as those extracted by ML learning are extracted simply by performing CD1 learning. Expanding this study should elucidate the specific solutions obtained by CD learning in other types of RBMs or in deep networks. Copyright © 2016 Elsevier Ltd. All rights reserved.
40 CFR 61.31 - Definitions.

Code of Federal Regulations, 2010 CFR

2010-07-01

... associated elements. (b) Extraction plant means a facility chemically processing beryllium ore to beryllium..., electrochemical machining, etching, or other similar operations. (e) Ceramic plant means a manufacturing plant... which contains more than 0.1 percent beryllium by weight. (k) Propellant plant means any facility...
40 CFR 61.31 - Definitions.

Code of Federal Regulations, 2011 CFR

2011-07-01

... associated elements. (b) Extraction plant means a facility chemically processing beryllium ore to beryllium..., electrochemical machining, etching, or other similar operations. (e) Ceramic plant means a manufacturing plant... which contains more than 0.1 percent beryllium by weight. (k) Propellant plant means any facility...
30 CFR 75.1719-1 - Illumination in working places.

Code of Federal Regulations, 2013 CFR

2013-07-01

... machine by cables, ropes, or chains. (c) The lighting prescribed in this section shall be in addition to... between the gob-side of the travelway and the side of the block of coal from which coal is being extracted...
30 CFR 75.1719-1 - Illumination in working places.

Code of Federal Regulations, 2014 CFR

2014-07-01

... machine by cables, ropes, or chains. (c) The lighting prescribed in this section shall be in addition to... between the gob-side of the travelway and the side of the block of coal from which coal is being extracted...
30 CFR 75.1719-1 - Illumination in working places.

Code of Federal Regulations, 2012 CFR

2012-07-01

... machine by cables, ropes, or chains. (c) The lighting prescribed in this section shall be in addition to... between the gob-side of the travelway and the side of the block of coal from which coal is being extracted...
Revisit of Machine Learning Supported Biological and Biomedical Studies.

PubMed

Yu, Xiang-Tian; Wang, Lu; Zeng, Tao

2018-01-01

Generally, machine learning includes many in silico methods to transform the principles underlying natural phenomenon to human understanding information, which aim to save human labor, to assist human judge, and to create human knowledge. It should have wide application potential in biological and biomedical studies, especially in the era of big biological data. To look through the application of machine learning along with biological development, this review provides wide cases to introduce the selection of machine learning methods in different practice scenarios involved in the whole biological and biomedical study cycle and further discusses the machine learning strategies for analyzing omics data in some cutting-edge biological studies. Finally, the notes on new challenges for machine learning due to small-sample high-dimension are summarized from the key points of sample unbalance, white box, and causality.
Utilizing uncoded consultation notes from electronic medical records for predictive modeling of colorectal cancer.

PubMed

Hoogendoorn, Mark; Szolovits, Peter; Moons, Leon M G; Numans, Mattijs E

2016-05-01

Machine learning techniques can be used to extract predictive models for diseases from electronic medical records (EMRs). However, the nature of EMRs makes it difficult to apply off-the-shelf machine learning techniques while still exploiting the rich content of the EMRs. In this paper, we explore the usage of a range of natural language processing (NLP) techniques to extract valuable predictors from uncoded consultation notes and study whether they can help to improve predictive performance. We study a number of existing techniques for the extraction of predictors from the consultation notes, namely a bag of words based approach and topic modeling. In addition, we develop a dedicated technique to match the uncoded consultation notes with a medical ontology. We apply these techniques as an extension to an existing pipeline to extract predictors from EMRs. We evaluate them in the context of predictive modeling for colorectal cancer (CRC), a disease known to be difficult to diagnose before performing an endoscopy. Our results show that we are able to extract useful information from the consultation notes. The predictive performance of the ontology-based extraction method moves significantly beyond the benchmark of age and gender alone (area under the receiver operating characteristic curve (AUC) of 0.870 versus 0.831). We also observe more accurate predictive models by adding features derived from processing the consultation notes compared to solely using coded data (AUC of 0.896 versus 0.882) although the difference is not significant. The extracted features from the notes are shown be equally predictive (i.e. there is no significant difference in performance) compared to the coded data of the consultations. It is possible to extract useful predictors from uncoded consultation notes that improve predictive performance. Techniques linking text to concepts in medical ontologies to derive these predictors are shown to perform best for predicting CRC in our EMR dataset. Copyright © 2016 Elsevier B.V. All rights reserved.
Evaluation of machinability and flexural strength of a novel dental machinable glass-ceramic.

PubMed

Qin, Feng; Zheng, Shucan; Luo, Zufeng; Li, Yong; Guo, Ling; Zhao, Yunfeng; Fu, Qiang

2009-10-01

To evaluate the machinability and flexural strength of a novel dental machinable glass-ceramic (named PMC), and to compare the machinability property with that of Vita Mark II and human enamel. The raw batch materials were selected and mixed. Four groups of novel glass-ceramics were formed at different nucleation temperatures, and were assigned to Group 1, Group 2, Group 3 and Group 4. The machinability of the four groups of novel glass-ceramics, Vita Mark II ceramic and freshly extracted human premolars were compared by means of drilling depth measurement. A three-point bending test was used to measure the flexural strength of the novel glass-ceramics. The crystalline phases of the group with the best machinability were identified by X-ray diffraction. In terms of the drilling depth, Group 2 of the novel glass-ceramics proves to have the largest drilling depth. There was no statistical difference among Group 1, Group 4 and the natural teeth. The drilling depth of Vita MK II was statistically less than that of Group 1, Group 4 and the natural teeth. Group 3 had the least drilling depth. In respect of the flexural strength, Group 2 exhibited the maximum flexural strength; Group 1 was statistically weaker than Group 2; there was no statistical difference between Group 3 and Group 4, and they were the weakest materials. XRD of Group 2 ceramic showed that a new type of dental machinable glass-ceramic containing calcium-mica had been developed by the present study and was named PMC. PMC is promising for application as a dental machinable ceramic due to its good machinability and relatively high strength.
PHOTOMETRIC SUPERNOVA CLASSIFICATION WITH MACHINE LEARNING

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lochner, Michelle; Peiris, Hiranya V.; Lahav, Ofer

Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models tomore » curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k -nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.« less
Automatic segmentation of airway tree based on local intensity filter and machine learning technique in 3D chest CT volume.

PubMed

Meng, Qier; Kitasaka, Takayuki; Nimura, Yukitaka; Oda, Masahiro; Ueno, Junji; Mori, Kensaku

2017-02-01

Airway segmentation plays an important role in analyzing chest computed tomography (CT) volumes for computerized lung cancer detection, emphysema diagnosis and pre- and intra-operative bronchoscope navigation. However, obtaining a complete 3D airway tree structure from a CT volume is quite a challenging task. Several researchers have proposed automated airway segmentation algorithms basically based on region growing and machine learning techniques. However, these methods fail to detect the peripheral bronchial branches, which results in a large amount of leakage. This paper presents a novel approach for more accurate extraction of the complex airway tree. This proposed segmentation method is composed of three steps. First, Hessian analysis is utilized to enhance the tube-like structure in CT volumes; then, an adaptive multiscale cavity enhancement filter is employed to detect the cavity-like structure with different radii. In the second step, support vector machine learning will be utilized to remove the false positive (FP) regions from the result obtained in the previous step. Finally, the graph-cut algorithm is used to refine the candidate voxels to form an integrated airway tree. A test dataset including 50 standard-dose chest CT volumes was used for evaluating our proposed method. The average extraction rate was about 79.1 % with the significantly decreased FP rate. A new method of airway segmentation based on local intensity structure and machine learning technique was developed. The method was shown to be feasible for airway segmentation in a computer-aided diagnosis system for a lung and bronchoscope guidance system.
Automatic classification of written descriptions by healthy adults: An overview of the application of natural language processing and machine learning techniques to clinical discourse analysis

PubMed Central

Toledo, Cíntia Matsuda; Cunha, Andre; Scarton, Carolina; Aluísio, Sandra

2014-01-01

Discourse production is an important aspect in the evaluation of brain-injured individuals. We believe that studies comparing the performance of brain-injured subjects with that of healthy controls must use groups with compatible education. A pioneering application of machine learning methods using Brazilian Portuguese for clinical purposes is described, highlighting education as an important variable in the Brazilian scenario. Objective The aims were to describe how to: (i) develop machine learning classifiers using features generated by natural language processing tools to distinguish descriptions produced by healthy individuals into classes based on their years of education; and (ii) automatically identify the features that best distinguish the groups. Methods The approach proposed here extracts linguistic features automatically from the written descriptions with the aid of two Natural Language Processing tools: Coh-Metrix-Port and AIC. It also includes nine task-specific features (three new ones, two extracted manually, besides description time; type of scene described – simple or complex; presentation order – which type of picture was described first; and age). In this study, the descriptions by 144 of the subjects studied in Toledo18 were used,which included 200 healthy Brazilians of both genders. Results and Conclusion A Support Vector Machine (SVM) with a radial basis function (RBF) kernel is the most recommended approach for the binary classification of our data, classifying three of the four initial classes. CfsSubsetEval (CFS) is a strong candidate to replace manual feature selection methods. PMID:29213908
Extracting information from the text of electronic medical records to improve case detection: a systematic review

PubMed Central

Carroll, John A; Smith, Helen E; Scott, Donia; Cassell, Jackie A

2016-01-01

Background Electronic medical records (EMRs) are revolutionizing health-related research. One key issue for study quality is the accurate identification of patients with the condition of interest. Information in EMRs can be entered as structured codes or unstructured free text. The majority of research studies have used only coded parts of EMRs for case-detection, which may bias findings, miss cases, and reduce study quality. This review examines whether incorporating information from text into case-detection algorithms can improve research quality. Methods A systematic search returned 9659 papers, 67 of which reported on the extraction of information from free text of EMRs with the stated purpose of detecting cases of a named clinical condition. Methods for extracting information from text and the technical accuracy of case-detection algorithms were reviewed. Results Studies mainly used US hospital-based EMRs, and extracted information from text for 41 conditions using keyword searches, rule-based algorithms, and machine learning methods. There was no clear difference in case-detection algorithm accuracy between rule-based and machine learning methods of extraction. Inclusion of information from text resulted in a significant improvement in algorithm sensitivity and area under the receiver operating characteristic in comparison to codes alone (median sensitivity 78% (codes + text) vs 62% (codes), P = .03; median area under the receiver operating characteristic 95% (codes + text) vs 88% (codes), P = .025). Conclusions Text in EMRs is accessible, especially with open source information extraction algorithms, and significantly improves case detection when combined with codes. More harmonization of reporting within EMR studies is needed, particularly standardized reporting of algorithm accuracy metrics like positive predictive value (precision) and sensitivity (recall). PMID:26911811
A two-dimensional matrix image based feature extraction method for classification of sEMG: A comparative analysis based on SVM, KNN and RBF-NN.

PubMed

Wen, Tingxi; Zhang, Zhongnan; Qiu, Ming; Zeng, Ming; Luo, Weizhen

2017-01-01

The computer mouse is an important human-computer interaction device. But patients with physical finger disability are unable to operate this device. Surface EMG (sEMG) can be monitored by electrodes on the skin surface and is a reflection of the neuromuscular activities. Therefore, we can control limbs auxiliary equipment by utilizing sEMG classification in order to help the physically disabled patients to operate the mouse. To develop a new a method to extract sEMG generated by finger motion and apply novel features to classify sEMG. A window-based data acquisition method was presented to extract signal samples from sEMG electordes. Afterwards, a two-dimensional matrix image based feature extraction method, which differs from the classical methods based on time domain or frequency domain, was employed to transform signal samples to feature maps used for classification. In the experiments, sEMG data samples produced by the index and middle fingers at the click of a mouse button were separately acquired. Then, characteristics of the samples were analyzed to generate a feature map for each sample. Finally, the machine learning classification algorithms (SVM, KNN, RBF-NN) were employed to classify these feature maps on a GPU. The study demonstrated that all classifiers can identify and classify sEMG samples effectively. In particular, the accuracy of the SVM classifier reached up to 100%. The signal separation method is a convenient, efficient and quick method, which can effectively extract the sEMG samples produced by fingers. In addition, unlike the classical methods, the new method enables to extract features by enlarging sample signals' energy appropriately. The classical machine learning classifiers all performed well by using these features.
RHE: A JVM Courseware

ERIC Educational Resources Information Center

Liu, S.; Tang, J.; Deng, C.; Li, X.-F.; Gaudiot, J.-L.

2011-01-01

Java Virtual Machine (JVM) education has become essential in training embedded software engineers as well as virtual machine researchers and practitioners. However, due to the lack of suitable instructional tools, it is difficult for students to obtain any kind of hands-on experience and to attain any deep understanding of JVM design. To address…
Influence of cutting data on surface quality when machining 17-4 PH stainless steel

NASA Astrophysics Data System (ADS)

Popovici, T. D.; Dijmărescu, M. R.

2017-08-01

The aim of the research presented in this paper is to analyse the cutting data influence upon surface quality for 17-4 PH stainless steel milling machining. The cutting regime parameters considered for the experiments were established using cutting regimes from experimental researches or from industrial conditions as basis, within the recommended ranges. The experimental program structure was determined by taking into account compatibility and orthogonality conditions, minimal use of material and labour. The machined surface roughness was determined by measuring the Ra roughness parameter, followed by surface profile registration in the form of graphics which were saved on a computer with MarSurf PS1Explorer software. Based on Ra roughness parameter, maximum values were extracted from these graphics and the influence charts of the cutting regime parameters upon surface roughness were traced using Microsoft Excel software. After a thorough analysis of the resulting data, relevant conclusions were drawn, presenting the interdependence between the surface roughness of the machined 17-4 PH samples and the cutting data variation.
Hardware Acceleration of Adaptive Neural Algorithms.

DOE Office of Scientific and Technical Information (OSTI.GOV)

James, Conrad D.

As tradit ional numerical computing has faced challenges, researchers have turned towards alternative computing approaches to reduce power - per - computation metrics and improve algorithm performance. Here, we describe an approach towards non - conventional computing that strengthens the connection between machine learning and neuroscience concepts. The Hardware Acceleration of Adaptive Neural Algorithms (HAANA) project ha s develop ed neural machine learning algorithms and hardware for applications in image processing and cybersecurity. While machine learning methods are effective at extracting relevant features from many types of data, the effectiveness of these algorithms degrades when subjected to real - worldmore » conditions. Our team has generated novel neural - inspired approa ches to improve the resiliency and adaptability of machine learning algorithms. In addition, we have also designed and fabricated hardware architectures and microelectronic devices specifically tuned towards the training and inference operations of neural - inspired algorithms. Finally, our multi - scale simulation framework allows us to assess the impact of microelectronic device properties on algorithm performance.« less
An ultra low power feature extraction and classification system for wearable seizure detection.

PubMed

Page, Adam; Pramod Tim Oates, Siddharth; Mohsenin, Tinoosh

2015-01-01

In this paper we explore the use of a variety of machine learning algorithms for designing a reliable and low-power, multi-channel EEG feature extractor and classifier for predicting seizures from electroencephalographic data (scalp EEG). Different machine learning classifiers including k-nearest neighbor, support vector machines, naïve Bayes, logistic regression, and neural networks are explored with the goal of maximizing detection accuracy while minimizing power, area, and latency. The input to each machine learning classifier is a 198 feature vector containing 9 features for each of the 22 EEG channels obtained over 1-second windows. All classifiers were able to obtain F1 scores over 80% and onset sensitivity of 100% when tested on 10 patients. Among five different classifiers that were explored, logistic regression (LR) proved to have minimum hardware complexity while providing average F-1 score of 91%. Both ASIC and FPGA implementations of logistic regression are presented and show the smallest area, power consumption, and the lowest latency when compared to the previous work.
Investigating multiple dysregulated pathways in rheumatoid arthritis based on pathway interaction network.

PubMed

Song, Xian-Dong; Song, Xian-Xu; Liu, Gui-Bo; Ren, Chun-Hui; Sun, Yuan-Bo; Liu, Ke-Xin; Liu, Bo; Liang, Shuang; Zhu, Zhu

2018-03-01

The traditional methods of identifying biomarkers in rheumatoid arthritis (RA) have focussed on the differentially expressed pathways or individual pathways, which however, neglect the interactions between pathways. To better understand the pathogenesis of RA, we aimed to identify dysregulated pathway sets using a pathway interaction network (PIN), which considered interactions among pathways. Firstly, RA-related gene expression profile data, protein-protein interactions (PPI) data and pathway data were taken up from the corresponding databases. Secondly, principal component analysis method was used to calculate the pathway activity of each of the pathway, and then a seed pathway was identified using data gleaned from the pathway activity. A PIN was then constructed based on the gene expression profile, pathway data, and PPI information. Finally, the dysregulated pathways were extracted from the PIN based on the seed pathway using the method of support vector machines and an area under the curve (AUC) index. The PIN comprised of a total of 854 pathways and 1064 pathway interactions. The greatest change in the activity score between RA and control samples was observed in the pathway of epigenetic regulation of gene expression, which was extracted and regarded as the seed pathway. Starting with this seed pathway, one maximum pathway set containing 10 dysregulated pathways was extracted from the PIN, having an AUC of 0.8249, and the result indicated that this pathway set could distinguish RA from the controls. These 10 dysregulated pathways might be potential biomarkers for RA diagnosis and treatment in the future.
Identification of Shearer Cutting Patterns Using Vibration Signals Based on a Least Squares Support Vector Machine with an Improved Fruit Fly Optimization Algorithm

PubMed Central

Si, Lei; Wang, Zhongbin; Liu, Xinhua; Tan, Chao; Liu, Ze; Xu, Jing

2016-01-01

Shearers play an important role in fully mechanized coal mining face and accurately identifying their cutting pattern is very helpful for improving the automation level of shearers and ensuring the safety of coal mining. The least squares support vector machine (LSSVM) has been proven to offer strong potential in prediction and classification issues, particularly by employing an appropriate meta-heuristic algorithm to determine the values of its two parameters. However, these meta-heuristic algorithms have the drawbacks of being hard to understand and reaching the global optimal solution slowly. In this paper, an improved fly optimization algorithm (IFOA) to optimize the parameters of LSSVM was presented and the LSSVM coupled with IFOA (IFOA-LSSVM) was used to identify the shearer cutting pattern. The vibration acceleration signals of five cutting patterns were collected and the special state features were extracted based on the ensemble empirical mode decomposition (EEMD) and the kernel function. Some examples on the IFOA-LSSVM model were further presented and the results were compared with LSSVM, PSO-LSSVM, GA-LSSVM and FOA-LSSVM models in detail. The comparison results indicate that the proposed approach was feasible, efficient and outperformed the others. Finally, an industrial application example at the coal mining face was demonstrated to specify the effect of the proposed system. PMID:26771615

Extractive Regimes: Toward a Better Understanding of Indonesian Development

ERIC Educational Resources Information Center

Gellert, Paul K.

2010-01-01

This article proposes the concept of an extractive regime to understand Indonesia's developmental trajectory from 1966 to 1998. The concept contributes to world-systems, globalization, and commodity-based approaches to understanding peripheral development. An extractive regime is defined by its reliance on extraction of multiple natural resources…
Machine-smoking studies of cigarette filter color to estimate tar yield by visual assessment and through the use of a colorimeter.

PubMed

Morton, Michael J; Williams, David L; Hjorth, Heather B; Smith, Jennifer H

2010-04-01

This paper explores using the intensity of the stain on the end of the filter ("filter color") as a vehicle for estimating cigarette tar yield, both by instrument reading of the filter color and by visual comparison to a template. The correlation of machine-measured tar yield to filter color measured with a colorimeter was reasonably strong and was relatively unaffected by different puff volumes or different tobacco moistures. However, the correlation of filter color to machine-measured nicotine yield was affected by the moisture content of the cigarette. Filter color, as measured by a colorimeter, was generally comparable to filter extraction of either nicotine or solanesol in its correlation to machine-smoked tar yields. It was found that the color of the tar stain changes over time. Panelists could generally correctly order the filters from machine-smoked cigarettes by tar yield using the intensity of the tar stain. However, there was considerable variation in the panelist-to-panelist tar yield estimates. The wide person-to-person variation in tar yield estimates, and other factors discussed in the text could severely limit the usefulness and practicality of this approach for visually estimating the tar yield of machine-smoked cigarettes. Copyright 2009 Elsevier Inc. All rights reserved.
Machine vision based quality inspection of flat glass products

NASA Astrophysics Data System (ADS)

Zauner, G.; Schagerl, M.

2014-03-01

This application paper presents a machine vision solution for the quality inspection of flat glass products. A contact image sensor (CIS) is used to generate digital images of the glass surfaces. The presented machine vision based quality inspection at the end of the production line aims to classify five different glass defect types. The defect images are usually characterized by very little `image structure', i.e. homogeneous regions without distinct image texture. Additionally, these defect images usually consist of only a few pixels. At the same time the appearance of certain defect classes can be very diverse (e.g. water drops). We used simple state-of-the-art image features like histogram-based features (std. deviation, curtosis, skewness), geometric features (form factor/elongation, eccentricity, Hu-moments) and texture features (grey level run length matrix, co-occurrence matrix) to extract defect information. The main contribution of this work now lies in the systematic evaluation of various machine learning algorithms to identify appropriate classification approaches for this specific class of images. In this way, the following machine learning algorithms were compared: decision tree (J48), random forest, JRip rules, naive Bayes, Support Vector Machine (multi class), neural network (multilayer perceptron) and k-Nearest Neighbour. We used a representative image database of 2300 defect images and applied cross validation for evaluation purposes.
The QuEST for multi-sensor big data ISR situation understanding

NASA Astrophysics Data System (ADS)

Rogers, Steven; Culbertson, Jared; Oxley, Mark; Clouse, H. Scott; Abayowa, Bernard; Patrick, James; Blasch, Erik; Trumpfheller, John

2016-05-01

The challenges for providing war fighters with the best possible actionable information from diverse sensing modalities using advances in big-data and machine learning are addressed in this paper. We start by presenting intelligence, surveillance, and reconnaissance (ISR) related big-data challenges associated with the Third Offset Strategy. Current approaches to big-data are shown to be limited with respect to reasoning/understanding. We present a discussion of what meaning making and understanding require. We posit that for human-machine collaborative solutions to address the requirements for the strategy a new approach, Qualia Exploitation of Sensor Technology (QuEST), will be required. The requirements for developing a QuEST theory of knowledge are discussed and finally, an engineering approach for achieving situation understanding is presented.
Modeling a Spatio-Temporal Individual Travel Behavior Using Geotagged Social Network Data: a Case Study of Greater Cincinnati

NASA Astrophysics Data System (ADS)

Saeedimoghaddam, M.; Kim, C.

2017-10-01

Understanding individual travel behavior is vital in travel demand management as well as in urban and transportation planning. New data sources including mobile phone data and location-based social media (LBSM) data allow us to understand mobility behavior on an unprecedented level of details. Recent studies of trip purpose prediction tend to use machine learning (ML) methods, since they generally produce high levels of predictive accuracy. Few studies used LSBM as a large data source to extend its potential in predicting individual travel destination using ML techniques. In the presented research, we created a spatio-temporal probabilistic model based on an ensemble ML framework named "Random Forests" utilizing the travel extracted from geotagged Tweets in 419 census tracts of Greater Cincinnati area for predicting the tract ID of an individual's travel destination at any time using the information of its origin. We evaluated the model accuracy using the travels extracted from the Tweets themselves as well as the travels from household travel survey. The Tweets and survey based travels that start from same tract in the south western parts of the study area is more likely to select same destination compare to the other parts. Also, both Tweets and survey based travels were affected by the attraction points in the downtown of Cincinnati and the tracts in the north eastern part of the area. Finally, both evaluations show that the model predictions are acceptable, but it cannot predict destination using inputs from other data sources as precise as the Tweets based data.
PDF text classification to leverage information extraction from publication reports.

PubMed

Bui, Duy Duc An; Del Fiol, Guilherme; Jonnalagadda, Siddhartha

2016-06-01

Data extraction from original study reports is a time-consuming, error-prone process in systematic review development. Information extraction (IE) systems have the potential to assist humans in the extraction task, however majority of IE systems were not designed to work on Portable Document Format (PDF) document, an important and common extraction source for systematic review. In a PDF document, narrative content is often mixed with publication metadata or semi-structured text, which add challenges to the underlining natural language processing algorithm. Our goal is to categorize PDF texts for strategic use by IE systems. We used an open-source tool to extract raw texts from a PDF document and developed a text classification algorithm that follows a multi-pass sieve framework to automatically classify PDF text snippets (for brevity, texts) into TITLE, ABSTRACT, BODYTEXT, SEMISTRUCTURE, and METADATA categories. To validate the algorithm, we developed a gold standard of PDF reports that were included in the development of previous systematic reviews by the Cochrane Collaboration. In a two-step procedure, we evaluated (1) classification performance, and compared it with machine learning classifier, and (2) the effects of the algorithm on an IE system that extracts clinical outcome mentions. The multi-pass sieve algorithm achieved an accuracy of 92.6%, which was 9.7% (p<0.001) higher than the best performing machine learning classifier that used a logistic regression algorithm. F-measure improvements were observed in the classification of TITLE (+15.6%), ABSTRACT (+54.2%), BODYTEXT (+3.7%), SEMISTRUCTURE (+34%), and MEDADATA (+14.2%). In addition, use of the algorithm to filter semi-structured texts and publication metadata improved performance of the outcome extraction system (F-measure +4.1%, p=0.002). It also reduced of number of sentences to be processed by 44.9% (p<0.001), which corresponds to a processing time reduction of 50% (p=0.005). The rule-based multi-pass sieve framework can be used effectively in categorizing texts extracted from PDF documents. Text classification is an important prerequisite step to leverage information extraction from PDF documents. Copyright © 2016 Elsevier Inc. All rights reserved.
Computational Foundations of Natural Intelligence

PubMed Central

van Gerven, Marcel

2017-01-01

New developments in AI and neuroscience are revitalizing the quest to understanding natural intelligence, offering insight about how to equip machines with human-like capabilities. This paper reviews some of the computational principles relevant for understanding natural intelligence and, ultimately, achieving strong AI. After reviewing basic principles, a variety of computational modeling approaches is discussed. Subsequently, I concentrate on the use of artificial neural networks as a framework for modeling cognitive processes. This paper ends by outlining some of the challenges that remain to fulfill the promise of machines that show human-like intelligence. PMID:29375355
40 CFR 61.31 - Definitions.

Code of Federal Regulations, 2012 CFR

2012-07-01

... associated elements. (b) Extraction plant means a facility chemically processing beryllium ore to beryllium..., electrochemical machining, etching, or other similar operations. (e) Ceramic plant means a manufacturing plant... compounds used or generated during any process or operation performed by a source subject to this subpart...
40 CFR 61.31 - Definitions.

Code of Federal Regulations, 2013 CFR

2013-07-01

... associated elements. (b) Extraction plant means a facility chemically processing beryllium ore to beryllium..., electrochemical machining, etching, or other similar operations. (e) Ceramic plant means a manufacturing plant... compounds used or generated during any process or operation performed by a source subject to this subpart...
40 CFR 61.31 - Definitions.

Code of Federal Regulations, 2014 CFR

2014-07-01

... associated elements. (b) Extraction plant means a facility chemically processing beryllium ore to beryllium..., electrochemical machining, etching, or other similar operations. (e) Ceramic plant means a manufacturing plant... compounds used or generated during any process or operation performed by a source subject to this subpart...
Clinical data miner: an electronic case report form system with integrated data preprocessing and machine-learning libraries supporting clinical diagnostic model research.

PubMed

Installé, Arnaud Jf; Van den Bosch, Thierry; De Moor, Bart; Timmerman, Dirk

2014-10-20

Using machine-learning techniques, clinical diagnostic model research extracts diagnostic models from patient data. Traditionally, patient data are often collected using electronic Case Report Form (eCRF) systems, while mathematical software is used for analyzing these data using machine-learning techniques. Due to the lack of integration between eCRF systems and mathematical software, extracting diagnostic models is a complex, error-prone process. Moreover, due to the complexity of this process, it is usually only performed once, after a predetermined number of data points have been collected, without insight into the predictive performance of the resulting models. The objective of the study of Clinical Data Miner (CDM) software framework is to offer an eCRF system with integrated data preprocessing and machine-learning libraries, improving efficiency of the clinical diagnostic model research workflow, and to enable optimization of patient inclusion numbers through study performance monitoring. The CDM software framework was developed using a test-driven development (TDD) approach, to ensure high software quality. Architecturally, CDM's design is split over a number of modules, to ensure future extendability. The TDD approach has enabled us to deliver high software quality. CDM's eCRF Web interface is in active use by the studies of the International Endometrial Tumor Analysis consortium, with over 4000 enrolled patients, and more studies planned. Additionally, a derived user interface has been used in six separate interrater agreement studies. CDM's integrated data preprocessing and machine-learning libraries simplify some otherwise manual and error-prone steps in the clinical diagnostic model research workflow. Furthermore, CDM's libraries provide study coordinators with a method to monitor a study's predictive performance as patient inclusions increase. To our knowledge, CDM is the only eCRF system integrating data preprocessing and machine-learning libraries. This integration improves the efficiency of the clinical diagnostic model research workflow. Moreover, by simplifying the generation of learning curves, CDM enables study coordinators to assess more accurately when data collection can be terminated, resulting in better models or lower patient recruitment costs.
Modal identification of spindle-tool unit in high-speed machining

NASA Astrophysics Data System (ADS)

Gagnol, Vincent; Le, Thien-Phu; Ray, Pascal

2011-10-01

The accurate knowledge of high-speed motorised spindle dynamic behaviour during machining is important in order to ensure the reliability of machine tools in service and the quality of machined parts. More specifically, the prediction of stable cutting regions, which is a critical requirement for high-speed milling operations, requires the accurate estimation of tool/holder/spindle set dynamic modal parameters. These estimations are generally obtained through Frequency Response Function (FRF) measurements of the non-rotating spindle. However, significant changes in modal parameters are expected to occur during operation, due to high-speed spindle rotation. The spindle's modal variations are highlighted through an integrated finite element model of the dynamic high-speed spindle-bearing system, taking into account rotor dynamics effects. The dependency of dynamic behaviour on speed range is then investigated and determined with accuracy. The objective of the proposed paper is to validate these numerical results through an experiment-based approach. Hence, an experimental setup is elaborated to measure rotating tool vibration during the machining operation in order to determine the spindle's modal frequency variation with respect to spindle speed in an industrial environment. The identification of natural frequencies of the spindle under rotating conditions is challenging, due to the low number of sensors and the presence of many harmonics in the measured signals. In order to overcome these issues and to extract the characteristics of the system, the spindle modes are determined through a 3-step procedure. First, spindle modes are highlighted using the Frequency Domain Decomposition (FDD) technique, with a new formulation at the considered rotating speed. These extracted modes are then analysed through the value of their respective damping ratios in order to separate the harmonics component from structural spindle natural frequencies. Finally, the stochastic properties of the modes are also investigated by considering the probability density of the retained modes. Results show a good correlation between numerical and experiment-based identified frequencies. The identified spindle-tool modal properties during machining allow the numerical model to be considered as representative of the real dynamic properties of the system.
A Hybrid Human-Computer Approach to the Extraction of Scientific Facts from the Literature.

PubMed

Tchoua, Roselyne B; Chard, Kyle; Audus, Debra; Qin, Jian; de Pablo, Juan; Foster, Ian

2016-01-01

A wealth of valuable data is locked within the millions of research articles published each year. Reading and extracting pertinent information from those articles has become an unmanageable task for scientists. This problem hinders scientific progress by making it hard to build on results buried in literature. Moreover, these data are loosely structured, encoded in manuscripts of various formats, embedded in different content types, and are, in general, not machine accessible. We present a hybrid human-computer solution for semi-automatically extracting scientific facts from literature. This solution combines an automated discovery, download, and extraction phase with a semi-expert crowd assembled from students to extract specific scientific facts. To evaluate our approach we apply it to a challenging molecular engineering scenario, extraction of a polymer property: the Flory-Huggins interaction parameter. We demonstrate useful contributions to a comprehensive database of polymer properties.
Extracting important information from Chinese Operation Notes with natural language processing methods.

PubMed

Wang, Hui; Zhang, Weide; Zeng, Qiang; Li, Zuofeng; Feng, Kaiyan; Liu, Lei

2014-04-01

Extracting information from unstructured clinical narratives is valuable for many clinical applications. Although natural Language Processing (NLP) methods have been profoundly studied in electronic medical records (EMR), few studies have explored NLP in extracting information from Chinese clinical narratives. In this study, we report the development and evaluation of extracting tumor-related information from operation notes of hepatic carcinomas which were written in Chinese. Using 86 operation notes manually annotated by physicians as the training set, we explored both rule-based and supervised machine-learning approaches. Evaluating on unseen 29 operation notes, our best approach yielded 69.6% in precision, 58.3% in recall and 63.5% F-score. Copyright © 2014 Elsevier Inc. All rights reserved.
Statistical interpretation of machine learning-based feature importance scores for biomarker discovery.

PubMed

Huynh-Thu, Vân Anh; Saeys, Yvan; Wehenkel, Louis; Geurts, Pierre

2012-07-01

Univariate statistical tests are widely used for biomarker discovery in bioinformatics. These procedures are simple, fast and their output is easily interpretable by biologists but they can only identify variables that provide a significant amount of information in isolation from the other variables. As biological processes are expected to involve complex interactions between variables, univariate methods thus potentially miss some informative biomarkers. Variable relevance scores provided by machine learning techniques, however, are potentially able to highlight multivariate interacting effects, but unlike the p-values returned by univariate tests, these relevance scores are usually not statistically interpretable. This lack of interpretability hampers the determination of a relevance threshold for extracting a feature subset from the rankings and also prevents the wide adoption of these methods by practicians. We evaluated several, existing and novel, procedures that extract relevant features from rankings derived from machine learning approaches. These procedures replace the relevance scores with measures that can be interpreted in a statistical way, such as p-values, false discovery rates, or family wise error rates, for which it is easier to determine a significance level. Experiments were performed on several artificial problems as well as on real microarray datasets. Although the methods differ in terms of computing times and the tradeoff, they achieve in terms of false positives and false negatives, some of them greatly help in the extraction of truly relevant biomarkers and should thus be of great practical interest for biologists and physicians. As a side conclusion, our experiments also clearly highlight that using model performance as a criterion for feature selection is often counter-productive. Python source codes of all tested methods, as well as the MATLAB scripts used for data simulation, can be found in the Supplementary Material.
The Remote Maxwell Demon as Energy Down-Converter

NASA Astrophysics Data System (ADS)

Hossenfelder, S.

2016-04-01

It is demonstrated that Maxwell's demon can be used to allow a machine to extract energy from a heat bath by use of information that is processed by the demon at a remote location. The model proposed here effectively replaces transmission of energy by transmission of information. For that we use a feedback protocol that enables a net gain by stimulating emission in selected fluctuations around thermal equilibrium. We estimate the down conversion rate and the efficiency of energy extraction from the heat bath.
Method of modifying a volume mesh using sheet extraction

DOEpatents

Borden, Michael J [Albuquerque, NM; Shepherd, Jason F [Albuquerque, NM

2007-02-20

A method and machine-readable medium provide a technique to modify a hexahedral finite element volume mesh using dual generation and sheet extraction. After generating a dual of a volume stack (mesh), a predetermined algorithm may be followed to modify the volume mesh of hexahedral elements. The predetermined algorithm may include the steps of determining a sheet of hexahedral mesh elements, generating nodes for merging, and merging the nodes to delete the sheet of hexahedral mesh elements and modify the volume mesh.
Applying a weighted random forests method to extract karst sinkholes from LiDAR data

NASA Astrophysics Data System (ADS)

Zhu, Junfeng; Pierskalla, William P.

2016-02-01

Detailed mapping of sinkholes provides critical information for mitigating sinkhole hazards and understanding groundwater and surface water interactions in karst terrains. LiDAR (Light Detection and Ranging) measures the earth's surface in high-resolution and high-density and has shown great potentials to drastically improve locating and delineating sinkholes. However, processing LiDAR data to extract sinkholes requires separating sinkholes from other depressions, which can be laborious because of the sheer number of the depressions commonly generated from LiDAR data. In this study, we applied the random forests, a machine learning method, to automatically separate sinkholes from other depressions in a karst region in central Kentucky. The sinkhole-extraction random forest was grown on a training dataset built from an area where LiDAR-derived depressions were manually classified through a visual inspection and field verification process. Based on the geometry of depressions, as well as natural and human factors related to sinkholes, 11 parameters were selected as predictive variables to form the dataset. Because the training dataset was imbalanced with the majority of depressions being non-sinkholes, a weighted random forests method was used to improve the accuracy of predicting sinkholes. The weighted random forest achieved an average accuracy of 89.95% for the training dataset, demonstrating that the random forest can be an effective sinkhole classifier. Testing of the random forest in another area, however, resulted in moderate success with an average accuracy rate of 73.96%. This study suggests that an automatic sinkhole extraction procedure like the random forest classifier can significantly reduce time and labor costs and makes its more tractable to map sinkholes using LiDAR data for large areas. However, the random forests method cannot totally replace manual procedures, such as visual inspection and field verification.
NEMO: Extraction and normalization of organization names from PubMed affiliations.

PubMed

Jonnalagadda, Siddhartha Reddy; Topham, Philip

2010-10-04

Today, there are more than 18 million articles related to biomedical research indexed in MEDLINE, and information derived from them could be used effectively to save the great amount of time and resources spent by government agencies in understanding the scientific landscape, including key opinion leaders and centers of excellence. Associating biomedical articles with organization names could significantly benefit the pharmaceutical marketing industry, health care funding agencies and public health officials and be useful for other scientists in normalizing author names, automatically creating citations, indexing articles and identifying potential resources or collaborators. Large amount of extracted information helps in disambiguating organization names using machine-learning algorithms. We propose NEMO, a system for extracting organization names in the affiliation and normalizing them to a canonical organization name. Our parsing process involves multi-layered rule matching with multiple dictionaries. The system achieves more than 98% f-score in extracting organization names. Our process of normalization that involves clustering based on local sequence alignment metrics and local learning based on finding connected components. A high precision was also observed in normalization. NEMO is the missing link in associating each biomedical paper and its authors to an organization name in its canonical form and the Geopolitical location of the organization. This research could potentially help in analyzing large social networks of organizations for landscaping a particular topic, improving performance of author disambiguation, adding weak links in the co-author network of authors, augmenting NLM's MARS system for correcting errors in OCR output of affiliation field, and automatically indexing the PubMed citations with the normalized organization name and country. Our system is available as a graphical user interface available for download along with this paper.
Sieve-based coreference resolution enhances semi-supervised learning model for chemical-induced disease relation extraction.

PubMed

Le, Hoang-Quynh; Tran, Mai-Vu; Dang, Thanh Hai; Ha, Quang-Thuy; Collier, Nigel

2016-07-01

The BioCreative V chemical-disease relation (CDR) track was proposed to accelerate the progress of text mining in facilitating integrative understanding of chemicals, diseases and their relations. In this article, we describe an extension of our system (namely UET-CAM) that participated in the BioCreative V CDR. The original UET-CAM system's performance was ranked fourth among 18 participating systems by the BioCreative CDR track committee. In the Disease Named Entity Recognition and Normalization (DNER) phase, our system employed joint inference (decoding) with a perceptron-based named entity recognizer (NER) and a back-off model with Semantic Supervised Indexing and Skip-gram for named entity normalization. In the chemical-induced disease (CID) relation extraction phase, we proposed a pipeline that includes a coreference resolution module and a Support Vector Machine relation extraction model. The former module utilized a multi-pass sieve to extend entity recall. In this article, the UET-CAM system was improved by adding a 'silver' CID corpus to train the prediction model. This silver standard corpus of more than 50 thousand sentences was automatically built based on the Comparative Toxicogenomics Database (CTD) database. We evaluated our method on the CDR test set. Results showed that our system could reach the state of the art performance with F1 of 82.44 for the DNER task and 58.90 for the CID task. Analysis demonstrated substantial benefits of both the multi-pass sieve coreference resolution method (F1 + 4.13%) and the silver CID corpus (F1 +7.3%).Database URL: SilverCID-The silver-standard corpus for CID relation extraction is freely online available at: https://zenodo.org/record/34530 (doi:10.5281/zenodo.34530). © The Author(s) 2016. Published by Oxford University Press.

Is synthetic biology mechanical biology?

PubMed

Holm, Sune

2015-12-01

A widespread and influential characterization of synthetic biology emphasizes that synthetic biology is the application of engineering principles to living systems. Furthermore, there is a strong tendency to express the engineering approach to organisms in terms of what seems to be an ontological claim: organisms are machines. In the paper I investigate the ontological and heuristic significance of the machine analogy in synthetic biology. I argue that the use of the machine analogy and the aim of producing rationally designed organisms does not necessarily imply a commitment to mechanical biology. The ideal of applying engineering principles to biology is best understood as expressing recognition of the machine-unlikeness of natural organisms and the limits of human cognition. The paper suggests an interpretation of the identification of organisms with machines in synthetic biology according to which it expresses a strategy for representing, understanding, and constructing living systems that are more machine-like than natural organisms.
Parameter optimization of electrochemical machining process using black hole algorithm

NASA Astrophysics Data System (ADS)

Singh, Dinesh; Shukla, Rajkamal

2017-12-01

Advanced machining processes are significant as higher accuracy in machined component is required in the manufacturing industries. Parameter optimization of machining processes gives optimum control to achieve the desired goals. In this paper, electrochemical machining (ECM) process is considered to evaluate the performance of the considered process using black hole algorithm (BHA). BHA considers the fundamental idea of a black hole theory and it has less operating parameters to tune. The two performance parameters, material removal rate (MRR) and overcut (OC) are considered separately to get optimum machining parameter settings using BHA. The variations of process parameters with respect to the performance parameters are reported for better and effective understanding of the considered process using single objective at a time. The results obtained using BHA are found better while compared with results of other metaheuristic algorithms, such as, genetic algorithm (GA), artificial bee colony (ABC) and bio-geography based optimization (BBO) attempted by previous researchers.
Device-Free Localization via an Extreme Learning Machine with Parameterized Geometrical Feature Extraction.

PubMed

Zhang, Jie; Xiao, Wendong; Zhang, Sen; Huang, Shoudong

2017-04-17

Device-free localization (DFL) is becoming one of the new technologies in wireless localization field, due to its advantage that the target to be localized does not need to be attached to any electronic device. In the radio-frequency (RF) DFL system, radio transmitters (RTs) and radio receivers (RXs) are used to sense the target collaboratively, and the location of the target can be estimated by fusing the changes of the received signal strength (RSS) measurements associated with the wireless links. In this paper, we will propose an extreme learning machine (ELM) approach for DFL, to improve the efficiency and the accuracy of the localization algorithm. Different from the conventional machine learning approaches for wireless localization, in which the above differential RSS measurements are trivially used as the only input features, we introduce the parameterized geometrical representation for an affected link, which consists of its geometrical intercepts and differential RSS measurement. Parameterized geometrical feature extraction (PGFE) is performed for the affected links and the features are used as the inputs of ELM. The proposed PGFE-ELM for DFL is trained in the offline phase and performed for real-time localization in the online phase, where the estimated location of the target is obtained through the created ELM. PGFE-ELM has the advantages that the affected links used by ELM in the online phase can be different from those used for training in the offline phase, and can be more robust to deal with the uncertain combination of the detectable wireless links. Experimental results show that the proposed PGFE-ELM can improve the localization accuracy and learning speed significantly compared with a number of the existing machine learning and DFL approaches, including the weighted K-nearest neighbor (WKNN), support vector machine (SVM), back propagation neural network (BPNN), as well as the well-known radio tomographic imaging (RTI) DFL approach.
Device-Free Localization via an Extreme Learning Machine with Parameterized Geometrical Feature Extraction

PubMed Central

Zhang, Jie; Xiao, Wendong; Zhang, Sen; Huang, Shoudong

2017-01-01

Device-free localization (DFL) is becoming one of the new technologies in wireless localization field, due to its advantage that the target to be localized does not need to be attached to any electronic device. In the radio-frequency (RF) DFL system, radio transmitters (RTs) and radio receivers (RXs) are used to sense the target collaboratively, and the location of the target can be estimated by fusing the changes of the received signal strength (RSS) measurements associated with the wireless links. In this paper, we will propose an extreme learning machine (ELM) approach for DFL, to improve the efficiency and the accuracy of the localization algorithm. Different from the conventional machine learning approaches for wireless localization, in which the above differential RSS measurements are trivially used as the only input features, we introduce the parameterized geometrical representation for an affected link, which consists of its geometrical intercepts and differential RSS measurement. Parameterized geometrical feature extraction (PGFE) is performed for the affected links and the features are used as the inputs of ELM. The proposed PGFE-ELM for DFL is trained in the offline phase and performed for real-time localization in the online phase, where the estimated location of the target is obtained through the created ELM. PGFE-ELM has the advantages that the affected links used by ELM in the online phase can be different from those used for training in the offline phase, and can be more robust to deal with the uncertain combination of the detectable wireless links. Experimental results show that the proposed PGFE-ELM can improve the localization accuracy and learning speed significantly compared with a number of the existing machine learning and DFL approaches, including the weighted K-nearest neighbor (WKNN), support vector machine (SVM), back propagation neural network (BPNN), as well as the well-known radio tomographic imaging (RTI) DFL approach. PMID:28420187
A Shellcode Detection Method Based on Full Native API Sequence and Support Vector Machine

NASA Astrophysics Data System (ADS)

Cheng, Yixuan; Fan, Wenqing; Huang, Wei; An, Jing

2017-09-01

Dynamic monitoring the behavior of a program is widely used to discriminate between benign program and malware. It is usually based on the dynamic characteristics of a program, such as API call sequence or API call frequency to judge. The key innovation of this paper is to consider the full Native API sequence and use the support vector machine to detect the shellcode. We also use the Markov chain to extract and digitize Native API sequence features. Our experimental results show that the method proposed in this paper has high accuracy and low detection rate.
A KARAOKE System Singing Evaluation Method that More Closely Matches Human Evaluation

NASA Astrophysics Data System (ADS)

Takeuchi, Hideyo; Hoguro, Masahiro; Umezaki, Taizo

KARAOKE is a popular amusement for old and young. Many KARAOKE machines have singing evaluation function. However, it is often said that the scores given by KARAOKE machines do not match human evaluation. In this paper a KARAOKE scoring method strongly correlated with human evaluation is proposed. This paper proposes a way to evaluate songs based on the distance between singing pitch and musical scale, employing a vibrato extraction method based on template matching of spectrum. The results show that correlation coefficients between scores given by the proposed system and human evaluation are -0.76∼-0.89.
Implementation of support vector machine for classification of speech marked hijaiyah letters based on Mel frequency cepstrum coefficient feature extraction

NASA Astrophysics Data System (ADS)

Adhi Pradana, Wisnu; Adiwijaya; Novia Wisesty, Untari

2018-03-01

Support Vector Machine or commonly called SVM is one method that can be used to process the classification of a data. SVM classifies data from 2 different classes with hyperplane. In this study, the system was built using SVM to develop Arabic Speech Recognition. In the development of the system, there are 2 kinds of speakers that have been tested that is dependent speakers and independent speakers. The results from this system is an accuracy of 85.32% for speaker dependent and 61.16% for independent speakers.
Understanding user intents in online health forums.

PubMed

Zhang, Thomas; Cho, Jason H D; Zhai, Chengxiang

2015-07-01

Online health forums provide a convenient way for patients to obtain medical information and connect with physicians and peers outside of clinical settings. However, large quantities of unstructured and diversified content generated on these forums make it difficult for users to digest and extract useful information. Understanding user intents would enable forums to find and recommend relevant information to users by filtering out threads that do not match particular intents. In this paper, we derive a taxonomy of intents to capture user information needs in online health forums and propose novel pattern-based features for use with a multiclass support vector machine (SVM) classifier to classify original thread posts according to their underlying intents. Since no dataset existed for this task, we employ three annotators to manually label a dataset of 1192 HealthBoards posts spanning four forum topics. Experimental results show that a SVM using pattern-based features is highly capable of identifying user intents in forum posts, reaching a maximum precision of 75%, and that a SVM-based hierarchical classifier using both pattern and word features outperforms its SVM counterpart that uses only word features. Furthermore, comparable classification performance can be achieved by training and testing on posts from different forum topics.
e-Learning Application for Machine Maintenance Process using Iterative Method in XYZ Company

NASA Astrophysics Data System (ADS)

Nurunisa, Suaidah; Kurniawati, Amelia; Pramuditya Soesanto, Rayinda; Yunan Kurnia Septo Hediyanto, Umar

2016-02-01

XYZ Company is a company based on manufacturing part for airplane, one of the machine that is categorized as key facility in the company is Millac 5H6P. As a key facility, the machines should be assured to work well and in peak condition, therefore, maintenance process is needed periodically. From the data gathering, it is known that there are lack of competency from the maintenance staff to maintain different type of machine which is not assigned by the supervisor, this indicate that knowledge which possessed by maintenance staff are uneven. The purpose of this research is to create knowledge-based e-learning application as a realization from externalization process in knowledge transfer process to maintain the machine. The application feature are adjusted for maintenance purpose using e-learning framework for maintenance process, the content of the application support multimedia for learning purpose. QFD is used in this research to understand the needs from user. The application is built using moodle with iterative method for software development cycle and UML Diagram. The result from this research is e-learning application as sharing knowledge media for maintenance staff in the company. From the test, it is known that the application make maintenance staff easy to understand the competencies.
Histogram of gradient and binarized statistical image features of wavelet subband-based palmprint features extraction

NASA Astrophysics Data System (ADS)

Attallah, Bilal; Serir, Amina; Chahir, Youssef; Boudjelal, Abdelwahhab

2017-11-01

Palmprint recognition systems are dependent on feature extraction. A method of feature extraction using higher discrimination information was developed to characterize palmprint images. In this method, two individual feature extraction techniques are applied to a discrete wavelet transform of a palmprint image, and their outputs are fused. The two techniques used in the fusion are the histogram of gradient and the binarized statistical image features. They are then evaluated using an extreme learning machine classifier before selecting a feature based on principal component analysis. Three palmprint databases, the Hong Kong Polytechnic University (PolyU) Multispectral Palmprint Database, Hong Kong PolyU Palmprint Database II, and the Delhi Touchless (IIDT) Palmprint Database, are used in this study. The study shows that our method effectively identifies and verifies palmprints and outperforms other methods based on feature extraction.
Machine Learning and Radiology

PubMed Central

Wang, Shijun; Summers, Ronald M.

2012-01-01

In this paper, we give a short introduction to machine learning and survey its applications in radiology. We focused on six categories of applications in radiology: medical image segmentation, registration, computer aided detection and diagnosis, brain function or activity analysis and neurological disease diagnosis from fMR images, content-based image retrieval systems for CT or MRI images, and text analysis of radiology reports using natural language processing (NLP) and natural language understanding (NLU). This survey shows that machine learning plays a key role in many radiology applications. Machine learning identifies complex patterns automatically and helps radiologists make intelligent decisions on radiology data such as conventional radiographs, CT, MRI, and PET images and radiology reports. In many applications, the performance of machine learning-based automatic detection and diagnosis systems has shown to be comparable to that of a well-trained and experienced radiologist. Technology development in machine learning and radiology will benefit from each other in the long run. Key contributions and common characteristics of machine learning techniques in radiology are discussed. We also discuss the problem of translating machine learning applications to the radiology clinical setting, including advantages and potential barriers. PMID:22465077
Reproducing an Early-20th-Century Wave Machine

ERIC Educational Resources Information Center

Daffron, John A.; Greenslade, Thomas B., Jr.

2016-01-01

Physics students often have problems understanding waves. Over the years numerous mechanical devices have been devised to show the propagation of both transverse and longitudinal waves (Ref. 1). In this article an updated version of an early-20th-century transverse wave machine is discussed. The original, Fig. 1, is at Creighton University in…
Teaching about the U.S. Constitution through Metaphor: Government as a Machine.

ERIC Educational Resources Information Center

Mills, Randy K.

1988-01-01

Briefly reviews theories of brain hemisphere functions and draws implications for social studies instruction. Maintains that the metaphor aids the development of understanding because it connects right and left brain functions. Provides a learning activity based on the metaphor of the U.S. government functioning as a machine. (BSR)
Machine Learning Based Evaluation of Reading and Writing Difficulties.

PubMed

Iwabuchi, Mamoru; Hirabayashi, Rumi; Nakamura, Kenryu; Dim, Nem Khan

2017-01-01

The possibility of auto evaluation of reading and writing difficulties was investigated using non-parametric machine learning (ML) regression technique for URAWSS (Understanding Reading and Writing Skills of Schoolchildren) [1] test data of 168 children of grade 1 - 9. The result showed that the ML had better prediction than the ordinary rule-based decision.
Machine Translation in the German Classroom: Detection, Reaction, Prevention

ERIC Educational Resources Information Center

Steding, Soren

2009-01-01

There are many websites today that offer free machine translations and although beginning students of German are not always proficient enough to judge the quality of these translations or to fully understand certain translation results, they use these services nonetheless for their assignments. The problem for the educator is to distinguish…
Improved soldering iron tip

NASA Technical Reports Server (NTRS)

Vanasse, M. A.

1976-01-01

Nickel-plated device, with machined recesses matching the multipin pattern of particular circuit module, facilitates repairs to electronic systems and reduces chance of damage to adjacent components. Nickel-plating reduces oxidation and scaling. Recesses retain sufficient amount of molten solder to uniformly wet pins for simultaneous heating and extraction.
Accessible engineering drawings for visually impaired machine operators.

PubMed

Ramteke, Deepak; Kansal, Gayatri; Madhab, Benu

2014-01-01

An engineering drawing provides manufacturing information to a machine operator. An operator plans and executes machining operations based on this information. A visually impaired (VI) operator does not have direct access to the drawings. Drawing information is provided to them verbally or by using sample parts. Both methods have limitations that affect the quality of output. Use of engineering drawings is a standard practice for every industry; this hampers employment of a VI operator. Accessible engineering drawings are required to increase both independence, as well as, employability of VI operators. Today, Computer Aided Design (CAD) software is used for making engineering drawings, which are saved in CAD files. Required information is extracted from the CAD files and converted into Braille or voice. The authors of this article propose a method to make engineering drawings information directly accessible to a VI operator.
Machine learning for micro-tomography

NASA Astrophysics Data System (ADS)

Parkinson, Dilworth Y.; Pelt, Daniël. M.; Perciano, Talita; Ushizima, Daniela; Krishnan, Harinarayan; Barnard, Harold S.; MacDowell, Alastair A.; Sethian, James

2017-09-01

Machine learning has revolutionized a number of fields, but many micro-tomography users have never used it for their work. The micro-tomography beamline at the Advanced Light Source (ALS), in collaboration with the Center for Applied Mathematics for Energy Research Applications (CAMERA) at Lawrence Berkeley National Laboratory, has now deployed a series of tools to automate data processing for ALS users using machine learning. This includes new reconstruction algorithms, feature extraction tools, and image classification and recommen- dation systems for scientific image. Some of these tools are either in automated pipelines that operate on data as it is collected or as stand-alone software. Others are deployed on computing resources at Berkeley Lab-from workstations to supercomputers-and made accessible to users through either scripting or easy-to-use graphical interfaces. This paper presents a progress report on this work.
Combined empirical mode decomposition and texture features for skin lesion classification using quadratic support vector machine.

PubMed

Wahba, Maram A; Ashour, Amira S; Napoleon, Sameh A; Abd Elnaby, Mustafa M; Guo, Yanhui

2017-12-01

Basal cell carcinoma is one of the most common malignant skin lesions. Automated lesion identification and classification using image processing techniques is highly required to reduce the diagnosis errors. In this study, a novel technique is applied to classify skin lesion images into two classes, namely the malignant Basal cell carcinoma and the benign nevus. A hybrid combination of bi-dimensional empirical mode decomposition and gray-level difference method features is proposed after hair removal. The combined features are further classified using quadratic support vector machine (Q-SVM). The proposed system has achieved outstanding performance of 100% accuracy, sensitivity and specificity compared to other support vector machine procedures as well as with different extracted features. Basal Cell Carcinoma is effectively classified using Q-SVM with the proposed combined features.
Multispectral image analysis for object recognition and classification

NASA Astrophysics Data System (ADS)

Viau, C. R.; Payeur, P.; Cretu, A.-M.

2016-05-01

Computer and machine vision applications are used in numerous fields to analyze static and dynamic imagery in order to assist or automate decision-making processes. Advancements in sensor technologies now make it possible to capture and visualize imagery at various wavelengths (or bands) of the electromagnetic spectrum. Multispectral imaging has countless applications in various fields including (but not limited to) security, defense, space, medical, manufacturing and archeology. The development of advanced algorithms to process and extract salient information from the imagery is a critical component of the overall system performance. The fundamental objective of this research project was to investigate the benefits of combining imagery from the visual and thermal bands of the electromagnetic spectrum to improve the recognition rates and accuracy of commonly found objects in an office setting. A multispectral dataset (visual and thermal) was captured and features from the visual and thermal images were extracted and used to train support vector machine (SVM) classifiers. The SVM's class prediction ability was evaluated separately on the visual, thermal and multispectral testing datasets.

Langmuir probes for SPIDER (source for the production of ions of deuterium extracted from radio frequency plasma) experiment: Tests in BATMAN (Bavarian test machine for negative ions)

NASA Astrophysics Data System (ADS)

Brombin, M.; Spolaore, M.; Serianni, G.; Pomaro, N.; Taliercio, C.; Palma, M. Dalla; Pasqualotto, R.; Schiesko, L.

2014-11-01

A prototype system of the Langmuir probes for SPIDER (Source for the production of Ions of Deuterium Extracted from RF plasma) was manufactured and experimentally qualified. The diagnostic was operated in RF (Radio Frequency) plasmas with cesium evaporation on the BATMAN (BAvarian Test MAchine for Negative ions) test facility, which can provide plasma conditions as expected in the SPIDER source. A RF passive compensation circuit was realised to operate the Langmuir probes in RF plasmas. The sensors' holder, designed to better simulate the bias plate conditions in SPIDER, was exposed to a severe experimental campaign in BATMAN with cesium evaporation. No detrimental effect on the diagnostic due to cesium evaporation was found during the exposure to the BATMAN plasma and in particular the insulation of the electrodes was preserved. The paper presents the system prototype, the RF compensation circuit, the acquisition system (as foreseen in SPIDER), and the results obtained during the experimental campaigns.
Langmuir probes for SPIDER (Source for the production of Ions of Deuterium Extracted from Radio Frequency plasma) experiment: tests in BATMAN (BAvarian Test Machine for Negative ions).

PubMed

Brombin, M; Spolaore, M; Serianni, G; Pomaro, N; Taliercio, C; Dalla Palma, M; Pasqualotto, R; Schiesko, L

2014-11-01

A prototype system of the Langmuir probes for SPIDER (Source for the production of Ions of Deuterium Extracted from RF plasma) was manufactured and experimentally qualified. The diagnostic was operated in RF (Radio Frequency) plasmas with cesium evaporation on the BATMAN (BAvarian Test MAchine for Negative ions) test facility, which can provide plasma conditions as expected in the SPIDER source. A RF passive compensation circuit was realised to operate the Langmuir probes in RF plasmas. The sensors' holder, designed to better simulate the bias plate conditions in SPIDER, was exposed to a severe experimental campaign in BATMAN with cesium evaporation. No detrimental effect on the diagnostic due to cesium evaporation was found during the exposure to the BATMAN plasma and in particular the insulation of the electrodes was preserved. The paper presents the system prototype, the RF compensation circuit, the acquisition system (as foreseen in SPIDER), and the results obtained during the experimental campaigns.
A Feature Fusion Based Forecasting Model for Financial Time Series

PubMed Central

Guo, Zhiqiang; Wang, Huaiqing; Liu, Quan; Yang, Jie

2014-01-01

Predicting the stock market has become an increasingly interesting research area for both researchers and investors, and many prediction models have been proposed. In these models, feature selection techniques are used to pre-process the raw data and remove noise. In this paper, a prediction model is constructed to forecast stock market behavior with the aid of independent component analysis, canonical correlation analysis, and a support vector machine. First, two types of features are extracted from the historical closing prices and 39 technical variables obtained by independent component analysis. Second, a canonical correlation analysis method is utilized to combine the two types of features and extract intrinsic features to improve the performance of the prediction model. Finally, a support vector machine is applied to forecast the next day's closing price. The proposed model is applied to the Shanghai stock market index and the Dow Jones index, and experimental results show that the proposed model performs better in the area of prediction than other two similar models. PMID:24971455
Semi-supervised vibration-based classification and condition monitoring of compressors

NASA Astrophysics Data System (ADS)

Potočnik, Primož; Govekar, Edvard

2017-09-01

Semi-supervised vibration-based classification and condition monitoring of the reciprocating compressors installed in refrigeration appliances is proposed in this paper. The method addresses the problem of industrial condition monitoring where prior class definitions are often not available or difficult to obtain from local experts. The proposed method combines feature extraction, principal component analysis, and statistical analysis for the extraction of initial class representatives, and compares the capability of various classification methods, including discriminant analysis (DA), neural networks (NN), support vector machines (SVM), and extreme learning machines (ELM). The use of the method is demonstrated on a case study which was based on industrially acquired vibration measurements of reciprocating compressors during the production of refrigeration appliances. The paper presents a comparative qualitative analysis of the applied classifiers, confirming the good performance of several nonlinear classifiers. If the model parameters are properly selected, then very good classification performance can be obtained from NN trained by Bayesian regularization, SVM and ELM classifiers. The method can be effectively applied for the industrial condition monitoring of compressors.
Age group classification and gender detection based on forced expiratory spirometry.

PubMed

Cosgun, Sema; Ozbek, I Yucel

2015-08-01

This paper investigates the utility of forced expiratory spirometry (FES) test with efficient machine learning algorithms for the purpose of gender detection and age group classification. The proposed method has three main stages: feature extraction, training of the models and detection. In the first stage, some features are extracted from volume-time curve and expiratory flow-volume loop obtained from FES test. In the second stage, the probabilistic models for each gender and age group are constructed by training Gaussian mixture models (GMMs) and Support vector machine (SVM) algorithm. In the final stage, the gender (or age group) of test subject is estimated by using the trained GMM (or SVM) model. Experiments have been evaluated on a large database from 4571 subjects. The experimental results show that average correct classification rate performance of both GMM and SVM methods based on the FES test is more than 99.3 % and 96.8 % for gender and age group classification, respectively.
Ranking support vector machine for multiple kernels output combination in protein-protein interaction extraction from biomedical literature.

PubMed

Yang, Zhihao; Lin, Yuan; Wu, Jiajin; Tang, Nan; Lin, Hongfei; Li, Yanpeng

2011-10-01

Knowledge about protein-protein interactions (PPIs) unveils the molecular mechanisms of biological processes. However, the volume and content of published biomedical literature on protein interactions is expanding rapidly, making it increasingly difficult for interaction database curators to detect and curate protein interaction information manually. We present a multiple kernel learning-based approach for automatic PPI extraction from biomedical literature. The approach combines the following kernels: feature-based, tree, and graph and combines their output with Ranking support vector machine (SVM). Experimental evaluations show that the features in individual kernels are complementary and the kernel combined with Ranking SVM achieves better performance than those of the individual kernels, equal weight combination and optimal weight combination. Our approach can achieve state-of-the-art performance with respect to the comparable evaluations, with 64.88% F-score and 88.02% AUC on the AImed corpus. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Automatic sleep staging using multi-dimensional feature extraction and multi-kernel fuzzy support vector machine.

PubMed

Zhang, Yanjun; Zhang, Xiangmin; Liu, Wenhui; Luo, Yuxi; Yu, Enjia; Zou, Keju; Liu, Xiaoliang

2014-01-01

This paper employed the clinical Polysomnographic (PSG) data, mainly including all-night Electroencephalogram (EEG), Electrooculogram (EOG) and Electromyogram (EMG) signals of subjects, and adopted the American Academy of Sleep Medicine (AASM) clinical staging manual as standards to realize automatic sleep staging. Authors extracted eighteen different features of EEG, EOG and EMG in time domains and frequency domains to construct the vectors according to the existing literatures as well as clinical experience. By adopting sleep samples self-learning, the linear combination of weights and parameters of multiple kernels of the fuzzy support vector machine (FSVM) were learned and the multi-kernel FSVM (MK-FSVM) was constructed. The overall agreement between the experts' scores and the results presented was 82.53%. Compared with previous results, the accuracy of N1 was improved to some extent while the accuracies of other stages were approximate, which well reflected the sleep structure. The staging algorithm proposed in this paper is transparent, and worth further investigation.
Deep Learning in Label-free Cell Classification

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Claire Lifan; Mahjoubfar, Ata; Tai, Li-Chia

Label-free cell analysis is essential to personalized genomics, cancer diagnostics, and drug development as it avoids adverse effects of staining reagents on cellular viability and cell signaling. However, currently available label-free cell assays mostly rely only on a single feature and lack sufficient differentiation. Also, the sample size analyzed by these assays is limited due to their low throughput. Here, we integrate feature extraction and deep learning with high-throughput quantitative imaging enabled by photonic time stretch, achieving record high accuracy in label-free cell classification. Our system captures quantitative optical phase and intensity images and extracts multiple biophysical features of individualmore » cells. These biophysical measurements form a hyperdimensional feature space in which supervised learning is performed for cell classification. We compare various learning algorithms including artificial neural network, support vector machine, logistic regression, and a novel deep learning pipeline, which adopts global optimization of receiver operating characteristics. As a validation of the enhanced sensitivity and specificity of our system, we show classification of white blood T-cells against colon cancer cells, as well as lipid accumulating algal strains for biofuel production. In conclusion, this system opens up a new path to data-driven phenotypic diagnosis and better understanding of the heterogeneous gene expressions in cells.« less
Concentration of Trichloroethylene in Breast Milk and Household Water from Nogales, Arizona

PubMed Central

Beamer, Paloma I.; Luik, Catherine E.; Abrell, Leif; Campos, Swilma; Martínez, María Elena; Sáez, A. Eduardo

2013-01-01

The United States Environmental Protection Agency has identified quantification of trichloroethylene (TCE), an industrial solvent, in breast milk as a high priority need for risk assessment. Water and milk samples were collected from 20 households by a lactation consultant in Nogales, Arizona. Separate water samples (including tap, bottled and vending machine) were collected for all household uses: drinking, bathing, cooking, and laundry. A risk factor questionnaire was administered. Liquid-liquid extraction with diethyl ether was followed by GC-MS for TCE quantification in water. Breast milk underwent homogenization, lipid hydrolysis and centrifugation prior to extraction. The limit of detection was 1.5 ng/mL. TCE was detected in 7 of 20 mothers’ breast milk samples. The maximum concentration was 6 ng/mL. TCE concentration in breast milk was significantly correlated with the concentration in water used for bathing (ρ=0.59, p=0.008). Detection of TCE in breast milk was more likely if the infant had a body mass index <14 (RR=5.2, p=0.02). Based on average breast milk consumption, TCE intake for 5% of the infants may exceed the proposed US EPA Reference Dose. Results of this exploratory study warrant more in depth studies to understand risk of TCE exposures from breast milk intake. PMID:22827160
Deep Learning in Label-free Cell Classification

DOE PAGES

Chen, Claire Lifan; Mahjoubfar, Ata; Tai, Li-Chia; ...

2016-03-15

Label-free cell analysis is essential to personalized genomics, cancer diagnostics, and drug development as it avoids adverse effects of staining reagents on cellular viability and cell signaling. However, currently available label-free cell assays mostly rely only on a single feature and lack sufficient differentiation. Also, the sample size analyzed by these assays is limited due to their low throughput. Here, we integrate feature extraction and deep learning with high-throughput quantitative imaging enabled by photonic time stretch, achieving record high accuracy in label-free cell classification. Our system captures quantitative optical phase and intensity images and extracts multiple biophysical features of individualmore » cells. These biophysical measurements form a hyperdimensional feature space in which supervised learning is performed for cell classification. We compare various learning algorithms including artificial neural network, support vector machine, logistic regression, and a novel deep learning pipeline, which adopts global optimization of receiver operating characteristics. As a validation of the enhanced sensitivity and specificity of our system, we show classification of white blood T-cells against colon cancer cells, as well as lipid accumulating algal strains for biofuel production. In conclusion, this system opens up a new path to data-driven phenotypic diagnosis and better understanding of the heterogeneous gene expressions in cells.« less
DeepNeuron: an open deep learning toolbox for neuron tracing.

PubMed

Zhou, Zhi; Kuo, Hsien-Chi; Peng, Hanchuan; Long, Fuhui

2018-06-06

Reconstructing three-dimensional (3D) morphology of neurons is essential for understanding brain structures and functions. Over the past decades, a number of neuron tracing tools including manual, semiautomatic, and fully automatic approaches have been developed to extract and analyze 3D neuronal structures. Nevertheless, most of them were developed based on coding certain rules to extract and connect structural components of a neuron, showing limited performance on complicated neuron morphology. Recently, deep learning outperforms many other machine learning methods in a wide range of image analysis and computer vision tasks. Here we developed a new Open Source toolbox, DeepNeuron, which uses deep learning networks to learn features and rules from data and trace neuron morphology in light microscopy images. DeepNeuron provides a family of modules to solve basic yet challenging problems in neuron tracing. These problems include but not limited to: (1) detecting neuron signal under different image conditions, (2) connecting neuronal signals into tree(s), (3) pruning and refining tree morphology, (4) quantifying the quality of morphology, and (5) classifying dendrites and axons in real time. We have tested DeepNeuron using light microscopy images including bright-field and confocal images of human and mouse brain, on which DeepNeuron demonstrates robustness and accuracy in neuron tracing.
Feature extraction using convolutional neural network for classifying breast density in mammographic images

NASA Astrophysics Data System (ADS)

Thomaz, Ricardo L.; Carneiro, Pedro C.; Patrocinio, Ana C.

2017-03-01

Breast cancer is the leading cause of death for women in most countries. The high levels of mortality relate mostly to late diagnosis and to the direct proportionally relationship between breast density and breast cancer development. Therefore, the correct assessment of breast density is important to provide better screening for higher risk patients. However, in modern digital mammography the discrimination among breast densities is highly complex due to increased contrast and visual information for all densities. Thus, a computational system for classifying breast density might be a useful tool for aiding medical staff. Several machine-learning algorithms are already capable of classifying small number of classes with good accuracy. However, machinelearning algorithms main constraint relates to the set of features extracted and used for classification. Although well-known feature extraction techniques might provide a good set of features, it is a complex task to select an initial set during design of a classifier. Thus, we propose feature extraction using a Convolutional Neural Network (CNN) for classifying breast density by a usual machine-learning classifier. We used 307 mammographic images downsampled to 260x200 pixels to train a CNN and extract features from a deep layer. After training, the activation of 8 neurons from a deep fully connected layer are extracted and used as features. Then, these features are feedforward to a single hidden layer neural network that is cross-validated using 10-folds to classify among four classes of breast density. The global accuracy of this method is 98.4%, presenting only 1.6% of misclassification. However, the small set of samples and memory constraints required the reuse of data in both CNN and MLP-NN, therefore overfitting might have influenced the results even though we cross-validated the network. Thus, although we presented a promising method for extracting features and classifying breast density, a greater database is still required for evaluating the results.
Comparison of water extraction methods in Tibet based on GF-1 data

NASA Astrophysics Data System (ADS)

Jia, Lingjun; Shang, Kun; Liu, Jing; Sun, Zhongqing

2018-03-01

In this study, we compared four different water extraction methods with GF-1 data according to different water types in Tibet, including Support Vector Machine (SVM), Principal Component Analysis (PCA), Decision Tree Classifier based on False Normalized Difference Water Index (FNDWI-DTC), and PCA-SVM. The results show that all of the four methods can extract large area water body, but only SVM and PCA-SVM can obtain satisfying extraction results for small size water body. The methods were evaluated by both overall accuracy (OAA) and Kappa coefficient (KC). The OAA of PCA-SVM, SVM, FNDWI-DTC, PCA are 96.68%, 94.23%, 93.99%, 93.01%, and the KCs are 0.9308, 0.8995, 0.8962, 0.8842, respectively, in consistent with visual inspection. In summary, SVM is better for narrow rivers extraction and PCA-SVM is suitable for water extraction of various types. As for dark blue lakes, the methods using PCA can extract more quickly and accurately.
Machine Learning

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chikkagoudar, Satish; Chatterjee, Samrat; Thomas, Dennis G.

The absence of a robust and unified theory of cyber dynamics presents challenges and opportunities for using machine learning based data-driven approaches to further the understanding of the behavior of such complex systems. Analysts can also use machine learning approaches to gain operational insights. In order to be operationally beneficial, cybersecurity machine learning based models need to have the ability to: (1) represent a real-world system, (2) infer system properties, and (3) learn and adapt based on expert knowledge and observations. Probabilistic models and Probabilistic graphical models provide these necessary properties and are further explored in this chapter. Bayesian Networksmore » and Hidden Markov Models are introduced as an example of a widely used data driven classification/modeling strategy.« less
Phytochemical characterization, antimicrobial activity and reducing potential of seed oil, latex, machine oil and presscake of Jatropha curcas

PubMed Central

Sharma, Amit Kumar; Gangwar, Mayank; Kumar, Dharmendra; Nath, Gopal; Kumar Sinha, Akhoury Sudhir; Tripathi, Yamini Bhushan

2016-01-01

Objective: This study aims to evaluate the antimicrobial activity, phytochemical studies and thin layer chromatography analysis of machine oil, hexane extract of seed oil and methanol extract of presscake & latex of Jatropha curcas Linn (family Euphorbiaceae). Materials and Methods: J. curcas extracts were subjected to preliminary qualitative phytochemical screening to detect the major phytochemicals followed by its reducing power and content of phenol and flavonoids in different fractions. Thin layer chromatography was also performed using different solvent systems for the analysis of a number of constituents in the plant extracts. Antimicrobial activity was evaluated by the disc diffusion method, while the minimum inhibitory concentration, minimum bactericidal concentration and minimum fungicidal concentration were calculated by micro dilution method. Results: The methanolic fraction of latex and cake exhibited marked antifungal and antibacterial activities against Gram-positive and Gram-negative bacteria. Phytochemical analysis revealed the presence of alkaloids, saponins, tannins, terpenoids, steroids, glycosides, phenols and flavonoids. Reducing power showed dose dependent increase in concentration compared to standard Quercetin. Furthermore, this study recommended the isolation and separation of bioactive compounds responsible for the antibacterial activity which would be done by using different chromatographic methods such as high-performance liquid chromatography (HPLC), GC-MS etc. Conclusion: The results of the above study suggest that all parts of the plants possess potent antibacterial activity. Hence, it is important to isolate the active principles for further testing of antimicrobial and other biological efficacy. PMID:27516977
Phytochemical characterization, antimicrobial activity and reducing potential of seed oil, latex, machine oil and presscake of Jatropha curcas.

PubMed

Sharma, Amit Kumar; Gangwar, Mayank; Kumar, Dharmendra; Nath, Gopal; Kumar Sinha, Akhoury Sudhir; Tripathi, Yamini Bhushan

2016-01-01

This study aims to evaluate the antimicrobial activity, phytochemical studies and thin layer chromatography analysis of machine oil, hexane extract of seed oil and methanol extract of presscake & latex of Jatropha curcas Linn (family Euphorbiaceae). J. curcas extracts were subjected to preliminary qualitative phytochemical screening to detect the major phytochemicals followed by its reducing power and content of phenol and flavonoids in different fractions. Thin layer chromatography was also performed using different solvent systems for the analysis of a number of constituents in the plant extracts. Antimicrobial activity was evaluated by the disc diffusion method, while the minimum inhibitory concentration, minimum bactericidal concentration and minimum fungicidal concentration were calculated by micro dilution method. The methanolic fraction of latex and cake exhibited marked antifungal and antibacterial activities against Gram-positive and Gram-negative bacteria. Phytochemical analysis revealed the presence of alkaloids, saponins, tannins, terpenoids, steroids, glycosides, phenols and flavonoids. Reducing power showed dose dependent increase in concentration compared to standard Quercetin. Furthermore, this study recommended the isolation and separation of bioactive compounds responsible for the antibacterial activity which would be done by using different chromatographic methods such as high-performance liquid chromatography (HPLC), GC-MS etc. The results of the above study suggest that all parts of the plants possess potent antibacterial activity. Hence, it is important to isolate the active principles for further testing of antimicrobial and other biological efficacy.
Feature generation using genetic programming with application to fault classification.

PubMed

Guo, Hong; Jack, Lindsay B; Nandi, Asoke K

2005-02-01

One of the major challenges in pattern recognition problems is the feature extraction process which derives new features from existing features, or directly from raw data in order to reduce the cost of computation during the classification process, while improving classifier efficiency. Most current feature extraction techniques transform the original pattern vector into a new vector with increased discrimination capability but lower dimensionality. This is conducted within a predefined feature space, and thus, has limited searching power. Genetic programming (GP) can generate new features from the original dataset without prior knowledge of the probabilistic distribution. In this paper, a GP-based approach is developed for feature extraction from raw vibration data recorded from a rotating machine with six different conditions. The created features are then used as the inputs to a neural classifier for the identification of six bearing conditions. Experimental results demonstrate the ability of GP to discover autimatically the different bearing conditions using features expressed in the form of nonlinear functions. Furthermore, four sets of results--using GP extracted features with artificial neural networks (ANN) and support vector machines (SVM), as well as traditional features with ANN and SVM--have been obtained. This GP-based approach is used for bearing fault classification for the first time and exhibits superior searching power over other techniques. Additionaly, it significantly reduces the time for computation compared with genetic algorithm (GA), therefore, makes a more practical realization of the solution.
Realtime automatic metal extraction of medical x-ray images for contrast improvement

NASA Astrophysics Data System (ADS)

Prangl, Martin; Hellwagner, Hermann; Spielvogel, Christian; Bischof, Horst; Szkaliczki, Tibor

2006-03-01

This paper focuses on an approach for real-time metal extraction of x-ray images taken from modern x-ray machines like C-arms. Such machines are used for vessel diagnostics, surgical interventions, as well as cardiology, neurology and orthopedic examinations. They are very fast in taking images from different angles. For this reason, manual adjustment of contrast is infeasible and automatic adjustment algorithms have been applied to try to select the optimal radiation dose for contrast adjustment. Problems occur when metallic objects, e.g., a prosthesis or a screw, are in the absorption area of interest. In this case, the automatic adjustment mostly fails because the dark, metallic objects lead the algorithm to overdose the x-ray tube. This outshining effect results in overexposed images and bad contrast. To overcome this limitation, metallic objects have to be detected and extracted from images that are taken as input for the adjustment algorithm. In this paper, we present a real-time solution for extracting metallic objects of x-ray images. We will explore the characteristic features of metallic objects in x-ray images and their distinction from bone fragments which form the basis to find a successful way for object segmentation and classification. Subsequently, we will present our edge based real-time approach for successful and fast automatic segmentation and classification of metallic objects. Finally, experimental results on the effectiveness and performance of our approach based on a vast amount of input image data sets will be presented.
Textractor: a hybrid system for medications and reason for their prescription extraction from clinical text documents.

PubMed

Meystre, Stéphane M; Thibault, Julien; Shen, Shuying; Hurdle, John F; South, Brett R

2010-01-01

OBJECTIVE To describe a new medication information extraction system-Textractor-developed for the 'i2b2 medication extraction challenge'. The development, functionalities, and official evaluation of the system are detailed. Textractor is based on the Apache Unstructured Information Management Architecture (UMIA) framework, and uses methods that are a hybrid between machine learning and pattern matching. Two modules in the system are based on machine learning algorithms, while other modules use regular expressions, rules, and dictionaries, and one module embeds MetaMap Transfer. The official evaluation was based on a reference standard of 251 discharge summaries annotated by all teams participating in the challenge. The metrics used were recall, precision, and the F(1)-measure. They were calculated with exact and inexact matches, and were averaged at the level of systems and documents. The reference metric for this challenge, the system-level overall F(1)-measure, reached about 77% for exact matches, with a recall of 72% and a precision of 83%. Performance was the best with route information (F(1)-measure about 86%), and was good for dosage and frequency information, with F(1)-measures of about 82-85%. Results were not as good for durations, with F(1)-measures of 36-39%, and for reasons, with F(1)-measures of 24-27%. The official evaluation of Textractor for the i2b2 medication extraction challenge demonstrated satisfactory performance. This system was among the 10 best performing systems in this challenge.
Information extraction with object based support vector machines and vegetation indices

NASA Astrophysics Data System (ADS)

Ustuner, Mustafa; Abdikan, Saygin; Balik Sanli, Fusun

2016-07-01

Information extraction through remote sensing data is important for policy and decision makers as extracted information provide base layers for many application of real world. Classification of remotely sensed data is the one of the most common methods of extracting information however it is still a challenging issue because several factors are affecting the accuracy of the classification. Resolution of the imagery, number and homogeneity of land cover classes, purity of training data and characteristic of adopted classifiers are just some of these challenging factors. Object based image classification has some superiority than pixel based classification for high resolution images since it uses geometry and structure information besides spectral information. Vegetation indices are also commonly used for the classification process since it provides additional spectral information for vegetation, forestry and agricultural areas. In this study, the impacts of the Normalized Difference Vegetation Index (NDVI) and Normalized Difference Red Edge Index (NDRE) on the classification accuracy of RapidEye imagery were investigated. Object based Support Vector Machines were implemented for the classification of crop types for the study area located in Aegean region of Turkey. Results demonstrated that the incorporation of NDRE increase the classification accuracy from 79,96% to 86,80% as overall accuracy, however NDVI decrease the classification accuracy from 79,96% to 78,90%. Moreover it is proven than object based classification with RapidEye data give promising results for crop type mapping and analysis.

Machine learning to parse breast pathology reports in Chinese.

PubMed

Tang, Rong; Ouyang, Lizhi; Li, Clara; He, Yue; Griffin, Molly; Taghian, Alphonse; Smith, Barbara; Yala, Adam; Barzilay, Regina; Hughes, Kevin

2018-06-01

Large structured databases of pathology findings are valuable in deriving new clinical insights. However, they are labor intensive to create and generally require manual annotation. There has been some work in the bioinformatics community to support automating this work via machine learning in English. Our contribution is to provide an automated approach to construct such structured databases in Chinese, and to set the stage for extraction from other languages. We collected 2104 de-identified Chinese benign and malignant breast pathology reports from Hunan Cancer Hospital. Physicians with native Chinese proficiency reviewed the reports and annotated a variety of binary and numerical pathologic entities. After excluding 78 cases with a bilateral lesion in the same report, 1216 cases were used as a training set for the algorithm, which was then refined by 405 development cases. The Natural language processing algorithm was tested by using the remaining 405 cases to evaluate the machine learning outcome. The model was used to extract 13 binary entities and 8 numerical entities. When compared to physicians with native Chinese proficiency, the model showed a per-entity accuracy from 91 to 100% for all common diagnoses on the test set. The overall accuracy of binary entities was 98% and of numerical entities was 95%. In a per-report evaluation for binary entities with more than 100 training cases, 85% of all the testing reports were completely correct and 11% had an error in 1 out of 22 entities. We have demonstrated that Chinese breast pathology reports can be automatically parsed into structured data using standard machine learning approaches. The results of our study demonstrate that techniques effective in parsing English reports can be scaled to other languages.
Real-time sensing of lint quality

USDA-ARS?s Scientific Manuscript database

Modem cotton gins have the purpose of extracting lint (the cotton) from trash and seeds- usually the sticks, leaves and burrs that are entrained with the cotton. These modem gins include many individual machine components that are operated sequentially to form the gin processing line. Recent on-line...
Type 2 Diabetes Screening Test by Means of a Pulse Oximeter.

PubMed

Moreno, Enrique Monte; Lujan, Maria Jose Anyo; Rusinol, Montse Torrres; Fernandez, Paqui Juarez; Manrique, Pilar Nunez; Trivino, Cristina Aragon; Miquel, Magda Pedrosa; Rodriguez, Marife Alvarez; Burguillos, M Jose Gonzalez

2017-02-01

In this paper, we propose a method for screening for the presence of type 2 diabetes by means of the signal obtained from a pulse oximeter. The screening system consists of two parts: the first analyzes the signal obtained from the pulse oximeter, and the second consists of a machine-learning module. The system consists of a front end that extracts a set of features form the pulse oximeter signal. These features are based on physiological considerations. The set of features were the input of a machine-learning algorithm that determined the class of the input sample, i.e., whether the subject had diabetes or not. The machine-learning algorithms were random forests, gradient boosting, and linear discriminant analysis as benchmark. The system was tested on a database of [Formula: see text] subjects (two samples per subject) collected from five community health centers. The mean receiver operating characteristic area found was [Formula: see text]% (median value [Formula: see text]% and range [Formula: see text]%), with a specificity = [Formula: see text]% for a threshold that gave a sensitivity = [Formula: see text]%. We present a screening method for detecting diabetes that has a performance comparable to the glycated haemoglobin (haemoglobin A1c HbA1c) test, does not require blood extraction, and yields results in less than 5 min.
Machine-learning-based diagnosis of schizophrenia using combined sensor-level and source-level EEG features.

PubMed

Shim, Miseon; Hwang, Han-Jeong; Kim, Do-Won; Lee, Seung-Hwan; Im, Chang-Hwan

2016-10-01

Recently, an increasing number of researchers have endeavored to develop practical tools for diagnosing patients with schizophrenia using machine learning techniques applied to EEG biomarkers. Although a number of studies showed that source-level EEG features can potentially be applied to the differential diagnosis of schizophrenia, most studies have used only sensor-level EEG features such as ERP peak amplitude and power spectrum for machine learning-based diagnosis of schizophrenia. In this study, we used both sensor-level and source-level features extracted from EEG signals recorded during an auditory oddball task for the classification of patients with schizophrenia and healthy controls. EEG signals were recorded from 34 patients with schizophrenia and 34 healthy controls while each subject was asked to attend to oddball tones. Our results demonstrated higher classification accuracy when source-level features were used together with sensor-level features, compared to when only sensor-level features were used. In addition, the selected sensor-level features were mostly found in the frontal area, and the selected source-level features were mostly extracted from the temporal area, which coincide well with the well-known pathological region of cognitive processing in patients with schizophrenia. Our results suggest that our approach would be a promising tool for the computer-aided diagnosis of schizophrenia. Copyright © 2016 Elsevier B.V. All rights reserved.
Application of Multi-task Sparse Lasso Feature Extraction and Support Vector Machine Regression in the Stellar Atmospheric Parameterization

NASA Astrophysics Data System (ADS)

Gao, Wei; Li, Xiang-ru

2017-07-01

The multi-task learning takes the multiple tasks together to make analysis and calculation, so as to dig out the correlations among them, and therefore to improve the accuracy of the analyzed results. This kind of methods have been widely applied to the machine learning, pattern recognition, computer vision, and other related fields. This paper investigates the application of multi-task learning in estimating the stellar atmospheric parameters, including the surface temperature (Teff), surface gravitational acceleration (lg g), and chemical abundance ([Fe/H]). Firstly, the spectral features of the three stellar atmospheric parameters are extracted by using the multi-task sparse group Lasso algorithm, then the support vector machine is used to estimate the atmospheric physical parameters. The proposed scheme is evaluated on both the Sloan stellar spectra and the theoretical spectra computed from the Kurucz's New Opacity Distribution Function (NEWODF) model. The mean absolute errors (MAEs) on the Sloan spectra are: 0.0064 for lg (Teff /K), 0.1622 for lg (g/(cm · s-2)), and 0.1221 dex for [Fe/H]; the MAEs on the synthetic spectra are 0.0006 for lg (Teff /K), 0.0098 for lg (g/(cm · s-2)), and 0.0082 dex for [Fe/H]. Experimental results show that the proposed scheme has a rather high accuracy for the estimation of stellar atmospheric parameters.
A method for the evaluation of image quality according to the recognition effectiveness of objects in the optical remote sensing image using machine learning algorithm.

PubMed

Yuan, Tao; Zheng, Xinqi; Hu, Xuan; Zhou, Wei; Wang, Wei

2014-01-01

Objective and effective image quality assessment (IQA) is directly related to the application of optical remote sensing images (ORSI). In this study, a new IQA method of standardizing the target object recognition rate (ORR) is presented to reflect quality. First, several quality degradation treatments with high-resolution ORSIs are implemented to model the ORSIs obtained in different imaging conditions; then, a machine learning algorithm is adopted for recognition experiments on a chosen target object to obtain ORRs; finally, a comparison with commonly used IQA indicators was performed to reveal their applicability and limitations. The results showed that the ORR of the original ORSI was calculated to be up to 81.95%, whereas the ORR ratios of the quality-degraded images to the original images were 65.52%, 64.58%, 71.21%, and 73.11%. The results show that these data can more accurately reflect the advantages and disadvantages of different images in object identification and information extraction when compared with conventional digital image assessment indexes. By recognizing the difference in image quality from the application effect perspective, using a machine learning algorithm to extract regional gray scale features of typical objects in the image for analysis, and quantitatively assessing quality of ORSI according to the difference, this method provides a new approach for objective ORSI assessment.
Automatic firearm class identification from cartridge cases

NASA Astrophysics Data System (ADS)

Kamalakannan, Sridharan; Mann, Christopher J.; Bingham, Philip R.; Karnowski, Thomas P.; Gleason, Shaun S.

2011-03-01

We present a machine vision system for automatic identification of the class of firearms by extracting and analyzing two significant properties from spent cartridge cases, namely the Firing Pin Impression (FPI) and the Firing Pin Aperture Outline (FPAO). Within the framework of the proposed machine vision system, a white light interferometer is employed to image the head of the spent cartridge cases. As a first step of the algorithmic procedure, the Primer Surface Area (PSA) is detected using a circular Hough transform. Once the PSA is detected, a customized statistical region-based parametric active contour model is initialized around the center of the PSA and evolved to segment the FPI. Subsequently, the scaled version of the segmented FPI is used to initialize a customized Mumford-Shah based level set model in order to segment the FPAO. Once the shapes of FPI and FPAO are extracted, a shape-based level set method is used in order to compare these extracted shapes to an annotated dataset of FPIs and FPAOs from varied firearm types. A total of 74 cartridge case images non-uniformly distributed over five different firearms are processed using the aforementioned scheme and the promising nature of the results (95% classification accuracy) demonstrate the efficacy of the proposed approach.
Building a protein name dictionary from full text: a machine learning term extraction approach.

PubMed

Shi, Lei; Campagne, Fabien

2005-04-07

The majority of information in the biological literature resides in full text articles, instead of abstracts. Yet, abstracts remain the focus of many publicly available literature data mining tools. Most literature mining tools rely on pre-existing lexicons of biological names, often extracted from curated gene or protein databases. This is a limitation, because such databases have low coverage of the many name variants which are used to refer to biological entities in the literature. We present an approach to recognize named entities in full text. The approach collects high frequency terms in an article, and uses support vector machines (SVM) to identify biological entity names. It is also computationally efficient and robust to noise commonly found in full text material. We use the method to create a protein name dictionary from a set of 80,528 full text articles. Only 8.3% of the names in this dictionary match SwissProt description lines. We assess the quality of the dictionary by studying its protein name recognition performance in full text. This dictionary term lookup method compares favourably to other published methods, supporting the significance of our direct extraction approach. The method is strong in recognizing name variants not found in SwissProt.
Feature Extraction and Classification of EHG between Pregnancy and Labour Group Using Hilbert-Huang Transform and Extreme Learning Machine.

PubMed

Chen, Lili; Hao, Yaru

2017-01-01

Preterm birth (PTB) is the leading cause of perinatal mortality and long-term morbidity, which results in significant health and economic problems. The early detection of PTB has great significance for its prevention. The electrohysterogram (EHG) related to uterine contraction is a noninvasive, real-time, and automatic novel technology which can be used to detect, diagnose, or predict PTB. This paper presents a method for feature extraction and classification of EHG between pregnancy and labour group, based on Hilbert-Huang transform (HHT) and extreme learning machine (ELM). For each sample, each channel was decomposed into a set of intrinsic mode functions (IMFs) using empirical mode decomposition (EMD). Then, the Hilbert transform was applied to IMF to obtain analytic function. The maximum amplitude of analytic function was extracted as feature. The identification model was constructed based on ELM. Experimental results reveal that the best classification performance of the proposed method can reach an accuracy of 88.00%, a sensitivity of 91.30%, and a specificity of 85.19%. The area under receiver operating characteristic (ROC) curve is 0.88. Finally, experimental results indicate that the method developed in this work could be effective in the classification of EHG between pregnancy and labour group.
Building a protein name dictionary from full text: a machine learning term extraction approach

PubMed Central

Shi, Lei; Campagne, Fabien

2005-01-01

Background The majority of information in the biological literature resides in full text articles, instead of abstracts. Yet, abstracts remain the focus of many publicly available literature data mining tools. Most literature mining tools rely on pre-existing lexicons of biological names, often extracted from curated gene or protein databases. This is a limitation, because such databases have low coverage of the many name variants which are used to refer to biological entities in the literature. Results We present an approach to recognize named entities in full text. The approach collects high frequency terms in an article, and uses support vector machines (SVM) to identify biological entity names. It is also computationally efficient and robust to noise commonly found in full text material. We use the method to create a protein name dictionary from a set of 80,528 full text articles. Only 8.3% of the names in this dictionary match SwissProt description lines. We assess the quality of the dictionary by studying its protein name recognition performance in full text. Conclusion This dictionary term lookup method compares favourably to other published methods, supporting the significance of our direct extraction approach. The method is strong in recognizing name variants not found in SwissProt. PMID:15817129
A Deep Learning Approach for Fault Diagnosis of Induction Motors in Manufacturing

NASA Astrophysics Data System (ADS)

Shao, Si-Yu; Sun, Wen-Jun; Yan, Ru-Qiang; Wang, Peng; Gao, Robert X.

2017-11-01

Extracting features from original signals is a key procedure for traditional fault diagnosis of induction motors, as it directly influences the performance of fault recognition. However, high quality features need expert knowledge and human intervention. In this paper, a deep learning approach based on deep belief networks (DBN) is developed to learn features from frequency distribution of vibration signals with the purpose of characterizing working status of induction motors. It combines feature extraction procedure with classification task together to achieve automated and intelligent fault diagnosis. The DBN model is built by stacking multiple-units of restricted Boltzmann machine (RBM), and is trained using layer-by-layer pre-training algorithm. Compared with traditional diagnostic approaches where feature extraction is needed, the presented approach has the ability of learning hierarchical representations, which are suitable for fault classification, directly from frequency distribution of the measurement data. The structure of the DBN model is investigated as the scale and depth of the DBN architecture directly affect its classification performance. Experimental study conducted on a machine fault simulator verifies the effectiveness of the deep learning approach for fault diagnosis of induction motors. This research proposes an intelligent diagnosis method for induction motor which utilizes deep learning model to automatically learn features from sensor data and realize working status recognition.
Enhancing understanding and improving prediction of severe weather through spatiotemporal relational learning.

PubMed

McGovern, Amy; Gagne, David J; Williams, John K; Brown, Rodger A; Basara, Jeffrey B

Severe weather, including tornadoes, thunderstorms, wind, and hail annually cause significant loss of life and property. We are developing spatiotemporal machine learning techniques that will enable meteorologists to improve the prediction of these events by improving their understanding of the fundamental causes of the phenomena and by building skillful empirical predictive models. In this paper, we present significant enhancements of our Spatiotemporal Relational Probability Trees that enable autonomous discovery of spatiotemporal relationships as well as learning with arbitrary shapes. We focus our evaluation on two real-world case studies using our technique: predicting tornadoes in Oklahoma and predicting aircraft turbulence in the United States. We also discuss how to evaluate success for a machine learning algorithm in the severe weather domain, which will enable new methods such as ours to transfer from research to operations, provide a set of lessons learned for embedded machine learning applications, and discuss how to field our technique.
Must We Embody Context?

PubMed

Hahn, Barbara

The essays in this forum brace this meditation on the historiography of technology. Understanding devices incorporates the context of any particular hardware, as John Staudenmaier showed by quantifying the contents of the first decades of Technology and Culture. As contextualist approaches have widened from systems theory through social construction and into the assemblages of actor-network theory, the discipline has kept artifacts at the analytical center: it is the history of technology that scholars seek to understand. Even recognizing that the machine only embodies the technology, the discipline has long sought to explain the machine. These essays invite consideration of how the history of technology might apply to non-corporeal things-methods as well as machines, and all the worldly phenomena that function in technological ways even without physicality. Materiality is financial as well as corporeal, the history of capitalism reminds us, and this essay urges scholars to apply history-of-technology approaches more broadly.
A comparative study of surface EMG classification by fuzzy relevance vector machine and fuzzy support vector machine.

PubMed

Xie, Hong-Bo; Huang, Hu; Wu, Jianhua; Liu, Lei

2015-02-01

We present a multiclass fuzzy relevance vector machine (FRVM) learning mechanism and evaluate its performance to classify multiple hand motions using surface electromyographic (sEMG) signals. The relevance vector machine (RVM) is a sparse Bayesian kernel method which avoids some limitations of the support vector machine (SVM). However, RVM still suffers the difficulty of possible unclassifiable regions in multiclass problems. We propose two fuzzy membership function-based FRVM algorithms to solve such problems, based on experiments conducted on seven healthy subjects and two amputees with six hand motions. Two feature sets, namely, AR model coefficients and room mean square value (AR-RMS), and wavelet transform (WT) features, are extracted from the recorded sEMG signals. Fuzzy support vector machine (FSVM) analysis was also conducted for wide comparison in terms of accuracy, sparsity, training and testing time, as well as the effect of training sample sizes. FRVM yielded comparable classification accuracy with dramatically fewer support vectors in comparison with FSVM. Furthermore, the processing delay of FRVM was much less than that of FSVM, whilst training time of FSVM much faster than FRVM. The results indicate that FRVM classifier trained using sufficient samples can achieve comparable generalization capability as FSVM with significant sparsity in multi-channel sEMG classification, which is more suitable for sEMG-based real-time control applications.
Label-free sensor for automatic identification of erythrocytes using digital in-line holographic microscopy and machine learning.

PubMed

Go, Taesik; Byeon, Hyeokjun; Lee, Sang Joon

2018-04-30

Cell types of erythrocytes should be identified because they are closely related to their functionality and viability. Conventional methods for classifying erythrocytes are time consuming and labor intensive. Therefore, an automatic and accurate erythrocyte classification system is indispensable in healthcare and biomedical fields. In this study, we proposed a new label-free sensor for automatic identification of erythrocyte cell types using a digital in-line holographic microscopy (DIHM) combined with machine learning algorithms. A total of 12 features, including information on intensity distributions, morphological descriptors, and optical focusing characteristics, is quantitatively obtained from numerically reconstructed holographic images. All individual features for discocytes, echinocytes, and spherocytes are statistically different. To improve the performance of cell type identification, we adopted several machine learning algorithms, such as decision tree model, support vector machine, linear discriminant classification, and k-nearest neighbor classification. With the aid of these machine learning algorithms, the extracted features are effectively utilized to distinguish erythrocytes. Among the four tested algorithms, the decision tree model exhibits the best identification performance for the training sets (n = 440, 98.18%) and test sets (n = 190, 97.37%). This proposed methodology, which smartly combined DIHM and machine learning, would be helpful for sensing abnormal erythrocytes and computer-aided diagnosis of hematological diseases in clinic. Copyright © 2017 Elsevier B.V. All rights reserved.
Opinion mining on book review using CNN-L2-SVM algorithm

NASA Astrophysics Data System (ADS)

Rozi, M. F.; Mukhlash, I.; Soetrisno; Kimura, M.

2018-03-01

Review of a product can represent quality of a product itself. An extraction to that review can be used to know sentiment of that opinion. Process to extract useful information of user review is called Opinion Mining. Review extraction model that is enhancing nowadays is Deep Learning model. This Model has been used by many researchers to obtain excellent performance on Natural Language Processing. In this research, one of deep learning model, Convolutional Neural Network (CNN) is used for feature extraction and L2 Support Vector Machine (SVM) as classifier. These methods are implemented to know the sentiment of book review data. The result of this method shows state-of-the art performance in 83.23% for training phase and 64.6% for testing phase.
Classification and pose estimation of objects using nonlinear features

NASA Astrophysics Data System (ADS)

Talukder, Ashit; Casasent, David P.

1998-03-01

A new nonlinear feature extraction method called the maximum representation and discrimination feature (MRDF) method is presented for extraction of features from input image data. It implements transformations similar to the Sigma-Pi neural network. However, the weights of the MRDF are obtained in closed form, and offer advantages compared to nonlinear neural network implementations. The features extracted are useful for both object discrimination (classification) and object representation (pose estimation). We show its use in estimating the class and pose of images of real objects and rendered solid CAD models of machine parts from single views using a feature-space trajectory (FST) neural network classifier. We show more accurate classification and pose estimation results than are achieved by standard principal component analysis (PCA) and Fukunaga-Koontz (FK) feature extraction methods.
Gott Time Machines, BTZ Black Hole Formation, and Choptuik Scaling

NASA Astrophysics Data System (ADS)

Birmingham, Danny; Sen, Siddhartha

2000-02-01

We study the formation of Bañados-Teitelboim-Zanelli black holes by the collision of point particles. It is shown that the Gott time machine, originally constructed for the case of vanishing cosmological constant, provides a precise mechanism for black hole formation. As a result, one obtains an exact analytic understanding of the Choptuik scaling.
Web-Based Machine Translation as a Tool for Promoting Electronic Literacy and Language Awareness

ERIC Educational Resources Information Center

Williams, Lawrence

2006-01-01

This article addresses a pervasive problem of concern to teachers of many foreign languages: the use of Web-Based Machine Translation (WBMT) by students who do not understand the complexities of this relatively new tool. Although networked technologies have greatly increased access to many language and communication tools, WBMT is still…
The Value Simulation-Based Learning Added to Machining Technology in Singapore

ERIC Educational Resources Information Center

Fang, Linda; Tan, Hock Soon; Thwin, Mya Mya; Tan, Kim Cheng; Koh, Caroline

2011-01-01

This study seeks to understand the value simulation-based learning (SBL) added to the learning of Machining Technology in a 15-week core subject course offered to university students. The research questions were: (1) How did SBL enhance classroom learning? (2) How did SBL help participants in their test? (3) How did SBL prepare participants for…

Some links on this page may take you to non-federal websites. Their policies may differ from this site.